Short and long-read ultra-deep sequencing profiles emerging heterogeneity across five platform Escherichia coli strains
Journal article, 2021
Reprogramming organisms for large-scale bioproduction counters their evolutionary objectives of fast growth and often leads to mutational collapse of the engineered production pathways during cultivation. Yet, the mutational susceptibility of academic and industrial Escherichia coli bioproduction host strains are poorly understood. In this study, we apply 2nd and 3rd generation deep sequencing to profile simultaneous modes of genetic heterogeneity that decimate engineered biosynthetic production in five popular E. coli hosts BL21(DE3), TOP10, MG1655, W, and W3110 producing 2,3-butanediol and mevalonic acid. Combining short-read and longread sequencing, we detect strain and sequence-specific mutational modes including single nucleotide polymorphism, inversion, and mobile element transposition, as well as complex structural variations that disrupt the integrity of the engineered biosynthetic pathway. Our analysis suggests that organism engineers should avoid chassis strains hosting active insertion sequence (IS) subfamilies such as IS1 and IS10 present in popular E. coli TOP10. We also recommend monitoring for increased mutagenicity in the pathway transcription initiation regions and recombinogenic repeats. Together, short and long sequencing reads identified latent low-frequency mutation events such as a short detrimental inversion within a pathway gene, driven by 8-bp short inverted repeats. This demonstrates the power of combining ultra-deep DNA sequencing technologies to profile genetic heterogeneities of engineered constructs and explore the markedly different mutational landscapes of common E. coli host strains. The observed multitude of evolving variants underlines the usefulness of early mutational profiling for new synthetic pathways designed to sustain in organisms over long cultivation scales.