Computational methods for analysis of fragmented sequence data
Doctoral thesis, 2015

Recent developments in genomic and proteomic sequencing technologies have revolutionized research in life sciences, providing new opportunities for the study of biological systems. However, modern sequence data sets are large, diverse, and heavily fragmented, which presents new challenges for their analysis and interpretation. In this thesis we present six research papers, that describe novel methods for studying bacteria and bacterial communities through the analysis of large data sets produced by modern DNA and protein sequencing technologies. In Paper I, we describe a method for discovering fragments of fluoroquinolone antibiotic resistance genes in short fragments of DNA. The resistance phenotypes of the predicted resistance genes were then validated by expression in an Escherichia coli host (Paper II). The method was further improved to handle larger and more fragmented data sets in Paper III. In Paper IV, we present Tentacle, an easy-to-use tool for high performance gene quantification in metagenomes that can be run on distributed computing resources to enable fast and efficient gene quantification in terabase metagenomes. In Paper V, we introduce proteotyping, an approach for microbial identification in clinical samples based on shotgun proteomics. Finally, in Paper VI we describe and evaluate a method for proteotyping analysis suited for application to clinical diagnostics of bacterial infections. The rapidly increasing volumes of data produced by new sequencing technologies provide new opportunities for understanding microbial biology. To unlock the full potential of large sequence data sets requires novel methods and approaches such as those presented in this thesis.



distributed computing



Pascal, Matematiska vetenskaper, Chalmers
Opponent: Erik Sonnhammer


Fredrik Boulund

Chalmers, Mathematical Sciences, Mathematical Statistics

University of Gothenburg

Tentacle: distributed quantification of genes in metagenomes

GigaScience,; Vol. 4(2015)p. artikel nr 40-

Journal article

Functional verification of computationally predicted qnr genes

Annals of Clinical Microbiology and Antimicrobials,; Vol. 12(2013)p. artikel nr 34-

Journal article


C3SE (Chalmers Centre for Computational Science and Engineering)

Subject Categories

Bioinformatics (Computational Biology)

Bioinformatics and Systems Biology

Areas of Advance

Life Science Engineering (2010-2018)



Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 3962

Pascal, Matematiska vetenskaper, Chalmers

Opponent: Erik Sonnhammer

More information