Computational methods for analysis of fragmented sequence data

Fredrik Boulund

Computational methods for analysis of fragmented sequence data
Doctoral thesis, 2015

Recent developments in genomic and proteomic sequencing technologies have revolutionized research in life sciences, providing new opportunities for the study of biological systems. However, modern sequence data sets are large, diverse, and heavily fragmented, which presents new challenges for their analysis and interpretation. In this thesis we present six research papers, that describe novel methods for studying bacteria and bacterial communities through the analysis of large data sets produced by modern DNA and protein sequencing technologies. In Paper I, we describe a method for discovering fragments of fluoroquinolone antibiotic resistance genes in short fragments of DNA. The resistance phenotypes of the predicted resistance genes were then validated by expression in an Escherichia coli host (Paper II). The method was further improved to handle larger and more fragmented data sets in Paper III. In Paper IV, we present Tentacle, an easy-to-use tool for high performance gene quantification in metagenomes that can be run on distributed computing resources to enable fast and efficient gene quantification in terabase metagenomes. In Paper V, we introduce proteotyping, an approach for microbial identification in clinical samples based on shotgun proteomics. Finally, in Paper VI we describe and evaluate a method for proteotyping analysis suited for application to clinical diagnostics of bacterial infections. The rapidly increasing volumes of data produced by new sequencing technologies provide new opportunities for understanding microbial biology. To unlock the full potential of large sequence data sets requires novel methods and approaches such as those presented in this thesis.

bioinformatics

sequencing

distributed computing

proteomics

metagenomics

Pascal, Matematiska vetenskaper, Chalmers

Opponent: Erik Sonnhammer

Author

Fredrik Boulund

Chalmers, Mathematical Sciences, Mathematical Statistics

University of Gothenburg

Other publications Research

Tentacle: distributed quantification of genes in metagenomes

GigaScience,;Vol. 4(2015)p. artikel nr 40-

Journal article

A novel method to discover fluoroquinolone antibiotic resistance (qnr) genes in fragmented nucleotide sequences

BMC Genomics,;Vol. 13(2012)p. 695-

Journal article

Functional verification of computationally predicted qnr genes

Annals of Clinical Microbiology and Antimicrobials,;Vol. 12(2013)p. artikel nr 34-

Journal article

Infrastructure

C3SE (-2020, Chalmers Centre for Computational Science and Engineering)

Subject Categories (SSIF 2011)

Bioinformatics (Computational Biology)

Bioinformatics and Systems Biology

Areas of Advance

Life Science Engineering (2010-2018)

ISBN

978-91-7597-281-7

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 3962

Pascal, Matematiska vetenskaper, Chalmers

Opponent: Erik Sonnhammer

More information

Created

10/8/2017

Computational methods for analysis of fragmented sequence data Doctoral thesis, 2015

Author

Fredrik Boulund

Tentacle: distributed quantification of genes in metagenomes

A novel method to discover fluoroquinolone antibiotic resistance (qnr) genes in fragmented nucleotide sequences

Functional verification of computationally predicted qnr genes

Infrastructure

Subject Categories (SSIF 2011)

Areas of Advance

ISBN

More information

Created

Computational methods for analysis of fragmented sequence data
Doctoral thesis, 2015