Statistical assessment of genomic variability in tumours and bacterial communities
Doctoral thesis, 2019

Current high-throughput DNA sequencing technologies have the ability to generate large amounts of high-resolution genomic data. The high dimensionality in combination with the substantial levels of technical errors and biological variability typically present in the data make, however, the analysis challenging. Tailored statistical methods are therefore crucial for reaching valid biological conclusions. In this thesis, such methods were developed and applied to address research questions in biology and medicine.
First, a method for identification of tumour-specific (somatic) mutations was developed, which included steps for noise-reduction, sensitive detection of  DNA alterations and removal of systematic errors. In Paper I, the method was applied to exome-sequenced paired normal–tumour samples from pheochromocytoma patients. A significantly higher mutation rate was found in malignant compared to benign tumours and three genes with recurrent somatic mutations, exclusively located in malignant tumours, were identified. In paper II and III, somatic mutations were identified in patients with acute myeloid leukemia and evaluated as biomarkers in personalised deep sequencing analysis of remaining cancer cells after treatment. In paper III, a statistical model correcting for position-specific errors in the data was developed and shown to provide superior sensitivity compared to standard techniques. In paper IV, clinically relevant molecular subgroups of metastatic small intestinal neuroendocrine tumours were identified based on miRNA gene expression profiles. Survival analysis and subsequent validation suggested miR-375 as a prognostic biomarker. In paper V, a hierarchical Bayesian model for detecting differences on nucleotide level between microbial communities is proposed. By including between-sample variability and utilizing a shrinkage approach, the model was able to perform well both in cases of few samples and high biological variability. Finally, the model was used to detect antibiotic resistance mutations in bacteria.
This thesis demonstrates that dedicated statistical analysis and knowledge of the underlying error structure present in high-dimensional biological data is of importance for enabling accurate interpretation and sound conclusions.

somatic mutations

hierarchical Bayesian modelling

cancer genetics

high-throughput sequencing

metagenomics

personalised diagnostics

sal Pascal, Matematiska vetenskaper, Chalmers tvärgata 3
Opponent: Docent Patrik Rydén, Institutionen för Matematik och Matematisk statistik, Umeå universitet

Author

Anna Rehammar

Chalmers, Mathematical Sciences, Applied Mathematics and Statistics

Malignant pheochromocytomas/paragangliomas harbor mutations in transport and cell adhesion genes.

International Journal of Cancer,; Vol. 138(2016)p. 2201-11

Journal article

A hierarchical Bayesian model for assessing differential nucleotide composition between metagenomes, Anna Rehammar, Anders Sjögren, Erik Kristiansson

Subject Categories

Bioinformatics (Computational Biology)

Medical Genetics

Probability Theory and Statistics

ISBN

978-91-7905-135-8

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 4602

Publisher

Chalmers

sal Pascal, Matematiska vetenskaper, Chalmers tvärgata 3

Opponent: Docent Patrik Rydén, Institutionen för Matematik och Matematisk statistik, Umeå universitet

More information

Latest update

6/3/2019 7