Using HaMMLET for Bayesian segmentation of WGS read-depth data
Book chapter, 2018

CNV detection requires a high-quality segmentation of genomic data. In many WGS experiments, sample and control are sequenced together in a multiplexed fashion using DNA barcoding for economic reasons. Using the differential read depth of these two conditions cancels out systematic additive errors. Due to this detrending, the resulting data is appropriate for inference using a hidden Markov model (HMM), arguably one of the principal models for labeled segmentation. However, while the usual frequentist approaches such as Baum-Welch are problematic for several reasons, they are often preferred to Bayesian HMM inference, which normally requires prohibitively long running times and exceeds a typical user’s computational resources on a genome scale data. HaMMLET solves this problem using a dynamic wavelet compression scheme, which makes Bayesian segmentation of WGS data feasible on standard consumer hardware.

Whole genome sequencing

Segmentation

HaMMLET

CNV

Bayesian inference

Hidden Markov Model

Author

John Wiedenhoeft

Rutgers University

Chalmers, Computer Science and Engineering (Chalmers), Computing Science (Chalmers)

Alexander Schliep

University of Gothenburg

Methods in Molecular Biology

10643745 (ISSN) 1940-6029 (eISSN)

83-93

Subject Categories

Bioinformatics (Computational Biology)

Bioinformatics and Systems Biology

Computer Vision and Robotics (Autonomous Systems)

DOI

10.1007/978-1-4939-8666-8_6

More information

Latest update

3/21/2023