HattCI: Fast and Accurate attC site Identification Using Hidden Markov Models.
Artikel i vetenskaplig tidskrift, 2016

Integrons are genetic elements that facilitate the horizontal gene transfer in bacteria and are known to harbor genes associated with antibiotic resistance. The gene mobility in the integrons is governed by the presence of attC sites, which are 55 to 141-nucleotide-long imperfect inverted repeats. Here we present HattCI, a new method for fast and accurate identification of attC sites in large DNA data sets. The method is based on a generalized hidden Markov model that describes each core component of an attC site individually. Using twofold cross-validation experiments on a manually curated reference data set of 231 attC sites from class 1 and 2 integrons, HattCI showed high sensitivities of up to 91.9% while maintaining satisfactory false-positive rates. When applied to a metagenomic data set of 35 microbial communities from different environments, HattCI found a substantially higher number of attC sites in the samples that are known to contain more horizontally transferred elements. HattCI will significantly increase the ability to identify attC sites and thus integron-mediated genes in genomic and metagenomic data. HattCI is implemented in C and is freely available at http://bioinformatics.math.chalmers.se/HattCI .

Författare

Mariana Buongermino Pereira

Chalmers, Matematiska vetenskaper, Matematisk statistik

Göteborgs universitet

Mikael Wallroth

Chalmers, Matematiska vetenskaper

Erik Kristiansson

Göteborgs universitet

Chalmers, Matematiska vetenskaper, Tillämpad matematik och statistik

Marina Axelson-Fisk

Göteborgs universitet

Chalmers, Matematiska vetenskaper, Matematisk statistik

Journal of Computational Biology

1066-5277 (ISSN)

Vol. 23 11 891-902

Drivkrafter

Hållbar utveckling

Fundament

Grundläggande vetenskaper

Ämneskategorier

Mikrobiologi

Sannolikhetsteori och statistik

Styrkeområden

Livsvetenskaper och teknik (2010-2018)

DOI

10.1089/cmb.2016.0024

PubMed

27428829

Mer information

Senast uppdaterat

2018-11-20