Predicting functional upstream open reading frames in Saccharomyces cerevisiae
Journal article, 2009

Background: Some upstream open reading frames (uORFs) regulate gene expression (i.e., they are functional) and can play key roles in keeping organisms healthy. However, how uORFs are involved in gene regulation is not yet fully understood. In order to get a complete view of how uORFs are involved in gene regulation, it is expected that a large number of experimentally verified functional uORFs are needed. Unfortunately, wet-experiments to verify that uORFs are functional are expensive. Results: In this paper, a new computational approach to predicting functional uORFs in the yeast Saccharomyces cerevisiae is presented. Our approach is based on inductive logic programming and makes use of a novel combination of knowledge about biological conservation, Gene Ontology annotations and genes' responses to different conditions. Our method results in a set of simple and informative hypotheses with an estimated sensitivity of 76%. The hypotheses predict 301 further genes to have 398 novel functional uORFs. Three (RPC11, TPK1, and FOL1) of these 301 genes have been hypothesised, following wet-experiments, by a related study to have functional uORFs. A comparison with another related study suggests that eleven of the predicted functional uORFs from genes LDB17, HEM3, CIN8, BCK2, PMC1, FAS1, APP1, ACC1, CKA2, SUR1, and ATH1 are strong candidates for wet-lab experimental studies. Conclusions: Learning based prediction of functional uORFs can be done with a high sensitivity. The predictions made in this study can serve as a list of candidates for subsequent wet-lab verification and might help to elucidate the regulatory roles of uORFs.

inductive logic programming

genomics

post-transcriptional regulation

machine learning

Author

Selpi Selpi

Chalmers, Applied Mechanics, Vehicle Safety

Christopher H. Bryant

University of Salford

Graham Kemp

Chalmers, Computer Science and Engineering (Chalmers), Computing Science (Chalmers)

Janeli Sarv

Chalmers, Mathematical Sciences, Mathematical Statistics

University of Gothenburg

Erik Kristiansson

University of Gothenburg

Per Sunnerhagen

University of Gothenburg

BMC Bioinformatics

14712105 (eISSN)

Vol. 10 451- 451

Subject Categories

Biochemistry and Molecular Biology

Bioinformatics and Systems Biology

Probability Theory and Statistics

Computer Science

DOI

10.1186/1471-2105-10-451

More information

Latest update

9/6/2018 1