Predicting functional upstream open reading frames in Saccharomyces cerevisiae
Artikel i vetenskaplig tidskrift, 2009

Background: Some upstream open reading frames (uORFs) regulate gene expression (i.e., they are functional) and can play key roles in keeping organisms healthy. However, how uORFs are involved in gene regulation is not yet fully understood. In order to get a complete view of how uORFs are involved in gene regulation, it is expected that a large number of experimentally verified functional uORFs are needed. Unfortunately, wet-experiments to verify that uORFs are functional are expensive. Results: In this paper, a new computational approach to predicting functional uORFs in the yeast Saccharomyces cerevisiae is presented. Our approach is based on inductive logic programming and makes use of a novel combination of knowledge about biological conservation, Gene Ontology annotations and genes' responses to different conditions. Our method results in a set of simple and informative hypotheses with an estimated sensitivity of 76%. The hypotheses predict 301 further genes to have 398 novel functional uORFs. Three (RPC11, TPK1, and FOL1) of these 301 genes have been hypothesised, following wet-experiments, by a related study to have functional uORFs. A comparison with another related study suggests that eleven of the predicted functional uORFs from genes LDB17, HEM3, CIN8, BCK2, PMC1, FAS1, APP1, ACC1, CKA2, SUR1, and ATH1 are strong candidates for wet-lab experimental studies. Conclusions: Learning based prediction of functional uORFs can be done with a high sensitivity. The predictions made in this study can serve as a list of candidates for subsequent wet-lab verification and might help to elucidate the regulatory roles of uORFs.

post-transcriptional regulation


inductive logic programming

machine learning


Selpi Selpi

Chalmers, Tillämpad mekanik, Fordonssäkerhet

Christopher H. Bryant

University of Salford

Graham Kemp

Chalmers, Data- och informationsteknik, Datavetenskap

Janeli Sarv

Chalmers, Matematiska vetenskaper, matematisk statistik

Göteborgs universitet

Erik Kristiansson

Göteborgs universitet

Per Sunnerhagen

Göteborgs universitet

BMC Bioinformatics

1471-2105 (ISSN)

Vol. 10 451- 451


Biokemi och molekylärbiologi

Bioinformatik och systembiologi

Sannolikhetsteori och statistik

Datavetenskap (datalogi)