Transcriptional and metabolic data integration and modeling for identification of active pathways
Journal article, 2012

With the growing availability of omics data generated to describe different cells and tissues, the modeling and interpretation of such data has become increasingly important. Pathways are sets of reactions involving genes, metabolites, and proteins highlighting functional modules in the cell. Therefore, to discover activated or perturbed pathways when comparing two conditions, for example two different tissues, it is beneficial to use several types of omics data. We present a model that integrates transcriptomic and metabolomic data in order to make an informed pathway-level decision. Since metabolites can be seen as end-points of perturbations happening at the gene level, the gene expression data constitute the explanatory variables in a sparse regression model for the metabolite data. Sophisticated model selection procedures are developed to determine an appropriate model. We demonstrate that the transcript profiles can be used to informatively explain the metabolite data from cancer cell lines. Simulation studies further show that the proposed model offers a better performance in identifying active pathways than, for example, enrichment methods performed separately on the transcript and metabolite data.

Pathways

regularization

germination

arabidopsis

Integrated modeling

Enrichment

profiles

selection

levels reveals

protein

Transcriptomics

Metabolomics

Author

Alexandra Jauhiainen

Karolinska Institutet

Olle Nerman

Chalmers, Mathematical Sciences, Mathematical Statistics

University of Gothenburg

G. Michailidis

University of Michigan

Rebecka Jörnsten

University of Gothenburg

Chalmers, Mathematical Sciences, Mathematical Statistics

Biostatistics

1465-4644 (ISSN) 1468-4357 (eISSN)

Vol. 13 4 748-761

Subject Categories (SSIF 2011)

Mathematics

DOI

10.1093/biostatistics/kxs016

More information

Latest update

4/20/2018