Transcriptional and metabolic data integration and modeling for identification of active pathways
Journal article, 2012
With the growing availability of omics data generated to describe different cells and tissues, the modeling and interpretation of such data has become increasingly important. Pathways are sets of reactions involving genes, metabolites, and proteins highlighting functional modules in the cell. Therefore, to discover activated or perturbed pathways when comparing two conditions, for example two different tissues, it is beneficial to use several types of omics data. We present a model that integrates transcriptomic and metabolomic data in order to make an informed pathway-level decision. Since metabolites can be seen as end-points of perturbations happening at the gene level, the gene expression data constitute the explanatory variables in a sparse regression model for the metabolite data. Sophisticated model selection procedures are developed to determine an appropriate model. We demonstrate that the transcript profiles can be used to informatively explain the metabolite data from cancer cell lines. Simulation studies further show that the proposed model offers a better performance in identifying active pathways than, for example, enrichment methods performed separately on the transcript and metabolite data.