Comparative network analysis of human cancer: sparse graphical models with modular constraints and sample size correction
Licentiatavhandling, 2013

In the study of transcriptional data for different groups (e.g. cancer types) it's reasonable to assume that some dependencies between genes on a transcriptional or genetic variants level are common across groups. Also, that this property is preserved locally, thus defining a modular structure in the model networks. For ease of interpretation, sparsity in the resulting model is also desirable. In this thesis we assume genomic data to have a multivariate normal distribution and estimate the networks by optimization of a penalized log-likelihood function for the corresponding inverse covariance matrices. We apply the fused elastic net penalty for sparsity and commonality. To achieve modular topology we propose a novel adaptive penalty. This adaptive penalty is computed from an initial zero-consistent solution. We also propose a generalization of the method which allows for fusion penalties defined by a graph. This method can be used to correct estimates when the groups have different sample sizes. It can also be use to correctly penalize in the presence of ordered variables such as survival. We optimize the penalized log-likelihood using the alternating directions method of multiplier (ADMM). Simulation studies show that our method more accurately identifies differential connectivity (network edges that differ between cancer classes) compared with standard methods. We also apply our method to the investigation of tumor data in glioblastoma, breast and ovarian cancer, integrating two types of data, mRNA (messenger RNA expression) and CNA (copy number aberration), by defining a prior distribution of the plausible links in the corresponding networks.

fused lasso

graphical models

cancer.

elastic net

precision matrix

high-dimension

networks

Inverse covariance matrix

low-sample

sparsity

Pascal, Chalmers Tvärgata 3, Chalmers Tekniska Högskola
Opponent: Docent Patrik Rydén, Matematik och Matematisk Statistik, Umeå Universitet, Sverige

Författare

José Sánchez

Göteborgs universitet

Chalmers, Matematiska vetenskaper

Fundament

Grundläggande vetenskaper

Ämneskategorier

Bioinformatik och systembiologi

Sannolikhetsteori och statistik

Preprint - Department of Mathematical Sciences, Chalmers University of Technology and Göteborg University

Pascal, Chalmers Tvärgata 3, Chalmers Tekniska Högskola

Opponent: Docent Patrik Rydén, Matematik och Matematisk Statistik, Umeå Universitet, Sverige