Development and application of omics data analysis tools to examine molecular associations linking complex exposures to cardiovascular disease
Doktorsavhandling, 2025

Preventing cardiovascular disease (CVD) requires better understanding of underlying metabolic perturbations to identify targets for intervention. Most studies have investigated exposures separately, and tools to analyze health effects of multiple exposures jointly are largely lacking. The work in this thesis aims to develop tools for complex omics analysis and apply them to population-based cohorts to investigate omics patterns linking combined exposures including diet, persistent organic pollutants (POPs), and gut microbiota to CVD risk.

This thesis includes development of two tools presented as R packages that make advanced omics data analysis accessible to a broader research community: 1) MUVR2 provides supervised machine learning with nested cross-validation to mitigate overfitting and false discovery combined with variable selection in high-dimensional data. MUVR2 includes elastic net which allows covariate adjustment thus enhancing epidemiological modeling. 2) TriplotGUI is a user-friendly tool integrating omics data reduction with meet-in-the-middle and mediation analyses to explore exposure-omics-outcome associations through intuitive visualizations.

Using earlier versions of these tools on the Swedish Mammography Cohort revealed two distinct omics sub-patterns linking POPs to CVD. The first involved perturbed lipid metabolism and inflammatory pathways associated with higher levels of organochlorine compounds, lower levels of per- and polyfluoroalkyl substance and higher myocardial infarction (MI) risk. The second involved carnitines and possible mitochondrial dysfunction and associated with OCs and stroke.

MUVR2 and TriplotGUI were applied to discover and replicate metabolites associated with diet, POPs, gut microbiota, and CVD incidence using four Nordic cohorts. Notable findings supporting metabolites mediating exposure-outcome associations included: An association between nuts and dried fruit and reduced MI risk, possibly mediated by pipecolate. Moreover, associations between fish intake and reduced MI risk, possibly mediated by phosphatidyl-ethanolamine(P-16:0/22:6) and an unknown metabolite. Importantly, only few exposure-metabolite-outcome associations were reproduced across cohorts, stressing the importance of replication for generalizable conclusions.

This thesis contributes advanced, accessible methods for linking environmental exposures to health outcomes through omics-based mediators, with MUVR2 and TriplotGUI improving the identification and interpretation of molecular signatures. Application of these tools to CVD enabled characterization of molecular signatures linking diet to health and underscored the necessity of rigorous external validation to minimize spurious associations.

omics

meet-in-the-middle analysis

mediation analysis

molecular epidemiology

cross-cohort design

cardiovascular disease

diet

machine learning

gut microbiota

persistent organic pollutants

KB, Kemihuset
Opponent: Marc Chadeau-Hyam, Professor of Computational Epidemiology and Biostatistics, School of Public Health - Faculty of Medicine, Imperial College London, the United Kindom.

Författare

Yingxiao Yan

Chalmers, Life sciences, Livsmedelsvetenskap

Adjusting for covariates and assessing modeling fitness in machine learning using MUVR2

Bioinformatics Advances,;Vol. 4(2024)

Artikel i vetenskaplig tidskrift

Yan Y, Schillemans T, Ribbenstedt A, Brunius C. Software Application Profile: TriplotGUI, A Molecular Epidemiology Toolbox for Investigating Associations between Exposures, Omics and Outcomes

OMICs Signatures Linking Persistent Organic Pollutants to Cardiovascular Disease in the Swedish Mammography Cohort

Environmental Science & Technology,;Vol. 58(2024)p. 1036-1047

Artikel i vetenskaplig tidskrift

Yan Y, Schillemans T, Toubon G, Ribbenstedt A, Åkesson A, Johansson I, Bergdahl I, Brunius C. Metabolic signatures linking multiple environmental exposures to cardiovascular disease risk: A multi-cohort discovery and validation study

Cardiovascular disease including heart attack and stroke is the leading contributor of death worldwide, causing the death of over 17 million people annually and healthcare costs exceeding $1 trillion yearly. A better understanding of the mechanisms regarding how for example diet, gut microbiota, and pollutants affect cardiovascular health may lead to more precise and effective prevention strategies.

This thesis describes the development of new tools to analyze how different environmental factors jointly influence disease risk through biological perturbations, by analyzing data from several thousand individuals across the Nordic countries. Nuts and dried fruits were associated with reduced heart attack risk, with evidence suggesting that the metabolite pipecolate may contribute to this protective effect, possibly by reducing inflammation and regulating cellular processes. Eating salmon was also associated with lower heart attack risk. While salmon offers cardioprotective nutrients such as so-called omega-3-fatty acids, it is also a major dietary source of organochlorine compounds and per- and polyfluoroalkyl substances (PFAS), chemical pollutants that adversely impact health at multiple stages of life. This highlights that choosing healthy food involves weighing both nutritional benefits and potential risks. Another key finding was that most associations found in single study populations did not hold true when tested across different cohorts, stressing the importance of replication across cohorts with different characteristics for wider generalizability.

The work in this thesis has advanced methods and used them to link environmental factors to cardiovascular diseases, providing knowledge that may contribute to better prevention.

Miljöexponeringars kombinerade inverkan på metabol hälsa

Formas (2020-01653), 2021-01-01 -- 2024-12-31.

Dynamic longitudinal exposome trajectories in cardiovascular and metabolic non-communicable diseases’ — ‘LONGITOOLS’

Europeiska kommissionen (EU) (EC/H2020/874739), 2019-12-31 -- 2023-12-31.

Ämneskategorier (SSIF 2025)

Folkhälsovetenskap, global hälsa och socialmedicin

Bioinformatik (beräkningsbiologi)

Bioinformatik och beräkningsbiologi

Drivkrafter

Hållbar utveckling

Infrastruktur

Chalmers infrastruktur för masspektrometri

Styrkeområden

Hälsa och teknik

DOI

10.63959/chalmers.dt/5694

ISBN

978-91-8103-236-9

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: Ny serie 5694 ISSN 0346-718X

Utgivare

Chalmers

KB, Kemihuset

Opponent: Marc Chadeau-Hyam, Professor of Computational Epidemiology and Biostatistics, School of Public Health - Faculty of Medicine, Imperial College London, the United Kindom.

Mer information

Senast uppdaterat

2025-09-05