Graph Neural Network based Hierarchy-Aware Embeddings of Knowledge Graphs: Applications to Yeast Phenotype Prediction
Preprint, 2026

We present a method for finding hierarchy-aware embeddings of knowledge graphs (KGs) using graph neural networks (GNNs) enriched with a semantic loss derived from underlying ontologies. This method yields embeddings that better reflect domain knowledge. To demonstrate their utility, we predict and interpret the effects of gene deletions in the yeast Saccharomyces cerevisiae and learn box embeddings for KGs in the absence of a prediction task. We further show how box embeddings can serve as the basis for evaluating KG revisions.

Our yeast KG is constructed from community databases and ontology terms. Low-dimensional box embeddings combined with GNNs are used to predict cell growth for double gene knockouts. Over 10-fold cross validation, these predictions have a mean Rscore of 0.360, significantly higher than baseline comparisons, demonstrating that high-level qualitative knowledge is informative about experimental outcomes. Incorporating semantic loss terms in the training of the models improves their predictive performance (R2=0.377) by aligning embeddings with ontology structure. This shows that class hierarchies from ontologies can be exploited for quantitative prediction. We also test the trained models on triple gene knockouts, showing they generalise to data beyond those seen in training.

Additionally, by identifying co-occurring relations in the yeast KG important for the cell-growth predictions, we construct hypotheses about interacting traits in yeast. A biological experiment validates one such finding, revealing an association between inositol utilisation and osmotic stress resistance, highlighting the model's potential to guide biological discovery.

Author

Filip Kronström

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Alexander Gower

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Daniel Brunnsåker

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Ievgeniia Tiukova

Chalmers, Life Sciences, Infrastructures

Ross King

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Subject Categories (SSIF 2025)

Bioinformatics (Computational Biology)

Bioinformatics and Computational Biology

Computer Sciences

DOI

10.48550/arXiv.2605.03690

More information

Created

5/13/2026