Knowledge Models and Inference Frameworks for Scientific Discovery
Doktorsavhandling, 2026
Systems biology is an integrationist approach to biological science, meaning organisms are treated as complex systems whose behaviour is dictated by the interaction of their constituent parts. Eukaryotic organisms are extremely complex, and research progress in systems biology can be slow. Recent advances in robotics and artificial intelligence (AI) offer great opportunity for automating scientific discovery in this field. Using the model organism Saccharomyces cerevisiae (baker’s yeast), this thesis explores: the philosophical motivations for automation in biological research; knowledge models and hypotheses in systems biology; and computational models of metabolism.
The first main contribution is a first-order logic framework for modelling cellular physiology, which enables abduction of hypotheses for improvement of knowledge models, using the automated theorem prover (ATP) iProver. The second contribution is an ontology for describing theory changes and hypotheses in a semantic and storage-efficient manner. The third main contribution is an application of graph neural networks (GNNs) to learn knowledge graph embeddings grounded in empirical data and ontology structures. The final contribution is an end-to-end demonstration of autonomous hypothesis generation and experimentation, with hypotheses modelled using ontology terms to support large language model (LLM) agents and human scientists.
These contributions demonstrate the power of knowledge graphs for autonomous scientific discovery. This thesis also argues that scientific discovery is better modelled as supervised learning—specifically active learning for AI scientists—than reinforcement learning; mapping concepts from machine learning algorithms to the domain produces systems that align with established scientific values, leading to improved theories.
automated theorem provers
systems biology
knowledge modelling
machine learning
scientific discovery
Artificial intelligence
abduction
ontologies
Författare
Alexander Gower
Chalmers, Data- och informationsteknik, Data Science och AI
RIMBO - An Ontology for Model Revision Databases
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),;Vol. 14276 LNAI(2023)p. 523-534
Paper i proceeding
Investigating uncharacterised genes in Saccharomyces cerevisiae using robot scientists
Scientific Reports,;Vol. 16(2026)
Artikel i vetenskaplig tidskrift
For a machine to do science, its knowledge must be stored in a form it can reason about—precise enough for a computer, and meaningful enough for human scientists. It must also be given tools to use its knowledge to reason about natural phenomena.
Machines and human scientists have complementary strengths. Machines do not tire, can run many experiments in parallel, are highly consistent, and have powerful reasoning tools at their disposal, not least due to advances in artificial intelligence (AI). Humans bring creativity, intuition, ethical values, and a bodily experience of the world. Together with machines, we can do better science than either could alone, and more of it.
This thesis explores these ideas in the study of baker's yeast—Saccharomyces cerevisiae. New methods are developed for storing biological knowledge in structured, machine-readable forms, and a robot scientist is demonstrated that autonomously designs and runs real laboratory experiments.
Ämneskategorier (SSIF 2025)
Bioinformatik (beräkningsbiologi)
Datavetenskap (datalogi)
Fundament
Grundläggande vetenskaper
Infrastruktur
Chalmers e-Commons (inkl. C3SE, 2020-)
DOI
10.63959/chalmers.dt/5896
ISBN
978-91-8103-439-4
Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5896
Utgivare
Chalmers