Knowledge Models and Inference Frameworks for Scientific Discovery

Alexander Gower

doi:10.63959/chalmers.dt/5896

Knowledge Models and Inference Frameworks for Scientific Discovery
Doktorsavhandling, 2026

Scientific discovery is an active process of designing, testing, and improving theories about the natural world. Automating this process is a grand challenge for 21st century science. This thesis examines scientific inquiry as it relates to machine learning, offering contributions to knowledge representations and reasoning frameworks, demonstrated in systems biology.

Systems biology is an integrationist approach to biological science, meaning organisms are treated as complex systems whose behaviour is dictated by the interaction of their constituent parts. Eukaryotic organisms are extremely complex, and research progress in systems biology can be slow. Recent advances in robotics and artificial intelligence (AI) offer great opportunity for automating scientific discovery in this field. Using the model organism Saccharomyces cerevisiae (baker’s yeast), this thesis explores: the philosophical motivations for automation in biological research; knowledge models and hypotheses in systems biology; and computational models of metabolism.

The first main contribution is a first-order logic framework for modelling cellular physiology, which enables abduction of hypotheses for improvement of knowledge models, using the automated theorem prover (ATP) iProver. The second contribution is an ontology for describing theory changes and hypotheses in a semantic and storage-efficient manner. The third main contribution is an application of graph neural networks (GNNs) to learn knowledge graph embeddings grounded in empirical data and ontology structures. The final contribution is an end-to-end demonstration of autonomous hypothesis generation and experimentation, with hypotheses modelled using ontology terms to support large language model (LLM) agents and human scientists.

These contributions demonstrate the power of knowledge graphs for autonomous scientific discovery. This thesis also argues that scientific discovery is better modelled as supervised learning—specifically active learning for AI scientists—than reinforcement learning; mapping concepts from machine learning algorithms to the domain produces systems that align with established scientific values, leading to improved theories.

automated theorem provers

systems biology

knowledge modelling

machine learning

scientific discovery

Artificial intelligence

abduction

ontologies

EDIT Lecture Hall EF

Opponent: Professor Jan Komorowski, Uppsala University, Sweden

Online disputation

Författare

Alexander Gower

Chalmers, Data- och informationsteknik, Data Science och AI

Forskning Andra publikationer

The Use of AI-Robotic Systems for Scientific Discovery

Preprint

RIMBO - An Ontology for Model Revision Databases

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),;Vol. 14276 LNAI(2023)p. 523-534

Paper i proceeding

LGEM+: Automated Improvement of Metabolic Network Models and Model-Driven Experimental Design through Abduction

Preprint

Graph Neural Network based Hierarchy-Aware Embeddings of Knowledge Graphs: Applications to Yeast Phenotype Prediction

Preprint

Investigating uncharacterised genes in Saccharomyces cerevisiae using robot scientists

Scientific Reports,;Vol. 16(2026)

Artikel i vetenskaplig tidskrift

Agentic AI Integrated with Scientific Knowledge: Laboratory Validation in Systems Biology

Preprint

Scientific discovery follows a cycle: starting from what you know, form a hypothesis, test it with an experiment, and use the result to improve your knowledge. This is also a description of active learning, a branch of machine learning in which algorithms select what to test next based on what they have already learned. Recognising this connection is powerful. It means that the design of AI systems for science can draw directly on machine learning research.

For a machine to do science, its knowledge must be stored in a form it can reason about—precise enough for a computer, and meaningful enough for human scientists. It must also be given tools to use its knowledge to reason about natural phenomena.

Machines and human scientists have complementary strengths. Machines do not tire, can run many experiments in parallel, are highly consistent, and have powerful reasoning tools at their disposal, not least due to advances in artificial intelligence (AI). Humans bring creativity, intuition, ethical values, and a bodily experience of the world. Together with machines, we can do better science than either could alone, and more of it.

This thesis explores these ideas in the study of baker's yeast—Saccharomyces cerevisiae. New methods are developed for storing biological knowledge in structured, machine-readable forms, and a robot scientist is demonstrated that autonomously designs and runs real laboratory experiments.

Ämneskategorier (SSIF 2025)

Bioinformatik (beräkningsbiologi)

Datavetenskap (datalogi)

Fundament

Grundläggande vetenskaper

Infrastruktur

Chalmers e-Commons (inkl. C3SE, 2020-)

DOI

10.63959/chalmers.dt/5896

Publikationsdata kopplat till DOI

ISBN

978-91-8103-439-4

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5896

Utgivare

Chalmers

EDIT Lecture Hall EF

Online

Opponent: Professor Jan Komorowski, Uppsala University, Sweden

Mer information

Senast uppdaterat

2026-05-13

Knowledge Models and Inference Frameworks for Scientific Discovery Doktorsavhandling, 2026

Författare

Alexander Gower

The Use of AI-Robotic Systems for Scientific Discovery

RIMBO - An Ontology for Model Revision Databases

LGEM+: Automated Improvement of Metabolic Network Models and Model-Driven Experimental Design through Abduction

Graph Neural Network based Hierarchy-Aware Embeddings of Knowledge Graphs: Applications to Yeast Phenotype Prediction

Investigating uncharacterised genes in Saccharomyces cerevisiae using robot scientists

Agentic AI Integrated with Scientific Knowledge: Laboratory Validation in Systems Biology

Ämneskategorier (SSIF 2025)

Fundament

Infrastruktur

DOI

ISBN

Utgivare

Mer information

Senast uppdaterat

Knowledge Models and Inference Frameworks for Scientific Discovery
Doktorsavhandling, 2026