Theory Exploration: Automated Conjecturing for Programs and Proofs
Doctoral thesis, 2025

Theory exploration is an approach to automating the discovery of interesting and useful properties about computer programs and mathematical structures. Such properties can be used to guide automated and interactive reasoning. Coming up with new lemmas is often crucial in proof automation, and can provide vital assistance to a user of an interactive proof system. Generating properties that specify the behavior of a program is beneficial for software verification, testing, and debugging. Automated conjecturing is a challenging endeavor due to the vast search space and the difficulty in identifying the most interesting and useful properties. Developing effective conjecturing techniques is therefore critical for advancing both automated and interactive formal reasoning about programs and proofs.

In this thesis, we present novel symbolic and neuro-symbolic methods for theory exploration, along with the design, development, and evaluation of associated tools. First, we present a coinductive lemma discovery tool, the first system designed to automatically discover and prove lemmas about potentially infinite structures. Then, we integrate theory exploration and automated theorem proving in a state-of-the-art inductive proof system. Next, we introduce template-based theory exploration, which narrows the conjecturing search space and makes theory exploration faster and more targeted. In addition, we provide empirical evidence for the effectiveness of template-based theory exploration in finding interesting and useful lemmas for mathematical formalizations. Finally, we use Large Language Models (LLMs) for lemma conjecturing, both directly and as part of a neuro-symbolic template-based tool. We present the first neuro-symbolic lemma conjecturing tool that can automatically conjecture lemmas across all formalization domains.

Automated Reasoning

AI for Math

Conjecturing

Proof Assistants

Theory Exploration

Coinduction

Functional Programming

Formalization

Induction

Theorem Proving

EA, EDIT building, Hörsalsvägen 11, Chalmers Campus Johanneberg
Opponent: Josef Urban, Czech Technical University in Prague, Czechia

Author

Sólrún Einarsdóttir

Data Science and AI 2

Into the infinite - theory exploration for coinduction

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),;Vol. 11110 LNAI(2018)p. 70-86

Paper in proceeding

Lemma Discovery and Strategies for Automated Induction

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),;Vol. 14739 LNAI(2024)p. 214-232

Paper in proceeding

Template-based Theory Exploration: Discovering Properties of Functional Programs by Testing

ACM International Conference Proceeding Series,;(2020)p. 67-78

Paper in proceeding

S. H. Einarsdóttir, M. Johansson, N. Smallbone LOL: A Library Of Lemma templates for data-driven conjecturing

Y. Alhessi, S. H. Einarsdóttir, G. Granberry, E. First, M. Johansson, S. Lerner, N. Smallbone Lemmanaid: Neuro-Symbolic Lemma Conjecturing

Theory exploration is an approach to automatically discover interesting and useful properties about computer programs and mathematical structures. These properties can be used as lemmas in a mathematical proof, enabling more theorems to be proved automatically or by a human user of an interactive tool. They can also function as specifications for a program, facilitating software verification, testing, and debugging. Improved methods for inventing useful properties will lower the bar for computer formalization of mathematics and for developing robust and secure software.

In this thesis we present new techniques and tools for theory exploration. We use AI methods ranging from symbolic AI to modern generative LLMs. First, we present the first tool designed to automatically discover and prove lemmas about potentially infinite structures. Then, we combine theory exploration and automated theorem proving in a system for proof by induction, achieving state-of-the-art results. Next, we introduce template-based theory exploration, which makes theory exploration faster and more targeted by narrowing the search space. Finally, we use Large Language Models (LLMs) to discover lemmas for mathematical formalization in the first neuro-symbolic theory exploration tool. We demonstrate the effectiveness of theory exploration for finding interesting and useful properties in a variety of settings.

Subject Categories (SSIF 2025)

Formal Methods

Computer Sciences

Artificial Intelligence

DOI

10.63959/chalmers.dt/5776

ISBN

978-91-8103-319-9

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5776

Publisher

Chalmers

EA, EDIT building, Hörsalsvägen 11, Chalmers Campus Johanneberg

Opponent: Josef Urban, Czech Technical University in Prague, Czechia

More information

Latest update

11/5/2025