Merry Christmas, my dear colleagues in Isabelle.
Neural Theorem Proving (NTP) is an automated reasoning technique that applies LLMs or other neural models to generate proofs, offering a promising approach to free ITP users from manual proof construction. While Lean may appear to dominate NTP development in recent years, Isabelle actually led the field as recently as two years ago. When I asked researchers why they have shifted to Lean, they cited Isabelle's lack of adequate infrastructure for machine learning and data mining.
We believe this gap has now been addressed through our development of a comprehensive machine learning toolkit for Isabelle, which includes a REPL, precompiled AFP heap images, and a minimalist proof language specifically designed for machine learning. This language, _Minilang_, aims to reduce the conceptual complexity that LLMs must learn, thereby improving training efficiency. Experiments demonstrate that models trained with the most basic approach (supervised finetuning) can prove 79% of PISA problems (a set of randomly selected proof goals from AFP) within 8 attempts. A translator from Isar to Minilang is also provided, successfully translating 85% of proofs in AFP and the stdlib of Isabelle/HOL. This work has been accepted in OOPSLA'26, and a pre-print is available on arXiv.
While Lean currently maintains its lead in AI for mathematical competitions, the landscape differs for practical proof engineering tasks (e.g., those from Isabelle's AFP and Lean's mathlib). On LeanDojo, a counterpart of PISA in Lean, the best result achieves 60% (by Alchemy), while Minilang's most basic LLM (a basic SFT with naive prompt context setting) reaches 79% on PISA. Although these results are from different benchmarks and not directly comparable, Minilang still positions Isabelle once again at the forefront of AI-driven theorem proving for practical proof engineering.
We envision Minilang as a foundational component of the AI toolkit ecosystem for Isabelle. To this end, we have open-sourced all code and data (see the end of our paper) and are committed to providing full technical support to Minilang and our toolkit users.
Our team will continue developing AI provers and tools for Isabelle, with the ultimate goal of providing production-ready, push-button solutions. We will update on our progress.
Long live Isabelle, and may it continue to prosper.
Qiyuan
Last updated: Jan 09 2026 at 12:51 UTC