[isabelle] The capability to construct AST from Isabelle ... · Mirror: Isabelle Users Mailing List

Stream: Mirror: Isabelle Users Mailing List

Topic: [isabelle] The capability to construct AST from Isabelle ...

Email Gateway (Jan 03 2022 at 10:04):

From: Alex Meyer <alex153@outlook.lv>
Hello!

Group of scientists (I am not connected with them, of course) has published today the important preprint https://arxiv.org/abs/2112.15594 in Arxiv:
"A Neural Network Solves and Generates Mathematics Problems by Program Synthesis: Calculus, Differential Equations, Linear Algebra, and More"
They are using OpenAI Codex (automation of software synthesis from the natural language commands) and the pretrained neural language model GPT-3 to solve the math problems and to generate new problems.

The most interesting technical aspect is that they encoded the math problems and their solutions as python or smth like that (industrial programming language) expressions (I am still tryig to grasp). I.e. they are just using calculus (analysis - differential equations, algebra - polynomials, etc.) and they are not actually doing the full math (i.e. the invention of calculus and the proof of their correctness) that is possible with Isabelle.

Last year I asked here several questions about the possibility to construct full AST from the Isabelle expressions and here was quite a scepticism about the possibility to do this. I am including example of my construction at the end of this message to show what I mean.

The most important obstacle against the construction of AST from Isabelle expresions (that can be encoded/embedded by Graph Neural Networks as a vectors for the further processing - e.g. Isabelle specification/theorem can be encoded and used as a input to the neural solver (to search for the solution and search for the proof of the solution and use Isabelle estimation of the proof status as the reward that guides reinforcement learning search)) - as I understand - is the necessity to expand every node towards the first types (i, o) un the most primitive applications. E.g. my AST contains Orderings.ord_class.less, HOL.eq that can be expanded in such a way.

But I am not sure whether this is a problem indeed. E.g. one can do such expansion, it can be costly, but one can do such expansion and create embedding vectors for Orderings.ord_class.less, HOL.eq and us them hierarchical fashion further, in more abstract reasoning. I.e. there may be no need to create fully expanded ASTs in every single case. On can construct partially expanded ASTs and assign precomputed embeddings to the leafs/nodes that nodes the complex notions (like function Orderings.ord_class.less).

In any case, my aim with this message was to raise the importance of possibility to extract ASTs from Isabelle and to give back ASTs to Isabelle, to extract Isabelle expressions in some languages (JSON, Python, XML) that can be used for the embedding of those expressions into neural networks and for construction of language models. This interoperability aspect can be the Gutenberg factor for adoption of one or another proof asistant in the AI systems. There are Scala-Isabelle and Isabelle-MMT projects and I read about them half a year ago, maybe they are the foundations for the building this integration capability or (if it already exists) for raising awareness about it. Thanks!

Alex

{
"data": {
"elementType": "term",
"termType": "App",
"pretty1": "(⟹) (b < a)",
"term1": {
"elementType": "term",
"termType": "App",
"pretty1": "(⟹)",
"term1": {
"elementType": "term",
"termType": "Const",
"name": "Pure.imp",
"type": { "elementType" : "type", "typeType" : "MLValueTyp" }
},
"pretty2": "b < a",
"term2": {
"elementType": "term",
"termType": "App",
"pretty1": "Trueprop",
"term1": {
"elementType": "term",
"termType": "Const",
"name": "HOL.Trueprop",
"type": { "elementType" : "type", "typeType" : "MLValueTyp" }
},
"pretty2": "b < a",
"term2": {
"elementType": "term",
"termType": "App",
"pretty1": "(<) b",
"term1": {
"elementType": "term",
"termType": "App",
"pretty1": "(<)",
"term1": {
"elementType": "term",
"termType": "Const",
"name": "Orderings.ord_class.less",
"type": { "elementType" : "type", "typeType" : "MLValueTyp" }
},
"pretty2": "b",
"term2": {
"elementType": "term",
"termType": "Free",
"type": { "elementType" : "type", "typeType" : "MLValueTyp" },
"name": "b"
}
},
"pretty2": "a",
"term2": {
"elementType": "term",
"termType": "Free",
"type": { "elementType" : "type", "typeType" : "MLValueTyp" },
"name": "a"
}
}
}
},
"pretty2": "two_integer_max_case_def a b = a",
"term2": {
"elementType": "term",
"termType": "App",
"pretty1": "Trueprop",
"term1": {
"elementType": "term",
"termType": "Const",
"name": "HOL.Trueprop",
"type": { "elementType" : "type", "typeType" : "MLValueTyp" }
},
"pretty2": "two_integer_max_case_def a b = a",
"term2": {
"elementType": "term",
"termType": "App",
"pretty1": "(=) (two_integer_max_case_def a b)",
"term1": {
"elementType": "term",
"termType": "App",
"pretty1": "(=)",
"term1": {
"elementType": "term",
"termType": "Const",
"name": "HOL.eq",
"type": { "elementType" : "type", "typeType" : "MLValueTyp" }
},
"pretty2": "two_integer_max_case_def a b",
"term2": {
"elementType": "term",
"termType": "App",
"pretty1": "two_integer_max_case_def a",
"term1": {
"elementType": "term",
"termType": "App",
"pretty1": "two_integer_max_case_def",
"term1": {
"elementType": "term",
"termType": "Free",
"type": { "elementType" : "type", "typeType" : "MLValueTyp" },
"name": "two_integer_max_case_def"
},
"pretty2": "a",
"term2": {
"elementType": "term",
"termType": "Free",
"type": { "elementType" : "type", "typeType" : "MLValueTyp" },
"name": "a"
}
},
"pretty2": "b",
"term2": {
"elementType": "term",
"termType": "Free",
"type": { "elementType" : "type", "typeType" : "MLValueTyp" },
"name": "b"
}
}
},
"pretty2": "a",
"term2": {
"elementType": "term",
"termType": "Free",
"type": { "elementType" : "type", "typeType" : "MLValueTyp" },
"name": "a"
}
}
}
}
}

Last updated: Aug 20 2025 at 20:23 UTC