Hi, I want to write a tool that can consume, manipulate, and produce THY files. For instance, a tool that can tell me which lemmas in a THY file are unused. How might I do this? It would be ideal if there were a tool that could export/import THY files to/from XML files, as I know how to parse XML. Does that exist? Thanks.
Hmm, converting to and from a particular file format simply because you know how to parse it doesn’t seem to be a good idea. Wouldn’t it be better to parse theory files directly?
Let me generalise my question then: how can I access and modify the AST of a THY file? I don’t want to build my own parser/pretty-printer for THY files, as that would be reinventing the wheel.
That’s a good general question. :smile: Unfortunately, I’m not deep enough into Isabelle that I could satisfyingly answer this question; others here might be able to do so. That said, my understanding is that there is some ML code in the core part of Isabelle for creating and modifying abstract syntax trees for _inner_ syntax (terms and types), but _outer_ syntax fragments, the parts _around_ terms and types, are not really parsed but rather processed directly using ML code, with _some_ additional foundation using finite state machines.
I'm looking for the same thing as you, also as a beginner. I noticed that there is a SideKick panel in jEdit IDE showing the structure of the file. But I haven't figured out how to call SideKick parser from the command line.
John Wickerson said:
Let me generalise my question then: how can I access and modify the AST of a THY file? I don’t want to build my own parser/pretty-printer for THY files, as that would be reinventing the wheel.
Dear John,
I noticed you're keen on writing a tool for identifying unused lemmas in a THY file. While the idea is great, I believe using the AST of a THY file might not yield the results you're hoping for.
For instance, if you prove a lemma named "foo" and store it in the default simp set of the simp method in a THY file. Determining whether "foo" has been used or not becomes challenging when we rely solely on the AST of the THY file without running Isabelle.
Just wanted to share this observation to potentially save you some time and effort. Let me know if you have any questions or if there's any other way I can assist!
Best wishes,
Yutaka
If you are just interested in finding unused theorems, then you can use unused_thms
More generally, while it is certainly possible to obtain a syntax tree of the outer syntax, you can't really parse inner syntax of Isabelle without running Isabelle itself. This is due to the complexity of inner syntax that allows you to do things like parse translations, adhoc overloading, etc.
There is a linter that works with the AST of the outer syntax. Perhaps @Fabian Huch can point you to the file where the AST of the outer syntax is obtained.
Lukas Stevens said:
There is a linter that works with the AST of the outer syntax. Perhaps Fabian Huch can point you to the file where the AST of the outer syntax is obtained.
We should have such terrific tools highlighted somewhere they are easily overlooked and underutilised...
With NixOS, you just do isabelle.withComponents (components: [ components.linter ])
:D
We could add it to isabelle.systems
It is already at https://isabelle.systems/addons
Lukas Stevens said:
We could add it to isabelle.systems
This website is awesome!
There is no "Isabelle AST" for outer syntax, at least not in the classical sense. The reason for this is that defining syntax in Isabelle is a user-space operation, i.e., no static syntax tree can even exist. The abstraction that we have are categorized tokens, i.e., you can statically tokenize Isabelle theories, where the token information yields you what kind it is.
If you want more than that, you will want to work with Isabelle code that has already been parsed by the actual syntax parsers (i.e., you need to process the content in Isabelle first). Then, you get the tokens with semantic markup -- syntax trees still don't exist as they don't exist in the parsing language. But the markup allows you to look at individual command spans and analyze tokens, defined entities, etc.
You could export that markup into xml (in fact, the representation is in xml internally), but you really want to use the machinery inside Isabelle -- otherwise you'll have to reinvent everything rather than just using a faithful implementation. You can do that either in Isabelle/Scala or Isabelle/ML.
The Isabelle linter does not actually parse Isabelle: Rather, it parsers the anti-patterns that people should avoid.
Finally, the actual problem of finding unused theorems can not be solved on a syntactical level as there are many ways in which theorems can be used that do not involve mentioning its name (e.g., fact
, implicit rule
, simp
set or other lemma collections, ...)
If you insist on still trying to have static syntax trees of Isabelle, I can point you to the work of Yiannos Stathopolous: He built an Isabelle/Isar parser (at least on the Pure level IIRC) a while ago, though maintenance is an issue, and I'm not sure the code is publicly available (you can ask him though).
Last updated: Dec 21 2024 at 16:20 UTC