[isabelle] Isabelle parsers · Mirror: Isabelle Users Mailing List

Stream: Mirror: Isabelle Users Mailing List

Topic: [isabelle] Isabelle parsers

Email Gateway (Jun 02 2022 at 17:36):

From: Talia Ringer <tringer@cs.washington.edu>
Hi all,

I'm curious if there is an Isabelle/HOL parser available somewhere. I'm
trying to do some data processing with Isabelle/HOL theories, and I think a
parser would make this much easier. I'm open to any language since this is
just data processing.

Thanks!

Talia

Email Gateway (Jun 02 2022 at 18:45):

From: Manuel Eberl <manuel@pruvisto.org>
Hm, I think this might be tough. Others who are more knowledgeable may
say more, but this is my understanding:

The inner syntax (i.e. types, terms, etc) is highly extensible, and even
in the outer syntax It is possible to register new commands, and those
commands (also proof methods) can have their own parsers (written in ML
with parser combinators) as well.

So there really isn't any way to properly parse Isabelle without, well,
running Isabelle. And you can't parse a single file in isolation either,
you need to know the theories it depends on.

That said, I think some approximation is possible. After all,
Isabelle/jEdit does reasonable things even when there are parse errors,
so there is probably some well-defined way to parse an Isabelle file
approximately even when you don't fully know what each command parser does.

Manuel

Email Gateway (Jun 02 2022 at 22:08):

From: Wenda Li <wl302@cam.ac.uk>
Some of the older discussions in the mailing list could be relevant here:

https://lists.cam.ac.uk/sympa/arc/cl-isabelle-users/2021-02/msg00012.html

Wenda

Email Gateway (Jun 06 2022 at 12:28):

From: Dominique Unruh <unruh@ut.ee>
As far as I know, there is no existing implementation besides Isabelle
itself for this.

What Isabelle does can be accessed in Isabelle/ML or through
scala-isabelle (and possibly other way).

Isabelle itself does not, however, generate a parse tree of the theories
or similar. It does the following:

* It splits the theory into commands (so the transformation
theory-text -> list(commands-as-text) is relatively easy to access).

* For each command, one of a set of command parsers are invoked. These
do not output parse trees but directly return functions that
transform the state of the Isabelle toplevel.

* Those commands are applied, so one can access the current theory
after each command.

* In addition, Isabelle produces "reporting" which corresponds to the
IDE highlighting (e.g. what you get when ctrl-hovering on an element
of a term). This may be useful information but it is not enough to
get a full parse tree.

In addition, given a term (as a string), it is easy to parse it into a
term (as a datastructure).

Anything that goes beyond this might be possible but I think not easy.

If you can be more specific what kind of data you need to extract I
could give an educated guess as to whether/how it could be achieved.

Best wishes,
Dominique.

Email Gateway (Jun 06 2022 at 17:58):

From: Talia Ringer <tringer@cs.washington.edu>
Thanks! Similarly to another user here yesterday, I'm looking to get some
kind of dependency graph. I have a hacky way of getting them by text, but
it's not perfect and doesn't handle cross-theory dependencies.

So, an alternative to proper parsing would be any of the ML APIs that might
give us access to dependencies. For the sake of data processing, I think it
is OK to do this through Isabelle.

Talia

Email Gateway (Jun 07 2022 at 14:09):

From: Dominique Unruh <unruh@ut.ee>
Hi,

if you are happy with the same information that the IDE gives you (i.e.,
only the theorems that are explicitly mentioned by the user, but not any
theorems that are implicitly used, e.g., by the simp method), then I
think you are in luck:

The command "isabelle dump" (Section 2.6 of the system manual) does, if
I remember right, dump the full annotated theory code, including all the
references (from which you can then most likely extract the theorems).

It is dumped in YXML format which is a compressed XML format (see
Section 1.5). If you convert it back to XML (not sure there is a
user-level too for it, but the format is easy) you can explore it and
see what links it contains.

I cannot help with the details of this approach because the above
information is based on some experimentation I did a while back and I
don't remember the exact details.

Of course, all this information can also be extracted in Isabelle/Scala,
but if you don't need more than this static dump, it could be easier to
parse the files rather then learn the API.

Best wishes,
Dominique.

Email Gateway (Jun 07 2022 at 14:17):

From: Makarius <makarius@sketis.net>
If this is not just a throw-away experiment, you should invest in
Isabelle/Scala programming. The "isabelle dump" tool may server as example of
how to do it, but it also does things a bit indirectly.

The regular Isabelle HTML presentation in Isabelle2021-1 could be a better
starting point to understand semantically annotated Isabelle sources.

Side-remark: You should study Isabelle/Scala sources with the help of IntelliJ
IDEA, see also "isabelle scala_project" that documented in the "system
manual". There might be some genuine problems with the IDE project setup, but
then we should sort that out and not give up too early.

Makarius

Email Gateway (Jun 08 2022 at 12:31):

From: Fabian Huch <huch@in.tum.de>
In [1], you can find some functionality to extract dependency graphs
into a graph database using Isabelle/Scala and the Isabelle build + export.

Hope that helps,

Fabian

[1]: https://github.com/Dacit/afp-complex-networks

Last updated: Jul 18 2025 at 20:25 UTC