Stream: Beginner Questions

Topic: Parser for Isabelle


view this post on Zulip Minh D (Feb 21 2024 at 05:20):

I am wondering if there is a way to parse the syntax of Isabelle/Isar into an AST that I can implement in Isabelle/jEdit, I know that there is a parser in Isabelle/ML in the source of Isabelle, but how can I see the actual parsed tree?

view this post on Zulip Fabian Huch (Feb 21 2024 at 08:17):

There is no parse tree - parsing is done via parsing combinators without ever creating one. As the syntax is not static but user-extensible, parsing is dependent on the theory context.

Not sure what you're trying to implement but you can view the markup tree already in jEdit in the Sidekick panel (select isabelle-markup).

view this post on Zulip Manuel Eberl (Feb 21 2024 at 10:18):

Just for the sake of completeness: There is an AST for the inner syntax. But Isar is part of the outer syntax, which indeed does not have an AST.

view this post on Zulip Minh D (Feb 21 2024 at 13:49):

Thank you both for the answers.

As the syntax is not static but user-extensible, parsing is dependent on the theory context.

As I am new to this, what do you mean by user-extensible? Isn't the syntax for Isabelle/Isar is already mentioned in the Isabelle/Isar reference manual?

There is an AST for the inner syntax.

How can I get to produce the AST for the inner syntax?

view this post on Zulip Manuel Eberl (Feb 21 2024 at 15:46):

At almost any point in an Isabelle theory, you can open an ML block and do some stuff there. Such as: declaring new commands (like proof, lemma, by, qed, fun), new methods (like simp, auto) etc., and when you do that you can specify a parser (in the form of a parser combinator) to parse the arguments of that command or method.

view this post on Zulip Manuel Eberl (Feb 21 2024 at 15:46):

The "built-in" Isar commands like "have", "also", etc. are not special in that regard. You could define your own Isar-style commands in your theories, and indeed some people do.

view this post on Zulip Manuel Eberl (Feb 21 2024 at 15:47):

The reference manual just lists the canonical pre-defined Isar commands.

view this post on Zulip Manuel Eberl (Feb 21 2024 at 15:48):

The only thing that is really fixed (as far as I am aware) is the tokenisation.

view this post on Zulip Manuel Eberl (Feb 21 2024 at 15:54):

I actually have no idea how to look at the inner syntax AST. I just know it exists because there are parse/print translations (kind of "hooks" that can manipulate syntax after parsing) that operate on it. Chapter 8 of the reference manual has some information about this. But this is quite obscure internal stuff that only very few Isabelle users actually care about.

view this post on Zulip Yutaka Nagashima (Feb 21 2024 at 23:16):

Dear @Minh D

Do you already know what you intend to do with the inner syntax ASTs?

I have some experience in analyzing them to build AI tools for Isabelle. Perhaps I am familiar with the APIs you are searching for.

view this post on Zulip Minh D (Feb 22 2024 at 02:42):

Thank you @Manuel Eberl for the detailed explanation, I finally see what it means to say "user-extensible" syntax for Isabelle.
@Yutaka Nagashima Thank you, it would be great if I can understand how to work around with the inner syntax AST, can I DM you?

view this post on Zulip Yutaka Nagashima (Feb 22 2024 at 20:07):

@Minh D
Sure. I have a full-time job, so please be aware that my replies may be slow.

view this post on Zulip Chengsong Tan (Mar 18 2024 at 08:33):

Yutaka Nagashima said:

Dear Minh D

Do you already know what you intend to do with the inner syntax ASTs?

I have some experience in analyzing them to build AI tools for Isabelle. Perhaps I am familiar with the APIs you are searching for.

Dear @Yutaka Nagashima

Could I get some help from you as well, please?
I have similar questions of needing a parser for Isabelle to do some source-to-source transformation.
For instance, some search-and-replace cannot be intuitively implemented using regex only, such as replacing occurrences of variables that are not inside a lambda expression.
Could I DM you about the details?
Thanks in advance!

Best wishes,
Chengsong

view this post on Zulip Manuel Eberl (Mar 18 2024 at 09:15):

There is some prior work on that. Some students at TUM were working on building refactoring tools for Isabelle/jEdit. As I recall they used the markup generated by PIDE, i.e. the same stuff Isabelle/jEdit uses for highlighting and annotations as well. But I honestly don't know anything about this stuff. Maybe @Fabian Huch knows?

view this post on Zulip Fabian Huch (Mar 18 2024 at 15:19):

I don't know of any students doing that.

view this post on Zulip Manuel Eberl (Mar 19 2024 at 16:54):

Hm, someone ought to know. Sounds like the kind of thing you'd have supervised. Maybe ask Tobias.

view this post on Zulip Yutaka Nagashima (Mar 19 2024 at 23:12):

Chengsong Tan said:

Yutaka Nagashima said:

Dear Minh D

Do you already know what you intend to do with the inner syntax ASTs?

I have some experience in analyzing them to build AI tools for Isabelle. Perhaps I am familiar with the APIs you are searching for.

Dear Yutaka Nagashima

Could I get some help from you as well, please?
I have similar questions of needing a parser for Isabelle to do some source-to-source transformation.
For instance, some search-and-replace cannot be intuitively implemented using regex only, such as replacing occurrences of variables that are not inside a lambda expression.
Could I DM you about the details?
Thanks in advance!

Best wishes,
Chengsong

Sure, you can send me direct messages. It might take a bit longer to respond since I have a full-time job.

However, if you are working on a non-confidential project, I think it's better to have our conversation in public. That way, others can also read the messages. They might learn something new or may know something that I don't.

view this post on Zulip Chengsong Tan (Mar 26 2024 at 22:47):

(deleted)

view this post on Zulip Chengsong Tan (Mar 26 2024 at 22:51):

Yutaka Nagashima said:

Chengsong Tan said:

Yutaka Nagashima said:

Dear Minh D

Do you already know what you intend to do with the inner syntax ASTs?

I have some experience in analyzing them to build AI tools for Isabelle. Perhaps I am familiar with the APIs you are searching for.

Dear Yutaka Nagashima

Could I get some help from you as well, please?
I have similar questions of needing a parser for Isabelle to do some source-to-source transformation.
For instance, some search-and-replace cannot be intuitively implemented using regex only, such as replacing occurrences of variables that are not inside a lambda expression.
Could I DM you about the details?
Thanks in advance!

Best wishes,
Chengsong

Sure, you can send me direct messages. It might take a bit longer to respond since I have a full-time job.

However, if you are working on a non-confidential project, I think it's better to have our conversation in public. That way, others can also read the messages. They might learn something new or may know something that I don't.

Yes sure, I am happy to discuss in public. Do you know of any existing Isabelle parsers? I essentially want to do some source-to-source transformations in Isabelle, and to do that I need to be able to parse the code such that I can refer to the constructs like a lemma, a conjunct in a logic formula, a subgoal, a named fact inside a lemma and etc. It would be great if there's already something that can be used out of the box. Or if you have any experience in hand-rolling such a parser and know the caveats it would be very helpful.

Thanks a lot,
Chengsong

view this post on Zulip Yutaka Nagashima (Apr 07 2024 at 16:43):

@Chengsong Tan

Hi Chengsong,

Sorry for this late reply.

I do not have a monolithic framework for this.

However, in my projects, I worked on source-to-source transformation, partly using a domain-specific language (DSL).

The ML code I used for this resides here: https://github.com/data61/PSL/tree/master/SeLFiE.
The DSL is described in this paper: https://arxiv.org/abs/2010.10296.

need to be able to parse the code such that I can refer to the constructs like a lemma, a conjunct in a logic formula, a subgoal, a named fact inside a lemma and etc.

I am familiar with referring to constructs like lemmas, conjuncts in a logic formula, subgoals, and named facts inside a lemma, but in the context of Isabelle/ML.

Would you prefer to work on Isabelle/ML?

Regards,
Yutaka

view this post on Zulip Chengsong Tan (Apr 25 2024 at 22:40):

Yutaka Nagashima said:

Chengsong Tan

Hi Chengsong,

Sorry for this late reply.

I do not have a monolithic framework for this.

However, in my projects, I worked on source-to-source transformation, partly using a domain-specific language (DSL).

The ML code I used for this resides here: https://github.com/data61/PSL/tree/master/SeLFiE.
The DSL is described in this paper: https://arxiv.org/abs/2010.10296.

need to be able to parse the code such that I can refer to the constructs like a lemma, a conjunct in a logic formula, a subgoal, a named fact inside a lemma and etc.

I am familiar with referring to constructs like lemmas, conjuncts in a logic formula, subgoals, and named facts inside a lemma, but in the context of Isabelle/ML.

Would you prefer to work on Isabelle/ML?

Regards,
Yutaka

Hi Yukata,

Thanks a lot for replying!

Yes I can work with Isabelle/ML, any tool that gets the job done would be great (I haven't used it though).

So if I have an Isabelle theory file, what's the quickest way to set up a parser and play with it by manipulating the .thy file?

I am now starting to read your paper, please feel free to tell me the best way to get started on this.

Best wishes,
Chengsong


Last updated: Apr 27 2024 at 20:14 UTC