[isabelle] Splicing runtime ML values into Isar · Archive Mirror: Isabelle Users Mailing List

Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Splicing runtime ML values into Isar

Email Gateway (Aug 22 2022 at 10:12):

From: Lars Hupel <hupel@in.tum.de>
Yes. This may also happen to the "tactic" method:

ML‹
structure Method = struct
fun set_tactic _ = I
end
›

lemma True
apply (tactic ‹all_tac›)

(* exception Fail raised (line 284 of "Isar/method.ML"):
Undefined ML tactic *)

Hygiene is often a problem in languages which support nesting and
(anti)quotation of language elements.

Cheers
Lars

Email Gateway (Aug 22 2022 at 10:34):

From: Lars Hupel <hupel@in.tum.de>
Dear list,

we've had antiquotations for a while now -- a syntactic method to embed
formal entities into ML code. I've been wondering about the opposite
direction -- embedding (properly typed) ML values into a formal context.
With the arrival of cartouches, this is actually possible now. The
attached theory allows one to write:

term "(λx. SPLICE ‹Bound 0›)"

... which results in a formal expression logically equivalent to

term "(λx. x)"

This also works with theorems:

ML‹val mythm = @{thm conjI[where P = True]}›
lemmas mythm = [[splice ‹mythm›]]

Of course, this is not very useful by itself. But it can also be used to
compute larger terms using ML, e.g. for lemma statements:

lemma "0 < SPLICE ‹HOLogic.mk_number @{typ nat} (1+1)›"
oops

... produces the goal "0 < 2". There might be use cases in program
synthesis here.

A nice side effect of using cartouches is that markup works out of the
box, i.e. all the ML entities (and embedded antiquotations) can be
Ctrl-clicked.

Disclaimer: Do not use this for production purposes.

Comments welcome!

Cheers
Lars
splice.ML
Splice.thy
Splice_Examples.thy

Email Gateway (Aug 22 2022 at 10:34):

From: Andreas Lochbihler <andreas.lochbihler@inf.ethz.ch>
Dear Lars,

This looks like an interesting experiment. Do you think that something similar could be
done for formal comments inside types and terms, i.e., to change from the term language to
the language context "document", as discussed in a thread from last year?

https://lists.cam.ac.uk/mailman/htdig/cl-isabelle-users/2014-September/msg00056.html

Andreas

Email Gateway (Aug 22 2022 at 10:34):

From: Lars Hupel <hupel@in.tum.de>
Certainly. I only discovered that terms may contain cartouches by
accident, but given that, the possibilities for embedded syntax are endless:

syntax "_doc" :: "cartouche_position ⇒ 'a ⇒ 'a" ("DOC _ (_)")

This allows you to write

term "(DOC ‹abc› 3) + 4"

Of course, a suitable parse translation is required to make some sense
of "abc".

Cheers
Lars

Email Gateway (Aug 22 2022 at 10:35):

From: Makarius <makarius@sketis.net>
Cartouches were indeed introduced to make arbitrary nesting of languages
synactically easy, although the concept of sublanguages has been there in
Isabelle from early on.

There are endless possibilities, but it also needs some efforts to make
concrete applications work robustly with the endless features we have
already accumulated.

Just a few notes on this particular experiment:

* There is no need to poke into global refs. The implicit ML context
can be used to transport values in and out of ML "eval" in a
value-oriented manner. See the implementation of the "tactic" proof
method, with its Method.set_tactic slot.

* Term translations operate on rather raw parse trees. You have already
discovered that in the examples: referring to a local "x" is not
immediately clear. It would require some mechanism to "protect"
already internal terms within the parse tree, but that would have to
cooperate properly with the other syntax phases and the term "check"
phases (type inference etc.).

* Morphisms, which are particularly important for attribute expressions,
introduce another dimension of higher abstract nonsense. One needs to
look closely that embedded ML works with that, but not all aspects are
properly implemented in the system and existing antiquotations.
E.g. ML antiquotations themselves ignore morphisms.

Isabelle/ML in Isabelle2015 also provides literal token syntax for formal
input source. Here is an example of this madness, used together with the
SPLICE inner syntax:

theory Scratch
imports Main "~/tmp/Splice"
begin

ML ‹
val ctxt = @{context};
val input = ‹λx. x + y + SPLICE ‹Bound 0››;

val t = Syntax.read_term ctxt (Syntax.implode_input input);
›

end

You see how the nested source inside the ML block gets formally annotated
by PIDE markup.

Makarius

Email Gateway (Aug 22 2022 at 10:35):

From: Makarius <makarius@sketis.net>
You can of course embed something that looks like document source within
terms, or any other Isabelle language.

The remaining problem, though, is to have the document preparation system
present it properly: that is a bit old, it only looks at the outer syntax
for pretty printing.

To get this right, a substantial reform of the Isabelle document
preparation is required, to "print" everything formally from the depths of
arbitrary nesting.

Makarius

Email Gateway (Aug 22 2022 at 10:35):

From: Andreas Lochbihler <andreas.lochbihler@inf.ethz.ch>
Dear Lars and Makarius,

Thanks for the quick replies.

The parse translation is the problem, as Makarius has pointed out. At the moment, I use
informal comments (* abc *) inside terms - they are neither checked nor printed. If I knew
how to formally check these comments in a parse translation, this would already be one
step further, as I could then use document antiquotation to check terms, types and
constants automatically. Document preparation should still work, although it would be fine
for me in the first iteration to ignore these comments (as (* *) are at the moment). Is
something like this possible in the current situation?

Best,
Andreas

Email Gateway (Aug 22 2022 at 10:36):

From: Lars Hupel <hupel@in.tum.de>
First of all, thanks for your valuable comments.

There are endless possibilities, but it also needs some efforts to make
concrete applications work robustly with the endless features we have
already accumulated.

Indeed. That's why I posted this particular use case here in order to
get an idea of whether this is something people would like to use.

* There is no need to poke into global refs. The implicit ML context
can be used to transport values in and out of ML "eval" in a
value-oriented manner. See the implementation of the "tactic" proof
method, with its Method.set_tactic slot.

While refactoring my code to use theory/context data slots, I got
carried away and implemented a general mechanism for typed evaluation. I
uploaded it to a dedicated repository: <http://git.io/vINfn>

It can be used like this:

typed_evaluation foo = ‹int›

Typed_Evaluation.eval @{token foo} ‹3› @{context}

There might be a way to get rid of the explicit registration of the
"token", but I haven't found one yet.

* Term translations operate on rather raw parse trees. You have already
discovered that in the examples: referring to a local "x" is not
immediately clear. It would require some mechanism to "protect"
already internal terms within the parse tree, but that would have to
cooperate properly with the other syntax phases and the term "check"
phases (type inference etc.).

I haven't quite understood the mechanics yet. It appears that already
the parsing fails, which makes me wonder whether it'd be sufficient to
leave a spliced ML expression uninterpreted until checking. This would
probably warrant a separate check phase.

* Morphisms, which are particularly important for attribute expressions,
introduce another dimension of higher abstract nonsense. One needs to
look closely that embedded ML works with that, but not all aspects are
properly implemented in the system and existing antiquotations.
E.g. ML antiquotations themselves ignore morphisms.

I don't understand at all how morphisms come into play here. Are you
suggesting that the embedded ML should somehow have access to a
morphism? Which morphism?

Cheers
Lars

Email Gateway (Aug 22 2022 at 10:36):

From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
Hi Lars,

While refactoring my code to use theory/context data slots, I got
carried away and implemented a general mechanism for typed evaluation. I
uploaded it to a dedicated repository: <http://git.io/vINfn>

It can be used like this:

typed_evaluation foo = ‹int›

Typed_Evaluation.eval @{token foo} ‹3› @{context}

There might be a way to get rid of the explicit registration of the
"token", but I haven't found one yet.

this looks definitely interesting.

Can a malicious attacker with access to the ML name space »hijack« your
evaluation machinery, e.g. by defining it own structure
Typed_Evaluation? This is something difficult to handle?

We should discuss this on my next visit to TUM.

Cheers,
Florian

* Term translations operate on rather raw parse trees. You have already
discovered that in the examples: referring to a local "x" is not
immediately clear. It would require some mechanism to "protect"
already internal terms within the parse tree, but that would have to
cooperate properly with the other syntax phases and the term "check"
phases (type inference etc.).

I haven't quite understood the mechanics yet. It appears that already
the parsing fails, which makes me wonder whether it'd be sufficient to
leave a spliced ML expression uninterpreted until checking. This would
probably warrant a separate check phase.

* Morphisms, which are particularly important for attribute expressions,
introduce another dimension of higher abstract nonsense. One needs to
look closely that embedded ML works with that, but not all aspects are
properly implemented in the system and existing antiquotations.
E.g. ML antiquotations themselves ignore morphisms.

I don't understand at all how morphisms come into play here. Are you
suggesting that the embedded ML should somehow have access to a
morphism? Which morphism?

Cheers
Lars

signature.asc

Last updated: Oct 09 2025 at 01:37 UTC