Stream: Mirror: Isabelle Users Mailing List

Topic: [isabelle] Persisting proof data in theory


view this post on Zulip Email Gateway (May 02 2023 at 05:26):

From: hannobecker@posteo.de
Hi all,

(How) Is it possible to persist proof data in the surrounding theory?

I'm logging (in ML) some dynamic data (mostly statistics) about custom
tactics, storing them in the Proof_Context via the Proof_Data
mechanism. This allows me to analyze a proof while still within its
proof block, which has been very useful. I'd now like to persist this
information in the surrounding theory to allow for multiple proofs (and
even theories) to be analyzed at once.

The stored data type supports merging, but naively implementing it as
Theory_Data modified from within the proof does not work (or I'm doing
it wrong).

Are there ways to merge back proof-data into the surrounding theory?

Thanks!
Hanno

view this post on Zulip Email Gateway (May 02 2023 at 12:11):

From: Thomas Sewell <tals4@cam.ac.uk>
This will be a little tricky.

The design is set up so that no data escapes from a proof except for the final theorem. Proofs can then be run in parallel, since all their inputs are known without waiting for other proofs to complete (the theorem is handled specially). If you want to move stats out of a proof into the main context, you'll have to break out of this design somehow, e.g. by using stateful ML references to move data from place to place.

The timing panel in jEdit gathers some stats. I guess that it works by having each proof step produce output on the PIDE protocol and having the Scala side gather the statistics. You could presumably implement something similar, but extending the inter-language protocol has always seemed far more daunting than adding some ML code. Maybe others have a better idea?

Good luck,
Thomas.

view this post on Zulip Email Gateway (May 02 2023 at 16:22):

From: Makarius <makarius@sketis.net>
Good answer --- as first approximation of a difficult problem.

Conceptually, proofs are indeed formally isolated and strictly "irrelevant".

You can't "leak" information from proofs, subproofs, proof blocks, without
violating structural principles of Isabelle/Isar. In retrospect, I am rather
glad that we have such great principles, and not just an adhoc mess of
stateful commands as in ancient times.

Note that within a proof context, you cannot even store anything, that is
meant to persist outside the current local block.

Instead, in order to oversee and analyze what is happening in Isabelle
theories and proofs, the proper approach is to use Isabelle/Scala.

The most basic mechanism to expose information from Isabelle/ML into the build
database of Isabelle/Scala is via the ML function Export.export (and some
variants of it).

As usual, you will find examples via hypersearch over existing sources. This
reveals, e.g. src/HOL/Tools/Mirabelle/mirabelle.ML and the corresponding
mirabelle.scala.

Here the approach is more advanced than strictly necessary, using a
user-defined PIDE protocol handler. Alternatively, it is also possible to
retrieve theory exports from the session build database, like reading a zip
file. E.g. see "isabelle export" and its Isabelle/Scala implementation in
src/Pure/Thy/export.scala

Or you emit regular messages of a special form, e.g. "writeln" or "tracing"
and then use an approach like "isabelle log" to retrieve the collected
information in Isabelle/Scala.

Exports and messages leave the ML world and cannot be retrieved from the
running ML session. If you need incremental access to your data within ML, the
Isabelle/Scala approach will not work.

Instead, you can try to manage your own global state in Isabelle/ML, using
Synchronized.var or even weak references (together with locks). This requires
substantial understanding of threads and memory management in Isabelle/ML and
the Poly/ML runtime system. (Not necessarily at the start of such a project,
but as a consequence at the successful end of it.)

Makarius

view this post on Zulip Email Gateway (May 03 2023 at 19:10):

From: hannobecker@posteo.de
Thank you for the replies!

If possible, I'd like to go into more detail on how this can be achieved
in ML, at the cost of breaking structural principles.

The problem with updating state through references is that I would like
to be able to roll back state changes when looking at the state of a
theory prior to a proof vs. after the proof. Within the proof itself, I
am currently achieving this by introducing a method-wrapper which
temporarily enables proof-local state updates through references,
performed by tactics called within the scope of this wrapper (whether
successful or not). When the wrapper concludes, the state behind the
reference is then persisted into the proof context. I'd like to play a
similar 'trick' to persist the so-gathered proof local data into the
surrounding theory when the proof concludes, but it's not clear where
exactly that should happen (without writing an adjusted 'lemma'
top-level command, at least).

Any further pointers would be most welcome,
Thank you,
Hanno

view this post on Zulip Email Gateway (May 03 2023 at 19:57):

From: Makarius <makarius@sketis.net>
Breaking structural principles will merely lead to a broken setup --- it is
not going to work properly.

It is certainly possible to go beyond the purely functional structures of
Isabelle/Isar, but it requires a lot of extra considerations about the
workings of Isabelle/ML and Poly/ML.

Makarius

view this post on Zulip Email Gateway (May 03 2023 at 20:10):

From: hannobecker@posteo.de
Thank you Makarius, I understand that it's easy to shoot myself in the
foot here.
Yet, in the proof-local situation, things have so far worked well for me
despite
somewhat tricky/hacky handling of references, leading to useful insights
being
generated. So if, following up on

view this post on Zulip Email Gateway (May 03 2023 at 20:31):

From: Makarius <makarius@sketis.net>
On 03/05/2023 22:10, hannobecker@posteo.de wrote:

Thank you Makarius, I understand that it's easy to shoot myself in the foot here.
Yet, in the proof-local situation, things have so far worked well for me despite
somewhat tricky/hacky handling of references, leading to useful insights being
generated. So if, following up on

As long as you merely shoot yourself privately, nobody will notice.

Hopefully such experiments won't come back to the general public, e.g.
Isabelle/AFP.

It is certainly possible to go beyond the purely functional structures
of Isabelle/Isar, but it requires a lot of extra considerations about
the workings of Isabelle/ML and Poly/ML.

you had some further pointers on how a ref-based, reversible migration of
data from the proof context into the theory could work, I'd be grateful.
Otherwise, I'll likely stick with what I have.

I can't say much on the spot, because I don't know anything about your
application. The usual Question 0 is: Do you really need what you have in
mind? Question 1: What do you actually have in mind?

Often there is a mismatch of what is possible vs. required vs. actually done.

As a starting point to get acquainted with parallel ML, you can try to work
with global Synchronized.var maybe together with weak references (to
participate in ML garbage collection properly). The latter is documented in
the source: You get to the source by C-hover-click in Isabelle/PIDE on this text:

ML ‹open Weak›

Makarius


Last updated: Apr 26 2024 at 20:16 UTC