[isabelle] Remaining uses of "isabelle process" and "isab... · Mirror: Isabelle Users Mailing List

Stream: Mirror: Isabelle Users Mailing List

Topic: [isabelle] Remaining uses of "isabelle process" and "isab...

Email Gateway (Feb 27 2021 at 15:27):

From: Makarius <makarius@sketis.net>
Dear Isabelle users,

Isabelle2021 makes it easy to invoke Isabelle/Scala functions from
Isabelle/ML, but this requires a proper PIDE session for the underlying
protocol of Isabelle/Scala.

This works routinely in:

isabelle build
isabelle server
isabelle jedit
isabelle vscode
...

but not in:

isabelle process
isabelle console

What are remaining uses of the latter two, before they get removed for the
next release?

This should be seen as an opportunity to do "Isabelle system programming"
right in Isabelle/Scala --- using current Isabelle2021. There are many
possibilities, and I don't think that raw ML processes are still needed. Are
there genuine counter examples to this claim?

Makarius

Email Gateway (Mar 01 2021 at 11:29):

From: Dominique Unruh <unruh@ut.ee>
My scala-isabelle library currently uses 'isabelle process'. It uses it
in the following way:

* From a Scala application, the Isabelle process is invoked with '-e
somefile.ML' where somefile.ML contains a loop that reads commands
from one named pipe (or socket in Windows) and writes responses
asynchronously to another named pipe (or socket).

I would be happy to use Isabelle/Scala for this (or some other builtin
method), but I would like to keep the following features:

* It should be possible that the Scala application starts Isabelle.
(As opposed to the Scala application being invoked as a component in
Isabelle.)

* I do not want to have to install any components inside the Isabelle
home / user home.
(Because I want that an application that uses scala-isabelle can run
with an existing Isabelle installation without the user having to do
anything but tell the Scala application where Isabelle is located.
In particular, I don't want to potentially mess up an existing
installation.)

* I do not want to use the PIDE protocol for communication between the
Scala application and scala-isabelle (but instead my own binary
protocol).
(Because I got a factor 1000 or so in round-trip time when switching
to my own minimalistic RPC protocol compared to libisabelle that
used PIDE.)

It may very well be that this can be easily achieved without 'isabelle
process' or Isabelle/Scala. For example, one idea would be to simply
load a theory in a headless noninteractive session that contains a
(nonterminating) ML_file command which contains the communication loop.
(Of course, this theory would never be included in a heap because it
does not terminate.) Or an ML_file command that forks the loop into a
separate thread and then returns. But I would need some tips for that
because my experiments with these things has mainly lead to me getting
some messages like "invalid execution id" etc.

On the long run, I want to support Isabelle/Scala in scala-isabelle in
the sense that the existing Isabelle/Scala infrastructure should be an
optional mechanism how an scala-isabelle application can interact with
Isabelle. But here also my initial attempts ran into "invalid execution
id" like problems.

Best wishes,
Dominique.

Email Gateway (Mar 01 2021 at 12:45):

From: Makarius <makarius@sketis.net>
On 01/03/2021 12:02, Dominique Unruh wrote:

My scala-isabelle library currently uses 'isabelle process'. It uses it in the
following way:

* From a Scala application, the Isabelle process is invoked with '-e
somefile.ML' where somefile.ML contains a loop that reads commands from
one named pipe (or socket in Windows) and writes responses asynchronously
to another named pipe (or socket).

The proper way to invoke an Isabelle ML process in Scala is is via
isabelle.ML_Process() --- it has a lot of flexibility. At the same time I
reckon that this approach will not do the job, because it is not connected
properly to Isabelle/Scala via the PIDE protocol.

* I do not want to use the PIDE protocol for communication between the Scala
application and scala-isabelle (but instead my own binary protocol).
(Because I got a factor 1000 or so in round-trip time when switching to my
own minimalistic RPC protocol compared to libisabelle that used PIDE.)

The PIDE protocol is already very minimalistic and efficient. Maybe you have
just noticed the default delay that was there until recently (after
Isabelle2021):

user: wenzelm
date: Sat Feb 20 21:38:23 2021 +0100
files: src/Pure/System/message_channel.ML
description:
more reactive protocol messages, e.g. for Scala.function (relevant for
Bash.process);

Afterwards, the measured roundtrip for Scala.function is approx. 0..2ms.

If there are remaining inconveniences, we can try to sort it out, but
dismissing PIDE is not an option.

It may very well be that this can be easily achieved without 'isabelle
process' or Isabelle/Scala. For example, one idea would be to simply load a
theory in a headless noninteractive session that contains a (nonterminating)
ML_file command which contains the communication loop. (Of course, this theory
would never be included in a heap because it does not terminate.) Or an
ML_file command that forks the loop into a separate thread and then returns.
But I would need some tips for that because my experiments with these things
has mainly lead to me getting some messages like "invalid execution id" etc.

How about using "isabelle server" or Headless.Session directly in
Isabelle/Scala? The rest is done declaratively within the PIDE document model,
without any homegrown protocols around it.

There are many possibilities to do it correctly within the Isabelle framework,
without disrupting it.

On the long run, I want to support Isabelle/Scala in scala-isabelle in the
sense that the existing Isabelle/Scala infrastructure should be an optional
mechanism how an scala-isabelle application can interact with Isabelle. But
here also my initial attempts ran into "invalid execution id" like problems.

Once more: Isabelle/Scala (with PIDE protocol) is non-optional for
Isabelle/ML. In the next release this will be even more noticeable in everyday
use.

Makarius

Email Gateway (Mar 08 2021 at 10:26):

From: Dominique Unruh <unruh@ut.ee>

My scala-isabelle library currently uses 'isabelle process'. It uses it in the
following way:

* From a Scala application, the Isabelle process is invoked with '-e
somefile.ML' where somefile.ML contains a loop that reads commands from
one named pipe (or socket in Windows) and writes responses asynchronously
to another named pipe (or socket).
The proper way to invoke an Isabelle ML process in Scala is is via
isabelle.ML_Process() --- it has a lot of flexibility. At the same time I
reckon that this approach will not do the job, because it is not connected
properly to Isabelle/Scala via the PIDE protocol.

I confirm that I got things to work this way. I am using the following code:

override def startIsabelleProcess(cwd: Path, mlCode: String, logic: String,
sessionRoots: Array[Path], build: Boolean,
userDir: Optional[Path]): Process = {
Isabelle_System.init(isabelle_root = isabelleRoot.toString)
val channel = System_Channel()
val options = Options.init()
val channel_options = options.string.update("system_channel_address", channel.address).
string.update("system_channel_password", channel.password)

val sessionRoots2 = sessionRoots.map(p => isabelle.Path.explode(p.toString)).toList
val sessions_structure = Sessions.load_structure(options = options, dirs = sessionRoots2)
val store = Sessions.store(options)

if (build) {
Isabelle_Thread.fork(name ="Build Isabelle", daemon =true) {
Build.build_logic(options = options, logic = logic, build_heap =true, dirs = sessionRoots2)
}.join()
}

ML_Process(channel_options, sessions_structure, store, eval_main = mlCode, logic = logic, cwd = cwd.toFile)
}

Can you confirm that I am using the proper functions here? (Most of it
is guess work and try and error.)

And one additional question: Is there a way to set the Isabelle user
home? I tried the following code (inspired by some fragment I found in
the Isabelle source somewhere), but it has no effect. Maybe there is a
different way? (I do not want to require an environment variable to be
set outside the application. But I want to give Scala apps the
possibility to use a local setup that does not interfere with the
Isabelle config of the user, if needed.)

val userDir2 = isabelle.File.standard_path(userDir.get().toString)
val putenv = Class.forName("org.gjt.sp.jedit.MiscUtilities")
.getMethod("putenv", classOf[String], classOf[String])
putenv.invoke(null,"ISABELLE_HOME_USER", userDir2)
putenv.invoke(null,"USER_HOME", userDir2)

One more question: I am passing the session root directories both to
Sessions.load_structure and to Build.build_logic. I am worried that this
means that the time-intensive scan of the session directories
(especially if AFP is loaded) happens twice. Is that correct? Can it be
avoided?

But these remaining questions notwithstanding, it seems that I will not
need the command line "isabelle process" as long as ML_Process will be
there.

* I do not want to use the PIDE protocol for communication between the Scala
application and scala-isabelle (but instead my own binary protocol).
(Because I got a factor 1000 or so in round-trip time when switching to my
own minimalistic RPC protocol compared to libisabelle that used PIDE.)

Afterwards, the measured roundtrip for Scala.function is approx. 0..2ms.

That is indeed better than what I experienced, but still is far from the
round-trip time of 0.03ms I measured with a minimalistic protocol. But I
will explore a possible transition to PIDE / Isabelle/Scala as a backend
further in the future, as soon as time permits.

(Maybe switching to named pipes on Linux instead of network sockets is
the main reason for the speed difference. At least in my experiments,
the round trip time was much faster with named pipes.)

Best wishes,
Dominique.

Email Gateway (Mar 08 2021 at 11:49):

From: Makarius <makarius@sketis.net>
The named pipes were there until some years ago, when I made everything
uniform for the sake of Windows:
https://isabelle-dev.sketis.net/rISABELLEacba5d6fdb2

It might be possible to do named pipes again with current Java versions (we
are presently at Java 15, soon at Java 17), but I did not have time to
investigate (and no particular reason for doing it).

Makarius

Email Gateway (Mar 08 2021 at 12:41):

From: Makarius <makarius@sketis.net>
On 08/03/2021 10:49, Dominique Unruh wrote:

The proper way to invoke an Isabelle ML process in Scala is is via
isabelle.ML_Process() --- it has a lot of flexibility. At the same time I
reckon that this approach will not do the job, because it is not connected
properly to Isabelle/Scala via the PIDE protocol.

I confirm that I got things to work this way. I am using the following code:

override def startIsabelleProcess(cwd: Path, mlCode: String, logic: String,
sessionRoots: Array[Path], build: Boolean,
userDir: Optional[Path]): Process = {
Isabelle_System.init(isabelle_root = isabelleRoot.toString)
val channel = System_Channel()
val options = Options.init()
val channel_options = options.string.update("system_channel_address", channel.address).
string.update("system_channel_password", channel.password)

val sessionRoots2 = sessionRoots.map(p => isabelle.Path.explode(p.toString)).toList
val sessions_structure = Sessions.load_structure(options = options, dirs = sessionRoots2)
val store = Sessions.store(options)

if (build) {
Isabelle_Thread.fork(name = "Build Isabelle", daemon = true) {
Build.build_logic(options = options, logic = logic, build_heap = true, dirs = sessionRoots2)
}.join()
}

ML_Process(channel_options, sessions_structure, store, eval_main = mlCode, logic = logic, cwd = cwd.toFile)
}

Can you confirm that I am using the proper functions here? (Most of it is
guess work and try and error.)

It looks fine as a start. Just a few side-remarks:

* It is better to invoke Isabelle_System.init() only once at the start of
the application, not for each Isabelle process.

* Likewise it is better to invoke Options.init() only once, and later pass
around (options: Options) in a purely functional manner.

* sessionRoots: Array[Path] looks very strange. Are you trying to imitate
very old Java conventions (or Eclipse)? Array is the worst data structure
ever. In recent Java, I see more and more uses of java.util.List, together
with convenient List.of() functions to create adhoc instances. And Scala 2.13
has good conversions asJava / asScala --- not that I using that myself, unless
there is no way around it.

* isabelle.Path.explode(p.toString) looks quite fragile; it is unlikely to
work on Windows. Assuming that your Path type is java.nio.file.Path you can
use isabelle.File.path(path.toFile)

* cwd could be null (a rare exception in Isabelle/Scala signatures!) and
thus cwd.toFile could crash.

* Instead of Isabelle_Thread.fork, I often use the more convenient
Future.thread.

Anyway, the above would merely replace the external "isabelle process" by
internal Isabelle/Scala operations. It does not address the problem of a
missing PIDE session context for Isabelle/ML/Scala function invocations ---
the starting point of this thread.

You can probably get through with isabelle.Isabelle_Process instead of
isabelle.ML_Process: that will also take care of the System_Channel.

Note that a PIDE session/protocol context does not necessarily mean PIDE
document model: Build.build does a regular batch-build within that context,
without any Document.update.

The point of the whole exercise is to have official protocol commands and
protocol messages, instead of low-level stream communication.

And one additional question: Is there a way to set the Isabelle user home?

You can do it indirectly by giving an alternative ISABELLE_IDENTIFIER, e.g.
like this on the command-line:

env ISABELLE_IDENTIFIER=My_Great_App_for_Isabelle2021
Isabelle2011/bin/isabelle getenv ISABELLE_HOME_USER
ISABELLE_HOME_USER=/home/makarius/.isabelle/My_Great_App_for_Isabelle2021

That environment needs to be present for Isabelle_System.init().

One more question: I am passing the session root directories both to
Sessions.load_structure and to Build.build_logic. I am worried that this means
that the time-intensive scan of the session directories (especially if AFP is
loaded) happens twice. Is that correct? Can it be avoided?

Sessions.load_structure alone is not very expensive: it merely traverses all
ROOT files. This takes approx. 0.1s for Isabelle + AFP, after the first run of
approx. 0.3s.

Build.build_logic does a rather expensive up-to-date check of the specified
logic image (not other sessions). E.g. 1s for HOL, 3s for HOL-Analysis.

The standard approach is to do it only once on application startup. (Existing
"isabelle process" or ML_Process(), Isabelle_Process() don't do a build either.)

Makarius

Email Gateway (Mar 17 2021 at 12:59):

From: Dominique Unruh <unruh@ut.ee>
Hi,

thanks for your feedback.

The issues about invoking .init() are covered in my specific context
because the code fragment I wrote to you will be invoked only once
(global instance). And I am aware that Array[...] is a somewhat
un-Scala-like datastructure. The reason is again very specific to my
application: Due to the fact that my library cannot guarantee that it
will be invoked under the same Scala version as Isabelle is compiled
for, I have to do some classloader trickery. The effect is that the
signature of the wrapper method I use needs to use pure Java. (I can
provide more details on that trick if desired.) Concerning
isabelle.File.path – I was prepared to do some manual conversion to
cygwin paths, but having a method that already takes care of this, great! :)

Anyway, the above would merely replace the external "isabelle
process" by internal Isabelle/Scala operations. It does not address the
problem of a missing PIDE session context for Isabelle/ML/Scala function
invocations --- the starting point of this thread.

True. Further transition is planned for the future, but time constraints
force me to stick to this approach for now, at least. Is the ML_Process
class that I am using planned to be discontinued soon?

Best wishes,
Dominique.

Email Gateway (Mar 17 2021 at 14:27):

From: Makarius <makarius@sketis.net>
Scala certainly has its own problems concerning jar non-portability.

My complaint of the data structure was actually about current Java practice.
Starting with Java 8, it has become a mostly functional/immutable environment;
this trend has continued with Java 11 LTS, and Java 17 LTS (end of 2021) will
go even further.

Thus the speed of Java development has outpaced Scala in recent years.

Makarius

Email Gateway (Mar 17 2021 at 15:03):

From: Makarius <makarius@sketis.net>
No, it remains the official way to start a low-level ML process from
Isabelle/Scala.

The very point of the whole affair is that the process alone will no longer
suffice to run Isabelle sessions, because more and more Isabelle/Scala/ML
functions are coming.

A recent example is for remote ATPs in Sledgehammer:
https://isabelle-dev.sketis.net/phame/post/view/35/remote_provers_from_systemontptp_via_isabelle_scala

Other examples: add-on file-system operations (Isabelle_System.rm_tree).
Potentially also SHA1.digest: instead of a low-level C module for Poly/ML it
might be better to have a high-level Scala function.

Makarius

Last updated: Jul 15 2022 at 23:21 UTC