Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] ML in LCF architecture


view this post on Zulip Email Gateway (Aug 18 2022 at 16:47):

From: Makarius <makarius@sketis.net>
Here I am sensing some issues that are specific to the classic HOL family,
not Isabelle nor LCF itself. Looking at the sources of good old Cambridge
LCF (from Larry's website), you have a LISP engine that implements an ML
interpreter that is merely a symbolic computation engine, so ML proof
scripts cannot affect the system integrity itself.

Later HOL implementations have done this differently, by starting with an
off-the-shelf ML system (SML or OCaml) and implementing the prover such
that it shares the ML environment with user tools and proof scripts. This
has opened a few potential problems of trustworthiness, although I would
call nothing of this really critical.

In Isabelle we have moved more and more away from that raw exposure of ML,
not just by avoiding ML in regular proof scripts. Isabelle/ML is embedded
into a managed context of the Isar infrastructure. There where numerous
practical demands that motivated that, such as robust undo history and
support for parallel execution.

Taking this approach of to the extreme, one could easily make Isabelle/ML
a sandboxed environment for purely symbolic computations, without any
access to system functions of Standard ML, or the Isabelle/Pure
implementation itself. As mentioned before, Poly/ML provides robust means
to invoke ML at runtime, such that user code cannot mess up your system
implementation.

Funnily, when Mark encountered the Isabelle tty for the first time some
months ago, his first impulse was to drop out of the managed Isar
environment, in order to have raw access to the underlying ML :-)

BTW, this no longer works in the Isabelle/Scala Prover IDE protocol, but
for other reasons than anxiety about vulnerability of the prover kernel.

Makarius

view this post on Zulip Email Gateway (Aug 18 2022 at 16:48):

From: mark@proof-technologies.com
on 19/1/11 12:21 PM, Makarius <makarius@sketis.net> wrote:

Here I am sensing some issues that are specific to the classic HOL family,
not Isabelle nor LCF itself. Looking at the sources of good old Cambridge
LCF (from Larry's website), you have a LISP engine that implements an ML
interpreter that is merely a symbolic computation engine, so ML proof
scripts cannot affect the system integrity itself.

(Cambridge HOL and HOL88 were the same in this respect as Cambridge LCF, I
think...)

But surely the user still interacted in ML, on top of the ML implementation
of LCF, so I would have thought that such an ML interpreter has the same
behaviour as any other, with the same risks of users overwriting crucial ML
identifiers, etc... Maybe I'm missing something here.

In Isabelle we have moved more and more away from that raw exposure of ML,
not just by avoiding ML in regular proof scripts. Isabelle/ML is embedded
into a managed context of the Isar infrastructure. There where numerous
practical demands that motivated that, such as robust undo history and
support for parallel execution.

Taking this approach of to the extreme, one could easily make Isabelle/ML
a sandboxed environment for purely symbolic computations, without any
access to system functions of Standard ML, or the Isabelle/Pure
implementation itself. As mentioned before, Poly/ML provides robust means
to invoke ML at runtime, such that user code cannot mess up your system
implementation.

Yes, sandboxing would be the ultimate I suppose. Presumably you mean that
this would involve some versatile API giving the user effectively the power
to extend the theorem prover like in normal LCF-style systems? But for the
time being, this all sounds a bit complicated, when all that needs to be
done to secure things as far as the domain of normal running of the ML
program is concerned (i.e. ignoring system functions, somehow circumventing
the ML language, OS-type hacks, etc) is to stop certain crucial ML values
and pretty printers from being overwritten, which I think you said was
already possible in Poly/ML.

Funnily, when Mark encountered the Isabelle tty for the first time some
months ago, his first impulse was to drop out of the managed Isar
environment, in order to have raw access to the underlying ML :-)

Yes this is how I understand how things really work. Maybe I'm just a
low-level command line junkie :-)

Mark.

view this post on Zulip Email Gateway (Aug 18 2022 at 16:48):

From: Makarius <makarius@sketis.net>
Fine. So the next step is to see how raw ML implements Isar, and Isar
incorporates ML again, with some sandboxing already in place. You should
then get some ideas how to implement a secure system, by adding
appropriate infrastructure instead of chopping everything off as in
HOL-Light, and probably also HOL Zero right now.

Makarius

view this post on Zulip Email Gateway (Aug 18 2022 at 16:48):

From: mark@proof-technologies.com
on 20/1/11 1:53 PM, Makarius <makarius@sketis.net> wrote:

I'm not sure what you mean by "chopping everything off" ...

And are you saying that Isar currently incorporates full-blown ML, so that
users can write their own extensions with all the power of ML?

Mark.

view this post on Zulip Email Gateway (Aug 18 2022 at 16:49):

From: Makarius <makarius@sketis.net>
Yes, the ML inside Isar is the full Poly/ML moderated by some of our
infrastructure. It is a bit like an operating system that turns the raw
CPU and memory into virtualized versions, such that user space programs
cannot mess up the whole thing, and the whole system becomes much more
useful and powerful than bare metal.

Makarius


Last updated: May 06 2024 at 16:21 UTC