[isabelle] Isabelle2014-RC0 segfault / sledgehammer raise... · Archive Mirror: Isabelle Users Mailing List

Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Isabelle2014-RC0 segfault / sledgehammer raise...

Email Gateway (Aug 19 2022 at 14:52):

From: Fabian Immler <immler@in.tum.de>
Hi Isabelle,

I noticed two problems, it could be that they are related but i don't know. Note that (especially the first one) might be related to the ML process crashing in thread [1].

Here is how i can reproduce them -- it seems to be important to have a clean ISABELLE_HOME_USER.
I could reproduce them on my Mac (OS X 10.8.5) and on a Linux machine. The errors also occur with "current" isabelle bc957769b584.

1.) When I start Isabelle loading the following theory,

theory Scratch
imports Main
begin

notepad
begin
fix P and x::'a
{
fix x::'a
have "P x ⟹ P x"
by simp
} hence True
sledgehammer

sledgehammer finds a proof and appears to have terminated, but the poly-Process keeps running with 100% CPU until I get the following error message (via Isabelle Syslog) indicating a segfault:

Welcome to Isabelle/HOL (Isabelle2014-RC0: July 2014)
message_output terminated
/Users/immler/Desktop/Isabelle2014-RC0.app/Contents/Resources/Isabelle2014-RC0/lib/scripts/run-polyml-5.5.2: line 84: 85096 Segmentation fault: 11 "$POLY" -q -i $ML_OPTIONS --eval "$(perl "$ISABELLE_HOME/lib/scripts/recode.pl" "$MLTEXT")" --error-exit < /dev/null

standard_error terminated
standard_output terminated
process terminated
command_input terminated
process_manager terminated
Return code: 139

2.) When I load the following theory (again with a clean ISABELLE_HOME_USER)

theory Scratch
imports Main
begin

lemma True
sledgehammer learn_isar
..
(* sufficiently many blank lines to have the sledgehammer invocation below outside of the visible part of the text area *)
notepad
begin
fix P and x::'a
{
fix x::'a
have "P x ⟹ P x"
by simp
} hence True
sledgehammer

while making sure that "sledgehammer" is not started before "sledgehammer learn_isar" ends, sledgehammer fails with

exception DUP "⋀\^E\^Fxml_elem\^Fxml_name=typing\^E\^E\^Fxml_body\^E\^E\^Fxml_elem\^Fxml_name=sorting\^E\^E\^Fxml_body\^Etype\^E\^F\^E'a\^E\^F ...... l_elem\^Fxml_name=typing\^E\^E\^Fxml_body\^E\^E\^Fxml_elem\^Fxml_name=sorting\^E\^E\^Fxml_body\^Etype\^E\^F\^E'a\^E\^F\^E\^E\^F\^Exa\^E\^F\^E" raised (line 261 of "General/table.ML")

Best regards,
Fabian

[1] http://thread.gmane.org/gmane.science.mathematics.logic.isabelle.user/9091

Email Gateway (Aug 19 2022 at 14:52):

From: Christian Sternagel <c.sternagel@gmail.com>
Just for the record: I have also seen the second issue that Fabian
describes several times but did not investigate yet, how to reproduce it
reliably.

cheers

chris

Email Gateway (Aug 19 2022 at 14:53):

From: Jasmin Christian Blanchette <jasmin.blanchette@gmail.com>
Hi Fabian, Christian,

I noticed two problems, it could be that they are related but i don't know. Note that (especially the first one) might be related to the ML process crashing in thread [1].

Thanks for your reports. I will look more closely into them in the coming days. I actually ran into the second issue last Friday when demoing Sledgehammer to Daniel Matichuk but could not reproduce it afterward. I also experience the first issue once in a while, sometimes with Sledgehammer, sometimes without; it seems to be related to multithreading.

I will disable MaSh for now -- it is the likely source of both problems, and I would rather play safe with the coming release. It's a pity, since the tool really does help Sledgehammer, but I don't want to see reports like this one about Isabelle2014 final if it can be avoided.

Cheers,

Jasmin

Email Gateway (Aug 19 2022 at 14:53):

From: Lars Noschinski <noschinl@in.tum.de>
Yesterday, PolyML crashed two times for me (once with a segfault, once
with an illegal instruction). This happened, when I replaced a "by
(metis ...)" by an "apply (metis ...)". The "by (metis ...)" was of
course just inserted by sledgehammer, so this might have been an
instance of the same problem.

-- Lars

Email Gateway (Aug 19 2022 at 14:53):

From: Makarius <makarius@sketis.net>
A segfault is ultimately a problem in the ML runtime system, but if there
is a clearly defined ML component that triggers the problem, David
Matthews can probably make educated guesses what is wrong.

We have had spurious crashes of this kind since last winter, but they were
so rare that it was hard to guess anything. The problem (if it is the
same) seems to be specific to parallel Intel hardware, but not AMD.

If anybody observes any such segfaults on AMD, please report this.

Another possibility for further testing is to downgrade to an older
Poly/ML version from http://isabelle.in.tum.de/components e.g. by claiming
it in $ISABELLE_HOME_USER/etc/settings like this:

init_component "$HOME/.isabelle/contrib/polyml-5.5.1-1"

Then use "isabelle components -a" on the command-line to resolve missing
components, and restart the Isabelle application.

Makarius

Email Gateway (Aug 19 2022 at 14:53):

From: Christian Sternagel <c.sternagel@gmail.com>
One further data point:

When I run Fabian's first example

theory Scratch
imports Main
begin

notepad
begin
fix P and x::'a
{
fix x::'a
have "P x ⟹ P x"
by simp
} hence True
sledgehammer

I get the second error he reported (instead of a segfault; this is not
with a clean ISABELLE_HOME_USER though), i.e.,

exception DUP

"⋀\^E\^Fxml_elem\^Fxml_name=typing\^E\^E\^Fxml_body\^E\^E\^Fxml_elem\^Fxml_name=sorting\^E\^E\^Fxml_body\^Etype\^E\^F\^E'a\^E\^F ...... l_elem\^Fxml_name=typing\^E\^E\^Fxml_body\^E\^E\^Fxml_elem\^Fxml_name=sorting\^E\^E\^Fxml_body\^Etype\^E\^F\^E'a\^E\^F\^E\^E\^F\^Exa\^E\^F\^E"
raised (line 261 of "General/table.ML")

So the two issues seem related.

cheers

chris

Email Gateway (Aug 19 2022 at 14:53):

From: Fabian Immler <immler@in.tum.de>
I get the same behavior with polyml-5.5.1-1 and polyml-5.4.1 (have not checked the other ones).

Fabian

Email Gateway (Aug 19 2022 at 14:53):

From: Makarius <makarius@sketis.net>
Thanks for testing -- I am presently sitting in the lecture hall at ITP
and can't do experiments myself.

polyml-5.4.1 does not have the advanced memory management of polyml-5.5.x
so the crash source is different, but this also means it is potentially
easier to isolate.

Makarius

Email Gateway (Aug 19 2022 at 14:54):

From: Jasmin Christian Blanchette <jasmin.blanchette@gmail.com>
Fabian wrote:

For the record: I followed the instructions and unfortunately wasn't able to reproduce either bug on my MacBook Pro. I'll try it again when I get a chance. Otherwise, we'll need to investigate when we're both at the same physical location.

Jasmin

Email Gateway (Aug 19 2022 at 14:54):

From: Lars Noschinski <noschinl@in.tum.de>
Another segfault, this time triggered directly while using sledgehammer.

$ cat ~/.isabelle/Isabelle2014-RC0/etc/settings
ML_OPTIONS="-H 1000 --gcthreads 1"
JEDIT_JAVA_OPTIONS="-Xms512m -Xmx4096m -Xss4m -Dactors.corePoolSize=4
-Dactors.enableForkJoin=false"

init_component "$HOME/P/afp"

Following Makarius' suggestion, I'm leaving MaSh enabled for now, but
switch to poly 5.5.1-1 and will report on further possible crashes.

Email Gateway (Aug 19 2022 at 14:55):

From: Lars Noschinski <noschinl@in.tum.de>
Crashed again; even at the same position in the theory (with poly
5.5.1-1). I'll try tomorrow whether this can be reproduced reliably.

Welcome to Isabelle/HOL (Isabelle2014-RC0: July 2014)
message_output terminated
/home/noschinl/opt/Isabelle2014-RC0/lib/scripts/run-polyml-5.5.1: line
84: 27780 Segmentation fault "$POLY" -q -i $ML_OPTIONS --eval
"$(perl "$ISABELLE_HOME/lib/scripts/recode.pl" "$MLTEXT")" --error-exit
< /dev/null

-- Lars

Email Gateway (Aug 19 2022 at 14:55):

From: Lars Noschinski <noschinl@in.tum.de>
Since it is already tomorrow: I started the minimization process and
polyml indeed crashed reliably at the same place, again and again. Then
it suddenly stopped, and I haven't been able to reproduce it again (even
when going back to the original theory).

But I stumbled upon some interesting behaviour: Apparently, in this
proof, sledgehammer tries to use a theorem whose proof (in another
theory) failed. So, in the sledgehammer window, I don't see the output
of sledgehammer, but rather the error message belonging to the failed
"by" of said theorem.

-- Lars

Email Gateway (Aug 19 2022 at 14:58):

From: Makarius <makarius@sketis.net>
While sitting together at the "Heuriger" on Friday, Cezary mentioned in
passing that the machine-learning stuff uses a lot of archaic array
programming. That might well be a reason for Poly/ML vomiting: it has
happened occasionally over the years, but arrays are extremely rare these
days. A more recent incident was reported by Cezary on the thread
"[polyml] Heap does not grow up to --maxheap"
http://lists.inf.ed.ac.uk/pipermail/polyml/2013-March/001181.html

Maybe there is an easy way to avoid mutable data structures, and get
things back into proper shape for the Isabelle2014 release.

Independently of that, it would be nice if the reason of the crash could
be isolated and sent to David Matthews.

Makarius

Email Gateway (Aug 19 2022 at 15:01):

From: Jasmin Christian Blanchette <jasmin.blanchette@gmail.com>
Am 20.07.2014 um 19:12 schrieb Makarius <makarius@sketis.net>:

While sitting together at the "Heuriger" on Friday, Cezary mentioned in passing that the machine-learning stuff uses a lot of archaic array programming. That might well be a reason for Poly/ML vomiting: it has happened occasionally over the years, but arrays are extremely rare these days. A more recent incident was reported by Cezary on the thread "[polyml] Heap does not grow up to --maxheap" http://lists.inf.ed.ac.uk/pipermail/polyml/2013-March/001181.html

Maybe there is an easy way to avoid mutable data structures, and get things back into proper shape for the Isabelle2014 release.

I don't think there's an easy way. This code has to be as fast as possible. It currently runs in about 300 ms on my machine, which is good, but I can imagine that this number looks more like 3 s or worse for the L4.verified guys (due to inherent quadratic components). Sticking to pure functional data structure would add a logarithmic component and a whole lot of memory allocations in a tight loop. If the opposite of "archaic" means "slow", I'm siding with "archaic".

The only realistic alternative I see is to use an external C or C++ program and providing it as a component. I would rather avoid this.

Independently of that, it would be nice if the reason of the crash could be isolated and sent to David Matthews.

Yes, I'll look into this once the Vienna Summer of Logic is definitely over.

It would be nice to be able to fix this before the Isabelle2014 release, but I'm currently working from the hypothesis that we won't make it on time.

Jasmin

Email Gateway (Aug 19 2022 at 15:02):

From: Makarius <makarius@sketis.net>
The opposite of archaic means fast on the usual side-conditions of our ML
environment (persistent data structures in a parallel environment). The
overall performance of Isabelle has increased a lot in that respect in the
past 7 years. In contrast, OCaml guys usually optimize for a different
execution model, and well-known provers on that platform are perceived as
very slow today.

Sometimes there are archaic algorithms that insist in poking around in
some array in a certain way, but that is often not necessary. I can't say
anything specific about the machine-learning business -- it might require
some actual work to reform it. There might be also quick-and-dirty
workarounds to allocate many small arrays with a normal tree table for
access -- Poly/ML does not like large blobs on the heap.

In my everyday encounter with arrarys in legacy APIs on the JVM, they
usually inflict a performance hit that could have been avoided by proper
immutable structures from the Scala library. This old "arrays are fast
myth" is really just a myth and needs to be overcome.

Makarius

Email Gateway (Aug 19 2022 at 15:03):

From: Jasmin Christian Blanchette <jasmin.blanchette@gmail.com>
Hi Fabian, Makarius,

Fabian wrote:

1.) When I start Isabelle loading the following theory,

theory Scratch
imports Main
begin

notepad
begin
fix P and x::'a
{
fix x::'a
have "P x ⟹ P x"
by simp
} hence True
sledgehammer

sledgehammer finds a proof and appears to have terminated, but the poly-Process keeps running with 100% CPU until I get the following error message (via Isabelle Syslog) indicating a segfault:

I can reproduce the problem with the following (array-free!) theory:

theory Scratch
imports Main
begin

ML {*
fun launch_thread timeout task =
let
val birth_time = Time.now ()
val death_time = Time.+ (birth_time, timeout)
val desc = ("", "")
in
Async_Manager.thread "" birth_time death_time desc task
end
*}

ML {*
launch_thread (seconds 10.0) (fn _ => error "FOO")
*}

end

In short: Any uncaught exception raised from an old-style "Async_Manager" worker thread eventually leads to a segfault.

2.) When I load the following theory (again with a clean ISABELLE_HOME_USER)

theory Scratch
imports Main
begin

lemma True
sledgehammer learn_isar
..
(* sufficiently many blank lines to have the sledgehammer invocation below outside of the visible part of the text area *)
notepad
begin
fix P and x::'a
{
fix x::'a
have "P x ⟹ P x"
by simp
} hence True
sledgehammer

while making sure that "sledgehammer" is not started before "sledgehammer learn_isar" ends, sledgehammer fails with

exception DUP "`⋀

I have yet to investigate this closer, but the exception that was thrown in (1) was "DUP", so the bug appears to be a... duplicate. ;)

Thanks again for taking the time to produce such a useful bug report!

Jasmin

Email Gateway (Aug 19 2022 at 15:03):

From: Makarius <makarius@sketis.net>
The problem can be pushed further down to Runtime.thread, which is marked
as "Proof General legacy", but that is not the problem here.

For the moment, the included changeset should prevent that particular
crash, but I first need to understand what is really going on.

BTW, the other hard crash with Isabelle2014-RC0 PIDE interaction under
heavy load and many auxiliary files is already sorted out: it is a failure
of the ML/C foreign language interface and our SHA1.digest that is based
on it. (That is one of these array algorithms.)

Makarius
ch

Email Gateway (Aug 19 2022 at 15:03):

From: Jasmin Christian Blanchette <jasmin.blanchette@gmail.com>
The issues with MaSh should all be addressed now (cffd1d6ae1e5). If anybody sees anything remotely strange with MaSh in the run-up to Isabelle2014, please report it again.

Jasmin

Email Gateway (Aug 19 2022 at 15:04):

From: Makarius <makarius@sketis.net>
Just an update on these speculative ML_OPTIONS that I proposed to you at
ITP: "--gcthreads 1" should not make a difference, you can remove it or
use a default that makes sense for your hardware (e.g. 2 or 4). The 1 was
based on the suspicion that parallel GC could be unstable, but the
well-known ML_file editing crash was caused by SHA1.digest via ML/C FFI.
That should work better in the coming Isabelle2014-RC1.

Also note that JEDIT_JAVA_OPTIONS now have simpler defaults. You can omit
the properties for the actors framework, because it is no longer used in
Isabelle2014.

There is a remaining problem that the JVM uses the full hyperthreaded
number of CPU cores by default, according to
Runtime.getRuntime().availableProcessors(), but I don't know how to change
that. Oracle is not as quick as David Matthews to adapt to current
hardware.

Makarius

Email Gateway (Aug 19 2022 at 15:25):

From: Fabian Immler <immler@in.tum.de>
I have recently seen this behavior as well (in Isabelle2014-RC1) and can reproduce it as follows (it seems important that mashine learning is "finished"):

theory Scratch imports Main begin

lemma True sledgehammer learn_isar by (rule TrueI)

notepad begin
fix P
have "P"
by nothing
have "P"
(* invoke s/h here *)

inserting a textual "sledgehammer" or "sledgehammer learn_isar" command shows nothing in the output, but colors the command in red.

Fabian

Email Gateway (Aug 19 2022 at 15:26):

From: Jasmin Christian Blanchette <jasmin.blanchette@gmail.com>
Hi Lars & Fabian,

Thank you for your report! The attached changeset for "isabelle-release" fixes it. In short, it's naive to expect "Thm.proof_body_of" never to throw any exceptions.

Makarius: Could you apply the attached changeset to "isabelle-release"? Thanks.

Jasmin
try_proof_body

Email Gateway (Aug 19 2022 at 15:32):

From: Makarius <makarius@sketis.net>
On Fri, 1 Aug 2014, Jasmin Christian Blanchette wrote:

I have recently seen this behavior as well (in Isabelle2014-RC1) and can reproduce it as follows (it seems important that mashine learning is "finished"):

theory Scratch imports Main begin

lemma True sledgehammer learn_isar by (rule TrueI)

notepad begin
fix P
have "P"
by nothing
have "P"
(* invoke s/h here *)

inserting a textual "sledgehammer" or "sledgehammer learn_isar" command shows nothing in the output, but colors the command in red.

The attached changeset for "isabelle-release" fixes it. In short, it's
naive to expect "Thm.proof_body_of" never to throw any exceptions.

There is some documentation about it in the implementation manual:

\item @{ML Thm.proof_of}~@{text "thm"} and @{ML
Thm.proof_body_of}~@{text "thm"} produce the proof term or proof
body (with digest of oracles and theorems) from a given theorem.
Note that this involves a full join of internal futures that fulfill
pending proof promises, and thus disrupts the natural bottom-up
construction of proofs by introducing dynamic ad-hoc dependencies.
Parallel performance may suffer by inspecting proof terms at
run-time.

This makes it clear that looking at the internal derivation of a theorem
means to side-step the normal order of processing -- as a consequence you
may see failures from forked branches of the parallel execution.

Makarius: Could you apply the attached changeset to "isabelle-release"?

OK, I will push this over to bitbucket soon.

Although the deeper problem remains: How to inspect parallel derivations
robustly while the system is running? The Thm.get_name_hint approach is
fragile anyway -- I have just been there with another guy today. Isn't
that one of the hints from the Isabelle cookbook that don't quite work?

Makarius

Last updated: Jun 30 2025 at 20:23 UTC