[isabelle] RC4: sledgehammer on Cygwin crashed once; temp... · Archive Mirror: Isabelle Users Mailing List

Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] RC4: sledgehammer on Cygwin crashed once; temp...

Email Gateway (Aug 19 2022 at 15:35):

From: Makarius <makarius@sketis.net>
That is one of these hard crashes of the Poly/ML runtime system that
become the more likely the more it approaches certain resource limits.
In the screenshot the sledgehammer invocation has quite a lot of provers.
What is your Multithreading.max_threads_value() in Isabelle/ML?
The question is how many are actuallt run in parellel.

How big is the theory that you were working on?

Makarius

Email Gateway (Aug 19 2022 at 15:36):

From: Gottfried Barrow <igbi@gmx.com>
The theory is not that big, about 1700 lines, which takes about 8
seconds to prove.

I was just starting with RC4, so I think threads was set by the default
"0" in the options, which appears to give me 4 threads, since I have an
older quad core.

I've been running with the thread options set to 4, and it has only
crashed that one time, and I've now been running sledgehammer quite a
bit over 6 hours or so.

For observational purposes, I have the "sledgehammer_params" set to use
17 provers, and I also have those 17 in the "Provers" option for the
Sledgehammer panel.

I list the 17 below and I've set threads=12 now, and, watching the
number of processes, it appears willing to actually start the number of
threads that max_threads_value() is set to.

For the first provers in the list, I interleave local provers with
remote provers, with the best 3 locals at the beginning, z3, e, and
spass, and I think it starts all the provers in the order listed. So,
with threads=4, remote_spass_pirate wasn't returning with an error
quickly, but with threads=12, it was failing after a few seconds, for a
particular proof.

Provers:

z3 remote_vampire e remote_e_tofof spass remote_iprover remote_snark
remote_e_sine remote_iprover_eq agsyhol remote_spass_pirate satallax
remote_waldmeister_new leo2 remote_satallax remote_agsyhol remote_leo2

The short story is that RC0 and RC4 have failed in a way that Cygwin
Isabelle hasn't failed before, so something has changed. RC4 has only
failed once, so it could be a rare thing.

On the surface, Isabelle2014 doesn't seem to be a big change, though I
suppose that's because I've already been exposed to it a while, such as
to datatype_new.

There is one huge surface addition, and that's \<open> and \<closed>.
Those are huge, because they're subtle and graphical, and they help to
clearly delimit the beginning and end of text, better than {* and *},
though those have their use.

In watching Sledgehammer with "verbose=true", it also appears that
machine learning is in the house. The prover "e" has become very
competitive with remote_vampire.

Regards,
GB

Email Gateway (Aug 19 2022 at 15:37):

From: Gottfried Barrow <igbi@gmx.com>
Finally another crash.

This is following a pattern. The crash doesn't happen until after I
terminate the sledgehammer command, and then it's not immediate.

In this case, I haven't changed threads back to threads=4, it's still at
12, with the 17 provers.

About 30 seconds after I terminated a sledgehammer command, I was
staring at the screen thinking about what I wanted to do, and it
terminated. The "Isabelle Syslog" message is below:

Regards,
GB

Welcome to Isabelle/HOL (Isabelle2014-RC4: August 2014)
assertion "baseAddr > (PolyWord*)obj && baseAddr <
((PolyWord*)obj)+length" failed: file "gc_mark_phase.cpp", line 439,
function: virtual void
MTGCProcessMarkPointers::ScanAddressesInObject(PolyObject*, POLYUNSIGNED)

message_output terminated
/cygdrive/e/E_2/dev/Isabelle2014/lib/scripts/run-polyml-5.5.2: line 84:
6368 Aborted (core dumped) "$POLY" -q -i $ML_OPTIONS --eval "$(perl
"$ISABELLE_HOME/lib/scripts/recode.pl" "$MLTEXT")" --error-exit < /dev/null

rmdir
:

failed to remove
‘/tmp/isabelle-6160’
:
Directory not empty

standard_output terminated
standard_error terminated
process terminated
command_input terminated
process_manager terminated
Return code: 134

Email Gateway (Aug 19 2022 at 15:37):

From: Makarius <makarius@sketis.net>
This one is already quite well-known in internal circles.

It can happen when the Poly/ML runtime system and memory management is
under high pressure -- we've increased that pressure on the application
side significantly since the last release. Trying to hunt it down and pin
it up, I have found out that Windows has a much higher chance to expose
that bad behaviour, presumably due to more narrow memory resources on
Cygwin. The situation might become better again, after switching to
Cygwin x86_64, but I have not tried it yet.

Note that David Matthews did not manage to eradicate the source of the
problem, because it never happened on any of his machines, and the people
who have seen it elsewhere could not repeat it reliably.

Makarius

Email Gateway (Aug 19 2022 at 15:37):

From: Gottfried Barrow <igbi@gmx.com>
Makarius,

I thought there was nothing more to say about it, and I didn't want to
reply just to say, "Switching to Cygwin x86_64 sounds exciting, if it
works". Now I have something to add.

"Auto Methods", mysteriously, could also be putting demands on the CPU
when Sledgehammer is running.

I usually enable, all together, "Auto Methods", "Auto Nitpick", and
"Auto Quickcheck" and forget about them.

Today, "Auto Methods" is acting as the only culprit for what I describe,
because I've enabled them one at a time for the setup I describe below.
"Auto Solve direct" seems to not be a problem, because I've left it
enabled, and I don't use "Auto Sledgehammer".

I attach a screenshot to show what's above where I have the cursor, and
what's below. It doesn't seem like any attempts should be made to prove
or disprove anything, and it's not like I have this problem all the time.

My setup for the screenshot is that I started PIDE with those 3 auto
options disabled. I then enabled them, with where the cursor is shown,
(I then took the time to do them one at a time, with only "Auto Methods"
causing this).

The CPU goes to about 50% as shown for all four cores, and then I have
to terminate. Disabling the three options is not enough.

As a final note, in repeatedly starting up, enabling "Auto Methods",
disabling, and then terminating PIDE, I had problems terminating, so I
rebooted.

So, on a fresh boot, I started PIDE, saw the purple bar show proving
until got to the cursor position I show in the screenshot, I enabled
"Auto Methods", all 4 cores started working to about 50% if the CPU, I
disabled "Auto Methods", and then I had to terminate.

Regards,
GB
140826__Auto stuff runs CPU mysteriously.png

Email Gateway (Aug 19 2022 at 15:38):

From: Makarius <makarius@sketis.net>
On Tue, 26 Aug 2014, Gottfried Barrow wrote:

"Auto Methods", mysteriously, could also be putting demands on the CPU
when Sledgehammer is running.

This is what the Isabelle/jEdit manual says in Isabelle2014:

\item @{system_option_ref auto_methods} controls automatic use of a
combination of standard proof methods (@{method auto}, @{method
simp}, @{method blast}, etc.). This corresponds to the Isar command
@{command_ref "try0"} \cite{isabelle-isar-ref}.

The tool is disabled by default, since unparameterized invocation of
standard proof methods often consumes substantial CPU resources
without leading to success.

There might be additional reasons for its CPU hunger. If so, I can refine
the text accordingly.

"Auto Solve direct" seems to not be a problem, because I've left it
enabled, and I don't use "Auto Sledgehammer".

More quotes from the manual:

\item @{system_option_ref auto_sledgehammer} controls a significantly
reduced version of @{command_ref sledgehammer}, which attempts to prove
a subgoal using external automatic provers. See also the
Sledgehammer manual \cite{isabelle-sledgehammer}.

This tool is disabled by default, due to the relatively heavy nature
of Sledgehammer.

\item @{system_option_ref auto_solve_direct} controls automatic use of
@{command_ref solve_direct}, which checks whether the current subgoals
can be solved directly by an existing theorem. This also helps to
detect duplicate lemmas.

This tool is \emph{enabled} by default.

The CPU goes to about 50% as shown for all four cores, and then I have
to terminate. Disabling the three options is not enough.

You can also experiment with the improved Monitor panel, to get some
statistics of the running ML system from inside.

Makarius

Last updated: Jul 04 2025 at 01:50 UTC