Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Isabelle2019-RC2 sporadic smt failures


view this post on Zulip Email Gateway (Aug 22 2022 at 19:42):

From: Makarius <makarius@sketis.net>
This is one of the open obscurities that requires further investigations.

I have copied the above example lemma + proof many times, to get many
tests sequentially (apply ... done) or in parallel (by ...): it works
fine on Linux and 3 different Windows installations.

Do you maybe have some anti-malware software that acts itself as
malware, by hindering external processes?

Makarius

view this post on Zulip Email Gateway (Aug 22 2022 at 19:42):

From: Mathias Fleury <mathias.fleury12@gmail.com>
Hi Fabian,

Can you replace in ~~/src/Tools/SMT/smt_systems.ML:

fun outcome_of unsat sat unknown solver_name line =
if String.isPrefix unsat line then SMT_Solver.Unsat
else if String.isPrefix sat line then SMT_Solver.Sat
else if String.isPrefix unknown line then SMT_Solver.Unknown
else raise SMT_Failure.SMT (SMT_Failure.Other_Failure ("Solver " ^ quote solver_name ^
" failed -- enable tracing using the " ^ quote (Config.name_of SMT_Config.trace) ^
" option for details"))

by

fun outcome_of unsat sat unknown solver_name line =
if String.isPrefix unsat line then SMT_Solver.Unsat
else if String.isPrefix sat line then SMT_Solver.Sat
else if String.isPrefix unknown line then SMT_Solver.Unknown
else raise SMT_Failure.SMT (SMT_Failure.Other_Failure ("Solver " ^ quote solver_name ^
" failed -- enable tracing using the " ^ quote (Config.name_of SMT_Config.trace) ^
" option for details and the error is " ^ (@{make_string} line)))

and then run your example (you will have to recompile HOL and HOL-Analysis) and after that tell me the exact error message?

Moreover could you run put the SMT problem outside of Isabelle in a terminal (I have attached the file, but it is exactly the problem that is included in your message) with the following command:

$ ~/.isabelle/contrib/z3-4.4.0pre-2/x86-windows/z3 --st smt.random_seed=1 smt.refine_inj_axioms=false -T:20 <path/to/prob_z3.smt_in>

I am interested in the statistics at the end, not the proof itself.

Thanks,
Mathias

view this post on Zulip Email Gateway (Aug 22 2022 at 19:42):

From: Fabian Immler <immler@in.tum.de>
On 5/23/2019 5:54 PM, Mathias Fleury wrote:

Hi Fabian,

Can you replace in ~~/src/Tools/SMT/smt_systems.ML:

fun outcome_of unsat sat unknown solver_name line =
if String.isPrefix unsat line then SMT_Solver.Unsat
else if String.isPrefix sat line then SMT_Solver.Sat
else if String.isPrefix unknown line then SMT_Solver.Unknown
else raise SMT_Failure.SMT (SMT_Failure.Other_Failure ("Solver " ^ quote solver_name ^
" failed -- enable tracing using the " ^ quote (Config.name_of SMT_Config.trace) ^
" option for details"))

by

fun outcome_of unsat sat unknown solver_name line =
if String.isPrefix unsat line then SMT_Solver.Unsat
else if String.isPrefix sat line then SMT_Solver.Sat
else if String.isPrefix unknown line then SMT_Solver.Unknown
else raise SMT_Failure.SMT (SMT_Failure.Other_Failure ("Solver " ^ quote solver_name ^
" failed -- enable tracing using the " ^ quote (Config.name_of SMT_Config.trace) ^
" option for details and the error is " ^ (@{make_string} line)))

and then run your example (you will have to recompile HOL and HOL-Analysis) and after that tell me the exact error message?
Yes, I replaced it (in ~~/src/HOL/Tools/SMT/smt_systems.ML) and got:
Solver z3: Solver "z3" failed -- enable tracing using the "smt_trace"
option for details and the error is ""

Moreover could you run put the SMT problem outside of Isabelle in a terminal (I have attached the file, but it is exactly the problem that is included in your message) with the following command:

$ ~/.isabelle/contrib/z3-4.4.0pre-2/x86-windows/z3 --st smt.random_seed=1 smt.refine_inj_axioms=false -T:20 <path/to/prob_z3.smt_in>
Yes, I ran this from ~/.isabelle/contrib and also from
.../Isabelle2019-RC2/contrib.
It works fine all the time (for 100+ sequential runs)

I am interested in the statistics at the end, not the proof itself.
(:add-rows 23
:added-eqs 29
:arith-conflicts 1
:assert-lower 15
:assert-upper 19
:conflicts 1
:eq-adapter 15
:fixed-eqs 6
:max-generation 3
:memory 0.61
:mk-clause 43
:num-checks 1
:pivots 12
:propagations 17
:quant-instantiations 27
:time 0.00
:total-time 1.16)

Fabian

On 23. May 2019, at 17:19, Makarius <makarius@sketis.net> wrote:

On 14/05/2019 13:54, Fabian Immler wrote:

With Isabelle2019-RC2 on Windows 10, I get sporadic failures of the smt
method.

One way to reproduce this is to insert/delete whitespace in the lemma
statement below. The error 'Solver z3: Solver "z3" failed -- enable
tracing using the "smt_trace" option for details' will occur in about 1
out of 10 smt invocations.

I've attached a trace with [[smt_trace]] enabled.

theory Scratch
imports
"HOL-Analysis.Analysis"
begin

lemma "fst ((1 - t1) *⇩R fst l1 + t1 *⇩R snd l1) ≤ fst (snd l1)"
if "t1 ∈ {0..1}"
"p1 = (1 - t1) *⇩R fst l1 + t1 *⇩R snd l1"
"fst (fst l1) < fst (snd l1)"
"fst (fst l2) < fst (snd l2)"
"fst (snd l1) < fst (fst l2)"
for l1 l2::"(realreal)real*real"
using that
by (smt atLeastAtMost_iff fst_add fst_scaleR scaleR_collapse
scaleR_left_mono)

end

This is one of the open obscurities that requires further investigations.

I have copied the above example lemma + proof many times, to get many
tests sequentially (apply ... done) or in parallel (by ...): it works
fine on Linux and 3 different Windows installations.

Do you maybe have some anti-malware software that acts itself as
malware, by hindering external processes?

Makarius

smime.p7s

view this post on Zulip Email Gateway (Aug 22 2022 at 19:42):

From: Mathias Fleury <mathias.fleury12@gmail.com>
Thank you very much for the detailed information.

To give you idea, on my mac I get

:total-time 0.01

So z3 on my machine is around ten times faster (!) on that problem.

I will try to play with the timeout to see if I can reproduce the problem.

Mathias

view this post on Zulip Email Gateway (Aug 22 2022 at 19:43):

From: Mathias Fleury <mathias.fleury12@gmail.com>
Hi Fabian,

I managed to find an old windows machine and I can reproduce the bug (sometimes at least). And the results are just weird.

smt works as follows:
(exec 2>&1; z3 ‹input_file›)> ‹temporary_file›

From this, two information are extracted:

1. the return code to indicate the kind of result (error/unknown)
2. the temporary_file

Here comes the interesting part. Under heavy load*, the ‹temporary_file› might not contain what I would expect to be there. The content can be corrupted and in particular, it can be empty.

I have no idea what changed since Isabelle2018 that could cause a change in the behaviour. I will think about it, but if anyone has an idea…

Out of curiosity, do you have an SSD or a hard drive?

Best,
Mathias

view this post on Zulip Email Gateway (Aug 22 2022 at 19:43):

From: Makarius <makarius@sketis.net>
There is a change in the Cygwin version for Windows. Cygwin introduces
extra complexity and can fail in funny ways, e.g. due to extra nesting
of processes as above.

The "(exec ...) > ..." scheme goes back to the original version by
Jasmin Blanchette from 2014:
http://isabelle.in.tum.de/repos/isabelle/rev/624faeda77b5#l12.56 -- so
the history does not say anything about the reasons for that. Maybe
Jasmin can recall them.

To robustify that bash script, I have made the following changes for the
next release candidate:

* Update of Cygwin:
https://isabelle.sketis.net/repos/isabelle-release/rev/b0fd8167bb9b

* Avoid extra subprocess:
https://isabelle.sketis.net/repos/isabelle-release/rev/2e101846ad8f

I will publish Isabelle2019-RC3 tomorrow, with all the other minor
changes that have accumulated in the past 2 weeks.

Makarius

view this post on Zulip Email Gateway (Aug 22 2022 at 19:43):

From: Makarius <makarius@sketis.net>
The above changes are already in Isabelle2019-RC3. Can you try if it
makes a difference.

On my various test machines (Linux, macOS, Windows) I don't see any such
problems.

Makarius

view this post on Zulip Email Gateway (Aug 22 2022 at 19:44):

From: Fabian Immler <immler@in.tum.de>
I have an SSD.

Fabian
smime.p7s

view this post on Zulip Email Gateway (Aug 22 2022 at 19:44):

From: Makarius <makarius@sketis.net>
Is there an easy way to provide the 400 runs above as isolated example?
Alternatively something derived from $ISABELLE_HOME/src/HOL/SMT_Examples?

For completeness, here is a copy from what I wrote on the other
(potentially unrelated) thread on Z3:

"""
I somehow suspect that the problems we see now are due to the change of
the environment, i.e. latent fragilities now become apparent due to one
or more of these reasons:

* unknown changes in Cygwin (64bit version, other changes)
* unknown changes in Windows 10
* unknown changes in the hardware: much faster process creation due to
SSD etc.
"""

Makarius

view this post on Zulip Email Gateway (Aug 22 2022 at 19:44):

From: Fabian Immler <immler@in.tum.de>
Not sure if that is what you mean, but I was referring to simply 400
copies of the smt-call that I reported earlier in this thread:

theory Scratch
imports
"HOL-Analysis.Analysis"
begin

lemma "fst ((1 - t1) *⇩R fst l1 + t1 *⇩R snd l1) ≤ fst (snd l1)"
if "t1 ∈ {0..1}"
"p1 = (1 - t1) *⇩R fst l1 + t1 *⇩R snd l1"
"fst (fst l1) < fst (snd l1)"
"fst (fst l2) < fst (snd l2)"
"fst (snd l1) < fst (fst l2)"
for l1 l2::"(realreal)real*real"
using that
by (smt atLeastAtMost_iff fst_add fst_scaleR scaleR_collapse
scaleR_left_mono)

Fabian
smime.p7s

view this post on Zulip Email Gateway (Aug 22 2022 at 19:44):

From: Makarius <makarius@sketis.net>
OK, I will try this. I've thought that you have an actual application
with 400 runs somewhere in the background.

The implicit question behind this: How serious is the impact on
applications?

I will make one more round of refinement, but it might be all we can do
for now (the Isabelle2019 release).

Makarius

view this post on Zulip Email Gateway (Aug 22 2022 at 19:44):

From: Fabian Immler <immler@in.tum.de>
Well, there is a large probability that I cannot build sessions with
more than a handful of smt-calls.

But I realized that Windows' "Virus & threat protection" seems to be the
culprit.

When building the theory with 400 smt calls, Task Manager shows that
"Antimalware Service Executable" is very active (100-150% CPU).

Disabling it temporarily
(Virus & threat protection ->
Virus & threat protection settings ->
Mange settings ->
Real-time protection), the problems with smt disappear.

Fabian
smime.p7s

view this post on Zulip Email Gateway (Aug 22 2022 at 19:45):

From: Makarius <makarius@sketis.net>
I have experimented with this a bit. Using 500 copies of your example
(as in the included compressed Test.thy), I see approx. 2 errors on one
of my Windows 10 test machines.

In the Windows Defender Settings there is a section "Virus & threat
protection settings". The second-last item at the bottom is for
"Exclusions". Here I have added the whole Isabelle home directory (as
over-approximation of all Isabelle components, cygwin bash etc.).

This appears to help: in a few subsequent tests there were no errors.

Makarius
Test.thy.gz

view this post on Zulip Email Gateway (Aug 22 2022 at 19:45):

From: Makarius <makarius@sketis.net>
These refinements are:

https://isabelle.sketis.net/repos/isabelle-release/rev/c7e9d3a0a681
https://isabelle.sketis.net/repos/isabelle-release/rev/9f3441164e92

I have no further ideas in the pipeline. We have already learned that
the Windows Defender is the main attack on Isabelle tool integrity.

You can try that in the informal snapshot Isabelle_27-May-2019 from
http://www21.in.tum.de/~wenzelm/test/website (it will disappear eventually).

Here is some text that I wrote some weeks ago about success and failure
of "Desktop Application Bundles on Linux, Windows, macOS":

"""
{\success} The majority of users is happy with the all-inclusive Isabelle
application bundle (2019: 300\,MB download size, 1\,GB unpacked size). It
works almost as smoothly as major Open Source products (e.g. Firefox,
LibreOffice).

{\failure} Full equality of all three platforms families has still not
been
achieved. Java GUI rendering on Linux is worse than on Windows and macOS;
exotic Linux/X11 window managers can cause problems. Add-on tools are
sometimes not as portable (and robust) on Windows: Isabelle often
refers to
Cygwin as auxiliary POSIX platform, but that may cause its own problems.
Even for native Windows tools (e.g. via MinGW) the Unix-style
orchestration
of multiple processes can have timing problems (e.g. Sledgehammer provers)
or fail due to antivirus software. Moreover, the recent tendency of
Windows
and macOS towards ``application stores'' makes it harder to run downloaded
Isabelle bundles on the spot: users need to bypass extra vendor checks.

"""

See also
https://sketis.net/2019/interaction-with-formal-mathematical-documents-in-isabelle-pide-at-cicm-2019-prague
and
https://sketis.net/2019/isabelle-presentations-at-lsv-paris-cachan

Makarius

view this post on Zulip Email Gateway (Aug 22 2022 at 19:50):

From: Fabian Immler <immler@in.tum.de>
With Isabelle2019-RC2 on Windows 10, I get sporadic failures of the smt
method.

One way to reproduce this is to insert/delete whitespace in the lemma
statement below. The error 'Solver z3: Solver "z3" failed -- enable
tracing using the "smt_trace" option for details' will occur in about 1
out of 10 smt invocations.

I've attached a trace with [[smt_trace]] enabled.

Best regards,
Fabian

theory Scratch
imports
"HOL-Analysis.Analysis"
begin

lemma "fst ((1 - t1) *⇩R fst l1 + t1 *⇩R snd l1) ≤ fst (snd l1)"
if "t1 ∈ {0..1}"
"p1 = (1 - t1) *⇩R fst l1 + t1 *⇩R snd l1"
"fst (fst l1) < fst (snd l1)"
"fst (fst l2) < fst (snd l2)"
"fst (snd l1) < fst (fst l2)"
for l1 l2::"(realreal)real*real"
using that
by (smt atLeastAtMost_iff fst_add fst_scaleR scaleR_collapse
scaleR_left_mono)

end
smt_trace
smime.p7s

view this post on Zulip Email Gateway (Aug 22 2022 at 19:50):

From: Makarius <makarius@sketis.net>
I have seen such failures sporadically on Linux, too.

The Z3 setup is unchanged: the situation should be the same as
Isabelle2018 (see also
http://isabelle.in.tum.de/repos/isabelle/rev/1463c4996fb2).

Note that I don't understand the details of Z3 / smt myself. Sascha
Boehme had a look at it recently, but I am unsure if there will be an
update for the Isabelle2019 release (there are still a few weeks left).

Makarius

view this post on Zulip Email Gateway (Aug 22 2022 at 19:51):

From: Fabian Immler <immler@in.tum.de>
But the situation is worse in Isabelle2019-RC2:
I just tried it once again:
In a theory file with 400 smt calls, I got 20 errors with
Isabelle2019-RC2 and none with Isabelle2018.

Fabian
smime.p7s


Last updated: Nov 21 2024 at 12:39 UTC