[isabelle] sledgehammer in RC4 · Archive Mirror: Isabelle Users Mailing List

Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] sledgehammer in RC4

Email Gateway (Aug 19 2022 at 12:28):

From: Gottfried Barrow <gottfried.barrow@gmx.com>
Randy,

I didn't see that anyone answered this, and it was of interest to me,
because of how Sledgehammer used to deal with virtual cores (# of
threads), but things have changed to how I wanted them. It ceased to be
an issue for me because I currently use an older quad core with no
virtual threads.

I have a laptop with an i3-330m:

http://ark.intel.com/products/47663/

As the Intel web page shows, it has 2 hardware cores and 4 virtual
threads. My use of "ML {* Multithreading.max_threads_value () *}" for
that laptop returns "it = 4". For the older quad core that I use, which
has no virtual threads, it returns "it = 4". For a newer i7 with 4
cores, and 8 threads, I assume it would return "it = 8".

For the i3-330m, when I run sledgehammer with "e spass z3_tptp z3", on a
theorem it can't prove, with a 60 second timeout, I see 4 processes
start up, SPASS.exe, eprover.exe, and two copies of z3-solver.exe, all
taking up to about 25% of the CPU.

That's good, because with Isabelle2012 (or whatever I was using at the
time), it would only run 2 provers at a time on that laptop. So when I
get an i7, it hopefully will run 8 at a time.

I guess if you get "it = 2", it means you have an older dual core that
has no virtual threads.

Regards,
GB

Email Gateway (Aug 19 2022 at 12:30):

From: Jasmin Blanchette <jasmin.blanchette@gmail.com>
Hi Randy,

Sorry for the delay in answering.

In Proof General, this should do the trick:

ML {* Multithreading.max_threads := 4 *}

In jEdit, there appears to be an option called "threads" that one can set (judging from the code in "Pure/PIDE/protocol.ML" -- I'm faster at grepping than at searching my way through GUIs).

I hope this helps. And please let us (or me) know if you run into more issues with Sledgehammer.

Regards,

Jasmin

Email Gateway (Aug 19 2022 at 12:31):

From: Randy Pollack <rpollack@inf.ed.ac.uk>
Hi Jasmin,

Thanks for your answer. I'm not sure if it is useful or harmful to
count the hyperthreads (double the number of real cores). The host OS
(OSX or Windows or Linux) can see the difference between cores and
hyperthreads, and hopefully schedules accordingly (i.e. distributes
threads to different cores before using a hyperthread on a core
already running).

The issue is complicated by the fact that I'm running Ubuntu in a VBox
virtual machine. Probably (but I don't know, and haven't been able to
ascertain from VBox documentation) VBox simply spawns guest threads
and leave it to the host OS to schedule them. That would imply that
no harm is done by telling isabelle the number of hyperthreads rather
than the number of cores.

Unless someone on the list knows about this, experimentation is required.

Best,
Randy

Email Gateway (Aug 19 2022 at 12:31):

From: Gottfried Barrow <gottfried.barrow@gmx.com>
You probably already know this, but for others who haven't used it, the
settings I show are useful to keep VirtualBox from hogging up all the CPUs.

For a machine in VirtualBox, there is the setting
"Setting/System/Processor", where you can set the number of processors
for a VirtualBox machine.

I attach a screen shot for my dual core laptop. It shows I have it set
to 2, but that 4 processors are available. I don't know if I set that,
or if that's the default.

Regards,
GB
vBox cpus.png

Email Gateway (Aug 19 2022 at 12:39):

From: Makarius <makarius@sketis.net>
On Thu, 14 Nov 2013, Randy Pollack wrote:

I'm not sure if it is useful or harmful to count the hyperthreads
(double the number of real cores). The host OS (OSX or Windows or
Linux) can see the difference between cores and hyperthreads, and
hopefully schedules accordingly (i.e. distributes threads to different
cores before using a hyperthread on a core already running).

This confusion of real cores vs. hyperthreaded virtual cores will probably
entertain the computing industry for a few more years. My experience is
that (1) the OSes today are smart enough to schedule threads for real
cores first and (2) the additional virtaul cores actually help a bit with
performance.

On my old 8 core Xeon * hyperthreading system (2009), I can get a true
factor of 9.6 in the best possible situation. That extra burning of CPU
cycles consumes quite a lot of energy, though. Users of mobile systems
should make sure that they don't overload the CPU unneccessary.

Isabelle provides a global system option "threads" for that (both for
Isabelle/Scala/jEdit and in Proof General via its own elips-based
mechanism).

Moreover, the Poly/ML settings can be fine-tuned in
$ISABELLE_HOME_USER/etc/settings like this:

ML_OPTIONS="-H 500 --gcthreads 4"

Further tuning might be required to make the JVM run smoothly on a given
system -- this is relevant for Isabelle/Scala and the PIDE front-ends.

Modern hardware requires considerable manual tuning to get the best
performance vs. energy consumption ratio.

I've heard occasionally complaints about too much burning of cycles, by
people who did not look at any of their hardware specifications nor the
Isabelle options.

The issue is complicated by the fact that I'm running Ubuntu in a VBox
virtual machine.

What is your host platform then? Since Isabelle claims to support all
major OSes routinely, I am always interested to see counterexamples.
(Android will not work, though.)

Makarius

Email Gateway (Aug 19 2022 at 12:50):

From: Randy Pollack <rpollack@inf.ed.ac.uk>
I just installed Isabelle2013-1-RC4 on a fresh installation of Ubuntu
13.10, using the standard "download for Linux" from the Isabelle
webpage. I built the HOL image without any problem.

Sledgehammer fails to find proofs that it found in the Isabelle2013
version. So I tried the simple example from the "First Steps" section
of "Hammering Away" from the Isabelle documentation. Only e and spass
were being called.

BTW, I fixed Z3_NON_COMMERCIAL= yes, and did not get the ususal
Isabelle warning saying to do that.

What have I forgotten?

Thanks for help,
Randy

Email Gateway (Aug 19 2022 at 12:50):

From: Jasmin Blanchette <jasmin.blanchette@gmail.com>
Hi Randy,

Which default provers end up being run depend on a number of factors. The algorithm has changed between 2013 and 2013-1RC*, to avoid running e.g. 5 provers on 4 cores. I'd need a bit more information from you to debug this:

How many cores does your machine have?
What is the value reported by ML {* Multithreading.max_threads_value () *} ?
Do you use Proof General or jEdit/PIDE?
How do you invoke Sledgehammer? Do you type "sledgehammer" in the editor window or use the new jEdit Sledgehammer panel?
What is the output of the "sledgehammer_params" command?

Thank you in advance.

Regards,

Jasmin

Email Gateway (Aug 19 2022 at 12:50):

From: Randy Pollack <rpollack@inf.ed.ac.uk>
Hi Jasmine,

I'm running in a VBox virtual machine on a MacPro that has 2 cores

How many cores does your machine have?
The VM is set with 2 CPUs

What is the value reported by ML {* Multithreading.max_threads_value () *} ?
2

Do you use Proof General or jEdit/PIDE?
Proof General

How do you invoke Sledgehammer? Do you type "sledgehammer" in the editor window or use the new jEdit Sledgehammer panel?
I type "sledgehammer"

What is the output of the "sledgehammer_params" command?
Default parameters for Sledgehammer:
blocking = false
debug = false
fact_filter = smart
fact_thresholds = 0.45 0.85
isar_compress = 10
isar_proofs = smart
isar_try0 = true
lam_trans = smart
learn = true
max_facts = smart
max_mono_iters = smart
max_new_mono_instances = smart
minimize = smart
overlord = false
preplay_timeout = 3
provers = e spass
slice = true
spy = false
strict = false
timeout = 30
type_enc = smart
uncurried_aliases = smart
verbose = false

Thanks,
Randy

Email Gateway (Aug 19 2022 at 12:50):

From: Jasmin Blanchette <jasmin.blanchette@gmail.com>
Hi Randy,

OK, that explains it.

Earlier versions of Sledgehammer ran at most 2 local ATPs (for a 2-CPU system) + a number of remote ATPs in that case, since remote ATPs do not put a heavy load on the local machine. However, this lead to some weird effects when using jEdit's new Sledgehammer tab, which users the thread scheduling of PIDE instead of Sledgehammer's somewhat deprecated solution. (Namely, two or three rounds of running ATPs would be necessary, doubling or trippling the effective time out and leading users to wonder what's going on.) It would be possible to have different defaults for Proof General and jEdit, but Sledgehammer already has too many modes, so I'd rather not go that way. Also, I am betting on the trend toward 4-/8-core systems and jEdit to continue.

The default provers invoked can be changed in Proof General using the Isabelle menu. This is preserved across sessions, so you should need to do it only once. An alternative is to put

sledgehammer_params [provers = e spass remote_vampire remote_z3]

at the top of your theory (or in one of your base theories).

I hope this helps.

Regards,

Jasmin

Email Gateway (Aug 19 2022 at 12:50):

From: Randy Pollack <rpollack@inf.ed.ac.uk>
Hi Jasmin,

You gave me useful information, and I can now get sledgehammer working for me.

There seems to be some problem with remote_z3:

sledgehammer_params [provers = remote_z3]
lemma "[a] = [b] \<Longrightarrow> a = b"
sledgehammer

gets the response:

Sledgehammer: "remote_z3" on goal
[a] = [b] \<Longrightarrow> a = b
A prover error occurred:
Solver "remote_z3" failed. [...]

while:

sledgehammer_params [provers = z3]
lemma "[a] = [b] \<Longrightarrow> a = b"
sledgehammer

gets the expected response. Also something strange about remote_e_sine:

sledgehammer_params [provers = remote_e_sine]
lemma "[a] = [b] \<Longrightarrow> a = b"
sledgehammer

gets the response:

Sledgehammer: "remote_e_sine" on goal
[a] = [b] \<Longrightarrow> a = b
Timed out.

while:

sledgehammer_params [provers = spass remote_e_sine]
lemma "[a] = [b] \<Longrightarrow> a = b"
sledgehammer

gets the expected response. So remote_e_sine works under some
circumstances and not under others. This seems to be repeatable.

BTW, how does Multithreading.max_threads_value get set to 2, can I
change it, and would that do any good? You and Makarius probably
understand this point better than I: a modern "dual core" Intel
machine has 4 hardware threads.

Thanks,
Randy

Email Gateway (Aug 19 2022 at 12:51):

From: Makarius <makarius@sketis.net>
I don't understand the "fixed Z3_NON_COMMERCIAL" part. What was broken
here?

Makarius

Email Gateway (Aug 19 2022 at 12:51):

From: Makarius <makarius@sketis.net>
The situation is indeed at a point where it is unclear to me what really
happens internally. The Proof General legacy is a heavy burden, sucks up
maintenance resources, and reduces possibilities to get it right by clear
and simple means.

We have already > 2 years of official stable releases of Isabelle/jEdit
(Isabelle2011-1, Isabelle2012, Isabelle2013, Isabelle2013-1). These are
in fact 4 generations of Prover IDE implementations: the steps between
each of them are quite substantial.

People who are still using Proof General should say more explicitly what
are the reasons for it, apart from old habits.

Makarius

Email Gateway (Aug 19 2022 at 12:52):

From: Jasmin Blanchette <jasmin.blanchette@gmail.com>
No. The message did not show up because only two provers were run (E and SPASS). That's perfectly normal. The message is shown only if users attempt to run Z3.

Jasmin

Email Gateway (Aug 19 2022 at 12:52):

From: Andreas Lochbihler <andreas.lochbihler@inf.ethz.ch>
Dear Makarius,

People who are still using Proof General should say more explicitly what are the reasons
for it, apart from old habits.
I use both jEdit and Proof General, but for different purposes. I see the following
advantages of ProofGeneral 3.7.1.1 with XEmacs:

XEmacs/ProofGeneral can be used completely without a mouse. With jEdit, I have to
switch to the mouse far more often, in particular for sledgehammer, find_theorems and to
navigate from one theory file to another.
Navigation by cursor keys is much slower. Moving a word to the left/right in XEmacs
jumps to the beginning / end of the next word, where a word consists of at least one
alphanumeric character. In jEdit, Ctrl-left/-right also stops at every parenthesis,
white-space, operator and the like.

Similarly, I often want to look somewhere else in my theory file (and use PgUp/PgDn for
that); with ProofGeneral, I just type C-c C-. to get back to where I was before, but I do
not know an equally simple way in jEdit.

XEmacs has less spacing between the lines (this is the main reason why I still use
ProofGeneral 3.7.1.1, GNU Emacs with PG 4.x has a similar spacing as jEdit), so more of
the proof script fits on the screen at the same font size.
The output buffers are not usable with a key board. This is particularly annoying when
I examine diagnostic output as produced by code_thms, export_code, print_classes, thm
<large Named_THM collection>, etc. In jEdit, I first have to open a new buffer and copy
the output there before I can use search and friends to navigate through.
After a few hours of work, jEdit reacts slower and slower. This gets particularly
annoying when typing fast. For example, I want that entering < = TAB produces the \<le>
symbol, always. Half of the time, I am faster than the popup and I get "<= " instead.
(I have tried this with 2013-1-RC3 last week, but not yet with RC4).

And finally, there's of course habits. I write my LaTeX documents in Emacs, so using the
same editor with the same handling is very convenient.

Andreas

Email Gateway (Aug 19 2022 at 12:52):

From: René Neumann <rene.neumann@in.tum.de>
Am 11.11.2013 12:34, schrieb Makarius:

People who are still using Proof General should say more explicitly what
are the reasons for it,

I think, I had put it in another mail already, but well:

jEdit is SLOW, as in: There is a noticeable lag when moving from one
menu to the other per mouse (rendering also consumes 50% to 100% of one
CPU).
Fonts are not antialiased. I found that setting some obscure
environment variable [1] made it work, but honestly: no AA by default is
a no-go.
no switching of sessions (at least last time I checked)
(obsolete: my WM fails to show jEdits dialogs. Just noticed that there
is a patch available for it).
jEdit is too mouse-centric. When I have to use "Ctrl-hover" to get
some information, I won't use it.
last time I checked, the 'abbreviation to unicode' conversion that
finally(!!) works great in PG, was somewhat unusable in jEdit.

apart from old habits.

And this is also not to be underrated. Over the last two years, I have
compiled a 150 lines .emacs (and I'm only using Emacs for Isabelle) plus
some 'isabelle emacs' startup magic. Hence being urged to spend all this
time again to get the editor in a shape one likes, does not feel good.
Especially when there are NEWS items reading "you have to re-do all your
current jEdit keybinding customizations".

René

[1] export _JAVA_OPTIONS='-Dawt.useSystemAAFontSettings=gasp
-Dswing.aatext=true'
smime.p7s

Last updated: Jun 30 2025 at 20:23 UTC