From: Thomas Sewell <Thomas.Sewell@nicta.com.au>
Hey Isabelle users.
I've been playing with Isabelle 2009-1's parallelism features for a 
while now, and have come to the conclusion that I can never get quite 
what I want from them. There are two problems. On my desktop machine at 
work, I can't safely enable parallelism. On my desktop machine at home, 
I can't completely disable parallelism.
The first problem is that the L4.verified proof repository I work on is 
huge (tens of thousands of lemmas and hundreds of thousands of lines of 
proof). This exposes a problem with parallel proof checking. CPU usage 
looks good to begin with but then our systems very rapidly run out of 
memory and start thrashing. I suspect the problem here is that for every 
proof saved for later checking the relevant context needs to saved as 
well. Our contexts get quite large, so the theory-parsing thread doesn't 
have to get far ahead of the proof-checking threads for memory to run 
out. The solution to this is to turn off parallel proof checking.
Furthermore, although Isabelle 2009-1 loads theories in parallel this 
does not seem to translate to CPU parallelism. It is clear from the log 
chatter that multiple theories are being loaded at once, however, when 
parallel proofs are disabled CPU usage remains at one core's worth. 
Running UNIX top and showing thread decompositions (press "H") reveals 
that there are multiple polyml threads running but their CPU usage 
always adds to exactly one core. It looks a lot like there's a lock 
somewhere all the threads need to run. This prevents me getting any 
parallel execution on my work machine.
Finally, on my machine at home, which seems to act up in the context of 
parallelism, I can't disable it. I can turn off parallel proof checking 
and set the maximum thread count to 1. UNIX top confirms that this 
reduces the number of polyml threads created from 7 to 2, which then 
share the CPU between them in exactly the same way. This prevents me 
disabling real parallelism on my home machine. I can still run a busy 
loop on the other core, which seems to hide the mysterious issues.
Do other users see the same behaviour? Do the developers know what's 
going on? Could it be easily fixed?
Yours,
    Thomas.
From: Makarius <makarius@sketis.net>
On Fri, 9 Apr 2010, Thomas Sewell wrote:
The first problem is that the L4.verified proof repository I work on is
huge (tens of thousands of lemmas and hundreds of thousands of lines of
proof). This exposes a problem with parallel proof checking. CPU usage
looks good to begin with but then our systems very rapidly run out of
memory and start thrashing. I suspect the problem here is that for every
proof saved for later checking the relevant context needs to saved as
well. Our contexts get quite large, so the theory-parsing thread doesn't
have to get far ahead of the proof-checking threads for memory to run
out. The solution to this is to turn off parallel proof checking.
In a low memory situation, you have two options to play with:
* Multithreading.max_threads (usedir option -M)
   * Goal.parallel_proofs (usedir option -q)
See also section 3.4 in the Isabelle System manual for some further 
explanations.  In particulay, -q1 vs. -q2 might sometimes help.
Since your application is rather big, I would say the real solution is to 
install more memory.  As a rule of thumb, you need to scale up memory with 
the number of cores.  Less than 2 GB per core is unlikely to work here. 
Since the total heap size of Poly/ML is limited to 2-3 GB on 32bit, you 
might have to switch to 64bit by editing the ML settings accordingly. 
There is an example in Isabelle2009-1/etc/settings.  Multiply the memory 
by 1.5 or even 2 for 64bit mode.
Furthermore, although Isabelle 2009-1 loads theories in parallel this
does not seem to translate to CPU parallelism. It is clear from the log
chatter that multiple theories are being loaded at once, however, when
parallel proofs are disabled CPU usage remains at one core's worth.
Running UNIX top and showing thread decompositions (press "H") reveals
that there are multiple polyml threads running but their CPU usage
always adds to exactly one core. It looks a lot like there's a lock
somewhere all the threads need to run. This prevents me getting any
parallel execution on my work machine.
The degree of parallelism in typical theory development graphs is 
surprisingly low.  Nonetheless, you should get some speedup, unless the 
graph is all linear.
If CPU usage is really just 100% there might be some ML tools using 
something like setmp_CRITICAL internally.  The official Isabelle 
distribution is essentially free from such exclusive execution, but maybe 
you have your own tools.  Or maybe it is just some critical wrapper around 
the main use_thys invocation in ROOT.ML of the session (only relevant for 
batch mode).
Finally, on my machine at home, which seems to act up in the context of
parallelism, I can't disable it. I can turn off parallel proof checking
and set the maximum thread count to 1. UNIX top confirms that this
reduces the number of polyml threads created from 7 to 2, which then
share the CPU between them in exactly the same way. This prevents me
disabling real parallelism on my home machine. I can still run a busy
loop on the other core, which seems to hide the mysterious issues.
I have just tried ISABELLE_USEDIR_OPTIONS="-M 1" on my 2-core laptop and 
it works as expected: there are only 2 threads, the main ML thread 
(typically 80% CPU time) and the GC thread (20% CPU time).  Total CPU 
usage is < 50% accounting for the 2 cores, i.e. this is really single-core 
mode.  The ML thread and GC thread are alternating each other.
Makarius
Last updated: Nov 03 2025 at 20:24 UTC