Hi, I'm having a problem with a large project, where every couple of seconds, there's a large ML cleanup that halts the entire JEdit interface for a few seconds. Here's a condensed code example that causes the problem (Isabelle 2021):
theory MWE
imports "HOL-Library.Word"
begin
fun numbits' :: "nat ⇒ nat" where
"numbits' 0 = 0"
| "numbits' n = 1 + numbits' (n div 2)"
(* Ceiling of log2 a *)
definition numbits :: "'a::len word ⇒ nat" where
"numbits x = numbits' ((unat x)-1)"
lemma
assumes m_nonzero: "m > 1"
shows md_less': "2 ^ (2 * numbits m) div unat (m :: 128 word) < 2 ^ 129"
proof -
show ?thesis sorry
qed
end
on loading this file into JEdit, very often the ML memory viewer in the bottom right corner shows "ML Cleanup" for a few seconds, during which the JEdit interface becomes unresponsive. It looks like this example also uses a very disproportionate amount of memory as well. I was encountering very similar problems in Isabelle 2020. Any ideas for helping this?
How much memory do you have? In any case, performance-related issues like this are also good to mention on the mailing list, since this is something Makarius can answer best, and he does not use Zulip.
a lot. It seems to be capping out at around 4-5gb, and triggering the ML cleanup thing
but I have plenty of ram on this device
I'll also mention this in the mailing list, thanks
Hm, I have a 32 GiB RAM machine (Arch Linux) and your MWE goes through fine within a few seconds with no responsiveness issues at all. I sometimes get into situations like you describe with very large developments that have been open a long time, especially when I jump back and forth a lot so that a lot of material has to get rebuilt. Of when there's a proof method running amok with lots of backtracking. But never on a tiny example like this.
Perhaps you ought to post this on the mailing list. Ideally also check whether it also happens with the latest release candidates. And describe exactly what happens when you run your MWE (i.e. when does it start freezing up).
sounds good, I'll try all of that. It's definitely consistent for me, whenever I load the file in JEdit, it sits at <100mb used as it runs all the dependencies, then it hops up to 4gb and triggers a cleanup as soon as it tries to run md_less'. I also have a colleague that was experiencing very similar problems, I'll try to have them run this MWE as well
question, is it possible to disabelle auto-quickcheck for one specific lemma? Perhaps that could be causing an issue?
sent a message to the mailing list as well
Also to clarify, it's not hanging in purple on any proof methods or definitions, the interface itself is getting hung up. It's very bizzare
I don't know about disabling auto-quickcheck for a single lemma, but you can definitely turn it of globally in Preferences -> Plugin Options -> Isabelle -> General. Maybe that would be helpful to test.
OK, so disabling auto quickcheck seems to solve the problem. Very interesting
I'm going to try running the larger project without quickcheck...
Yup. Problem goes away
Did you install Isabelle from the official Tarball? Or using some kind of package manager?
From the official tarball
Hm, no idea then. What happens if you run quickcheck manually on that example?
It immediately triggers an ML cleanup haha
(and it balloons the memory usage of ML)
Oh yeah I see the problem.
If I run quickcheck manually, it consumes loads of memory and freezes up.
if I delete the Quickcheck call, it catches its breath after a while though.
Yeah, even after that, for me it is using up tons of memory. If I never trigger quickcheck it seems to be OK though. I wonder if there's a way to tell Isabelle not to quickcheck specific things...
Okay I see the problem.
The problem is that your example includes the expression 2 ^ 129 :: nat
. Since you haven't loaded HOL-Library.Code_Target_Numeral
, arithmetic on nat
and int
is code-generated using successor arithmetic, i.e. data nat = Zero | Suc nat
. That is, of course, very inefficient, especially if you have something like 2^129
.
If you do value "2 ^ 129 :: nat"
, you get basically the same effect.
Oh! I've never heard of HOL-Library.Code_Target_Numeral
, I've been looking for that kind of thing for a while
Yeah it basically just sets up the code generator in such a way that arithmetic on int
and nat
uses whatever arbitrary-precision arithmetic the target language offers.
So in ML and Haskell this will boil down to just using IntInf.int
or Integer
, both of which use GMP. On Scala it's probably BigInteger
.
To be completely honest, I just assumed it did that by default
Perhaps it ought to. I guess it increases the trusted code base (both the target-language arbitrary-precision arithmetic implementation and the translation rules in the code generator).
But it's practically unusable without it.
(fun fact, I think Fabian Immler did find a bug in the Poly/ML integer implementation once)
Wow. Makes sense then. Thanks so much for your help!
In any case, do feel free to send this to the mailing list. Perhaps someone has an idea of how to address this properly.
Andreas found a bug in machine words in a similar setting
Under normal circumstances, auto quickcheck should probably be killed very quickly when it doesn't find anything, but I suppose that in this case perhaps it goes so crazy that whatever is supposed to kill it doesn't get a chance to do so.
or perhaps auto quickcheck could use native ints in situations like this, since it isn't really trusted that much anyway
Well that doesn't really solve the problem in general.
The general problem is that it is basically impossible to predict how many resources the evaluation of generated code will take in advance, and if it takes an absurd amount (like here) it might degrade the usability of the rest of the system significantly.
You could e.g. have something like replicate (2 ^ 30) True = replicate (2 ^ 30) True
where you will also run into trouble.
(strangely, this behaves differently when importing Code_Target_Numeral
, it just makes quickcheck crash with an "out of memory". I wonder why)
Congratulations, you just solved an ongoing mystery on the mailing list involving Lars Hupel. They ran into the same problem as you but did not realise that Quickcheck was the culprit. (I linked this thread there)
Matthew already mailed it, but that was indeed some good sleuthing!
I mixed up the mail threads, but I'm glad to see a mystery solved!
Matthew Torrence has marked this topic as resolved.
Last updated: Dec 21 2024 at 12:33 UTC