✔ ML cleanup problems · General

Hi, I'm having a problem with a large project, where every couple of seconds, there's a large ML cleanup that halts the entire JEdit interface for a few seconds. Here's a condensed code example that causes the problem (Isabelle 2021):

theory MWE

imports "HOL-Library.Word"

begin

fun numbits' :: "nat ⇒ nat" where
  "numbits' 0 = 0"
| "numbits' n = 1 + numbits' (n div 2)"

(* Ceiling of log2 a *)
definition numbits :: "'a::len word ⇒ nat" where
  "numbits x = numbits' ((unat x)-1)"

lemma
  assumes m_nonzero: "m > 1"
  shows md_less': "2 ^ (2 * numbits m) div unat (m :: 128 word) < 2 ^ 129"
proof -
  show ?thesis sorry
qed


end

on loading this file into JEdit, very often the ML memory viewer in the bottom right corner shows "ML Cleanup" for a few seconds, during which the JEdit interface becomes unresponsive. It looks like this example also uses a very disproportionate amount of memory as well. I was encountering very similar problems in Isabelle 2020. Any ideas for helping this?

Manuel Eberl (Nov 29 2021 at 13:31):

How much memory do you have? In any case, performance-related issues like this are also good to mention on the mailing list, since this is something Makarius can answer best, and he does not use Zulip.

Matthew Torrence (Nov 29 2021 at 13:32):

a lot. It seems to be capping out at around 4-5gb, and triggering the ML cleanup thing

Matthew Torrence (Nov 29 2021 at 13:32):

Matthew Torrence (Nov 29 2021 at 13:34):

Manuel Eberl (Nov 29 2021 at 13:37):

Hm, I have a 32 GiB RAM machine (Arch Linux) and your MWE goes through fine within a few seconds with no responsiveness issues at all. I sometimes get into situations like you describe with very large developments that have been open a long time, especially when I jump back and forth a lot so that a lot of material has to get rebuilt. Of when there's a proof method running amok with lots of backtracking. But never on a tiny example like this.

Perhaps you ought to post this on the mailing list. Ideally also check whether it also happens with the latest release candidates. And describe exactly what happens when you run your MWE (i.e. when does it start freezing up).

Matthew Torrence (Nov 29 2021 at 13:40):

sounds good, I'll try all of that. It's definitely consistent for me, whenever I load the file in JEdit, it sits at <100mb used as it runs all the dependencies, then it hops up to 4gb and triggers a cleanup as soon as it tries to run md_less'. I also have a colleague that was experiencing very similar problems, I'll try to have them run this MWE as well

Matthew Torrence (Nov 29 2021 at 13:40):

question, is it possible to disabelle auto-quickcheck for one specific lemma? Perhaps that could be causing an issue?

Matthew Torrence (Nov 29 2021 at 13:45):

Matthew Torrence (Nov 29 2021 at 13:48):

Also to clarify, it's not hanging in purple on any proof methods or definitions, the interface itself is getting hung up. It's very bizzare

Alex Weisberger (Nov 29 2021 at 13:52):

I don't know about disabling auto-quickcheck for a single lemma, but you can definitely turn it of globally in Preferences -> Plugin Options -> Isabelle -> General. Maybe that would be helpful to test.

Matthew Torrence (Nov 29 2021 at 13:56):

Matthew Torrence (Nov 29 2021 at 13:57):

Manuel Eberl (Nov 29 2021 at 13:57):

Did you install Isabelle from the official Tarball? Or using some kind of package manager?

Matthew Torrence (Nov 29 2021 at 13:57):

Manuel Eberl (Nov 29 2021 at 13:58):

Matthew Torrence (Nov 29 2021 at 13:59):

Manuel Eberl (Nov 29 2021 at 13:59):

Matthew Torrence (Nov 29 2021 at 14:00):

Yeah, even after that, for me it is using up tons of memory. If I never trigger quickcheck it seems to be OK though. I wonder if there's a way to tell Isabelle not to quickcheck specific things...

Manuel Eberl (Nov 29 2021 at 14:03):

Manuel Eberl (Nov 29 2021 at 14:04):

The problem is that your example includes the expression 2 ^ 129 :: nat. Since you haven't loaded HOL-Library.Code_Target_Numeral, arithmetic on nat and int is code-generated using successor arithmetic, i.e. data nat = Zero | Suc nat. That is, of course, very inefficient, especially if you have something like 2^129.

Manuel Eberl (Nov 29 2021 at 14:05):

Matthew Torrence (Nov 29 2021 at 14:05):

Oh! I've never heard of HOL-Library.Code_Target_Numeral, I've been looking for that kind of thing for a while

Manuel Eberl (Nov 29 2021 at 14:06):

Yeah it basically just sets up the code generator in such a way that arithmetic on int and nat uses whatever arbitrary-precision arithmetic the target language offers.

Manuel Eberl (Nov 29 2021 at 14:06):

So in ML and Haskell this will boil down to just using IntInf.int or Integer, both of which use GMP. On Scala it's probably BigInteger.

Matthew Torrence (Nov 29 2021 at 14:07):

Manuel Eberl (Nov 29 2021 at 14:07):

Perhaps it ought to. I guess it increases the trusted code base (both the target-language arbitrary-precision arithmetic implementation and the translation rules in the code generator).

Manuel Eberl (Nov 29 2021 at 14:07):

Manuel Eberl (Nov 29 2021 at 14:08):

(fun fact, I think Fabian Immler did find a bug in the Poly/ML integer implementation once)

Matthew Torrence (Nov 29 2021 at 14:08):

Manuel Eberl (Nov 29 2021 at 14:08):

In any case, do feel free to send this to the mailing list. Perhaps someone has an idea of how to address this properly.

Mathias Fleury (Nov 29 2021 at 14:09):

Manuel Eberl (Nov 29 2021 at 14:09):

Under normal circumstances, auto quickcheck should probably be killed very quickly when it doesn't find anything, but I suppose that in this case perhaps it goes so crazy that whatever is supposed to kill it doesn't get a chance to do so.

Matthew Torrence (Nov 29 2021 at 14:13):

or perhaps auto quickcheck could use native ints in situations like this, since it isn't really trusted that much anyway

Manuel Eberl (Nov 29 2021 at 14:15):

Manuel Eberl (Nov 29 2021 at 14:16):

The general problem is that it is basically impossible to predict how many resources the evaluation of generated code will take in advance, and if it takes an absurd amount (like here) it might degrade the usability of the rest of the system significantly.

Manuel Eberl (Nov 29 2021 at 14:20):

You could e.g. have something like replicate (2 ^ 30) True = replicate (2 ^ 30) True where you will also run into trouble.

Manuel Eberl (Nov 29 2021 at 14:20):

(strangely, this behaves differently when importing Code_Target_Numeral, it just makes quickcheck crash with an "out of memory". I wonder why)

Manuel Eberl (Nov 29 2021 at 16:38):

Congratulations, you just solved an ongoing mystery on the mailing list involving Lars Hupel. They ran into the same problem as you but did not realise that Quickcheck was the culprit. (I linked this thread there)

Stream: General

Topic: ✔ ML cleanup problems

Matthew Torrence (Nov 29 2021 at 13:29):

Manuel Eberl (Nov 29 2021 at 13:31):

Matthew Torrence (Nov 29 2021 at 13:32):

Matthew Torrence (Nov 29 2021 at 13:32):

Matthew Torrence (Nov 29 2021 at 13:34):

Manuel Eberl (Nov 29 2021 at 13:37):

Matthew Torrence (Nov 29 2021 at 13:40):

Matthew Torrence (Nov 29 2021 at 13:40):

Matthew Torrence (Nov 29 2021 at 13:45):

Matthew Torrence (Nov 29 2021 at 13:48):

Alex Weisberger (Nov 29 2021 at 13:52):

Matthew Torrence (Nov 29 2021 at 13:56):

Matthew Torrence (Nov 29 2021 at 13:57):

Matthew Torrence (Nov 29 2021 at 13:57):

Manuel Eberl (Nov 29 2021 at 13:57):

Matthew Torrence (Nov 29 2021 at 13:57):

Manuel Eberl (Nov 29 2021 at 13:58):

Matthew Torrence (Nov 29 2021 at 13:59):

Matthew Torrence (Nov 29 2021 at 13:59):

Manuel Eberl (Nov 29 2021 at 13:59):

Manuel Eberl (Nov 29 2021 at 13:59):

Manuel Eberl (Nov 29 2021 at 13:59):

Matthew Torrence (Nov 29 2021 at 14:00):

Manuel Eberl (Nov 29 2021 at 14:03):

Manuel Eberl (Nov 29 2021 at 14:04):

Manuel Eberl (Nov 29 2021 at 14:05):

Matthew Torrence (Nov 29 2021 at 14:05):

Manuel Eberl (Nov 29 2021 at 14:06):

Manuel Eberl (Nov 29 2021 at 14:06):

Matthew Torrence (Nov 29 2021 at 14:07):

Manuel Eberl (Nov 29 2021 at 14:07):

Manuel Eberl (Nov 29 2021 at 14:07):

Manuel Eberl (Nov 29 2021 at 14:08):

Matthew Torrence (Nov 29 2021 at 14:08):

Manuel Eberl (Nov 29 2021 at 14:08):

Mathias Fleury (Nov 29 2021 at 14:09):

Manuel Eberl (Nov 29 2021 at 14:09):

Matthew Torrence (Nov 29 2021 at 14:13):

Manuel Eberl (Nov 29 2021 at 14:15):

Manuel Eberl (Nov 29 2021 at 14:16):

Manuel Eberl (Nov 29 2021 at 14:20):

Manuel Eberl (Nov 29 2021 at 14:20):

Manuel Eberl (Nov 29 2021 at 16:38):

Lars Hupel (Nov 29 2021 at 16:43):

Matthew Torrence (Nov 29 2021 at 16:45):

Notification Bot (Dec 03 2021 at 12:43):