Stream: General

Topic: ✔ ML cleanup problems


view this post on Zulip Matthew Torrence (Nov 29 2021 at 13:29):

Hi, I'm having a problem with a large project, where every couple of seconds, there's a large ML cleanup that halts the entire JEdit interface for a few seconds. Here's a condensed code example that causes the problem (Isabelle 2021):

theory MWE

imports "HOL-Library.Word"

begin

fun numbits' :: "nat ⇒ nat" where
  "numbits' 0 = 0"
| "numbits' n = 1 + numbits' (n div 2)"

(* Ceiling of log2 a *)
definition numbits :: "'a::len word ⇒ nat" where
  "numbits x = numbits' ((unat x)-1)"

lemma
  assumes m_nonzero: "m > 1"
  shows md_less': "2 ^ (2 * numbits m) div unat (m :: 128 word) < 2 ^ 129"
proof -
  show ?thesis sorry
qed


end

on loading this file into JEdit, very often the ML memory viewer in the bottom right corner shows "ML Cleanup" for a few seconds, during which the JEdit interface becomes unresponsive. It looks like this example also uses a very disproportionate amount of memory as well. I was encountering very similar problems in Isabelle 2020. Any ideas for helping this?

view this post on Zulip Manuel Eberl (Nov 29 2021 at 13:31):

How much memory do you have? In any case, performance-related issues like this are also good to mention on the mailing list, since this is something Makarius can answer best, and he does not use Zulip.

view this post on Zulip Matthew Torrence (Nov 29 2021 at 13:32):

a lot. It seems to be capping out at around 4-5gb, and triggering the ML cleanup thing

view this post on Zulip Matthew Torrence (Nov 29 2021 at 13:32):

but I have plenty of ram on this device

view this post on Zulip Matthew Torrence (Nov 29 2021 at 13:34):

I'll also mention this in the mailing list, thanks

view this post on Zulip Manuel Eberl (Nov 29 2021 at 13:37):

Hm, I have a 32 GiB RAM machine (Arch Linux) and your MWE goes through fine within a few seconds with no responsiveness issues at all. I sometimes get into situations like you describe with very large developments that have been open a long time, especially when I jump back and forth a lot so that a lot of material has to get rebuilt. Of when there's a proof method running amok with lots of backtracking. But never on a tiny example like this.

Perhaps you ought to post this on the mailing list. Ideally also check whether it also happens with the latest release candidates. And describe exactly what happens when you run your MWE (i.e. when does it start freezing up).

view this post on Zulip Matthew Torrence (Nov 29 2021 at 13:40):

sounds good, I'll try all of that. It's definitely consistent for me, whenever I load the file in JEdit, it sits at <100mb used as it runs all the dependencies, then it hops up to 4gb and triggers a cleanup as soon as it tries to run md_less'. I also have a colleague that was experiencing very similar problems, I'll try to have them run this MWE as well

view this post on Zulip Matthew Torrence (Nov 29 2021 at 13:40):

question, is it possible to disabelle auto-quickcheck for one specific lemma? Perhaps that could be causing an issue?

view this post on Zulip Matthew Torrence (Nov 29 2021 at 13:45):

sent a message to the mailing list as well

view this post on Zulip Matthew Torrence (Nov 29 2021 at 13:48):

Also to clarify, it's not hanging in purple on any proof methods or definitions, the interface itself is getting hung up. It's very bizzare

view this post on Zulip Alex Weisberger (Nov 29 2021 at 13:52):

I don't know about disabling auto-quickcheck for a single lemma, but you can definitely turn it of globally in Preferences -> Plugin Options -> Isabelle -> General. Maybe that would be helpful to test.

view this post on Zulip Matthew Torrence (Nov 29 2021 at 13:56):

OK, so disabling auto quickcheck seems to solve the problem. Very interesting

view this post on Zulip Matthew Torrence (Nov 29 2021 at 13:57):

I'm going to try running the larger project without quickcheck...

view this post on Zulip Matthew Torrence (Nov 29 2021 at 13:57):

Yup. Problem goes away

view this post on Zulip Manuel Eberl (Nov 29 2021 at 13:57):

Did you install Isabelle from the official Tarball? Or using some kind of package manager?

view this post on Zulip Matthew Torrence (Nov 29 2021 at 13:57):

From the official tarball

view this post on Zulip Manuel Eberl (Nov 29 2021 at 13:58):

Hm, no idea then. What happens if you run quickcheck manually on that example?

view this post on Zulip Matthew Torrence (Nov 29 2021 at 13:59):

It immediately triggers an ML cleanup haha

view this post on Zulip Matthew Torrence (Nov 29 2021 at 13:59):

(and it balloons the memory usage of ML)

view this post on Zulip Manuel Eberl (Nov 29 2021 at 13:59):

Oh yeah I see the problem.

view this post on Zulip Manuel Eberl (Nov 29 2021 at 13:59):

If I run quickcheck manually, it consumes loads of memory and freezes up.

view this post on Zulip Manuel Eberl (Nov 29 2021 at 13:59):

if I delete the Quickcheck call, it catches its breath after a while though.

view this post on Zulip Matthew Torrence (Nov 29 2021 at 14:00):

Yeah, even after that, for me it is using up tons of memory. If I never trigger quickcheck it seems to be OK though. I wonder if there's a way to tell Isabelle not to quickcheck specific things...

view this post on Zulip Manuel Eberl (Nov 29 2021 at 14:03):

Okay I see the problem.

view this post on Zulip Manuel Eberl (Nov 29 2021 at 14:04):

The problem is that your example includes the expression 2 ^ 129 :: nat. Since you haven't loaded HOL-Library.Code_Target_Numeral, arithmetic on nat and int is code-generated using successor arithmetic, i.e. data nat = Zero | Suc nat. That is, of course, very inefficient, especially if you have something like 2^129.

view this post on Zulip Manuel Eberl (Nov 29 2021 at 14:05):

If you do value "2 ^ 129 :: nat", you get basically the same effect.

view this post on Zulip Matthew Torrence (Nov 29 2021 at 14:05):

Oh! I've never heard of HOL-Library.Code_Target_Numeral, I've been looking for that kind of thing for a while

view this post on Zulip Manuel Eberl (Nov 29 2021 at 14:06):

Yeah it basically just sets up the code generator in such a way that arithmetic on int and nat uses whatever arbitrary-precision arithmetic the target language offers.

view this post on Zulip Manuel Eberl (Nov 29 2021 at 14:06):

So in ML and Haskell this will boil down to just using IntInf.int or Integer, both of which use GMP. On Scala it's probably BigInteger.

view this post on Zulip Matthew Torrence (Nov 29 2021 at 14:07):

To be completely honest, I just assumed it did that by default

view this post on Zulip Manuel Eberl (Nov 29 2021 at 14:07):

Perhaps it ought to. I guess it increases the trusted code base (both the target-language arbitrary-precision arithmetic implementation and the translation rules in the code generator).

view this post on Zulip Manuel Eberl (Nov 29 2021 at 14:07):

But it's practically unusable without it.

view this post on Zulip Manuel Eberl (Nov 29 2021 at 14:08):

(fun fact, I think Fabian Immler did find a bug in the Poly/ML integer implementation once)

view this post on Zulip Matthew Torrence (Nov 29 2021 at 14:08):

Wow. Makes sense then. Thanks so much for your help!

view this post on Zulip Manuel Eberl (Nov 29 2021 at 14:08):

In any case, do feel free to send this to the mailing list. Perhaps someone has an idea of how to address this properly.

view this post on Zulip Mathias Fleury (Nov 29 2021 at 14:09):

Andreas found a bug in machine words in a similar setting

view this post on Zulip Manuel Eberl (Nov 29 2021 at 14:09):

Under normal circumstances, auto quickcheck should probably be killed very quickly when it doesn't find anything, but I suppose that in this case perhaps it goes so crazy that whatever is supposed to kill it doesn't get a chance to do so.

view this post on Zulip Matthew Torrence (Nov 29 2021 at 14:13):

or perhaps auto quickcheck could use native ints in situations like this, since it isn't really trusted that much anyway

view this post on Zulip Manuel Eberl (Nov 29 2021 at 14:15):

Well that doesn't really solve the problem in general.

view this post on Zulip Manuel Eberl (Nov 29 2021 at 14:16):

The general problem is that it is basically impossible to predict how many resources the evaluation of generated code will take in advance, and if it takes an absurd amount (like here) it might degrade the usability of the rest of the system significantly.

view this post on Zulip Manuel Eberl (Nov 29 2021 at 14:20):

You could e.g. have something like replicate (2 ^ 30) True = replicate (2 ^ 30) True where you will also run into trouble.

view this post on Zulip Manuel Eberl (Nov 29 2021 at 14:20):

(strangely, this behaves differently when importing Code_Target_Numeral, it just makes quickcheck crash with an "out of memory". I wonder why)

view this post on Zulip Manuel Eberl (Nov 29 2021 at 16:38):

Congratulations, you just solved an ongoing mystery on the mailing list involving Lars Hupel. They ran into the same problem as you but did not realise that Quickcheck was the culprit. (I linked this thread there)

view this post on Zulip Lars Hupel (Nov 29 2021 at 16:43):

Matthew already mailed it, but that was indeed some good sleuthing!

view this post on Zulip Matthew Torrence (Nov 29 2021 at 16:45):

I mixed up the mail threads, but I'm glad to see a mystery solved!

view this post on Zulip Notification Bot (Dec 03 2021 at 12:43):

Matthew Torrence has marked this topic as resolved.


Last updated: Dec 21 2024 at 12:33 UTC