Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Isabelle2020-RC5 is now available --- Isabelle...


view this post on Zulip Email Gateway (Aug 23 2022 at 08:54):

From: Makarius <makarius@sketis.net>
On 12/04/2020 20:53, Peter Lammich wrote:

I was interrupted by a "Prover process
terminated with error" dialogue displaying the message below.

Welcome to Isabelle/HOL (Isabelle2020-RC5: April 2020)
Interrupt
Interrupt
Interrupt
standard_error terminated
standard_output terminated
process terminated
Malformed message:
bad chunk (unexpected EOF after 679 of 836 bytes)
message_output terminated
command_input terminated
process_manager terminated
Return code: 137

The return code 137 = 128 + 9 means that the poly process was terminated by a
hard kill, i.e. the runtime system did not react to anymore to SIGINT or SIGTERM.

Moreover, the Interrupt exceptions on the syslog look suspicious: it indicates
resource problems in ML, e.g. severe shortage of heap or stack.

What is the output the following?

* isabelle getenv ML_PLATFORM ML_OPTIONS

* ulimit -a

On the command line, I see

j65266pl@cspool450:~/devel/isabelle/RF2/llvm$ ~/opt/Isabelle2020-
RC5/bin/isabelle jedit -d ~/devel/isabelle/afp-devel/thys/
basic/LLVM_Basic_Main.thy

Where can I get Isabelle/LLVM for Isabelle2020?

Makarius

view this post on Zulip Email Gateway (Aug 23 2022 at 08:54):

From: Peter Lammich <lammich@in.tum.de>

What is the output the following?

* isabelle getenv ML_PLATFORM ML_OPTIONS

* ulimit -a

j65266pl@cspool450:~$ ~/opt/Isabelle2020-RC5/bin/isabelle getenv
ML_PLATFORM ML_OPTIONS
ML_PLATFORM=x86_64_32-linux
ML_OPTIONS=--minheap 500
j65266pl@cspool450:~$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 127434
max locked memory (kbytes, -l) 16384
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 127434
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

Where can I get Isabelle/LLVM for Isabelle2020?

https://github.com/lammich/isabelle_llvm

However, the session that failed used a private copy with some
experimental changes. Anyway, the error seems not reproducible.

view this post on Zulip Email Gateway (Aug 23 2022 at 08:54):

From: Makarius <makarius@sketis.net>
On 13/04/2020 17:44, Peter Lammich wrote:

What is the output the following?

* isabelle getenv ML_PLATFORM ML_OPTIONS

* ulimit -a

j65266pl@cspool450:~$ ~/opt/Isabelle2020-RC5/bin/isabelle getenv
ML_PLATFORM ML_OPTIONS
ML_PLATFORM=x86_64_32-linux
ML_OPTIONS=--minheap 500
j65266pl@cspool450:~$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 127434
max locked memory (kbytes, -l) 16384
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 127434
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

The parameters are fine -- the usual defaults on Linux.

Where can I get Isabelle/LLVM for Isabelle2020?

https://github.com/lammich/isabelle_llvm

However, the session that failed used a private copy with some
experimental changes. Anyway, the error seems not reproducible.

The point is the get an impression about overall resource usage. After playing
with it for 30min, it looks all fine and smooth to me.

There is significant usage of ML heap, though. On a mobile machine with 16GB
total, the heap converges towards 10-11GB.

I have occasionally seen instabilities of the Poly/ML runtime system in batch
builds in such rather full situations.

For now, I would say we keep watching it carefully. David Matthews has done a
lot of fine-tuning of heap management recently, and a bit more is to be
expected soon (after the Isabelle2020 release).

Makarius

view this post on Zulip Email Gateway (Aug 23 2022 at 08:55):

From: Peter Lammich <lammich@in.tum.de>

There is significant usage of ML heap, though. On a mobile machine
with 16GB
total, the heap converges towards 10-11GB.

I have occasionally seen instabilities of the Poly/ML runtime system
in batch
builds in such rather full situations.

I have 32GB of memory.

For now, I would say we keep watching it carefully. David Matthews
has done a
lot of fine-tuning of heap management recently, and a bit more is to
be
expected soon (after the Isabelle2020 release).

Ok, I'll watch it. Right now, it looks that it was a one-in-a-million
thing, as it did not (yet) occur again.

view this post on Zulip Email Gateway (Aug 23 2022 at 08:55):

From: Makarius <makarius@sketis.net>
OK, I also have 32GB on my main machine and will continue testing with that.

Note that Poly/ML in x86_64_32 mode (default) is limited to approx. 16GB, and
these 10-11GB user heap are rather typical in practice. David Matthews has
some ideas in the pipeline to double that space --- it will be flushed the
next time there is sufficient funding for such a project.

Which base logic image are you typically using for editing Isabelle/LLVM?

Makarius

view this post on Zulip Email Gateway (Aug 23 2022 at 08:55):

From: Peter Lammich <lammich@in.tum.de>

Which base logic image are you typically using for editing
Isabelle/LLVM?

Typically, afp/Refine_Monadic.

When the error happened, I had only loaded vcg/LLVM_VCG_Main and
a few other theories I'm currently working on, from the default HOL
image.

Peter

Makarius

view this post on Zulip Email Gateway (Aug 23 2022 at 08:57):

From: Peter Lammich <lammich@in.tum.de>
Hi,
I got the error again, and this time I investigated it a bit. Linux
invoked OOM-killer on the poly-process! I attached the log, the last 3
messages show which process was selected for killing.There were two
Isabelle's running, and OOM-killer picked one.
Can I somehow tell poly to do more aggressive garbage collection,
BEFORE it gets OOM killed?
What changed in the poly configuration? I never had those problems with
Isabelle-2019, and sometimes had up to 3 instances running in parallel.
How can I find out if I changed something? A diff between the
.isabelle-folders reveals nothing suspicious to me?

Best, Peter

------------------Apr 16 23:58:12 cspool450
org.gnome.Shell.desktop[3621]: Memory pressure relief: Total: res =
34840576/34803712/-36864, res+swap = 28217344/28217344/0Apr 16 23:58:17
cspool450 kernel: PacerThread invoked oom-killer:
gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=300Apr
16 23:58:17 cspool450 kernel: CPU: 6 PID: 28464 Comm: PacerThread Not
tainted 5.3.0-46-generic #38~18.04.1-UbuntuApr 16 23:58:17 cspool450
kernel: Hardware name: Dell Inc. Precision 3540/0M14W7, BIOS 1.6.5
12/26/2019Apr 16 23:58:17 cspool450 kernel: Call Trace:Apr 16 23:58:17
cspool450 kernel: dump_stack+0x6d/0x95Apr 16 23:58:17 cspool450
kernel: dump_header+0x4f/0x200Apr 16 23:58:17 cspool450
kernel: oom_kill_process+0xe6/0x120Apr 16 23:58:17 cspool450
kernel: out_of_memory+0x109/0x510Apr 16 23:58:17 cspool450
kernel: __alloc_pages_slowpath+0xad1/0xe10Apr 16 23:58:17 cspool450
kernel: ? __switch_to_asm+0x34/0x70Apr 16 23:58:17 cspool450
kernel: __alloc_pages_nodemask+0x2cd/0x320Apr 16 23:58:17 cspool450
kernel: alloc_pages_current+0x6a/0xe0Apr 16 23:58:17 cspool450
kernel: __page_cache_alloc+0x6a/0xa0Apr 16 23:58:17 cspool450
kernel: pagecache_get_page+0x9c/0x2b0Apr 16 23:58:17 cspool450
kernel: filemap_fault+0x3a2/0xb00Apr 16 23:58:17 cspool450 kernel: ?
unlock_page_memcg+0x12/0x20Apr 16 23:58:17 cspool450 kernel: ?
page_add_file_rmap+0x5e/0x150Apr 16 23:58:17 cspool450 kernel: ?
alloc_set_pte+0x52e/0x5f0Apr 16 23:58:17 cspool450 kernel: ?
filemap_map_pages+0x18f/0x380Apr 16 23:58:17 cspool450
kernel: ext4_filemap_fault+0x31/0x44Apr 16 23:58:17 cspool450
kernel: __do_fault+0x57/0x117Apr 16 23:58:17 cspool450
kernel: __handle_mm_fault+0xda1/0x1230Apr 16 23:58:17 cspool450
kernel: handle_mm_fault+0xcb/0x210Apr 16 23:58:17 cspool450
kernel: __do_page_fault+0x2a1/0x4d0Apr 16 23:58:17 cspool450
kernel: do_page_fault+0x2c/0xe0Apr 16 23:58:17 cspool450
kernel: page_fault+0x34/0x40Apr 16 23:58:17 cspool450 kernel: RIP:
0033:0x5566f01d66a2Apr 16 23:58:17 cspool450 kernel: Code: Bad RIP
value.Apr 16 23:58:17 cspool450 kernel: RSP: 002b:00007fb61fad6790
EFLAGS: 00010206Apr 16 23:58:17 cspool450 kernel: RAX: 00005566eba0ba18
RBX: 7fffffffffffffff RCX: 00005566eba0ba18Apr 16 23:58:17 cspool450
kernel: RDX: 0000132de75dc948 RSI: 000000000001d4ae RDI:
0000132de75dc8f0Apr 16 23:58:17 cspool450 kernel: RBP: 00007fb61fad67b0
R08: 0000000000022580 R09: 00007fb61fad6630Apr 16 23:58:17 cspool450
kernel: R10: 00007fb61fad65f0 R11: 0000000000000206 R12:
0000132de75dc8f0Apr 16 23:58:17 cspool450 kernel: R13: 00000000000495d4
R14: 0000132de75dc8f0 R15: 00000020c0b47fedApr 16 23:58:17 cspool450
kernel: Mem-Info:Apr 16 23:58:17 cspool450 kernel: active_anon:7390182
inactive_anon:571000
isolated_anon:0 active_file:276
inactive_file:1
isolated_file:0 unevictable:25173
dirty:1 writeback:0
unstable:0 slab_reclaimable:25025
slab_unreclaimable:37942 mapped:23813
7 shmem:516419 pagetables:38113
bounce:0 free:50911 free_pcp:2186
free_cma:0Apr 16 23:58:17 cspool450 kernel: Node 0
active_anon:29560728kB inactive_anon:2284000kB active_file:1104kB
inactive_file:4kB unevictable:100692kB isolated(anon):0kB
isolated(file):0kB mapped:952548kB dirty:4kB writeback:0kB
shmem:2065676kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB
writeback_tmp:0kB unstable:0kB all_unreclaimable? noApr 16 23:58:17
cspool450 kernel: Node 0 DMA free:15900kB min:32kB low:44kB high:56kB
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
unevictable:0kB writepending:0kB present:15992kB managed:15900kB
mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB
local_pcp:0kB free_cma:0kBApr 16 23:58:17 cspool450 kernel:
lowmem_reserve[]: 0 1773 31843 31843 31843Apr 16 23:58:17 cspool450
kernel: Node 0 DMA32 free:124016kB min:3760kB low:5576kB high:7392kB
active_anon:1392156kB inactive_anon:355744kB active_file:76kB
inactive_file:0kB unevictable:3132kB writepending:0kB present:1948512kB
managed:1882692kB mlocked:0kB kernel_stack:0kB pagetables:316kB
bounce:0kB free_pcp:112kB local_pcp:4kB free_cma:0kBApr 16 23:58:17
cspool450 kernel: lowmem_reserve[]: 0 0 30069 30069 30069Apr 16
23:58:17 cspool450 kernel: Node 0 Normal free:63728kB min:63788kB
low:94576kB high:125364kB active_anon:28168572kB
inactive_anon:1928256kB active_file:484kB inactive_file:612kB
unevictable:97560kB writepending:4kB present:31399936kB
managed:30799340kB mlocked:264kB kernel_stack:24384kB
pagetables:152136kB bounce:0kB free_pcp:8632kB local_pcp:1032kB
free_cma:0kBApr 16 23:58:17 cspool450 kernel: lowmem_reserve[]: 0 0 0 0
0Apr 16 23:58:17 cspool450 kernel: Node 0 DMA: 14kB (U) 18kB (U)
116kB (U) 232kB (U) 164kB (U) 1128kB (U) 1256kB (U) 0512kB
11024kB (U) 12048kB (M) 3*4096kB (M) = 15900kBApr 16 23:58:17
cspool450 kernel: Node 0 DMA32: 3484kB (UME) 3138kB (UME) 265*16kB
(UME) 32432kB (UME) 33464kB (UME) 288128kB (UME) 89256kB (UME)
45512kB (UME) 21024kB (UM) 02048kB 04096kB = 124616kBApr 16
23:58:17 cspool450 kernel: Node 0 Normal: 51114kB (UM) 16718kB (UME)
49016kB (UME) 29832kB (UME) 21964kB (UME) 0128kB 0256kB 0512kB
01024kB 02048kB 0*4096kB = 65204kBApr 16 23:58:17 cspool450 kernel:
Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0
hugepages_size=1048576kBApr 16 23:58:17 cspool450 kernel: Node 0
hugepages_total=0 hugepages_free=0 hugepages_surp=0
hugepages_size=2048kBApr 16 23:58:17 cspool450 kernel: 672338 total
pagecache pagesApr 16 23:58:17 cspool450 kernel: 155339 pages in swap
cacheApr 16 23:58:17 cspool450 kernel: Swap cache stats: add 1433360,
delete 1278008, find 462505/467558Apr 16 23:58:17 cspool450 kernel:
Free swap = 0kBApr 16 23:58:17 cspool450 kernel: Total swap =
999420kBApr 16 23:58:17 cspool450 kernel: 8341110 pages RAMApr 16
23:58:17 cspool450 kernel: 0 pages HighMem/MovableOnlyApr 16 23:58:17
cspool450 kernel: 166627 pages reservedApr 16 23:58:17 cspool450
kernel: 0 pages cma reservedApr 16 23:58:17 cspool450 kernel: 0 pages
hwpoisonedApr 16 23:58:17 cspool450 kernel: Tasks state (memory values
in pages):Apr 16 23:58:17 cspool450 kernel: [ pid ] uid tgid
total_vm rss pgtables_bytes swapents oom_score_adj nameApr 16
23:58:17 cspool450 kernel:
[ 564] 0 564 45374 349 380928 0 0
systemd-journalApr 16 23:58:17 cspool450 kernel:
[ 574] 0 574 26475 56 94208 0 0
lvmetadApr 16 23:58:17 cspool450 kernel:
[ 582] 0 582 11752 513 122880 0 -1000
systemd-udevdApr 16 23:58:17 cspool450 kernel:
[ 889] 101 889 17827 318 172032 0 0
systemd-resolveApr 16 23:58:17 cspool450 kernel: [ 891]
62583 891 36528 141 192512 0 0 systemd-
timesynApr 16 23:58:17 cspool450 kernel:
[ 988] 0 988 17708 232 180224 0 0
systemd-logindApr 16 23:58:17 cspool450 kernel:
[ 990] 0 990 74256 194 208896 0 0
boltdApr 16 23:58:17 cspool450 kernel:
[ 997] 0 997 108599 396 356352 0 0
ModemManagerApr 16 23:58:17 cspool450 kernel:
[ 998] 0 998 44408 1990 241664 0 0
networkd-dispatApr 16 23:58:17 cspool450 kernel:
[ 999] 0 999 27637 95 114688 0 0
irqbalanceApr 16 23:58:17 cspool450 kernel:
[ 1001] 0 1001 46749 209 196608 0 0
thermaldApr 16 23:58:17 cspool450 kernel:
[ 1006] 0 1006 125919 656 348160 0 0
udisksdApr 16 23:58:17 cspool450 kernel:
[ 1017] 116 1017 11815 86 126976 0 0
avahi-daemonApr 16 23:58:17 cspool450 kernel:
[ 1022] 102 1022 65758 325 167936 0 0
rsyslogdApr 16 23:58:17 cspool450 kernel:
[ 1025] 0 1025 9123 102 114688 0 0
bluetoothdApr 16 23:58:17 cspool450 kernel:
[ 1031] 0 1031 9606 74 118784 0 0
cronApr 16 23:58:17 cspool450 kernel:
[ 1035] 103 1035 13205 685 143360 0 -900
dbus-daemonApr 16 23:58:17 cspool450 kernel:
[ 1047] 116 1047 11768 86 122880 0 0
avahi-daemonApr 16 23:58:17 cspool450 kernel:
[ 1096] 0 1096 11431 280 126976 0 0
wpa_supplicantApr 16 23:58:17 cspool450 kernel:
[ 1099] 0 1099 73693 262 208896 0 0
accounts-daemonApr 16 23:58:17 cspool450 kernel:
[ 1102] 0 1102 1137 16 53248 0 0
acpidApr 16 23:58:17 cspool450 kernel:
[ 1105] 0 1105 160255 1235 466944 0 0
NetworkManagerApr 16 23:58:17 cspool450 kernel:
[ 1111] 0 1111 548623 2902 425984 0 -900
snapdApr 16 23:58:17 cspool450 kernel:
[ 1390] 0 1390 75962 891 237568 0 0
polkitdApr 16 23:58:17 cspool450 kernel:
[ 1643] 117 1643 81510 1538 266240 0 0
colordApr 16 23:58:17 cspool450 kernel:
[ 1690] 0 1690 48585 1972 26624
[message truncated]

view this post on Zulip Email Gateway (Aug 23 2022 at 08:57):

From: Makarius <makarius@sketis.net>
On 17/04/2020 01:12, Peter Lammich wrote:

I got the error again, and this time I investigated it a bit. Linux invoked
OOM-killer on the poly-process!
I attached the log, the last 3 messages show which process was selected for
killing.
There were two Isabelle's running, and OOM-killer picked one.

I did not know anything about the OOM killer yet, but it seems to cause
problems occasionally. E.g. see
https://serverfault.com/questions/101916/turn-off-the-linux-oom-killer-by-default?rq=1

Can I somehow tell poly to do more aggressive garbage collection, BEFORE it
gets OOM killed?

You need to ask that David Matthews.

Maybe you just need more swap space.

Or you should limit the poly process explicitly, e.g. in
$ISABELLE_HOME_USER/etc/settings:

ML_OPTIONS="--minheap 1500 --maxheap 8g"

Makarius


Last updated: Nov 21 2024 at 12:39 UTC