Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Algebraic_Numbers


view this post on Zulip Email Gateway (Aug 22 2022 at 13:28):

From: Lars Hupel <hupel@in.tum.de>
Hi René,

as far as I can tell, there is no easy solution just yet. The
medium-term goal is to automate publishing of artifacts from stable AFP
using Jenkins. This will mean a publicly-browsable directory of all
PDFs, HTMLs, tars, ...

Cheers
Lars

view this post on Zulip Email Gateway (Aug 22 2022 at 13:29):

From: Gerwin Klein <Gerwin.Klein@nicta.com.au>
The solution is to do it the other way around.

Algebraic_Numbers should be the session with everything, generating all documents. Then you add a separate small session without the tests that others can use if they need to conserve resources. This one doesn’t even have to produce a document at all.

Cheers,
Gerwin


The information in this e-mail may be confidential and subject to legal professional privilege and/or copyright. National ICT Australia Limited accepts no liability for any damage caused by this email or its attachments.

view this post on Zulip Email Gateway (Aug 22 2022 at 13:30):

From: Lars Hupel <hupel@in.tum.de>

Thanks for this easy solution. I just pushed a corresponding change
with new target: Algebraic_Number_Lib

Because of this we now have the unfortunate situation that a significant
part of Algebraic_Numbers is now being built twice, without any actual
need to do so.

Cheers
Lars

view this post on Zulip Email Gateway (Aug 22 2022 at 13:31):

From: Makarius <makarius@sketis.net>
Indeed. When there are bulky sessions, one should figure out how to make
them less bulky, and not just duplicate bulkiness.

Makarius

view this post on Zulip Email Gateway (Aug 22 2022 at 13:31):

From: Gerwin Klein <Gerwin.Klein@nicta.com.au>
I thought the point was that the part that is duplicated is not bulky. If it is, we should think more, but I don’t have any better solutions ready.

For comparison, we’re doing this already: since we can’t merge multiple parent sessions, all entries that re-use more than one other entry will currently definitely rebuild all of at least one parent anyway. Given that, and the effort I’m spending periodically to make sure the AFP does not rebuild HOL-Library 100 times each day, I don’t think the light part of one entry being rebuilt is a problem.

Cheers,
Gerwin


The information in this e-mail may be confidential and subject to legal professional privilege and/or copyright. National ICT Australia Limited accepts no liability for any damage caused by this email or its attachments.

view this post on Zulip Email Gateway (Aug 22 2022 at 13:31):

From: Makarius <makarius@sketis.net>
I think the worries of Lars Hupel are based on the rather weak
Isabelle/Jenkins testing hardware. My own worries are based on general
observations that Algebraic_Numbers could be close to the infamous
"invisible concrete wall" (without detailed measurements yet).

Here are some numbers on the fastest test machine that I have access to
(i.e. my own workstation):

Isabelle/bcf4828bb125 + AFP/83c0fef83e3f
x86-linux, 6 threads

Finished Algebraic_Number_Lib (0:00:48 elapsed time, 0:04:12 cpu time,
factor 5.22) ## without heap image
Finished Algebraic_Numbers (0:04:48 elapsed time, 0:22:56 cpu time,
factor 4.78) ## based on Pre_Algebraic_Numbers

Now the same, but Algebraic_Numbers is based on Algebraic_Number_Lib,
i.e. the first session builds a heap and the second only runs the bulky
tests:

Finished Algebraic_Number_Lib (0:01:17 elapsed time, 0:05:28 cpu time,
factor 4.23)
Finished Algebraic_Numbers (0:03:54 elapsed time, 0:17:54 cpu time,
factor 4.57)

This makes some difference in overall CPU time usage. It is reminiscent
of ancient times, when Jinja (not JinjaThreads) had to be split into two
sessions: the full compactification of the first heap makes the second
one more comfortable. Although the effect today is not that important,
due to online heap sharing.

There are already AFP entries with more than one session, and also
add-on documents. Maybe that is already sufficient to publish the tests
separately, or maybe not. Lars Hupel might be able to say that.

Stepping back further, the question is why Algebraic_Number_Tests
requires so much persistent heap memory that sessions built on it might
get into problems. Maybe there is a deeper resource problem behind it.

Makarius

view this post on Zulip Email Gateway (Aug 22 2022 at 13:31):

From: Makarius <makarius@sketis.net>
BTW, for years I am trying to get to a situation where sessions and heap
images are independent (also for document preparation). That would allow
to run many AFP sessions within the same Poly/ML process, providing a
chance that the big number of small sessions uses much less resources.

Makarius

view this post on Zulip Email Gateway (Aug 22 2022 at 13:31):

From: Gerwin Klein <Gerwin.Klein@nicta.com.au>
That would be very useful!

Cheers,
Gerwin


The information in this e-mail may be confidential and subject to legal professional privilege and/or copyright. National ICT Australia Limited accepts no liability for any damage caused by this email or its attachments.

view this post on Zulip Email Gateway (Aug 22 2022 at 13:31):

From: Lars Hupel <hupel@in.tum.de>

I think the worries of Lars Hupel are based on the rather weak
Isabelle/Jenkins testing hardware. My own worries are based on general
observations that Algebraic_Numbers could be close to the infamous
"invisible concrete wall" (without detailed measurements yet).

Better hardware is on its way.

There are already AFP entries with more than one session, and also
add-on documents. Maybe that is already sufficient to publish the tests
separately, or maybe not. Lars Hupel might be able to say that.

Presently, all documents for the development version are built, but not
published. The reason for that is that it turns out the disk space
requirements to retain all HTML and PDF files is excessive (dozens of GB
amassed during two or three weeks). I have a changeset to the build
server in my pipeline which will retain the latest set of documents.

(NB, build logs are retained and will be retained forever; they are
relatively small due to compression.)

The remaining question is how these documents will be presented in AFP.
Currently only the documents of the main sessions are linked from the
entry pages.

Stepping back further, the question is why Algebraic_Number_Tests
requires so much persistent heap memory that sessions built on it might
get into problems. Maybe there is a deeper resource problem behind it.

That is an excellent question. I still can't quite put my finger on it
as to why that single theory forces us to run AFP tests in bulky 64 bit
mode.

Cheers
Lars

view this post on Zulip Email Gateway (Aug 22 2022 at 13:31):

From: Makarius <makarius@sketis.net>
I did not know that the situation is that bad.

An important point if AFP tests is to see if everything still works
nicely in the small x86 heap model. That is one aspect of the "invisible
concrete wall".

Makarius

view this post on Zulip Email Gateway (Aug 22 2022 at 13:31):

From: Gerwin Klein <Gerwin.Klein@nicta.com.au>

Presently, all documents for the development version are built, but not
published. The reason for that is that it turns out the disk space
requirements to retain all HTML and PDF files is excessive (dozens of GB
amassed during two or three weeks). I have a changeset to the build
server in my pipeline which will retain the latest set of documents.

That is a good idea, I don’t think we need to keep all artefacts.

(NB, build logs are retained and will be retained forever; they are
relatively small due to compression.)

Good, these are worth keeping, at least for a moderately longer time span. They enable statistics etc.

The remaining question is how these documents will be presented in AFP.
Currently only the documents of the main sessions are linked from the
entry pages.

Extra sessions and documents should be very rare, i.e. almost all AFP sessions should have exactly one document, which is the one that is linked.

Is this not the case or am I misunderstanding the question?

Stepping back further, the question is why Algebraic_Number_Tests
requires so much persistent heap memory that sessions built on it might
get into problems. Maybe there is a deeper resource problem behind it.

That is an excellent question. I still can't quite put my finger on it
as to why that single theory forces us to run AFP tests in bulky 64 bit
mode.

Does sound worth investigating, I have no direct ideas either.

Maybe it is indeed time to officially enable multiple sessions for AFP entries. That will take some infrastructure work, but it would give us more flexibility in treating things like this, for instance with the technique Makarius mentioned, which we used for Jinja years back.

Cheers,
Gerwin


The information in this e-mail may be confidential and subject to legal professional privilege and/or copyright. National ICT Australia Limited accepts no liability for any damage caused by this email or its attachments.

view this post on Zulip Email Gateway (Aug 22 2022 at 13:32):

From: Makarius <makarius@sketis.net>
The split into two sessions had mainly the effect to enforce a full heap
compaction (and reload). In contempory Poly/ML this can be done at
runtime, e.g. by putting this into a suitable place (e.g.
Algebraic_Number_Tests.thy):

ML_command %invisible \<open>ML_Heap.share_common_data ()\<close>

Here are some measurements for Isabelle/d8884c111bca + AFP/44c49a721891
on x86-linux with 6 threads.

(0) standard setup:
Finished Algebraic_Numbers (0:04:21 elapsed time, 0:20:03 cpu time,
factor 4.60)

(1) ML_command share_common_data:
Finished Algebraic_Numbers (0:05:23 elapsed time, 0:23:52 cpu time,
factor 4.43)

(2) Algebraic_Numbers based on Algebraic_Number_Lib:
Finished Algebraic_Number_Lib (0:01:18 elapsed time, 0:05:31 cpu time,
factor 4.21)
Finished Algebraic_Numbers (0:04:07 elapsed time, 0:17:55 cpu time,
factor 4.34)

The times for (1) are approximately equal to the sums of times for (2).

Maybe Lars can try share_common_data on the AFP test hardware to see if
it still happens to work on 32bit architecture.

Makarius

view this post on Zulip Email Gateway (Aug 22 2022 at 13:36):

From: Manuel Eberl <eberlm@in.tum.de>
The Algebraic_Numbers theory is notoriously big and resource-intensive.
I have a development that imports Algebraic_Numbers and then adds some
small things on top of it and exports code. Oddly, this then causes
stack overflows unless I switch to 64 bit Isabelle.

I noticed that a huge chunk of the resource hunger of that development
actually comes from Algebraic_Number_Tests.thy, which one does not even
need in order to use the development.

I therefore propose removing this theory from the Algebraic_Numbers
session and introducing a separate Algebraic_Number_Tests session for
it. I tried this locally and this is theh difference that it made:

Before: Finished Algebraic_Numbers (0:04:33 elapsed time, 0:15:05 cpu
time, factor 3.31)
After: Finished Algebraic_Numbers (0:01:18 elapsed time, 0:04:05 cpu
time, factor 3.12)

Now, I don't know if and by how much this reduces the memory
requirements and I have no idea how to measure this reliably. In any
case, I do not see any drawbacks of doing this.

Any opinions?

Cheers,

Manuel

view this post on Zulip Email Gateway (Aug 22 2022 at 13:37):

From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
IMHO it is good style that things on which others are supposed to build
upon contain only libraries, not tests. So, I would suggest to go ahead.

Cheers,
Florian Haftmann
signature.asc

view this post on Zulip Email Gateway (Aug 22 2022 at 13:39):

From: "Thiemann, Rene" <Rene.Thiemann@uibk.ac.at>
Dear all,

it is perfectly fine to me, to create a separate target in the ROOT file without the tests,
e.g., as indicated below. However, I’m unsure how the document preparation will work in this
way. Somehow both entries contribute to the final PDF, i.e., both parts should be displayed on
the AFP-website.

Does anybody know an easy solution to this?

Cheers,
René

session Algebraic_Numbers_With_Tests (AFP) = Algebraic_Numbers +
description {* Testing Algebraic Number *}
options [timeout = 1200]
theories
"Algebraic_Number_Tests"
document_files
"root.bib"
"root.tex"

session Algebraic_Numbers (AFP) = Pre_Algebraic_Numbers +
description {* Algebraic Numbers *}
options [timeout = 600]
theories
"Missing_Multiset"
"Missing_List"
"Compare_Complex"
"Improved_Code_Equations"
"Precomputation"
"Is_Rat_To_Rat"
"Ring_Hom_Poly"
"Order_Polynomial"
"Binary_Exponentiation"
"Explicit_Roots"
"Rational_Root_Test"
"Kronecker_Factorization"
"Square_Free_Factorization"
"Berlekamp_Hensel_Factorization"
"Rational_Factorization"
"Real_Factorization"
"Show_Real_Approx“
"Show_Real_Precise"
document_files
"root.bib"
"root.tex"
signature.asc


Last updated: Nov 21 2024 at 12:39 UTC