From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
Dear power users of the bit and word libraries,
I have finished polishing the bit and word material as envisaged before
the next Isabelle release.
The changes are by no means as invasive as for the last Isabelle
release, but you might check your existing applications against before
the upcoming release – there is still time for corrections and improvements.
Cheers,
Florian
OpenPGP_signature
From: Gerwin Klein <kleing@unsw.edu.au>
On 11 Aug 2021, at 01:34, Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de> wrote:
Dear power users of the bit and word libraries,
I have finished polishing the bit and word material as envisaged before
the next Isabelle release.
This sounds good, but I don't quite remember any more what exactly that was, can you give a summary?
The changes are by no means as invasive as for the last Isabelle
release, but you might check your existing applications against before
the upcoming release – there is still time for corrections and improvements.
As you know, it's a bit of work to check that for l4v, esp since it is still on Isabelle2020 (no real issues, mostly organisational upheavals have taken up all available attention). It will take a while, but it looks like it will break in the usually hopefully small ways.
Do you have a list of changes that users will likely ned to apply?
Cheers,
Gerwin
signature.asc
From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
Hi Gerwin,
This sounds good, but I don't quite remember any more what exactly that was, can you give a summary?
As you know, it's a bit of work to check that for l4v, esp since it is still on Isabelle2020 (no real issues, mostly organisational upheavals have taken up all available attention). It will take a while, but it looks like it will break in the usually hopefully small ways.
Do you have a list of changes that users will likely ned to apply?
it’s in the NEWS
Infix syntax for bit operations AND, OR, XOR is now organized in
bundle bit_operations_syntax. INCOMPATIBILITY.Bit operations set_bit, unset_bit and flip_bit are now class
operations. INCOMPATIBILITY.Theory Bit_Operations is now part of HOL-Main. Minor INCOMPATIBILITY.
Simplified class hierarchy for bit operations: bit operations reside
in classes (semi)ring_bit_operations, class semiring_bit_shifts is
gone.Abbreviation "max_word" has been moved to session Word_Lib in the AFP,
as also have constants "shiftl1", "shiftr1", "sshiftr1", "bshiftr1",
"setBit", "clearBit". See there further the changelog in theory Guide.
INCOMPATIBILITY.
and as a separate section in the »Guide« in session Word_Lib:
➧[Changes since AFP 2021] ~
:black_small_square: Theory \<^theory>‹Word_Lib.Ancient_Numeral› is no part of \<^theory>‹Word_Lib.Word_Lib_Sumo›
any longer.:black_small_square: Infix syntax for \<^term>‹(AND)›, \<^term>‹(OR)›, \<^term>‹(XOR)› organized in
syntax bundle \<^bundle>‹bit_operations_syntax›.:black_small_square: Abbreviation \<^abbrev>‹max_word› moved from distribution into theory
\<^theory>‹Word_Lib.Legacy_Aliases›.:black_small_square: Operation \<^const>‹test_bit› replaced by input abbreviation \<^abbrev>‹test_bit›.
:black_small_square: Operation \<^const>‹shiftl› replaced by abbreviation \<^abbrev>‹shiftl›.
:black_small_square: Operation \<^const>‹shiftr› replaced by abbreviation \<^abbrev>‹shiftr›.
:black_small_square: Operation \<^const>‹sshiftr› replaced by abbreviation \<^abbrev>‹sshiftr›.
:black_small_square: Abbreviations \<^abbrev>‹bin_nth›, \<^abbrev>‹bin_last›, \<^abbrev>‹bin_rest›,
\<^abbrev>‹bintrunc›, \<^abbrev>‹sbintrunc›, \<^abbrev>‹norm_sint›,
\<^abbrev>‹bin_cat› moved into theory \<^theory>‹Word_Lib.Legacy_Aliases›.:black_small_square: Operations \<^abbrev>‹shiftl1›, \<^abbrev>‹shiftr1›, \<^abbrev>‹sshiftr1›, \<^abbrev>‹bshiftr1›,
\<^abbrev>‹setBit›, \<^abbrev>‹clearBit› moved from distribution into theory
\<^theory>‹Word_Lib.Legacy_Aliases› and replaced by input abbreviations.:black_small_square: Operation \<^const>‹complement› replaced by input abbreviation \<^abbrev>‹complement›.
Hope this helps,
Florian
OpenPGP_signature
From: Makarius <makarius@sketis.net>
Side-remark: 2021 is a year of 2 Isabelle releases. So maybe you manage to
catch up for Isabelle2021-1 (December 2021). The release process will start
approx. 01-Nov-2021 and finish 15-Dec-2021.
Makarius
isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev
From: Makarius <makarius@sketis.net>
Is this failure a consequence of it? (Seen in current Isabelle/0f051404f487 +
AFP/bd6c4c7c76ec)
Native_Word FAILED
(see also
/Users/wenzelm/.isabelle/heaps/polyml-5.8.2_x86_64_32-darwin/log/Native_Word)
*** Type unification failed: Clash of types "_ ⇒ _" and "uint64"
*** Type error in application: operator not of function type
*** Operator: x :: uint64
*** Operand: AND :: ??'a
*** At command "lemma" (line 25 of "$AFP/Native_Word/Native_Word_Test_GHC.thy")
*** At command "test_code" (line 13 of
"$AFP/Native_Word/Native_Word_Test_GHC.thy")
*** Type unification failed: Clash of types "_ ⇒ _" and "uint64"
*** Type error in application: operator not of function type
*** Operator: x :: uint64
*** Operand: AND :: ??'a
*** At command "lemma" (line 25 of "$AFP/Native_Word/Native_Word_Test_GHC.thy")
Unfinished session(s): Native_Word
Makarius
isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev
From: Makarius <makarius@sketis.net>
OK. It looks fine.
Makarius
isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev
From: Gerwin Klein <kleing@unsw.edu.au>
Having finally managed to get some time to look over the changes to Word_Lib, I have strong concerns about this development and how it was made.
We only finished updating to Isabelle2021 a week ago -- it was major amount of work, mostly due to Word_Lib change, but I could see the value of the generalisations of the last round.
I do not see the value in most of these current changes. They have neither been articulated, nor are they obvious to me. In fact, to my mind the opposite is true for several of the changes.
For instance, who on earth wants to read something like "is_aligned (push_bit m k) m" instead of "is_aligned (k << m) m"? It is entirely unacceptable to make such a change in an AFP library without even consulting the original authors. I would never write this. Please revert it. The syntax was introduced for good reasons. It is fine to make it a bundle to not pollute a global syntax space, it is not fine to change original material that was written with a purpose.
It is absolutely not fine to produce commits like "dropped junk" which removes a theory that was critical to how this library is used after in previous commits having made it unusable, entirely missing the point of its existence. Please revert that entire series of commits and reproduce the old setup.
Further, a whole lot of constants have been moved around. Why? What does this improve? This can cause major amounts of renaming work for no gain to anyone. Please either produce a convincing argument for the benefit of these moves or revert them.
Changing definitions such as shiftl1 to input abbreviations is likely to cause major breakage because term structure changes significantly. Again, what rationale has been considered, what cost/benefit analysis led to this decision? I strongly prefer definitions for these concepts. Please articulate these before you make such changes, and either convince the authors of the benefits or revert the change.
It is not acceptable to make such sweeping changes to other people's work without getting the author's consent first. As you very well know 90% of the use of this library is outside the AFP, and Word_Lib is the interface to that work, as well as an ingredient of a number of automated tools.
Gerwin
isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev
From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
Hi Gerwin and all reading this thread,
I went through the repositories’ history and my memory to recap what
happened here in the post-Isabelle2020 and post-Isabelle2021 time.
Overall, the changes at a glance are guided by the following aims,
quoted from the write-up at
<https://isabelle.in.tum.de/~haftmann/bits_and_word/>:
Theory Word contains the word type and its operations proper, located
in session HOL-Library.
Additional material of somehow unclear status in relevance is
modularized into separate theories and moved to session Word_Lib in the AFP.
The historically grown Word_Lib is minimally structured and augmented
with a guide.
The ultimate goal is that future development can happen mostly in the AFP.
Some of that work has already happened before Isabelle2021; what
happened for Isabelle2021-1 as far as I can see from the history:
Some disentanglement of theory dependencies to enable users to use the
library more selectively (»Word_Lib is minimally structured«) – this
affects genuine Word_Lib material .
Movement of material from the distribution / HOL-Library.Word to
Word_Lib (»moved to session Word_Lib in the AFP«)
Restructuring of that (originally non-Word_Lib (!)) material
(»modularized into separate theories«).
(I emphasize here that most of the changes seen in the AFP actually
originate from the Bit/Word material in the distribution, where the
cumulative reworking over two releases gives me kind of responsibility
and »legitimacy« for proactive developments. Coming back to the original
aims: »The ultimate goal is that future development can happen mostly in
the AFP.« Ie. without subtle and hard to maintain dependencies between
AFP and distribution. Note also that the changes towards Isabelle2021
are far more massive than those afterwards.)
If these intentions and measures by itself infringe your authorship or
the intended purpose or usability of Word_Lib, this is a fundamental
issue and we should make a stop here and find a way to work that out in
particular.
Otherwise I would appreciate to get to the particular problems with the
current matter of affairs. I thankfully accept any hints on slips,
omissions, misperceptions, doubtful quality, theory and tool break-downs
both in the visible and the non-visible universe – be it due to changes
or not. In that manner both authors and non-authors of Word_Lib in
private conversation have already given feedback and also contributions.
Generally: If there are any repositories I should have a look at for
some kind of assessment, please let me know. Also if there is particular
experience of the migration to Isabelle2021 I should take into
consideration.
For instance, who on earth wants to read something like "is_aligned (push_bit m k) m" instead of "is_aligned (k << m) m"? . The syntax was introduced for good reasons. It is fine to make it a bundle to not pollute a global syntax space, it is not fine to change original material that was written with a purpose.
If your point is to to apply the ‹_ << _› syntax etc. thoroughly and
consistently among Word_Lib theories, I am happy to look after these slips.
It is absolutely not fine to produce commits like "dropped junk" which removes a theory that was critical to how this library is used after in previous commits having made it unusable, entirely missing the point of its existence. Please revert that entire series of commits and reproduce the old setup.
After a second study of history I notice that I messed this up already
in the AFP release for Isabelle2021; the change after that point of
reference was indeed just a cleanup of dead theories. Looks I should
just revert to pre-Isabelle2021 there?
Further, a whole lot of constants have been moved around. Why? What does this improve? This can cause major amounts of renaming work for no gain to anyone. Please either produce a convincing argument for the benefit of these moves or revert them
These moves AFAICS originate from the distribution (see above), where
the elimination of the ancient HOL-Word session for Isabelle2021 already
had massive impact on internal names for operations. Is your concern
about renaming work just abstract or based on concrete observations? I
am asking since if any tool survived the Isabelle2021 movements, there
is little reason to assume that it won’t survive the current state of
affairs – Isabelle is far more robust against such things as, say, 15
years ago.
Changing definitions such as shiftl1 to input abbreviations is likely to cause major breakage because term structure changes significantly.
The shift*1 Operations have been auxiliary definitions in the
distribution to bootstrap the bit shifts operations, similarly to
iszero, equipped only with a minimal set of lemmas. If one tool out
there relies on it as explicit constant i. e. due to pattern matching, I
won’t shed a tear to keep it as constant.
(Aside, the primitive definitions then would be maybe: shiftl1 a = (a <<
1), shiftr1 a = (a >> 1), sshiftr1 a = (a >>> 1)).
Cheers,
Florian
OpenPGP_signature
From: Gerwin Klein <kleing@unsw.edu.au>
On 6 Oct 2021, at 23:30, Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de> wrote:
Theory Word contains the word type and its operations proper, located
in session HOL-Library.Additional material of somehow unclear status in relevance is
modularized into separate theories and moved to session Word_Lib in the AFP.The historically grown Word_Lib is minimally structured and augmented
with a guide.The ultimate goal is that future development can happen mostly in the AFP.
So far that is as we discussed back then and I continue to agree with it. The problems are in the details.
Some disentanglement of theory dependencies to enable users to use the
library more selectively (»Word_Lib is minimally structured«) – this
affects genuine Word_Lib material .Movement of material from the distribution / HOL-Library.Word to
Word_Lib (»moved to session Word_Lib in the AFP«)Restructuring of that (originally non-Word_Lib (!)) material
(»modularized into separate theories«).(I emphasize here that most of the changes seen in the AFP actually
originate from the Bit/Word material in the distribution, where the
cumulative reworking over two releases gives me kind of responsibility
and »legitimacy« for proactive developments. Coming back to the original
aims: »The ultimate goal is that future development can happen mostly in
the AFP.« Ie. without subtle and hard to maintain dependencies between
AFP and distribution. Note also that the changes towards Isabelle2021
are far more massive than those afterwards.)
This is the thing: I don't object to the massive changes for Isabelle2021, at least not the ones we discussed and looked at together. I saw (and still see) the value in the reorganisation and generalisation you did there. The problem are the subsequent "cleanup" etc commits, which to me did not actually clean much up but have managed to leave Word_Lib unusable (it'll be fine with the changes I wanted). If we had gone through the same cycle of discussion beforehand I could have pointed these out very easily.
I can see where our misunderstand originated and I'll get over my annoyance. I'm glad I had the time now to look at it before we have to freeze this for the release.
Generally: If there are any repositories I should have a look at for
some kind of assessment, please let me know. Also if there is particular
experience of the migration to Isabelle2021 I should take into
consideration.
There is some relevant experience from Isabelle2021. As a rough order of magnitude, it took about 2 full person months to do the complete proof update for l4v, which finished a bit more than a week ago now. 2 person months is about 1 person month higher than a usual Isabelle update. Not quite all, but most of the changes were due to Word_Lib.
You did an initial foray into this proof update for one architecture (ARM, 32 bit) after the first half of the Isabelle2021 changes. The result of that foray was that those changes were relatively benign, which is why I was surprised by the much higher effort later. I think the main fallout was from a few small changes in the second half of the Isabelle2021 update. What is relevant for the next update are these:
things started breaking immediately for 64-bit architectures, because the Sumo concept alone does not quite work. It is not enough to make all lemmas for all word lengths available. The point of the Word_SetupXX theories is to additionally establish a set of common names that refer to the default word type of the program (basically, whatever "unsigned int" is in C for that architecture, and what the register width is for the machine). Having these names common means that the same proof text can work on different concrete types. This is different to actually generic lemmas that work for any word length or that need to resolve preconditions on the word length before they can be applied (the latter can be solved in theories like Word8 etc). It is not the prettiest form of genericity, but it is crucial for not having to duplicate thousands of lemmas that for other reasons do need to refer to a concrete word length (which is known in the proof state, but unknown to the proof text). Ultimately this is comes from C, which as painful as it is, works on exactly that principle that the same type name can refer to different representations, depending on architecture.
the default simpset setup changed, it no longer reliably normalises ground terms with numerals. This caused a bunch of proof changes, hunting for which rules are needed etc. My reading of NEWS it that this is getting worse with NOT terms not being normalised any more. This potentially makes sense for int/nat, but I would suggest to keep them for Word (and to make sure the rest of the numeral simplification actually covers all cases -- usually 0, 1, Suc 0 are the corner cases to look out for).
the same issue caused C/assembly refinement to fail, as well as generated bitfield proofs (for 64 bit architectures) because arithmetic leaves different terms. This might be unavoidable for arbitrary terms in a larger update, but Isabelle should be able to compute with at least numeral ground terms reliably.
punning max_word with -1 conflicts with normalising numerals to the interval [0; 2^len). There were a few test lemmas in the repository that made sure this was not accidentally removed, I'm not sure where these went. Removing normalisation for -1 in turn leads to terms like unat(-1) which proofs then get stuck on. This can be fixed locally by finding the right rule to re-establish the old normalisation, but that will of course mean that abstract lemmas that refer to -1 won't apply any more. This means there is no principled way to deal with ground terms any more. I can see the appeal of using -1 for generic lemmas, because it has nice algebraic meaning (as opposed to max_word). Maybe the solution is to build a rule set for arithmetic that can be added on demand or as a bundle. It's a bit of a crutch, but at least it would provide a choice.
the new simpset often loops in the l4v proofs, because it either reversed polarity or added additional rules about ucast/unat. Some of this was fall-out from the more general treatment (e.g. unsigned)
not all new looping is due to that specific problem. I haven't been able to track down what rules exactly are the problem, but adding field_simps now almost always loops in word proofs. This might be a symptom of a different problem (i.e. not related to Word_Lib), because the usual fix was to instantiate specific commutativity rules. This indicates that something doesn't quite work any more with ordered rewriting. Possibly there is now an additional rewriting step or something like that so that the simple test in simp
no longer suffices.
related to simp rules for "unsigned", there are many (hundreds) of instances where casts or unat now leave terms of the form "take_bit 32 .." in the goal which have little chance of making progress. This was such frequent cause of breakage that we introduced a bundle which removes the rules producing take_bit terms. Trying to remove them globally is hopeless, because any theory merge with a theory that still has them, will re-introduce them.
similarly, there is a rule that automatically rewrites "x = 0" to something in the direction of "x dvd 2^len". This is rarely useful with concrete len, and preventing these terms from being produced is similarly manual as take_bit.
the unat_arith tactic solves fewer goals. I might be mistaken, but I think this is mostly due to simpset changes, but I have not had the time to track down which exactly.
Overall, it looks like a number of aspects of the new simpset are convenient for reasoning abstractly about machine words when you are working within the library (esp take_bit and 0/dvd), but counterproductive once you are working with specific word lengths as you do in program verification.
I think it would make sense to attempt to tune the simpset such that both scenarios (concrete and abstract) are available separately as bundles, making the global default conservative.
The problem is that the old setup was fairly well tuned and recreating it will not be easy, but at least removing some of the global rules that only work well for the abstract setup would help, because adding simp rules later is easy, but removing them is annoying.
There is a separate question of what a good setup is for the interplay between casts to different types and what good normal forms for these are. We had settled on never unfolding ucast/unat/uint etc automatically, instead producing abstract rules that describe the interaction with the usual operators and relationships between each other. I still think that is a reasonable setting.
For instance, who on earth wants to read something like "is_aligned (push_bit m k) m" instead of "is_aligned (k << m) m"? . The syntax was introduced for good reasons. It is fine to make it a bundle to not pollute a global syntax space, it is not fine to change original material that was written with a purpose.
If your point is to to apply the ‹_ << _› syntax etc. thoroughly and
consistently among Word_Lib theories, I am happy to look after these slips.
That would be appreciated. My complaint was not that a new lemma did not use the word syntax (although it would be nice to remain consistent), but that old lemmas were rewritten to not use it. Possibly as part of another change.
It is absolutely not fine to produce commits like "dropped junk" which removes a theory that was critical to how this library is used after in previous commits having made it unusable, entirely missing the point of its existence. Please revert that entire series of commits and reproduce the old setup.
After a second study of history I notice that I mess
[message truncated]
From: Makarius <makarius@sketis.net>
Side-remark: About 50% of the time I do look at the hg history (using e.g. "hg
grep --all") to see how names evolve, and what needs to be done. This also
applies to magic (simp add: ...) or (simp del: ...) details.
Makarius
isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev
From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
There is some relevant experience from Isabelle2021. As a rough order of magnitude, it took about 2 full person months to do the complete proof update for l4v, which finished a bit more than a week ago now. 2 person months is about 1 person month higher than a usual Isabelle update. Not quite all, but most of the changes were due to Word_Lib.
You did an initial foray into this proof update for one architecture (ARM, 32 bit) after the first half of the Isabelle2021 changes. The result of that foray was that those changes were relatively benign, which is why I was surprised by the much higher effort later. I think the main fallout was from a few small changes in the second half of the Isabelle2021 update.
In the retrospective I did underestimate the side conditions imposed by
the life cycle of l4v wrt. to Isabelle releases.
Do I understand correctly that there is a running version built on
Isabelle2021 / AFP 2021? If yes, this can be used as point of reference
when appropriate.
It seems there are a couple of things to iron out properly. There is
still some time towards the upcoming Isabelle release. Beyond that, how
strictly is l4v coupled to the Isabelle / AFP release cycle? Maybe
consolidations for Word_Lib could also happen on a dedicated AFP branch
after the Isabelle release and later converge to the (then) next AFP
release.
- things started breaking immediately for 64-bit architectures, because the Sumo concept alone does not quite work. It is not enough to make all lemmas for all word lengths available. The point of the Word_SetupXX theories is to additionally establish a set of common names that refer to the default word type of the program (basically, whatever "unsigned int" is in C for that architecture, and what the register width is for the machine). Having these names common means that the same proof text can work on different concrete types. This is different to actually generic lemmas that work for any word length or that need to resolve preconditions on the word length before they can be applied (the latter can be solved in theories like Word8 etc). It is not the prettiest form of genericity, but it is crucial for not having to duplicate thousands of lemmas that for other reasons do need to refer to a concrete word length (which is known in the proof state, but unknown to the proof text). Ultimately this is comes from C, which as painful as it is, works on exactly that principle that the same type name can refer to different representations, depending on architecture.
My proposal:
Theories Word_SetupXX re-appear offering the same name bindings as
pre-Isabelle2021
Their purpose is documented in the Guide.thy
The shift*1 Operations have been auxiliary definitions in the
distribution to bootstrap the bit shifts operations, similarly to
iszero, equipped only with a minimal set of lemmas. If one tool out
there relies on it as explicit constant i. e. due to pattern matching, I
won’t shed a tear to keep it as constant.Great, let's keep them constants.
(Aside, the primitive definitions then would be maybe: shiftl1 a = (a << 1), shiftr1 a = (a >> 1), sshiftr1 a = (a >>> 1)).
I'd be fine with these as definitions as long as the old forms still available as lemmas, preferably under the previous names. Maybe we can make sure over time that these constants are not produced any more by any tools and then they can finally be removed, but that is a number of release cycles away.
My proposal:
Proper constants for shiftl1 a = (a << 1), shiftr1 a = (a >> 1),
sshiftr1 a = (a >>> 1)
Facts for those available under the same names as Isabelle2021 / AFP
2021 (point of reference)
In terms of Isabelle's robustness: I'd agree with it being more robust for moving constants where antiquotations can be used to refer to unqualified names, but renaming is still one of the major pain points in almost every Isabelle release -- compounded by the fact that by far not all of these are documented in a way that would be usable for a simple search/replace. Those names that are listed in NEWS with their replacements are highly appreciated, but the lists are not complete, and there is no such NEWS for Word_Lib.
Maybe the impression of robustness comes from the incremental way we update the AFP. Each specific renaming is obvious to the author of that change and likely not hard to automate. On the other hand, being confronted with all of these at the same time, and having to find all instances by proof failure is time consuming and can be extremely frustrating when you have to hunt in hg history for what something has been renamed to.
Concerning NEWS, the Guide.thy has a changelog section.
Concerning robustness I did refer to changes to the internal
identifiers, e.g. due to a move to a different theory, not to a renaming
proper, ie. a change of the typical (unqualified) name visible to the
end-user. And according to my memory such renamings should not have
happened. Did you stumble over a renaming though?
For instance, who on earth wants to read something like "is_aligned (push_bit m k) m" instead of "is_aligned (k << m) m"? . The syntax was introduced for good reasons. It is fine to make it a bundle to not pollute a global syntax space, it is not fine to change original material that was written with a purpose.
If your point is to to apply the ‹_ << _› syntax etc. thoroughly and
consistently among Word_Lib theories, I am happy to look after these slips.That would be appreciated. My complaint was not that a new lemma did not use the word syntax (although it would be nice to remain consistent), but that old lemmas were rewritten to not use it. Possibly as part of another change.
There is one issue which prevents a 100% satisfactory solution here: the
syntax needs abbreviations, and these cannot be organized in bundles.
Taking into consideration that there seem to be enough rough edges at
the moment, I would not object to postpone the reconciliation of the two
different layers of bit shift operations to a future release, ie.
keeping << >> >>> as distinct constants at the moment.
(This would be one remaining item on my todo list, the other one being
the final dismantling of Ancient_Numeral.thy sometime in the future)
- the unat_arith tactic solves fewer goals. I might be mistaken, but I think this is mostly due to simpset changes, but I have not had the time to track down which exactly.
Tactic unat_arith has a technical deficiency: at the time of its
declaration it takes a simpset »as it is« as base, and hence its
behavior is brittle wrt. to movements in the surroundings. Since there
are many things to work out concerning the simp setup, I would postpone
this issue after these have been resolved.
- the default simpset setup changed, it no longer reliably normalises ground terms with numerals.
Ie. terms involving operations like AND + - * and numerals 0, 1, 42? If
that breaks down, something is really weird.
This caused a bunch of proof changes, hunting for which rules are needed etc. My reading of NEWS it that this is getting worse with NOT terms not being normalised any more. This potentially makes sense for int/nat, but I would suggest to keep them for Word (and to make sure the rest of the numeral simplification actually covers all cases -- usually 0, 1, Suc 0 are the corner cases to look out for).
After Thomas Sewell pointed out some problems with simplifying bit
expressions over natural numbers, I took over the elementary numeral
rewriting approach by Andreas Lochbihler originating in session
Native_Word to overcome that; in the first stage that relied on NOT not
being simplified by default, but with everything now in place, it should
not be difficult to recover that.
Simplification of bit expressions involving word numerals works still
the same way as pre-Isabelle2021: rewriting to int first. If this by
itself exhibits problems.
- the same issue caused C/assembly refinement to fail, as well as generated bitfield proofs (for 64 bit architectures) because arithmetic leaves different terms. This might be unavoidable for arbitrary terms in a larger update, but Isabelle should be able to compute with at least numeral ground terms reliably.
Yes, that is definitely supposed to work. Do you have examples at hand?
Otherwise I will augment the Examples.thy to cover more combinations.
- punning max_word with -1 conflicts with normalising numerals to the interval [0; 2^len). There were a few test lemmas in the repository that made sure this was not accidentally removed, I'm not sure where these went. Removing normalisation for -1 in turn leads to terms like unat(-1) which proofs then get stuck on. This can be fixed locally by finding the right rule to re-establish the old normalisation, but that will of course mean that abstract lemmas that refer to -1 won't apply any more. This means there is no principled way to deal with ground terms any more. I can see the appeal of using -1 for generic lemmas, because it has nice algebraic meaning (as opposed to max_word). Maybe the solution is to build a rule set for arithmetic that can be added on demand or as a bundle. It's a bit of a crutch, but at least it would provide a choice.
Concerning normalizing numerals to the interval [0; 2^len) – is that
supposed to work universally? There seems to be no such mechanism
neither in Isabelle2020 nor Isabelle2021:
lemma ‹w = 2342342› for w :: ‹4 word›
apply simp
(I think something like that would require a simproc).
Concerning max_word – would it help to let it abbreviate 2^len - 1
rather than -1? If it is used in terms involving concrete numerals, the
abstract properties of -1 are not that relevant. (I pondered that issue
into different directions, but will spare the space here for the moment).
Coming back to word numeral norm
[message truncated]
From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
(Repost after using a bad mail address – sorry for the noise)
There is some relevant experience from Isabelle2021. As a rough order of magnitude, it took about 2 full person months to do the complete proof update for l4v, which finished a bit more than a week ago now. 2 person months is about 1 person month higher than a usual Isabelle update. Not quite all, but most of the changes were due to Word_Lib.
You did an initial foray into this proof update for one architecture (ARM, 32 bit) after the first half of the Isabelle2021 changes. The result of that foray was that those changes were relatively benign, which is why I was surprised by the much higher effort later. I think the main fallout was from a few small changes in the second half of the Isabelle2021 update.
In the retrospective I did underestimate the side conditions imposed by
the life cycle of l4v wrt. to Isabelle releases.
Do I understand correctly that there is a running version built on
Isabelle2021 / AFP 2021? If yes, this can be used as point of reference
when appropriate.
It seems there are a couple of things to iron out properly. There is
still some time towards the upcoming Isabelle release. Beyond that, how
strictly is l4v coupled to the Isabelle / AFP release cycle? Maybe
consolidations for Word_Lib could also happen on a dedicated AFP branch
after the Isabelle release and later converge to the (then) next AFP
release.
- things started breaking immediately for 64-bit architectures, because the Sumo concept alone does not quite work. It is not enough to make all lemmas for all word lengths available. The point of the Word_SetupXX theories is to additionally establish a set of common names that refer to the default word type of the program (basically, whatever "unsigned int" is in C for that architecture, and what the register width is for the machine). Having these names common means that the same proof text can work on different concrete types. This is different to actually generic lemmas that work for any word length or that need to resolve preconditions on the word length before they can be applied (the latter can be solved in theories like Word8 etc). It is not the prettiest form of genericity, but it is crucial for not having to duplicate thousands of lemmas that for other reasons do need to refer to a concrete word length (which is known in the proof state, but unknown to the proof text). Ultimately this is comes from C, which as painful as it is, works on exactly that principle that the same type name can refer to different representations, depending on architecture.
My proposal:
Theories Word_SetupXX re-appear offering the same name bindings as
pre-Isabelle2021
Their purpose is documented in the Guide.thy
The shift*1 Operations have been auxiliary definitions in the
distribution to bootstrap the bit shifts operations, similarly to
iszero, equipped only with a minimal set of lemmas. If one tool out
there relies on it as explicit constant i. e. due to pattern matching, I
won’t shed a tear to keep it as constant.Great, let's keep them constants.
(Aside, the primitive definitions then would be maybe: shiftl1 a = (a << 1), shiftr1 a = (a >> 1), sshiftr1 a = (a >>> 1)).
I'd be fine with these as definitions as long as the old forms still available as lemmas, preferably under the previous names. Maybe we can make sure over time that these constants are not produced any more by any tools and then they can finally be removed, but that is a number of release cycles away.
My proposal:
Proper constants for shiftl1 a = (a << 1), shiftr1 a = (a >> 1),
sshiftr1 a = (a >>> 1)
Facts for those available under the same names as Isabelle2021 / AFP
2021 (point of reference)
In terms of Isabelle's robustness: I'd agree with it being more robust for moving constants where antiquotations can be used to refer to unqualified names, but renaming is still one of the major pain points in almost every Isabelle release -- compounded by the fact that by far not all of these are documented in a way that would be usable for a simple search/replace. Those names that are listed in NEWS with their replacements are highly appreciated, but the lists are not complete, and there is no such NEWS for Word_Lib.
Maybe the impression of robustness comes from the incremental way we update the AFP. Each specific renaming is obvious to the author of that change and likely not hard to automate. On the other hand, being confronted with all of these at the same time, and having to find all instances by proof failure is time consuming and can be extremely frustrating when you have to hunt in hg history for what something has been renamed to.
Concerning NEWS, the Guide.thy has a changelog section.
Concerning robustness I did refer to changes to the internal
identifiers, e.g. due to a move to a different theory, not to a renaming
proper, ie. a change of the typical (unqualified) name visible to the
end-user. And according to my memory such renamings should not have
happened. Did you stumble over a renaming though?
For instance, who on earth wants to read something like "is_aligned (push_bit m k) m" instead of "is_aligned (k << m) m"? . The syntax was introduced for good reasons. It is fine to make it a bundle to not pollute a global syntax space, it is not fine to change original material that was written with a purpose.
If your point is to to apply the ‹_ << _› syntax etc. thoroughly and
consistently among Word_Lib theories, I am happy to look after these slips.That would be appreciated. My complaint was not that a new lemma did not use the word syntax (although it would be nice to remain consistent), but that old lemmas were rewritten to not use it. Possibly as part of another change.
There is one issue which prevents a 100% satisfactory solution here: the
syntax needs abbreviations, and these cannot be organized in bundles.
Taking into consideration that there seem to be enough rough edges at
the moment, I would not object to postpone the reconciliation of the two
different layers of bit shift operations to a future release, ie.
keeping << >> >>> as distinct constants at the moment.
(This would be one remaining item on my todo list, the other one being
the final dismantling of Ancient_Numeral.thy sometime in the future)
- the unat_arith tactic solves fewer goals. I might be mistaken, but I think this is mostly due to simpset changes, but I have not had the time to track down which exactly.
Tactic unat_arith has a technical deficiency: at the time of its
declaration it takes a simpset »as it is« as base, and hence its
behavior is brittle wrt. to movements in the surroundings. Since there
are many things to work out concerning the simp setup, I would postpone
this issue after these have been resolved.
- the default simpset setup changed, it no longer reliably normalises ground terms with numerals.
Ie. terms involving operations like AND + - * and numerals 0, 1, 42? If
that breaks down, something is really weird.
This caused a bunch of proof changes, hunting for which rules are needed etc. My reading of NEWS it that this is getting worse with NOT terms not being normalised any more. This potentially makes sense for int/nat, but I would suggest to keep them for Word (and to make sure the rest of the numeral simplification actually covers all cases -- usually 0, 1, Suc 0 are the corner cases to look out for).
After Thomas Sewell pointed out some problems with simplifying bit
expressions over natural numbers, I took over the elementary numeral
rewriting approach by Andreas Lochbihler originating in session
Native_Word to overcome that; in the first stage that relied on NOT not
being simplified by default, but with everything now in place, it should
not be difficult to recover that.
Simplification of bit expressions involving word numerals works still
the same way as pre-Isabelle2021: rewriting to int first. If this by
itself exhibits problems.
- the same issue caused C/assembly refinement to fail, as well as generated bitfield proofs (for 64 bit architectures) because arithmetic leaves different terms. This might be unavoidable for arbitrary terms in a larger update, but Isabelle should be able to compute with at least numeral ground terms reliably.
Yes, that is definitely supposed to work. Do you have examples at hand?
Otherwise I will augment the Examples.thy to cover more combinations.
- punning max_word with -1 conflicts with normalising numerals to the interval [0; 2^len). There were a few test lemmas in the repository that made sure this was not accidentally removed, I'm not sure where these went. Removing normalisation for -1 in turn leads to terms like unat(-1) which proofs then get stuck on. This can be fixed locally by finding the right rule to re-establish the old normalisation, but that will of course mean that abstract lemmas that refer to -1 won't apply any more. This means there is no principled way to deal with ground terms any more. I can see the appeal of using -1 for generic lemmas, because it has nice algebraic meaning (as opposed to max_word). Maybe the solution is to build a rule set for arithmetic that can be added on demand or as a bundle. It's a bit of a crutch, but at least it would provide a choice.
Concerning normalizing numerals to the interval [0; 2^len) – is that
supposed to work universally? There seems to be no such mechanism
neither in Isabelle2020 nor Isabelle2021:
lemma ‹w = 2342342› for w :: ‹4 word›
apply simp
(I think something like that would require a simproc).
Concerning max_word – would it help to let it abbreviate 2^len - 1
rather than -1? If it is used in terms involving concrete numerals, the
abstract properties of -1 are not that relevant. (I pondered that issue
into different directions, but will spare th
[message truncated]
From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
No need to investigate that further – the critical rules are
word_of_nat_eq_0_iff and word_of_int_eq_0_iff
Florian
OpenPGP_signature
From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
A quick report after reaching isabelle 807b094a9b78 / AFP c42c2c2447c2.
NOT <numeric expression> is normalized without interfering with other
normalization rules; rules have been augmented accordingly.
Conversions are not normalized over-aggressively, ie. only if no
auxiliary »glue« operations have to be inserted.
In the pipeline is a change which makes the normalizing set on int, nat,
word more complete, esp. wrt. negative numerals; while there is no
evidence that is the reason for the observed breakdown in normalization
of words, it has been a good opportunity to tackle that systematically.
An observation: Normalization rules for words typically work by
rewriting to int. This approach is historic – normalization could be
achieved by more elementary rewriting in most cases. At least this seems
to be the cause for the illusion of implicit normalization of word numerals:
unbundle bit_operations_syntax
lemma
‹w = 1705› for w :: ‹8 word›
apply simp \<comment> ‹no normalization›
oops
lemma
‹w = 1705 AND 255› for w :: ‹8 word›
apply simp \<comment> ‹normalizes due to rewriting to int›
oops
My next steps are to conclude the obvious normalization issues in the
distribution and then tackle the open more technical issues in the AFP.
Then I have to find a way to find out whether these resolve the observed
issues in l4v, particularly:
- the default simpset setup changed, it no longer reliably normalises ground terms with numerals.
- the same issue caused C/assembly refinement to fail, as well as generated bitfield proofs (for 64 bit architectures) because arithmetic leaves different terms. This might be unavoidable for arbitrary terms in a larger update, but Isabelle should be able to compute with at least numeral ground terms reliably.
There is a separate question of what a good setup is for the interplay between casts to different types and what good normal forms for these are. We had settled on never unfolding ucast/unat/uint etc automatically, instead producing abstract rules that describe the interaction with the usual operators and relationships between each other. I still think that is a reasonable setting.
- the new simpset often loops in the l4v proofs, because it either reversed polarity or added additional rules about ucast/unat. Some of this was fall-out from the more general treatment (e.g. unsigned)
- not all new looping is due to that specific problem. I haven't been able to track down what rules exactly are the problem, but adding field_simps now almost always loops in word proofs. This might be a symptom of a different problem (i.e. not related to Word_Lib), because the usual fix was to instantiate specific commutativity rules. This indicates that something doesn't quite work any more with ordered rewriting. Possibly there is now an additional rewriting step or something like that so that the simple test in
simp
no longer suffices.
Just a further observation:
Overall, it looks like a number of aspects of the new simpset are convenient for reasoning abstractly about machine words when you are working within the library (esp take_bit and 0/dvd), but counterproductive once you are working with specific word lengths as you do in program verification.
Simp rules for concrete values are often oriented the other way round
than those for abstract reasoning; typical instances I’m familiar with
are abstract code equations and simp rules for numerals – which makes it
painful to prove new simp rules for numerals since the concrete
rewrites always get into the way.
But from what I have seen so far I don’t think that this applies here –
the bias towards take_bit stems from a few new simp rules on
conversions, which are now not default any longer. Whether more rules
on take_bit would have resolved the problems is unclear to me, but the
idea behind the current setup to leave conversions in case of doubt
sounds like a reasonable and understandable approach to settle on.
Cheers,
Florian
OpenPGP_signature
From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
(Repost after using a bad mail address (again) – sorry for the noise)
A quick report after reaching isabelle 807b094a9b78 / AFP c42c2c2447c2.
NOT <numeric expression> is normalized without interfering with other
normalization rules; rules have been augmented accordingly.
Conversions are not normalized over-aggressively, ie. only if no
auxiliary »glue« operations have to be inserted.
In the pipeline is a change which makes the normalizing set on int, nat,
word more complete, esp. wrt. negative numerals; while there is no
evidence that this the reason for the observed breakdown in
normalization of words, it has been a good opportunity to tackle that
systematically.
An observation: Normalization rules for words typically work by
rewriting to int. This approach is historic – normalization could be
achieved by more elementary rewriting in most cases. At least this seems
to be the cause for the illusion of implicit normalization of word numerals:
unbundle bit_operations_syntax
lemma
‹w = 1705› for w :: ‹8 word›
apply simp \<comment> ‹no normalization›
oops
lemma
‹w = 1705 AND 255› for w :: ‹8 word›
apply simp \<comment> ‹normalizes due to rewriting to int›
oops
My next steps are to conclude the obvious normalization issues in the
distribution and then tackle the open more technical issues in the AFP.
Then I have to find a way to find out whether these resolve the observed
issues in l4v, particularly:
- the default simpset setup changed, it no longer reliably normalises ground terms with numerals.
- the same issue caused C/assembly refinement to fail, as well as generated bitfield proofs (for 64 bit architectures) because arithmetic leaves different terms. This might be unavoidable for arbitrary terms in a larger update, but Isabelle should be able to compute with at least numeral ground terms reliably.
There is a separate question of what a good setup is for the interplay between casts to different types and what good normal forms for these are. We had settled on never unfolding ucast/unat/uint etc automatically, instead producing abstract rules that describe the interaction with the usual operators and relationships between each other. I still think that is a reasonable setting.
- the new simpset often loops in the l4v proofs, because it either reversed polarity or added additional rules about ucast/unat. Some of this was fall-out from the more general treatment (e.g. unsigned)
- not all new looping is due to that specific problem. I haven't been able to track down what rules exactly are the problem, but adding field_simps now almost always loops in word proofs. This might be a symptom of a different problem (i.e. not related to Word_Lib), because the usual fix was to instantiate specific commutativity rules. This indicates that something doesn't quite work any more with ordered rewriting. Possibly there is now an additional rewriting step or something like that so that the simple test in
simp
no longer suffices.
Just a further observation:
Overall, it looks like a number of aspects of the new simpset are convenient for reasoning abstractly about machine words when you are working within the library (esp take_bit and 0/dvd), but counterproductive once you are working with specific word lengths as you do in program verification.
Simp rules for concrete values are often oriented the other way round
than those for abstract reasoning; typical instances I’m familiar with
are abstract code equations and simp rules for numerals – which makes it
painful to prove new simp rules for numerals since the concrete
rewrites always get into the way.
But from what I have seen so far I don’t think that this applies here –
the bias towards take_bit stems from a few new simp rules on
conversions, which are now not default any longer. Whether more rules
on take_bit would have resolved the problems is unclear to me, but the
idea behind the current setup to leave conversions in case of doubt
sounds like a reasonable and understandable approach to settle on.
Cheers,
Florian
OpenPGP_signature
From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
A clarifying remark:
This does only refer to Word.thy proper.
In Word_Lib, there is theory Norm_Words.thy which does a normalization
of word numerals except ‹- 1›; according to its examples, it works as
advertized.
Cheers,
Florian
OpenPGP_signature
From: Gerwin Klein <kleing@unsw.edu.au>
On 11 Oct 2021, at 03:19, Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de> wrote:
Signed PGP part
A quick report after reaching isabelle 807b094a9b78 / AFP c42c2c2447c2.
- NOT <numeric expression> is normalized without interfering with other
normalization rules; rules have been augmented accordingly.
Excellent.
- Conversions are not normalized over-aggressively, ie. only if no
auxiliary »glue« operations have to be inserted.
Also sounds good.
An observation: Normalization rules for words typically work by
rewriting to int. This approach is historic – normalization could be
achieved by more elementary rewriting in most cases. At least this seems
to be the cause for the illusion of implicit normalization of word numerals:unbundle bit_operations_syntax
lemma
‹w = 1705› for w :: ‹8 word›
apply simp \<comment> ‹no normalization›
oopslemma
‹w = 1705 AND 255› for w :: ‹8 word›
apply simp \<comment> ‹normalizes due to rewriting to int›
oopsMy next steps are to conclude the obvious normalization issues in the
distribution and then tackle the open more technical issues in the AFP.
This sounds all good.
Then I have to find a way to find out whether these resolve the observed
issues in l4v, particularly:
- the default simpset setup changed, it no longer reliably normalises ground terms with numerals.
- the same issue caused C/assembly refinement to fail, as well as generated bitfield proofs (for 64 bit architectures) because arithmetic leaves different terms. This might be unavoidable for arbitrary terms in a larger update, but Isabelle should be able to compute with at least numeral ground terms reliably.
There is a separate question of what a good setup is for the interplay between casts to different types and what good normal forms for these are. We had settled on never unfolding ucast/unat/uint etc automatically, instead producing abstract rules that describe the interaction with the usual operators and relationships between each other. I still think that is a reasonable setting.
- the new simpset often loops in the l4v proofs, because it either reversed polarity or added additional rules about ucast/unat. Some of this was fall-out from the more general treatment (e.g. unsigned)
- not all new looping is due to that specific problem. I haven't been able to track down what rules exactly are the problem, but adding field_simps now almost always loops in word proofs. This might be a symptom of a different problem (i.e. not related to Word_Lib), because the usual fix was to instantiate specific commutativity rules. This indicates that something doesn't quite work any more with ordered rewriting. Possibly there is now an additional rewriting step or something like that so that the simple test in
simp
no longer suffices.
The last update fixed all the occurrences that Isabelle2021 introduces, so it might be nontrivial to find examples. I can try to start a partial update to Isabelle2021-1-RC0 and report on how that goes. I can probably also still relatively easily find the positions where looping with commutativity was the specific problem.
Just a further observation:
Overall, it looks like a number of aspects of the new simpset are convenient for reasoning abstractly about machine words when you are working within the library (esp take_bit and 0/dvd), but counterproductive once you are working with specific word lengths as you do in program verification.
Simp rules for concrete values are often oriented the other way round
than those for abstract reasoning; typical instances I’m familiar with
are abstract code equations and simp rules for numerals – which makes it
painful to prove new simp rules for numerals since the concrete
rewrites always get into the way.But from what I have seen so far I don’t think that this applies here –
the bias towards take_bit stems from a few new simp rules on
conversions, which are now not default any longer. Whether more rules
on take_bit would have resolved the problems is unclear to me, but the
idea behind the current setup to leave conversions in case of doubt
sounds like a reasonable and understandable approach to settle on.
I'd be happy with that.
Cheers,
Gerwin
signature.asc
From: Gerwin Klein <kleing@unsw.edu.au>
On 9 Oct 2021, at 19:37, Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de> wrote:
Signed PGP part
There is some relevant experience from Isabelle2021. As a rough order of magnitude, it took about 2 full person months to do the complete proof update for l4v, which finished a bit more than a week ago now. 2 person months is about 1 person month higher than a usual Isabelle update. Not quite all, but most of the changes were due to Word_Lib.
You did an initial foray into this proof update for one architecture (ARM, 32 bit) after the first half of the Isabelle2021 changes. The result of that foray was that those changes were relatively benign, which is why I was surprised by the much higher effort later. I think the main fallout was from a few small changes in the second half of the Isabelle2021 update.
In the retrospective I did underestimate the side conditions imposed by
the life cycle of l4v wrt. to Isabelle releases.Do I understand correctly that there is a running version built on
Isabelle2021 / AFP 2021? If yes, this can be used as point of reference
when appropriate.
Yes, the current master branch of https://github.com/seL4/l4v <https://github.com/seL4/l4v> is on Isabelle2021
It seems there are a couple of things to iron out properly. There is
still some time towards the upcoming Isabelle release. Beyond that, how
strictly is l4v coupled to the Isabelle / AFP release cycle?
l4v is strongly coupled to the release cycle in the sense that it only uses Isabelle release versions. It maybe lag behind Isabelle releases sometimes, and it may take new versions of AFP entries as long as these still work with the corresponding Isabelle release (or may add its own changes) .
Maybe consolidations for Word_Lib could also happen on a dedicated AFP branch
after the Isabelle release and later converge to the (then) next AFP
release.
Yes, that would be a possibility.
- things started breaking immediately for 64-bit architectures, because the Sumo concept alone does not quite work. It is not enough to make all lemmas for all word lengths available. The point of the Word_SetupXX theories is to additionally establish a set of common names that refer to the default word type of the program (basically, whatever "unsigned int" is in C for that architecture, and what the register width is for the machine). Having these names common means that the same proof text can work on different concrete types. This is different to actually generic lemmas that work for any word length or that need to resolve preconditions on the word length before they can be applied (the latter can be solved in theories like Word8 etc). It is not the prettiest form of genericity, but it is crucial for not having to duplicate thousands of lemmas that for other reasons do need to refer to a concrete word length (which is known in the proof state, but unknown to the proof text). Ultimately this is comes from C, which as painful as it is, works on exactly that principle that the same type name can refer to different representations, depending on architecture.
My proposal: * Theories Word_SetupXX re-appear offering the same name bindings as
pre-Isabelle2021 * Their purpose is documented in the Guide.thy
Sounds good.
The shift*1 Operations have been auxiliary definitions in the
distribution to bootstrap the bit shifts operations, similarly to
iszero, equipped only with a minimal set of lemmas. If one tool out
there relies on it as explicit constant i. e. due to pattern matching, I
won’t shed a tear to keep it as constant.Great, let's keep them constants.
(Aside, the primitive definitions then would be maybe: shiftl1 a = (a << 1), shiftr1 a = (a >> 1), sshiftr1 a = (a >>> 1)).
I'd be fine with these as definitions as long as the old forms still available as lemmas, preferably under the previous names. Maybe we can make sure over time that these constants are not produced any more by any tools and then they can finally be removed, but that is a number of release cycles away.
My proposal: * Proper constants for shiftl1 a = (a << 1), shiftr1 a = (a >> 1),
sshiftr1 a = (a >>> 1) * Facts for those available under the same names as Isabelle2021 / AFP
2021 (point of reference)
Perfect.
For instance, who on earth wants to read something like "is_aligned (push_bit m k) m" instead of "is_aligned (k << m) m"? . The syntax was introduced for good reasons. It is fine to make it a bundle to not pollute a global syntax space, it is not fine to change original material that was written with a purpose.
If your point is to to apply the ‹_ << _› syntax etc. thoroughly and
consistently among Word_Lib theories, I am happy to look after these slips.That would be appreciated. My complaint was not that a new lemma did not use the word syntax (although it would be nice to remain consistent), but that old lemmas were rewritten to not use it. Possibly as part of another change.
There is one issue which prevents a 100% satisfactory solution here: the
syntax needs abbreviations, and these cannot be organized in bundles.Taking into consideration that there seem to be enough rough edges at
the moment, I would not object to postpone the reconciliation of the two
different layers of bit shift operations to a future release, ie.
keeping << >> >>> as distinct constants at the moment.
I think that would be Ok, yes.
(This would be one remaining item on my todo list, the other one being
the final dismantling of Ancient_Numeral.thy sometime in the future)
That is probably a good idea, although some of these old lemmas are still used. It should be possible to find where these are given a list of names.
- the unat_arith tactic solves fewer goals. I might be mistaken, but I think this is mostly due to simpset changes, but I have not had the time to track down which exactly.
Tactic unat_arith has a technical deficiency: at the time of its
declaration it takes a simpset »as it is« as base, and hence its
behavior is brittle wrt. to movements in the surroundings. Since there
are many things to work out concerning the simp setup, I would postpone
this issue after these have been resolved.
I'm fine with that, and yes, it would be much better for unat_arith to use a more controlled simpset (possibly augmentable with a named theorem set).
- the default simpset setup changed, it no longer reliably normalises ground terms with numerals.
Ie. terms involving operations like AND + - * and numerals 0, 1, 42? If
that breaks down, something is really weird.
I don't remember the concrete term, I think it was something with shifting + AND/OR, getting stuck either on one of the usual suspects (0, 1, Suc 0 or -1).
This caused a bunch of proof changes, hunting for which rules are needed etc. My reading of NEWS it that this is getting worse with NOT terms not being normalised any more. This potentially makes sense for int/nat, but I would suggest to keep them for Word (and to make sure the rest of the numeral simplification actually covers all cases -- usually 0, 1, Suc 0 are the corner cases to look out for).
After Thomas Sewell pointed out some problems with simplifying bit
expressions over natural numbers, I took over the elementary numeral
rewriting approach by Andreas Lochbihler originating in session
Native_Word to overcome that; in the first stage that relied on NOT not
being simplified by default, but with everything now in place, it should
not be difficult to recover that.Simplification of bit expressions involving word numerals works still
the same way as pre-Isabelle2021: rewriting to int first. If this by
itself exhibits problems.
pre-Isabelle2021 it worked reliably enough, so I think that is Ok.
- the same issue caused C/assembly refinement to fail, as well as generated bitfield proofs (for 64 bit architectures) because arithmetic leaves different terms. This might be unavoidable for arbitrary terms in a larger update, but Isabelle should be able to compute with at least numeral ground terms reliably.
Yes, that is definitely supposed to work. Do you have examples at hand?
Otherwise I will augment the Examples.thy to cover more combinations.
I can have a look if I can recover them from the update diff (it's over 480 changed files, so it might take a bit).
- punning max_word with -1 conflicts with normalising numerals to the interval [0; 2^len). There were a few test lemmas in the repository that made sure this was not accidentally removed, I'm not sure where these went. Removing normalisation for -1 in turn leads to terms like unat(-1) which proofs then get stuck on. This can be fixed locally by finding the right rule to re-establish the old normalisation, but that will of course mean that abstract lemmas that refer to -1 won't apply any more. This means there is no principled way to deal with ground terms any more. I can see the appeal of using -1 for generic lemmas, because it has nice algebraic meaning (as opposed to max_word). Maybe the solution is to build a rule set for arithmetic that can be added on demand or as a bundle. It's a bit of a crutch, but at least it would provide a choice.
Concerning normalizing numerals to the interval [0; 2^len) – is that
supposed to work universally? There seems to be no such mechanism
neither in Isabelle2020 nor Isabelle2021:lemma ‹w = 2342342› for w :: ‹4 word›
apply simp
Yes, that was definitely supposed to work, also deeper in the term, e.g. P (9::3 word) used to normalise to P 1. I don't know exactly when that was lost, it might have been before Isabelle2021. It looks like it is enough for us to not actually normalise, but instead be able to decide the usual operators (=, <, etc), which did
[message truncated]
From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
Hi Gerwin,
Maybe consolidations for Word_Lib could also happen on a dedicated AFP
branch
after the Isabelle release and later converge to the (then) next AFP
release.Yes, that would be a possibility.
At the moment I am optimistic we won’t need this.
- things started breaking immediately for 64-bit architectures,
because the Sumo concept alone does not quite work. It is not enough
to make all lemmas for all word lengths available. The point of the
Word_SetupXX theories is to additionally establish a set of common
names that refer to the default word type of the program (basically,
whatever "unsigned int" is in C for that architecture, and what the
register width is for the machine). Having these names common means
that the same proof text can work on different concrete types. This
is different to actually generic lemmas that work for any word length
or that need to resolve preconditions on the word length before they
can be applied (the latter can be solved in theories like Word8 etc).
It is not the prettiest form of genericity, but it is crucial for not
having to duplicate thousands of lemmas that for other reasons do
need to refer to a concrete word length (which is known in the proof
state, but unknown to the proof text). Ultimately this is comes from
C, which as painful as it is, works on exactly that principle that
the same type name can refer to different representations, depending
on architecture.My proposal: * Theories Word_SetupXX re-appear offering the same name bindings as
pre-Isabelle2021 * Their purpose is documented in the Guide.thySounds good.
After having a second look, the story appears more delicate: there is
not only Word_Setup_ARCH.thy, but also Word_Lemmas_ARCH.thy; while
Word_Lemmas_32.thy and Word_Lemmas_64.thy mimic each other, they contain
both »generic« lemmas with same name like
lemma word_bits_len_of:
"len_of TYPE (32) = word_bits"
by (simp add: word_bits_conv)
vs.
lemma word_bits_len_of:
"len_of TYPE (64) = word_bits"
by (simp add: word_bits_conv)
and »specific« lemmas with differentiated name like
lemma of_nat32_0:
"\<lbrakk>of_nat n = (0::word32); n < 2 ^ word_bits\<rbrakk>
\<Longrightarrow> n = 0"
by (erule of_nat_0, simp add: word_bits_def)
vs.
lemma of_nat64_0:
"\<lbrakk>of_nat n = (0::word64); n < 2 ^ word_bits\<rbrakk>
\<Longrightarrow> n = 0"
by (erule of_nat_0, simp add: word_bits_def)
And none of them uses the »generic« type alias machine_word introduced
in the corresponding Word_Setup_ARCH.thy theory.
What is the state supposed to be achieved here? Naively I would think
that Word_Setup_ARCH.thy should contain all »generic« lemmas and make
use of the »generic« type alias, wheres »specific« lemmas should stay in
Word_32 / Word_64.
Florian
OpenPGP_signature
From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
Hi Gerwin et al.,
my report about the state of affairs reached inisabelle 8ab92e40dde6 /
afp: 807c0639d73d tip
A. << >> >>> are back as conventional operations (constants)
The glitch of two different sets of bit shift operations naturally
sometimes produces unclear situations.
An ideal solution would be available if abbreviations could be organized
in bundles.
B. A confluent set of rewrite rules on ground terms
Particularly for signed operations like <s <=s sdiv smod this has not
been present even for int so far.
An explicit exception is the mask :: nat => 'a, where there is too
little context to determine whether it is supposed to be understood as
abstract or concrete value.
A related cause is the singleton bit expression 2 ^ numeral _ which
normalizes due to the conventional rules on exponentiation – but this
has never been different.
C. Poor man's genericity 32 vs. 64
I still need feedback on this -- my further post maybe got lost due to
my ongoing confusion of mail adresses.
Cheers,
Florian
OpenPGP_signature
From: Gerwin Klein <kleing@unsw.edu.au>
On 30 Oct 2021, at 21:04, Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de> wrote:
Hi Gerwin et al.,
my report about the state of affairs reached inisabelle 8ab92e40dde6 /
afp: 807c0639d73d tipA. << >> >>> are back as conventional operations (constants)
That sounds good.
The glitch of two different sets of bit shift operations naturally
sometimes produces unclear situations.
I'll have to get deeper into a proof update to see how that interacts. Might be Ok.
B. A confluent set of rewrite rules on ground terms
Particularly for signed operations like <s <=s sdiv smod this has not
been present even for int so far.
Yes, that right, those were underdeveloped.
An explicit exception is the mask :: nat => 'a, where there is too
little context to determine whether it is supposed to be understood as
abstract or concrete value.
Agreed, we have so far left mask abstract and not automatically reduced it for that reason.
A related cause is the singleton bit expression 2 ^ numeral _ which
normalizes due to the conventional rules on exponentiation – but this
has never been different.
Also good.
C. Poor man's genericity 32 vs. 64
I still need feedback on this -- my further post maybe got lost due to
my ongoing confusion of mail adresses.My proposal: * Theories Word_SetupXX re-appear offering the same name bindings as
pre-Isabelle2021 * Their purpose is documented in the Guide.thyAfter having a second look, the story appears more delicate: there is
not only Word_Setup_ARCH.thy, but also Word_Lemmas_ARCH.thy; while
Word_Lemmas_32.thy and Word_Lemmas_64.thy mimic each other, they contain
both »generic« lemmas with same name likelemma word_bits_len_of:
"len_of TYPE (32) = word_bits"
by (simp add: word_bits_conv)vs.
lemma word_bits_len_of:
"len_of TYPE (64) = word_bits"
by (simp add: word_bits_conv)and »specific« lemmas with differentiated name like
lemma of_nat32_0:
"\<lbrakk>of_nat n = (0::word32); n < 2 ^ word_bits\<rbrakk>
\<Longrightarrow> n = 0"
by (erule of_nat_0, simp add: word_bits_def)vs.
lemma of_nat64_0:
"\<lbrakk>of_nat n = (0::word64); n < 2 ^ word_bits\<rbrakk>
\<Longrightarrow> n = 0"
by (erule of_nat_0, simp add: word_bits_def)And none of them uses the »generic« type alias machine_word introduced
in the corresponding Word_Setup_ARCH.thy theory.
lemmas that can be phrased with machine_word usually should be, so that would be an improvement.
What is the state supposed to be achieved here? Naively I would think
that Word_Setup_ARCH.thy should contain all »generic« lemmas and make
use of the »generic« type alias, whereas »specific« lemmas should stay in
Word_32 / Word_64.
Yes, that would make sense and would clarify the setup. It is possible that some lemmas are not in the right place due to dependencies, but a lot of material has been generalised and moved into generic Word_Lemmas instead, so it is quite possible that this could now be applied consistently.
Cheers,
Gerwin
signature.asc
From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
Hi Gerwin et. al.,
I have arranged this now in
isabelle: d1117655110c
afp: 0b1dccde39f0
This his been my tentative last work on that before the upcoming release.
Are there still issues I should look after?
Cheers,
Florian
OpenPGP_signature
From: Gerwin Klein <kleing@unsw.edu.au>
Hi Florian,
On 4 Nov 2021, at 22:11, Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de> wrote:
[..]
isabelle: d1117655110c
afp: 0b1dccde39f0This his been my tentative last work on that before the upcoming release.
Are there still issues I should look after?
There is nothing open on my list at least. The next step should probably be for me to get a bit deeper into at least one of the proof updates to flush out any remaining issues. Will report back when I have anything.
Cheers,
Gerwin
signature.asc
Last updated: Dec 07 2024 at 16:22 UTC