Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] mergesort (was: BUG in conv.ML: rewr_conv and ...


view this post on Zulip Email Gateway (Aug 19 2022 at 08:52):

From: Makarius <makarius@sketis.net>
On Wed, 24 Oct 2012, Christian Sternagel wrote:

On 10/23/2012 08:17 PM, Makarius wrote:

(Next time I will tell a story how an efficient and fully verified
merge-sort function included in the core sources caused several days of
worries.)

Please tell the story.

I will try, but it has to be shorter than usual: I've just returned from
travel and I am preparing for the next departure starting in 1.5 days.

I'm particularly interested since in the near future I was going to
suggest to run some tests whether it would make sense (performance-wise)
to replace the existing sorting algorithm in the Isabelle sources by the
one from (sorry for the self-advertisement):

http://link.springer.com/article/10.1007/s10817-012-9260-7

The website seems to be down at the moment, and it does not look like a
publicly available thing anyway (without paying extra).

As I'm writing this email I saw that the current sorting algorithm in
src/Pure/library.ML actually already is a very similar mergesort (last
time I checked, it wasn't ;)).

You probably mean this version
http://isabelle.in.tum.de/repos/isabelle/rev/b28defd0b5a5 and not "the
current" which is hardly defined for an ever growing set of approx. 50000
elements. Its changelog reads "replaced quicksort by mergesort, which
might be a bit more efficient for key operations like Ord_List.make,
Sorts.minimize_sort;". This was the most positive formulation I could
find after spending too much time with the issue to gain almost nothing.

The code has become a bit longer, a tiny little bit better-defined in what
it does, but we also had to unearth a much older sort function to
accomodate the HO unification code with its special demands, see
http://isabelle.in.tum.de/repos/isabelle/rev/a0d8abca8d7a

When you proposed to use your verified version of mergesort (derived from
the Haskell library) some month ago, I was not very excited for formal
software-quality reasons. Things deep down there that have been used for a
long time in certain ways follow their own requirements. Formal
specification and formal proofs are fine, but not yet good software
engineering. Too much "eating your own dogfood" is bad for health.

Since I am usually a bit too open to experimentation, when Stefan
Berghofer came with some measurements about a hot-spot in the management
of sorts in the inference kernel, we gave it a try nonetheless. One
result was no measureable performance improvement, see the log entry
quoted above. Another result was the discovery that the Unify module by
Larry from 25 years ago does not really "sort" where it invokes a function
called "sort", but something else. The "quasi order" it gives here is not
transitive, resulting in funny results with the proven mergesort.

Just curious: Is there a specific reason for not using the function
composition "trick" to avoid 'rev' in 'ascending'?

IIRC, the original version copied from Haskell was composing many
functional closures according to the length of the list. This might be
efficient in Haskell (I have no substantial experience with it), but in
the conventional evaluation model of ML it is not. Plain rev is not very
expensive, and there is no reason to avoid it as "premature optimization".
David Matthews is quite smart to manage the heap for us, so you don't have
to be afraid of some temporary allocations.

Makarius

view this post on Zulip Email Gateway (Aug 19 2022 at 09:04):

From: Christian Sternagel <c-sterna@jaist.ac.jp>
Please tell the story. I'm particularly interested since in the near
future I was going to suggest to run some tests whether it would make
sense (performance-wise) to replace the existing sorting algorithm in
the Isabelle sources by the one from (sorry for the self-advertisement):

http://link.springer.com/article/10.1007/s10817-012-9260-7

As I'm writing this email I saw that the current sorting algorithm in
src/Pure/library.ML actually already is a very similar mergesort (last
time I checked, it wasn't ;)). Just curious: Is there a specific reason
for not using the function composition "trick" to avoid 'rev' in
'ascending'?

cheers

chris


Last updated: Nov 21 2024 at 12:39 UTC