Stream: Mirror: Isabelle Development Mailing List

Topic: [isabelle-dev] NEWS: Zstd compression for Isabelle/Scala ...


view this post on Zulip Email Gateway (Oct 21 2022 at 18:08):

From: Makarius <makarius@sketis.net>
* ML *

* System *

This refers to Isabelle/f2b98eb6a7a9. See also
https://github.com/luben/zstd-jni and https://github.com/facebook/zstd

Here is an example in Isabelle/ML (proper timing requires to re-check that 1-3
times):

theory Scratch
imports Pure
begin

ML ‹
val b1 = Bytes.read @{file
‹~~/src/HOL/Analysis/Henstock_Kurzweil_Integration.thy›};
val b2 = timeap Zstd.compress b1;
val b3 = timeap Zstd.uncompress b2;
@{assert} (Bytes.eq (b1, b3));

ML ‹
val b1 = Bytes.read @{file
‹~~/src/HOL/Analysis/Henstock_Kurzweil_Integration.thy›};
val b2 = timeap XZ.compress b1;
val b3 = timeap XZ.uncompress b2;
@{assert} (Bytes.eq (b1, b3));

end

This shows that Zstd (JNI library) is much faster than XZ (pure Java), while
the compression ratio is only slightly lower.

It remains to be seen how we can use this super-fast compression, e.g. for
blobs within build databases (presently XZ), or even for heaps of Poly/ML ---
for the latter it might be better to see if this could be included in Poly/ML.

Makarius


isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev

view this post on Zulip Email Gateway (Oct 21 2022 at 18:09):

From: Makarius <makarius@sketis.net>
Here the same example for Isabelle/Scala, e.g. the Console/Scala plugin in
Isabelle/jEdit:

val b1 =
Bytes.read(Path.explode("~~/src/HOL/Analysis/Henstock_Kurzweil_Integration.thy"))
val b2 = Timing.timeit() { b1.compress(Compress.Options_Zstd()) }
val b3 = Timing.timeit() { b2.uncompress() }

val b1 =
Bytes.read(Path.explode("~~/src/HOL/Analysis/Henstock_Kurzweil_Integration.thy"))
val b2 = Timing.timeit() { b1.compress(Compress.Options_XZ()) }
val b3 = Timing.timeit() { b2.uncompress() }

It shows how Isabelle/ML (world of mathematics) and Isabelle/Scala (world of
physics) nicely work together. The ML functions from before actually invoke
these Scala operations via the PIDE protocol channel, using a custom-made
Bytes type on both sides (for better scalability).

Makarius


isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev

view this post on Zulip Email Gateway (Oct 21 2022 at 20:01):

From: Makarius <makarius@sketis.net>
Here are some more tests with Isabelle/cff0828c374f --- where large files work
much better. This is for Isabelle/jEdit with Console/Scala:

def test_heap(name: String, options: Compress.Options): Unit = {
val store = Sessions.store(Options.init())
val heap = store.find_heap(name).getOrElse(error("Bad heap " + quote(name)))
val b = Timing.timeit("read") { Bytes.read(heap) }
val c = Timing.timeit("compress") { b.compress(options, store.cache.compress) }
val d = Timing.timeit("uncompress") { c.uncompress(store.cache.compress) }
Console.println("uncompressed size = " + d.length + "\ncompressed size = "

scala> test_heap("Pure", Compress.Options_XZ(3))

read: 0.009s elapsed time

compress: 1.792s elapsed time

uncompress: 0.328s elapsed time

uncompressed size = 20848737
compressed size = 3430468
ratio = 6.077519743661798

scala> test_heap("Pure", Compress.Options_Zstd(3))

read: 0.009s elapsed time

compress: 0.081s elapsed time

uncompress: 0.022s elapsed time

uncompressed size = 20848737
compressed size = 4690553
ratio = 4.444835608935663

scala> test_heap("Pure", Compress.Options_Zstd(6))

read: 0.008s elapsed time

compress: 0.206s elapsed time

uncompress: 0.021s elapsed time

uncompressed size = 20848737
compressed size = 4255723
ratio = 4.898988256519515

scala> test_heap("HOL", Compress.Options_XZ(3))

read: 0.148s elapsed time

compress: 27.193s elapsed time

uncompress: 6.162s elapsed time

uncompressed size = 340480517
compressed size = 61810868
ratio = 5.508424780574186

scala> test_heap("HOL", Compress.Options_Zstd(3))

read: 0.370s elapsed time

compress: 1.559s elapsed time

uncompress: 0.511s elapsed time

uncompressed size = 340480517
compressed size = 88856965
ratio = 3.8317819767983297

scala> test_heap("HOL", Compress.Options_Zstd(6))

read: 0.164s elapsed time

compress: 3.594s elapsed time

uncompress: 0.378s elapsed time

uncompressed size = 340480517
compressed size = 82115162
ratio = 4.146378192616852

Makarius


isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev


Last updated: Apr 19 2024 at 16:20 UTC