From: Makarius <makarius@sketis.net>
* ML *
Operations for Zstd compression (via Isabelle/Scala):
Zstd.compress: Bytes.T -> Bytes.T
Zstd.uncompress: Bytes.T -> Bytes.T
* System *
This refers to Isabelle/f2b98eb6a7a9. See also
https://github.com/luben/zstd-jni and https://github.com/facebook/zstd
Here is an example in Isabelle/ML (proper timing requires to re-check that 1-3
times):
theory Scratch
imports Pure
begin
ML ‹
val b1 = Bytes.read @{file
‹~~/src/HOL/Analysis/Henstock_Kurzweil_Integration.thy›};
val b2 = timeap Zstd.compress b1;
val b3 = timeap Zstd.uncompress b2;
@{assert} (Bytes.eq (b1, b3));
›
ML ‹
val b1 = Bytes.read @{file
‹~~/src/HOL/Analysis/Henstock_Kurzweil_Integration.thy›};
val b2 = timeap XZ.compress b1;
val b3 = timeap XZ.uncompress b2;
@{assert} (Bytes.eq (b1, b3));
›
end
This shows that Zstd (JNI library) is much faster than XZ (pure Java), while
the compression ratio is only slightly lower.
It remains to be seen how we can use this super-fast compression, e.g. for
blobs within build databases (presently XZ), or even for heaps of Poly/ML ---
for the latter it might be better to see if this could be included in Poly/ML.
Makarius
isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev
From: Makarius <makarius@sketis.net>
Here the same example for Isabelle/Scala, e.g. the Console/Scala plugin in
Isabelle/jEdit:
val b1 =
Bytes.read(Path.explode("~~/src/HOL/Analysis/Henstock_Kurzweil_Integration.thy"))
val b2 = Timing.timeit() { b1.compress(Compress.Options_Zstd()) }
val b3 = Timing.timeit() { b2.uncompress() }
val b1 =
Bytes.read(Path.explode("~~/src/HOL/Analysis/Henstock_Kurzweil_Integration.thy"))
val b2 = Timing.timeit() { b1.compress(Compress.Options_XZ()) }
val b3 = Timing.timeit() { b2.uncompress() }
It shows how Isabelle/ML (world of mathematics) and Isabelle/Scala (world of
physics) nicely work together. The ML functions from before actually invoke
these Scala operations via the PIDE protocol channel, using a custom-made
Bytes type on both sides (for better scalability).
Makarius
isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev
From: Makarius <makarius@sketis.net>
Here are some more tests with Isabelle/cff0828c374f --- where large files work
much better. This is for Isabelle/jEdit with Console/Scala:
def test_heap(name: String, options: Compress.Options): Unit = {
val store = Sessions.store(Options.init())
val heap = store.find_heap(name).getOrElse(error("Bad heap " + quote(name)))
val b = Timing.timeit("read") { Bytes.read(heap) }
val c = Timing.timeit("compress") { b.compress(options, store.cache.compress) }
val d = Timing.timeit("uncompress") { c.uncompress(store.cache.compress) }
Console.println("uncompressed size = " + d.length + "\ncompressed size = "
scala> test_heap("Pure", Compress.Options_XZ(3))
uncompressed size = 20848737
compressed size = 3430468
ratio = 6.077519743661798
scala> test_heap("Pure", Compress.Options_Zstd(3))
uncompressed size = 20848737
compressed size = 4690553
ratio = 4.444835608935663
scala> test_heap("Pure", Compress.Options_Zstd(6))
uncompressed size = 20848737
compressed size = 4255723
ratio = 4.898988256519515
scala> test_heap("HOL", Compress.Options_XZ(3))
uncompressed size = 340480517
compressed size = 61810868
ratio = 5.508424780574186
scala> test_heap("HOL", Compress.Options_Zstd(3))
uncompressed size = 340480517
compressed size = 88856965
ratio = 3.8317819767983297
scala> test_heap("HOL", Compress.Options_Zstd(6))
uncompressed size = 340480517
compressed size = 82115162
ratio = 4.146378192616852
Makarius
isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev
Last updated: Dec 21 2024 at 16:20 UTC