Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Questions about Quickcheck


view this post on Zulip Email Gateway (Aug 22 2022 at 13:16):

From: Manuel Eberl <eberlm@in.tum.de>
Hallo,

I am wondering about the following things w.r.t. Quickcheck:

– What is the real difference between exhaustive and full_exhaustive?
There are a sentence or two on this in isar-ref about term
reconstruction, but it's not quite clear to me what that actually means.

– Why does quickcheck not fall back to random testing when exhaustive
testing is not available?

– Why is Quickcheck for multisets only set up for random and
full_exhaustive, but not for exhaustive?

– Is there any reason not to simply replace the entire Quickcheck setup
for multisets with "quickcheck_generator multiset construtors: mset"? (I
have done that locally and it seems to work well)

Manuel

view this post on Zulip Email Gateway (Aug 22 2022 at 13:16):

From: Andreas Lochbihler <andreas.lochbihler@inf.ethz.ch>
Hi Manuel,

– What is the real difference between exhaustive and full_exhaustive? There are a sentence
or two on this in isar-ref about term reconstruction, but it's not quite clear to me what
that actually means.

AFAIK exchaustive and full_exhaustive are two different compilation schemes, but I don't know
the details. You can switch between the two schemes with the option quickcheck_full_support.
Since this option is enabled by default, the exhaustive tester is essentially untested.
Most types in the library do not have any setup for exhaustive any more (in particular all
the datatypes, it seems).

– Is there any reason not to simply replace the entire Quickcheck setup for multisets with
"quickcheck_generator multiset construtors: mset"? (I have done that locally and it seems
to work well)
AFAICS, you get essentially the same testers.

Cheers,
Andreas

view this post on Zulip Email Gateway (Aug 22 2022 at 13:27):

From: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Dear Manuel,

since I am subscribed the isabelle users' mailing list for a couple of
days now, I can tell you some of the rationales for the design
decisions in quickcheck.

Three years ago, I did some profiling of quickcheck's execution and
observed that much time was actually spent constructing and
deconstructing the term representations for printing no matter if
counterexamples were found or not. For fair comparison between
different testing approaches that all are based on the same code
generation setup and are quasi-complete, in the sense, if there is a
counterexample, it will eventually be found (assuming the
counterexample can be represented as ground term with the given code
generation setup), the testing performance is essential. The overall
testing performance would render one approach superior or inferior to
another one.

Exhaustive generators are more performant than full-exhaustive, but do
not allow to print terms when functions are involved.
Full_exhaustive generator are less performant but can always print the
terms, and hence, are generally more useful.

I was always hoping that a lifting and transfer mechanism would allow
to transfer the function space of whole theory developments into the
Andreas' FinFuns function space and make this automatic transferred
code setup usable for quickcheck. This would make the fast exhaustive
generators as powerful as the full_exhaustive ones, but (I believe)
this development has not happened yet.

In practice (thanks to much work on the IDE, non-blocking interaction
and use of multiple cores), quickcheck is powerful and useful even if
it is not the slightly optimized version yet.

Commonly, the manual and automatic setup for random and exhaustive
testing is done in one go. So both strategies would usually work and
start searching for counterexamples or both would fail, as they rely
on the same code generation setup. So then to find counterexamples
quickly, applying one strategy after the other seems awkward, and I
preferred to implement that multiple strategies can be applied in
parallel. As back in 2012, we still had a blocking IDE in wider use,
the blocking auto-quickcheck was limited to only one strategy; I
decided for exhaustive testing.

Without further investigation of the past, I cannot tell you the
reason for the definition of the current Multiset quickcheck setup.
Possibly, during some work on the new datatype package or on code
generation of multisets, some one touched the quickcheck setup, but
was not aware of the quickcheck_generator command---but that's just a
wild guess from my side; one would really need to look at the relevant
changesets and their context. The multiset theory has quite some
evolution since I looked deeply into that file.

I hope that helps.

Best regards,

Lukas


Last updated: Mar 28 2024 at 12:29 UTC