[isabelle] Disproof methods with Word.thy and AutoCorres · Archive Mirror: Isabelle Users Mailing List

Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Disproof methods with Word.thy and AutoCorres

Email Gateway (Aug 19 2022 at 16:34):

From: Scott Constable <sdconsta@syr.edu>
I’m currently working with AutoCorres v0.999, and I am trying to find a good disproof method to check my Hoare triples. For trivial proofs on words such as

lemma “(x :: 8 word) = y”

quickcheck easily finds a counterexample, but nitpick struggles with the potentially massive cardinalities of the underlying types. For example,

lemma "(x :: 8 word) = y"
nitpick[sat_solver = MiniSat_JNI]

outputs:

Nitpicking formula...
Timestamp: 12:28:39.
Too many scopes. Skipping 95000 scopes. (Consider using "mono" or
"merge_type_vars" to prevent this.)
Batch 1 of 100: Trying 50 scopes:
card int = 1, card nat = 1, card "4 list" = 1, card "2 list" = 1,
card "1 list" = 1, card "1" = 1, card unit = 1, card "8 word" = 1,
card "8" = 1, card "4" = 1, and card "2" = 1;
card int = 2, card nat = 2, card "4 list" = 2, card "2 list" = 2,
card "1 list" = 2, card "1" = 1, card unit = 1, card "8 word" = 2,
card "8" = 2, card "4" = 2, and card "2" = 2;
…

In my experience nitpick’s recommendations do not seem to alleviate the problem. By manually constraining the cardinalities, I was however able to found a counterexample for 4 bit words, but nothing greater. Quickcheck on the other hand does not seem to cooperate with anything that AutoCorres outputs. I typically receive something like the following:

Quickchecking...
Type unification failed: No type arity lifted_globals_ext :: full_exhaustive

Type error in application: incompatible operand type

Operator: full_exhaustive_class.full_exhaustive ::
(??'a × (unit ⇒ term) ⇒ (bool × term list) option)
⇒ natural ⇒ (bool × term list) option
Operand:
λ(s, t_s__).
Quickcheck_Random.catch_match
...
::
??'b lifted_globals_scheme × (unit ⇒ term) ⇒ (bool × term list) option

though to me it looks like this should work.

Any help would be much appreciated.

Email Gateway (Aug 19 2022 at 16:34):

From: David Greenaway <david.greenaway@nicta.com.au>
Hi Scott,

On 05/11/14 04:38, Scott Constable wrote:

I’m currently working with AutoCorres v0.999 a good disproof method to
check my Hoare triples.

I have no idea what I am doing with either nitpick nor quickcheck, but
thought I would have a bit of a poke around to see if I could understand
what was going on.

[...] For trivial proofs on words such as

lemma “(x :: 8 word) = y”

quickcheck easily finds a counterexample, but nitpick struggles with
the potentially massive cardinalities of the underlying types. For
example,

lemma "(x :: 8 word) = y"
nitpick[sat_solver = MiniSat_JNI]

outputs:

Nitpicking formula...
Timestamp: 12:28:39.
Too many scopes. Skipping 95000 scopes. (Consider using "mono" or
[...]

It seems like nitpick is generating rather inefficient representations
of words. For example, for the lemma "x = (y :: 2 word)", nitpick
generates:

Free variables:
x = «0»
y = «0»
Types:
nat = {0, 1, 2, 3, 4, 5, 6, 7, …}
int = {- 3, - 2, - 1, 0, 1, 2, 3, 4, …}
1 list =
{[], [n⇩,⇩1], [n⇩,⇩1, n⇩,⇩1], [n⇩,⇩1, n⇩,⇩1, n⇩,⇩1],
[n⇩,⇩1, n⇩,⇩1, n⇩,⇩1, n⇩,⇩1], [n⇩,⇩1, n⇩,⇩1, n⇩,⇩1, n⇩,⇩1, n⇩,⇩1],
[n⇩,⇩1, n⇩,⇩1, n⇩,⇩1, n⇩,⇩1, n⇩,⇩1, n⇩,⇩1],
[n⇩,⇩1, n⇩,⇩1, n⇩,⇩1, n⇩,⇩1, n⇩,⇩1, n⇩,⇩1, n⇩,⇩1], …}
unit = {()}
2 word = {«0», …}
2 = {«0», …}
Constants:
SOME xs. set xs = ⦃True⦄ ∧ distinct xs = [n⇩,⇩1]
distinct =
(λx. ?)
([] := True, [n⇩,⇩1] := True, [n⇩,⇩1, n⇩,⇩1] := False,
...
set = (λx. ?)
([] := {}, [n⇩,⇩1] := {n⇩,⇩1}, [n⇩,⇩1, n⇩,⇩1] := {n⇩,⇩1},
[n⇩,⇩1, n⇩,⇩1, n⇩,⇩1] := {n⇩,⇩1}, [n⇩,⇩1, n⇩,⇩1, n⇩,⇩1, n⇩,⇩1] := {n⇩,⇩1},
...
tl = (λx. ?)
([] := [], [n⇩,⇩1] := [], [n⇩,⇩1, n⇩,⇩1] := [n⇩,⇩1],
[n⇩,⇩1, n⇩,⇩1, n⇩,⇩1] := [n⇩,⇩1, n⇩,⇩1],
...
length =
(λx. ?)
([] := 0, [n⇩,⇩1] := 1, [n⇩,⇩1, n⇩,⇩1] := 2, [n⇩,⇩1, n⇩,⇩1, n⇩,⇩1] := 3,
[n⇩,⇩1, n⇩,⇩1, n⇩,⇩1, n⇩,⇩1] := 4, [n⇩,⇩1, n⇩,⇩1, n⇩,⇩1, n⇩,⇩1, n⇩,⇩1] := 5,
...
op ^ 2 = (λx. ?)(0 := 1, 1 := 2, 2 := 4)

The problem is compounded by nitpick wasting time trying different combinations
of cardinalities for the different intermediate types (nats, ints, lists,
words).

I am sure there would be a way to make nitpick generate more efficient
representations of words, but I don't have the expertise, sorry. I wonder if
Jasmin has any suggestions?

[...] Quickcheck on the other hand does not seem to cooperate with
anything that AutoCorres outputs. I typically receive something like
the following:

Quickchecking...
Type unification failed: No type arity lifted_globals_ext :: full_exhaustive

AutoCorres defines the "lifted_globals" type using the Record package,
which itself will instantiate "full_exhaustive" if it is able to do so.

In particular, if your C program doesn't use any global state (and hence your
lifted_globals datatype is particularly boring) the "lifted_globals" type is
indeed instantiated into the "full_exhaustive" class by the Record
package. For example, in a C program that doesn't use the heap, we can
write:

lemma "(a :: lifted_globals) = b"
quickcheck

and quickcheck produces:

Quickcheck found a counterexample:

a = ⦇phantom_machine_state_' = Suc 0, ghost'state_' = ()⦈
b = ⦇phantom_machine_state_' = 0, ghost'state_' = ()⦈

Things appear to fall apart, however, when either the "lifted_globals"
state or the property being proven become more complex.

On the "lifted_globals" side of things, the moment a program starts to
use the heap, the "ptr" type comes into play. By default, quickcheck
doesn't know how to work with this, so just gives up. We can write the
line:

quickcheck_generator ptr constructors: Ptr

which teaches quickcheck how to work with the "ptr" type, but quickcheck
still falls over for reasons I don't fully understand. I _suspect_ it
may be related to the dummy type variable in the "ptr" type, which
causes quickcheck to fail with the error message:

Error: Type 'a ptr list includes a free type variable

This issue might have come up in the past on the list [1, 2].

[1] : https://lists.cam.ac.uk/pipermail/cl-isabelle-users/2009-July/msg00012.html
[2] : https://lists.cam.ac.uk/pipermail/cl-isabelle-users/2009-September/msg00075.html

On the property-is-too-complex side of things, quickcheck quickly wants
the "lifted_globals" type to be instantiated into the "enum" class;
presumably so it can evaluate quantifiers. As the lifted_globals type is
frequently of infinite size (it has "nat"'s in its representation), this
isn't feasible.

You also seem to have hit some different errors to me, and its not clear
to me if they are related or not.

So, in short, I think the problems are (at least?):

1. Many properties you might want to quickcheck appear to require
"lifted_globals" to be in the "enum" class, which it usually
can't be.

2. The "ptr" type causes problems for "quickcheck" for reasons I don't
fully understand, which means that any property involving
a non-trivial "lifted_globals" will cause errors.

Is there anyone with quickcheck expertise that might be able to help
resolve these issues?

Cheers,
David

The information in this e-mail may be confidential and subject to legal professional privilege and/or copyright. National ICT Australia Limited accepts no liability for any damage caused by this email or its attachments.

Email Gateway (Aug 19 2022 at 16:35):

From: Andreas Lochbihler <andreas.lochbihler@inf.ethz.ch>
Hi David and Scott,

I do have some experience with quickcheck, but I do not know AutoCorres at all.

I nevertheless had a brief look at problem 1. At least for validity Hoare triples, the
enum requirement arises as follows:

AutoCorres defines validity of Hoare triples as follows:
"⦃P⦄ f ⦃Q⦄ ≡ ∀s. P s ⟶ (∀(r,s') ∈ fst (f s). Q r s')"

The unbounded universal quantifier over s causes the problem with the enum type class,
because the code equation for ∀ implements it by testing all values of the universe (which
it obtains from the enum type class). There are two ways to avoid the enum type class:

Make the quantifier bounded by P and generate the states from P. Then, P cannot be a
predicate of type "'s => bool" any more, so this is probably not viable. One could,
however, implement a pass in the preprocessor of the code generator that tries to rewrite
precondition instances into generators for states. Then, quickcheck would only test states
that actually satisfy the precondition, so quickcheck has a reasonable chance of finding
counterexamples. However, this only works if P can really be rewritten into a generator. I
expect that one would need generative versions of all the common functions that occur in
preconditions.
For quickcheck, it suffices to replace the quantifier by some generator for states
(rather than covering all states). To that end, you have to define a copy of the
quantifier and implement is with a code equation that tests the predicate in the points
generated by the generator and raises an exception (with Code.abort) after the generator
of states has been exhausted. This is probably less work in terms of setup than 1 and
works reliably, but it is unclear how good the generated states are. If hardly any of them
satisfy the precondition P, then test coverage is low.

If one wants to invest more work, one can also combine both approaches. If 1 fails, try 2.

I was not able to look at problem 2, because I have not found a small example where the
ptr causes an exception. If you provide one, I can have a look at this, too.

Best,
Andreas

Email Gateway (Aug 19 2022 at 16:35):

From: David Greenaway <david.greenaway@nicta.com.au>
Hi Andreas,

Thanks for your suggestions about writing some custom state generators.
I have some ideas about what might be useful testcases in practice.

Generating custom states by inspecting the precondition sounds fun, but
might be a longer-term project. :)
PtrTypeQuickcheck.thy

Email Gateway (Aug 19 2022 at 16:35):

From: Andreas Lochbihler <andreas.lochbihler@inf.ethz.ch>
Hi David,

Now I see that ML's value restriction causes the error. One of the parameters of the enum
type class (namely enum) is not of function type. Since its definition 'a ptr does not
depend on other type classes' parameters for 'a, the code generator produces a declaration
of the form

val enum_ptr = ...

which of course should be polymorphic in the element type, but this is not allowed for val
declarations in ML. If you restrict the element type to a type class with type parameters,
then the code generator will produce a fun declaration, which may be polymorphic. You can
define your own type class such as

class dummy = fixes DUMMY :: "'a itself"

and declare a corresponding code equation:

lemma enum_ptr_code [code]:
"Enum.enum = (map (Ptr o of_nat) [0 ..< 2 ^ 32] :: 'a :: dummy ptr list)"
by(simp add: enum_ptr_def)

Of course, you have to instantiate this class for all your type constructors that you want
to store in a ptr.

instantiation "fun" :: (type, type) dummy begin
definition "DUMMY = TYPE('a ⇒ 'b)"
instance ..
end

instantiation nat :: dummy begin
definition "DUMMY = TYPE(nat)"
instance ..
end

Alternatively, you can reuse one of Isabelle/HOL's type classes, e.g. typerep (of which
almost every type is an instance of):

lemma enum_ptr_code [code]:
"Enum.enum = (map (Ptr o of_nat) [0 ..< 2 ^ 32] :: 'a :: typerep ptr list)"
by(simp add: enum_ptr_def)

With this setup, the code generator outputs valid ML code.

Best,
Andreas

Email Gateway (Aug 19 2022 at 16:36):

From: Scott Constable <sdconsta@syr.edu>
David and Andreas,

Thanks for the investigation on quickcheck. Meanwhile I have been looking into tweaking nitpick, though the results thus far have been troubling. See below.

lemma word_lem: "(x :: 4 word) = y"
nitpick[verbose, mono, card int = 32, card nat = 16, card "2 list" = 4,
card "1 list" = 4, card "4 word" = 8, card "4" = 4, card "2" = 2,
card "1" = 1, card unit = 1]

Nitpicking formula...
Timestamp: 08:56:02.
The types "4 word", "4", and "2" are considered monotonic. Nitpick might
be able to skip some scopes.
Using SAT solver "Lingeling_JNI". The following solvers are configured:
"Lingeling_JNI", "CryptoMiniSat_JNI", "MiniSat_JNI", "SAT4J",
"SAT4J_Light".
Trying 1 scope:
card int = 32, card nat = 16, card "2 list" = 4, card "1 list" = 4,
card "4 word" = 8, card "4" = 4, card "2" = 2, card "1" = 1, and
card unit = 1.
Nitpick found a potentially spurious counterexample for card int = 32,
card nat = 16, card "2 list" = 4, card "1 list" = 4, card "4 word" = 8,
card "4" = 4, card "2" = 2, card "1" = 1, and card unit = 1:

Free variables:
x = «1»
y = «0»
Nitpick could not find a better counterexample. It checked 1 of 1 scope.
Total time: 3.8 s.

For this contrived example, I was able to get nitpick to focus on one scope, and find the most trivial counterexample, albeit in almost four seconds. Slight perturbations to the cardinalities (i.e. reducing “card int” to 16) may either cause nitpick to find stranger counterexamples, like <<8>> and <<9>>, or not find any counterexamples at all. My experiments have exhausted my understanding of nitpick, and I’m not sure if there is any way of further improving performance.

Though I have a general idea of what you both are suggesting on the quickcheck side, I am not convinced that this solution will scale up to the application which my research group desires: checking Hoare triples over heap abstracted code generated by AutoCorres.

~Scott

Email Gateway (Aug 19 2022 at 16:37):

From: Jasmin Christian Blanchette <jasmin.blanchette@gmail.com>
Dear Scott,

Sorry for the delay in answering. I am just coming back from vacation.

The problem here is that Nitpick has no built in support for words. It then needs to work in terms of the definition

typedef 'a word = "{(0::int) ..< 2 ^ len_of TYPE('a::len0)}"
morphisms uint Abs_word by auto

which is not fun for SAT solving.

Nitpick's underlying engine, Kodkod, has a notion of words, and it would be possible to handle them really efficiently. However, this would require quite a bit of work in Nitpick, which is not one of my main research interests right now. Sorry about that.

Regards,

Jasmin

Last updated: Oct 18 2025 at 04:23 UTC