[isabelle] Failing simproc · Archive Mirror: Isabelle Users Mailing List

Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Failing simproc

Email Gateway (Aug 18 2022 at 20:32):

From: Dmitriy Traytel <traytel@in.tum.de>
Hi *,

I stumbled upon some funny behaviour which prevents the simplifier in
proving "True" in Isabelle2012 (also in 8a1ef12f7e6d). The reduced
example is attached.

Dmitriy
Simproc_Fail.thy

Email Gateway (Aug 18 2022 at 20:32):

From: Lars Noschinski <noschinl@in.tum.de>
This exception is thrown by the simproc natless_cancel_sums (in
Nat_Arith.nat_cancel_sums), as the following succeeds:

notepad
begin
ML_prf {* Delsimprocs [nth Nat_Arith.nat_cancel_sums 1] *}
have "(\<And>ys zs. length ys < Suc (length zs + length ys))" by simp
hence True
by simp

The other simprocs in Nat_Arith also seem to be affected.

-- Lars

Email Gateway (Aug 18 2022 at 20:34):

From: Brian Huffman <huffman@in.tum.de>
For the record: This problem was caused by an interaction between
simprocs and schematic type variables. More specifically, some
internal proofs done by Nat_Arith simprocs would fail on terms that
contain schematics. The problem is now fixed in revision 868dc809c8a2.

Authors of simprocs should take note: The simplifier may run simprocs
on terms containing schematics (type and/or term variables) and it is
the responsibility of the simproc to handle this situation properly.

A simproc must never instantiate any schematic variable. Some
guidelines for simproc writers: Internal proofs should use the
simplifier only in "safe" mode, i.e. only using the "safe" solvers,
which avoid instantiating schematics. Also, avoid instantiating rules
with terms from the input and then using those rules with rtac,
because schematic variables in a rule are not preserved when the rule
is applied. (The Nat_Arith simprocs failed to follow either of these
guidelines.)

Brian

Email Gateway (Aug 19 2022 at 07:46):

From: Lars Noschinski <noschinl@in.tum.de>
On 20.07.2012 15:06, Brian Huffman wrote:

This exception is thrown by the simproc natless_cancel_sums (in
Nat_Arith.nat_cancel_sums), as the following succeeds:

notepad
begin
ML_prf {* Delsimprocs [nth Nat_Arith.nat_cancel_sums 1] *}
have "(\<And>ys zs. length ys< Suc (length zs + length ys))" by simp
hence True
by simp

The other simprocs in Nat_Arith also seem to be affected.

For the record: This problem was caused by an interaction between
simprocs and schematic type variables. More specifically, some
internal proofs done by Nat_Arith simprocs would fail on terms that
contain schematics. The problem is now fixed in revision 868dc809c8a2.

Authors of simprocs should take note: The simplifier may run simprocs
on terms containing schematics (type and/or term variables) and it is
the responsibility of the simproc to handle this situation properly.

There are also other simprocs, which produce equations, which are
ignored by the simplifier. I haven't checked closely yet, but these are
most likely due to instantiations of schematic variables:

$ zgrep 'IGNORED result of simproc' * | cut -d\" -f2 | sort | uniq -c
45 Cfun.beta_cfun_proc
86 equal
2 Numeral_Simprocs.inteq_cancel_numerals
941 perm_simproc_fun
24814 record_eq_simp

(src/HOL make all)

A simproc must never instantiate any schematic variable. Some
guidelines for simproc writers: Internal proofs should use the
simplifier only in "safe" mode, i.e. only using the "safe" solvers,
which avoid instantiating schematics. Also, avoid instantiating rules
with terms from the input and then using those rules with rtac,
because schematic variables in a rule are not preserved when the rule
is applied. (The Nat_Arith simprocs failed to follow either of these
guidelines.)

So you decided against the generic approach of importing the schematic
variables? Was it because of efficiency?

-- Lars

Email Gateway (Aug 19 2022 at 07:47):

From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>

$ zgrep 'IGNORED result of simproc' * | cut -d\" -f2 | sort | uniq -c
86 equal

I'm wondering a little bit about this. The simproc »equal« is part of
the code generatorm, preprocessor, as follows:

setup {*
Code_Preproc.map_pre (fn simpset =>
simpset addsimprocs [Simplifier.simproc_global_i @{theory} "equal"
[@{term HOL.eq}]
(fn thy => fn _ =>
fn Const (_, Type ("fun", [Type _, _])) => SOME @{thm eq_equal}
| _ => NONE)])
*}

I don't see any instantiation of schematic variables here, but I'm more
or less ignorant about the simproc business, so the code above may be
utterly wrong or conceptionally outdated.

Florian
signature.asc

Email Gateway (Aug 19 2022 at 07:47):

From: Lars Noschinski <noschinl@in.tum.de>
On 20.07.2012 17:51, Florian Haftmann wrote:

$ zgrep 'IGNORED result of simproc' * | cut -d\" -f2 | sort | uniq -c
86 equal

Ok, these numbers are flawed. Despite the wording of the warning message

"IGNORED result of simproc " ^ quote name ^ " -- does not match"

the warning message is printed not only for non-matching equalities but
also for anything which can not successfully be used for rewriting. This
explains the big number for record_eq_simp (which generates conditional
rules).

I'm wondering a little bit about this. The simproc »equal« is part of
the code generatorm, preprocessor, as follows:

setup {*
Code_Preproc.map_pre (fn simpset =>
simpset addsimprocs [Simplifier.simproc_global_i @{theory} "equal"
[@{term HOL.eq}]
(fn thy => fn _ =>
fn Const (_, Type ("fun", [Type _, _])) => SOME @{thm eq_equal}
| _ => NONE)])
*}

Are there situations where this rewrite step is expected to fail (maybe
due to sort constraints)? Then this warning would be expected, too.

-- Lars

Email Gateway (Aug 19 2022 at 07:47):

From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
Failure is expected due to types themselves: only instantiated equality
(eq [tyco …]) is rewritten, not fully polymorphic equality (eq ['a]).

Florian
signature.asc

Email Gateway (Aug 19 2022 at 07:48):

From: Brian Huffman <huffman@in.tum.de>
On Fri, Jul 20, 2012 at 5:36 PM, Lars Noschinski <noschinl@in.tum.de> wrote:

On 20.07.2012 15:06, Brian Huffman wrote:

A simproc must never instantiate any schematic variable. Some
guidelines for simproc writers: Internal proofs should use the
simplifier only in "safe" mode, i.e. only using the "safe" solvers,
which avoid instantiating schematics. Also, avoid instantiating rules
with terms from the input and then using those rules with rtac,
because schematic variables in a rule are not preserved when the rule
is applied. (The Nat_Arith simprocs failed to follow either of these
guidelines.)

Perhaps I can explain this better. Here's how the Nat_Arith
cancellation simprocs worked before yesterday's rev. 868dc809c8a2 --
note that this technique is wrong and simproc writers must avoid it!

The idea is to prove e.g. "(a + x + b < c + x + d) = (a + b < c + d)"
by inserting "x" on the right, and then doing ac rewriting. The
simproc does the same steps as this proof script:

schematic_lemma
"(a + size (x::?'b1::size) + b < c + size (x::?'b1::size) + d) = (a

b < c + d)"
apply (rule_tac ?k.1 = "size x" in nat_add_left_cancel_less [THEN subst_equals])
apply (simp only: add_ac)

Stepping through this proof with "show_sorts" enabled reveals the
problem: The inserted "x" has type "?'b3" instead of "?'b1", because
rule_tac automatically renames schematics in the rule to avoid clashes
with schematics in the proof goal.

So you decided against the generic approach of importing the schematic
variables? Was it because of efficiency?

I haven't looked at efficiency yet. The more serious problem is that
simprocs that use this approach now are broken! The simprocs in
Provers/Arith/cancel_numerals.ML already try to use
Variable.import_terms and Variable.export to temporarily fix
schematics. But on Dmitriy's original example, natless_cancel_numerals
returns an equation that cannot be used by the simplifier because its
schematic type variables are instantiated.

I'm not sure, but there might be a bug in Variable.export and related
functions. For example:

ML_val {*
let
val ctxt = @{context}
val ts1 = [Thm.term_of @{cpat "size (x::?'a::size)"}]
val (ts2, ctxt') = Variable.import_terms true ts1 ctxt
val ts3 = Variable.export_terms ctxt' ctxt ts2
in
(ts1, ts2, ts3)
end
*}

val it =
([Const ("Nat.size_class.size", "?'a \<Rightarrow> nat") $ Free
("x", "?'a")],
[Const ("Nat.size_class.size", "'a \<Rightarrow> nat") $ Free ("x", "'a")],
[Const ("Nat.size_class.size", "'a \<Rightarrow> nat") $ Free ("x", "'a")]):
term list * term list * term list

The export_terms step does nothing at all! (This example actually
works correctly if we replace "x" with "?x" in the input term.)

I'm afraid we can't recommend simproc writers to use the
Variable.import/export technique, at least not until we figure this
out.

Brian

Email Gateway (Aug 19 2022 at 07:48):

From: Makarius <makarius@sketis.net>
theory Simproc_Fail
imports Main
begin

notepad
begin
have "(\<And>ys zs. length ys < Suc (length zs + length ys))" by simp
hence True apply simp sorry
end

end

This is a hard crash, not a tactic failure. Generally, goal states with
polymorphic content (schematic type variables) are pathologic; many proof
tools will choke on them. It is of the "don't do it then category":
posing goals with schematic type variables is asking for trouble.

Above this happens above after piping the "this" fact into the True goal
via the "insert" phase of simp. Instead of 'using' polymoprhic facts,
typically those from the background library, one should "add" them somehow
to the automated tools, e.g. (simp add: ...), (auto simp add: ...), (auto
iff: ...) ...

Anyway, did you have a concrete application where unresolved types in
goals were required, or was it just one of the common accidents to get
them via bad luck?

This incident remindes me again that I wanted to work out some refined
color scheme for term variables with unexpected poloypmorphism,
potentially with some extra tool tips apart from the coloring.

Makarius

Email Gateway (Aug 19 2022 at 07:48):

From: Makarius <makarius@sketis.net>
On Sat, 21 Jul 2012, Brian Huffman wrote:

On Fri, Jul 20, 2012 at 5:36 PM, Lars Noschinski <noschinl@in.tum.de> wrote:

On 20.07.2012 15:06, Brian Huffman wrote:

A simproc must never instantiate any schematic variable. Some
guidelines for simproc writers: Internal proofs should use the
simplifier only in "safe" mode, i.e. only using the "safe" solvers,
which avoid instantiating schematics. Also, avoid instantiating rules
with terms from the input and then using those rules with rtac,
because schematic variables in a rule are not preserved when the rule
is applied. (The Nat_Arith simprocs failed to follow either of these
guidelines.)

Perhaps I can explain this better. Here's how the Nat_Arith
cancellation simprocs worked before yesterday's rev. 868dc809c8a2 --
note that this technique is wrong and simproc writers must avoid it!

What is wrong, fixed, broken now? I am confused.

The simproc under consideration stems from 1997, when I was supervising
Stefan Berghofer in a very small student project. According to the
Mercurial history, it has not really been improved since then. I first
thought that it was covered by Brian's general simproc renovation project
for Isabelle2012, but it does not seem the case.

IIRC, the system was more aggressive back then to reject goals with
schematic types outright.

The idea is to prove e.g. "(a + x + b < c + x + d) = (a + b < c + d)"
by inserting "x" on the right, and then doing ac rewriting. The
simproc does the same steps as this proof script:

schematic_lemma
"(a + size (x::?'b1::size) + b < c + size (x::?'b1::size) + d) = (a
+ b < c + d)"
apply (rule_tac ?k.1 = "size x" in nat_add_left_cancel_less [THEN subst_equals])
apply (simp only: add_ac)

Goals with schematic type variables routinely lead to undefined behaviour.
Schematic types for fixed term variables are especially pathologic. Does
this example have any practical relevance?

Anyway, a more profound deficiency of this modest simproc is this: It is
supposed to normalize certain outer algebraic structure, but it retains
the concrete subterms. Thus the normalization might do non-sense with the
accidental substructure, not just with its hidden polymorphism as above.

This was already known in 1997, but we did not have the technology so do
better on the spot. Today in 2012, it should be trivial to use the local
context of the simproc (from the "ss") to introduce fresh fixes for the
terms (ts, us, vs), then prove the result in its generalized form with
fully opaque subterms, finally export the resulting equation to hand it
back to the simplifier. Thus the local goal is fully fixed with fixed
types, and the result as schematic and poylmorphic as required. The
simplifier will apply higher-order matching to recover the concrete
instance.

I'm not sure, but there might be a bug in Variable.export and related
functions. For example:

ML_val {*
let
val ctxt = @{context}
val ts1 = [Thm.term_of @{cpat "size (x::?'a::size)"}]
val (ts2, ctxt') = Variable.import_terms true ts1 ctxt
val ts3 = Variable.export_terms ctxt' ctxt ts2
in
(ts1, ts2, ts3)
end
*}

val it =
([Const ("Nat.size_class.size", "?'a \<Rightarrow> nat") $ Free
("x", "?'a")],
[Const ("Nat.size_class.size", "'a \<Rightarrow> nat") $ Free ("x", "'a")],
[Const ("Nat.size_class.size", "'a \<Rightarrow> nat") $ Free ("x", "'a")]):
term list * term list * term list

The export_terms step does nothing at all! (This example actually
works correctly if we replace "x" with "?x" in the input term.)

I'm afraid we can't recommend simproc writers to use the
Variable.import/export technique, at least not until we figure this out.

A fixed term variable x::?'a with schematic type should normally not occur
during logical inference. It conceptually means that the type-inference
phase did not finish its job before entering the logic. It might also
mean that old and new tools were combined in a odd way, without the
"proper context" or things not declared properly in the context.

Fixed type variables within fixed term variables are still fixed,
according to Hindley-Milner polymorphism. So the above export looks
right. If subterms with their potential hidden polymorphism are made
abstract as sketched above, such problems will not occur.

Makarius

Email Gateway (Aug 19 2022 at 07:49):

From: Brian Huffman <huffman@in.tum.de>
On Sat, Jul 21, 2012 at 8:19 PM, Makarius <makarius@sketis.net> wrote:

On Sat, 21 Jul 2012, Brian Huffman wrote:

On Fri, Jul 20, 2012 at 5:36 PM, Lars Noschinski <noschinl@in.tum.de>

wrote:
>

On 20.07.2012 15:06, Brian Huffman wrote:

A simproc must never instantiate any schematic variable. Some
guidelines for simproc writers: Internal proofs should use the
simplifier only in "safe" mode, i.e. only using the "safe" solvers,
which avoid instantiating schematics. Also, avoid instantiating rules
with terms from the input and then using those rules with rtac,
because schematic variables in a rule are not preserved when the rule
is applied. (The Nat_Arith simprocs failed to follow either of these
guidelines.)

Perhaps I can explain this better. Here's how the Nat_Arith
cancellation simprocs worked before yesterday's rev. 868dc809c8a2 --
note that this technique is wrong and simproc writers must avoid it!

What is wrong, fixed, broken now? I am confused.

What's wrong: The implementation of the simprocs in
HOL/Tools/nat_arith.ML of Isabelle2012. (In particular, the function
gen_uncancel_tac is not robust in the presence of schematics.) Writers
of new simprocs should not emulate this code.

What's fixed: Since rev. 868dc809c8a2, the simprocs in
HOL/Tools/nat_arith.ML work correctly on goals with schematics.

The idea is to prove e.g. "(a + x + b < c + x + d) = (a + b < c + d)"
by inserting "x" on the right, and then doing ac rewriting. The
simproc does the same steps as this proof script:

schematic_lemma
"(a + size (x::?'b1::size) + b < c + size (x::?'b1::size) + d) = (a
+ b < c + d)"
apply (rule_tac ?k.1 = "size x" in nat_add_left_cancel_less [THEN
subst_equals])
apply (simp only: add_ac)

Goals with schematic type variables routinely lead to undefined behaviour.
Schematic types for fixed term variables are especially pathologic. Does
this example have any practical relevance?

Yes, it does.

Sometimes users have proof goals containing schematics. Sometimes they
run the simplifier on such goals. This can result in simprocs being
asked to rewrite terms containing schematics. For example, a simproc
may need to internally prove the equality "(a + size (x::?'b1::size) +
b < c + size (x::?'b1::size) + d) = (a + b < c + d)".

Keep in mind that the proof script above is not a user proof; it
represents the internal proof done by a simproc.

Anyway, a more profound deficiency of this modest simproc is this: It is
supposed to normalize certain outer algebraic structure, but it retains the
concrete subterms. Thus the normalization might do non-sense with the
accidental substructure, not just with its hidden polymorphism as above.

This was already known in 1997, but we did not have the technology so do
better on the spot. Today in 2012, it should be trivial to use the local
context of the simproc (from the "ss") to introduce fresh fixes for the
terms (ts, us, vs), then prove the result in its generalized form with fully
opaque subterms, finally export the resulting equation to hand it back to
the simplifier. Thus the local goal is fully fixed with fixed types, and
the result as schematic and poylmorphic as required. The simplifier will
apply higher-order matching to recover the concrete instance.

I think your idea would work. However it seems like it would be
non-trivial to implement: You would have to determine the high-level
structure of the original term, identify which of the complex subterms
are equal, and then build new, abstract terms with fresh variables in
the right places. Do you think we should recommend this technique as a
standard best-practice for simproc writers?

The technique used by the simprocs in
HOL/Tools/nat_numeral_simprocs.ML (based on an ML functor from
Provers/Arith/cancel_numerals.ML) is much easier to use: Just do a
Variable.import_terms at the beginning to locally fix the schematics,
and then run Variable.export on the resulting equation.

I think the Variable.import/export technique would make a good
official recommendation for simproc writers (something for the
Cookbook, maybe) if it only worked right! This is why I complained of
a possible bug in Variable.export.

In Dmitriy's original example, the simp trace reveals that the
natless_cancel_numerals simproc (which uses the import/export
technique) returns an unusable rule where some of the locally-fixed
type variables have not been properly generalized again.

declare [[show_types]]
declare [[simp_trace]]
declare [[simp_debug]]

notepad
begin
have "(\<And>ys zs. length ys < Suc (length zs + length ys))" by simp
hence True apply simp sorry
end

[1]Trying procedure "Numeral_Simprocs.natless_cancel_numerals" on:
length (:000\<Colon>?'a1 list) < Suc (length (:001\<Colon>?'b1 list) +
length :000)
...
[1]Procedure "Numeral_Simprocs.natless_cancel_numerals" produced rewrite rule:
length (:000\<Colon>'a list) < Suc (length (:001\<Colon>'b list) +
length :000) \<equiv>
(0\<Colon>nat) < Suc (length :001)

A fixed term variable x::?'a with schematic type should normally not occur
during logical inference. It conceptually means that the type-inference
phase did not finish its job before entering the logic. It might also mean
that old and new tools were combined in a odd way, without the "proper
context" or things not declared properly in the context.

Oh, but it can, and it does! This situation is clearly shown in the
simp trace above, which was produced by Dmitriy's completely
reasonable proof script.

Fixed type variables within fixed term variables are still fixed, according
to Hindley-Milner polymorphism. So the above export looks right. If
subterms with their potential hidden polymorphism are made abstract as
sketched above, such problems will not occur.

Of course, if we completely redesigned all of our simprocs, we could
avoid all these problems.

What we need, though, is a simple, practical technique that any
simproc writer can use. The Variable.import/export technique would be
just the thing, but we need to address the bugs in Variable.export
first.

Brian

Last updated: Jul 04 2025 at 01:50 UTC