[isabelle] Sound theorem instantiation rejected in Pure/g... · Mirror: Isabelle Users Mailing List

I am using Isabelle2025 and want to report an unexpected behaviour of a
check in Pure/goal.ML, function "prove_common" and in
Pure/Isar/proof.ML, function "conclude_goal".

Both functions try to check if the proven theorem is an instance of the
originally passed theorem statement using "Unify.matcher". However,
Unify.matcher is incomplete for higher-order matching, e.g. it fails on
the matching problem
"(?f :: prop ⇒ prop) ((?x :: prop ⇒ prop) y) ≡ ?f y"
(one solution: ?x |-> λ_. y).

As a consequence, sound proofs get rejected e.g. when using Goal.prove
and when writing a proof in Isar. Below I append an example theory file
showcasing this, including a minimised version of a lemma that triggered
this behaviour in my development.

Optimally, one would like to have a complete higher-order matcher, but I
am not aware of a simple algorithm for that (though I am by no means an
expert in this area). Checking the history of goal.ML and proof.ML
indicates that both have seen iterations trying to improve the
completeness of the instance check, i.e. this behaviour has been
considered and refined in the past.

Since the check happens in goal.ML and proof.ML, it affects users of
Isabelle/ML and Isabelle/Isar.
(1) I wonder if the check in goal.ML is necessary at all?
(2) Would it be conceivable to temporarily disable the check (e.g. with
a configuration option) in goal.ML and/or proof.ML for users?

theory Scratch
   imports Main
begin

lemma "∃f. set (case xs of [] ⇒ [] | _ ⇒ f xs) = set []"
   apply (rule exI)
   apply simp (*error: Proved a different theorem*)
   oops

ML‹
   val ctxt = @{context}
   val read = Proof_Context.read_term_pattern ctxt
   val [t1, t2] = Syntax.read_terms (Proof_Context.set_mode
Proof_Context.mode_schematic ctxt)
     ["(?f :: prop ⇒ prop) ((?x :: prop ⇒ prop) y)", "?f y"]
   val can_unif = Unify.matcher (Context.Proof ctxt) [t1] [t2] |>
is_some (*false; no matcher found*)
›

ML‹
let
   val ctxt = @{context}
   val goal = read "(?f :: bool ⇒ bool) ((?x :: bool ⇒ bool) y) ≡ ?f (?x y)"
   val insts = [
       ((("x", 0), @{typ "bool ⇒ bool"}), read "λ_ :: bool. (y :: bool)")
     ] |> map (apsnd (Thm.cterm_of ctxt))
in
  (*error: Proved a different theorem*)
   Goal.prove ctxt [] [] goal
     (fn _ => PRIMITIVE (Thm.instantiate (TVars.empty, Vars.make insts))
       THEN HEADGOAL (resolve0_tac @{thms reflexive}))
end
›

schematic_goal "(?f :: bool ⇒ bool) ((?x :: bool ⇒ bool) y) ≡ ?f (?x y)"
apply (tactic ‹
let
   val ctxt = @{context}
   val read = Proof_Context.read_term_pattern ctxt
   val insts = [
       ((("x", 0), @{typ "bool ⇒ bool"}), read "λ_ :: bool. (y :: bool)")
     ] |> map (apsnd (Thm.cterm_of ctxt))
in PRIMITIVE (Thm.instantiate (TVars.empty, Vars.make insts)) end›) (*OK*)
by (rule reflexive) (*error: Proved a different theorem*)

end

Email Gateway (Aug 14 2025 at 10:58):

Higher-order matching is undecidable. The check is necessary because a tactic replaces the proof state by a new proof state and could easily replace the goal by a different one. Matching is only necessary if the original goal contains schematic variables, which is these days extremely rare. Most of the time we only have to check for identity.

I tried your example, but that proof didn't work. I proved the theorem in a couple of different ways and did not get an error.

Email Gateway (Aug 14 2025 at 11:11):

From: Tobias Nipkow <nipkow@in.tum.de>
Higher-Order matchig modulo beta+eta (this is important) is decidable. This was
proved by Stirling in 2007 but is beside the point (partly because of the
complexity). The point that Kevin is making is this: the check has no logical
relevance, it is a sanity check, but it rules out perfectly sound proofs. Thus
there should be some way to say "the incomplete sanity check prevents this
operation, but trust me, I know what I am doing and you don't need to protect me
against doing something silly".

Email Gateway (Aug 14 2025 at 17:08):

From: Dmitriy Traytel <cl-isabelle-users@lists.cam.ac.uk>
What do you mean by “no logical reference”? The check is what rules out the following “proof”:

Email Gateway (Aug 14 2025 at 17:13):

From: Fabian Huch <huch@in.tum.de>
In meta-programming it is not relevant; in an Isar proof, one indeed
expect that the stated term was proven.

But there could be a way to supply the correct instantiation manually:
For instance, via syntax like "qed[where ?x=...]".

Email Gateway (Aug 15 2025 at 06:58):

I’m not sure what you mean by “in meta-programming it is not relevant”. The same (ruled-out) example:

ML ‹
Goal.prove @{context} [] [] (HOLogic.mk_Trueprop @{term False})
(fn _ => PRIMITIVE (fn thm => TrueI RS Drule.protectI))
›

Regarding the qed syntax proposal: Kevin’s original goal has no schematic variables, so there is nothing to instantiate.

Email Gateway (Aug 15 2025 at 08:09):

From: Kevin Kappelmann <kevin.kappelmann@tum.de>
For my original proof (and the one in the shared theory file using
"simp"), I discovered in the meantime that the list to set comprehension
simproc in List.thy creates the instance that does not pass the matching
sanity check. The simproc can be modified to use Goal.prove_internal
(which uses no matching check) instead of Goal.prove. I shared a patch
with Tobias which will be published soon.

Email Gateway (Aug 15 2025 at 11:30):

From: Makarius <makarius@sketis.net>
This thread is already concluded, being mostly based on misunderstandings how
things work.

I merely want to report the situation concerning the regular documentation of
Isabelle/ML, which is a combination of "isabelle doc implementation" + reading
the sources:

* The implementation manual for talks conceptually about a "prove"
operation and then refers concretely to Goal.prove and Goal.prove_common. The
text for "prove" is this:

The ‹prove› operation provides an interface for structured backwards
reasoning under program control, with some explicit sanity checks of the
result. [...]

* The documentation in ML is as follows: src/Pure/goal.ML defines things
clearly from the bottom up. Before introducing Goal.prove_common (and more
"prove variations"), there is this notable explanation:

fun prove_internal ctxt casms cprop tac =
(case SINGLE (tac (map (Assumption.assume ctxt) casms)) (init cprop) of
SOME th => Drule.implies_intr_list casms (finish ctxt th)
| NONE => error "Tactic failed");

A quick "induction over the sources" (via hypersearch in Isabelle/jEdit) shows
that Goal.prove_internal is only used in very few situations. So there is no
need to emphasize it in the LaTeX-based documentation; the ML-based
documentation is sufficient.

Stream: Mirror: Isabelle Users Mailing List

Topic: [isabelle] Sound theorem instantiation rejected in Pure/g...