Stream: Mirror: Isabelle Users Mailing List

Topic: [isabelle] Apparent dovetailing/scheduling problem with S...


view this post on Zulip Email Gateway (Mar 08 2023 at 20:57):

From: "Eugene W. Stark" <isabelle-users@starkeffect.com>
Since the release of Isabelle 2022 I have had relatively frequent situations in which I have
been using Sledgehammer, it reports "No proof found", but in fact the fact to be proved
can easily be proved using auto or simp. This tends to cause me to waste time because I end
up entering smaller and smaller proof steps but then I find I could have done the whole thing
with auto or simp in the first place.

I finally found a case that could be turned into a short example:


theory Barf
imports Main
begin

locale X =
fixes zero :: 'a and one :: 'a and two :: 'a and null :: 'a
assumes distinct: "null ≠ zero ∧ null ≠ one ∧ null ≠ two ∧
zero ≠ one ∧ zero ≠ two ∧ one ≠ two"
begin

definition P
where "P x ⟷ (x = zero ∨ x = one ∨ x = two)"

lemma P_iff:
shows "P x ⟷ x = zero ∨ x = one ∨ x = two"
unfolding P_def by blast

lemma
shows "Collect P = {zero, one, two}"
using P_iff
(*
HERE CLICK "Apply" in Sledgehammer panel.
Sledgehammer with "Provers: cvc4 z3 verit", "Isar proofs" unchecked,
"Try methods" checked, reports "No proof found" after about 40 seconds,
but the lemma is provable by auto.
*)
by auto

end


It seems like it must be some kind of a problem with the way "try methods" is scheduled
with respect to the SMT provers, but that is just speculation because I don't have a clue
as to how it really works.

- Gene Stark

view this post on Zulip Email Gateway (Mar 08 2023 at 22:49):

From: "Eugene W. Stark" <isabelle-users@starkeffect.com>
(Sorry for the spam -- I was lazy and inadvertently ended up adding the original posting to a
"Call for Papers" Thread, instead of a new thread.)

Since the release of Isabelle 2022 I have had relatively frequent situations in which I have
been using Sledgehammer, it reports "No proof found", but in fact the fact to be proved
can easily be proved using auto or simp. This tends to cause me to waste time because I end
up entering smaller and smaller proof steps but then I find I could have done the whole thing
with auto or simp in the first place.

I finally found a case that could be turned into a short example:


theory Barf
imports Main
begin

locale X =
fixes zero :: 'a and one :: 'a and two :: 'a and null :: 'a
assumes distinct: "null ≠ zero ∧ null ≠ one ∧ null ≠ two ∧
zero ≠ one ∧ zero ≠ two ∧ one ≠ two"
begin

definition P
where "P x ⟷ (x = zero ∨ x = one ∨ x = two)"

lemma P_iff:
shows "P x ⟷ x = zero ∨ x = one ∨ x = two"
unfolding P_def by blast

lemma
shows "Collect P = {zero, one, two}"
using P_iff
(*
HERE CLICK "Apply" in Sledgehammer panel.
Sledgehammer with "Provers: cvc4 z3 verit", "Isar proofs" unchecked,
"Try methods" checked, reports "No proof found" after about 40 seconds,
but the lemma is provable by auto.
*)
by auto

end


It seems like it must be some kind of a problem with the way "try methods" is scheduled
with respect to the SMT provers, but that is just speculation because I don't have a clue
as to how it really works.

- Gene Stark

view this post on Zulip Email Gateway (Mar 09 2023 at 06:53):

From: Mathias Fleury <mathias.fleury12@gmail.com>
Hi Eugene,

Did you intend to use the command try (that calls Sledgehammer, try0,
nitpick) instead of Sledgehammer only? try0 tries auto/simp/blast/... to
see if any tactic is able to prove the goal, whereas Sledgehammer does
not. On your example try reports very fast that "by blast" is able to
prove the goal.

It is not very surprising that proofs exists that auto can find, but
Sledgehammer does not.

Mathias

view this post on Zulip Email Gateway (Mar 09 2023 at 08:40):

From: "Eugene W. Stark" <isabelle-users@starkeffect.com>
My understanding was that the "Try Methods" checkbox in the Sledgehammer panel was supposed to enable trying
the normal methods (simp, auto, blast, force, etc.) in addition to the SMT solvers. If that is incorrect,
then what is the purpose of that checkbox?

view this post on Zulip Email Gateway (Mar 09 2023 at 08:50):

From: Mathias Fleury <mathias.fleury12@gmail.com>
According to the official documentation
https://isabelle.in.tum.de/doc/sledgehammer.pdf:

try0[=〈bool〉]{true}(neg.:dont_try0)
Specifies whether standard proof methods such asautoandblastshould
be tried as alternatives tometisin Isar proofs. The collection of meth-
ods is roughly the same as for thetry0command.

Let's try it on your example:

lemma P_iff:
     shows "P x ⟷ x = zero ∨ x = one ∨ x = two"

With the option:
   cvc4: Try this: by (simp add: P_def) (0.8 ms)

Without you get only metis steps:
  vampire: Try this: by (metis P_def) (199 ms)
  cvc4: Try this: by (smt (verit) P_def) (1.2 s)

My guess is that this option is mostly there for historical reasons to
reduce the amount of computer CPU power required.

Mathias

view this post on Zulip Email Gateway (Mar 09 2023 at 09:01):

From: Jasmin Blanchette <jasmin.blanchette@ifi.lmu.de>
Indeed. There was a time where this option was new and experimental, but nowadays is almost unthinkable to use Sledgehammer without it. There are nevertheless still a few rare cases where one wants to see exactly which lemmas were used by the prover, and the metis output is then more informative.

Jasmin

view this post on Zulip Email Gateway (Mar 09 2023 at 09:01):

From: "Eugene W. Stark" <isabelle-users@starkeffect.com>
As I stated in my original posting, I get:

With Provers: cvc4 z3 verit", "Isar proofs" unchecked, "Try methods" checked,
"No proof found" reported after about 40 seconds when "Apply" in the Sledgehammer panel
is selected.

Unchecking "Try methods" does not change the behavior.

Yes, entering "try" in the buffer does quickly result in the suggestion "by blast",
but that is not the issue that I was reporting. I am reporting an issue with the
behavior of the Sledgehammer panel. Why is the behavior of the Sledgehammer panel
with "Try methods" checked different than the behavior when "try" is entered into
the editing buffer?

view this post on Zulip Email Gateway (Mar 09 2023 at 09:05):

From: Dominique Unruh <cl-isabelle-users@lists.cam.ac.uk>

Indeed. There was a time where this option was new and experimental,
but nowadays is almost unthinkable to use Sledgehammer without it. There
are nevertheless still a few rare cases where one wants to see exactly
which lemmas were used by the prover, and the metis output is then more
informative.

Maybe it would make sense to remove this option (at least from the UI)
as it is confusing, and instead add an option: "show involved lemmas"
(which would add an extra line after every suggested tactic, or maybe
for compactness after every suggested tactic that isn't anyway a
metis-call)?

I think this would avoid the understandable confusion about the checkbox.

Best wishes,
Dominique.

view this post on Zulip Email Gateway (Mar 09 2023 at 09:12):

From: Mathias Fleury <mathias.fleury12@gmail.com>
On 3/9/23 10:01, Eugene W. Stark wrote:

As I stated in my original posting, I get:

With Provers: cvc4 z3 verit", "Isar proofs" unchecked, "Try methods"
checked,
"No proof found" reported after about 40 seconds when "Apply" in the
Sledgehammer panel
is selected.

Unchecking "Try methods" does not change the behavior.

It does… if a proof a found. It is an option for the reconstruction of
the proofs.

Yes, entering "try" in the buffer does quickly result in the
suggestion "by blast",
but that is not the issue that I was reporting.  I am reporting an
issue with the
behavior of the Sledgehammer panel.  Why is the behavior of the
Sledgehammer panel
with "Try methods" checked different than the behavior when "try" is
entered into
the editing buffer?

I mentioned "try" because it solves your problem. The "try methods"
checkbox has nothing to do with the "try" tactic you can type in the
buffer, except for the name:

"try" is roughly equivalent to "nitpick OR sledgehammer OR try0,
whatever terminates first".

"Try methods" is equivalent to "sledgehammer" (when on) or
"sledgehammer[dont_try0]" (when off).

Mathias

On 3/9/23 03:50, Mathias Fleury wrote:

According to the official documentation
https://isabelle.in.tum.de/doc/sledgehammer.pdf:

try0[=〈bool〉]{true}(neg.:dont_try0)
Specifies whether standard proof methods such asautoandblastshould
be tried as alternatives tometisin Isar proofs. The collection of meth-
ods is roughly the same as for thetry0command.

Let's try it on your example:

lemma P_iff:
      shows "P x ⟷ x = zero ∨ x = one ∨ x = two"

With the option:
    cvc4: Try this: by (simp add: P_def) (0.8 ms)

Without you get only metis steps:
   vampire: Try this: by (metis P_def) (199 ms)
   cvc4: Try this: by (smt (verit) P_def) (1.2 s)

My guess is that this option is mostly there for historical reasons
to reduce the amount of computer CPU power required.

Mathias

On 3/9/23 09:39, Eugene W. Stark wrote:

My understanding was that the "Try Methods" checkbox in the
Sledgehammer panel was supposed to enable trying
the normal methods (simp, auto, blast, force, etc.) in addition to
the SMT solvers.  If that is incorrect,
then what is the purpose of that checkbox?

On 3/9/23 01:52, Mathias Fleury wrote:

Hi Eugene,

Did you intend to use the command try (that calls Sledgehammer,
try0, nitpick) instead of Sledgehammer only? try0 tries
auto/simp/blast/... to see if any tactic is able to prove the goal,
whereas Sledgehammer does not. On your example try reports very
fast that "by blast" is able to prove the goal.

It is not very surprising that proofs exists that auto can find,
but Sledgehammer does not.

Mathias

On 3/8/23 23:48, Eugene W. Stark wrote:

(Sorry for the spam -- I was lazy and inadvertently ended up
adding the original posting to a
"Call for Papers" Thread, instead of a new thread.)

Since the release of Isabelle 2022 I have had relatively frequent
situations in which I have
been using Sledgehammer, it reports "No proof found", but in fact
the fact to be proved
can easily be proved using auto or simp.  This tends to cause me
to waste time because I end
up entering smaller and smaller proof steps but then I find I
could have done the whole thing
with auto or simp in the first place.

I finally found a case that could be turned into a short example:


theory Barf
imports Main
begin

locale X =
    fixes zero :: 'a and one :: 'a and two :: 'a and null :: 'a
  assumes distinct: "null ≠ zero ∧ null ≠ one ∧ null ≠ two ∧
                     zero ≠ one ∧ zero ≠ two ∧ one ≠ two"
  begin

definition P
    where "P x ⟷ (x = zero ∨ x = one ∨ x = two)"

lemma P_iff:
    shows "P x ⟷ x = zero ∨ x = one ∨ x = two"
      unfolding P_def by blast

lemma
    shows "Collect P = {zero, one, two}"
      using P_iff
      (*
         HERE CLICK "Apply" in Sledgehammer panel.
         Sledgehammer with "Provers: cvc4 z3 verit", "Isar proofs"
unchecked,
         "Try methods" checked, reports "No proof found" after
about 40 seconds,
         but the lemma is provable by auto.
       *)
      by auto

end


It seems like it must be some kind of a problem with the way "try
methods" is scheduled
with respect to the SMT provers, but that is just speculation
because I don't have a clue
as to how it really works.

- Gene Stark

view this post on Zulip Email Gateway (Mar 09 2023 at 10:09):

From: Dominique Unruh <cl-isabelle-users@lists.cam.ac.uk>
I would like to additionally mention try0. On my system, try gets the
system to its knees, while running sledgehammer and try0 together works
fine.

I have created a keyboard shortcut that inserts "sledgehammer try0 by -"
(on separate lines), this allows me to already enter and possibly prove
the next subgoal while slegehammer is still running.

(Creating keyboard shortcuts in jEdit:  type "sl Ctrl-;", this opens a
config box to enter the lines. Next time, type "sl Ctrl-;" to insert
those lines. xyz can be whatever you like.)

Best wishes,
Dominique.


Last updated: Apr 26 2024 at 08:19 UTC