[isabelle-dev] Two new experimental features in Sledgehammer · Mirror: Isabelle Development Mailing List

Stream: Mirror: Isabelle Development Mailing List

Topic: [isabelle-dev] Two new experimental features in Sledgehammer

Email Gateway (Mar 07 2023 at 17:00):

From: Jasmin Blanchette <jasmin.blanchette@ifi.lmu.de>
Hi all,

I have recently implemented two new features in Sledgehammer and would be grateful for your feedback before the release. One feature is some form of falsification or counterexample search, the other one is a form of abduction (inference of missing assumptions).

Perhaps the best explanation is an example. Suppose you state the goal

lemma "x - y + y = (x::nat)"

Notice that the condition "y <= x" was forgotten. In the "falsification" mode, which is run in parallel with the regular mode, Sledgehammer will add "!!x y. x - y + y = (x::nat)" to the theory and try to derive False. In this case, it will succeed and print

vampire found a falsification...
vampire: The goal is falsified by these facts: gt_ex, le_add2, order.strict_iff_not

The list of facts is probably of limited use in practice, but at least we're alerted to the goal's likely unprovability. (If the context is consistent but the goal isn't consistent with it, as is the case here, then the goal is not provable.)

On the same example, if the first feature didn't kick in, then the second feature, "abduction", would have:

e: Candidate missing assumption:
y < x

Here, E basically says: "I would be able to prove the goal if you added the assumption y < x." Note that this is not the best assumption possible -- that would be "y <= x" -- but it's enough to wake us up.

As a second example, consider

lemma "set_mset (mset_set A) = A"

This doesn't hold because A might be infinite, and converting it to a (finite) multiset will not preserve its elements. This is literally Sledgehammer's output:

e: Candidate missing assumption:
finite A

Having been bitten by this very example before, I am very excited that it is detected by Sledgehammer.

In conclusion, I would be grateful for your feedback concerning the new features and their user's interface. I hope they are more useful that confusing, but experience will tell. They were both easy additions, and I wouldn't mind disabling them by default if they turn out not to work well in practice.

Best,
Jasmin

isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev

Email Gateway (Mar 08 2023 at 07:28):

From: Tobias Nipkow <nipkow@in.tum.de>
Jasmin,

I am totally excited by the emergence of abduction in Isabelle. I had always
hoped for something like it but had no idea how to rank the probably huge set of
formulas that the ATPs find that would imply the goal. I assume you have some
such ranking? Does the (equally huge) literature on abduction help?

Tobias

PS I already saw one "missing assumption" notification but in that case it did
not help.

smime.p7s

Email Gateway (Mar 08 2023 at 15:00):

From: Jasmin Blanchette <jasmin.blanchette@ifi.lmu.de>
Hi Tobias,

I am totally excited by the emergence of abduction in Isabelle. I had always hoped for something like it but had no idea how to rank the probably huge set of formulas that the ATPs find that would imply the goal. I assume you have some such ranking? Does the (equally huge) literature on abduction help?

For the moment, this is only implemented with E. As you correctly write, saturation provers generate a huge set of clauses that, if we had their negation, we could finish the refutation. However, the vast majority of these clauses are not descendants of the negated conjecture. These are effectively lemmas, like hd [x, y] = x, and there's hardly any point in suggesting to the user to prove hd [x, y] ~= x.

Remain the clauses that are descendants of the negated conjecture. Among these, I leave out all clauses of length more than 1 -- these are likely too complicated, too ad hoc, to be interesting. This leaves the unit clauses that are descendants of the negated conjecture. There are often very few of these -- of the order of 10, sometimes even 0. Currently, I sort them roughly by size, preferring those that talk about the goal's variables. In the future, I will completely filter out those that don't talk about the clauses variables, since arbitrary assumptions about arbitrary things are unlikely to be unprovable (e.g., "0 = 1").

In short, the approach I'm trying out is very simple -- so simple that if I wrote a paper about it, it would have to be a 5-page paper. It's very ad hoc and not based on the vast literature. But maybe it would be an idea to look more seriously at what the literature has to offer. One weakness of my simple approach is that the superposition prover uses a term order to break symmetries in the proof search. Due to that, not all consequences are derived, and sometimes a nice-to-have clause like "~ finite A" (corresponding to the missing assumption "finite A") is simply not generated, even though it is a consequence.

PS I already saw one "missing assumption" notification but in that case it did not help.

Good to know.

Jasmin

isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev

Email Gateway (Mar 08 2023 at 15:54):

From: Tobias Nipkow <nipkow@in.tum.de>
That makes a lot of sense. You could evaluate the effectiveness by running it on
propositions (from the library) that s/h can prove and then leave out premises
to see if abduction generates them. Or even learn that way which unit clauses
are good candiates ...

Tobias

smime.p7s

Email Gateway (Mar 10 2023 at 10:05):

From: Tobias Nipkow <nipkow@in.tum.de>
Jasmin,

Abduction: I just saw

e: Candidate missing assumption:
length xs + 1 = length xs

and decided that there was probably somethig wrong with my goal, and it was. Had
s/h just failed, I might not have drawn that conclusion.

Tobias

smime.p7s

Email Gateway (Mar 10 2023 at 10:08):

From: Jasmin Blanchette <jasmin.blanchette@ifi.lmu.de>
Hi Tobias,

Thanks for sharing the example. I hadn't thought that wrong missing assumptions might be correlated with wrong conjectures!

Best,
Jasmin

isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev

Last updated: Feb 25 2026 at 08:53 UTC