[isabelle-dev] [Spam] NEWS: update of external provers · Mirror: Isabelle Development Mailing List

Stream: Mirror: Isabelle Development Mailing List

Topic: [isabelle-dev] [Spam] NEWS: update of external provers

Email Gateway (Oct 19 2020 at 16:26):

From: Jasmin Blanchette <j.c.blanchette@vu.nl>
Hi Larry,

I was wanting to know exactly the same thing. In particular, will sledgehammer suggest it now? And does it give us a version of the smt method more robust than the existing one?

That's the goal, but for this to happen, a few small changes (as well as some testing) have to take place. Having the new binaries is the first step. We're (Martin, Mathias, and I) are well aware of the 15 Dec. deadline.

See also this paper draft for some empirical data:

https://matryoshka-project.github.io/pubs/verit_isa_paper.pdf <https://matryoshka-project.github.io/pubs/verit_isa_paper.pdf>

The nice thing about veriT is that it has more complete instantiation techniques than Z3 -- in fact, more or less the same as implemented in CVC4. This should boost the reconstruction success rate of proofs founds by CVC4.

Jasmin

Email Gateway (Oct 20 2020 at 10:47):

From: Lawrence Paulson <lp15@cam.ac.uk>
This sounds terrific. With smt at the moment being essentially forbidden anywhere in the distribution, an SMT result from sledgehammer only means “yes it is a theorem”, and having to replace a single smt line by something 10 times as long is always painful.

Larry

isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev

Email Gateway (Oct 20 2020 at 14:27):

From: Jasmin Blanchette <j.c.blanchette@vu.nl>
Let me clarify. veriT is still, just like Z3, an external tool. Whether proofs reconstructed using veriT (via the syntax "by (smt (verit) ...)") should be allowed in the distribution or not, and similarly in the AFP, is a political decision I cannot make. veriT has a much smaller and more stable code base and is developed by close colleagues in Liège and Nancy, but we might still prefer not to rely on it for the distribution. Or we might say that the presence of two solvers with the same input format (Z3 and veriT) confers a certain robustness, although this is not entirely true regarding quantifier instantiation.

On the positive side, veriT outputs detailed proofs (like Z3 but unlike CVC4), which are parsed by Isabelle and yield these "semi-intelligible Isar monsters". Some users find that they can fish out useful bits from them, if not use them as is; others find them too broken or ugly. Improving the situation here is high on my agenda for Sledgehammer, because it's generally the key to proof reconstruction for stronger ATPs (e.g. HO).

Finally, there has been a lot of work by the CVC4 developers lately on proof production. They conveniently adopted the veriT file format, which means we might have Isar proofs for them soon (finally), and perhaps even an integration in the smt method ("by (smt (cvc4) ...)").

We'll try to remember to update the NEWS as we go. For the next release, just having properly tested up-to-date binaries of the latest and greatest ATPs would be a nice target.

Jasmin

isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev

Email Gateway (Oct 20 2020 at 15:08):

From: Lawrence Paulson <lp15@cam.ac.uk>
The main problem with these monsters is that they often don’t work, and related to this, they often “obtain” functions that don’t appear to be necessary. Where monsters do work, they often suggest quite relevant lemmas.
Larry

isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev

Email Gateway (Oct 21 2020 at 04:48):

From: Jasmin Blanchette <j.c.blanchette@vu.nl>

The main problem with these monsters is that they often don’t work, and related to this, they often “obtain” functions that don’t appear to be necessary. Where monsters do work, they often suggest quite relevant lemmas.

Indeed, there's some brokenness with Skolemization which is perhaps not so hard to fix -- given that this used to work much better back in 2016, according to our JAR paper ("Semi-intelligible ..."). Esp. back then we had it working reliably for E, which is no longer the case it seems. Maybe the provers have changed their format a bit and we don't pick them up correctly. It's on our list, but I'm afraid probably not for the next release. Martin, Mathias, and I are doing all of that as a "hobby", and Makarius's offer to build the binaries relieved me from a huge weight. The situation is, thanks to them, still much better than only a few months ago.

Jasmin

isabelle-dev mailing list
isabelle-dev@in.tum.de
https://mailman46.in.tum.de/mailman/listinfo/isabelle-dev

Last updated: Jul 15 2022 at 23:21 UTC