[isabelle] Teaching students about Sledgehammer · Mirror: Isabelle Users Mailing List

Stream: Mirror: Isabelle Users Mailing List

Topic: [isabelle] Teaching students about Sledgehammer

Email Gateway (Mar 26 2022 at 01:08):

From: Talia Ringer <tringer@cs.washington.edu>
Hi all, hope you're doing well.

I'm teaching a class on proof automation, and next week we are reading
about Sledgehammer. My class does one day of a reading discussion, followed
by one day of group programming on an artifact.

I'm curious if anyone has any recommendations for artifacts for
Sledgehammer. It can be something like:

- a small collection of proofs that sledgehammer can dispatch, but that
students just learning Isabelle/HOL could feasible do by hand too, so they
can compare;

- a look inside of the implementation; or
- pretty much anything else that gives students a sense of either what
it's like to practically use Isabelle/HOL with sledgehammer, or how it's
implemented.

I figure it is probably best to ask the community. At this point in the
semester, students know Coq and Agda, and have seen and read about HOL and
Isabelle/HOL, but haven't written any proofs in or automation for
Isabelle/HOL. So something relatively simple would be nice.

Thanks!

Talia

Email Gateway (Mar 26 2022 at 12:13):

From: Alex Weisberger <alex.m.weisberger@gmail.com>
Hello!

There's this talk that Martin Kleppmann gave about using Isabelle to prove
properties about a distributed system. In the talk he uses sledgehammer
pretty exclusively to find proofs:
https://martin.kleppmann.com/2019/09/14/strange-loop-isabelle.html.

I would also check out Concrete Semantics: http://concrete-semantics.org/.
It's about language semantics / compiler correctness, but there's a lot of
little exercises that can be done by students. This book doesn't really
talk about sledgehammer, directly but heavily uses the "auto" tactic which
is also very powerful in terms of proof automation. If sledgehammer has to
be used, you can pretty much use it in place of auto for most of the proofs
and it will work fine.

Alex

Email Gateway (Mar 26 2022 at 12:46):

From: Lawrence Paulson <lp15@cam.ac.uk>
Regarding the general objectives underlying Sledgehammer (which have hardly changed since 2005), I suggest looking at early papers:

Automation for Interactive Proof: First Prototype (first 2 pages only)
https://www.cl.cam.ac.uk/~lp15/papers/Automation/info-and-comp.pdf
Three Years of Experience with Sledgehammer
https://www.cl.cam.ac.uk/~lp15/papers/Automation/paar.pdf

For a glimpse of the powerful Mark II version built by Jasmin Blanchette, here’s a good starting point:

Extending Sledgehammer with SMT Solvers (see page 3 for the key ideas)
https://www.cl.cam.ac.uk/~lp15/papers/Automation/cade2011-sledge-smt.pdf

One of the big advantages of this new version is the abolition of unsound translations into first-order logic, eliminating the risk that a sledgehammer proof will fail.

Regarding examples: one great thing about sledgehammer is how well it works with structured proofs. If it can’t prove something, often it’s enough to find an intermediate fact and prove that first. So I was able to cook up this little proof in a few minutes, not copying previous examples, simply by recalling the intuitive proof. Every line was justified by sledgehammer alone. No need to memorise hundreds of tactics. You could invite your students to try to justify some of these lines some other way.

lemma "\<not> (\<exists>q::rat. q^2 = 2)"
proof
assume "\<exists>q::rat. q^2 = 2"
then obtain q::rat where "q^2 = 2"
by blast
obtain m n where "coprime m n" "q = of_int m / of_int n"
by (meson quotient_of_coprime quotient_of_div surj_pair)
then have "of_int m ^ 2 / of_int n ^ 2 = (2::rat)"
by (metis \<open>q\<^sup>2 = 2\<close> power_divide)
then have 2: "of_int m ^ 2 = (2::rat) * of_int n ^ 2"
by (metis division_ring_divide_zero double_eq_0_iff mult_2_right mult_zero_right nonzero_divide_eq_eq)
then have "2 dvd m"
by (metis (mono_tags, lifting) even_mult_iff even_numeral of_int_eq_iff of_int_mult of_int_numeral power2_eq_square)
then obtain r where "m = 2*r"
by blast
then have "2 dvd n"
by (smt (verit, ccfv_threshold) "2" Groups.mult_ac(3) dvd_def even_mult_iff mult_cancel_left of_int_1 of_int_add of_int_eq_of_int_power_cancel_iff of_int_mult one_add_one power2_eq_square)
then show False
using \<open>coprime m n\<close> \<open>m = 2 * r\<close> by auto
qed

Larry Paulson

Email Gateway (Mar 29 2022 at 21:56):

From: Talia Ringer <tringer@cs.washington.edu>
Thanks, all!

Last updated: Jul 15 2022 at 23:21 UTC