[isabelle] chatgpt and Isabelle/HOL4? :) · Mirror: Isabelle Users Mailing List

Stream: Mirror: Isabelle Users Mailing List

Topic: [isabelle] chatgpt and Isabelle/HOL4? :)

Email Gateway (Feb 01 2023 at 14:51):

From: Michael Norrish <cl-isabelle-users@lists.cam.ac.uk>
My student Sean Lamont asked ChatGPT to do some theorem-proving in HOL4:

[cid:image001.png@01D93595.7FAEED70]

It’s not clear to me that this is even correct Isabelle (syntax apart, the step to get f n = ~f n looks dubious), but it certainly isn’t HOL4.

When told repeatedly that this indeed wasn’t HOL4, it could eventually be persuaded to produce:

[cid:image002.png@01D93595.7FAEED70]

which is probably still less helpful :)

For your new year’s delectation.

Michael
image001.png
image002.png

Email Gateway (Feb 01 2023 at 15:16):

From: Makarius <makarius@sketis.net>
So now we have a machine to produce useless spam in high volumes.

The old-fashioned term for that is "bluffer" or "imposter", or something else
we've forgotton long ago (because the problem has become so common-place).

Makarius

Email Gateway (Feb 01 2023 at 17:43):

From: Tobias Nipkow <nipkow@in.tum.de>
Michael, we experimented with it a bit a few weeks ago when chatgpt hit the
headlines. The unsurprising conclusion: if it has seen a reasobaly simple proof
of the thing you ask for, it will give a good approximation. Usually not a
completely well-formed proof, but not a joke either. A good starting point for
student homework solutions. It will not come up with anything original but might
become a good search engine. Here is a simple example:

Q: Where can I find an Isabelle proof of reverse(reverse xs) = xs

A: You can find a proof of the statement "reverse(reverse xs) = xs" in the
standard library of the theorem prover Isabelle. This proof can be found in the
theory file "List.thy", which is part of the standard distribution of Isabelle.

In this theory file, the statement can be proven as a lemma using the following
proof script:

lemma "reverse (reverse xs) = xs"
by (induction xs) auto

The interesting aspect is, that in List.thy "reverse" is actually called "rev"
and the proof looks like this:

lemma rev_rev_ident [simp]: "rev (rev xs) = xs"
by (induct xs) auto

It must have seen enough proofs of this standard lemma to make the connection.

Tobias
smime.p7s

Email Gateway (Feb 01 2023 at 18:04):

From: Peter Lammich <lammich@in.tum.de>
I have (obviously) made similar experiments, with similar outcomes. The
proof somehow contained a few correct ideas, but was mainly garbage that
had to be manually fixed.

It has become better though in the current version, telling me: "I'm
sorry, but I am a text-based AI model and do not have the capability to
sketch a proof in Isabelle/HOL".

If you insist, it will still output a 'template proof', though

Email Gateway (Feb 07 2023 at 08:41):

From: Josef Urban <josef.urban@gmail.com>
For TP people who want to read a honest non-hype account of this
technology, how it emerged, and what may be its limits, I very much
recommend Mikolov's recent interview: http://ii.tudelft.nl/bnvki/?p=1884 .

Josef

image001.png
image002.png
image002.png

Last updated: May 11 2025 at 12:37 UTC