Stream: General

Topic: Artificial intelligence for Isabelle


view this post on Zulip Craig Alan Feinstein (May 25 2025 at 18:52):

Has anyone tried using artificial intelligence like chatgpt to save time writing proofs in Isabelle. I have tried it with ChatGpt and Claude where you pay $20 a month. Has anyone tried any other AI tools?

view this post on Zulip Craig Alan Feinstein (May 25 2025 at 19:34):

I just found this article saying Baldur works well with Isabelle. https://spectrum.ieee.org/amp/ai-debug-software-2667018624

view this post on Zulip Mathias Fleury (May 25 2025 at 19:43):

I have never seen a demo of the system showing that it is working for the end-user

view this post on Zulip Craig Alan Feinstein (May 25 2025 at 23:23):

I didn’t see it either.

view this post on Zulip Yutaka Nagashima (May 26 2025 at 17:09):

The AbductionProver for Isabelle is currently more of a search-based approach, but I still call it an AI prover.
(Also, it runs on a laptop without requiring expensive graphics cards.)

https://youtu.be/rXU-lJxP_GI

https://github.com/data61/PSL

view this post on Zulip Craig Alan Feinstein (May 26 2025 at 20:37):

Thank you Yutaka I will check that out. I was about to give up on Claude yesterday after having discovered that if you stay on the same conversation for a long enough time, it appears to run out of intelligent things to say with respect to your code. However, I discovered that if you transfer to another conversation, it will come up with more intelligent comments about the code.

view this post on Zulip Yutaka Nagashima (May 26 2025 at 21:39):

I haven't tried Claude 4 yet.

However, I discovered that if you transfer to another conversation, it will come up with more intelligent comments about the code.

Overall, this seems like an issue someone else will automatically solve after n months or weeks—if they haven't already. We just need to wait while focusing on something else—unless you're really keen to tackle it yourself. :stuck_out_tongue_wink:

view this post on Zulip Craig Alan Feinstein (May 26 2025 at 23:33):

Yutaka I’m in a spot where I would like to finish doing my proofs I’ve been working on for a long time. At this point if I can do this using ai I’ll use it.

view this post on Zulip Craig Alan Feinstein (May 28 2025 at 13:57):

I figure ai is in many ways like a human. If you talk to much to it, it gets tired and eventually it gets so exhausted that it forgets a lot of what has been said and starts saying stupid stuff

view this post on Zulip Craig Alan Feinstein (Aug 17 2025 at 06:30):

I’ve been experimenting with ChatGPT 5. It is much better than ChatGPT 4 but still has a way to go with Isabelle proofs. It gives code with lots of errors a significant amount of the time. And when I asked it to it to fix the errors I got more errors much of the time. It is definitely improving though. I do think it will eventually hit a point where it will be successful with high probability.

view this post on Zulip Craig Alan Feinstein (Aug 17 2025 at 20:59):

I’m finding that sledgehammer is a good bet whenever it doesn’t work

view this post on Zulip irvin (Aug 18 2025 at 14:23):

I've never got a LLM to generate Isabelle/ML code that works.


Last updated: Aug 23 2025 at 01:39 UTC