using value command result for proofs · General

Stream: General

Topic: using value command result for proofs

Chengsong Tan (Jul 07 2023 at 11:52):

Hi community,

For a complex function f which terminates, sometimes the Isabelle sledgehammer cannot prove properties like f x = y, despite evaluating "f x" using the value command yields the result "y". How do you leverage the result of a value command in the proof?

I have attached the theory file and marked the place to look at using (****)

valueMWE.thy
Screenshot-2023-07-07-at-12.52.02.png
Screenshot-2023-07-07-at-12.52.23.png

Fabian Huch (Jul 07 2023 at 12:46):

You can use the eval method.

Manuel Eberl (Jul 07 2023 at 12:52):

Or code_simp if you don't want to rely on using the ML evaluation oracle. It will be much slower for bigger computations though.

Manuel Eberl (Jul 07 2023 at 12:53):

It might also be a good idea to define your do_transition function with definition instead of fun. It's much faster. And your function isn't recursive anyway.

Manuel Eberl (Jul 07 2023 at 13:00):

Generally, I would say that sledgehammer is not great at doing a lot of ‘stupid’ equational rewriting. This sort of thing is normally done by the simplifier, and my guess would be that you'd simply have to declare a few equations as simp or add them to the simp set manually when you need them and then the simplifier could unfold everything and prove the theorem in question easily. I'm not quite sure what those equations would be though.

Manuel Eberl (Jul 07 2023 at 13:01):

onestep_def is probably one of them, but that's not enough; it still gets stuck somewhere. It's a bit tricky to debug this, to, because the size of the goal blows up so much. If all you want to do here is evaluating these terms then eval and code_simp will probably work great, but if you want to prove things about these functions then setting up a good simp set will be useful regardless.

Chengsong Tan (Jul 07 2023 at 17:13):

Manuel Eberl said:

Or code_simp if you don't want to rely on using the ML evaluation oracle. It will be much slower for bigger computations though.

Is there any danger (like getting the wrong results) relying on the ML evaluation oracle?

Chengsong Tan (Jul 09 2023 at 01:26):

Manuel Eberl said:

It might also be a good idea to define your do_transition function with definition instead of fun. It's much faster. And your function isn't recursive anyway.

Thanks! This speeds up the processing considerably.

Lukas Stevens (Jul 10 2023 at 06:37):

If the ML-Compiler has bugs, the result might be wrong.

Manuel Eberl (Jul 10 2023 at 10:41):

If you use the ML evaluation oracle, your proof doesn't go through the Isabelle kernel anymore but rather the goal gets converted to some ML code by the (unverified) code generator and then evaluated by Poly/ML, and if the result is true then Isabelle accepts it as a theorem.

This effectively greatly increases the trusted code base, and it is easy to achieve inconsistencies (i.e. prove "false") if you manually fiddle with the code generator setup (e.g. reconfigure it with e.g. the code_printing command).

That said, I am not aware of any real-life problems caused by this, i.e. someone accidentally proving something that was wrong by relying on computational reflection (which is what this process is called, i.e. what eval does). I for one tend to avoid using reflection as long as it's feasible to do so, but if it isn't then I'm fine with it. Other people are less scrupulous than me and just always use it when it's applicable.

Manuel Eberl (Jul 10 2023 at 10:43):

Lukas Stevens said:

If the ML-Compiler has bugs, the result might be wrong.

I'd say the code generator is much more of an issue than the ML compiler. After all, Isabelle itself is written in ML and running in the Poly/ML runtime, so we're relying on that being correct anyway.

That said, there was an issue with Poly/ML's handling of "big integer" arithmetic a few years ago that @Fabian Immler uncovered while he was running code generated by the code generator. So it is something to keep in mind. Isabelle itself doesn't really use "big integers" much.

Last updated: Aug 23 2025 at 01:39 UTC