## Stream: General

### Topic: using value command result for proofs

#### Chengsong Tan (Jul 07 2023 at 11:52):

Hi community,

For a complex function f which terminates, sometimes the Isabelle sledgehammer cannot prove properties like f x = y, despite evaluating "f x" using the value command yields the result "y". How do you leverage the result of a value command in the proof?

I have attached the theory file and marked the place to look at using (****)

#### Fabian Huch (Jul 07 2023 at 12:46):

You can use the `eval` method.

#### Manuel Eberl (Jul 07 2023 at 12:52):

Or `code_simp` if you don't want to rely on using the ML evaluation oracle. It will be much slower for bigger computations though.

#### Manuel Eberl (Jul 07 2023 at 12:53):

It might also be a good idea to define your `do_transition` function with `definition` instead of `fun`. It's much faster. And your function isn't recursive anyway.

#### Manuel Eberl (Jul 07 2023 at 13:00):

Generally, I would say that `sledgehammer` is not great at doing a lot of ‘stupid’ equational rewriting. This sort of thing is normally done by the simplifier, and my guess would be that you'd simply have to declare a few equations as `simp` or add them to the simp set manually when you need them and then the simplifier could unfold everything and prove the theorem in question easily. I'm not quite sure what those equations would be though.

#### Manuel Eberl (Jul 07 2023 at 13:01):

`onestep_def` is probably one of them, but that's not enough; it still gets stuck somewhere. It's a bit tricky to debug this, to, because the size of the goal blows up so much. If all you want to do here is evaluating these terms then `eval` and `code_simp` will probably work great, but if you want to prove things about these functions then setting up a good simp set will be useful regardless.

#### Chengsong Tan (Jul 07 2023 at 17:13):

Manuel Eberl said:

Or `code_simp` if you don't want to rely on using the ML evaluation oracle. It will be much slower for bigger computations though.

Is there any danger (like getting the wrong results) relying on the ML evaluation oracle?

#### Chengsong Tan (Jul 09 2023 at 01:26):

Manuel Eberl said:

It might also be a good idea to define your `do_transition` function with `definition` instead of `fun`. It's much faster. And your function isn't recursive anyway.

Thanks! This speeds up the processing considerably.

#### Lukas Stevens (Jul 10 2023 at 06:37):

If the ML-Compiler has bugs, the result might be wrong.

#### Manuel Eberl (Jul 10 2023 at 10:41):

If you use the ML evaluation oracle, your proof doesn't go through the Isabelle kernel anymore but rather the goal gets converted to some ML code by the (unverified) code generator and then evaluated by Poly/ML, and if the result is `true` then Isabelle accepts it as a theorem.

This effectively greatly increases the trusted code base, and it is easy to achieve inconsistencies (i.e. prove "false") if you manually fiddle with the code generator setup (e.g. reconfigure it with e.g. the `code_printing` command).

That said, I am not aware of any real-life problems caused by this, i.e. someone accidentally proving something that was wrong by relying on computational reflection (which is what this process is called, i.e. what `eval` does). I for one tend to avoid using reflection as long as it's feasible to do so, but if it isn't then I'm fine with it. Other people are less scrupulous than me and just always use it when it's applicable.

#### Manuel Eberl (Jul 10 2023 at 10:43):

Lukas Stevens said:

If the ML-Compiler has bugs, the result might be wrong.

I'd say the code generator is much more of an issue than the ML compiler. After all, Isabelle itself is written in ML and running in the Poly/ML runtime, so we're relying on that being correct anyway.

That said, there was an issue with Poly/ML's handling of "big integer" arithmetic a few years ago that @Fabian Immler uncovered while he was running code generated by the code generator. So it is something to keep in mind. Isabelle itself doesn't really use "big integers" much.

Last updated: Sep 09 2024 at 08:25 UTC