Hi community,
For a complex function f which terminates, sometimes the Isabelle sledgehammer cannot prove properties like f x = y, despite evaluating "f x" using the value command yields the result "y". How do you leverage the result of a value command in the proof?
I have attached the theory file and marked the place to look at using (****)
valueMWE.thy
Screenshot-2023-07-07-at-12.52.02.png
Screenshot-2023-07-07-at-12.52.23.png
You can use the eval
method.
Or code_simp
if you don't want to rely on using the ML evaluation oracle. It will be much slower for bigger computations though.
It might also be a good idea to define your do_transition
function with definition
instead of fun
. It's much faster. And your function isn't recursive anyway.
Generally, I would say that sledgehammer
is not great at doing a lot of ‘stupid’ equational rewriting. This sort of thing is normally done by the simplifier, and my guess would be that you'd simply have to declare a few equations as simp
or add them to the simp set manually when you need them and then the simplifier could unfold everything and prove the theorem in question easily. I'm not quite sure what those equations would be though.
onestep_def
is probably one of them, but that's not enough; it still gets stuck somewhere. It's a bit tricky to debug this, to, because the size of the goal blows up so much. If all you want to do here is evaluating these terms then eval
and code_simp
will probably work great, but if you want to prove things about these functions then setting up a good simp set will be useful regardless.
Manuel Eberl said:
Or
code_simp
if you don't want to rely on using the ML evaluation oracle. It will be much slower for bigger computations though.
Is there any danger (like getting the wrong results) relying on the ML evaluation oracle?
Manuel Eberl said:
It might also be a good idea to define your
do_transition
function withdefinition
instead offun
. It's much faster. And your function isn't recursive anyway.
Thanks! This speeds up the processing considerably.
If the ML-Compiler has bugs, the result might be wrong.
If you use the ML evaluation oracle, your proof doesn't go through the Isabelle kernel anymore but rather the goal gets converted to some ML code by the (unverified) code generator and then evaluated by Poly/ML, and if the result is true
then Isabelle accepts it as a theorem.
This effectively greatly increases the trusted code base, and it is easy to achieve inconsistencies (i.e. prove "false") if you manually fiddle with the code generator setup (e.g. reconfigure it with e.g. the code_printing
command).
That said, I am not aware of any real-life problems caused by this, i.e. someone accidentally proving something that was wrong by relying on computational reflection (which is what this process is called, i.e. what eval
does). I for one tend to avoid using reflection as long as it's feasible to do so, but if it isn't then I'm fine with it. Other people are less scrupulous than me and just always use it when it's applicable.
Lukas Stevens said:
If the ML-Compiler has bugs, the result might be wrong.
I'd say the code generator is much more of an issue than the ML compiler. After all, Isabelle itself is written in ML and running in the Poly/ML runtime, so we're relying on that being correct anyway.
That said, there was an issue with Poly/ML's handling of "big integer" arithmetic a few years ago that @Fabian Immler uncovered while he was running code generated by the code generator. So it is something to keep in mind. Isabelle itself doesn't really use "big integers" much.
Last updated: Sep 09 2024 at 08:25 UTC