[isabelle] Some Remarks to Code Generation · Archive Mirror: Isabelle Users Mailing List

Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Some Remarks to Code Generation

Email Gateway (Aug 19 2022 at 08:03):

From: Patrick Michel <uni@pbmichel.de>
Hi!

I am currently using the code generation facility (mostly for Scala).
All in all it works very well so far, the generated code compiles, I have tried data type refinement, code equation replacement, etc.

So thank you for this great feature!

There are some minor oddities I stumbled on, however, some of which might be worth a change in the library:

A) I was a bit shocked that after proving every function to be total in Isabelle and using proven code equations for refinement, I could make the generated code loop quite easily.

I don't think this is discussed in the code equation chapter of the code generation tutorial.

I stumbled upon it while giving some (admittedly very stupid) equation for an "equal_class.equal" implementation and first thought it would only be a corner case, as equality is handled in a quite special way.

As it turns out it is quite easy to prove a code equation that obviously leads to non-terminating execution.
I guess it is non-trivial to find out if code equations screw you over, so there is not much that can be done here...

B) Code_Integer uses reference equality on BigInt in Scala. That seems to be a bad idea… I had a hard time figuring out why my code did not behave as expected and it turned out that some zeros were not equal to zero (same with some ones). Replacing the equality in the generated code with "inta.equals(int)" fixed all my problems.

I don't know how to adjust the part in Code_Integer to fix the code generator instead, so any advise there would be helpful.

C) Tail recursion. I got stack overflows (with increased stack size) from many of my own functions and some of the library functions. I solved mine by data type refinement and code equations, which leaves the ones in the library.

I think there was a discussion involving fold variants some time ago and which was more "canonical".
The situation now seems to be that there is fold, foldl and foldr in List.thy:

primrec fold :: "('a => 'b => 'b) => 'a list => 'b => 'b"
where
fold_Nil: "fold f [] = id"
| fold_Cons: "fold f (x # xs) = fold f xs o f x" -- {* natural argument order *}

primrec foldr :: "('a => 'b => 'b) => 'a list => 'b => 'b"
where
foldr_Nil: "foldr f [] = id"
| foldr_Cons: "foldr f (x # xs) = f x o foldr f xs" -- {* natural argument order *}

primrec foldl :: "('b => 'a => 'b) => 'b => 'a list => 'b"
where
foldl_Nil: "foldl f a [] = a"
| foldl_Cons: "foldl f a (x # xs) = foldl f (f a x) xs"

Of these three, only foldl is tail recursive. My generated code had to tackle _extremely large_ parameters, so it was a good benchmark to test for tail recursion…

Most notably the rev function on lists is not tail recursive, as its code equations are based on fold:

lemma rev_conv_fold [code]:
"rev xs = fold Cons xs []"
by (simp add: fold_Cons_rev)

I see no good reason for this as rev is the most basic of all tail recursive functions, so I replaced its code equations:

declare rev_conv_fold [code del]

lemma rev_foldl_general:
"rev m @ n = foldl (λl e. e # l) n m"
by (induct m, auto) (metis fold_Cons_rev foldr_conv_fold foldr_conv_foldl rev_rev_ident)

lemma rev_foldl [code]:
"rev = foldl (λl e. e # l) []"
by (rule ext, fold rev_foldl_general) simp

The parameters needed to be flipped, which is not perfect, but I did no longer get stack overflows from rev.

The other function I noticed was map, which is of course a totally different story. It also produced stack overflows for me, so I had to replace it, yet I don't think this is a good idea in general, as map is just not tail recursive.

fun rev_map where
"rev_map f = foldl (λl e. f e # l) []"

theorem rev_map_eq:
"rev_map f l = rev (map f l)"
by (induct l rule: rev_induct) auto

declare map.simps [code del]

lemma map_tail [code]:
"map f l = rev (rev_map f l)"
by (metis rev_map_eq rev_swap)

Thanks again for an overall amazing and extremely powerful feature!

Patrick Michel
Software Technology Group
University of Kaiserslautern

Email Gateway (Aug 19 2022 at 08:03):

From: "Yannick Duchêne (Hibou57 )" <yannick_duchene@yahoo.fr>
I've not reached that point (still learning Isabelle and HOL), but I'm
already interested in this, as I will generate SML programs from proofs.
Can you please, provide a concrete example of an erroneous proof of a
non‑terminating function?

Email Gateway (Aug 19 2022 at 08:04):

From: Patrick Michel <uni@pbmichel.de>
Hi!

Here is a small example:

theory Scratch imports Main
begin

fun fib :: "nat \<Rightarrow> nat" where
"fib 0 = 0"
| "fib (Suc 0) = Suc 0"
| "fib (Suc (Suc n)) = fib (Suc n) + fib n"

value "map (fib o nat) [0..10]"

theorem fib_loop [code]:
"fib n = fib n"
by simp

export_code fib in Scala file "Fib.scala"

end

Patrick Michel

Email Gateway (Aug 19 2022 at 08:04):

From: Tobias Nipkow <nipkow@in.tum.de>
Am 31/07/2012 18:13, schrieb Patrick Michel:

Hi!

I am currently using the code generation facility (mostly for Scala).
All in all it works very well so far, the generated code compiles, I have tried data type refinement, code equation replacement, etc.

So thank you for this great feature!

There are some minor oddities I stumbled on, however, some of which might be worth a change in the library:

A) I was a bit shocked that after proving every function to be total in Isabelle and using proven code equations for refinement, I could make the generated code loop quite easily.

I don't think this is discussed in the code equation chapter of the code generation tutorial.

I stumbled upon it while giving some (admittedly very stupid) equation for an "equal_class.equal" implementation and first thought it would only be a corner case, as equality is handled in a quite special way.

As it turns out it is quite easy to prove a code equation that obviously leads to non-terminating execution.

Indeed: "f x = f x".

This is not a bug but a feature. Code equations give the user a lot of power,
for example to optimize his/her code, but also to shoot yourself in the foot. To
look at it from the intended positive side: it allows you to generate partial
functions, something that some other provers cannot.

Tobias

I guess it is non-trivial to find out if code equations screw you over, so there is not much that can be done here...

B) Code_Integer uses reference equality on BigInt in Scala. That seems to be a bad idea… I had a hard time figuring out why my code did not behave as expected and it turned out that some zeros were not equal to zero (same with some ones). Replacing the equality in the generated code with "inta.equals(int)" fixed all my problems.

I don't know how to adjust the part in Code_Integer to fix the code generator instead, so any advise there would be helpful.

C) Tail recursion. I got stack overflows (with increased stack size) from many of my own functions and some of the library functions. I solved mine by data type refinement and code equations, which leaves the ones in the library.

I think there was a discussion involving fold variants some time ago and which was more "canonical".
The situation now seems to be that there is fold, foldl and foldr in List.thy:

primrec fold :: "('a => 'b => 'b) => 'a list => 'b => 'b"
where
fold_Nil: "fold f [] = id"
| fold_Cons: "fold f (x # xs) = fold f xs o f x" -- {* natural argument order *}

primrec foldr :: "('a => 'b => 'b) => 'a list => 'b => 'b"
where
foldr_Nil: "foldr f [] = id"
| foldr_Cons: "foldr f (x # xs) = f x o foldr f xs" -- {* natural argument order *}

primrec foldl :: "('b => 'a => 'b) => 'b => 'a list => 'b"
where
foldl_Nil: "foldl f a [] = a"
| foldl_Cons: "foldl f a (x # xs) = foldl f (f a x) xs"

Of these three, only foldl is tail recursive. My generated code had to tackle _extremely large_ parameters, so it was a good benchmark to test for tail recursion…

Most notably the rev function on lists is not tail recursive, as its code equations are based on fold:

lemma rev_conv_fold [code]:
"rev xs = fold Cons xs []"
by (simp add: fold_Cons_rev)

I see no good reason for this as rev is the most basic of all tail recursive functions, so I replaced its code equations:

declare rev_conv_fold [code del]

lemma rev_foldl_general:
"rev m @ n = foldl (λl e. e # l) n m"
by (induct m, auto) (metis fold_Cons_rev foldr_conv_fold foldr_conv_foldl rev_rev_ident)

lemma rev_foldl [code]:
"rev = foldl (λl e. e # l) []"
by (rule ext, fold rev_foldl_general) simp

The parameters needed to be flipped, which is not perfect, but I did no longer get stack overflows from rev.

The other function I noticed was map, which is of course a totally different story. It also produced stack overflows for me, so I had to replace it, yet I don't think this is a good idea in general, as map is just not tail recursive.

fun rev_map where
"rev_map f = foldl (λl e. f e # l) []"

theorem rev_map_eq:
"rev_map f l = rev (map f l)"
by (induct l rule: rev_induct) auto

declare map.simps [code del]

lemma map_tail [code]:
"map f l = rev (rev_map f l)"
by (metis rev_map_eq rev_swap)

Thanks again for an overall amazing and extremely powerful feature!

Patrick Michel
Software Technology Group
University of Kaiserslautern

Email Gateway (Aug 19 2022 at 08:15):

From: Magnus Myreen <mom22@cam.ac.uk>

2) The only guarantee code-generation gives (and that is quite a strong one)
is that the generated code is a sound rewriting engine for the function
defined in the logic (assuming that every tool in the typical chain -- e.g.,
compiler, OS -- is correct).

I think you should add to that list of assumptions that the soundness
of the Isabelle code generator must also be assumed. To my knowledge
(correct me if I'm wrong!), Isabelle's code generator comes with no
formal guarantee of producing code that is correct w.r.t. the
semantics of the generated code. By formal guarantee, I mean a theorem
in Isabelle/HOL's logic relating the given input (equations in
Isabelle/HOL) to the semantics of the generated code (ML, Haskell,
Scala ...).

Scott Owens and I have developed a code generator (for HOL4) which
provides such guarantees, see:

Magnus O. Myreen and Scott Owens.
Proof-Producing Synthesis of ML from Higher-Order Logic.
To appear in ICFP'12.
http://www.cl.cam.ac.uk/~mom22/miniml/hol2miniml.pdf

Our code generator takes normal HOL functions as input, translates
these into ML and proves that the generated ML code terminates and
correctly implements the given input functions according to an
operational semantics of our ML language (called MiniML, which is a
pure subset of ML and Haskell).

Example:

As an example, if we were to translate HOL4's library function for
FOLDL:

|- (forall f e. FOLDL f e [] = e) /\
(forall f e x l. FOLDL f e (x::l) = FOLDL f (f e x) l)

The tool generates some MiniML code (shown here in SML syntax):

datatype ('a) list =
Cons of 'a * ('a) list
| Nil ;

fun FOLDL v3 =
(fn v4 =>
(fn v5 =>
(case v5 of
Nil => v4
| (Cons (v2, v1)) => (((FOLDL v3) ((v3 v4) v2)) v1)))) ;

and our code generator automatically proves that this code correctly
implements the HOL function FOLDL according to an operational
semantics of MiniML (Eval below). Here code_for_FOLDL_in asserts that
the above code is in the environment env, and --> is an infix
"refinement invariant combinator" explained in the paper.

|- code_for_FOLDL_in env ==>
Eval env (Var "FOLDL")
(((b --> a --> b) --> b --> LIST a --> b) FOLDL)

We call such theorems 'certificate theorems'.

End of Example.

I should also mention that we (Ramana Kumar, Scott Owens and I) are
constructing a verified MiniML implementation (including dynamic
compilation to x86), so that the trust story can be extended all the
way down to concrete x86/ARM machine code.

See also "Code Generation via Higher-Order Rewrite Systems" by Haftmann and
Nipkow, where they state:

Interesting. I wasn't aware of this paper. It seems that this paper
has proved that an internal phase (dictionary translation) does the
right thing w.r.t. a model of HRSs.

[...] partial correctness of the generated programs w.r.t. the
original equational theorems is guaranteed.

Note that your "[...]" omission hides the big assumption: "If the code
generator preserves [the semantics of] the equations from the logic
when turning them into programs"

No claims are stated for aspects which have no explicit
representation in the logic, in particular termination or
runtime complexity.

Our code generator proves termination. It does this by using the
induction theorems that arise from function definition (and their
totality proofs).

By the way, the last part of the following sentence from Haftmann and
Nipkow's FLOPS'10 paper is a brave statement with no justification:

"The transformation of an intermediate program to a program in a
full-blown SML or Haskell-like target language is again a mere
syntactic adjustment and does not change the equational semantics."

Cheers,
Magnus

On 1 August 2012 03:12, Christian Sternagel <c-sterna@jaist.ac.jp> wrote:

On 08/01/2012 01:38 AM, Yannick Duchêne (Hibou57) wrote:

Le Tue, 31 Jul 2012 18:13:12 +0200, Patrick Michel <uni@pbmichel.de> a
écrit:

As it turns out it is quite easy to prove a code equation that
obviously leads to non-terminating execution.

I've not reached that point (still learning Isabelle and HOL), but I'm
already interested in this, as I will generate SML programs from proofs.
Can you please, provide a concrete example of an erroneous proof of a
non-terminating function?

Just to clarify. There is nothing "erroneous" going on here. There are just
two different perspectives:

1) In HOL only functions that are provably total are accepted (but being
total does not mean that the same function when executed as a functional
program is terminating, since in HOL there is nothing like an evaluation
strategy). Consider, e.g.,

definition ite where
"ite b x y == if b then x else y"

lemmas ite_cong [fundef_cong] = if_cong [folded ite_def]

fun f :: "nat => nat" where
"f x = ite (x = 0) 1 (f (x - 1) * x)"

The call relation of f is well-founded (since in every recursive call the
argument is decreased by one). However when exporting code in any language
with strict evaluation, the result will loop on any input.

2) The only guarantee code-generation gives (and that is quite a strong one)
is that the generated code is a sound rewriting engine for the function
defined in the logic (assuming that every tool in the typical chain -- e.g.,
compiler, OS -- is correct). Being a sound rewriting engine means that every
equation that can be derived by evaluation in the programming language is
also an equation of HOL.

See also "Code Generation via Higher-Order Rewrite Systems" by Haftmann and
Nipkow, where they state:

[...] partial correctness of the generated programs w.r.t. the
original equational theorems is guaranteed.
No claims are stated for aspects which have no explicit
representation in the logic, in particular termination or
runtime complexity.

cheers

chris

Email Gateway (Aug 19 2022 at 08:23):

From: Christian Sternagel <c-sterna@jaist.ac.jp>
Just to clarify. There is nothing "erroneous" going on here. There are
just two different perspectives:

1) In HOL only functions that are provably total are accepted (but being
total does not mean that the same function when executed as a functional
program is terminating, since in HOL there is nothing like an evaluation
strategy). Consider, e.g.,

definition ite where
"ite b x y == if b then x else y"

lemmas ite_cong [fundef_cong] = if_cong [folded ite_def]

fun f :: "nat => nat" where
"f x = ite (x = 0) 1 (f (x - 1) * x)"

The call relation of f is well-founded (since in every recursive call
the argument is decreased by one). However when exporting code in any
language with strict evaluation, the result will loop on any input.

2) The only guarantee code-generation gives (and that is quite a strong
one) is that the generated code is a sound rewriting engine for the
function defined in the logic (assuming that every tool in the typical
chain -- e.g., compiler, OS -- is correct). Being a sound rewriting
engine means that every equation that can be derived by evaluation in
the programming language is also an equation of HOL.

See also "Code Generation via Higher-Order Rewrite Systems" by Haftmann
and Nipkow, where they state:

[...] partial correctness of the generated programs w.r.t. the
original equational theorems is guaranteed.
No claims are stated for aspects which have no explicit
representation in the logic, in particular termination or
runtime complexity.

cheers

chris

Email Gateway (Aug 19 2022 at 08:28):

From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
Hi Magnus,

By the way, the last part of the following sentence from Haftmann and
Nipkow's FLOPS'10 paper is a brave statement with no justification:

"The transformation of an intermediate program to a program in a
full-blown SML or Haskell-like target language is again a mere
syntactic adjustment and does not change the equational semantics."

here you are comparing apples with bananas. We do not claim anything
beyond preservation of equational semantics, and argue that our abstract
intermediate language is appropriately close to ML or Haskell such that
the final transition »only« involves turning abstract into concrete
syntax, but no transformations introducing new equations etc. This,
naturally, is technically involved, but if I have a look at your
reference, it also admits that a formal treatment of this is beyond scope:

We have not attempted to solve the problem of verified
parsing or pretty printing.

Cheers,
Florian
signature.asc

Email Gateway (Aug 19 2022 at 08:29):

From: Magnus Myreen <mom22@cam.ac.uk>
Hi Florian,

Admittedly our pretty printing from abstract syntax to concrete SML
syntax is ad hoc at the moment. However, we do perform this
translation inside the logic, i.e. by inferences inside the logic, and
hope to prove it correct w.r.t. an SML parser function. You can find
the definition of the "dec_to_sml_string" function in the following
Lem file

https://github.com/mn200/HOL/blob/master/examples/miniML/semantics/print_ast.lem

which is used to generate the HOL4 definition. For evaluation inside
the logic, we use HOL4's standard EVAL conversion, which is not super
fast but fast enough.

My original comment was not so much to do with parsing and printing,
but more with the fact that you claim with no justification that the
semantics of "full-blown SML" matches the equational semantics of your
HRSs.

This is not to say that our work solves that problem completely
either! We worry that our MiniML semantics doesn't perfectly match
"full-blown SML". That is partly our motivation for implementing a
verified runtime for MiniML, so that we can extend this guarantee down
to the machine code (where e.g. concrete syntax is trivial).

Cheers,
Magnus

Last updated: Jul 18 2025 at 20:25 UTC