Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Problems with newline in String.literal


view this post on Zulip Email Gateway (Aug 22 2022 at 18:26):

From: "Thiemann, René" <Rene.Thiemann@uibk.ac.at>
Dear all,

I’m currently changing the error messages in our formalization from the “string” type to
“String.literal”, so that the exported code uses target language
strings in the generated code. Unfortunately, there seems to be a mistake
in the handling of newline:

definition "message = STR ''this is⏎a two line text''"

is translated into Haskell as

message :: String;
message = "this is\\a two line text";

where the \\ should be a \n

Moreover, also the following behavior is weird:

lemma "STR ''⏎'' = STR 0x5C" "integer_of_char (CHR ''⏎'') = 10"
by code_simp+

Here, one observes that the ''⏎'' gets different characters codes,
depending on whether it is put into a CHR or into a STR.
Note that 0x5C is precisely the backslash as in the generated Haskell code.

Cheers,
René

view this post on Zulip Email Gateway (Aug 22 2022 at 18:26):

From: Dominique Unruh <unruh@ut.ee>
As a workaround, you can use an actual newline:

definition "message = STR ''Line
Line''"
thm message_def
export_code message in Haskell

This fragment also shows that the parser and printer are out of sync here.
"thm message_def" prints the newline as ⏎ but if we use ⏎ in the
definition, we get '\\'.

Best wishes,
Dominique

view this post on Zulip Email Gateway (Aug 22 2022 at 18:27):

From: "Thiemann, René" <Rene.Thiemann@uibk.ac.at>
Dear Dominique,

thanks for the hint, but I don’t want replace 709 \<newline> by newlines in my sources.
This will cause trouble in the indentation of the source and will be hard to undo later on.

Therefore, at the moment I just replace all \\ by \n in the generated source using "sed”.

Cheers,
René

view this post on Zulip Email Gateway (Aug 22 2022 at 18:29):

From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
Hi Rene,

thanks for reporting this.

The reason is that by accident Isabelle2018 accepts funny literals like
»STR "ðäµ"« which consist of a list of xsymbols »\<…>« of which only the
first character »\« is interpreted.

Future releases will follow the same convention as »CHR "…"« and hence
only allow proper ascii characters with the symbolic newline as special
case.

This is a mere syntax issue; the logical foundation is sane.

Cheers,
Florian
signature.asc


Last updated: Apr 24 2024 at 08:20 UTC