Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] unicode tokens in isabelle2011+pg4.0


view this post on Zulip Email Gateway (Aug 18 2022 at 17:07):

From: Clemens Ballarin <ballarin@in.tum.de>
Hi John,

I don't think this will work with PG 4.0. I use
ProofGeneral-4.1pre110131. In order to get rid of boxes displayed
instead of symbols, set the buffer's font to "Apple Symbols". Also
set the symbol font in "Tokens > Set Font > symbol" to this font.

You may choose a different font for the buffer, but then symbols in
antiquotations won't work.

Clemens

Quoting John Wickerson <jpw48@cam.ac.uk>:

view this post on Zulip Email Gateway (Aug 18 2022 at 17:08):

From: Alexander Krauss <krauss@in.tum.de>
Hi John,

I have my keyboard layout set up so that, for instance,
OPTION-SHIFT-A produces the unicode forall (∀) symbol. Previously I
could use this freely in Isabelle proof scripts, but now I get an
"inner lexical error" message when Isabelle is parsing statements
involving these symbols.

The handling of symbols changed drastically from pg 3.7 to 4.

I am not sure if I know a solution, but the problem seems to be that you
are creating a unicode symbol in the buffer, but Isabelle expects the
ascii-encoded symbol \<forall> (which pg displays using the same glyph,
but it's represented differently). I can reproduce this without special
keyboard mapping by copy and pasting the character from elsewhere.

I haven't seen an option in pg that rebinds unicode keysyms to these
special tokens. However, in the worst case you should be able to rebind
the keys yourself. Here is what works in my Emacs:

(global-set-key (kbd "C-A") (lambda () (interactive)
(unicode-tokens-insert-token "forall")))

(possibly replacing the unicode keysym for C-A)
Unfortunately, you have to do this for each symbol then... but it's
better than nothing.

Hope this helps...

Clemens Ballarin wrote:

In order to get rid of boxes displayed instead of symbols...

Do I understand correctly that you do see the symbols correctly...?

Alex

view this post on Zulip Email Gateway (Aug 18 2022 at 17:08):

From: Clemens Ballarin <ballarin@in.tum.de>
Quoting Alexander Krauss <krauss@in.tum.de>:

Do I understand correctly that you do see the symbols correctly...?

And beautifully. But it only works for Apple Symbols, neither
IsabelleText nor the Stix fonts (although the latter work perfectly
well in all other applications). I have not explanation why this is so.

Clemens

view this post on Zulip Email Gateway (Aug 18 2022 at 17:09):

From: Joachim Breitner <mail@joachim-breitner.de>
Hi,

is there a reason why Isabelle would not accept ∀ synonymously to
\<forall> in theory files? This would also be nice for all those
unicode-loving folk who occasionally at .thy files directly (e.g. doing
some mass-substitution with vim, or looking at diffs).

Greetings,
Joachim
signature.asc

view this post on Zulip Email Gateway (Aug 18 2022 at 17:09):

From: John Wickerson <jpw48@cam.ac.uk>
Hello,

I've just upgraded from Isabelle2009-2 to Isabelle2011, and from pg3.7 to pg4.0. (I'm running Aquamacs 2.1.)

The upgrade was fine, except I'm now having a problem with unicode tokens.

I have my keyboard layout set up so that, for instance, OPTION-SHIFT-A produces the unicode forall (∀) symbol. Previously I could use this freely in Isabelle proof scripts, but now I get an "inner lexical error" message when Isabelle is parsing statements involving these symbols.

When I turn on "highlight real unicode chars" in the Tokens menu, this forall symbol is highlighted, but when I type "\<forall>" in my proof script, it produces a symbol that looks the same as the unicode character, but isn't highlighted.

How can I get back my ability to write my isabelle scripts in unicode?

Thanks very much for your help.
John

view this post on Zulip Email Gateway (Aug 18 2022 at 17:09):

From: Makarius <makarius@sketis.net>
The deeper reason is that unicode is neither uniform nor universal. By
keeping encodings and assumptions about availability of glyphs out of the
Isabelle sources, they can be used with plain text tools as described
above, because it is plain ASCII by default.

The situation is different for front-ends based on Isabelle/Scala (such as
Isabelle/jEdit). There is a default mapping between Isabelle symbols and
JVM-based unicode, which is used here as a poort-mans rendering engine for
mathematical symbols. Thus \<forall> vs. its corresponding unicode glyph
are treated interchangeably by the user interface. This works well under
the assumption that there is no conflict with independent unicodes in user
sources.

Makarius

view this post on Zulip Email Gateway (Aug 18 2022 at 17:09):

From: John Wickerson <jpw48@cam.ac.uk>
Hi Alex, thanks very much for replying.

The handling of symbols changed drastically from pg 3.7 to 4.

Ah, that would explain it! :)

Unfortunately, you have to do this for each symbol then... but it's better than nothing.

Oh dear! Well, I've reverted to PG 3.7 for the time being, as that handled my unicode keyboard layout very satisfactorily. Perhaps the fault lies not with PG but with Isabelle, for requiring the forall symbol to be inputted as \<forall> not as the unicode character ∀.

Many thanks for your help with this!

John

view this post on Zulip Email Gateway (Aug 18 2022 at 17:10):

From: Brian Huffman <brianh@cs.pdx.edu>
You can certainly configure Isabelle to allow unicode as an
alternative input syntax. For example, you can define the following
notation for the forall-quantifier, where "∀" is a genuine unicode
symbol instead of the "\<forall>" markup:

notation (input) All (binder "∀" 10)

Now the genuine unicode forall symbol works as expected on input:

term "∀x. P x"

I suppose you could create a theory file that defines similar unicode
notation for all syntax defined in Isabelle/HOL. This might make a
nice addition to HOL/Library.

view this post on Zulip Email Gateway (Aug 18 2022 at 17:10):

From: Makarius <makarius@sketis.net>
This is a bad idea. From what has been explained before, the Isabelle
treatment of symbols vs. unicode handling assumes that this is not done.
Big confusion is to be expected if this is violated. (E.g. try to load
such a file with Isabelle/jEdit and then save it again.)

See also
http://isabelle.in.tum.de/dist/Isabelle2011/doc/implementation.pdf section
"1.2.1 Strings of Symbols" for some explanations and specifications around
this important Isabelle concept.

Makarius

view this post on Zulip Email Gateway (Aug 18 2022 at 17:10):

From: Makarius <makarius@sketis.net>
This behaviour is according to the specification of Isabelle symbols. It
has been there for so many years that 5 new unicode encodings have been
invented in the meantime. I have also refined the whole concept of
symbols and potential mappings wrt. external unicode recently for
Isabelle/Scala, with quite detailed specifications in the Isabelle/Isar
implementation manual.

You cannot expect that ancient Proof General 3.7.x fully complies with all
that. I don't know about the exact behaviour of Proof General 4.1 either,
which is still to be released. (This also means you can contribute to its
issue tracker http://proofgeneral.inf.ed.ac.uk/trac/.)

Makarius

view this post on Zulip Email Gateway (Aug 18 2022 at 17:10):

From: Alexander Krauss <krauss@in.tum.de>
Makarius wrote:
Are you saying that the Unicode code points listed in
$ISABELLE_HOME/etc/symbols may not occur regularly in the sources? The
section in the implementation manual does not make this clear... In fact
what I read there is that any UTF-8 sequence is legal. So then it should
not be a problem to declare the ab

Or is it an additional constraint imposed by the jEdit layer? PG 4 seems
to differentiate between the different classes of symbols...

Alex

view this post on Zulip Email Gateway (Aug 18 2022 at 17:10):

From: Brian Huffman <brianh@cs.pdx.edu>
In section 1.2.1 of the Implementation manual, it clearly states that
unicode codepoints like "∀" are treated distinctly from symbols like
"\<forall>". ProofGeneral 4.0 and 4.1 apparently respect this
distinction, while Isabelle/jEdit does not.

view this post on Zulip Email Gateway (Aug 18 2022 at 17:10):

From: Makarius <makarius@sketis.net>
Isabelle symbols co-exist with arbitrary UTF-8 sequences. The encoding
back and forth for "poor-man's rendering" in the front-end is a slightly
different thing. It is not injective, so overlap needs to be avoided.

Also note that $ISABELLE_HOME/etc/symbols is just one default table.
Users may have there own.

When struggling with this unicode mess over the 1-2 years, I could not
imagine that anybody could actualy love unicode. In the next round I will
try harder produce explicit errors on bad sources.

Makarius

view this post on Zulip Email Gateway (Aug 18 2022 at 17:10):

From: Alexander Krauss <krauss@in.tum.de>
Makarius wrote:

Also note that $ISABELLE_HOME/etc/symbols is just one default table.
Users may have there own.

[...]
In the next round I
will try harder produce explicit errors on bad sources.

Hmm... Does that mean that the property of being "bad sources" will be
configuration-dependent? That sounds very problematic for exchanging
theories...

Alex

view this post on Zulip Email Gateway (Aug 18 2022 at 17:11):

From: Makarius <makarius@sketis.net>
Yes it is, and the game is to minimize problems.

Makarius


Last updated: Apr 19 2024 at 04:17 UTC