Can symbols like \<forall> be stored as ∀ instead? · General

Stream: General

Topic: Can symbols like \<forall> be stored as ∀ instead?

Mario Xerxes Castelán Castro (Jan 10 2026 at 01:45):

Will Isabelle work if are stored as the character they are displayed as, instead of using the Isabelle-specific encoding? I know this is handled by the encoding which is UTF-8-Isabelle. I noticed the completion menu does not pop up when using encoding UTF-8.

Kevin Kappelmann (Jan 10 2026 at 10:45):

It will lead to strange behaviours, e.g. Isabelle/jEdit might not complain. but isabelle build will fail.
See this thread #Beginner Questions > Github

Qiyuan Xu (Jan 12 2026 at 16:25):

My REPL supports the conversion between Isabelle's ASCII encoding and Unicode. It reads the system and the user symbol file to parse the ASCII symbols, so almost identical to how the system handles it.

Fabian Huch (Jan 13 2026 at 14:05):

Note on this topic: While there are UTF-8 codepoints for many Isabelle symbols (not for all), converting to UTF-8 only makes sense in specific contexts. For instance, \<^sup> has the unicode codepoint ⇧, but you really want to display the following symbol in superscript instead, unless you're e.g. directly editing the identifier that contains ⇧.

Fabian Huch (Jan 13 2026 at 14:06):

And UTF-8 characters don't generally exist for superscipted/subscripted characters.

Mario Xerxes Castelán Castro (Jan 13 2026 at 23:38):

Fabian Huch ha dicho:

And UTF-8 characters don't generally exist for superscipted/subscripted characters.

Of course, but UTF-8 has subcripts and superscripts for digits, which is the most common use of subscripts and superscripts.

Fabian Huch (Jan 14 2026 at 08:44):

Certainly not in general math -- that should be obvious, but you can just look into the AFP: list of ⇧ usage (within the first three hits, you will find characters not representable by UTF-8, and you'll have to scroll far to find numbers).
The whole reason why UTF-8 does not support arbitrary superscript/subscript is that it is not meant for rich text representation, in particular not for mathematical formulae (proposals to add more sub/superscript characters were dismissed by the consortium for this exact reason).

Fabian Huch (Jan 14 2026 at 08:46):

Another issue, when talking symbols more generally, is that having a symbol encoding is not good enough. There are plenty of superscript symbols that exist in UTF-8, but your font won't actually be able to render them. Let alone the font of some other user. And even if rendered at all, the symbol might be represented differently...

Fabian Huch (Jan 14 2026 at 08:48):

The idea (which I have heard from people using other ITPs) that monospaced UTF-8 would be good enough for math seems crazy to me.

Mario Xerxes Castelán Castro (Jan 15 2026 at 01:13):

I understand having a general notation that is not dependent on non-ASCII Unicode support. Nonetheless, Isabelle should use the existing Unicode characters for logical connectives like ∀, ∃, etc. All serious programming fonts include these characters.

Fabian Huch (Jan 15 2026 at 08:33):

I don't really see how you can demand that it 'should' do that, when unicode is neither sufficient nor necessary here. In any case, monospaced programming fonts are not adequate at all for displaying logical connectives such as a long implication arrow -- you get a tiny => instead of the proper ==> . Not to mention large operators such as a sum sign...

Fabian Huch (Jan 15 2026 at 08:35):

In Isabelle we favor great technology over popular solutions. Use a different ITP if you want the latter, e.g. Lean.

Mario Xerxes Castelán Castro (Jan 16 2026 at 02:40):

Fabian Huch ha dicho:

In Isabelle we favor great technology over popular solutions. Use a different ITP if you want the latter, e.g. Lean.

I like that philosophy and I chose Isabelle over Lean for that reason among others. I mean using Unicode characters for symbols that already exist in Unicode, not to change Isabelle to only use such symbols. The reason is so that when one sees theory files using tools other than Isabelle/jEdit, such symbols will be displayed properly. It is better to see ¬ (∀x. P x) instead of \<not> (\<forall>x. P x). In the case of ==>, Unicode has ⟹ U+27F9 LONG RIGHTWARDS DOUBLE ARROW. Note it displays longer than ⇒ in Zulip.

Fabian Huch (Jan 16 2026 at 08:39):

The Zulip font selection is reasonably solid, it will display most Isabelle symbols (although some are displayed as wrong symbols, e.g. the \<inverse> becomes ¯ but should look something like ⁻¹). And of course the regular text font is not monospaced, only the programming font is: ⟹ becomes ⟹ and ⟶ becomes ⟶.

Using other tools to display Isabelle theories is of course a valid concern, e.g. for VCS diffs. But for that use-case it is much simpler, and technologically much more solid, to build a diff viewer that can display Isabelle symbols properly.
Or what is the application you have in mind?

Fabian Huch (Jan 16 2026 at 08:44):

Huh, that is funny: ⟹ does not actually become much smaller in Zulip code mode, since the monospaced font does not have the symbol, so this is rendered in a different font in code which is not actually monospaced. So apparently mixing fonts like zulip does has its own pitfalls...

Balazs Toth (Jan 16 2026 at 15:36):

Out of curiosity: Couldn't we have both of best worlds, if for example the encoding of ==> in UTF8-Isabelle would be ⟹ instead of \<Longrightarrow> (Whenever something like this is possible). Then we have our nice symbols in Jedit but can also nicely read mercurial/git diffs for example.

Fabian Huch (Jan 19 2026 at 13:21):

Well, but then you couldn't write actual ==> in a theory. This might not be a problem for this character, but I doubt people would want to trade 'more readable diffs with no/bad tooling' with 'Not being able to write equations like (a+b)=c because the system interprets )= as ⊇... ;)

terru (Jan 19 2026 at 15:14):

regarding readable diffs specifically (at least in git; i am less familiar with hg, which I don't use outside of Isabelle itself): you can tell git to consider .thy files as being "binary", and then set a textconv program which git will apply before diffing to make them readable (of course, this means they are no longer usable as patches). Of course .thy files are not actually binary, but that way you can replace the escape sequences with unicode symbols (in the cases where such symbols easily exist, at least)

Mario Xerxes Castelán Castro (Jan 22 2026 at 02:22):

Fabian Huch ha dicho:

Well, but then you couldn't write actual ==> in a theory. This might not be a problem for this character, but I doubt people would want to trade 'more readable diffs with no/bad tooling' with 'Not being able to write equations like (a+b)=c because the system interprets )= as ⊇... ;)

That is rather an argument to use spaces as in (a + b) = c which is good practice anyway.

Mario Xerxes Castelán Castro (Jan 22 2026 at 02:22):

Fabian Huch ha dicho:

Using other tools to display Isabelle theories is of course a valid concern, e.g. for VCS diffs. But for that use-case it is much simpler, and technologically much more solid, to build a diff viewer that can display Isabelle symbols properly.
Or what is the application you have in mind?

Yes, it is for VCS diffs, specifically in Mercurial.

Fabian Huch (Jan 22 2026 at 09:49):

Mario Xerxes Castelán Castro said:

Fabian Huch ha dicho:

Well, but then you couldn't write actual ==> in a theory. This might not be a problem for this character, but I doubt people would want to trade 'more readable diffs with no/bad tooling' with 'Not being able to write equations like (a+b)=c because the system interprets )= as ⊇... ;)

That is rather an argument to use spaces as in (a + b) = c which is good practice anyway.

Right. And it would be kinda cool if terms like (=) can be lexed in three different ways ;)

Fabian Huch (Jan 22 2026 at 09:50):

Mario Xerxes Castelán Castro said:

Fabian Huch ha dicho:

Using other tools to display Isabelle theories is of course a valid concern, e.g. for VCS diffs. But for that use-case it is much simpler, and technologically much more solid, to build a diff viewer that can display Isabelle symbols properly.
Or what is the application you have in mind?

Yes, it is for VCS diffs, specifically in Mercurial.

I see. Yes, I do agree that we need better tooling for that.

Alex Mobius (Feb 24 2026 at 16:45):

@Fabian Huch the "cannot represent general math" point is somewhat moot, isn't it? Isabelle's notation only has a handful of control characters (bold/sub/sup) over UTF-8, it cannot even represent such simple, widely used notations as column vectors, matrices, tall braces for piecewise functions, ... It's far even from Wolfram Mathematica in that regard. Even the multi-character subscripts are often not rendered nicely in Isabelle/jEdit!
"If you want interoperability with other tools, go use Lean instead" might not be the winning message you expect it to be.
(Also, utf-8 doesnt have to use a monospaced font, but that's neither here nor there).

Alex Mobius (Feb 24 2026 at 16:49):

terru said:

regarding readable diffs specifically (at least in git; i am less familiar with hg, which I don't use outside of Isabelle itself): you can tell git to consider .thy files as being "binary", and then set a textconv program which git will apply before diffing to make them readable (of course, this means they are no longer usable as patches). Of course .thy files are not actually binary, but that way you can replace the escape sequences with unicode symbols (in the cases where such symbols easily exist, at least)

Is there an existing textconv script that you know of, to be applied here?

Fabian Huch (Feb 24 2026 at 17:07):

Alex Mobius said:

Fabian Huch the "cannot represent general math" point is somewhat moot, isn't it? Isabelle's notation only has a handful of control characters (bold/sub/sup)

How does this contradict the point at all? Yes of course we can do even better and hopefully will support even more symbols and notation in the future.

over UTF-8

Isabelle is precisely not 'characters over UTF-8'. Neither over 'unicode', of which UTF-8 is just one encoding.

"If you want interoperability with other tools, go use Lean instead" might not be the winning message you expect it to be.

This is a neutral assessment and not some sort of "winning message".

terru (Feb 24 2026 at 17:14):

Alex Mobius said:

Is there an existing textconv script that you know of, to be applied here?

There's my own little isabelle2unicode, which just replaces escape sequences by the characters given for them in (a hardcoded copy of) etc/symbols, so, e.g. the subscript control character becomes a downwards pointing arrow. I should maybe put this in a better place sometime & explain how to use it with git, but so far I never got around to that …

Alex Mobius (Feb 24 2026 at 19:44):

@Fabian Huch fair enough, apologies for misreading your tone then.

Fabian Huch (Feb 25 2026 at 09:19):

terru said:

There's my own little [isabelle2unicode](https://stuebinm.eu/git/isabelle-utils/), which just replaces escape sequences by the characters given for them in (a hardcoded copy of) `etc/symbols`, so, e.g. the subscript control character becomes a downwards pointing arrow. I should maybe put this in a better place sometime & explain how to use it with git, but so far I never got around to that …

FYI, this can be achieved simply by calling Symbol.decode in Isabelle/Scala, i.e., such a command-line tool can be implemented in ~40 lines like this: isabelle-unicode.zip

terru (Feb 25 2026 at 15:46):

Yes, and then (on my laptop) it takes ~4 seconds to do one file, as it has to start a JVM first (well, one could implement some kind of batch-mode perhaps). Not really the best solution for a tool I want git to call on each file that was changed in a diff, and still be near-instant enough that looking at a git log does not feel sluggish …

Last updated: Feb 27 2026 at 20:31 UTC