[isabelle] Basic IsarToLatex markup completed · Archive Mirror: Isabelle Users Mailing List

Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Basic IsarToLatex markup completed

Email Gateway (Aug 19 2022 at 09:14):

From: Gottfried Barrow <gottfried.barrow@gmx.com>
Hi,

I don't expect anything back on this, and I'm not trying to convince
anyone of anything, but I send it in anyway. I got the important parts
of my IsarToLatex markup language finished/started, and now I'm ready to
start using it. The goal is, after all, to prove things, not just mess
around with software.

My last email I thought there wouldn't be anything for anyone to say,
but then I got an important proof from Christian Sternagel; though the
proof is trivial, the result is not unimportant, and the discussion is
and was instructive.

My use of that proof would be Example 1.1.8, Theorem 1.1.9, and Proof
1.1.9.1 of the attached PDF, pages 3 and 4.

My nifty script allows you to see from the PDF what line numbers to look
for in the THY, which would be lines 168, 220, and 226.

The basic idea of my humble man's WYSIWYG is:

1) Create markup delimiters by subscripting or superscripting select
ASCII or \<...> characters,
2) in non-standard ways,
3) that make the symbols smaller and more discrete,
4) but yet you can still recognize them and not confuse them for similar
looking characters, even when using a small font.

As to why I want a humble man's WYSIWYG, I won't explain that.

I have it to where there's a close correlation between how the THY
looks, and how the PDF looks. I still have to compile to make sure I
haven't made mistakes, but by not having to put in newlines for arrays,
I should make a few less mistakes.

I have the basic LaTeX commands implemented where I currently don't have
to use \, {, } and \\. The basics commands are inline math, math
symbols, equations and array environments, \label, \ref, \eqref, \cite,
\index, \texttt, and the very important theorem environments, which I
use in conjunction with a verbatim environment.

There are three attachments. My working THY, sTs01.thy, which is the
primary file used to produce sTs.pdf. There's a jEdit screenshot for
people who can't or don't want to load the THY. I printed out sTs01.thy
with a PDF printer driver, but it doesn't do justice to what you see in
jEdit, and the printout was 700K, where the PDF generated from LaTeX is
125K.

The screenshot shows some of my markup characters, and I'm using a size
10 font. The spellchecker is underlining some things.

I got some good ideas from the standard way Isabelle does the LaTeX build.

The import file "i" is not needed in sTs01.thy. The margins for the PDF
are chosen for the way I prefer to read a PDF on a computer.

As to generating other THY files from a working file like sTs01.thy,
such as a THY that ASCII lovers would find acceptable and useful, that's
a future possibility, but not to be done if unnecessary. These days,
most people don't expect HTML pages to be converted to TXT files.

Regards,
GB
jEdit markup.png
sTs.pdf
sTs01.thy

Email Gateway (Aug 19 2022 at 09:16):

From: Makarius <makarius@sketis.net>
On Wed, 28 Nov 2012, Gottfried Barrow wrote:

I got the important parts of my IsarToLatex markup language
finished/started, and now I'm ready to start using it. The goal is,
after all, to prove things, not just mess around with software.

The basic idea of my humble man's WYSIWYG is:

1) Create markup delimiters by subscripting or superscripting select ASCII or
\<...> characters,
2) in non-standard ways,
3) that make the symbols smaller and more discrete,
4) but yet you can still recognize them and not confuse them for similar
looking characters, even when using a small font.

Interesting what can be done with a few control symbols. It appears to be
generally in line with the "markdown" family of tools. Do you have your
own implementation somewhere on the web?

I would have never got the idea to make a subscripted ASCII prime. Such
things depend a lot on the actual font being used, though.

There's a jEdit screenshot for people who can't or don't want to load
the THY.

The screenshot shows some of my markup characters, and I'm using a size
10 font. The spellchecker is underlining some things.

Which spellchecker are you using here? Are you satisfied with it? At some
point I would like to wire-up a reasonably good one with Isabelle/jEdit in
a way that it knows about the formal vs. informal parts of the text. So
it won't hilight the formal option "SAT4J", for example.

As to generating other THY files from a working file like sTs01.thy,
such as a THY that ASCII lovers would find acceptable and useful, that's
a future possibility, but not to be done if unnecessary. These days,
most people don't expect HTML pages to be converted to TXT files.

At some point I would like to see the Isabelle/Isar sources rendered
nicely in HTML, based on similar pseudo-markup symbols as you have here,
or like a variant of Markdown that is using non-ASCII characters.

Luckily Isabelle has an infinite amount of \<foo> symbols to play with,
and they can be assigned to Unicode glyphs on demand.

Makarius

Email Gateway (Aug 19 2022 at 09:25):

From: Gottfried Barrow <gottfried.barrow@gmx.com>
On 11/30/2012 5:03 PM, Makarius wrote:

Interesting what can be done with a few control symbols. It appears
to be generally in line with the "markdown" family of tools. Do you
have your own implementation somewhere on the web?

No. It's just me hacking around with the scripting language of WinEdit,
which is a LaTeX editor with it's own flavor of regular expressions:
http://winedt.com/

Consequently, the scripts aren't suitable for public distribution for a
product that's tied into jEdit. I'd like to switch over to jython, so I
can give Isabelle users any scripts I write, but that's not going to
happen for a long time.

I'm also not motivated at this time to get anyone to do anything that
creates forks off of what's being done using only the official
distribution. And there's plenty of time to wait, because I'll be making
lots of additions and changes as I use what I'm doing more and more. For
example, I only had a long form for \eqref, like Equation <ymdhMMa>, and
I've added a short form, like <Eq`ymdhMMa>, so that only the equation
number gets printed.

This ties into what Christian Sternagel said in the message thread that
preceded this one, that if possible, it's best to get features
implemented officially.

You can have anything you want of what I've done. To give you the
scripts, I would probably have to zip up the WinEdit executable so you
could figure out how the scripting language works, along with using the
help file.

What I don't want to publicly distribute right now is my "Greek theme
layout". I'd like to use it a while before I give people the LaTeX
sources, not that people couldn't easily figure out what I'm doing.

What I'm doing is a 3900 line LaTeX preamble, with a ton of clutter and
commented lines, and a 825 line WinEdit script, which is at least half
commented lines.

I would have never got the idea to make a subscripted ASCII prime.
Such things depend a lot on the actual font being used, though.

It wasn't obvious to me, but you made it available as an identifier
character, and one thing led to another. For markup tags, the
subscripted reverse tick, , is an excellent character, at least with the default jEdit font on Windows 7. A subscripted and subscripted '
are distinguishable, whereas, a subscripted ' looks the same as lots of
other subscripted characters.

But you bring up a good point that hasn't occurred to me, that all my
choices of subscripted characters, though I can distinguish them with my
font, other people might not be able to.

With Isar identifiers, using subscripted ' is not a problem because
there are only a few non-alphabet or non-numeric characters that can be
used, but there are a bunch of different subscripted characters that
make a small little dot, and people should be able to know what
character they're looking at.

So maybe I'll put some thought into trying to get away from using
subscripted ', `, and *.

Which spellchecker are you using here? Are you satisfied with it? At
some point I would like to wire-up a reasonably good one with
Isabelle/jEdit in a way that it knows about the formal vs. informal
parts of the text. So it won't hilight the formal option "SAT4J", for
example.

I use the VoxSpell plugin: http://plugins.jedit.org/plugins/?VoxSpell

It comes with a dictionary. I tried the SpellCheck plugin, which can use
Aspell and several other dictionaries, including VoxSpell, if I remember
right.

VoxSpell is the only one that checks your spelling as you type. Yea, I
like it. You can't edit its dictionary directly, so I couldn't figure
out how to remove a word from its dictionary, after I had added, but if
you've added an underlined word, such as "word", if you right click on
"word", there's an item in the menu which says "reset 'word'".

...but not to be done if unnecessary. These days, most people don't
expect HTML pages to be converted to TXT files.

At some point I would like to see the Isabelle/Isar sources rendered
nicely in HTML, based on similar pseudo-markup symbols as you have
here, or like a variant of Markdown that is using non-ASCII characters.

Luckily Isabelle has an infinite amount of \<foo> symbols to play
with, and they can be assigned to Unicode glyphs on demand.

As far as interface, I try not to make any requests, since you were way
ahead of everyone anyway. I'm sure the HTML feature will be useful, but
it's not like I need to wait on it to get what I need now. The basics of
what I need now is pretty much there, and I didn't do anything.

I will make a pseudo-request, but first, you mention our ability to add
\<foo> commands to our etc/symbols, and that ties into a fundamental
rule I try to follow which is, "Try to never require anything of another
Isabelle user other than the default distribution".

So, if I put new \<foo>s in my etc/symbols that I use for markup tags,
then other people will need to use my etc/symbols file to see what I
see, and I can be pretty sure that no one will want to do that.

This leads into the \<^isub> command. What I want is not just characters
for markup tags, but subscripted characters. Your point is good about
the possibility of subscripted primes not being distinguishable under
certain fonts, but subscripted \<rhd>, \<lhd>, \<lless>, and
\<ggreater>, which I'm using for equations, would probably work under
most fonts.

Here are three things that would make for safer markup:

1) You leave \<^isub> and \<^isup> in Isar.
2) They and the character that follows them get ignored, other than the
fact that the character gets subscripted or superscripted.
3) They're highlighted different than the characters following \<^sub>
and \<^sup>.

Without item three, it might just end up confusing people, if they're
trying to figure out what the markup tags are. Is it \<^isub>' or
\<^sub>'? You wouldn't be able to tell.

That's not a request. It's merely a pseudo-request. I wouldn't know
what's best.

Regards,
GB

Email Gateway (Aug 19 2022 at 09:27):

From: Makarius <makarius@sketis.net>
On Fri, 30 Nov 2012, Gottfried Barrow wrote:

Here are three things that would make for safer markup:

1) You leave \<^isub> and \<^isup> in Isar.

2) They and the character that follows them get ignored, other than the fact
that the character gets subscripted or superscripted.

Does that mean you want to write "foo\<^sub>\<lless>" as identifier?
Or \<^sub>\<lless> anywhere in the Isar outer syntax to be like a blank?

3) They're highlighted different than the characters following \<^sub> and
\<^sup>.

Without item three, it might just end up confusing people, if they're trying
to figure out what the markup tags are. Is it \<^isub>' or \<^sub>'? You
wouldn't be able to tell.

The confusion of \<^isub> version \<^sub> has been there all the time,
starting when \<^isub> was introduced. There were historical reasons to
have subscripts within an identifier different from subscripts as part of
notation, because there was a built-in notation using it (a variant of
application). So there might actually be a chance now to unify identifier
syntax to use \<^sub> only, and drop the special rendering of \<^isub>
\<^isup> in Isabelle/jEdit.

In general, the jEdit text view can render "\<^foo>\<bar>" or "\<^foo>a"
in almost arbitrary ways that fit on a 1-dimensional line of text. Any
affine transformation on the font can be applied. For \<^sub>a the glyph
for \<^sub> is made very thin (and in the next release also with 100%
transparency) and the "a" is scaled/translated to give a nice subscript
(nicer than the official Java subscript style).

One could also change the font family on the spot, for each of the two
symbols separately, but this is presently not done.

Makarius

Email Gateway (Aug 19 2022 at 09:30):

From: Gottfried Barrow <gottfried.barrow@gmx.com>
On 12/1/2012 4:02 AM, Makarius wrote:

On Fri, 30 Nov 2012, Gottfried Barrow wrote:

Here are three things that would make for safer markup:

1) You leave \<^isub> and \<^isup> in Isar.

2) They and the character that follows them get ignored, other than
the fact that the character gets subscripted or superscripted.

Does that mean you want to write "foo\<^sub>\<lless>" as identifier?
Or \<^sub>\<lless> anywhere in the Isar outer syntax to be like a blank?

I answer those questions after a few preliminary remarks.

In place of \<^isub>, I'll call what I want \<^msub>, for user-markup
subscript. I want to have two different subscript commands, the normal
one, \<^sub> (which you've said will be the only one), and this new
command \<^msub>.

I want to write something like this:

--"\<^msub>' Please consider the equation \<^msub>'x+1\<^msub>' in
theorem \<^msub>*my\<^sub>'theorem\<^msub>*."

This above is an Isar comment that will become normal LaTeX text, that
is, text in a text{...}.

1) The --"\<^msub>'..." marks it as normal LaTeX text that's outside of
any other environment.
2) The \<^msub>'....\<^msub>' is a math inline equation.
3) The \<^msub*.....\<^msub>* is typewriter font.
4) The "my\<^sub>'theorem" is a normal Isar identifier with the allowed
subscripted prime, \<^sub>', in the name. I will have possibly copied
and pasted "my\<^sub>'theorem" into the comment. Plus, if \<^sub>' is
formally part of the identifier, then I definitely don't want to use
another character in its place, even if its in a comment.

I used the word "ignored" because I got confused a little. I've only
been talking about using \<^isub> inside a comment, --"...", where I can
already do what I want with it.

I'll expand on describing these new commands, \<^msub> and \<^msup>.

1) They are reserved for user's use. A user can interpret a \<^msub> or
\<^msup> subscripted or superscripted character any way they want.
2) They hide nothing. They subscript or superscript the character that
follows them in a reasonably normal manner.
3) The character that they subscript or superscript is distinguishable
from characters subscripted and superscripted with \<^sub> and \<^sup>.
4) They can be used anywhere in a THY file.

Item 4 would be where you would sometimes need to "ignore" a \<^msub> or
\<^msup>, and the character after them.

I haven't even been wanting to markup anything outside of a --"..."
comment, but I throw out item 4 to make the use of \<^msub> and \<^msup>
as general as possible.

Having said all that, I specifically answer your two questions to make
sure I have.

1) "Does that mean you want to write "foo\<^sub>\<lless>" as identifier?"

No. The rules and available characters for identifiers are fine with me.
Part of the reason I want a \<^msub> is because I heavily use \<^isub>'
in identifier names, so trying to use \<^isub>'.....\<^isub>' as
delimited text can't easily be done. In general I have to be careful
about trying to use \<^isub>' to create pairs of delimiters, and ' is a
great subscripted character because it's subtle and it's easy to type.

2) "Or \<^sub>\<lless> anywhere in the Isar outer syntax to be like a
blank? ".

No. With this simple markup I'm talking about, I don't want anything to
be completely hidden. If I decide to use markup, like

\<^msub>\<lless>a big equation array\<^msub>\<ggreater>,

though I want \<lless> and \<ggreater> to be small, I want to know that
they're there, and recognize what they are.

If you have something fancy in mind to give us, that's fine, but my
system is simply based on typing characters in a normal way. We need to
see what we typed.

In general, the jEdit text view can render "\<^foo>\<bar>" or
"\<^foo>a" in almost arbitrary ways that fit on a 1-dimensional line
of text. Any affine transformation on the font can be applied. For
\<^sub>a the glyph for \<^sub> is made very thin (and in the next
release also with 100% transparency) and the "a" is scaled/translated
to give a nice subscript (nicer than the official Java subscript style).

One could also change the font family on the spot, for each of the two
symbols separately, but this is presently not done.

This brings up a restatement of one of my rules, which would be, "The
THY can only be edited in a way which conforms to the rules".

I can do all the preprocessing I want on a THY before I feed it to
LaTeX, but if someone wants to open up one of my THYs, then I want them
to see what I have been seeing, and I don't them to get any errors.

Basically, even if I could figure out a way to do a lot of preprocessing
on the text in a THY before it got to your prover engine, I wouldn't
want to do it, unless it performed some great magic, but not I'm a
magician in these matters. The only "One" making changes would be you. I
just work within the rules.

Regards,
GB

Email Gateway (Aug 19 2022 at 09:31):

From: Gottfried Barrow <gottfried.barrow@gmx.com>
First, I now use \<^usub> and \<^usup> as the new, hypothetical commands
you will supply in Isar. The "u" is to emphasize "user reserved". It
could be that you want your own \<^psub> and \<^psup> "prover reserved"
markup commands.

(1) To clarify, "ignore" means that \<^usub><ASCII or Unicode character>
will get parsed for correct syntax, then get displayed as a subscripted
character (different looking than the use of \<^sub>), but not get used
in any way by the prover engine, and not get flagged as an error, no
matter where it is in a THY file.

Here, I make a pseudo sales speech. For myself, I don't have to have any
of this. I can modify my script to accommodate how I use Isar.

Suppose you gave us three control characters that behave as described
by, or similar to, (1): \<^usub>, \<^usup>, and \<^uchar>, where
\<^uchar> simply displays the character in a uniquely visible way.

You would then free up every ASCII and Unicode character to be used as a
markup tag in a THY.

The markup we're talking about is more general than HTML or LaTeX
markup. We could call it "preprocessing markup". It would give the user
the ability to use single characters, single subscripted characters, and
single superscripted characters as markup tags for any kind of
preprocessing they would want to do.

Myself, I would be using a lot of single characters tags, and I don't
think I would ever have to use more than two characters to create my
delimiters.

Regards,
GB

Email Gateway (Aug 19 2022 at 09:32):

From: Gottfried Barrow <gottfried.barrow@gmx.com>
It seems like this would all turn out to be something like one character
comments that get implemented like (...), other than having to come up
with a good way to fit in how they would look.

But I would have eventually settled on using \<^sub> for this, since
I've only been using \<^isub>. It pays to get a little information here
and there. It would have been a rude awakening after it occurred to me
that \<^sub> was the perfect solution all along, and after using the
perfect solution for 4 months, the perfect solution then disappeared in
the upgrade in 2013.

You can upgrade my psuedo-request and psuedo-sales speech to the real thing.

Regards,
GB

Email Gateway (Aug 19 2022 at 09:34):

From: Makarius <makarius@sketis.net>
On Sat, 1 Dec 2012, Gottfried Barrow wrote:

In place of \<^isub>, I'll call what I want \<^msub>, for user-markup
subscript. I want to have two different subscript commands, the normal
one, \<^sub> (which you've said will be the only one), and this new
command \<^msub>.

I want to write something like this:

--"\<^msub>' Please consider the equation \<^msub>'x+1\<^msub>' in theorem
\<^msub>*my\<^sub>'theorem\<^msub>*."

This above is an Isar comment that will become normal LaTeX text, that is,
text in a text{...}.

1) The --"\<^msub>'..." marks it as normal LaTeX text that's outside of any
other environment.
2) The \<^msub>'....\<^msub>' is a math inline equation.
3) The \<^msub*.....\<^msub>* is typewriter font.

I'll expand on describing these new commands, \<^msub> and \<^msup>.

1) They are reserved for user's use. A user can interpret a \<^msub> or
\<^msup> subscripted or superscripted character any way they want.
2) They hide nothing. They subscript or superscript the character that
follows them in a reasonably normal manner.
3) The character that they subscript or superscript is distinguishable from
characters subscripted and superscripted with \<^sub> and \<^sup>.

I would call that control symbol \<^mark> to make its logical function
more clear. The rendering engine would merely have to assign it to some
font style + color change, like it does already for \<^sub> in a hardwired
manner. That would have to become a bit more flexible, to allow
etc/symbols changing the behaviour of control symbols.

(This is merely a summary of what I distilled from above, not any promise
that this will come in the next release.)

4) They can be used anywhere in a THY file.

Item 4 would be where you would sometimes need to "ignore" a \<^msub> or
\<^msup>, and the character after them.

I haven't even been wanting to markup anything outside of a --"..."
comment, but I throw out item 4 to make the use of \<^msub> and \<^msup>
as general as possible.

When it is not required there is no point to think about it too much.

Assigning phyisical rendering to control symbols is one thing, but
specifying a certain meaning as part of the syntax is another. The latter
is usually avoided, and the etc/symbols file does not cover that at all.
E.g. you can't say that something should be a "letter", "digit", "blank"
that was not known as such before.

This brings up a restatement of one of my rules, which would be, "The
THY can only be edited in a way which conforms to the rules".

I can do all the preprocessing I want on a THY before I feed it to
LaTeX, but if someone wants to open up one of my THYs, then I want them
to see what I have been seeing, and I don't them to get any errors.

Inventing Isabelle symbols that are not used elsewhere is perfectly within
the rules of the free-form typesetting of Isabelle documents. You can
have your scripts to post-process the symbols in the "text" ranges, and
people have done that before. It merely needs an awarenes that you don't
"declare" use of special symbols formally, so there is always a potential
to clash with other people's scripts and tools. But this is the same in
regular LaTeX.

Makarius

Email Gateway (Aug 19 2022 at 09:35):

From: Gottfried Barrow <gottfried.barrow@gmx.com>
So I was being overly ambitious to help other people out, not myself.

For myself, all I need is this:

(1) What is now called \<^isub>, you rename to \<^sub>.
(2) What is now called \<^sub>, you rename to \<^mark>.

Item (1) is what you've already implemented in the development version
(so I understand).

Item (2) is merely renaming a control symbol that has been freed up.

I wouldn't need anything special to distinguish between the use of
\<^sub> and \<^mark>. Context would tell me which was being used.

If you were to tell me today that item (2) is what's going to be done,
then I would just start using \<^sub> as my markup command. It would
make the search and replace in my script fairly mindless, which is the
way I like it.

Thanks for the hypothetical gift. I appreciate it, hypothetically.

If inclined, you would also rename what is now \<^sup> to make it a
markup control symbol. The result would be that, effectively, no
commands are being taken away from Isar. If \<^mark> was being used, and
it looked like \<^sub>, that just means that when a THY was being marked
up, it would require more experience from the user.

Regards,
GB

Last updated: Jul 12 2025 at 08:27 UTC