Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] What naming will cause potential errors?


view this post on Zulip Email Gateway (Aug 19 2022 at 10:49):

From: Gottfried Barrow <gottfried.barrow@gmx.com>
Hi,

On the developer's list I saw some discussion about problems that naming
can cause.

If possible, I'd like to know what naming conventions I "have to" follow
to prevent future, potential errors due to naming.

I looked at the implementation manual:

http://isabelle.in.tum.de/website-Isabelle2013/dist/Isabelle2013/doc/implementation.pdf

Naming conventions for ML are spelled out some on page 3, "0.1.2 Naming
conventions". I'm not using ML, so I'm only concerned with Isar naming,
and it wasn't obvious to me that any hard rules exists, such as in
section "Names", page 46.

I'll take two specific cases, theory names and theorem names.

I was naming my theory with lowercase first letter. I have now changed
the first letter to uppercase, since it's a minor thing, and I have this
hunch that somewhere, there's a need for fully qualified names to abide
by that.

For theorem names, I name many theorems starting with uppercase, which
seems to be contrary to the convention in the HOL sources.

I have good reason to do this, because I care more about following
mathematical notation conventions than following Isabelle coding
conventions.

However, if someone tells me that my naming scheme will potentially
cause errors, rather than just annoy people, I am more than ready to
give preference to any naming convention that "must be" followed to
prevent potential errors.

But I don't want to change my naming scheme without knowing what the
rules are because there are lots of different objects, in particular, in
addition to theorem names, there are locales, classes, and sublocales,
which I see that I may need now.

Can anyone tell me what Isar naming conventions "must be" followed to
prevent potential errors?

Thanks,
GB

view this post on Zulip Email Gateway (Aug 19 2022 at 10:51):

From: Makarius <makarius@sketis.net>
There is nothing you "have to" follow. Naming conventions are a matter to
keep things simple at very little cost -- they belong to "best practices".

Observing certain customs routinely saves a lot of energy to rethink and
reconsider things over and over again. Sometimes this has to be done,
nonethrless: e.g. for Isabelle/Scala I've re-used many Isabelle/ML
conventions at the cost of violating other conventions on the Java Virtual
Machine, and Scala itself tends towards very mixed ways of writing. I had
to spend extra time to study history, manuals, websites to ensure that no
bad consequences are coming from Graph.is_minimal in both Isabelle/ML and
Isabelle/Scala, and not Graph.is_minimal here and Graph.isMinimal there
(IAmNotGoingIntoDetailsWhyCamelCaseIsNotUsedInOurTradition).

Anyway, the mail thread you mention was initiated by the surprise that
using a lower-case theory name in Isabelle/HOL makes the Haskell code
generator fail due to resulting lower-case module name, which Haskell
rejects.

Theory names in Isabelle have a long-standing naming tradition: capital
words (usually in singular), separated by underscore.

Example (good): Vector_Space.

Counter-examples (bad): Vector_Spaces, VectorSpaces, vectorSpaces,
vectorspace etc.

Many years ago, theory names were als very short, since users had to
repeat typing use_thy invocations a lot on the TTY, but has changed a lot
with the introduction of Proof General in 1998. Today there is nothing
special saying outright what you mean with theory Boolean_Algebra, which
former generations would have squeezed that towards 8 or 5 characters.

The majority of material within a theory body uses lower case names (also
separate by underscore). Some special situation have capitalized or
all-caps names, e.g. "max", "Max", "MAX" for certain families of
operations.

Locale names are within a theory body, and thus lower-case.

Example:

theory Vector_Space
imports Main
begin

locale vector_space = ...
begin

definition "foo = ..."

end

end

This produces fully qualified names like

Vector_Space.vector_space.foo

with name space accesses like

Vector_Space.foo -- "global access"
vector_space.foo -- "local access"

That is the basic observation from 1999, when locales were first
introduced. Further complexity has piled up since then, but the basic
advantage of having access to two important hierary levels is still there
-- by convention, not by special machinery that would introduce extra
complexity.

You do get overlaps in certain more complex cases nonetheless, but the
convention is somewhat better off than doing it just arbitrarily.

Makarius

view this post on Zulip Email Gateway (Aug 19 2022 at 10:51):

From: Makarius <makarius@sketis.net>
On Sun, 7 Apr 2013, Gottfried Barrow wrote:

For theorem names, I name many theorems starting with uppercase, which
seems to be contrary to the convention in the HOL sources.

I have good reason to do this, because I care more about following
mathematical notation conventions than following Isabelle coding
conventions.

In some sense the "coding" conventioned deviated by accident (or on
purpose) from what you often see in informal "prose" text, but I would
consider this an improvement. We are not writing for the New York Times
here.

Isabelle sources are coming from certain free-world Unixoid background
where lowercase is preferred. The ancient empire of IBM had enforced
all-caps variable names, and the advent of proper lowercase in ASCII was a
great relieve (together with the invention of the underscore as formal
separator).

I did not invent these conventions, but I do think that lower-case is
better for the mass of text. Incidently, Coq prefers capital names and
keywords by default, and just last year some hard-core Coq users pointed
this out as historical accident: it causes a lot of pain to press SHIFT so
often, especially in Proof General. The proposed solution was then to use
vi keyboard mapping for Emacs. (If this is madness, there is method in
it.)

Note that I normally don't argue about the ease of writing the sources,
since ease of reading has the prime priority. This is why the
underscore is important to separate words, and lower-case helps the
Western brain to gloss quickly over large bodies of text (this is why the
scribes of Charlemagne invented the miniscules). Uppercase still has its
role in situations of special emphasis, e.g. "max", "Max", "MAX", but it
would loose that benefit if it were used by default.

However, if someone tells me that my naming scheme will potentially cause
errors, rather than just annoy people

Annoyance only starts when people try to combine several independent
theory libraries into bigger coherent applications.

Until recently, hardly anything was done in this respect, apart from the
main libraries that are shipped with Isabelle (main HOL and HOL/Library).
I am pleased to see that in recent years, some AFP articles have started
to aggregate, and build up beginnings of modest dependency structures.

But I don't want to change my naming scheme without knowing what the
rules are because there are lots of different objects, in particular, in
addition to theorem names, there are locales, classes, and sublocales,
which I see that I may need now.

As a general principle in Isabelle, "different objects" don't differ in
the naming convention, but make them coincide as far as sensible, even
with the actual names. Each category of a separate name space, so you
normally you don't have a dog "foo_dog", cat "foo_cat", mouse "foo_mouse",
but dog "foo", cat "foo", mouse "foo".

In practice, this could mean class "nat", type "nat", const "nat", theorem
"nat". Fine points need to be observed when different categories appear in
clusters: lets say some type "nat" and some const "nat" that both produce
some theorem "nat.induct", which would then clash in the common theorem
name space.

Makarius

view this post on Zulip Email Gateway (Aug 19 2022 at 10:52):

From: Gottfried Barrow <gottfried.barrow@gmx.com>
On 4/12/2013 3:38 PM, Makarius wrote:

There is nothing you "have to" follow. Naming conventions are a
matter to keep things simple at very little cost -- they belong to
"best practices".

Thanks for the lengthy reply. I'll take your word that I can name
identifiers in any way that the prover engine will accept when I'm
working in jEdit. Still, I'll remain paranoid about this for years to come.

Observing certain customs routinely saves a lot of energy to rethink
and reconsider things over and over again. Sometimes this has to be
done, nonethrless: e.g. for Isabelle/Scala I've re-used many
Isabelle/ML conventions at the cost of violating other conventions on
the Java Virtual Machine, and Scala itself tends towards very mixed
ways of writing. I had to spend extra time to study history, manuals,
websites to ensure that no bad consequences are coming from
Graph.is_minimal in both Isabelle/ML and Isabelle/Scala, and not
Graph.is_minimal here and Graph.isMinimal there
(IAmNotGoingIntoDetailsWhyCamelCaseIsNotUsedInOurTradition).

Anyway, the mail thread you mention was initiated by the surprise that
using a lower-case theory name in Isabelle/HOL makes the Haskell code
generator fail due to resulting lower-case module name, which Haskell
rejects.

It seems there were several overlapping threads there, but as to naming,
someone eventually sort of asked, "Can you please enforce naming rules
so that we don't end up with error prone code that goes undetected." And
you sort of said, "I don't want to enforce anything I don't have to
enforce." Consequently, that's why I'll stay paranoid for years to come.

At this time, I'm not concerned with any errors that can result from
using external tools, only errors produced by the tools that Isabelle
officially supports. If the purple bar turns grey at the command "end"
in my THY file, and that represents that my naming conventions won't
cause problems with other Isabelle tools, then that's good enough for me.

You do get overlaps in certain more complex cases nonetheless, but the
convention is somewhat better off than doing it just arbitrarily.

I've put some effort into coming with a naming scheme, mainly for the
purposes of readability, name meaning, and consistency of style. My
question here has been to try and not get vested in some naming scheme
that comes back to haunt me.

I imagine a scenario like this. Someone says, "I tried to use your
theory, but it's causing a ton of errors in another Isabelle tool, and
it's because of your naming scheme." And I reply, "I see. It's very
unfortunate that for these last 3 years I never had a need for that
tool, and I now regret again that my simplistic view of the world has
again caused me trouble."

Regards,
GB

view this post on Zulip Email Gateway (Aug 19 2022 at 10:52):

From: Gottfried Barrow <gottfried.barrow@gmx.com>
Your statement, "Note that I normally don't argue about the ease of
writing the sources, since ease of reading has the prime priority",
is usefully quotable. For one to be opinionated about a topic can many
times result in one being perceived as presumptuous, whereas one being
opinionated about the same topic, while also providing a quote from
someone with good, establishment credentials, can, instead, result in
putting others on the defense.

I had already "primarily" adopted your non-use of camel case, but the
other half to readability is the line length of code, and standard
mathematical notation demonstrates how much information you can get on
one line, where, other than the graphical symbols, it's mainly because
of the use of single letters, or single subscripted letters.

I do, though, use a little bit of camel case, so today, I've made me a
rule about that. The rule is that I only use camel case to separate
characters or cryptic sequences of characters, or characters acting as a
prefix or suffix to words.

For example, I'm naming my operators like inP, ssO, eqO, xeO, geU, geI,
and so forth. And then I also have a name for the same operators with
names something like P\<^isub>I\<^isub>n, where part of the reasoning of
both styles of names is to try and prevent global names clashes with the
HOL sources, and what names other people are using. My use of a name
P\<^isub>I\<^isub>n then requires sometimes that a theorem name will
start with P\<^isub>I\<^isub>n.

What I've said about my naming is still tied into my original question
for this thread. First, I want to know what the rules are (which you've
already answered). It's best to stay with community convention if
breaking convention will cause problems down the road.

Regards,
GB

view this post on Zulip Email Gateway (Aug 19 2022 at 10:53):

From: Makarius <makarius@sketis.net>
On Sat, 13 Apr 2013, Gottfried Barrow wrote:

It seems there were several overlapping threads there, but as to naming,
someone eventually sort of asked, "Can you please enforce naming rules
so that we don't end up with error prone code that goes undetected." And
you sort of said, "I don't want to enforce anything I don't have to
enforce." Consequently, that's why I'll stay paranoid for years to come.

I don't think you have to be paranoid, as long as you stay somewhere in
the middle ground what you normally see in public Isabelle applications.

Concerning enforcement of rules, there has been a slight tendency over the
years to become more strict: tell more explicitly via errors or warnings
instead of just writing manuals about the system. It is always a long and
tedious process to do so, since one first needs to pin down the rules
precisely, implement them, update existing applications accordingly, and
finally handle seasoned users who argue differently.

At this time, I'm not concerned with any errors that can result from
using external tools, only errors produced by the tools that Isabelle
officially supports. If the purple bar turns grey at the command "end"
in my THY file, and that represents that my naming conventions won't
cause problems with other Isabelle tools, then that's good enough for
me.

It is hard to tell where "internal" ends and "external" starts. The
Isabelle/HOL code generator for Haskell is part of the standard Isabelle
disctibution, and used routinely in many applications.

Another example is the file-system: Is that internal or external to the
prover? A theory called "\<A>\<^isub>1" won't work for relatively obvious
reasons; one called "Aux" is less obvious -- it used to work in
Isabelle2012 due to Cygwin, but the Scala/JVM build system now insists in
Windows/MS-DOS conformity. Further uncertainty is introduced routinely
due to case-insensitive file-systems.

Note that Isabelle/jEdit has its own way to manage files for the prover,
so there are subtle differences to someone used to Proof General.

I imagine a scenario like this. Someone says, "I tried to use your
theory, but it's causing a ton of errors in another Isabelle tool, and
it's because of your naming scheme." And I reply, "I see. It's very
unfortunate that for these last 3 years I never had a need for that
tool, and I now regret again that my simplistic view of the world has
again caused me trouble."

This can always happen. Theory developments being used by others will
have to be refined and smoothened. A similar smoothening effect happens
to user applications after more than 3 updates of the underlying Isabelle
version -- I think you have noticed this already :-)

Makarius


Last updated: Mar 29 2024 at 12:28 UTC