[isabelle] Bug report: Generated Scala code on case-insen... · Archive Mirror: Isabelle Users Mailing List

Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Bug report: Generated Scala code on case-insen...

Email Gateway (Aug 22 2022 at 16:28):

From: Brandon Bohrer <bbohrer@cs.cmu.edu>
Hi All (and probably Andreas in particular),

I've got a bug to report when code-generating to Scala.
One quirk of Scala compilation is that on case-insensitive filesystems
(some Macs, probably Windows), classes (including case classes for datatype
constructors) must never differ only by case, else compilation fails when
the compiler writes multiple .class files with conflicting names.

Some of the classes generated by Isabelle violate this convention.
While it would be optimal to maintain the convention even when the user
introduces potential conflicts (see attached UserError.{thy,scala}), it's
more important to fix "built-in" code, which also has this bug.
See BugReport.scala lines 12 and 14: "class char" and "case class Char" of
"object String" conflict.
The other case I saw was "class seq" and "case class Seq" in "object
Predicate" when using the predicate compiler, see
BugReportPred.{thy,scala}.

The first solutions that come to mind are:

1) Automatically change class/case class names to avoid conflict, even if
the user gave constructors+types unfortunate names. From reading generated
code I'm under the impression that similar functionality already exists?
e.g. I see classes like "Lista" to avoid conflict with "List".

2) Go through relevant files in the standard library and rename
appropriately at the source level, but that sounds awfully invasive and
less robust.

We have worked around it for now by renaming manually after the code is
generated, which is suboptimal for obvious reasons.

Any fix would be greatly appreciated!
-Brandon
BugReport.scala
BugReport.thy
BugReportPred.scala
BugReportPred.thy
UserError.scala
UserError.thy

Email Gateway (Aug 22 2022 at 16:28):

From: Lars Hupel <hupel@in.tum.de>
Dear Brandon,

some three years ago, Florian has introduced a configuration option for
that:

declare [[scala_case_insensitive]]

It doesn't appear to be documented, though.

Cheers
Lars

Email Gateway (Aug 22 2022 at 16:28):

From: Brandon Bohrer <bbohrer@cs.cmu.edu>
Dear Lars,

That's great to hear, and thanks for the quick response!
In my experience since there are so many features that it would likely be
quite difficult to find even if it were documented.
Is there any chance of printing a warning when export_code runs saying "you
probably want to declare this option"?

Email Gateway (Aug 22 2022 at 16:28):

From: Lars Hupel <hupel@in.tum.de>

Is there any chance of printing a warning when export_code runs saying "you
probably want to declare this option"?

That remains unclear.

On the one hand, I maintain that case-insensitive file systems are
fundamentally broken and we shouldn't cater for them by default. On the
other hand, likely the majority of Isabelle users are affected by this.

So maybe the pragmatic approach would be to remove the option and do
mangling by default.

Cheers
Lars

Email Gateway (Aug 22 2022 at 16:28):

From: Jasmin Blanchette <jasmin.blanchette@inria.fr>
Lars wrote:

So maybe the pragmatic approach would be to remove the option and do
mangling by default.

We could also keep the option but change its default value.

Jasmin

Email Gateway (Aug 22 2022 at 16:28):

From: Brandon Bohrer <bbohrer@cs.cmu.edu>
I agree that just making it the default seems a lot easier than having a
message, documentaton, etc. about it.
If there's a concern that the mangling could break anything else or affect
readability, one could always replace the "scala_case_insensitive"
with a "scala_case_sensitive"
option that manually disables the mangling if someone really needs it.

-Brandon

Email Gateway (Aug 22 2022 at 16:29):

From: Makarius <makarius@sketis.net>
I wonder if scalac could directly output into a jar file, without the
dangers of the physical file-system. Maybe by invoking the Scala
compiler within Scala under program control?

On Windows and Mac OS X there is loss of information due to
case-insensibility. On Linux there are occasional problems with
file-name lengths on encrypted file-systems.

Makarius
signature.asc

Email Gateway (Aug 22 2022 at 16:29):

From: Lars Hupel <hupel@in.tum.de>
A brief look at the sources of the Scala compiler yielded this:

/** A list pairing source directories with their output directory.

* This option is not available on the command line, but can be set
by

* other tools (IDEs especially). The command line specifies a single
* output directory that is used for all source files, denoted by a
* '*' in this list.
*/
lazy val outputDirs = new OutputDirs

Additionally, there is an ad-hoc implementation of an in-memory
directory present.

I haven't checked whether all necessary operations are supported, but it
looks doable.

Email Gateway (Aug 22 2022 at 16:30):

From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
Sounds like an interesting idea, but also more involved. I think I'll
first start with my pragmatic proposal and then gather more expertise in
that area.

Florian
signature.asc

Email Gateway (Aug 22 2022 at 16:31):

From: Lars Hupel <hupel@in.tum.de>
On a tangential note, it would be very helpful if Isabelle itself would
support virtual file systems. PIDE could easily do it by switching to
NIO paths from IO files, but the problem remains for I/O operations
performed by the prover.

(Use cases I'm thinking of are performing cross-platform in-memory
builds, or not having to check out the AFP for including a single
session with dependencies.)

Email Gateway (Aug 22 2022 at 16:31):

From: Makarius <makarius@sketis.net>
On 15/12/17 11:51, Lars Hupel wrote:

On a tangential note, it would be very helpful if Isabelle itself would
support virtual file systems. PIDE could easily do it by switching to
NIO paths from IO files, but the problem remains for I/O operations
performed by the prover.

There is already some support for that: the PIDE module resources.scala
and its derivatives for the jEdit and VSCode front-end provide some
abstractions over the file-system. Thus Isabelle/PIDE/jEdit can work
over http to some extent. E.g. try this with current Isabelle/b8f30228a55b:

isabelle jedit -l HOL-Analysis
https://bitbucket.org/isa-afp/afp-devel/raw/6e2a1bb82505/thys/Bernoulli/Bernoulli_Zeta.thy

There are various fine points of uniform URL support still missing: in
jEdit, in Java standard libraries, in Isabelle/Scala etc.

Moreover, Isabelle/Scala provides some competing abstractions of local
vs. remote SSH.System operations (file access and process execution). At
the bottom it is based on jsch. jEdit has its own virtual file-system
for ssh/sftp on top of jsch. Here are also some attempts to combine jsch
with Java NIO: https://github.com/lucastheisen/jsch-nio

Overall, I am not so convinced of the Java platform: things are
generally unfinished and not well thought out. In the past 12 months, I
have even started to think of retargeting Isabelle/Scala onto Node.js as
an alternative to the JVM, by using ScalaJs. It means the abstractions
probably need to be more abstract than standard Java interfaces.

Use cases I'm thinking of are performing cross-platform in-memory
builds

Sorry, I do not understand this idea.

or not having to check out the AFP for including a single
session with dependencies.)

In principle this could work with a slight refinement of the existing
http support, such that the session directory specifications (option -d)
can be URLs as well. Various fine points are preventing this: e.g. users
still write out generated files occasionally, even though I have tried
to work towards strictly read-only Isabelle session sources in the past
20 years.

On the other hand, a full AFP source tree has only 140 MB + 100 MB for
the hg history. That is relatively little compared to the full Isabelle
installation, and the resulting heap images will easily outweigh the
sources.

So an alternative could be just a fancy download tool for AFP subtrees
within the Prover IDE or on the command-line.

Makarius

Email Gateway (Aug 22 2022 at 16:31):

From: Lars Hupel <hupel@in.tum.de>

There is already some support for that: the PIDE module resources.scala
and its derivatives for the jEdit and VSCode front-end provide some
abstractions over the file-system. Thus Isabelle/PIDE/jEdit can work
over http to some extent. E.g. try this with current Isabelle/b8f30228a55b:

isabelle jedit -l HOL-Analysis
https://bitbucket.org/isa-afp/afp-devel/raw/6e2a1bb82505/thys/Bernoulli/Bernoulli_Zeta.thy

There are various fine points of uniform URL support still missing: in
jEdit, in Java standard libraries, in Isabelle/Scala etc.

Uniform URL support in Java is already there: Look at NIO2. There are
many different filesystem providers out there (as you are undoubtedly
aware). Unfortunately the knowledge of JSR 203 is still "fresh" in the
community (despite being released in 2011). For HTTP, this one surfaced
after a quick search: <https://github.com/magicDGS/jsr203-http>

Overall, I am not so convinced of the Java platform: things are
generally unfinished and not well thought out.

A curious thing to say about a battle-tested platform that's been used
in production for a very long time. What precisely is not "well thought
out"? Because NIO definitely is.

In the past 12 months, I
have even started to think of retargeting Isabelle/Scala onto Node.js as
an alternative to the JVM, by using ScalaJs. It means the abstractions
probably need to be more abstract than standard Java interfaces.

Scala.js is certainly a possibility. But rest assured that JavaScript as
a platform comes with its own sets of warts and trade-offs. The idea of
running the same code in the browser and on the server is harder to
achieve than how it's advertised.

Use cases I'm thinking of are performing cross-platform in-memory
builds

Sorry, I do not understand this idea.

On Linux one can just mount a tmpfs and let Isabelle emit heap images
(or whatever it needs) there, but this is not cross-platform.

So an alternative could be just a fancy download tool for AFP subtrees
within the Prover IDE or on the command-line.

This already exists. isabellectl has a flag --afp that will
automatically download the AFP (without history), register it in ROOTS,
and starts a jEdit instance. This utilizes the Maven Central
infrastructure. But it still requires unpacking the archive on disk,
which I'd like to avoid for various reasons.

Email Gateway (Aug 22 2022 at 16:31):

From: Makarius <makarius@sketis.net>
On 15/12/17 17:11, Lars Hupel wrote:

Uniform URL support in Java is already there: Look at NIO2. There are
many different filesystem providers out there (as you are undoubtedly
aware). Unfortunately the knowledge of JSR 203 is still "fresh" in the
community (despite being released in 2011). For HTTP, this one surfaced
after a quick search: <https://github.com/magicDGS/jsr203-http>

I will take a look at this eventually.

Overall, I am not so convinced of the Java platform: things are
generally unfinished and not well thought out.

A curious thing to say about a battle-tested platform that's been used
in production for a very long time. What precisely is not "well thought
out"? Because NIO definitely is.

The "Java is great" propaganda has been with us since Java 1.0. After
struggling myself with the platform for 10 years, I don't believe it
anymore. It usually requires hard work and strange tricks to make
existing Java libraries work properly. (That might be also a business
model of that community, but the mainstream is already moving elsewhere.)

In the past 12 months, I
have even started to think of retargeting Isabelle/Scala onto Node.js as
an alternative to the JVM, by using ScalaJs. It means the abstractions
probably need to be more abstract than standard Java interfaces.

Scala.js is certainly a possibility. But rest assured that JavaScript as
a platform comes with its own sets of warts and trade-offs. The idea of
running the same code in the browser and on the server is harder to
achieve than how it's advertised.

JavaScript and Node.js are definitely crap. As usual the idea is to work
on the intersection of the various platforms, and build a separate
Isabelle edifice on top of the rubble.

Presently this is merely a speculative projection of Isabelle/VSCode
into the future. Isabelle/jEdit and Isabelle/Scala/JVM are not going to
disappear, but the exclusive dependency on the Java universe might come
to an end.

This already exists. isabellectl has a flag --afp that will
automatically download the AFP (without history), register it in ROOTS,
and starts a jEdit instance. This utilizes the Maven Central
infrastructure. But it still requires unpacking the archive on disk,
which I'd like to avoid for various reasons.

I was thinking of direct access to https://bitbucket.org/isa-afp without
Java artifact stores.

What are the various reasons? Isabelle session builds still require
local file-system storage.

Makarius

Email Gateway (Aug 22 2022 at 16:31):

From: Lars Hupel <hupel@in.tum.de>

I was thinking of direct access to https://bitbucket.org/isa-afp without
Java artifact stores.

That (almost) works, too.

What are the various reasons? Isabelle session builds still require
local file-system storage.

This is the reason why I was suggesting earlier on this thread that it
would be convenient if it didn't require the local file system. Mostly
the problem is that supporting platforms like Windows, where symlinking
is hard without administrative privileges, requires a lot of copying
things around.

Email Gateway (Aug 22 2022 at 16:31):

From: Makarius <makarius@sketis.net>
At the bottom of all this are Poly/ML heap images. They do require a
regular file-system, and I don't know how to do it differently.

I have occasionally pondered the question if the heap could be a blob
within SQLite or PostgreSQL (without Java getting in between), but David
Matthews is the only one who can possibly do anything here.

Makarius

Email Gateway (Aug 22 2022 at 16:31):

From: David Matthews <dm@prolingua.co.uk>
When a heap image is exported all the ML code is stopped. That means
that it can't invoke any ML functions to process the bytes. It might be
able to turn the heap image into a byte vector and return that so that
ML code could process it later. The problem I can see there is that
this would require more memory and we've already encountered problems
with shortage of memory during heap exports. Could you not export the
heap to a temporary file, import that into your database and then delete
the temporary file? Loading a heap image would be the reverse.

David

Email Gateway (Aug 22 2022 at 16:32):

From: Makarius <makarius@sketis.net>
Yes, if/when I really need the database of heaps. But there are
presently other priorities and no need to change this aspect of Poly/ML.

Makarius

Email Gateway (Aug 22 2022 at 16:32):

From: David Matthews <dm@prolingua.co.uk>
It doesn't require any change to Poly/ML; you can do it now. The only
complication is with hierarchies of heaps i.e. dependencies such as HOL
on Pure. In order to load a child you will have to export all its
parents into temporary files and then use PolyML.SaveState.loadHierarchy.

David

Email Gateway (Aug 22 2022 at 16:32):

From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
See now http://isabelle.in.tum.de/repos/isabelle/rev/ad538f6c5d2f

Cheers,
Florian
signature.asc

Email Gateway (Aug 22 2022 at 16:33):

From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>
Hi all,

the option »scala_case_sensitive« has indeed been introduced quite
ad-hoc three years ago.

After some thinking about I came to the conclusion that it would be far
better to turn it into an explicit argument to the Scala serializer and
document it accordingly.

I am sceptic about making it the default since case distinction is used
prominently in HOL (e.g. gcd / Gcd) and it would diminish readability
significantly.

Cheers,
Florian
signature.asc

Last updated: Jul 05 2025 at 20:22 UTC