Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Isabelle’s getsettings somehow broken


view this post on Zulip Email Gateway (Aug 22 2022 at 12:33):

From: Joachim Breitner <breitner@kit.edu>
Hi,

Luckily we have poor souls running sliding update distributions (Debian
unstable here) who can suffer from the fallout in time for it to be
fixed before the other users are affected.

Anyways, the problem also happens with 2016-rc4, so this is likely
something worth investigating before the release. After further
digging, it turns out that it works if /bin/sh does point to /bin/bash
instead of /bin/dash. Bisecting the version ranges of dash, the problem
started to appear in 0.5.8. Likely contender:
http://git.kernel.org/cgit/utils/dash/dash.git/commit/?id=46d3c1a614f11f0d40a7e73376359618ff07abcd

I notified the dash developers about the problem, but dash 0.5.8 will
eventually be used in plenty of places where we probably want to use
Isabelle out of the box, so I suggest to either ensure that bash
scripts within Isabelle call each other via bash only, or (likely more
reliable) do not rely on exporting bash functions via the environment,
but source their definition in every shell.

Greetings,
Joachim
signature.asc

view this post on Zulip Email Gateway (Aug 22 2022 at 12:34):

From: Joachim Breitner <breitner@kit.edu>
Dear Makarius,

Am Mittwoch, den 10.02.2016, 17:54 +0100 schrieb Makarius:

On Wed, 10 Feb 2016, Joachim Breitner wrote:

After further digging, it turns out that it works if /bin/sh does point 
to /bin/bash instead of /bin/dash. Bisecting the version ranges of dash, 
the problem started to appear in 0.5.8. Likely contender: 
http://git.kernel.org/cgit/utils/dash/dash.git/commit/?id=46d3c1a614f11f0d40a7e73376359618ff07abcd

That change is from 2012. My impression is that the bash guys have already 
taken measures against such "sanitised environments" some years ago: the 
shell functions are represented as regular assignments of the form A=B, 
e.g. BASH_FUNC_isabelle_scala%%=() ...

I notified the dash developers about the problem, but dash 0.5.8

There must be a new problem in dash 0.5.8. 

..as I said, it’s the commit above. If you look at
http://git.kernel.org/cgit/utils/dash/dash.git/log/
and press next you see that it is contained in the 0.5.8 tag, but not
the 0.5.7 tag dated 2011-07-08.

dash’s upstream (Erik Blake) says this is not a bug and POSIX conform:
http://www.mail-archive.com/dash@vger.kernel.org/msg01147.html

I suggest to either ensure that bash scripts within Isabelle call each 
other via bash only, or (likely more reliable) do not rely on exporting 
bash functions via the environment, but source their definition in every 
shell.

I've spent some years on variations on all that already, refining it over 
and over again.

Isabelle exclusively uses bash wherever possible. Sometimes other system 
function sneak-in unreliable /bin/sh.

I digged deeper, assuming the poisonous 'sh -c "..."' pattern
somewhere, and using strace I saw these four calls going via /bin/sh:
[pid 19336] execve("/bin/sh", ["sh", "-c", "exec \"$EXEC_PROCESS\" '/tmp/isabelle-jojo19273/bash_pid995593' \"exec bash '/tmp/isabelle-jojo19273/bash_script995593' > '/tmp/isabelle-jojo19273/bash_out995593' 2> '/tmp/isabelle-jojo19273/bash_err995593'\""], [/* 187 vars */]) = 0
[pid 19352] execve("/bin/sh", ["sh", "-c", "exec \"$EXEC_PROCESS\" '/tmp/isabelle-jojo19273/bash_pid995594' \"exec bash '/tmp/isabelle-jojo19273/bash_script995594' > '/tmp/isabelle-jojo19273/bash_out995594' 2> '/tmp/isabelle-jojo19273/bash_err995594'\""], [/* 187 vars */]) = 0
[pid 19355] execve("/bin/sh", ["sh", "-c", "exec \"$EXEC_PROCESS\" '/tmp/isabelle-jojo19273/bash_pid995595' \"exec bash '/tmp/isabelle-jojo19273/bash_script995595' > '/tmp/isabelle-jojo19273/bash_out995595' 2> '/tmp/isabelle-jojo19273/bash_err995595'\""], [/* 187 vars */]) = 0
[pid 19358] execve("/bin/sh", ["sh", "-c", "exec \"$EXEC_PROCESS\" '/tmp/isabelle-jojo19273/bash_pid995596' \"exec bash '/tmp/isabelle-jojo19273/bash_script995596' > '/tmp/isabelle-jojo19273/bash_out995596' 2> '/tmp/isabelle-jojo19273/bash_err995596'\""], [/* 187 vars */]) = 0

So this points to these lines

val _ = File.write script_path script;
            val bash_script =
              "exec bash " ^
                File.shell_path script_path ^
                " > " ^ File.shell_path out_path ^
                " 2> " ^ File.shell_path err_path;
            val _ = getenv_strict "EXEC_PROCESS";
            val status =
              OS.Process.system
                ("exec \"$EXEC_PROCESS\" " ^ File.shell_path pid_path ^ " " ^ quote bash_script);

in src/Pure/Concurrent/bash.ML.

In my naivety, I would immediately scold such code: Really, one should
not use the system-family of calls that needlessly go via a shell, with
all the worrying quoting involved. But, judging from a very brief web
search (http://sml-family.org/Basis/os-process.html), it seems that SML
does not provide the safe form taking a list of arguments, instead of a
single string, that I have seen in most other languages. Is there
really no way of executing a program from SML that does not go via the
shell?

One way out would be to give up on lib/scripts/getsettings doing the
caching and configure everything upon every invocation. This way,
functions would not have to be exported.

What is your $SHELL actually?

The $SHELL is /bin/bash, but I don’t expect that to make a difference
here.

Greetings,
Joachim
signature.asc

view this post on Zulip Email Gateway (Aug 22 2022 at 12:34):

From: Joachim Breitner <breitner@kit.edu>
Hi,

I filed a report at http://bugs.debian.org/814358

but I don’t have much hopes. If dash’s upstream is right that POSIX
does not guarantee the survival of non-standard environment entries,
then we are relying on accidental behavior here, and will have a hard
time convincing the Debian maintainers to diverge from upstream.

Chances would be better if Software maintained by Debian would be
affected, but the odds of that are rather low (few people export bash
functions besides from thier login shell) and Isabelle is unfortunately
not distributed by Debian.

Anyways, I’m sure have sympathy for maintainers who, by improving
matters and refining their code to do things correctly and as
specified, happen to break the setup of users who were using it in an
obsolete or simply wrong way before.

Greetings,
Joachim
signature.asc

view this post on Zulip Email Gateway (Aug 22 2022 at 12:34):

From: Joachim Breitner <breitner@kit.edu>
Hi,

are we sure there is nothing we can do? In the current state,
Isabelle2016 will not be easily usable on future Debian and Ubuntu
releases without root access and rather low level modifications¹.

One way out might be to follow the advice that I procured here:
http://stackoverflow.com/a/35324612/946226
This way, you can avoid the call to /bin/sh in the problematic part
src/Pure/Concurrent/bash.ML. I’m not sure if this is available on
Windows.

Greetings,
Joachim

¹ The instructions could be something like:
Run
$ sudo dpkg-reconfigure dash
and answer the question with "no". Then ensure that the symbolic
link /bin/sh points to /bin/bash.
signature.asc

view this post on Zulip Email Gateway (Aug 22 2022 at 12:34):

From: Gerwin Klein <Gerwin.Klein@nicta.com.au>
We’ve just had the exact same problem yesterday on 2016-RC4 + Ubuntu. Not sure what the resolution was, let me find out.

Cheers,
Gerwin


The information in this e-mail may be confidential and subject to legal professional privilege and/or copyright. National ICT Australia Limited accepts no liability for any damage caused by this email or its attachments.

view this post on Zulip Email Gateway (Aug 22 2022 at 12:35):

From: Joachim Breitner <breitner@kit.edu>
Hi,

and it will stay this way, of course. The question is: When will the
first Ubuntu release with dash-0.5.8 be released. I’m not sure if
Ubuntu 16.04 (xenial, the next LTS release) will still see updates from
Debian, but Ubuntu 16.10 will very likely ship with dash-0.5.8.

The next Debian stable release, stretch, will be probably not release
until early 2017, but that is just a guess.

Greetings,
Joachim
signature.asc

view this post on Zulip Email Gateway (Aug 22 2022 at 12:35):

From: Joachim Breitner <breitner@kit.edu>
Hi,

I wonder how such moves from Debian to Ubuntu really work. Are there 
really imports from Debian-testing into a stable Ubuntu? There are plenty 
of websites about that, but it also requires time to study them. 
https://wiki.ubuntu.com/XenialXerus/ReleaseSchedule has DebianImportFreeze 
for 17-Feb-2016 -- whatever that means.

well spotted, thanks. It means
Prior to this date, new versions of packages will be automatically
imported from Debian where they have not been customized for Ubuntu
( https://wiki.ubuntu.com/DebianImportFreeze )
but dash is customized for Ubuntu, so it will not get merged
automatically. There is still a not too slim chance that it will be
merged manually. So the likelyhood of the next Ubuntu LTS release
shipping dash 0.5.8 is large enough to conservatively assume that to be
the case.

So patching Poly/ML could work, but anybody doing plain Unix programming 
with "system" functions (e.g. in perl) will fall again on his nose with a 
bad /bin/sh.

Only (in the context of the current issue) if this system programming
happens in a process called by isabelle and itself calls isabelle
again. Is that something we should worry about?

Have you considered, as a middle way, to keep the
ISABELLE_SETTINGS_PRESENT-logic in lib/scripts/getsettings that
determines the settings once, but within this if..then..else block only
set genuine environment variables, and after the else block, define all
the bash functions unexported. Some of them are (like spiltarray) are
defined unconditionally and not really settings anyways; only few
functions have definitions that depend on the environmens (namely tar,
jvmpath, isabelle_admin_build). You could define them conditional on
the variables set above. This would give you the desired scoping
behavior for genuine settings, while not relying on treacherous
features such as exporting code between processes.

Greetings,
Joachim
signature.asc

view this post on Zulip Email Gateway (Aug 22 2022 at 12:35):

From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>

An alternative is to modify Poly/ML to use /bin/bash or getenv(SHELL)
instead of hardwired /bin/sh -- that would be a typical Debianistic patch.

getenv(SHELL), falling back on /bin/sh by default is a very custom
behaviour and would make sense; nevertheless the user still has the
burden of setting SHELL explicitly in case of problems.

A technical alternative could also to implement a separate encoding of
shell functions into the environment using plain variable names. e.g.

function l {
ls -lAX --file-type --group-directories-first --color=auto "$@"
}

export FUN_L="$(LANG= type l | { read; cat; })"

and later

eval "$(echo $FUN_L)"

Don't know whether this is a good idea.

Cheers,
Florian
signature.asc

view this post on Zulip Email Gateway (Aug 22 2022 at 12:36):

From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>

Am Mittwoch, den 10.02.2016, 19:44 +0100 schrieb Makarius:

Maybe it is better to convince Debian maintainers to regain some
sanity?
The testing of dash 0.5.8 has happened only last week:
https://tracker.debian.org/news/744916

I filed a report at http://bugs.debian.org/814358

Quite recently (a few months ago?), the encoding of shell functions in
bash had been changed to a avoid security hole (»shell shock«). I do
not recall the details, but maybe this is also relevant here.

Florian

Thanks.

From what I've seen on the dash tracker and now a bit of Debian, this
all
looks like a very hostile environment. 20 years ago, the OpenSource
world was still idealistic and friendly.

What effectively happens here, is that people who take POSIX to the
letter (dash) fight other people who embrace-and-extend it (bash).

After reading the official (unpatched) sources of bash-4.3 from
https://www.gnu.org/software/bash I've found out that this is not a
problem of dash vs. bash, but of Debian against itself and its users.

In the original GNU bash, a shell function "foo" is stored in the
environment as "foo=() ...". This is well-formed in the sense of other
shells, and bash-0.5.8 will let it through as expected.

The Debian version of bash decorates the name with a suffix "%%" that
renders the name malformed, and thus it gets lost. See the patch
http://sourcesdev.debian.net/patches/bash/4.3-14/bash43-027.diff

This patch changes the encoding bash uses for exported functions to
avoid clashes with shell variables and to avoid depending only on an
environment variable's contents to determine whether or not to interpret
it as a shell function.

Maybe you can attach this information in your own words to the above
Debian tracker item. I did not manage to find access to it, and my
impression is that Debian is a closed society anyway.

Makarius
signature.asc

view this post on Zulip Email Gateway (Aug 22 2022 at 12:36):

From: Lars Hupel <hupel@in.tum.de>
I think your assessment is correct. See also

<https://en.wikipedia.org/wiki/Shellshock_%28software_bug%29>

Also, it's not just Debian. On my Arch system, I also see the suffix.
For example, isabelle env contains

BASH_FUNC_isabelle_scala%%=() ...

This is bash version 4.3.042. Arch is well-known for rarely patching
upstream sources.

Cheers
Lars

view this post on Zulip Email Gateway (Aug 22 2022 at 12:36):

From: Joachim Breitner <breitner@kit.edu>
Hi,

is it? Isn’t "SHELL" how a user configures his preferred _interactive_
shell? That might be ksh, or fish, or even ghci for all you know...
Likely not something you want to rely on.

Greetings,
Joachim
signature.asc

view this post on Zulip Email Gateway (Aug 22 2022 at 12:36):

From: Joachim Breitner <breitner@kit.edu>
Hi,

Thanks for digging up the patch, it is a useful bit of information.
I’ll raise it in the bug tracker in a little while.

Your impression on the openess is (IMHO) very wrong. Debian’s bug
tracker is so open that it you probably did not expect it to be so
simple: Just send a mail to 814358@bugs.debian.org (the link in the
title of the bug – admittedly not easily discoverable).

Greetings,
Joachim
signature.asc

view this post on Zulip Email Gateway (Aug 22 2022 at 12:39):

From: Florian Haftmann <florian.haftmann@informatik.tu-muenchen.de>

An alternative is to modify Poly/ML to use /bin/bash or
getenv(SHELL)
instead of hardwired /bin/sh -- that would be a typical Debianistic
patch.

getenv(SHELL), falling back on /bin/sh by default is a very custom
behaviour and would make sense; nevertheless the user still has the
burden of setting SHELL explicitly in case of problems.

is it? Isn’t "SHELL" how a user configures his preferred _interactive_
shell? That might be ksh, or fish, or even ghci for all you know...
Likely not something you want to rely on.

E.g. make does interpret SHELL that way; but that seems indeed not to
be representative:

haftmann@scarlatti:/data$ SHELL=/usr/bin/tclsh python -c 'import os; os.system("ps -f")'
UID PID PPID C STIME TTY TIME CMD
haftmann 3286 3030 0 08:46 pts/0 00:00:00 bash
haftmann 5800 3286 0 09:33 pts/0 00:00:00 python -c import os; os.system("ps -f")
haftmann 5801 5800 0 09:33 pts/0 00:00:00 sh -c ps -f
haftmann 5802 5801 0 09:33 pts/0 00:00:00 ps -f

Cheers,
Florian
signature.asc

view this post on Zulip Email Gateway (Aug 22 2022 at 12:39):

From: Joachim Breitner <breitner@kit.edu>
Hi,

indeed I’m less sure about my interpretation, and the POSIX standard is
not very helpful. I tried to kick of a discussion over at
http://unix.stackexchange.com/questions/263773/is-the-shell-environment-variable-only-for-interactive-shells

Greetings,
Joachim
signature.asc

view this post on Zulip Email Gateway (Aug 22 2022 at 12:40):

From: Joachim Breitner <breitner@kit.edu>
Hi Florian,

that is actually not the case:

/tmp/foo $ cat Makefile 
test:
echo bar
/tmp/foo $ SHELL=/bin/true make test
echo bar
bar
/tmp/foo $ make test SHELL=/bin/true
echo bar

it seems you confused make’s Makefile variable SHELL with the
environment variable SHELL.

Greetings,
Joachim
signature.asc

view this post on Zulip Email Gateway (Aug 22 2022 at 12:52):

From: Joachim Breitner <breitner@kit.edu>
Hi,

this is still 2015 (sorry for not helping with beta-testing the next
release at the moment).

I suddenly see this failure building a session with document:

/tmp $ isabelle mkroot -d test
/tmp $ cd test/
/tmp/test $ vim ROOT 

add only one theory, Foo

/tmp/test $ echo 'theory Foo imports Main begin end' > Foo.thy
/tmp/test $ isabelle build -D .
Running test ...

test FAILED
(see also /home/jojo/.isabelle/Isabelle2015/heaps/polyml-5.5.2_x86-linux/log/test)

val it = (): unit
Loading theory "Foo"

theory "Foo"

0.070s elapsed time, 0.140s cpu time, 0.000s GC time

"$ISABELLE_TOOL" document -o 'pdf' -n 'document' -t '' 'output/document' 2>&1
*** /opt/isabelle/Isabelle2015/bin/isabelle: Zeile 52: splitarray: Kommando nicht gefunden.
*** Unknown Isabelle tool: document
*** 
*** Failed to build document "/tmp/test/output/document.pdf"
Unfinished session(s): test
0:00:10 elapsed time, 0:00:16 cpu time, factor 1.60

Here is what I found so far:

splitarray is a bash function defined in lib/scripts/getsettings, which
exports a bunch of variables and functions to the environment, (using
"set -o allexport"). It also exports ISABELLE_SETTINGS_PRESENT and does
nothing if it finds that variable to be set.

It seems that when isabelle build calls isabelle document, it somehow
has ISABELLE_SETTINGS_PRESENT in the environment (thus making the line
source "$ISABELLE_HOME/lib/scripts/getsettings" || exit 2
in bin/isabelle do nothing), but not the bash function splitarray (thus
making the program fail).

It must be a recent change in my system, but my bash package has not
changed since September, and I can’t imagine what other package might
have caused this.

Has anyone else seen this before?

Greetings,
Joachim
signature.asc


Last updated: Apr 25 2024 at 08:20 UTC