[isabelle] Proving Contests · Archive Mirror: Isabelle Users Mailing List

Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Proving Contests

Email Gateway (Aug 22 2022 at 18:42):

From: Gidon Ernst <gidon.ernst@unimelb.edu.au>
Hi José,

indeed, judging the solutions in those competitions was always a tremendous effort.
The reason why at these events humans had to be involved in the judging was the diversity of tools and the fact that specifications were given in natural language.

Your idea is quite different and very intriguing! I hope it will be realized eventually :)

Best,
Gidon

Email Gateway (Aug 22 2022 at 18:42):

From: Dominique Unruh <unruh@ut.ee>
Instead of judging by proof by runtime (which does not seem like a very
useful criterion) or by beauty (which needs big human effort), an option
would be percentage of completion. By giving out several problems of
differing hardness one could make sure that basically no-one will get 100%
of the solutions. (And even for a given problem one might have different
options. E.g., prove this for integers vs. for any ring. Or prove this with
the following additional simplifying assumption vs. prove it without
assumption, etc.)

Then checking can be automated. (And to look for cheaters, it will not be
enough to manually scan the source for for "sorry". Instead, the only
reliable way as far as I know is to extract the proof term and replay it in
a separate instance of Isabelle. But the infrastructure for that is mostly
there, so it's mainly a question of filling in some "plumbing".)

Best wishes,
Dominique.

Email Gateway (Aug 22 2022 at 18:42):

From: Simon Wimmer <wimmersimon@gmail.com>
Dear all,

currently we use the following scheme. Each sub-task is worth a certain
number of points. We first rank by highest total score, and then lowest sum
of submission time stamps to break the ties.

Simon

Email Gateway (Aug 22 2022 at 18:43):

From: mailing-list anonymous <mailing.list.anonymous@gmail.com>
Dear All,

I would like to make several comments with regard to the proposal by José
Manuel Rodriguez Caballero (as someone who participated in olympiads while
studying at secondary school).

Of course, the olympiads in informatics can hardly be described as
olympiads that test the participants on their knowledge of the syntax of a
particular programming language. The main challenge is the design of
algorithms and not their implementation. The participants are allowed to
select a programming language. While the choice is relatively narrow (C++,
Pascal and Java), they seem to be working on expanding this choice (until
recently the participants were allowed to use only C, C++ and Pascal).

While Isabelle is one of the most popular proof assistants, it is not the
only available proof assistant. I believe that Coq is equivalently popular
and has nearly identical functionality (for example, there are more
questions about Coq on Stack Overflow than about Isabelle). Of course,
there are other well-known systems for proof verification.

Personally, I prefer Isabelle and I would like to see it succeed. However,
if one was to organise an international olympiad that allows its
participants to use Isabelle, in my opinion, it would need to be an
olympiad in the proof verification/logic/formal methods and not an olympiad
"in Isabelle". From the start, it should allow the participants to use
either Isabelle or Coq. Also, of course, the tasks should reflect that the
participants are being tested on their knowledge of the methods that are
commonly used for the formalisation of mathematics or formal proof of
computer programs and not the knowledge of the 'advanced features' of a
given proof assistant. Thus, the organisers would have to carefully select
the subsets of the standard libraries that one is allowed to use during the
olympiad.

Also, personally, in my view, participation over the internet should not be
allowed, because it favours cheating. Being able to claim that one has a
prize in an olympiad in formal verification would provide an enormous
competitive advantage when one is applying to universities (and beyond).
Thus, cheating must be eliminated at all costs. It may be best (for the
sake of humanity) not to organise such an olympiad at all, rather than
allowing the participation over the internet. Also, of course, there should
be different olympiads for secondary school students and undergraduate
students.

Thank you

Email Gateway (Aug 22 2022 at 18:43):

From: Simon Wimmer <wimmersimon@gmail.com>
Dear all,

for our system, we foresee to also get other theorem provers on board.
For each theorem prover, one needs to find a volunteer who is willing to
implement a prover backend,
and who is willing to formulate the tasks for the theorem prover.

As for most programming competitions, our main goal is to spark motivation.
For this purpose, in my opinion, the
only way is to host the competition online, as only then everyone has the
possibility to participate.

I agree that if you want to hand out an important prize, you will always
need to make sure that
the person you give the prize to will be the one who actually solved the
problems. However,
I believe this is already ten steps ahead of where we stand now.

Simon

Email Gateway (Aug 22 2022 at 18:43):

From: José Manuel Rodriguez Caballero <josephcmac@gmail.com>
Just to point out that Contests using Isabelle do not need to be restricted
to mathematics: they can be about any subject, e.g, chemistry, physics,
biology, and even philosophy. The only requirement is to use axioms. The
fact that the solutions can be automatically checked allow massive
participation from several countries via email. Internet is a new paradigm
for Contests.

A similar massive educational-scientific project are the games about
quantum mechanics, in order to develop the quantum intuition in people
which are not quantum physicists :
https://www.youtube.com/watch?v=vkVnnN0MjIE

José M.

Email Gateway (Aug 22 2022 at 18:49):

From: José Manuel Rodriguez Caballero <josephcmac@gmail.com>
Hi Gidon,

Interesting Proving Contests in your links, thank you ! Nevertheless, I
took a look at the Geographical distribution of the participants (page 3 in
the following link):
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.362.2877&rep=rep1&type=pdf

I think that it is important to open a competition for millions of
participants at the same time, including participants from less
economically developed countries. This is the reason why I suggested a
software in order to do the ordering of the solutions in virtue of the
running time: human cannot handle such a huge data.

In the same way that alphabetization was extended from an elite to the
whole world, proof assistants are a second alphabetization that the world
need in order to get update: https://www.youtube.com/watch?v=7Pq-S557XQU

Kind Regards,
José Manuel Rodriguez Caballero

El mar., 6 nov. 2018 a las 21:26, Gidon Ernst (<gidon.ernst@unimelb.edu.au>)
escribió:

Email Gateway (Aug 22 2022 at 18:49):

From: Peter Lammich <lammich@in.tum.de>
Hi Jose

Considering ranking of the solutions by running time: I am not convinced that
this is a good measure for ranking formal proofs ... it's indeed a practical
one that is simple to automate, but it surely does not discriminate good from
bad proofs ... I even fear that optimizing a proof for running time may make
it harder to read and less elegant. Are there other criteria to rank proofs
that can be checked automatically? Size of the thy file isn't a good one, too,
I fear.

Peter

Email Gateway (Aug 22 2022 at 18:49):

From: Gidon Ernst <gidon.ernst@unimelb.edu.au>
Hi all,
two similar competitions come to mind:

1st and 2nd VSComp, see
The 1st verified software competition: Experience report, FM2011
The 2nd verified software competition: Experience report, COMPARE 2012

VerifyThis (yearly at ETAPS)
http://www.pm.inf.ethz.ch/research/verifythis.html
This year, the Isabelle Team from TUM took the gold medal :)
Huisman, Klebanov, Monahan: On the organisation of program verification competitions, COMPARE 2012
the yearly competition reports, and the STTT 17(6) special issue: https://link.springer.com/journal/10009/17/6/page/1

Hope you find this information useful :)

Best,
Gidon

Nice project. My idea of an International Olympiad in Isabelle are a little
bit different. Indeed, I can summarize my vision as follows:

1) There will be some publicity about the International Olympiad some
months before its date. The publicity should be mainly addressed to
undergraduate and high school students. People interested in the Olympiad
should summit their email addresses in a website. People working with
Isabelle in an official research project, e.g., a Master, a PhD, a
professor or a professional programmer, are not allowed to apply in order
to guarantee more space for beginners.

2) The day of the Olympiad, a thy-file with a theorem followed by sorry
will be sent to each one of the email addresses.

3) People can submit their solutions until 6 hours after their received the
email with the problem (5 hours to solve the problem and 1 hour to make a
break, I guess). Solutions sent after the deadline will be automatically
ignored.

4) An automatic system (software) will discriminate the invalid solutions,
e.g., using sorry somewhere, from the valid solutions. After that, it will
make an ordering according to the running time of each valid solution.

5) Finally, an email with the ranking of his/her solution will be
automatically sent to each participant. The participants having ranking 1,
2, 3 will receive a prize.

When I think about such an international project, I am mainly concerned
with people living in economically less developed countries. The idea is
that a person from such countries, having the dream of winning an
International Olympiad in Isabelle will have a motivation to improve
himself/herself in this field. Of course, classification by countries
should be avoided and people from highly developed countries should be
allowed to participate as well.

This is just a dream, I have not enough free time to do something about it.
My inspiration are the IOI: https://ioinformatics.org/

Kind Regards,
José M.

Last updated: Feb 01 2025 at 20:19 UTC