[isabelle] [Agda] Prior work on proof assistant performan... · Archive Mirror: Isabelle Users Mailing List

Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] [Agda] Prior work on proof assistant performan...

Email Gateway (Aug 23 2022 at 09:08):

From: Jason Gross <jasongross9@gmail.com>
Thanks! This is very interesting. (And thanks all for sources.)

I'm still digesting this, but have one question already:

Clearly, we should use NbE/environment machines for evaluation, and
implement conversion checking in the semantic domain.

Does this mean that we can't be agnostic about whether or not we're
supporting HoTT, because we either have to pick set theory/irrelevant
equality proofs or cubical (or some other model of HoTT) when normalizing
proofs of equality?

I'd be very interested in your findings about proof assistant performance.

I'd be happy to share once I have my thesis in a sufficiently ready state!
Broadly, the message of my thesis is that performance of
(dependently-typed) proof assistants / interactive theorem provers is an
interesting area of research sorely in need of systematic study. (My
thesis will also include some of the research work I've done, which can be
rendered as "how to work around some of the asymptotics in Coq without
needing to change the system", but I think that's a bit less broadly
interesting.)

-Jason

On Fri, May 8, 2020 at 4:55 AM András Kovács <kovacsandras@inf.elte.hu>
wrote:

Hi Jason,

You may be interested in my github repos: smalltt
<https://github.com/AndrasKovacs/smalltt>, normalization-bench
<https://github.com/AndrasKovacs/normalization-bench>. I haven't yet
updated the smalltt repo, but there's a simplified implementation
<https://gist.github.com/AndrasKovacs/a0e0938113b193d6b9c1c0620d853784>
of its evaluator, which seems to have roughly the same performance but
which is much simpler to implement.

The basic idea is that in elaboration there are two primary computational
tasks, one is conversion checking and the other is generating solutions for
metavariables. Clearly, we should use NbE/environment machines for
evaluation, and implement conversion checking in the semantic domain.
However, when we want to generate meta solutions, we need to compute
syntactic terms, and vanilla NbE domain only supports quote/readback to
normal forms. Normal forms are way too big and terrible for this purpose.
Hence, we extend vanilla NbE domain with lazy non-deterministic choice
between two or more evaluation strategies. In the simplest case, the point
of divergence is whether we unfold some class of definitions or not. Then,
the conversion checking algorithm can choose to take the full-unfolding
branch, and the quoting operation can choose to take the non-unfolding
branch. At the same time, we have a great deal of shared computation
between the two branches; we avoid recomputing many things if we choose to
look at both branches.

I believe that a feature like this is absolutely necessary for robust
performance. Otherwise, we choke on bad asymptotics, which is surprisingly
common in dependent settings. In Agda and Coq, even something as trivial as
elaborating a length-indexed vector expression has quadratic complexity in
the length of the vector.

It is also extremely important to stick to the spirit of Coquand's
semantic checking algorithm as much as possible. In summary: core syntax
should support no expensive computation: no substitution, shifting,
renaming, or other ad-hoc term massaging. Core syntax should be viewed as
immutable machine code, which supports evaluation into various semantic
domains, from which sometimes we can read syntax back; this also leaves it
open to swap out the representation of core syntax to efficient
alternatives such as bytecode or machine code.

Only after we get the above two basic points right, can we start to think
about more specific and esoteric optimizations. I am skeptical of proposed
solutions which do not include these. Hash consing has been brought up many
times, but it is very unsatisfying compared to non-deterministic NbE,
because of its large constant costs, implementation complexity, and the
failure to handle sharing loss from beta-redexes in any meaningful way
(which is the most important source of sharing loss!). I am also skeptical
of exotic evaluators such as interaction nets and optimal beta reducers;
there is a good reason that all modern functional languages run on
environment machines instead of interaction nets.

If we want to support type classes, then tabled instance resolution
<https://arxiv.org/pdf/2001.04301.pdf> is also a must, otherwise we are
again smothered by bad asymptotics even in modestly complex class
hierarchies. This can be viewed as a specific instance of hash-consing (or
rather "memoization"), so while I think ubiquitous hash-consing is bad,
some focused usage can do good.

Injectivity analysis is another thing which I believe has large potential
impact. By this I mean checking whether functions are injective up to
definitional equality, which is decidable, and can be used to more
precisely optimize unfolding in conversion checking.

I'd be very interested in your findings about proof assistant performance.
This has been a topic that I've been working on on-and-off for several
years. I've recently started to implement a system which I intend to be
eventually "production strength" and also as fast as possible, and
naturally I want to incorporate existing performance know-how.

Jason Gross <jasongross9@gmail.com> ezt írta (időpont: 2020. máj. 8., P,
0:20):

I'm in the process of writing my thesis on proof assistant performance
bottlenecks (with a focus on Coq). I'm having some trouble finding prior
work on performance engineering and/or benchmarking of proof assistants
(other than the paper on the Lean tactic language (
https://leanprover.github.io/papers/tactic.pdf)), and figured I'd reach
out to these lists.

Does anyone have references for prior work on performance analysis or
engineering of proof assistants or dependently typed languages? (Failing
that, I'd also be interested in papers about compile-time performance of
compilers.)

Thanks,
Jason

Agda mailing list
Agda@lists.chalmers.se
https://lists.chalmers.se/mailman/listinfo/agda

Last updated: Oct 08 2025 at 20:22 UTC