Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Efficient Lookups in Lists


view this post on Zulip Email Gateway (Aug 18 2022 at 09:47):

From: Jan Olaf Blech <blech@informatik.uni-kl.de>
Hi,

We use Isabelle/HOL in a compiler runtime verification
environment. The verification process comprises "lookup
operations" in lists, function updates and sets, i.e.
proving lemmata of the following form:

lemma "a mem [...,a,...]"
lemma "f (...,a:= b,...) a = b"
lemma "a : {...,a,...}"

If the lists, function updates, and sets grow large, these
lookups take up almost all of the time of a verification
run. In our current version we instantiate Isabelle's
simplifier with some simple rules like

"x = a | x : A ==> x : insert a A"

to achieve time linear to the size of the underlying
structure the lookup is conducted on.
However a lookup still takes quite long: several minutes
for a list of a few thousand elements. For the
verification of some relatively small programs thousands
of lookups are needed in structures containing thousands
of elements.

Is anyone aware of a technique to do this faster?

What do you think of "bypassing" Isabelle's simplifier to
make such lookups faster? One problem is probably that we
would "spoil" the Isabelle implementation and the "trusted
computing base" would grow larger.

Regards

Jan Olaf Blech

view this post on Zulip Email Gateway (Aug 18 2022 at 09:47):

From: Tobias Nipkow <nipkow@in.tum.de>
There are several possible solutions.

  1. If your application is executable you may be able to generate ML code
    from it and run it completely independently of Isabelle. The speedup
    will be about 100. If it is almost executable, it may be worth making it
    fully executable (eg an implementation of finite sets as lists is
    already in the Library).

  2. You, as you say, bypass the the simplifier by writing an oracle, i.e.
    trusted piece of ML code that checks if an element is in a list/set.
    This is probably the quick an dirty solution (although you need to
    familiarize yourself with oracles).

  3. You explicitly use some efficient data structure rather than a list
    or set, eg a binary tree. This is completely safe, requires no low-level
    coding, but you will need to develop the efficient data structure (the
    theory BinarySearchTree in the AFP may provide much of what you need)
    and will need to use it rather than the abstract type of
    list/function/set (which should not complicate proofs much).

  4. Alternative 0 can also be used "locally" by compiling executable
    goals like "a mem [...]" into ML and executing them there. The
    development version of Isabelle contains a proof method "evaluation"
    which does just this. However, currently it can only be used to solve a
    goal outright and it is not integrated with other proof tools. More
    flexible versions are in preparation.

  5. You can use "reflection" to translate you list/function/set problem
    into some efficient data structure and execute it there. In contrast to
    alternative 2 you only do this reflection at specific points (when you
    need to solve expensive goals) but otherwise work with the usual
    mathematical types of lists/sets/etc. Efficiency is guaranteed as in 3.

Thanks for bringing this up. Efficient executability is an important
issue we are working on at the moment. Challenge problems are welcome.

Tobias

Jan Olaf Blech schrieb:

view this post on Zulip Email Gateway (Aug 18 2022 at 09:47):

From: Brian Huffman <brianh@csee.ogi.edu>
A good solution might be to write a simplification procedure (or simproc) to
solve these kinds of subgoals. A simproc is an ML function that takes a term
t as an argument, and optionally returns a theorem of the form t == t'. You
can learn more about them in the Isabelle reference manual.

For example, you could write some ML code that examines terms of the form x :
A, checking to see whether term x occurs explicitly in the term A. If so, it
could then build theorem of the form "x : A == True" using Eq_TrueI, insertI1
and insertI2 together with some simple theorem combinators. If term x is not
found in the term A, the simproc would just fail and do nothing (hopefully
this is not the common case).

Note that since a simproc builds an actual theorem that proves the equality,
you aren't adding anything to the trusted code base. You should still get a
significant speedup, because the ML code would be much faster than making
hundreds of recursive calls to the simplifier.

Hope this helps,

view this post on Zulip Email Gateway (Aug 18 2022 at 09:47):

From: Steven Obua <obua@in.tum.de>
There is one more solution possibility, you could use the compute oracle
in Pure/Tools/compute.ML.
It is already included in your Isabelle build (so it is somewhere
between the "trusted base" and the rest :-)),
you can invoke it by calling once

Compute.basic_make ....

and then each time you simplify a term by

Compute.rewrite ...

There are two major drawbacks to this method:

  1. You can only simplify terms which contain no type variables.
  2. You have to configure the rewrite by calling Compute.basic_make and
    give it ALL the theorems used for rewriting;
    these theorems must not contain type variables, either;

The no-type-variables limitation in 2) is not so bad if you know what
types you can possibly have in your terms; you can just instantiate the
polymorphic rewrite theorems with all types you need and give these to
Compute.basic_make.

I have used above technique successfully for fast matrix multiplication,
gaining a speed-up-factor comparable to compilation.
It is more trustworthy than compilation, though, because it builds
directly on Pure and makes no additional assumptions.

Steven


Last updated: Jan 04 2025 at 20:18 UTC