Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] Looking for advice regarding typed abbreviatio...


view this post on Zulip Email Gateway (Aug 18 2022 at 16:15):

From: Rafal Kolanski <rafalk@cse.unsw.edu.au>
Dear Isabelle Gurus,

I'm trying to put together a representation of C structures using
records, coupled with a memory model that involves pointers. There is an
operation I'm trying to express elegantly, however, that is giving me
quite a bit of trouble: dereference-and-get-address-of-field.

For example take these two C structures:
struct A {
char z;
};
struct B {
int x;
struct A y;
};
These have correspondingly obvious isabelle record representations (with
names prefixed to prevent collisions).

If we have some variable "sa" holding a pointer to a struct A in memory,
then the dereference-and-get-address-of-field operator (denoted &->)
should look something like this: sa&->''x''
representing the C code: &(sa->x)
This will add the structure offset of field "x" to the address contained
in the pointer "sa" and retype it as a pointer to an int.

Doing the one-step case, as with "x" is possible using overloaded
functions with a manual constraint on the return type, but what I really
want is chaining, for example:
sa&->y&->z should give a char pointer of z within struct A within
struct B.

This is where I'm completely stuck. In my case, a "pointer" is just an
address with a phantom type variable representing the type of value it
points to. I have a function per record that given a string name of the
field, gives its offset within the corresponding structure. I have a
function that given a variable pointing to a structure of some type,
returns the relevant lookup function. The problem is the return value,
since it's type varies based on which field name one requests.

I am aware that this is probably not the neatest way to go about it.
However, I really want to have a nice-looking textual representation for
the user, and ugly details under the carpet if necessary to make that
possible.

The generic problem boils down to (simplified pseudo-Isabelle):

consts (overloaded)
get_lookup_table_for :: "'a itself => string => 'b"
consts (overloaded)
-- "lookup table for structure A"
lookup_A :: "'a ptr => string => 'b ptr"
-- "lookup table for structure B"
lookup_B :: "'a ptr => string => 'b ptr"
then define:
"get_lookup_table_for TYPE(A) = lookup_A"
"get_lookup_table_for TYPE(B) = lookup_B"
"lookup_A p ''y'' = (ptr_add p 4)::(B ptr)"
"lookup_B p ''z'' = (ptr_add p 0)::(char ptr)"

to finally get a function:
"lookup p name = get_lookup_table_for (TYPE(p)) p name"
which can be chained like so:
lookup (lookup sa ''y'') ''z''
and that should give something, which will either have the type "char
ptr" inferred, or, if constrained to type "char ptr" can still be
reasoned about.

I wish to avoid having to constrain the type returned by almost every
single structure field lookup when annotating code.

As such, I welcome any method which will work, no matter how clunky.
If it involves ML hackery, no problem.

Would any of you have some kind of recommendation for me? I saw some
posts a while back about doing syntax translations pre and post type
checking, but I'm unsure how to apply them to this situation seeing as I
need the type of "sa" to be able to infer the type of the lookup result.

Sincerely hopeful,

Rafal Kolanski.

view this post on Zulip Email Gateway (Aug 18 2022 at 16:15):

From: Andreas Lochbihler <andreas.lochbihler@kit.edu>
Dear Rafael,

struct A {
char z;
};
struct B {
int x;
struct A y;
};

[...]

The generic problem boils down to (simplified pseudo-Isabelle):

consts (overloaded)
get_lookup_table_for :: "'a itself => string => 'b"
consts (overloaded)
-- "lookup table for structure A"
lookup_A :: "'a ptr => string => 'b ptr"
-- "lookup table for structure B"
lookup_B :: "'a ptr => string => 'b ptr"
then define:
"get_lookup_table_for TYPE(A) = lookup_A"
"get_lookup_table_for TYPE(B) = lookup_B"
"lookup_A p ''y'' = (ptr_add p 4)::(B ptr)"
"lookup_B p ''z'' = (ptr_add p 0)::(char ptr)"

to finally get a function:
"lookup p name = get_lookup_table_for (TYPE(p)) p name"
which can be chained like so:
lookup (lookup sa ''y'') ''z''
and that should give something, which will either have the type "char
ptr" inferred, or, if constrained to type "char ptr" can still be
reasoned about.

Would any of you have some kind of recommendation for me? I saw some
posts a while back about doing syntax translations pre and post type
checking, but I'm unsure how to apply them to this situation seeing as I
need the type of "sa" to be able to infer the type of the lookup result.
I think syntax translations during type checking might solve your problem.
However, I feel that you need a more general framework than the one by Christian
Sternagel. In his approach, the overloaded constant is replaced during type
checking with another constant whose type is more specific. AFAIK this approach
works as follows when it encounters an overloaded constant c during parsing:

1st) Determine the type constraints on c (from surrounding applications/type
annotations etc.)
2nd) Check whether these constraints unify with one of the registered
translations. If so, replace c appropriately.
3rd) After type checking, raise an error if c still occurs literally in the term.

Your problem looks a bit more difficult because the return type of the
overloaded lookup constant not only depends on the arguments' types, but also on
the value. For exmple, suppose that the type of sa is known as
"B ptr". What should be the type of "lookup sa x" in
"x = ''y'' | x = ''z'' --> lookup sa x = ..."?

However, if you only need to succeed in type checking lookup terms when the
second argument is given as a literal string, a similar approach might work.
Instead of looking at the type constraints, you would have to match for term
pattern

Const ("lookup", _) $
(_, <ML representation of "B ptr">) $
(Const ("Cons", _) $
(Const ("String.char.Char", _) $ ... ))

and replace this with a specialized constant
lookup_B_y defined as

lookup_B_y p = (ptr_add p 4)::(A ptr)"

If this solves your problem, there might be an even simpler solution: Make the
field name parameter part of the lookup function name. Every struct declaration
gets translated to a list of dereference functions. In your example:

definition lookup_A_z :: "A ptr => char ptr" ("_&->''''z''''")
where "lookup_A_z p == ptr_add p 0"

definition lookup_B_x :: "B ptr => int ptr" ("_&->''''x''''")
where "lookup_B_x p == ptr_add p 0"

definition lookup_B_y :: "B ptr => A ptr" ("_&->''''y''''")
where "lookup_B_y p == ptr_add p 4"

Then, "(p&->''y'')&->''z''" should type check and have type "char ptr", although
I have not tested this yet.
Note that if you have multiple structs with the same field name, say

struct C { char x; }

you will get another definition

definition lookup_C_x :: "C ptr => char ptr" ("_&->''''x''''")
where "lookup_C_x == ptr_add p 0"

Now, "p&->''x''" will produce two parse trees, but in most cases, only one will
be type correct. If this happens multiple times, parsing becomes increadibly
slow because the number of parse trees grows exponentially. In that case, you
can have an overloaded constant for every field name and use Christian
Sternagel's solution. You only need to register translations for "_&->''x''" and
"_&->''y''" etc.

Hope this helps,
Andreas

view this post on Zulip Email Gateway (Aug 18 2022 at 16:15):

From: Andreas Lochbihler <andreas.lochbihler@kit.edu>
Dear Rafal,

The approach is the same as the adhoc overloading facility by Christian
Sternagel (in the current development version in
~~/src/Tools/Adhoc_Overloading). This mechanism adds the translation via
Syntax.add_term_check to the type checking phase of terms (and similarly the
inverse translation to the printing phase). For more information on this, see
the thread on the mailing list starting with this post:
https://lists.cam.ac.uk/mailman/htdig/cl-isabelle-users/2010-May/msg00055.html

At this very point in the type checking phase, you'd have to replace the generic
constant "lookup" with specialized versions like "lookup_B_y". To determine the
right specialized version, you will need the type of the first parameter and the
value of the second parameter of lookup. Determining the value of the second
parameter could be done with pattern matching. I don't know exactly what the
syntax tree looks like at this stage, but you should be able to find that out.

Sincerely,
Andreas


Last updated: Apr 18 2024 at 04:17 UTC