Inspired by https://martin.kleppmann.com/2019/09/14/strange-loop-isabelle.html and https://www.microsoft.com/en-us/research/publication/tla-proofs/, I made an attempt to prove a Peterson mutex.

It did not go too far: https://gist.github.com/suhr/c13d89be9110b6c3ac8a0ffdfd422735

I'm really impressed how far you can go by just applying `sledgehammer`

everywhere. But you can only go so far doing so.

My question is: how do you approach proving problems like this? For tiny problems proving is easy: you can always apply substitutions and rules directly if you need to. But in the case of bigger problems, definitions are rather large to substitute them into every lemma.

I would also greatly appreciate any help on this particular problem.

You could try to find statements that help you on your way to your final proof, eg:

```
lemma "show_this_statement"
proof (induct t)
case base -- this is you base case of your data type, eg "case 0" for t::nat
show ?thesis by ...
next
case step --this is your induction step, eg "case (Suc t)" for t::nat
have "some_more_statements" by ...
then show ?thesis by ...
qed
```

You can have as many proof-steps as you like in Isar-style proofs.

Does this help? (I don't know what exactly you want to prove)

The lemma I actually want to prove is the following

```
lemma flag_inv:
assumes sp: "spec m"
and dom: "i ∈ {0,1}"
shows "pc (m t) i ∈ {a2, a3a, a3b, cs} ⟶ flag (m t) i"
```

It is an invariant about all possible behaviors of the specified system. What is basically says is that if a process `i`

is in one of the given states, then the flag for a given process is set.

System behaviors are modeled as functions `nat ⇒ state`

, returning the system state in a given moment of time. Predicate `init`

defines the

state at the moment of time equal to zero, while relation `nxt`

defines how the state `m t`

is related to state `m (Suc t)`

.

Lemmas `same_flag`

, `flag_a2`

and `flag_a3a`

are actually auxiliary lemmas for proving `flag_inv`

. The problem is, I got stuck at `flag_a3a`

.

Proofs generated by `sledgehammer`

for `same_flag`

and `flag_a2`

invoke an SMT solver, even though the definitions are mostly using just logic and equality. I suspect that my way to define things might make things harder than they should be, but I don't know how to diagnose it.

Most of the statements the smt solver and metis use are all your definitions (`*_def`

). Without these definitions, the system cannot know what to reason about. What happens, if you hand in all those definitions by hand?

Either by `using ...`

or you could try what happens if you say something like `apply (auto simp add: ...)`

I figured out two lemmas, that seem to be quite useful. But now I have a different problem:

```
lemma ninv_next:
assumes sp: "spec m"
and refl: "∀s. P s s"
and st: "∀s s'. ∀i ∈ {0,1}. (nall i s s' ⟶ P s s')"
shows "P (m t) (m (Suc t))"
using sp refl st spec_def nxt_def by metis
lemma same_flag:
assumes sp: "spec m"
and dom: "i ∈ {0,1}"
shows "pc (m t) i ∈ {a1,a4} ∨ flag (m t) i = flag(m (Suc t)) i" (is ?thesis)
proof -
let "?P" = "λj s s'. pc s j ∈ {a1,a4} ∨ flag s j = flag s' j"
have refl: "∀s. ?P i s s" by auto
have st: "∀s s'. ∀j ∈ {0,1}. (nall j s s' ⟶ ?P j s s')"
using dom nall_def n0_def n1_def n2_def n3a_def n3b_def n4_def ncs_def
by auto
show ?thesis using sp dom refl st ninv_next sorry
qed
```

Even though `?P i`

satisfies both `refl`

and `st`

, for some reason I can't derive the conclusion that `?P i (m t) (m (Suc t))`

.

In st, it should be `?P i`

instead of `?P j`

right?

Otherwise, from the distance, that looks like finding the right instantiation for `P`

fails. Try `ninv_next[of _ "?P i"]`

instead of `ninv_next`

. At least that could help you going forward. in finding the issue.

Yes, this indeed helps. Thanks!

Is there a way to abbreviate `using nall_def n0_def n1_def n2_def n3a_def n3b_def n4_def ncs_def`

into something like `using defs`

?

```
lemmas defs = nall_def n0_def n1_def n2_def n3a_def n3b_def n4_def ncs_def
```

but it is not clear if this is really a good idea

suhr has marked this topic as resolved.

Last updated: Sep 25 2022 at 23:25 UTC