I've noticed that calculation proofs sometimes behave in weird ways, failing with cryptic errors (which btw aren't listed in the reference manual). Some examples:
The lemma below fails with "Vacuous calculation result":
theory Repro
imports Main "HOL-Data_Structures.Sorting"
begin
fun partition3 :: "('a::linorder) ⇒ 'a list ⇒ 'a list × 'a list × 'a list" where
"partition3 x xs = (
filter (λy. y < x ) xs,
filter (λy. y = x ) xs,
filter (λy. y > x ) xs)"
fun quicksort3 :: "('a::linorder) list ⇒ 'a list" where
"quicksort3 [] = []" |
"quicksort3 (x # xs) = (
let (ls, es, gs) = partition3 x xs in
quicksort3 ls @ x # es @ quicksort3 gs)"
lemma "sorted (quicksort3 xs)"
proof(induction xs rule: quicksort3.induct, simp)
case (2 x xs)
obtain ls es gs where parts: ‹(ls, es, gs) = partition3 x xs› by auto
have ls_smallest: ‹∀a ∈ set (quicksort3 ls). ∀b ∈ set (x # es @ quicksort3 gs). a ≤ b› sorry
have ‹sorted (quicksort3 (x # xs)) = sorted (quicksort3 ls @ x # es @ quicksort3 gs)›
using parts by simp
(* Comment out this line, and the errors disappear: *)
also have ‹… = sorted ((quicksort3 ls) @ (x # es @ quicksort3 gs))› by simp
also (* calculation: sorted (quicksort3 (x # xs)) = sorted (quicksort3 (x # xs)) *)
have ‹… = (sorted (quicksort3 ls) ∧ sorted (x # es @ quicksort3 gs)
∧ (∀a ∈ set (quicksort3 ls). ∀b ∈ set (x # es @ quicksort3 gs). a ≤ b))›
by (simp add: sorted_append)
also (* Error: Vacuous calculation result: *)
have ‹… = (sorted (quicksort3 ls) ∧ sorted (x # es @ quicksort3 gs))›
using ls_smallest by simp
show ?case sorry
qed
end
It seems that the line also have ‹… = sorted ((quicksort3 ls) @ (x # es @ quicksort3 gs))› breaks the calculation proof. This step is redundant, but Isabelle seems to accept redundant steps in other situations:
notepad begin
fix P1 P2 P3 P4 Q :: bool
assume assms: P1 ‹P1 = P2› ‹P2 = P3› ‹P3 = P4›
have ‹P1 = P2› using assms by simp
also have ‹… = P3› using assms by simp
also have ‹… = P3› by simp (* Redudant step *)
also have ‹… = P4› using assms by simp
end
In the next example, Isabelle accepts one calculation step, but not another identical one:
notepad begin
fix P1 P2 P3 P4 Q :: bool
assume assms: P1 ‹P1 = P2› ‹P2 = P3› ‹P3 = P4› Q
have ‹P1 ∧ Q = P2 ∧ Q› using assms by simp
also have ‹… = P3 ∧ Q› using assms by simp
also have ‹… = P4 ∧ Q› using assms by simp (* No matching trans rules for calculation *)
end
Are these known bugs that will be fixed in a future release? Or, if this behavior is expected, is there some other way to write transitive chain calculations that don't have these weird quirks? Namely, if I can prove:
have 1: ‹A = B› sorry
have 2: ‹B = C› sorry
have ‹A = C› using 1 2 by simp
Then I should be also able to write it like this (regardless of what A, B and C look like, for any chain length):
have ‹A = B› sorry
also have ‹… = C› sorry
finally have ‹A = C› by simp
In your first example, what essentially happens is Isabelle sees two statements of the form "A = B" and "B = B" and tries to apply a transitivity rule. There are two transitivity rules for equality, and you may either get "A = B" (so your step was redundant) or "A = A" (this is what happens in your first example; thus you get "(* calculation: sorted (quicksort3 (x # xs)) = sorted (quicksort3 (x # xs)) *)", a useless calculation rule that cannot be applied to the rest of the proof). So redundant steps may either work or break, because Isabelle tries to smartly select the correct transitivity rule and there's not enough information; however, it's fine to just delete them.
Concerning your next example with "P1 /\ Q = P2 /\ Q", it does not say what you think it says, because = binds tighter than logical operators (you can use <--> as an equality that binds less tightly). So your first step is "P1 & ((Q = P2) & Q)" (which is true because all of these are True), your second step is RHS of the first step equal to P3, and Q; so, (((Q = P2) & Q) = P3) & Q". That is also of course true because all of the variables are True. Then on the next line the transitive step ("also") fails because the first line wasn't an equality, and the second line also isn't an equality. What's a little counterintuitive is that it doesn't have anything to do with what comes after "also", the "also" itself fails.
Answering your final question, "simp" is not deterministic and has an upper bound on the complexity of combinations it tries, while a calculation chain can get in theory to any length without impacting the easibility of the proof. In your second snippet, btw, it is better to write finally have ‹A = C› . with a dot instead of by simp, because the result is just restating the calculation, there's no further simplification needed.
I would suggest not despairing and sticking with calculation proofs; they do take a bit of getting used to, but this is often the nicest way to perform e.g. assumption or goal rewrites by equivalencies, not to mention the ordered statements.
Thank you, this is very helpful! I still don't really get why the first example fails though. You say that there are two transitivity rules for equality, but in the print_trans_rules output I only see one transitivity rule ⟦?r = ?s; ?s = ?t⟧ ⟹ ?r = ?t and one symmetry rule ?s = ?t ⟹ ?t = ?s that have an equality in both the premise and the conclusion. I don't see how you can derive B = B from the first rule though. And in any case this is very counter-intuitive IMO, as it breaks the expected chain of calculations from left to right.
Oh, rather there are two possible orders of substitution. When using equality with calculation it performs a substitution in the other term, so rules are of the form ?s = ?t ==> ?P ?t ==> ?P ?s and `?P ?s ==> ?s = ?t ==> ?P ?t". Which means you either substitute B -> B in A = B, or substitute A <- B in B = B. At least that is my understanding.
This is starting to look like a bug. According to the manual, the second and further also should be equivalent to note calculation = trans [OF calculation this]. And this seems to work until you write down a redundant calculation step - then for some reason it breaks:
notepad begin
fix P Q R S T :: bool
assume ‹P = Q› ‹Q = R› ‹R = S› ‹S = T›
have ‹P = Q› using ‹P = Q ›by simp
print_facts
(* First `also`: calculation = this = (P = Q)*)
also have ‹… = R› using ‹Q = R› by simp
print_facts
thm trans[OF calculation this]
(* Next `also`: calculation = thm trans[OF calculation this] = (P = R)*)
also have ‹… = S› using ‹R = S› by simp
print_facts
thm trans[OF calculation this]
(* Next `also`: calculation = thm trans[OF calculation this] = (P = S)*)
also have ‹… = S› using ‹R = S› by simp
print_facts
thm trans[OF calculation this]
(* Next `also`: calculation should equal: thm trans[OF calculation this] = (P = S), but actually equals: P = P *)
also have ‹… = T› using ‹S = T› by simp
print_facts
thm trans[OF calculation this]
finally have ‹P = T› .
end
If I apply the rule trans explicitly, I get a different error:
also(trans) have ‹… = T› using ‹S = T› by simp (* Vacuous calculation result: P = S *)
print_facts
thm trans[OF calculation this]
It seems you were right, and Isabelle picked a different trans rule that resulted in calculation := P = P (though I still have no idea how to debug this and find out which one exactly). Perhaps it initially tried to apply trans, saw it fail, and that's why it went with something else. But the "vacuous calculation" error still doesn't explain what's wrong.
FWIW in here:
notepad begin
fix P1 P2 P3 P4 Q :: bool
assume assms: P1 ‹P1 = P2› ‹P2 = P3› ‹P3 = P4›
have ‹P1 = P2› using assms by simp
also have ‹… = P3› using assms by simp
also have ‹… = P3› by simp (* Redudant step *)
also have ‹… = P4› using assms by simp
end
you have the same behaviour, you're just one line away from triggering the error :)
looks like also has a wider set of rules than just trans, indeed - the manual does say the translation it provides are approximate. I however really don't get why it's important what behvaiour we get here... garbage in, garbage out, just don't write P=P in your proofs :)
The vacuous calculation error just means that it reproduced a result that was already in print_facts somewhere. That's usually not what you want, so it's an error.
It's not a critical issue - it just creates friction. Maybe I just want to parenthesize certain terms for clarity - it shouldn't break the program IMO.
There are a number of issues with the calculation feature in Isar, in my opinion. It is a very versatile tool, but also quite brittle as a result of that. In any case, changing what it does at this point is basically impossible because it would break lots of existing proofs.
Last updated: Mar 04 2026 at 20:34 UTC