[isabelle] Low level simplifier failure · Mirror: Isabelle Users Mailing List

Stream: Mirror: Isabelle Users Mailing List

Topic: [isabelle] Low level simplifier failure

Email Gateway (Jan 21 2026 at 11:39):

From: Lawrence Paulson <lp15@cam.ac.uk>

Consider the following theory:

theory Scratch
imports Main

begin

datatype a = A unit unit and b = B ‹unit ⇒ a›

declare [[simp_trace]]
lemma "x = B (A ())"
apply simp

This produces the following output:

[1]SIMPLIFIER INVOKED ON THE FOLLOWING TERM:
x = B (A ())
exception UnequalLengths raised (line 553 of "library.ML”)

The exception goes away if we eliminate the mutually recursive datatype in favour of separate declarations, but the original of this issue is necessarily mutually recursive. Perhaps there are simpprocs involved with datatypes?

Larry

Email Gateway (Jan 21 2026 at 12:37):

From: Dmitriy Traytel <cl-isabelle-users@lists.cam.ac.uk>

This seems to be indeed coming from Manuel's datatype_no_proper_subterm simproc.

using [[simproc del: datatype_no_proper_subterm]]

removes the low-level exception.

Dmitriy

On 21 Jan 2026, at 12.39, Lawrence Paulson <lp15@cam.ac.uk> wrote:

Consider the following theory:

theory Scratch
imports Main

begin

datatype a = A unit unit and b = B ‹unit ⇒ a›

declare [[simp_trace]]
lemma "x = B (A ())"
apply simp

This produces the following output:

[1]SIMPLIFIER INVOKED ON THE FOLLOWING TERM:
x = B (A ())
exception UnequalLengths raised (line 553 of "library.ML”)

The exception goes away if we eliminate the mutually recursive datatype in favour of separate declarations, but the original of this issue is necessarily mutually recursive. Perhaps there are simpprocs involved with datatypes?

Larry

Email Gateway (Jan 21 2026 at 15:05):

From: Manuel Eberl <manuel@pruvisto.org>

Okay, I see what the problem is. The simproc breaks in the presence of
partially applied constructors. That kind of issue can only arise in
datatypes that have constructors whose arguments are functions returing
datatypes. Such datatypes are out of scope for the simproc anyway, since
they only have a "dummy" size function that always returns 0. If the
simproc didn't break at the "~~" (raising an UnequalLengths), it would
break later on at the "rule_by_tactic" since it would have to show "0 ≠ 0".

I think the following two steps would make sense to make this more robust:

– catch the Unequal_Lengths exception and just abort the simproc

– make the "rule_by_tactic" fail more gracefully in case the "size"
function does not work as intended

It might be even better to have a check at the beginning to make sure
that the datatype expression is in the "supported fragment" to begin
with. But I'm not sure how one would do that. Perhaps it was a bad idea
to create this simproc in the first place. But it does do something
useful, that users might reasonably expect Isabelle's automation to be
able to do. Namely automatically show that something like "xs = Cons a
(Cons b xs)" is false.

Perhaps some of the datatype experts could comment on whether there is a
robust way to check if a datatype has a "proper" size function and not
just a "dummy" one.

Manuel

On 21/01/2026 13:36, Dmitriy Traytel (via cl-isabelle-users Mailing
List) wrote:

This seems to be indeed coming from Manuel's datatype_no_proper_subterm simproc.

using [[simproc del: datatype_no_proper_subterm]]

removes the low-level exception.

Dmitriy

On 21 Jan 2026, at 12.39, Lawrence Paulson <lp15@cam.ac.uk> wrote:

Consider the following theory:

theory Scratch
imports Main

begin

datatype a = A unit unit and b = B ‹unit ⇒ a›

declare [[simp_trace]]
lemma "x = B (A ())"
apply simp

This produces the following output:

[1]SIMPLIFIER INVOKED ON THE FOLLOWING TERM:
x = B (A ())
exception UnequalLengths raised (line 553 of "library.ML”)

The exception goes away if we eliminate the mutually recursive datatype in favour of separate declarations, but the original of this issue is necessarily mutually recursive. Perhaps there are simpprocs involved with datatypes?

Larry

Email Gateway (Jan 21 2026 at 15:48):

From: Manuel Eberl <eberlm@in.tum.de>

Oops. I had a feeling that might be the culprit. I'll investigate.

On 21/01/2026 13:36, Dmitriy Traytel wrote:

This seems to be indeed coming from Manuel's datatype_no_proper_subterm simproc.

using [[simproc del: datatype_no_proper_subterm]]

removes the low-level exception.

Dmitriy

On 21 Jan 2026, at 12.39, Lawrence Paulson <lp15@cam.ac.uk> wrote:

Consider the following theory:

theory Scratch
imports Main

begin

datatype a = A unit unit and b = B ‹unit ⇒ a›

declare [[simp_trace]]
lemma "x = B (A ())"
apply simp

This produces the following output:

[1]SIMPLIFIER INVOKED ON THE FOLLOWING TERM:
x = B (A ())
exception UnequalLengths raised (line 553 of "library.ML”)

The exception goes away if we eliminate the mutually recursive datatype in favour of separate declarations, but the original of this issue is necessarily mutually recursive. Perhaps there are simpprocs involved with datatypes?

Larry

Email Gateway (Jan 21 2026 at 17:32):

From: Dmitriy Traytel <cl-isabelle-users@lists.cam.ac.uk>

Hi Manuel,

Something strange happened with your follow up response, which didn’t arrive in my inbox (nor in my spam filter). I do see it on Zulip’s mailing list mirror.

Okay, I see what the problem is. The simproc breaks in the presence of partially applied constructors. That kind of issue can only arise in datatypes that have constructors whose arguments are functions returing datatypes. Such datatypes are out of scope for the simproc anyway, since they only have a "dummy" size function that always returns 0. If the simproc didn't break at the "~~" (raising an UnequalLengths), it would break later on at the "rule_by_tactic" since it would have to show "0 ≠ 0".

I think the following two steps would make sense to make this more robust:

– catch the Unequal_Lengths exception and just abort the simproc

– make the "rule_by_tactic" fail more gracefully in case the "size" function does not work as intended

It might be even better to have a check at the beginning to make sure that the datatype expression is in the "supported fragment" to begin with. But I'm not sure how one would do that. Perhaps it was a bad idea to create this simproc in the first place. But it does do something useful, that users might reasonably expect Isabelle's automation to be able to do. Namely automatically show that something like "xs = Cons a (Cons b xs)" is false.

Perhaps some of the datatype experts could comment on whether there is a robust way to check if a datatype has a "proper" size function and not just a "dummy" one.

The canonical way would be to query the size database and analyze the returned equations for whatever you precisely mean by being “proper”/“dummy” (I guess "all right-hand-sides = 0" is your definition of dummy):

ML ‹BNF_LFP_Size.size_of @{context} @{type_name b}›

Note that datatypes themselves know nothing about size functions; this is a separate plugin with a separate database. Datatypes do know through which type constructors they nest recursion. For example the following code checks whether recursion is nested through the function type.

ML ‹BNF_FP_Def_Sugar.fp_sugar_of @{context} @{type_name b}
|> the |> #fp_nesting_bnfs |> map (Binding.name_of o BNF_Def.name_of_bnf)
|> Library.exists (fn T => T = @{type_name fun})›

Although “fun” is perhaps just one example where this shows up; I think size functions returning 0 could have other origins, e.g., when codatatypes or other “too large" types are involved in datatype declarations.

Best wishes,
Dmitriy

On 21 Jan 2026, at 13.54, Manuel Eberl <eberlm@in.tum.de> wrote:

Oops. I had a feeling that might be the culprit. I'll investigate.

On 21/01/2026 13:36, Dmitriy Traytel wrote:
This seems to be indeed coming from Manuel's datatype_no_proper_subterm simproc.

using [[simproc del: datatype_no_proper_subterm]]

removes the low-level exception.

Dmitriy

On 21 Jan 2026, at 12.39, Lawrence Paulson <lp15@cam.ac.uk> wrote:

Consider the following theory:

theory Scratch
imports Main

begin

datatype a = A unit unit and b = B ‹unit ⇒ a›

declare [[simp_trace]]
lemma "x = B (A ())"
apply simp

This produces the following output:

[1]SIMPLIFIER INVOKED ON THE FOLLOWING TERM:
x = B (A ())
exception UnequalLengths raised (line 553 of "library.ML”)

Larry

Email Gateway (Jan 21 2026 at 18:28):

From: Makarius <makarius@sketis.net>

On 21/01/2026 12:39, Lawrence Paulson wrote:

datatype a = A unit unit and b = B ‹unit ⇒ a›

declare [[simp_trace]]
lemma "x = B (A ())"
apply simp

This produces the following output:

[1]SIMPLIFIER INVOKED ON THE FOLLOWING TERM:
x = B (A ())
exception UnequalLengths raised (line 553 of "library.ML”)

Side-remark: this is how to see more directly from where the failure comes from.

declare [[simp_debug]]
lemma "x = B (A ())"
apply simp
(*
...
[1]Trying procedure "BNF_Least_Fixpoint.datatype_no_proper_subterm" on:
x = B (A ())
exception UnequalLengths raised (line 553 of "library.ML")
*)

Thus it becomes clear that this is not a "low level simplifier failure", but a
problem in the user-space library.

Makarius

Email Gateway (Jan 22 2026 at 10:03):

From: Manuel Eberl <manuel@pruvisto.org>

On 21/01/2026 18:31, Dmitriy Traytel (via cl-isabelle-users Mailing
List) wrote:

The canonical way would be to query the size database and analyze the
returned equations for whatever you precisely mean by being
“proper”/“dummy” (I guess "all right-hand-sides = 0" is your
definition of dummy):

Hm, that still looks a bit brittle and ad-hoc. But it might work as a
first sanity check to abort early.

I still need to rework the latter part as well though. Right now it
assumes that the proof with the "size" function will always go through
if a "size" function exists, and throws an exception if it doesn't. That
is probably an overly optimistic assumption considering the complexity
of the datatype/BNF package.

A cleaner solution would be if the datatype package itself could provide
some kind of proper subterm relation and a proof that it is irreflexive.
But I assume that that's not straightforward to do or we would have done
it back then. Plus I think it is probably not a good idea to add even
more bulk to the datatype package.

@Makarius: Thanks for the hint about simp_debug. I did not know about
that option.

Manuel

Last updated: Mar 14 2026 at 08:38 UTC