Tricks for arithmetic · Beginner Questions

Are there any tricks for performing arithmetic? I just want to prove that (x + x^2)^2 <= 4 * x^4 given that x >= 1 but I need several reasoning steps for that. I would expect that this can be solved in one or two steps.

Lukas Stevens (May 19 2022 at 11:43):

x :: nat

Wenda Li (May 19 2022 at 11:52):

The problem is much easier when x is a real number. It might be possible to build an incomplete procedure to first prove the result on reals and then transfer the result to naturals...

Lukas Stevens (May 19 2022 at 11:53):

I also tried using real numbers but didn't manage to solve it in one step.

Manuel Eberl (May 19 2022 at 11:54):

What is "it"?

Manuel Eberl (May 19 2022 at 11:54):

In principle, this is of course decidable, but I would not expect any of our standard proof methods to be able to solve this. It's nonlinear, after all.

Manuel Eberl (May 19 2022 at 11:54):

Sturm's method can almost solve it (it can solve it if you replace the ≥ with >), but that is of course overkill here.

Manuel Eberl (May 19 2022 at 11:56):

It really isn't that simple of a problem, I do think you at least need something like Sturm's theorem to deal with this. In the multivariate case, something like CAD.

Manuel Eberl (May 19 2022 at 11:56):

Not sure why Sturm can't solve this; it was too long ago. But even then, it relies on code reflection, which seems like overkill.

Lukas Stevens (May 19 2022 at 11:57):

Are all of the standard methods + Sturm complete?

Manuel Eberl (May 19 2022 at 11:57):

My usual approach is: show that the difference is nonnegative by factoring it, like this:

lemma
  assumes "x ≥ (1 :: real)"
  shows   "(x + x^2)^2 ≤ 4 * x^4"
  using assms
proof -
  have "4 * x^4 - (x + x^2)^2 = x ^ 2 * (x - 1) * (3 * x + 1)"
    by (simp add: algebra_simps power4_eq_xxxx power2_eq_square)
  also have "… ≥ 0"
    using assms by (intro mult_nonneg_nonneg) auto
  finally show ?thesis
    by simp
qed

Manuel Eberl (May 19 2022 at 11:57):

Lukas Stevens said:

Are all of the standard methods + Sturm complete?

Not sure what you mean by that.

Manuel Eberl (May 19 2022 at 11:58):

In principle, Sturm should be able to prove anything of the form ∀x∈{a..b}. p(x) ≤ q(x) and the same if you replace any of the ≤ with <.

Manuel Eberl (May 19 2022 at 11:59):

Now if my implementation is complete, that is a different question… that was done a very long time ago and nobody ever really used it much (I think Larry uses it at some point)

Lukas Stevens (May 19 2022 at 11:59):

I mean would it help to have something like this: https://arxiv.org/abs/cs/0601134

Lukas Stevens (May 19 2022 at 11:59):

You proposed this as reading material at some point, right?

Manuel Eberl (May 19 2022 at 11:59):

But yeah Sturm's theorem should be complete for the universal case

Manuel Eberl (May 19 2022 at 12:00):

univariate CAD should be complete for universal and existential (I think Wenda did something like this at some point, but I cannot find it in the AFP)

Manuel Eberl (May 19 2022 at 12:00):

then there's multivariate CAD of course, which is an entirely different beast (but also complete)

Manuel Eberl (May 19 2022 at 12:00):

Lukas Stevens said:

You proposed this as reading material at some point, right?

I don't think that was me.

Manuel Eberl (May 19 2022 at 12:03):

Okay I must be missing something

Manuel Eberl (May 19 2022 at 12:03):

Why isn't $T[Q]$ exactly Tarski arithmetic?

Manuel Eberl (May 19 2022 at 12:05):

Ah, probably because you can also have additional functions

Manuel Eberl (May 19 2022 at 12:05):

I see

Manuel Eberl (May 19 2022 at 12:07):

I'm not sure how practically useful this procedure is. For instance, it seems that it cannot do distributivity, which means it won't be able to do your example above either.

Manuel Eberl (May 19 2022 at 12:08):

This paper is cited in the one you posted, perhaps this might be more practical: https://link.springer.com/chapter/10.1007/11532231_22

Mohammad Abdulaziz (May 19 2022 at 12:09):

I think CAD combined clause learning is very practical. The use that all the time in SMT solvers

Mohammad Abdulaziz (May 19 2022 at 12:10):

There is a famous paper by Leo de Moura about that. But implementing that is non-trivial

Manuel Eberl (May 19 2022 at 12:11):

Manuel Eberl said:

This paper is cited in the one you posted, perhaps this might be more practical: https://link.springer.com/chapter/10.1007/11532231_22

I don't think this is CAD though. This is probably more something akin to Tarskis's original procedure, which has abysmal running time.

Lukas Stevens (May 19 2022 at 12:13):

Lukas Stevens said:

You proposed this as reading material at some point, right?

I meant this paper https://arxiv.org/pdf/1404.4410.pdf

Manuel Eberl (May 19 2022 at 12:13):

But yeah for univariate stuff I'm pretty sure Wenda implemented all of that using something like Mathematica to provide certificates that are then replayed in Isabelle.

Manuel Eberl (May 19 2022 at 12:14):

Ah yes, here: https://arxiv.org/abs/1506.08238

Manuel Eberl (May 19 2022 at 12:14):

Lukas Stevens said:

Lukas Stevens said:

You proposed this as reading material at some point, right?

I meant this paper https://arxiv.org/pdf/1404.4410.pdf

Hmm, that vaguely rings a bell, but I can't remember having read or suggested it. But I'll believe you. :grinning:

Manuel Eberl (May 19 2022 at 12:14):

I'm an old man, my memory isn't what it once was.

Jonathan Julian Huerta y Munive (May 19 2022 at 12:18):

@Manuel Eberl what is CAD?

Manuel Eberl (May 19 2022 at 12:18):

Cylindric Algebraic Decomposition

Manuel Eberl (May 19 2022 at 12:18):

It's (as far as I know) the fastest known algorithm for real arithmetic

Manuel Eberl (May 19 2022 at 12:19):

but it's still exponential

Manuel Eberl (May 19 2022 at 12:19):

(in the number of, I forgot… quantifier alternations probably)

Lukas Stevens (May 19 2022 at 12:19):

How difficult is it to implement?

Manuel Eberl (May 19 2022 at 12:19):

Doubly exponential even!

Manuel Eberl (May 19 2022 at 12:20):

Lukas Stevens said:

How difficult is it to implement?

I honestly don't know that much about it. You should ask @Wenda Li .

Manuel Eberl (May 19 2022 at 12:20):

I think he had plans to do multivariate CAD in Isabelle at some point.

Manuel Eberl (May 19 2022 at 12:20):

But it's certainly a big project. Even the univariate one was a decently-sized paper.

Manuel Eberl (May 19 2022 at 12:21):

Assia Mahboubi's PhD thesis was to do Tarskis's procedure (which I think is considerably simpler than CAD, but again, I'm not an expert) in Coq.

Manuel Eberl (May 19 2022 at 12:21):

So that's another indicator that it's probably quite a bit of work… (unless we're willing to believe that everything is much harder to do in Coq)

Lukas Stevens (May 19 2022 at 12:22):

Manuel Eberl said:

But it's certainly a big project. Even the univariate one was a decently-sized paper.

But this is with verification in Isabelle, right?

Jonathan Julian Huerta y Munive (May 19 2022 at 12:23):

@Lukas Stevens we are developing simple tactics for arithmetical reasoning for our hybrid systems verification components in https://github.com/isabelle-utp/Hybrid-Verification. Nothing too complex yet. The lemma you showed can be solved with two of our tactics plus sledgehammer. To call them you would just need the file Real_Arith_Tactics.thy

lemma
  assumes "x ≥ (1 :: real)"
  shows   "(x + x^2)^2 ≤ 4 * x^4"
  apply bin_unfold
  apply mon_pow_simp
  by (smt (verit, ccfv_SIG) assms le_add1 nat_add_1_add_1 numeral_Bit0 numeral_Bit1 power_increasing)

Wenda Li (May 19 2022 at 12:24):

For this problem in particular, our build-in sum-of-squares method suffices (HOL-Library.Sum_of_Squares):

lemma
  assumes "(x::real) ≥1"
  shows "(x + x^2)^2 <= 4 * x^4"
  using assms
  by sos

Univariate CAD should be preferable if the degree of the polynomial is too high. You can try my certificate-based univariate CAD here: https://github.com/Wenda302/RCF_Decision_Procedures. I haven't put it into the AFP as it requires Mathematica installed for finding a certificate.

Multivariate CAD is still on the way, which may take a bit time but I am super excited about it!

Jonathan Julian Huerta y Munive (May 19 2022 at 12:24):

@Manuel Eberl thanks for the background info! Very useful and very interested in this!

Lukas Stevens (May 19 2022 at 12:27):

Wenda Li said:

For this problem in particular, our build-in sum-of-squares method suffices (HOL-Library.Sum_of_Squares):
lemma
  assumes "(x::real) ≥1"
  shows "(x + x^2)^2 <= 4 * x^4"
  using assms
  by sos
Univariate CAD should be preferable if the degree of the polynomial is too high. You can try my certificate-based univariate CAD here: https://github.com/Wenda302/RCF_Decision_Procedures. I haven't put it into the AFP as it requires Mathematica installed for finding a certificate.

Multivariate CAD is still on the way, which may take a bit time but I am super excited about it!

Also based on certificates?

Wenda Li (May 19 2022 at 12:29):

Manuel Eberl said:

Assia Mahboubi's PhD thesis was to do Tarskis's procedure (which I think is considerably simpler than CAD, but again, I'm not an expert) in Coq.

Assia Mahboubi implemented CAD in Coq but as far as I know the proof is yet to finish. Cyril Cohen and Assia Mahboubi implemented and verified Tarski's quantifier-elimination procedure in Coq, which is more of theoretical interest as the procedure has non-elementary complexity.

Wenda Li (May 19 2022 at 12:33):

Lukas Stevens said:

Also based on certificates?

Certificate-based is my short-term solution. In the long term, I am planning to have both certificate-based and non-certificate-based ones, but non-certificate-based ones might be slower as it really takes time to catch up with years of accumulated optimisation in modern CAS.

Lukas Stevens (May 19 2022 at 12:35):

Maybe it would be a good idea to hand out porting some implementation of CAD to ML to a student? Then, one can use that as a fallback for Mathematica. ~~Or is that what you are planning to do anyway, i.e. not verifying the procedure but sticking to the certificated-based approach?~~

Wenda Li (May 19 2022 at 12:36):

Manuel Eberl said:

but it's still exponential

Double-exponential actually. For the univariate case, it is polynomial though.

Wenda Li (May 19 2022 at 12:44):

Lukas Stevens said:

Maybe it would be a good idea to hand out porting some implementation of CAD to ML to a student? Then, one can use that as a fallback for Mathematica. ~~Or is that what you are planning to do anyway, i.e. not verifying the procedure but sticking to the certificated-based approach?~~

For univariate CAD, the crucial part is a highly-efficient univariate polynomial isolation procedure (which itself would be nice paper). I am still working on that procedure. Once it is finished, we can expect a non-certificate-based univariate CAD tactic as well as a standalone extracted program.

Yes, I am already outsourcing some of the problems to students, but it still would take some time.

Wenda Li (May 19 2022 at 12:58):

Manuel Eberl said:

Manuel Eberl said:

This paper is cited in the one you posted, perhaps this might be more practical: https://link.springer.com/chapter/10.1007/11532231_22

I don't think this is CAD though. This is probably more something akin to Tarskis's original procedure, which has abysmal running time.

Yes, to me that implementation is more close to Tarski's original quantifier elimination---it has some optimisation but the overall complexity should stay the same (i.e., non-elementary in terms of the number of variables). PVS has this procedure (https://shemesh.larc.nasa.gov/fm/pvs/Tarski/), though they only use it for solving univariate problems. Isabelle also has something very similar: https://www.isa-afp.org/entries/BenOr_Kozen_Reif.html.

Manuel Eberl (May 19 2022 at 13:36):

Wenda Li said:

Lukas Stevens said:

Maybe it would be a good idea to hand out porting some implementation of CAD to ML to a student? Then, one can use that as a fallback for Mathematica. ~~Or is that what you are planning to do anyway, i.e. not verifying the procedure but sticking to the certificated-based approach?~~

For univariate CAD, the crucial part is a highly-efficient univariate polynomial isolation procedure (which itself would be nice paper). I am still working on that procedure. Once it is finished, we can expect a non-certificate-based univariate CAD tactic as well as a standalone extracted program.

Yes, I am already outsourcing some of the problems to students, but it still would take some time.

Non-verified, I assume? Just running in ML?

Manuel Eberl (May 19 2022 at 13:37):

Wenda Li said:

For this problem in particular, our build-in sum-of-squares method suffices (HOL-Library.Sum_of_Squares):
lemma
  assumes "(x::real) ≥1"
  shows "(x + x^2)^2 <= 4 * x^4"
  using assms
  by sos

Oh, right! I thought about sos but didn't realise it also worked with assumptions like x ≥ 1. I wonder how that even works.

Manuel Eberl (May 19 2022 at 13:39):

Lukas Stevens said:

But it's certainly a big project. Even the univariate one was a decently-sized paper.

But this is with verification in Isabelle, right?

Well, a proof-producing method typically has to do some work inside Isabelle and some work in ML to plug the theorems together (plus possibly some external work to compute certificates).

Manuel Eberl (May 19 2022 at 13:39):

I think for CAD there will always be a significant amount of work to be done inside Isabelle to get to the point where you can implement a proof-producing procedure in ML.

Wenda Li (May 19 2022 at 13:44):

Manuel Eberl said:

Wenda Li said:

Lukas Stevens said:

Maybe it would be a good idea to hand out porting some implementation of CAD to ML to a student? Then, one can use that as a fallback for Mathematica. ~~Or is that what you are planning to do anyway, i.e. not verifying the procedure but sticking to the certificated-based approach?~~

For univariate CAD, the crucial part is a highly-efficient univariate polynomial isolation procedure (which itself would be nice paper). I am still working on that procedure. Once it is finished, we can expect a non-certificate-based univariate CAD tactic as well as a standalone extracted program.

Yes, I am already outsourcing some of the problems to students, but it still would take some time.

Non-verified, I assume? Just running in ML?

What do you mean by 'non-verified' here?

Wenda Li (May 19 2022 at 13:46):

Manuel Eberl said:

Oh, right! I thought about sos but didn't realise it also worked with assumptions like x ≥ 1. I wonder how that even works.

SOS is supposed to be complete for univariate (and bivariate, if my memory serves me) problems, considering we factor out the numerical issue in the SDP solver.

Wenda Li (May 19 2022 at 13:47):

Manuel Eberl said:

I think for CAD there will always be a significant amount of work to be done inside Isabelle to get to the point where you can implement a proof-producing procedure in ML.

I am a huge fan of proof-by-reflection. This should be the future :-)

Manuel Eberl (May 19 2022 at 13:53):

Wenda Li said:

Manuel Eberl said:

Wenda Li said:

Lukas Stevens said:

Maybe it would be a good idea to hand out porting some implementation of CAD to ML to a student? Then, one can use that as a fallback for Mathematica. ~~Or is that what you are planning to do anyway, i.e. not verifying the procedure but sticking to the certificated-based approach?~~

For univariate CAD, the crucial part is a highly-efficient univariate polynomial isolation procedure (which itself would be nice paper). I am still working on that procedure. Once it is finished, we can expect a non-certificate-based univariate CAD tactic as well as a standalone extracted program.

Yes, I am already outsourcing some of the problems to students, but it still would take some time.

Non-verified, I assume? Just running in ML?

What do you mean by 'non-verified' here?

Well, you gave the student the task of building a polynomial root isolation procedure, right? For your purposes, that procedure doesn't have to be a verified procedure in HOL, right? It would be enough to have some unverified ML code that just produces the numbers and then you check them in HOL by plugging in the numbers and verifying that they have alternating signs.

Manuel Eberl (May 19 2022 at 13:53):

(Assuming that the polynomials are squarefree, which is easy to guarantee because you can just run a squarefree factorisation)

Manuel Eberl (May 19 2022 at 13:54):

Okay well you probably do need more than that I suppose, some root-counting stuff like Sturm sequences or whatever, but you know more about that than I do.

Wenda Li (May 19 2022 at 14:10):

Ahh, I see your point, Manuel! Yes, that could be an intermediate solution, but I have been planning for a fully verified real root isolation procedure, which could drop the certificate-verification overhead entirely.

Calculating Sturm sequences would be a huge pain when the coefficient size is large.

Wenda Li (May 19 2022 at 14:14):

Also, if I remember correctly, calculating square-free polynomials might not be a cheap operation in many cases...

Last updated: Aug 13 2025 at 08:30 UTC

Stream: Beginner Questions

Topic: Tricks for arithmetic

Lukas Stevens (May 19 2022 at 11:43):

Lukas Stevens (May 19 2022 at 11:43):

Wenda Li (May 19 2022 at 11:52):

Lukas Stevens (May 19 2022 at 11:53):

Manuel Eberl (May 19 2022 at 11:54):

Manuel Eberl (May 19 2022 at 11:54):

Manuel Eberl (May 19 2022 at 11:54):

Manuel Eberl (May 19 2022 at 11:56):

Manuel Eberl (May 19 2022 at 11:56):

Lukas Stevens (May 19 2022 at 11:57):

Manuel Eberl (May 19 2022 at 11:57):

Manuel Eberl (May 19 2022 at 11:57):

Manuel Eberl (May 19 2022 at 11:58):

Manuel Eberl (May 19 2022 at 11:59):

Lukas Stevens (May 19 2022 at 11:59):

Lukas Stevens (May 19 2022 at 11:59):

Manuel Eberl (May 19 2022 at 11:59):

Manuel Eberl (May 19 2022 at 12:00):

Manuel Eberl (May 19 2022 at 12:00):

Manuel Eberl (May 19 2022 at 12:00):

Manuel Eberl (May 19 2022 at 12:03):

Manuel Eberl (May 19 2022 at 12:03):

Manuel Eberl (May 19 2022 at 12:05):

Manuel Eberl (May 19 2022 at 12:05):

Manuel Eberl (May 19 2022 at 12:07):

Manuel Eberl (May 19 2022 at 12:08):

Mohammad Abdulaziz (May 19 2022 at 12:09):

Mohammad Abdulaziz (May 19 2022 at 12:10):

Manuel Eberl (May 19 2022 at 12:11):

Lukas Stevens (May 19 2022 at 12:13):

Manuel Eberl (May 19 2022 at 12:13):

Manuel Eberl (May 19 2022 at 12:14):

Manuel Eberl (May 19 2022 at 12:14):

Manuel Eberl (May 19 2022 at 12:14):

Jonathan Julian Huerta y Munive (May 19 2022 at 12:18):

Manuel Eberl (May 19 2022 at 12:18):

Manuel Eberl (May 19 2022 at 12:18):

Manuel Eberl (May 19 2022 at 12:19):

Manuel Eberl (May 19 2022 at 12:19):

Lukas Stevens (May 19 2022 at 12:19):

Manuel Eberl (May 19 2022 at 12:19):

Manuel Eberl (May 19 2022 at 12:20):

Manuel Eberl (May 19 2022 at 12:20):

Manuel Eberl (May 19 2022 at 12:20):

Manuel Eberl (May 19 2022 at 12:21):

Manuel Eberl (May 19 2022 at 12:21):

Lukas Stevens (May 19 2022 at 12:22):

Jonathan Julian Huerta y Munive (May 19 2022 at 12:23):

Wenda Li (May 19 2022 at 12:24):

Jonathan Julian Huerta y Munive (May 19 2022 at 12:24):

Lukas Stevens (May 19 2022 at 12:27):

Wenda Li (May 19 2022 at 12:29):

Wenda Li (May 19 2022 at 12:33):

Lukas Stevens (May 19 2022 at 12:35):

Wenda Li (May 19 2022 at 12:36):

Wenda Li (May 19 2022 at 12:44):

Wenda Li (May 19 2022 at 12:58):

Manuel Eberl (May 19 2022 at 13:36):

Manuel Eberl (May 19 2022 at 13:37):

Manuel Eberl (May 19 2022 at 13:39):

Manuel Eberl (May 19 2022 at 13:39):

Wenda Li (May 19 2022 at 13:44):

Wenda Li (May 19 2022 at 13:46):

Wenda Li (May 19 2022 at 13:47):

Manuel Eberl (May 19 2022 at 13:53):

Manuel Eberl (May 19 2022 at 13:53):

Manuel Eberl (May 19 2022 at 13:54):

Wenda Li (May 19 2022 at 14:10):

Wenda Li (May 19 2022 at 14:14):