Are there any tricks for performing arithmetic? I just want to prove that (x + x^2)^2 <= 4 * x^4
given that x >= 1
but I need several reasoning steps for that. I would expect that this can be solved in one or two steps.
x :: nat
The problem is much easier when x is a real number. It might be possible to build an incomplete procedure to first prove the result on reals and then transfer the result to naturals...
I also tried using real numbers but didn't manage to solve it in one step.
What is "it"?
In principle, this is of course decidable, but I would not expect any of our standard proof methods to be able to solve this. It's nonlinear, after all.
Sturm's method can almost solve it (it can solve it if you replace the ≥ with >), but that is of course overkill here.
It really isn't that simple of a problem, I do think you at least need something like Sturm's theorem to deal with this. In the multivariate case, something like CAD.
Not sure why Sturm can't solve this; it was too long ago. But even then, it relies on code reflection, which seems like overkill.
Are all of the standard methods + Sturm complete?
My usual approach is: show that the difference is nonnegative by factoring it, like this:
lemma
assumes "x ≥ (1 :: real)"
shows "(x + x^2)^2 ≤ 4 * x^4"
using assms
proof -
have "4 * x^4 - (x + x^2)^2 = x ^ 2 * (x - 1) * (3 * x + 1)"
by (simp add: algebra_simps power4_eq_xxxx power2_eq_square)
also have "… ≥ 0"
using assms by (intro mult_nonneg_nonneg) auto
finally show ?thesis
by simp
qed
Lukas Stevens said:
Are all of the standard methods + Sturm complete?
Not sure what you mean by that.
In principle, Sturm should be able to prove anything of the form ∀x∈{a..b}. p(x) ≤ q(x)
and the same if you replace any of the ≤ with <.
Now if my implementation is complete, that is a different question… that was done a very long time ago and nobody ever really used it much (I think Larry uses it at some point)
I mean would it help to have something like this: https://arxiv.org/abs/cs/0601134
You proposed this as reading material at some point, right?
But yeah Sturm's theorem should be complete for the universal case
univariate CAD should be complete for universal and existential (I think Wenda did something like this at some point, but I cannot find it in the AFP)
then there's multivariate CAD of course, which is an entirely different beast (but also complete)
Lukas Stevens said:
You proposed this as reading material at some point, right?
I don't think that was me.
Okay I must be missing something
Why isn't exactly Tarski arithmetic?
Ah, probably because you can also have additional functions
I see
I'm not sure how practically useful this procedure is. For instance, it seems that it cannot do distributivity, which means it won't be able to do your example above either.
This paper is cited in the one you posted, perhaps this might be more practical: https://link.springer.com/chapter/10.1007/11532231_22
I think CAD combined clause learning is very practical. The use that all the time in SMT solvers
There is a famous paper by Leo de Moura about that. But implementing that is non-trivial
Manuel Eberl said:
This paper is cited in the one you posted, perhaps this might be more practical: https://link.springer.com/chapter/10.1007/11532231_22
I don't think this is CAD though. This is probably more something akin to Tarskis's original procedure, which has abysmal running time.
Lukas Stevens said:
You proposed this as reading material at some point, right?
I meant this paper https://arxiv.org/pdf/1404.4410.pdf
But yeah for univariate stuff I'm pretty sure Wenda implemented all of that using something like Mathematica to provide certificates that are then replayed in Isabelle.
Ah yes, here: https://arxiv.org/abs/1506.08238
Lukas Stevens said:
Lukas Stevens said:
You proposed this as reading material at some point, right?
I meant this paper https://arxiv.org/pdf/1404.4410.pdf
Hmm, that vaguely rings a bell, but I can't remember having read or suggested it. But I'll believe you. :grinning:
I'm an old man, my memory isn't what it once was.
@Manuel Eberl what is CAD?
Cylindric Algebraic Decomposition
It's (as far as I know) the fastest known algorithm for real arithmetic
but it's still exponential
(in the number of, I forgot… quantifier alternations probably)
How difficult is it to implement?
Doubly exponential even!
Lukas Stevens said:
How difficult is it to implement?
I honestly don't know that much about it. You should ask @Wenda Li .
I think he had plans to do multivariate CAD in Isabelle at some point.
But it's certainly a big project. Even the univariate one was a decently-sized paper.
Assia Mahboubi's PhD thesis was to do Tarskis's procedure (which I think is considerably simpler than CAD, but again, I'm not an expert) in Coq.
So that's another indicator that it's probably quite a bit of work… (unless we're willing to believe that everything is much harder to do in Coq)
Manuel Eberl said:
But it's certainly a big project. Even the univariate one was a decently-sized paper.
But this is with verification in Isabelle, right?
@Lukas Stevens we are developing simple tactics for arithmetical reasoning for our hybrid systems verification components in https://github.com/isabelle-utp/Hybrid-Verification. Nothing too complex yet. The lemma you showed can be solved with two of our tactics plus sledgehammer. To call them you would just need the file Real_Arith_Tactics.thy
lemma
assumes "x ≥ (1 :: real)"
shows "(x + x^2)^2 ≤ 4 * x^4"
apply bin_unfold
apply mon_pow_simp
by (smt (verit, ccfv_SIG) assms le_add1 nat_add_1_add_1 numeral_Bit0 numeral_Bit1 power_increasing)
For this problem in particular, our build-in sum-of-squares method suffices (HOL-Library.Sum_of_Squares
):
lemma
assumes "(x::real) ≥1"
shows "(x + x^2)^2 <= 4 * x^4"
using assms
by sos
Univariate CAD should be preferable if the degree of the polynomial is too high. You can try my certificate-based univariate CAD here: https://github.com/Wenda302/RCF_Decision_Procedures. I haven't put it into the AFP as it requires Mathematica installed for finding a certificate.
Multivariate CAD is still on the way, which may take a bit time but I am super excited about it!
@Manuel Eberl thanks for the background info! Very useful and very interested in this!
Wenda Li said:
For this problem in particular, our build-in sum-of-squares method suffices (
HOL-Library.Sum_of_Squares
):lemma assumes "(x::real) ≥1" shows "(x + x^2)^2 <= 4 * x^4" using assms by sos
Univariate CAD should be preferable if the degree of the polynomial is too high. You can try my certificate-based univariate CAD here: https://github.com/Wenda302/RCF_Decision_Procedures. I haven't put it into the AFP as it requires Mathematica installed for finding a certificate.
Multivariate CAD is still on the way, which may take a bit time but I am super excited about it!
Also based on certificates?
Manuel Eberl said:
Assia Mahboubi's PhD thesis was to do Tarskis's procedure (which I think is considerably simpler than CAD, but again, I'm not an expert) in Coq.
Assia Mahboubi implemented CAD in Coq but as far as I know the proof is yet to finish. Cyril Cohen and Assia Mahboubi implemented and verified Tarski's quantifier-elimination procedure in Coq, which is more of theoretical interest as the procedure has non-elementary complexity.
Lukas Stevens said:
Also based on certificates?
Certificate-based is my short-term solution. In the long term, I am planning to have both certificate-based and non-certificate-based ones, but non-certificate-based ones might be slower as it really takes time to catch up with years of accumulated optimisation in modern CAS.
Maybe it would be a good idea to hand out porting some implementation of CAD to ML to a student? Then, one can use that as a fallback for Mathematica. Or is that what you are planning to do anyway, i.e. not verifying the procedure but sticking to the certificated-based approach?
Manuel Eberl said:
but it's still exponential
Double-exponential actually. For the univariate case, it is polynomial though.
Lukas Stevens said:
Maybe it would be a good idea to hand out porting some implementation of CAD to ML to a student? Then, one can use that as a fallback for Mathematica.
Or is that what you are planning to do anyway, i.e. not verifying the procedure but sticking to the certificated-based approach?
For univariate CAD, the crucial part is a highly-efficient univariate polynomial isolation procedure (which itself would be nice paper). I am still working on that procedure. Once it is finished, we can expect a non-certificate-based univariate CAD tactic as well as a standalone extracted program.
Yes, I am already outsourcing some of the problems to students, but it still would take some time.
Manuel Eberl said:
Manuel Eberl said:
This paper is cited in the one you posted, perhaps this might be more practical: https://link.springer.com/chapter/10.1007/11532231_22
I don't think this is CAD though. This is probably more something akin to Tarskis's original procedure, which has abysmal running time.
Yes, to me that implementation is more close to Tarski's original quantifier elimination---it has some optimisation but the overall complexity should stay the same (i.e., non-elementary in terms of the number of variables). PVS has this procedure (https://shemesh.larc.nasa.gov/fm/pvs/Tarski/), though they only use it for solving univariate problems. Isabelle also has something very similar: https://www.isa-afp.org/entries/BenOr_Kozen_Reif.html.
Wenda Li said:
Lukas Stevens said:
Maybe it would be a good idea to hand out porting some implementation of CAD to ML to a student? Then, one can use that as a fallback for Mathematica.
Or is that what you are planning to do anyway, i.e. not verifying the procedure but sticking to the certificated-based approach?For univariate CAD, the crucial part is a highly-efficient univariate polynomial isolation procedure (which itself would be nice paper). I am still working on that procedure. Once it is finished, we can expect a non-certificate-based univariate CAD tactic as well as a standalone extracted program.
Yes, I am already outsourcing some of the problems to students, but it still would take some time.
Non-verified, I assume? Just running in ML?
Wenda Li said:
For this problem in particular, our build-in sum-of-squares method suffices (
HOL-Library.Sum_of_Squares
):lemma assumes "(x::real) ≥1" shows "(x + x^2)^2 <= 4 * x^4" using assms by sos
Oh, right! I thought about sos
but didn't realise it also worked with assumptions like x ≥ 1
. I wonder how that even works.
Lukas Stevens said:
But it's certainly a big project. Even the univariate one was a decently-sized paper.
But this is with verification in Isabelle, right?
Well, a proof-producing method typically has to do some work inside Isabelle and some work in ML to plug the theorems together (plus possibly some external work to compute certificates).
I think for CAD there will always be a significant amount of work to be done inside Isabelle to get to the point where you can implement a proof-producing procedure in ML.
Manuel Eberl said:
Wenda Li said:
Lukas Stevens said:
Maybe it would be a good idea to hand out porting some implementation of CAD to ML to a student? Then, one can use that as a fallback for Mathematica.
Or is that what you are planning to do anyway, i.e. not verifying the procedure but sticking to the certificated-based approach?For univariate CAD, the crucial part is a highly-efficient univariate polynomial isolation procedure (which itself would be nice paper). I am still working on that procedure. Once it is finished, we can expect a non-certificate-based univariate CAD tactic as well as a standalone extracted program.
Yes, I am already outsourcing some of the problems to students, but it still would take some time.
Non-verified, I assume? Just running in ML?
What do you mean by 'non-verified' here?
Manuel Eberl said:
Oh, right! I thought about
sos
but didn't realise it also worked with assumptions likex ≥ 1
. I wonder how that even works.
SOS is supposed to be complete for univariate (and bivariate, if my memory serves me) problems, considering we factor out the numerical issue in the SDP solver.
Manuel Eberl said:
I think for CAD there will always be a significant amount of work to be done inside Isabelle to get to the point where you can implement a proof-producing procedure in ML.
I am a huge fan of proof-by-reflection. This should be the future :-)
Wenda Li said:
Manuel Eberl said:
Wenda Li said:
Lukas Stevens said:
Maybe it would be a good idea to hand out porting some implementation of CAD to ML to a student? Then, one can use that as a fallback for Mathematica.
Or is that what you are planning to do anyway, i.e. not verifying the procedure but sticking to the certificated-based approach?For univariate CAD, the crucial part is a highly-efficient univariate polynomial isolation procedure (which itself would be nice paper). I am still working on that procedure. Once it is finished, we can expect a non-certificate-based univariate CAD tactic as well as a standalone extracted program.
Yes, I am already outsourcing some of the problems to students, but it still would take some time.
Non-verified, I assume? Just running in ML?
What do you mean by 'non-verified' here?
Well, you gave the student the task of building a polynomial root isolation procedure, right? For your purposes, that procedure doesn't have to be a verified procedure in HOL, right? It would be enough to have some unverified ML code that just produces the numbers and then you check them in HOL by plugging in the numbers and verifying that they have alternating signs.
(Assuming that the polynomials are squarefree, which is easy to guarantee because you can just run a squarefree factorisation)
Okay well you probably do need more than that I suppose, some root-counting stuff like Sturm sequences or whatever, but you know more about that than I do.
Ahh, I see your point, Manuel! Yes, that could be an intermediate solution, but I have been planning for a fully verified real root isolation procedure, which could drop the certificate-verification overhead entirely.
Calculating Sturm sequences would be a huge pain when the coefficient size is large.
Also, if I remember correctly, calculating square-free polynomials might not be a cheap operation in many cases...
Last updated: Dec 21 2024 at 16:20 UTC