[isabelle] A proof assistant approach to the COVID-19 pub... · Archive Mirror: Isabelle Users Mailing List

Stream: Archive Mirror: Isabelle Users Mailing List

Topic: [isabelle] A proof assistant approach to the COVID-19 pub...

Email Gateway (Aug 23 2022 at 08:32):

From: Ken Kubota <mail@kenkubota.de>
If you want to use mathematics for a social purpose, then not only mathematics, but also the inner logic of social life comes into play.
In this case, it is more important to stop the increase of the number of infections.

Since the COVID-19 pandemic has exponential growth, it is essential to lower the effective reproductive number below 1.

I made some calculations including a translation of the exponential values to linear values by using a logarithmic scale:
http://kenkubota.de/corona-sheet
(If you use full-screen mode, you can make the progress visible by switching the pages.)

Using the current WHO data (March 24 and exactly one week before), the daily factor in the UK is 1.2317234559, which means a daily increase by 23.2 %.
Cases double in 3.33 days, and the full population would be infected in 44 days.
With the current daily progress (average of seven days until March 24), a scenario like in Italy (hospital overload and production shutdown) can be expected
– in the UK in 8 to 11 days,
– in Germany in 1 to 4 days,
– in the U.S. in 4 to 6 days,
– in Canada in 15 to 18 days,
– in Japan in 92 to 106 days,
– in India in 44 to 48 days.
(Local conditions such as quality and number of hospitals are not considered.)

So it seems obvious to me that the measurements of social distancing should be made much tighter quickly, otherwise severe consequences will follow.
China has a good daily factor of 1.0011075962, and since healed persons don't drop out of the number of cases, the real factor will be slightly better.

Some more calculations ("waiting one more day creates 40% more cases") are available here:
https://medium.com/@tomaspueyo/coronavirus-act-today-or-people-will-die-f4d3d9cd99ca

Kind regards,

Ken Kubota

Ken Kubota
https://doi.org/10.4444/100

Email Gateway (Aug 23 2022 at 08:32):

From: Jose Manuel Rodriguez Caballero <jose.manuel.rodriguez.caballero@ut.ee>
Dear Ken Kubota,
In your post, you expressed that your priority is to stop the increase in the number of infections. Although this is important, it will not stop the exponential growth of the pandemic. Notice that the number of medical doctors is at most constant (it decrease each time a medical doctor gets ill). Hence, it will be inevitable that the number of patients will be a lot larger than the number of medical doctors available.

My approach to this problem, from a computational perspective, is to provide parallel medical diagnosis and treatment to the population. Which technology is better in order to archive this goal? A proof assistant? Artificial Intelligence? Probabilistic programming? The answer is: better to have something than nothing.

In order to avoid self-medication, the people who distribute the medication could use the software in order to check if someone needs the medication or not.

Of course, if such software or library will be created, it needs to be supervised by a medical doctor.

Kind regards,
José M.

Email Gateway (Aug 23 2022 at 08:33):

From: Makarius <makarius@sketis.net>
Note that exponential growth does not exist in nature. In the real world it is
mainly restricted to the imagination of economists.

The standard model for spreading biological viruses (as well as rumors or
gossip) is the Logistic Function (or Sigmoid): it looks exponential initially,
has a turning point, and converges to a constant value. E.g. see the teaching
material
https://www.khanacademy.org/science/biology/ecology/population-growth-and-regulation/a/exponential-logistic-growth

A very important parameter in this model is the number of unknown/unmeasured
infections: after all Covid-19 has only mild health consequences and goes very
often undetected. This hidden parameter restricts the ultimate limit of the
logistic function (among other side-conditions and assumptions).

Note that for Germany, I presently see less than linear growth of reported
measurements at
https://experience.arcgis.com/experience/478220a4c454480e823b17327b2bf1d4/page/page_1

The authorities behind that are a bit slow in handing data to the central
entity --- the Robert Koch Institute in Berlin. So it often takes a few days
for the values for each day to stabilize. (Other sources claim to have better
data in real-time, but I don't follow that.)

Makarius

Email Gateway (Aug 23 2022 at 08:33):

From: Jose Manuel Rodriguez Caballero <jose.manuel.rodriguez.caballero@ut.ee>
Below, I will only provide some calculations, but I am more interested in the applications of Isabelle to medicine than in the statistical data.

Makarius wrote:

Note that exponential growth does not exist in nature. In the real world it is
mainly restricted to the imagination of economists.

I am using exponential growth as an approximation because empirically it is accurate enough for the current situation in Germany and other countries (it is no longer accurate for China, where the crisis almost finished). Right now, there are 392400 medical doctors in Germany:

https://www.statista.com/topics/3305/doctors-in-germany/

I have the following information (total cases of COVID19 in Germany)

https://www.worldometers.info/coronavirus/country/germany/

March 23: 29056
March 24: 32991
March 25: 37323

We see that the growth is almost exponential:

32991/29056 = 1.13542813877
37323/32991 = 1.13130853869

Because the population of Germany is 82.79 million, the number of cases is small with respect to the general population. So, for the short-term predictions, the exponential model should be valid. We can easily estimate that between April 13, 2020, and April 20, 2020, the number of cases will be larger than the number of medical doctors in Germany.

I would like to notice that Germany is not the typical case, but one of the best examples of a country concerning the management of this crisis.

Kind regards,
José M.

Email Gateway (Aug 23 2022 at 08:33):

From: Makarius <makarius@sketis.net>
On 26/03/2020 15:23, Jose Manuel Rodriguez Caballero wrote:

I am using exponential growth as an approximation because empirically it is accurate enough for the current situation in Germany and other countries (it is no longer accurate for China, where the crisis almost finished).

Playing with exponential functions is very dangerous. Mathematicians often
know the difference towards logistic functions (or other even more accurate
models), but the general public and politicians don't.

This then results in very bad approaches to solve quite different problems.
Right now it seems that the leading Western economies are destroying
themselves in order to solve the huge ecological problems that have
accumulated in the past 200 years. I would welcome sorting that out
eventually, but by less drastic means.

Because the population of Germany is 82.79 million, the number of cases is small with respect to the general population. So, for the short-term predictions, the exponential model should be valid.

So what is your model for already infected individuals, where nobody has
noticed nor measured anything? Are these 2 times, 10 times, or 100 times more
than the visible numbers?

I would like to notice that Germany is not the typical case, but one of the best examples of a country concerning the management of this crisis.

Better forget that. I am living in Bavaria, which is front-most in inadequate
and self-destructive policies.

But that is really politics, not mathematics, nor formal mathematics.

Makarius

Email Gateway (Aug 23 2022 at 08:33):

From: Jose Manuel Rodriguez Caballero <jose.manuel.rodriguez.caballero@ut.ee>
I agree with Makarius that we should avoid any reference to something outside formal mathematics and mathematics in this debate. So, I will only answer this question:

Because the population of Germany is 82.79 million, the number of cases is small with respect to the general population. So, for the short-term predictions, the exponential model should be valid.

So what is your model for already infected individuals, where nobody has
noticed nor measured anything? Are these 2 times, 10 times, or 100 times more
than the visible numbers?

My estimation is empirical, I have no mathematical model. I assume that the behavior of the number of cases/day is bounded below by an arithmetic progression and I use the data in the past as empirical proof that my law is good enough for the short term predictions.

For example, using the data

March 21: 22364
March 22: 24873
March 23: 29056
March 24: 32991
March 25: 37323

we see the following pattern

24873/22364 = 1.1121892327 > 1.1
29056/24873 = 1.16817432557 > 1.1
32991/29056 = 1.13542813877 > 1.1
37323/32991 = 1.13130853869 > 1.1

So, it is likely that the number of cases on March 26 will be at least

37323*1.1 = 41055.3

Indeed, the information for March 26, 2020, 16:39 GMT is 43646 cases, which agree with my empirical lower bound.

End of the clarification. As I already pointed out, I am more interested in knowing how proof assistants could be used in order to generate medical diagnosis and treatment than in the statistical data of COV19.

Kind regards,
José M.

Email Gateway (Aug 23 2022 at 08:33):

From: Makarius <makarius@sketis.net>
On 26/03/2020 18:10, Jose Manuel Rodriguez Caballero wrote:

So, it is likely that the number of cases on March 26 will be at least

37323*1.1 = 41055.3

Indeed, the information for March 26, 2020, 16:39 GMT is 43646 cases, which agree with my empirical lower bound.

Where did you find this? The Robert Koch Institute in Berlin says 36508 for
26-Mar-2020 midnight, but as usual it will take several days until the numbers
stabilize: there is a subtle difference in the day of measurement compared to
the day of report of the result.

Overall that makes the latest numbers appear lower than they turn out approx.
5 days afterwards. (And I am now unsure about the order of change we are
seeing here.)

Even worse, this numerology is not limited to a single website and source:
there appears to be quite some chaos in methods of measurement, counting of
cases, presentation of data etc. --- and overall numbers are rather low for
proper statistics.

End of the clarification. As I already pointed out, I am more interested in knowing how proof assistants could be used in order to generate medical diagnosis and treatment than in the statistical data of COV19.

If you want to do this seriously, I recommend to ignore COVID-19 for now, and
use Influenza as an example instead: it has the advantage that the methods and
approaches of that community have been refined over some decades --- and there
are significant numbers of "excess mortality" every year.

Makarius

Email Gateway (Aug 23 2022 at 08:34):

From: Jose Manuel Rodriguez Caballero <jose.manuel.rodriguez.caballero@ut.ee>
Makarius wrote:
Where did you find this? The Robert Koch Institute in Berlin says 36508 for
26-Mar-2020 midnight, but as usual it will take several days until the numbers
stabilize: there is a subtle difference in the day of measurement compared to
the day of report of the result.
My preference is:

https://www.worldometers.info/coronavirus/country/germany/

which is the same page that was used by my former professor Alexander Shnirelman (I did 3 courses with him when I was living in Montreal and 1 course by youtube now that I am in Estonia). According to Shnirelman, during the time he did this video, the exponential model holds for Germany and USA:

https://youtu.be/MItENLZxcM8

Kind regards,
José M.

Email Gateway (Aug 23 2022 at 08:35):

From: Ken Kubota <mail@kenkubota.de>
[ Continuation of: https://lists.cam.ac.uk/pipermail/cl-isabelle-users/2020-March/msg00110.html ]

Am 26.03.2020 um 13:47 schrieb Makarius <makarius@sketis.net>:

On 26/03/2020 12:47, Jose Manuel Rodriguez Caballero wrote:

In your post, you expressed that your priority is to stop the increase in the number of infections. Although this is important, it will not stop the exponential growth of the pandemic.

Note that exponential growth does not exist in nature. In the real world it is
mainly restricted to the imagination of economists.

The standard model for spreading biological viruses (as well as rumors or
gossip) is the Logistic Function (or Sigmoid): it looks exponential initially,
has a turning point, and converges to a constant value. E.g. see the teaching
material
https://www.khanacademy.org/science/biology/ecology/population-growth-and-regulation/a/exponential-logistic-growth

I find this argument very dangerous, as it amounts to denying the basically exponential character of the pandemic.

Also the article you refer to seems to argue with limited resources only, which in this case is the number of susceptible persons.
If we follow the estimation used by Bach at
https://medium.com/@joschabach/flattening-the-curve-is-a-deadly-delusion-eea324fe9727
for "herd immunity (at 40% to 70%)," and a fatality rate "between 0.6% in South Korea and 4.4% in Iran" mentioned by Pueyo at
https://medium.com/@tomaspueyo/coronavirus-act-today-or-people-will-die-f4d3d9cd99ca
(mainly depending on whether a hospital overload will occur), then the overall share will be
between 0.24% (= 40%0.6%) and 3.08% (= 70%4.4%) of the population that will die if exponential growth isn't stopped.
Since hospital overload can be assumed, rather the latter number is likely to apply.

"Two fundamental strategies are possible: (a) mitigation, which focuses on slowing but not necessarily stopping epidemic spread – reducing peak healthcare demand while protecting those most at risk of severe disease from infection, and (b) suppression, which aims to reverse epidemic growth, reducing case numbers to low levels and maintaining that situation indefinitely. [...] However, the resulting mitigated epidemic would still likely result in hundreds of thousands of deaths and health systems (most notably intensive care units) being overwhelmed many times over. For countries able to achieve it, this leaves suppression as the preferred policy option."
https://www.imperial.ac.uk/media/imperial-college/medicine/sph/ide/gida-fellowships/Imperial-College-COVID19-NPI-modelling-16-03-2020.pdf

"Going for 'herd immunity' at this point does not seem a viable option, as this will put NHS at an even stronger level of stress, risking many more lives than necessary.
By putting in place social distancing measures now, the growth can be slowed down dramatically, and thousands of lives can be spared. We consider the social distancing measures taken as of today as insufficient, and we believe that additional and more restrictive measures should be taken immediately, as it is already happening in other countries across the world."
http://maths.qmul.ac.uk/~vnicosia/UK_scientists_statement_on_coronavirus_measures.pdf

One can use more sophisticated models such as the SEIR model used in https://doi.org/10.1016/S2468-2667(20)30073-6,
but clearly exponential growth comes nearer to reality than linear growth.

A very important parameter in this model is the number of unknown/unmeasured
infections: after all Covid-19 has only mild health consequences and goes very
often undetected. This hidden parameter restricts the ultimate limit of the
logistic function (among other side-conditions and assumptions).

Note that for Germany, I presently see less than linear growth of reported
measurements at
https://experience.arcgis.com/experience/478220a4c454480e823b17327b2bf1d4/page/page_1

The authorities behind that are a bit slow in handing data to the central
entity --- the Robert Koch Institute in Berlin. So it often takes a few days
for the values for each day to stabilize. (Other sources claim to have better
data in real-time, but I don't follow that.)

The pandemic by nature has exponential growth, as each infected person on average infects a certain number of other persons.
Data may be insufficient and/or some advances may have been achieved in lowering the effective reproductive number, so on the surface the graph shortly might look linear.

But without lowering the effective reproductive number to 1.0 or less, exponential growth can't be stopped.
With an effective reproductive number above 1.0, the curve cannot be "flattened", but only stretched, which doesn't affect the exponential character at all.
To compare the ratio k between two different exponential bases x and y, assume: x^n = y^(k*n)
This yields: k = log_y(x)
or: k = log_10(x)/log_10(y)
For example: log(1.33)/log(1.1) = 2.9921141983 = (approx.) 3
In other words, lowering the daily factor or the effective reproductive number from 33% to 10% will stretch the curve by factor 3 (e.g., hospital overload in 30 instead of 10 days).
For the current values, see: https://kenkubota.de/corona-sheet
Unfortunately, my estimation proved correct, and with 63.4 % at the linear translation the day before yesterday France faced the same hospital overload as Italy with this value.
I would love to be proved wrong about this, but as Germany has passed this threshold, too, and with about four times as much intensive care beds than Italy, and a current doubling period of 5.62 days, Germany might face the same situation in 10 or 14 days.

For Germany, the actual link is: https://corona.rki.de/

More links are provided at: https://kenkubota.de/corona

Ken Kubota
https://doi.org/10.4444/100

Email Gateway (Aug 23 2022 at 08:35):

From: Jose Manuel Rodriguez Caballero <jose.manuel.rodriguez.caballero@ut.ee>
Reading other person posts with the same title, I see that the debate that I began concerning the COVID-19 international crisis is becoming a debate about statistics and pandemic mathematical modeling. I think that the Isabelle mailing list is not the most appropriate place for such discussions (logistic vs exponential growth). So, I suggest people interested in these two subjects to go to Terence Tao's polymath proposal "clearinghouse for crowdsourcing COVID-19 data and data cleaning requests" in order to debate there:

https://terrytao.wordpress.com/2020/03/25/polymath-proposal-clearinghouse-for-crowdsourcing-covid-19-data-and-data-cleaning-requests/

My approach to the COVID-19 international crisis is from the point of view of computer science, in an imaginary project explicitly involving Isabelle. I feel that I am writing science fiction, but I do not see any theoretical obstacle for the development of a library in Isabelle/HOL which will be statistically indistinguishable from a medical doctor. I think that if we will continue the debate in this mailing list, we should avoid statistics and pandemic mathematical modeling and to focus on theoretical computer science. My inspiration right now is Alan Turing. What would Alan Turing do in a situation like the COVID-19 international crisis? I think that to do this mental exercise is the first step in order to find a solution to this problem as a community.

Sincerely yours,
José M.

Email Gateway (Aug 23 2022 at 08:37):

From: Jose Manuel Rodriguez Caballero <jose.manuel.rodriguez.caballero@ut.ee>
Concerning a recent publication,

https://www.prnewswire.com/news-releases/expert-system-joins-fight-against-covid-19-301031827.html

which I think is a common interest, I would like to ask the following questions:
-Could the Isabelle community contribute in some way to this international collaboration against COVID-19?
-Could Isabelle technology be useful using the approach described in this publication?

Kind regards,
José M.

Email Gateway (Aug 23 2022 at 08:46):

From: Jose Manuel Rodriguez Caballero <jose.manuel.rodriguez.caballero@ut.ee>
Dear members of the Isabelle community,
I would like to share some thoughts about the current COVID-19 public health crisis and how could be approached from the point of view of proof assistants. First, I would like to describe the crisis and then I will explain how proof assistants could be useful in order to solve some of the problems, but not all.

Description of the problem. A highly contagious virus produces an overload in the health system because there are a lot more patients than medical doctors (the number of patients increases in an exponential way, whereas more and more medical doctors become patients). Consequently, there are many patients who have no access to a medical doctor or even if they have access to the medical doctor it is during a brief period of time and the diagnosis/treatment is so quickly that there is a high probability of human mistake. The diagnosis/treatment is not just to decide whether or not a patient has COVID-19, but also to know what to do concerning the interaction between the COVID-19 and the medical history of the patient, among other technicalities.

Proposal for a partial solution: To develop a user-friendly library in Isabelle in order to allow the user to introduce the parameters of his/her physiological data, e.g., temperature, blood pressure, the result of a blood test, in order to be able to make a computer-generated diagnosis. The diagnosis will be a series of lemmas about the probability of some medical condition and the treatment will be a series of lemmas about the probability that some treatment will be correct for this patient. Such a project should be developed by a multidisciplinary team consisting of computer scientists, statisticians and, more important, professional medical doctors (they are the ones that will decide whether or not the library could be used in patients, without them, the project should not be done).

Toy model example.

Parameters:
Temperature = x, blood preassure = y, blood test 1 = u, blood test 2 = v
(* delete the letter and write the values of the parameters * )

lemma diagnosis1: The probability that the patient has the illness A is more than 95%.
sorry (* delete "sorry" and write "sledgehammer" * )

lemma treatment1: The probability that the patient need to take medicament B is more than 95%.
sorry (* delete "sorry" and write "sledgehammer" * )

Personally, I am not interested in being part of such a problem, because my current research takes all my time. Nevertheless, I think that this approach may be helpful for public health in case of situations like the one described above. Even after the end of the COVID-19 pandemic, to have such a library to the disposal of the public may be useful in case another pandemic appears.

Sincerely yours,
José M.

Last updated: Jan 15 2026 at 12:52 UTC