Stream: Beginner Questions

Topic: Real uniform distribution


view this post on Zulip Christoph Madlener (Jun 11 2022 at 09:26):

I'm trying to define a uniform distribution over an interval of reals [a,b][a,b] (where a<ba < b). (If there is an existing definition for this I wasn't able to find I'd also be glad if that is pointed out to me). There are two approaches I've tried (unsuccessfully) so far:

  1. define the density and lift that to a real pmf:
context
  fixes a :: real and b :: real
  assumes a_less_b: "a < b"
begin

lift_definition pmf_of_real_interval :: "real pmf" is "λx. indicator {a..b} x / (b - a)"
proof
   ...

In that proof I didn't manage to prove the goal that the density integrates to 1.

  1. define the cdf and use real_distribution_interval_measure to get a distribution that way. I'm also not quite sure how easy it is from there to get the real pmf I want in the end. I'm also open for suggestions on how to formulate the cdf more elegantly:
definition real_uniform_cdf :: "real ⇒ real ⇒ real ⇒ real" where
  "real_uniform_cdf a b ≡
    λx. if x < Min {a,b}
        then 0
        else if x ∈ {Min{a,b}..Max{a,b}}
        then (x - Min{a,b}) / (Max{a,b} - Min{a,b})
        else 1"

lemma real_uniform_cdf_altdef:
  "a ≤ b ⟹ real_uniform_cdf a b = (λx. if x < a then 0 else if x ∈ {a..b} then (x - a) / (b - a) else 1)"
  unfolding real_uniform_cdf_def
  by auto

lemma "a < b ⟹ real_distribution (interval_measure (real_uniform_cdf a b))"
proof(intro real_distribution_interval_measure, goal_cases)
...

Here, I'm struggling to prove the continuity of the cdf. I have to admit I haven't really wrapped my head around the whole filter thing. If there is any introductory reading (or illustrative examples) someone can recommend, I would also be thankful.

view this post on Zulip Manuel Eberl (Jun 11 2022 at 10:27):

The thing you are trying to define does not exist. A PMF is a probability mass function, i.e. it describes a distribution by telling you what the probability of each possible result x is. For continuous distributions like the one you are talking about this does not work, since that probability will be 0 for all results.

view this post on Zulip Manuel Eberl (Jun 11 2022 at 10:28):

You can of course talk about densities (which is presumable what you are trying to do here), but that is not what PMFs are.

view this post on Zulip Manuel Eberl (Jun 11 2022 at 10:28):

If you want non-discrete distributions, you have to use the measure type, and it will be somewhat more painful than using PMFs.

view this post on Zulip Manuel Eberl (Jun 11 2022 at 10:30):

Also note that many distributions do not have a probability density, such as e.g. the Dirac distribution (Giry_Monad.return or return_pmf) and more generally any discrete distribution.

view this post on Zulip Manuel Eberl (Jun 11 2022 at 10:35):

The thing you are looking for is uniform_measure :: 'a measure ⇒ 'a set ⇒ 'a measure.

The second parameter is the set over which you want to randomise, i.e. in your case {a..b}. The first parameter is just providing the underlying measurable space (i.e. a Sigma algebra); in your care you can just use borel.

view this post on Zulip Manuel Eberl (Jun 11 2022 at 10:36):

No, wait, that last bit was me being stupid. The first parameter is the underlying measure space; it's just just for the sigma algebra.

view this post on Zulip Manuel Eberl (Jun 11 2022 at 10:36):

The natural one to use for real numbers is lborel (The Lebesgue–Borel measure). There's also lebesgue, which is a beefed-up version of lborel, but you don't need it in most cases.

view this post on Zulip Manuel Eberl (Jun 11 2022 at 10:37):

It helps to know some basic measure theory to use all this stuff, but it's not absolutely necessary. If you have any problems feel free to ask me.

view this post on Zulip Christoph Madlener (Jun 11 2022 at 11:24):

I see. Maybe I can give some more context of what I'm trying to do. I'm trying to formalize a randomized algorithm that draws nn numbers from [0,1][0,1], which are then used to define a linear order (so I actually want to rely on the fact that the probability of drawing the same number twice is 0).
Eventually I want to reason about the expectation of some random variable over the (randomized) output of this algorithm. I have done something similar before, where the distributions were discrete and I could actually use 'a pmf to nicely formulate the randomized algorithm with the provided monad syntax.
To my understanding the first part of reasoning about expectations (and probabilities, etc.) shouldn't be a problem, since they are defined using measure anyways. I'm not sure how to define the randomized algorithm itself though (without the monad of pmf).

view this post on Zulip Maximilian Schaeffeler (Jun 11 2022 at 17:02):

I don't know if I understand your last remark correctly, but the Giry Monad is defined for general (sub-)probability spaces of type 'a measure, not just for 'a pmf.


Last updated: Jul 15 2022 at 23:21 UTC