# Lecture 23 - Useful Distributions for Random Numbers

Text: chapter 28 & 29

## Random Variables

#### Random variables with arbitrary distributions

A random variable is uniquely defined by its CDF, F(x).

• 0 <= F(x) <= 1
• Suppose we want to express this random variable as a transformation Y = g(X) of a uniformly distributed random variable.
• P(Y < y) = Fy(y) = P(X < g^-1(y) ) = Fx( g^-1(y) ).
• Choose g(x) = Fx(x). Then P(Y < y) = y, or fy(y) = 1. Y is uniformly distributed.
• Then, given that Y is uniformly distributed, X = g^-1(Y) = Fx^-1(Y) is distributed by fx(x)

#### To Create a Random Variable with and Arbitrary Distribution, fx(x)

1. Calculate the CDF: Fx(x) = \int_0^x fx(x') dx'
2. Get one sample of a uniformly distributed random number, u.
3. Solve the equation u = Fx(x) for x, which amounts to calculating x = Fx^-1(u).

It's important to do this fast.

## Examples of Discrete Distributions

This emphasizes `what it's good for' over equations or theorems.

#### 1. Arbitrary

An arbitrary discrete distribution is a map from value to probability

• a set of pairs (xi, pi)

Order on xi to calculate the CDF,

• which is a step function.

Solve u = Fx(x) graphically.

#### 2. Discrete Uniform

M uniformly spaced values, from m to n

• interval is y = ( n-m )/(M-1)
• values of xi are m+i*y, 0<=i<M
• values of pi are 1/M

Choose u,

• x = m + floor( u*(n-m+1) )

#### 4. Bernoulli

Flip a (biased) coin

• P(tails) = 1-p

But really heads = 1, tails = 0.

Therefore,

1. Get u.
2. If u < 1-p then 0; else 1

#### 5. Geometric

Probability of getting it right after x-1 failures

• f(x) = p(1 - p)^(x-1), x = 1, 2, ...

CDF is p*\sum_0^x (1 - p)^x' = p * (1-(1-p)^x) / (1 - (1 - p)) = 1-(1-p)^x

• u = 1 - (1 - p)^x
• 1 - u = (1 - p)^x
• log(1-u) = x*log(1-p)
• x = floor( log(1-u) / log(1-p)

Exercise for the reader: redo this using the graphical method.

#### 6. Binomial Distribution

Get exactly x correct in n tries.

• f(x) = C(n,x) p^x * (1-p)^(n-x)

There is no way of doing the sum we did for the geometric distribution,

• but, this is the way that the sum of n Bernoulli trials is distributed

So, generate n Bernoulli variables and sum them.

• Not so good if n is large.
• If n is large, then