4. Bayesian Inference on Poisson

Poisson Distribution

Parameter: μ
Likelihood: p(y|μ) is discrete
Poisson:

p(y|μ)=μyy!eμy=0,1,2,

Example 1:

Let Y be the number of accidents occurring in an industrial plant, Y is described by a Poisson process with mean μ accidents every three months. Suppose that 5 possible values of μ are 1/3, 2/3, 1, 4/3, and 5/3. We do not have any reason to give any possible value more weight than any other value, so we give them equal prior weight. During the last three months, NO accidents occur.

a) Find the posterior distribution, i.e., p(μ|data).

Possible Values for μPriorLikelihood(Poisson)Prior×LikelihoodPosterior1315P(y=0|μ=13)0.71650.14330.34942315P(y=0|μ=23)0.541340.10270.2504115P(y=0|μ=1)0.36780.07350.17954315P(y=0|μ=43)0.36780.05270.12855315P(y=0|μ=53)0.26360.03780.0922sum=0.4101

b) Find the posterior mean, i.e., E(μ|data)

E[μ|y=0]=(13)(0.3494)+(23)(0.2504)++(53)(0.0922)0.7878

c) Find p(μ ≤ 1|data).

p[μ1|y=0]0.3494+0.2304+0.17950.7793

Gamma Prior

A random variable Y is said to have a Gamma distribution with parameters r>y and v>0 the density function of Y is

g(y)={vryr1Γ(r)evy,0y<0otherwiseΓ(r)=0yr1eydy

original definition:

g(y)=[1Γ(α)βα]yα1ey/β,0<y<
E[y]=vrΓ(r)0yyr1evydy=vrΓ(r)0y(r+1)1evydy=vrΓ(r)Γ(r+1)vr+10vr+1Γ(r+1)y(r+1)1evydy=rvVar(y)=E(y2)[E(y)]2E(y2)=0y2g(y)dy=0y2vrΓ(r)yr1evydy=vrΓ(r)0y(r+2)1evydy=vrΓ(r)Γ(r+2)vr+20vr+2Γ(r+2)y(r+2)1evydy=r2+rv2Var(y)=rv2

Parameter: μ
Likelihood: p(y|μ) is Poisson
Prior: p(μ) is Gamma(r,v)
Posterior:


Single Observation:

p(y)=01y!vrΓ(r)μ(y+r)1e(v+1)μdμ=1y!vrΓ(r)Γ(y+r)(v+1)(y+r)0(v+1)(y+r)Γ(y+r)μy+r1e(v+1)μdμ(Bayes’)p(μ|y)=p(y|μ)p(μ)p(y)=p(y|μ)p(μ)0p(y|μ)p(μ)dμ=μyy!eμvrΓ(r)μr1evμ01y!vrΓ(r)μ(y+r)1e(v+1)μdμ=μyy!CeμvrΓ(r)μr1evμ1y!vrΓ(r)Γ(y+r)(v+1)(y+r)=(v+1)y+rΓ(y+r)μ(y+r)1e(v+1)μGamma(r=y+r,v=v+1)

Multiple observations:

Likelihood: p(y1,y2,,yn|μ)

1i=1nyi!μi=1nyienμ

Posterior:*

Updating Rules

p(μ|y1,y2,,yn)p(μ)p(y1,y2,,yn|μ)(independent)vrΓ(r)μr1evμi=1nμyieμyi!μ(r+i=1nyi)1e(v+n)μ(updating rules)Gamma(r=r+yi,v=v+n)p(μ|{yi}n)=(1i=1nyi!μi=1nyienμ)vrΓ(r)μr1evμ01yi!vrΓ(r)μ(yi)+r1e(n+v)μdμ01yi!vrΓ(r)μ(yi)+r1e(n+v)μdμ=1yi!vrΓ(r)Γ(yi+r)(n+v)(yi+r)p(μ|{yi}n)=(n+v)yi+rΓ(yi+r)μ(yi+r)1e(n+v)μ

Example 2:

Suppose we wish to estimate the number of tree seedlings in a forest. We randomly install
ten square meter plots and count the number of seedlings in each resulting in counts of
51, 47, 55, 51, 57, 55, 44, 41, 53, and 56. Assume that the number of tree seedlings per
plot follows a Poisson distribution.

a) Use a gamma prior for the Poisson parameter λ. Suppose your assessment of the
expected value for λ is 45 per plot and your assessment of the variance for λ is 9 per
plot. Find a gamma prior for λ with this mean and variance. Provide its parameters,
explicitly. Justify your answer.

λgamma(r=?,v=?)want E[λ]=45=rvand var[λ]=9=rv2r=45v9=45v5=v45=r5225=r

b) Find the posterior distribution of λ. Provide its parameters, explicitly:

p(λ|{yi}i=110)λyiyienλvrΓ(r)λr1evλλ(r+yi)1e(n+v)λ=λ(735)1e(15)λλgamma(735,15)

c) Summarize the posterior distribution by its first two moments (i.e. mean and variance).
If you remember the formulas, write them and use them.

E[λ]=73515,var(λ)=735152

b) Perform a Bayesian test of the hypothesis H0 : λ ≥ 50 vs Ha : λ < 50 at the 5% level.
Please, show all your work.

P[λ50|data]=1P[λ<50|data]=1P[λ50|data]=105015735Γ(735)λ7351e15λdλ

Normal approximation:

P[λ50]P[λ493.266750493.2667]=P[z0.5532]=1P[z<0.5532]1P[z0.55]=0.2912

Effective Sample Size

Approach 1:
Using posterior mean for μ in terms of sample mean and prior mean

E(μ|data)=rv=(yi+r)n+v=(yi)n+v+rn+v=nn+vyin+vn+rrv=nn+vsample mean +vn+vprior mean

Effective sample size =v

Approach 2:
Likelihood: p({yi}i=1n|μ)=1yi!μyiemμ
prior: vrΓ(r)μr1evμ

effective sample size = v

Non-Informative Prior

Jeffreys' Prior

g(μ)1μ

Gamma(r=1,v=0) Flat Uniform

Example 1
p(μ) is Gamma(r=1,v=110)

E(μ)=10,Var(μ)=100

Example 2
p(μ) is Gamma(r=1,v=1100)

E(μ)=100,Var(μ)=10,000

non-informative is Gamma(r,limn1n) = Gamma(r,v=0)

Becomes flat.

Posterior Predictive Distribution

r=r+yi,v=v+n

y is the new observation
y was our original data

0p(y|λ)(λ|y)dλ=0λyy!eλ(vrΓ(r))(λr1evλ)dλ=(vrΓ(r))1y!0λ(r+y)1e(v+1)λdλ=1y!(vrΓ(r))Γ(y+r)(v+1)y+r

Say r=1 and v=1

p(y|y)=1y!(1Γ(1))Γ(y+1)(2)y+1=(12)y(12)

Geometric random variable. Number of failures until first success. y+1 is when we have a success. discrete random variables

Example 3:

A geologist wishes to study the incidence of seismic movements in a given region. She then selects m independent but geologically similar observation points and counts the number of movements in a specific time interval. The observational model is Yi ∼ Pois(μ), where Yi , i = 1, 2, · · · , m, is the number of occurrences in the ith observation point and μ is the average rate of seismic movements.
a) From her previous experience, the researcher assumes that E (μ) = 2 movements per time interval an that V (μ) = 0.40 and uses these values to specify a conjugate prior. Find parameters of prior distribution and provide them explicitly.

rv=2r=2vrv2=0.42v=0.4v=5r=10,Gamma(10,5)

Equivalent sample size =5

b) Assuming that (2, 3, 0, 0, 1, 0, 2, 0, 3, 0, 1, 2) was observed. What is the posterior distribution? Find it and provide its parameters explicitly.

Update rules:
Gamma(r=10+14,v=5+12)

c) She wishes to find the probability that the number of seismic
movements in an (m + 1)th site is 2 based on the observations
she had made, i.e., p(X = 2|y1, ..., ym). Find it.

=0μynewynew!eμvrΓ(r)μr1evλdμ=1ynew!vrΓ(r)0μ(r+ynew)1e(v+1)μdμ=1ynew!vrΓ(r)Γ(r+ynew)(v+1)r+ynew

sub in r=24,v=17, and ynew=2, then solve.