5. Bayesian Inference on Normal Distribution

Normal Likelihood

Likelihood:

f(y|μ)=12πσ2e12(yμσ)2f({yi}n|μ,σ2)=(12πσ2)ne12σ2(yiμ)2

y¯N(μ,σ2n)

i=1n(yiμ)2=(yiy¯+y¯μ)2=((yiy¯)2+(y¯μ)2+2(yiy¯)(y¯μ))=(yiy¯)2+n(y¯μ)2+2(y¯μ)(yiy¯)0=(yiy¯)2+n(y¯μ)2L=[12πσ2]n/2e12σ2[(yiy¯)2+n(y¯μ)2]=[12πσ2]n/2e1/(2σ2)(yiy¯)2constant wrt μe[1/(2σ2)]n(y¯μ)2

Assuming that σ2 is known

f({yi}|μ)en/(2σ2)(y¯μ)2e12(σ2n)(y¯μ)2

proportional to a Normal with mean μ and variance σ2n.

Thus,

p(y1,y2,,yn)p(y¯|μ) is Normal

Example

Suppose we take a random sample of four observations from a Normal distribution having mean μ and known variance σ2 = 1.
The observations are 3.2, 2.2, 3.6, and 4.1. The possible value of μ are 2.0, 2.5, 3.0, 3.5, and 4.0. We will use a prior that gives them all equal weight. We want to use Bayes’ Theorem to find our posterior belief about μ given the whole random sample.

x=c(3.2, 2.2, 3.6, 4.1);  
mu = c(2, 2.5, 3, 3.5, 4);  
mu.prior = rep(1/5, 5);  ## [1/5,1/5,1/5,1/5,1/5]
likelihood = dnorm(mean(x),mean=mu, sd = 1/sqrt(4));  
posterior = mu.prior*likelihood/sum(mu.prior*likelihood);  
posterior;  
## [1] 0.01579107 0.12266347 0.35052941 0.36850143  
## [5] 0.14251462  

Flat Prior

Take limσ2

g(μ)=1

Which is not a proper distribution so it is improper, however the posterior will integrate to 1 thus it will be proper.

Posterior

Parameter: μ
Likelihood: P(y1,y2,,yn|μ,σ2)
Assumption: σ2 is known

p({yi}|μ)p(y¯|μ)N(μ,σ2n)p(μ|y¯)p(y¯|μ)p(μ)e12(σ2/n)[y¯μ]2e12(σ2/n)[μy¯]2p(μ|{yi})N(y¯,σ2n)

Normal Prior Single Observation

Parameter: μ
Likelihood: P(y1,y2,,yn|μ,σ2)p(y¯,μ)N(μ,σn2)
Assumption: σ2 is known
Prior: p(μ)N(μ=m,σ2=s2)

f(μ)=12πs2e12(μms)2

Posterior

p(μ|y¯)p(y¯|μ)p(μ)e12(σ2/n)[y¯μ]2e12s2(μm)2

One Observation:
Parameter: μ
Likelihood: P(y|μ)N(μ,σn2)
Assumption: σ2 is known
Prior: p(μ)N(μ=m,σ2=s2)

p(μ|y¯)p(y|μ)p(μ)e12(σ2)[yμ]2e12s2(μm)2e12[(yμ)2σ2+(μm)2s2]2exp(12μ2(s2+σ2)2μ(s2y+σ2m)+(s2y2+σ2m)σ2s2)exp(12σ2s2[μ2(s2+σ2)2μ(s2y+σ2m)])exp(12σ2s2(s2y2+σ2m2))exp(s2+σ22σ2s2)(μ22μ(s2y+σ2m)s2+σ2)[μs2y2+σ2ms2+σ2]=μ22μ(s2y+σ2m)s2+σ2+[s2y2+σ2ms2+σ2]2p(μ|y)exp(s2+σ22σ2s2)[μs2y+σ2ms2+σ2]2exp(12σ2s2σ2+s2(μσ2m+s2yσ2+s2)2)

mean s2y+σ2ms2+σ2=m variance σ2s2s2+σ2=s2

Updating Rules

Precision: 1/Variance posterior precision s2+σ2σ2s2

1s2=s2σ2s2+σ2σ2s2=1σ2+1s2Posterior Precision =Sample Precision+ Prior Precisionm=s2ys2+σ2+σ2ms2+σ2=s2yσ2s2s2+σ2σ2s2+σ2mσ2s2s2+σ2σ2s2=[1σ21s2]y¯+[1s21s2]mPosterior Mean=Sample PrecisionPosterior Precision(Sample Mean)+Prior PrecisionPosterior Precision(Prior Mean)

Normal Prior Multiple Observations

Parameter: μ
Likelihood: p({yi}|μ) is N(μ,σ2)
Prior: p(μ)N(m,s2)
Posterior: p(μ|{yi})

p({yi}|μ)p(y¯|μ)N(μ,σ2n)1s2=nσ2+1s2m=[nσ21s2]y¯+[1s21s2]m

Equivalent Sample Size

Prior Variance = s2=σ2nequiv

or

m=[n(1σ2)1s2]y¯+[σ2s2(1σ2)1s2]m

Example

Arnie and Barb are going to estimate the mean length of one-year-old rainbow trout in a stream. Previous studies in other streams have shown the length of yearling rainbow trout to be Normally distributed with known standard deviation of 2 cm. Arnie decides his prior mean is 30 cm. He decides that he doesn’t believe it is possible for a yearling rainbow to be less than 18 cm or greater than 42 cm. Thus his prior standard deviation is 4 cm. Thus he will use a Normal(30, 4) prior. Barb doesn’t know anything about trout, so she decides to use the “flat” prior.

They take a random sample of 12 yearling trout from the stream and find the sample mean ̄y = 32 cm. Arnie and Barb find their posterior distributions using the simple updating rules for the Normal conjugate family.

σ2=4 by the empirical rule
m=35
n=12
y¯=32

1s2=124+142s2=4916m=4916491632+116491630=[[324849]]+[[30149]]convex combination

Example

The standard process for making a polymer has mean yield 35%. A
chemical engineer has developed a modified process. She runs the
process on 10 batches and measures the yield (in percent) for each
batch. They are:
38.7 40.4 37.2 36.6 35.9
34.7 37.6 35.1 37.5 35.6
Assume that yield is Normal(μ, σ2) where the standard deviation
σ = 3 is known.

1s2=1009900m=1009001009900(36.93)+9900100990030

Example

Of those women who are diagnosed to have early-stage breast cancer, one-third eventually die of the disease. Suppose a community public health department instituted a screening program to provide for the early detection of breast cancer and to increase the survival rate π of those diagnosed to have the disease. A random sample of 27 women was selected from among those who were periodically screened by the program and who were diagnosed to have the disease. Let y represent the number of those in the sample who survive the disease.
Answer each of the following questions. You have to show all your work to get full credit.

a) If you wish to detect whether the community screening program has been effective, state the null hypothesis that should be tested.

b) State the alternative hypothesis.

H0:π2/3,Ha:π>2/3

c) If 20 women in the sample of 27 survive the disease, find the posterior distribution of π. Use a Beta(2, 1) prior for π. Provide parameters of posterior distribution, explicitly.

p(π|Data)(π20(1π7))(π21(π1)11)π|DataBeta(22,8)

d) Using parts a), b) and c), can you conclude that the community screening program was
effective? Test at the 5% level of significance in a Bayesian manner. Show all your work
and explain the practical conclusions from your test.

Want P[π2/3|Data] using normal approximation we have P[z<(2/32230)228(30)231]=0.2005>α, Failed to reject H0