3. Bayesian Inference on Binomial Proportion

Beta and Binom. Distributions

beta(a,b)=Γ(a+b)Γ(a)Γ(b)πa1(1π)b1kernel

for 0π1

binomial:=f(y|π)=(ny)πy(1π)ny

for y=1,,n
0π1

(ny)01πy(1y)nπdπ=(ny)01π(y+1)1(1π)(n+1y)1dπ=(ny)Γ(α)Γ(β)Γ(α+β)01Γ(α+β)Γ(α)Γ(β)πα1(1π)β1dπ(1)=(ny)Γ(α)Γ(β)Γ(α+β)

α=y+1,β=ny+1

p(π|y)=p(πy)p(y)=p(y|π)p(π)01p(y|π)p(π)dπ=(ny)πy(1π)ny(ny)Γ(α)Γ(β)Γ(α+β)=Γ(n+2)Γ(y+1)Γ(ny+1)π(y+1)1(1π)(ny+1)1

0π1

note: Uniform(0,1) is a beta(α=1,β=1)



Beta Expectation

WTS YBeta(a,b)E(y)=αa+β

E[y]=(y)Γ(a+b)Γ(a)Γ(b)ya1(1y)b1dy=01Γ(a+b)Γ(a)Γ(b)ya(1y)b1dy=Γ(a+b)Γ(a)Γ(b)01y(a+1)1(1y)b1dy=Γ(a+b)Γ(a)Γ(b)Γ(α)Γ(β)Γ(α+β)01Γ(α+β)Γ(α)Γ(β)ya(1y)b1dynote β=b,α=a+1=Γ(a+b)Γ(a+1)Γ(a)Γ(a+b+1)=Γ(a+b)aΓ(a)Γ(a)(a+b)Γ(a+b)=aa+b

Beta Variance

WTS Var(y)=αβ(α+β)2(α+β+1)

E[x2]=01x2Γ(α+β)Γ(α)Γ(β)xα1(1x)β1dx=Γ(α+β)Γ(α)Γ(β)01x(α+2)1(1x)β1dx=Γ(α+β)Γ(α)Γ(β)Γ(α)Γ(β)Γ(α+β)01Γ(α+β)Γ(α)Γ(β)xα1(1x)β1dxα=α+2,β=β=Γ(α+β)Γ(α+2)Γ(β)Γ(α)Γ(β)Γ(α+β+2)=Γ(α+β)(α+1)(α)Γ(α)Γ(β)Γ(α)Γ(β)(α+β+1)(α+β)Γ(α+β)=α(1+α)(α+β)(α+β+1)Var(x)=E[x2][E[x]]2

Example 1

Asked prior to a study of a new chemotherapy, an oncologist said that she would expect
50% of patients respond. Then, you obtain a sample of 20 patients treated, 14 respond.
Answer each of the following questions. You have to show all your work to get full credit.

a) Let π denote the unknown proportion of patients that respond to new chemotherapy.
Use a U (0, 1) prior for π. Find the posterior distribution of π given y. Provide its
parameters, explicitly. Justify your answer.

g(π|y)=g(π)f(y|π)01g(π)f(y|π)dπ=(2014)(π)14(π)6(2014)01π(14+1)1(1π)(6+1)1dπ(1)=(π)14(π)6Γ(15)Γ(7)Γ(22)

Alternative 1:
Posterior distribution is proportional to (prior distribution)×(likelihood)

p(π|y)p(π)p(y|π)p(π|y)π14(1π)6π11(1π)11uniformp(π|y)π151(1π)71p(π|y)β(α=15,β=7)

Alternative 2:

Updating rules: α=y+1,β=(ny)+1
(for conjugate priors, uniform and beta)

b) Summarize the posterior distribution by its first two moments (i.e. mean and variance).
If you remember the formulas, write them and use them.

E[π|y]=αα+β=1515+70.6818Var(π|y)=αβ(α+β)2(α+β+1)

c) Using your posterior distribution, find P [π > 0.7]. Please, show all your work

By hand using normal approx.

P[π>0.7]P[π0.68180.094>0.70.68190.094]
1-pbeta(0.7,15,7)

General Beta Prior

Prior is β(α,β) instead of uniform

P(π|y)=P(π,y)P(y)=P(y|π)P(π)01p(y|π)p(π)dπ01p(y|π)p(π)dπ=01(ny)πy(1π)nyΓ(α+β)Γ(α)Γ(β)πα1(1π)β1dπ=(ny)Γ(α+β)Γ(α)Γ(β)Γ(α)Γ(β)Γ(α+β)01Γ(α+β)Γ(α)Γ(β)πα1(1β)β1dπ=(ny)Γ(α+β)Γ(α)Γ(β)π(y+α)1(1π)(ny)+β1(ny)Γ(α+β)Γ(α)Γ)βΓ(y+α)Γ((ny)+β)Γ(α+β+n)P(π|y)=Γ(α+β+n)Γ(y+α)Γ(ny+β)π(y+α)1(1π)(ny)+β1

General update rule:

α=y+α,β=(ny)+β

Choosing Parameters to Match Prior Beliefs

Strategy 1: Graph some beta densities until you find one that matches your beliefs

Strategy 2: Not that Beta(α,β) prior is equivalent to the information contained in a previously observed data set. Based on update rules α1 successes and β1 failures.
note: U(0,1)=Beta(1,1),α1=0,β1=0

Strategy 3: Solve for values of α and β that give:
the desired expectation
the desired equivalent prior sample size which for a beta(α,β) is α+β2

Strategy 4: chose α and β that reflect a prior probability interval that reflects your belief about π
can look at credible intervals

Strategy 5: Solve for values of α and β that give:
the desired expectation
the desired variance
(expectation from sample equivalent same size is α+β here)

Assumptions:
Parameter: π
Likelihood: p(y|π) is binom(n,π)
Posterior: p(π|y)p(y|π)p(π)=πy(1π)ny(πα1(1π)β1)

Expectation Based On Sample

E[π|y]=y+αn+α+β=yn+α+β+αn+α+β=nn+α+βyn+[α+βn+α+β][αα+β]=nn+α+β(sample mean)+[[prior meanα+βn+α+β]]

note: limnnn+α+βyn+[α+βn+α+β][αα+β]=yn

this is why we have the idea that α+β=n

Posterior Predictive Distribution

n=sample size of "future" sampley=number of "successes" in future samplep(y|y)=01p(y|π)p(π|y)dπ,yZ+p(π|y)=p(πy)p(y)=p(y|π)p(π)01p(y|π)p(π)dπp(y,y,π)=p(yy|π)p(π)(1)=p(y|π)p(y|π)p(π)p(y|y)=p(yy)p(y)(1)=01p(y|π)p(y|π)p(π)dπp(y)=01p(y|π)p(y|π)p(π)p(y)dπ=01p(y|π)p(π|y)dπ

In general if bayesian analysis has been done to estimate a π using Beta(α,β) prior and a dataset with y successes in a sample size n.

p(π|y)Beta(αpost,βpost)αpost=α+y,βpost=β+ny

Deriving beta-binomial

p[y|y]=01(ny)πy(1π)nyΓ(α+β)Γ(α)Γ(β)πα1(1π)β1dπ=(ny)Γ(α+β)Γ(α)Γ(β)01πy+α1(1π)ny+β1dπ=(ny)Γ(α+β)Γ(α)Γ(β)Γ(y+α)Γ(ny+β)Γ(n+α+β)y=0,1,2,,n

Example 2

Suppose a drug has an unknown true response rate π. Assume a Bernoulli process. Answer each of the following questions. You have to show all your work to get full credit.

a) Suppose that previous experience with similar compounds has suggested a response rate with an expectation around 410 and variance 241100. Find a Beta prior for π with mean 410 and variance 241100. Provide parameters of distribution, explicitly. If you can’t find it, use a U(0,1) prior to get partial credit.

πbeta(?,?)

E[π]=αα+β=410var(π)=αβ(α+β)2(α+β+1)α=4β=6β=3α2(3α22)(2α+3α2)2(5α+22)=241100(3α2)225α2(5α+2)8=2411001250α+100=11100α=4

b) Suppose that we observe 15 positive responses (successes) out of 20 patients. Find the posterior distribution of π given y. Provide its parameters, explicitly. Justify your answer.

p(π|y)πy(1π)nyβ(4,6)π15(1π)5k(π3(1π)5)kπ18(1π)10p(π|y) is beta(19,11)

c) Summarize the posterior distribution by its first two moments (i.e. mean and variance).

E[π|y]=1930V[π|y]=191130231

d) Compute a 95% credible interval for π using the Normal approximation.

E[π|y]±zα2V[π|y]1920±1.960.0075
qbeta(0.025, 19, 11)

note: this is based on the equal-tail posterior credible sets however the highest posterior density region is the shortest possible interval containing the desired probability. This may no be symmetrical.

Example 3

Suppose that a uniform prior is placed on the proportion π (that denotes the unknown proportion of patients that respond to a new chemotherapy), and that from a random sample of 10 patients treated, 7 respond. Also suppose that a new group of 5 patients is planning to receive this new chemotherapy. Let y denote the number in this new sample who respond. Find the posterior predictive probability that y=4 and y=5.

simple example:
Predictive posterior distribution when n=1

by definition:

p(y=1|y)=01p(y|π)p(π|y)dπ=01πp(π|y)dπconditional post. expectationp(y=0|y)=01(1π)p(π|y)dπ=1E[π|y]

full example:

Step 1 posterior:
update rules

p(π|y)p(y|π)p(π)=(π)7(1π)3(π11(1π)11)uniform Beta(1,1)=π81(1π)41

n=5

p(y=4|y=7)=01p(y|π)p(π|y)dπ=01(54)(π)4(1π)Γ(8+4)Γ(8)Γ(4)π81(1π)41dπ=(54)Γ(12)/(Γ(8)Γ(4))01π121(1π)51dπ=(54)Γ(12)/(Γ(8)Γ(4))(Γ(12)Γ(5)Γ(17))=(54)11!7!4!11!4!16!=(54)(11)(103)(16)(154)1