3. Bayesian Inference on Binomial Proportion

Beta and Binom. Distributions

b e t a (a, b) = \frac{Γ (a + b)}{Γ (a) Γ (b)} \underset{kernel}{\underset{―}{π^{a - 1} (1 - π)^{b - 1}}}

for $0 \leq π \leq 1$

binomial := f (y | π) = (\binom{n}{y}) π^{y} (1 - π)^{n - y}

for $y = 1, \dots, n$
$0 \leq π \leq 1$

\begin{aligned} (\binom{n}{y}) \int_{0}^{1} π^{y} (1 - y)^{n - π} d π \\ = (\binom{n}{y}) \int_{0}^{1} π^{(y + 1) - 1} (1 - π)^{(n + 1 - y) - 1} d π \\ = (\binom{n}{y}) \frac{Γ (α_{*}) Γ (β_{*})}{Γ (α_{*} + β_{*})} \int_{0}^{1} \frac{Γ (α_{*} + β_{*})}{Γ (α_{*}) Γ (β_{*})} π^{α_{*} - 1} (1 - π)^{β_{*} - 1} d π \\ (1) & = (\binom{n}{y}) \frac{Γ (α_{*}) Γ (β_{*})}{Γ (α_{*} + β_{*})} \end{aligned}

$α_{*} = y + 1, β_{*} = n - y + 1$

\begin{aligned} p (π | y) = \frac{p (π \cap y)}{p (y)} \\ = \frac{p (y | π) p (π)}{\int_{0}^{1} p (y | π) p (π) d π} \\ = \frac{(\binom{n}{y}) π^{y} (1 - π)^{n - y}}{(\binom{n}{y}) \frac{Γ (α_{*}) Γ (β_{*})}{Γ (α_{*} + β_{*})}} \\ = \frac{Γ (n + 2)}{Γ (y + 1) Γ (n - y + 1)} π^{(y + 1) - 1} (1 - π)^{(n - y + 1) - 1} \end{aligned}

$0 \leq π \leq 1$

note: $U n i f o r m (0, 1)$ is a $b e t a (α = 1, β = 1)$

Beta Expectation

WTS $Y \sim B e t a (a, b) ⟹ E (y) = \frac{α}{a + β}$

\begin{aligned} E [y] = & \int_{- \infty}^{\infty} (y) \frac{Γ (a + b)}{Γ (a) Γ (b)} y^{a - 1} (1 - y)^{b - 1} d y \\ = \int_{0}^{1} \frac{Γ (a + b)}{Γ (a) Γ (b)} y^{a} (1 - y)^{b - 1} d y \\ = \frac{Γ (a + b)}{Γ (a) Γ (b)} \int_{0}^{1} y^{(a + 1) - 1} (1 - y)^{b - 1} d y \\ = \frac{Γ (a + b)}{Γ (a) Γ (b)} \frac{Γ (α_{*}) Γ (β_{*})}{Γ (α_{*} + β_{*})} \int_{0}^{1} \frac{Γ (α_{*} + β_{*})}{Γ (α_{*}) Γ (β_{*})} y^{a} (1 - y)^{b - 1} d y \\ note β_{*} = b, α_{*} = a + 1 \\ = \frac{Γ (a + b) Γ (a + 1)}{Γ (a) Γ (a + b + 1)} \\ = \frac{Γ (a + b) \cdot a \cdot Γ (a)}{Γ (a) (a + b) Γ (a + b)} \\ = \frac{a}{a + b} \end{aligned}

Beta Variance

WTS $V a r (y) = \frac{α β}{(α + β)^{2} (α + β + 1)}$

\begin{aligned} E [x^{2}] & = \int_{0}^{1} \frac{x^{2} Γ (α + β)}{Γ (α) Γ (β)} x^{α - 1} (1 - x)^{β - 1} d x \\ = \frac{Γ (α + β)}{Γ (α) Γ (β)} \int_{0}^{1} x^{(α + 2) - 1} (1 - x)^{β - 1} d x \\ = \frac{Γ (α + β)}{Γ (α) Γ (β)} \frac{Γ (α_{*}) Γ (β_{*})}{Γ (α_{*} + β_{*})} \int_{0}^{1} \frac{Γ (α_{*} + β_{*})}{Γ (α_{*}) Γ (β_{*})} x^{α_{*} - 1} (1 - x)^{β_{*} - 1} d x \\ α_{*} = α + 2, β_{*} = β \\ = \frac{Γ (α + β) \cdot Γ (α + 2) Γ (β)}{Γ (α) Γ (β) Γ (α + β + 2)} \\ = \frac{Γ (α + β) \cdot (α + 1) \cdot (α) \cdot Γ (α) \cdot Γ (β)}{Γ (α) \cdot Γ (β) \cdot (α + β + 1) \cdot (α + β) \cdot Γ (α + β)} \\ = \frac{α (1 + α)}{(α + β) (α + β + 1)} \end{aligned}

\begin{array}{r} V a r (x) = E [x^{2}] - [E [x]]^{2} \end{array}

Example 1

Asked prior to a study of a new chemotherapy, an oncologist said that she would expect
50% of patients respond. Then, you obtain a sample of 20 patients treated, 14 respond.
Answer each of the following questions. You have to show all your work to get full credit.

a) Let π denote the unknown proportion of patients that respond to new chemotherapy.
Use a U (0, 1) prior for π. Find the posterior distribution of π given y. Provide its
parameters, explicitly. Justify your answer.

\begin{aligned} g (π | y) & = \frac{g (π) f (y | π)}{\int_{0}^{1} g (π) f (y | π) d π} \\ = \frac{(\binom{20}{14}) (π)^{14} (π)^{6}}{(\binom{20}{14}) \int_{0}^{1} π^{(14 + 1) - 1} (1 - π)^{(6 + 1) - 1} d π} \\ (1) & = \frac{(π)^{14} (π)^{6}}{\frac{Γ (15) Γ (7)}{Γ (22)}} \end{aligned}

Alternative 1:
Posterior distribution is proportional to (prior distribution) $\times$ (likelihood)

\begin{aligned} p (π | y) \propto p (π) p (y | π) & ⟹ p (π | y) \propto π^{14} (1 - π)^{6} \cdot \underset{uniform}{\underset{―}{π^{1 - 1} (1 - π)^{1 - 1}}} \\ ⟹ p (π | y) \propto π^{15 - 1} (1 - π)^{7 - 1} \\ ⟹ p (π | y) \sim β (α_{*} = 15, β_{*} = 7) \end{aligned}

Alternative 2:

Updating rules: $α_{*} = y + 1, β_{*} = (n - y) + 1$
(for conjugate priors, uniform and beta)

b) Summarize the posterior distribution by its first two moments (i.e. mean and variance).
If you remember the formulas, write them and use them.

E [π | y] = \frac{α_{*}}{α * + β_{*}} = \frac{15}{15 + 7} \approx 0.6818

V a r (π | y) = \frac{α_{*} β_{*}}{(α_{*} + β_{*})^{2} (α^{*} + β_{*} + 1)}

c) Using your posterior distribution, find P [π > 0.7]. Please, show all your work

By hand using normal approx.

P [π > 0.7] \approx P [\frac{π - 0.6818}{\sqrt{0.094}} > \frac{0.7 - 0.6819}{\sqrt{0.094}}]

1-pbeta(0.7,15,7)

General Beta Prior

Prior is $β (α, β)$ instead of uniform

\begin{aligned} P (π | y) & = \frac{P (π, y)}{P (y)} \\ = \frac{P (y | π) P (π)}{\int_{0}^{1} p (y | π) p (π) d π} \end{aligned}

\begin{aligned} \int_{0}^{1} p (y | π) p (π) d π & = \int_{0}^{1} (\binom{n}{y}) π^{y} (1 - π)^{n - y} \frac{Γ (α + β)}{Γ (α) Γ (β)} π^{α - 1} (1 - π)^{β - 1} d π \\ = \frac{(\binom{n}{y}) Γ (α + β)}{Γ (α) Γ (β)} \frac{Γ (α_{*}) Γ (β_{*})}{Γ (α_{*} + β_{*})} \int_{0}^{1} \frac{Γ (α_{*} + β_{*})}{Γ (α_{*}) Γ (β_{*})} π^{α_{*} - 1} (1 - β_{*})^{β_{*} - 1} d π \\ = \frac{(\binom{n}{y}) \frac{Γ (α + β)}{Γ (α) Γ (β)} π^{(y + α) - 1} (1 - π)^{(n - y) + β - 1}}{(\binom{n}{y}) \frac{Γ (α + β)}{Γ (α) Γ) β} \frac{Γ (y + α) Γ ((n - y) + β)}{Γ (α + β + n)}} \\ P (π | y) & = \frac{Γ (α + β + n)}{Γ (y + α) Γ (n - y + β)} π^{(y + α) - 1} (1 - π)^{(n - y) + β - 1} \end{aligned}

General update rule:

α_{*} = y + α, β_{*} = (n - y) + β

Choosing Parameters to Match Prior Beliefs

Strategy 1: Graph some beta densities until you find one that matches your beliefs

Strategy 2: Not that $B e t a (α, β)$ prior is equivalent to the information contained in a previously observed data set. Based on update rules $α - 1$ successes and $β - 1$ failures.
note: $U (0, 1) = B e t a (1, 1), α - 1 = 0, β - 1 = 0$

Strategy 3: Solve for values of $α$ and $β$ that give:
the desired expectation
the desired equivalent prior sample size which for a $b e t a (α, β)$ is $α + β - 2$

Strategy 4: chose $α$ and $β$ that reflect a prior probability interval that reflects your belief about $π$
can look at credible intervals

Strategy 5: Solve for values of $α$ and $β$ that give:
the desired expectation
the desired variance
(expectation from sample equivalent same size is $α + β$ here)

Assumptions:
Parameter: $π$
Likelihood: $p (y | π)$ is $b i n o m (n, π)$
Posterior: $p (π | y) \propto p (y | π) \cdot p (π) = π^{y} (1 - π)^{n - y} \cdot (π^{α - 1} \cdot (1 - π)^{β - 1})$

Expectation Based On Sample

\begin{aligned} E [π | y] & = \frac{y + α}{n + α + β} \\ = \frac{y}{n + α + β} + \frac{α}{n + α + β} \\ = \frac{n}{n + α + β} \frac{y}{n} + [\frac{α + β}{n + α + β}] [\frac{α}{α + β}] \\ = \frac{n}{n + α + β} (sample mean) + [[prior mean ∥ \frac{α + β}{n + α + β}]] \end{aligned}

note: $lim_{n \to \infty} \frac{n}{n + α + β} \frac{y}{n} + [\frac{α + β}{n + α + β}] [\frac{α}{α + β}] = \frac{y}{n}$

this is why we have the idea that $α + β = n$

Posterior Predictive Distribution

\begin{aligned} n^{*} & = sample size of "future" sample \\ y^{*} & = number of "successes" in future sample \end{aligned}

p (y^{*} | y) = \int_{0}^{1} p (y^{*} | π) p (π | y) d π, y^{*} \in Z^{+}

p (π | y) = \frac{p (π \cap y)}{p (y)} = \frac{p (y | π) p (π)}{\int_{0}^{1} p (y | π) p (π) d π}

\begin{aligned} p (y_{*}, y, π) & = p (y \cap y^{*} | π) p (π) \\ (1) & = p (y * | π) p (y | π) p (π) \end{aligned}

\begin{aligned} p (y^{*} | y) & = \frac{p (y * \cap y)}{p (y)} \\ (1) & = \frac{\int_{0}^{1} p (y_{*} | π) p (y | π) p (π) d π}{p (y)} \\ = \int_{0}^{1} p (y_{*} | π) \frac{p (y | π) p (π)}{p (y)} d π \\ = \int_{0}^{1} p (y_{*} | π) p (π | y) d π \end{aligned}

In general if bayesian analysis has been done to estimate a $π$ using $B e t a (α, β)$ prior and a dataset with $y$ successes in a sample size $n$ .

p (π | y) \sim B e t a (α_{p o s t}, β_{p o s t})

α_{p o s t} = α + y, β_{p o s t} = β + n - y

Deriving beta-binomial

\begin{aligned} p [y^{*} | y] & = \int_{0}^{1} (\binom{n_{*}}{y *}) π^{y_{*}} (1 - π)^{n_{*} - y_{*}} \frac{Γ (α_{*} + β_{*})}{Γ (α_{*}) Γ (β_{*})} \cdot π^{α_{*} - 1} (1 - π)^{β_{*} - 1} d π \\ = (\binom{n_{*}}{y *}) \frac{Γ (α_{*} + β_{*})}{Γ (α_{*}) Γ (β_{*})} \int_{0}^{1} π^{y_{*} + α_{*} - 1} (1 - π)^{n_{*} - y_{*} + β_{*} - 1} d π \\ = (\binom{n_{*}}{y_{*}}) \frac{Γ (α_{*} + β_{*})}{Γ (α_{*}) Γ (β_{*})} \cdot \frac{Γ (y_{*} + α_{*}) Γ (n_{*} - y_{*} + β_{*})}{Γ (n_{*} + α_{*} + β_{*})} \end{aligned}

y_{*} = 0, 1, 2, \dots, n_{*}

Example 2

Suppose a drug has an unknown true response rate π. Assume a Bernoulli process. Answer each of the following questions. You have to show all your work to get full credit.

a) Suppose that previous experience with similar compounds has suggested a response rate with an expectation around $\frac{4}{10}$ and variance $\frac{24}{1100}$ . Find a Beta prior for π with mean $\frac{4}{10}$ and variance $\frac{24}{1100}$ . Provide parameters of distribution, explicitly. If you can’t find it, use a $U (0, 1)$ prior to get partial credit.

$π \sim b e t a (?, ?)$

E [π] = \frac{α}{α + β} = \frac{4}{10}

v a r (π) = \frac{α β}{(α + β)^{2} (α + β + 1)}

⟹ α = 4 ⟹ β = 6

\begin{aligned} β & = \frac{3 α}{2} \\ ⟹ \frac{(\frac{3 α^{2}}{2})}{{(\frac{2 α + 3 α}{2})}^{2} (\frac{5 α + 2}{2})} = \frac{24}{1100} \\ ⟹ \frac{\frac{(3 α^{2})}{2}}{\frac{25 α^{2} (5 α + 2)}{8}} = \frac{24}{1100} \\ ⟹ \frac{1}{250 α + 100} = \frac{1}{1100} \\ ⟹ α = 4 \end{aligned}

b) Suppose that we observe 15 positive responses (successes) out of 20 patients. Find the posterior distribution of π given y. Provide its parameters, explicitly. Justify your answer.

\begin{aligned} p (π | y) & \propto π^{y} (1 - π)^{n - y} \cdot β (4, 6) \\ \propto π^{15} (1 - π)^{5} \cdot k (π^{3} \cdot (1 - π)^{5}) \\ \propto k \cdot π^{18} (1 - π)^{10} \\ ⟹ p (π | y) is b e t a (19, 11) \end{aligned}

c) Summarize the posterior distribution by its first two moments (i.e. mean and variance).

\begin{array}{r} E [π | y] = \frac{19}{30} \\ V [π | y] = \frac{19 \cdot 11}{30^{2} \cdot 31} \end{array}

d) Compute a 95% credible interval for π using the Normal approximation.

\begin{array}{r} E [π | y] \pm z_{\frac{α}{2}} \cdot \sqrt{V [π | y]} \\ \frac{19}{20} \pm 1.96 \sqrt{0.0075} \end{array}

qbeta(0.025, 19, 11)

note: this is based on the equal-tail posterior credible sets however the highest posterior density region is the shortest possible interval containing the desired probability. This may no be symmetrical.

Example 3

Suppose that a uniform prior is placed on the proportion π (that denotes the unknown proportion of patients that respond to a new chemotherapy), and that from a random sample of 10 patients treated, 7 respond. Also suppose that a new group of 5 patients is planning to receive this new chemotherapy. Let $y_{*}$ denote the number in this new sample who respond. Find the posterior predictive probability that $y_{*} = 4$ and $y_{*} = 5$ .

simple example:
Predictive posterior distribution when $n_{*} = 1$

by definition:

\begin{aligned} p (y_{*} = 1 | y) & = \int_{0}^{1} p (y_{*} | π) p (π | y) d π \\ = \int_{0}^{1} π \cdot p (π | y) d π conditional post. expectation \end{aligned}

\begin{array}{r} p (y^{*} = 0 | y) = \int_{0}^{1} (1 - π) p (π | y) d π = 1 - E [π | y] \end{array}

full example:

Step 1 posterior:
update rules

p (π | y) \propto p (y | π) p (π) = (π)^{7} (1 - π)^{3} \underset{uniform Beta(1,1)}{\underset{―}{(π^{1 - 1} (1 - π)^{1 - 1})}} = π^{8 - 1} (1 - π)^{4 - 1}

$n_{*} = 5$

\begin{aligned} p (y * = 4 | y = 7) & = \int_{0}^{1} p (y_{*} | π) p (π | y) d π \\ = \int_{0}^{1} (\binom{5}{4}) (π)^{4} (1 - π)^{} \cdot \frac{Γ (8 + 4)}{Γ (8) Γ (4)} π^{8 - 1} (1 - π)^{4 - 1} d π \\ = (\binom{5}{4}) Γ (12) / (Γ (8) Γ (4)) \int_{0}^{1} π^{12 - 1} (1 - π)^{5 - 1} d π \\ = (\binom{5}{4}) Γ (12) / (Γ (8) Γ (4)) (\frac{Γ (12) Γ (5)}{Γ (17)}) \\ = (\binom{5}{4}) \frac{11!}{7! 4!} \frac{11! 4!}{16!} \\ = (\binom{5}{4}) (11) (\binom{10}{3}) (16) {(\binom{15}{4})}^{- 1} \end{aligned}