p.sterzinger@lse.ac.uk
03 February 2025
Theorem 1 (Bayes’ Theorem) \[ \Pr \left( B \mid A \right) = \frac{\Pr(A \mid B) \Pr(B)}{\Pr(A)} \]
Proof. Using the definition of conditional probability,
\[ \Pr(A \mid B) = \frac{\Pr(A \cap B) }{\Pr(B)} \] and \[ \Pr(B \mid A) = \frac{\Pr(A \cap B) }{\Pr(A)} \] so that \[ \Pr(B \mid A) = \frac{\Pr(A \cap B) }{\Pr(A)} = \frac{\Pr(A \mid B) \Pr(B)}{\Pr(A)} \]
\(Y = (Y_1, \ldots, Y_n)^\top \,, Y_i \in \mathcal{Y}\)
Likelihood: \(f_{Y \mid \theta}(y)\)
Prior: \(\pi(\theta)\,, \theta \in \Theta\)
Posterior distribution of \(Y\) given \(\theta\):
\[ f_{\theta \mid y}(\theta) = \frac{f_{Y \mid \theta}(y) \pi(\theta)}{\int_{\Theta} f_{Y \mid \theta}(y) \pi(\theta) d \theta} \propto f_{Y \mid \theta}(y) \pi(\theta) \]
Given likelihood \(f_{Y \mid \theta}(y)\), a prior \(\pi(\theta)\) is called a conjugate prior if the posterior distribution \(f_{\theta \mid y}(\theta)\) belongs to the same family of distributions as the prior distribution.
Convenient as we get a closed form expression for the posterior
We can see how our prior beliefs are updated by the observed data
Let \(x = (x_1, \ldots , x_n)\) be a random sample from an Exponential(\(\lambda\)) distribution. Set the prior for \(\lambda\) to be a Gamma(\(\alpha\), \(\beta\)) and derive its posterior distribution.
\[ f_{X \mid \lambda}(x) = \prod_{i = 1}^n f_{X_i \mid \lambda}(x_i) = \prod_{i = 1}^n \lambda \exp \{-\lambda x_i \} = \lambda^n \exp\left\{- \lambda \sum_{i = 1}^n x_i \right\} \]
\[ \pi(\lambda) \propto \lambda^{\alpha-1}\exp\{-\beta \lambda\} \]
\[ \begin{aligned} f_{\lambda \mid x}(\lambda) &\propto \lambda^n \exp\left\{- \lambda \sum_{i = 1}^n x_i \right\} \lambda^{\alpha-1}\exp\{-\beta \lambda\} \\ &= \lambda^{n + \alpha - 1} \exp\left\{- \lambda \left(\sum_{i = 1}^n x_i + \beta \right)\right\} \\ &\sim \textrm{Gamma}\left(\alpha + n, \beta + \sum_{i = 1}^n x_i\right) \end{aligned} \]
Let \(x = (x_1, \ldots , x_n)\) be a random sample from a \(N(\theta, \sigma^2)\) distribution with \(\sigma^2\) known.
- Show that the likelihood is proportional to \[ f_{X \mid \theta}(x) \propto \exp \left \{- \frac{n (\bar{x} - \theta)^2 + (n - 1) S^2}{2 \sigma^2} \right\}\,,\] where \(\bar{x}\) is the sample mean and \(S^2\) is the sample variance \[ S^2 = \frac{1}{n - 1} \sum_{i = 1}^n (x_i - \bar{x})^2 \,, \] Hence the likelihood simplifies to \[ f_{X \mid \theta}(x) \propto \exp \left \{- \frac{ (\bar{x} - \theta)^2}{2 \sigma^2 / n} \right\} \,.\]
- Set the prior for \(\theta\) to be \(N(\mu, \tau^2)\) and derive its posterior distribution.
\[ \begin{aligned} f_{X \mid \theta}(x) &= \prod_{i = 1}^n f_{X_i \mid \theta}(x_i) \\ &= \prod_{i = 1}^n \frac{1}{\sqrt{2 \pi \sigma^2}} \exp \left \{- \frac{(x_i-\theta)^2}{2 \sigma^2} \right\} \\ &= (2 \pi \sigma^2)^{-n / 2} \exp \left \{- \frac{\sum_{i = 1}^n (x_i-\theta)^2}{2 \sigma^2} \right\} \end{aligned} \]
Hence sufficies to show:
\[ \sum_{i = 1}^n (x_i-\theta)^2 = n(\bar{x} - \theta)^2 + (n - 1) S^2 \]
Note that
\[ \begin{aligned} \sum_{i = 1}^n (x_i - \theta)^2 &= \sum_{i = 1}^n \left(x_i^2 - 2 \theta x_i + \theta^2 \right) \\ &= \sum_{i = 1}^n x_i^2 - 2 \theta \sum_{i = 1}^n x_i + n \theta^2 \\ &= \sum_{i = 1}^n x_i^2 - 2 \theta n \bar{x} + n \theta^2 \\ &= \sum_{i = 1}^n x_i^2 \textcolor{red}{- n \bar{x}^2 + n \bar{x}^2} - 2 n \theta \bar{x} + n \theta^2 \\ &= \left(\sum_{i = 1}^n x_i^2 \textcolor{red}{ - n \bar{x}^2}\right) + n\theta^2 - 2n \theta \bar{x} \textcolor{red}{+ n \bar{x}^2} \\ &= \left(\sum_{i = 1}^n x_i^2{ - n \bar{x}^2}\right) + n \left(\bar{x} - \theta\right)^2 \\ &= (n - 1) S^2 + n \left(\bar{x} - \theta\right)^2 \end{aligned} \]
as required.
Thus
\[ \begin{aligned} f_{X \mid \theta}(x) &= (2 \pi \sigma^2)^{-n / 2} \exp \left \{- \frac{(n - 1) S^2 + n \left(\bar{x} - \theta\right)^2}{2 \sigma^2} \right\} \\ &= (2 \pi \sigma^2)^{-n / 2} \exp \left \{- \frac{(n - 1) S^2 }{2 \sigma^2} \right\} \exp \left \{- \frac{ \left(\bar{x} - \theta\right)^2}{2 \sigma^2 / n } \right\} \\ & \propto \exp \left \{- \frac{ \left(\bar{x} - \theta\right)^2}{2 \sigma^2 / n } \right\} \,, \end{aligned} \]
where the last line follows since \(\sigma^2\) is assumed to be known.
\[ \pi(\theta) \propto \exp \left\{ -\frac{(\theta - \mu)^2}{2\tau^2}\right\} \]
\[ \begin{aligned} f_{\theta \mid x}(\theta) & \propto \exp \left \{- \frac{ \left(\bar{x} - \theta\right)^2}{2 \sigma^2 / n } \right\} \exp \left\{ -\frac{(\theta - \mu)^2}{2\tau^2}\right\} \\ &= \exp \left \{- \frac{ \left(\bar{x} - \theta\right)^2}{2 \sigma^2 / n } - \frac{(\theta - \mu)^2}{2\tau^2}\right\} \\ &= \exp \left \{- \frac{ \textcolor{red}{\bar{x}^2} - 2 \theta \bar{x} + {\theta^2}}{2 \sigma^2 / n } - \frac{\theta^2 - 2\theta\mu +\textcolor{red}{\mu^2}}{2\tau^2}\right\} \\ &= \exp \left \{- \frac{ - 2 \theta \bar{x} + \theta^2}{2 \sigma^2 / n } - \frac{\theta^2 - 2\theta\mu }{\tau^2}\right\} \exp\left\{-\frac{ \textcolor{red}{\bar{x}^2}}{2 \sigma^2 / n } - \frac{\textcolor{red}{\mu^2}}{2\tau^2} \right\} \\ &\propto \exp \left \{- \frac{ \theta^2 - 2 \theta \bar{x}}{2 \sigma^2 / n } - \frac{\theta^2 - 2\theta\mu }{2\tau^2}\right\} \end{aligned} \]
\[ a = \bar{x}, \quad b = \sigma^2 / n, \quad c = \mu, \quad d = \tau^2 \]
\[ \begin{aligned} \exp \left \{- \frac{ \theta^2 - 2 \theta \bar{x}}{2 \sigma^2 / n } - \frac{\theta^2 - 2\theta\mu }{2\tau^2}\right\} &= \exp \left \{- \frac{ \theta^2 - 2 \theta a}{2b } - \frac{\theta^2 - 2\theta c }{2d}\right\} \\ &= \exp \left \{- \frac{ d(\theta^2 - 2 \theta a) + b(\theta^2 - 2\theta c )}{2bd}\right\} \\ &= \exp \left \{- \frac{ (b + d) \theta^2 - (da + bc)2 \theta}{2bd}\right\} \\ &= \exp \left \{- \frac{ \theta^2 - 2 \theta (da + bc) / (b + d)}{2bd / (b + d)}\right\} \\ & \sim N \left( \frac{da + bc}{b + d}, \frac{bd}{b + d} \right)\,, \end{aligned} \]
\[ \frac{da + bc}{b + d} = \frac{\sigma^2 \mu / n + \tau^2 \bar{x}}{\tau^2 + \sigma^2 / n}, \quad \frac{bd}{b + d} = \frac{\tau^2 \sigma^2 / n}{\tau^2 + \sigma^2 / n} \]
Let \(x = (x_1, \ldots, x_n)\) be a random sample from a \(N(0, \sigma^2)\) distribution. Set the prior for \(\sigma^2\) to be \(IGamma(\alpha, \beta)\) and derive its posterior distribution.
\[ \begin{aligned} f_{X \mid \sigma^2}(x) &= (2 \pi \sigma^2)^{-n / 2} \exp \left \{- \frac{\sum_{i = 1}^n x_i^2}{2 \sigma^2} \right\} \\ &\propto (\sigma^2)^{-n / 2} \exp \left \{- \frac{\sum_{i = 1}^n x_i^2}{2 \sigma^2} \right\} \\ \end{aligned} \]
\[ \pi(\sigma^2) \propto (\sigma^2)^{-\alpha - 1} \exp\left(-\frac{\beta}{\sigma^2}\right) \]
\[ \begin{aligned} f_{\sigma^2 \mid x}(\sigma^2) &\propto (\sigma^2)^{-n / 2} \exp \left \{- \frac{\sum_{i = 1}^n x_i^2}{2 \sigma^2} \right\} (\sigma^2)^{-\alpha - 1} \exp\left(-\frac{\beta}{\sigma^2}\right) \\ &= (\sigma^2)^{-n / 2 - \alpha - 1} \exp \left \{- \frac{\sum_{i = 1}^n x_i^2}{2 \sigma^2} -\frac{\beta}{\sigma^2} \right\} \\ &= (\sigma^2)^{-n / 2 - \alpha - 1} \exp \left \{- \frac{\beta + \sum_{i = 1}^n x_i^2 / 2 }{\sigma^2} \right\} \\ &\sim \textrm{IGamma} \left(n / 2 + \alpha, \beta + \sum_{i = 1}^n x_i^2 / 2 \right) \end{aligned} \]
Philipp Sterzinger - ST308 Assignment 1