ST303 Stochastic Simumlation

Class 6


Philipp Sterzinger

p.sterzinger@lse.ac.uk


Office hours: Thursday, COL 8.05



14 November 2025

Questions

Question 1

Suppose that you can easily generate random variables from the density \(g(x)\), \(x > 0\). How would you generate random variables from density \[f(x) = \frac{g(x)}{m (1 + x)} \,,\] where \[ m = \int_0^\infty \frac{g(y)}{1 + y} dy \,. \]

Question 1

Rejection sampling

  • \(\sup_x f(x) / g(x) = 1 / m\)
  • Acceptance region: \(U \leq 1 / (1 + X)\)

Alogrithm:

  1. Sample \(X\) according to \(g(x)\)
  2. Sample \(U \sim \mathcal{U}(0,1)\)
  3. If \(U \leq 1 / (1 + X)\), accept, else reject

Question 2

Generate a sample from the density \[ f(x) = \frac{1}{\log(2)(x + 1)(x + 2)}, \quad x > 0 \] using an acceptance-rejection (AR) method and the density \[ g(x) = \frac{1}{(1 + x)^2} \,. \] A direct inverse transformation method was used in Exercise 3, 2f. Discuss the pros and cons of each method.

Question 2

CDF of proposal distribution

Integration by substitution:

  • \(y = \frac{1}{1 + x}\)
  • \(dy = -(1 + x)^{-2} dx\)
  • \(x = 1 \to y = 1\), \(x = z \to y = 1 / (1 + z)\) \[ \begin{aligned} G_X(z) &= \int_0^z \frac{1}{(1 + x)^2} dx \\ &= - \int_1^{\frac{1}{1 + z}} 1 dy \\ % &= - \left[y \right]_1^{\frac{1}{1 + z}} \\ &= \frac{z}{1 + z} \,. \end{aligned} \]

Inverse CDF of proposal distribution

\[G_X^{-1}(p) = \frac{p}{1 - p}\]

Question 2

Acceptance region

\[ \begin{aligned} c &= \sup_{x > 0} \frac{f(x)}{g(x)} \\ &= \sup_{x > 0} \frac{1}{\log(2)} \frac{(x + 1)^2}{(x + 1) (x + 2)} \\ &= \frac{1}{\log(2)} \sup_{x > 0} \frac{x + 1}{x + 2} \\ &= \frac{1}{\log(2)} \end{aligned} \]

Acceptance region: \(U \leq \frac{X + 1}{X + 2}\)

Question 2

Inverse transform sampling

  • More difficult to derive
  • Easy to sample from
  • \(100\%\) acceptance rate

Rejection sampling

  • Easier to set up
  • computationally more demanding
  • acceptance rate of \(\approx 70\%\)

Question 2

Code
f <- function(x){
    1 / (log(2) * (x + 1) * (x + 2))
}

g <- function(x){
    1 / (1 + x)^2
}

c <- 1 / log(2) 

G_inv <- function(p){
    p / (1 - p)
}

F_inv <- function(p){
    (1 - 2^p) / (2^(p - 1) - 1)
}

N <- 10^6
set.seed(123) 

U_1 <- runif(N) 
U_2 <- runif(N) 
X_1 <- F_inv(U_1) 
X_2 <- G_inv(U_1) 
X_2 <- X_2[U_2 <= f(X_2) / (c * g(X_2))] 

ar <- length(X_2) / N 
th_ar <- 1 / c
print(rbind(th_ar, ar)) 
           [,1]
th_ar 0.6931472
ar    0.6925490
Code
curve(f, from = 0, to = F_inv(.975),  col = "black", lwd = 2, lty = 2)  
lines(density(X_1[X_1 < F_inv(.975)]), col = "red", lwd = 2, xlab = "", main = "") 
lines(density(X_2[X_2 < F_inv(.975)]), col = "blue", lwd = 2) 

legend("topright", 
    legend = c("pdf", "Inv. transform", "Rejection sampling"), 
    col = c("black", "red", "blue"), 
    lty = c(2,  1, 1), 
    lwd = rep(2, 3)) 

Question 3

Let \(X\) be a standard normal random variable. Let \(Y = X − 1\), conditionally on \(X\) being between \(1\) and \(2\). We would like to generate random variables with density the same as \(Y\)

\[ f(y) = \frac{\exp \left\{- \frac{(1 + y)^2}{2} \right\}}{\sqrt{2 \pi} (\Psi(2) - \Psi(1))} \]

The natural thing to do is to sample standard normal random variables and accept only those that lie between \(1\) and \(2\) and return \(X − 1\) for the accepted values. As this is wasteful, an obvious way to double the acceptance rate is to accept the variables that lie between \(−2\) and \(−1\) as well as those between \(1\) and \(2\) and return \(|X| − 1\) for those values. An alternative is using an acceptance-rejection method based on the density

\[ g(y) = \frac{a \exp\{-ay \}}{1 - \exp\{-a\}} \,, \] for \(0 < y < 1.\) Find the best value for \(a\). Is this method (using the optimal \(a\)) better than the previous one?

Question 3

Rejection sampling

  1. Find a goood upper bound \(c(a)\) for \(\sup_{y \in (0, 1)} \, f(y) / g(y)\)
  2. Minimize \(c(a)\) with respect to \(a\)
  3. Set up rejection sampling as per usual

Question 3

Upper bound \(c(a)\)

  1. \[ \begin{aligned} \underset{y \in (0,1)}{\sup} \, \frac{f(y)}{g(y)} &= \underbrace{\frac{1 - \exp\{-a\}}{a \sqrt{2 \pi} (\Psi(2) - \Psi(1))}}_{C} \underset{y \in (0,1)}{\sup} \, \exp \left\{ - \frac{(1 + y)^2}{2} + ay \right\} \\ &\leq C \underset{y \in \Re}{\sup} \, \exp \left\{ - \frac{(1 + y)^2}{2} + ay \right\} \\ \end{aligned} \]

  2. \[ \begin{aligned} \arg\underset{y \in \Re}{ \min} \, \exp \left\{ - \frac{(1 + y)^2}{2} + ay \right\} &= \arg\underset{y \in \Re}{ \min} \, - \frac{(1 + y)^2}{2} + ay \\ &= \arg\underset{y \in \Re}{\textrm{solve}} \, \{a - 1 - y = 0 \} \end{aligned} \]

  3. \[ \underset{y \in (0,1)}{\sup} \, \frac{f(y)}{g(y)} \leq C \exp\left\{\frac{a^2}{2} - a \right\} \equiv c(a) \]

Question 3

Minimising \(c(a)\)

Use R

Code
c_a <- function(a){
    C <- (1 - exp(-a)) / (a * sqrt(2 * pi) * (pnorm(2) - pnorm(1))) 
    
    C * exp(a^2 / 2 - a) 
}

opt_a <- optimize(c_a, lower = 0, upper = 10, maximum = FALSE)$minimum

cat("Optimal a: ", opt_a) 
Optimal a:  1.387895

Question 3

Inverse transform sampling from envelope

\[ \begin{aligned} G(z) &= \int_0^z \frac{a \exp\{-xa\}}{1 - \exp\{-a\}} dx \\ &= -\left[\frac{ \exp\{-xa\}}{1 - \exp\{-a\}} \right]_0^z \\ &= \frac{1 - \exp\{-xa\} }{1 - \exp\{-a\}} \end{aligned} \]

\[ \begin{aligned} G(x)^{-1} &= - \frac{\log \left(1 - p(1 - \exp\{-a\}) \right)}{a} \end{aligned} \]

Question 3

Business as usual

Code
f <- function(y){
    ds <- rep(0, length = length(y))
    inds <- (y > 0) & (y < 1)
    ds[inds] <- dnorm(1 + y[inds]) / (pnorm(2) - pnorm(1)) 
    ds
}

g <- function(y, a){
    ds <- rep(0, length = length(y))
    inds <- (y > 0) & (y < 1)
    ds[inds] <-  a * exp(-a * y[inds]) / (1 - exp(-a)) 
    ds
}

G_inv <- function(p, a){
    - log(1 - p * (1 - exp(-a))) / a 
}

N <- 10^5 
set.seed(123)

# Rejection sampling 
U_1 <- runif(N) 
U_2 <- runif(N)
X <- G_inv(U_1, opt_a) 
Y <- X[U_2 < exp(-(1 + X)^2 / 2 + opt_a * X + opt_a - opt_a^2 / 2)]

# Natural method 1 & 2
X_2 <- rnorm(N) 
Y_2 <- X_2[(X_2 < 2) & (X_2 > 1)] - 1 
X_3 <- abs(X_2)
Y_3 <- X_3[(X_3 < 2) & (X_3 > 1)] - 1 

cat("Rejection sampling AR: ", length(Y) / N, ", Natural method AR: ", length(Y_2) / N, ", Natural method 2 AR: ", length(Y_3) / N)
Rejection sampling AR:  0.96499 , Natural method AR:  0.13852 , Natural method 2 AR:  0.27333
Code
curve(f, from = 0, to = 1,  col = "black", lwd = 2, lty = 2)  
lines(density(Y), col = "red", lwd = 2, xlab = "", main = "") 
lines(density(Y_2), col = "blue", lwd = 2) 
lines(density(Y_3), col = "green", lwd = 2) 

legend("topright", 
    legend = c("pdf", "Rejection sampling", "NM 1", "NM 2"), 
    col = c("black", "red", "blue", "green"), 
    lty = c(2, 1, 1, 1), 
    lwd = rep(2, 4))