ST303 Stochastic Simumlation

Class 7


Philipp Sterzinger

p.sterzinger@lse.ac.uk


Office hours: Tuesday, COL 5.03



19 November 2025

Recap

Variance reduction

Monte Carlo integral approximation

\[ \begin{aligned} I(g) &= \int_0^1 g(x) dx \\ &= \textrm{E}_{\mathcal{U}(0,1)}\left[g(U) \right] \\ &= \lim\limits_{n \to \infty} \frac{1}{n} \sum_{i = 1}^n g(U_i) \quad (\textrm{almost surely}) \\ & \approx \frac{1}{n} \sum_{i = 1}^n g(U_i) \\ & \equiv \hat{I}(g) \end{aligned} \]

Approximation error

\(\textrm{var}\left(\hat{I}(g) - I(g) \right) = \mathcal{O}(n^{-1})\) but may be substantial in small samples

Variance reduction

Suppose we have two random variables \(X,Y\) with the same mean and variance \(\sigma^2\). Then

\[ \textrm{var} \left(\frac{X + Y}{2} \right) = \frac{1}{2} \sigma^2 (1 + \rho) < \sigma^2 \iff \rho < 1 \,, \] where \(\rho \in (-1, 1)\) is the correlation coefficient.

Variance reduction

BUT: We can always double the sample size in our simulation

Hence, we should compare \((X+Y) / 2\) to the mean of two independent samples of \(X\), say \(X_1, X_2\).

To acheive effective variance reduction, we thus need

\[ \textrm{var} \left(\frac{X + Y}{2} \right) = \frac{1}{2} \sigma^2 (1 + \rho) < \textrm{var} \left(\frac{X_1 + X_2}{2} \right) = \frac{1}{2} \sigma^2 \iff \rho < 0 \,, \]

Variance reduction

Antithetic variates

Idea: minimize variance of MC estimator \(\hat{I}(g)\) by cooking up second estimator \(\tilde{I}(g)\) with same mean and variance but negative correlation.

Theorem 1 If \(g(x)\) is a monotone function, and \(U \sim \mathcal{U}(0,1)\) a uniform random variable, then \[ \textrm{cov}(g(U), g(1-U)) \leq 0 \]

Thus, letting \(\tilde{I}(g) = \frac{1}{2n} \sum_{i = 1}^n g(U_i) + \frac{1}{2n} \sum_{i = 1}^n g(1 - U_i)\) we have that

\[ \textrm{var}\left( \tilde{I}(g) \right) < \textrm{var}\left( \hat{I}(g) \right) \]

Questions

Question 1

Improve your estimates for all integrals in Exercise \(3\) of Seminar \(1\) and Exercise \(2\) of Seminar \(2\) by the method of Antithetic Variables. Approximate the new standard errors.

Question 1.a

\[ I = \int_0^1 \exp\{-x^{1.5}\} dx = \textrm{E}_{\mathcal{U}(0,1)} \left[ g(U) \right] \] for \(g(x) = \exp\{-x^{1.5}\}\).

Standard MC

\[ \hat{I} = \frac{1}{n}\sum_{i = 1}^n g(U_i), \quad U_i \overset{\textrm{i.i.d.}}{\sim} \mathcal{U}(0, 1) \]

Question 1.a

Antithetic variates

\[ \tilde{I} = \frac{1}{2n} \sum_{i = 1}^n g(U_i) + \frac{1}{2n} \sum_{i = 1}^n g(1 - U_i), \quad U_i \overset{\textrm{i.i.d.}}{\sim} \mathcal{U}(0, 1) \]

Monotonicity:

Code
f <- function(x){exp(-1.5 * x)} 
plot.new() 
curve(f, from = 0, to = 1, lwd = 2, col = "red", ylab = "g(x)")

Question 1.a

Antithetic variates

Code
g <- function(x){exp(-x^(1.5))} 

N <- 10^5
set.seed(123)

U <- runif(N) 

mc_mean <- mean(g(U)) 
mc_sd <- sd(g(U)) 

av_vals <- 0.5 * g(U) + 0.5 * g(1 - U)
av_mean <- mean(av_vals) 
av_sd <- sd(av_vals) 

reps <- 1000
Us <- matrix(runif(reps * N), nrow = N, ncol = reps) 

av_means <- vapply(1:reps, function(col){U <- Us[, col]; mean(0.5 * g(U) + 0.5 * g(1 - U))}, 0.0)
mc_means <- vapply(1:reps, function(col){U <- Us[, col]; mean(g(U))}, 0.0)

av <- cbind(av_mean, av_sd) 
mc <- cbind(mc_mean, mc_sd) 

comparison <- rbind(mc, av) 
colnames(comparison) <- c("Mean", "Standard deviation") 
rownames(comparison) <- c("Monte Carlo", "Antithetic variates") 

par(mfrow = c(1, 2))

hist(g(U), col = "#5b5be0b0", freq = FALSE, breaks = 100, xlab = "MC/AV summands", ylab = "Distribution of Summands", main = "Histograms of MC summands") 
hist(av_vals, freq=FALSE, breaks=100, col="#e05b5bb0", border="#e05b5bb0", add = TRUE) 
abline(v = 0.699792, col = "green", lty = 2, lwd = 2) 

legend("topright", 
       legend = c("MC", "AV", "truth"), 
       fill = c("#5b5be0b0", "#e05b5bb0", NA), 
       border = NA, 
       lty = c(NA, NA, 2), 
       lwd = c(NA, NA, 2), 
       pch = c(NA, NA, NA), 
       col = c(NA, NA, "green"))

hist(mc_means, col = "#5b5be0b0",  freq = FALSE,  ylab = "Distribution of MC estimates", xlab = "MC means", main = "Histogram of MC estimates") 
hist(av_means, col = "#e05b5bb0",  freq = FALSE,  ylab = "Distribution of MC/AV estimates", xlab = "MC/AV means", main = "Histograms of MC/AV estimates", add = TRUE, border = "#e05b5bb0") 
abline(v = 0.699792, col = "green", lty = 2, lwd = 2) 
legend("topright", 
       legend = c("MC", "AV", "truth"), 
       fill = c("#5b5be0b0", "#e05b5bb0", NA), 
       border = NA, 
       lty = c(NA, NA, 2), 
       lwd = c(NA, NA, 2), 
       pch = c(NA, NA, NA), 
       col = c(NA, NA, "green"))
Code
comparison
                         Mean Standard deviation
Monte Carlo         0.7003113        0.196876091
Antithetic variates 0.6998029        0.004407909

Question 1.b

\[ \begin{aligned} I &= \int_0^\infty \exp\{-x^{1.5}\} dx \\ &= \int_0^1 \frac{1}{y^2}\exp\left\{-\left(\frac{1}{y} - 1 \right)^{1.5}\right\} dy \\ &= \textrm{E}_{\mathcal{U}(0,1)} \left[ h(U) \right] \end{aligned} \] for \[ h(y) = \frac{1}{y^2}\exp\left\{-\left(\frac{1}{y}-1\right)^{1.5}\right\} \]

Question 1.b

Monotonicity of transformation

Code
h <- function(x){exp(- (1 / x - 1)^(3 / 2)) / x^2}  
plot.new() 
curve(h, from = 0, to = 1, lwd = 2, col = "red", ylab = "h(x)")

Question 1.b

Code
N <- 10^5
set.seed(123)

U <- runif(N) 

cat("cov(h(u), h(1-u)): ", cov(h(U), h(1-U)))
cov(h(u), h(1-u)):  -0.1782384
Code
mc_mean <- mean(h(U)) 
mc_sd <- sd(h(U)) 

av_vals <- 0.5 * h(U) + 0.5 * h(1 - U)
av_mean <- mean(av_vals) 
av_sd <- sd(av_vals) 


av <- cbind(av_mean, av_sd) 
mc <- cbind(mc_mean, mc_sd) 

comparison <- rbind(mc, av) 
colnames(comparison) <- c("Mean", "Standard deviation") 
rownames(comparison) <- c("Monte Carlo", "Antithetic variates") 

hist(g(U), 
  col = "#5b5be0b0", 
  freq = FALSE, breaks = 100, 
  xlab = "Estimates", 
  ylab = "Distribution of Estimates", 
  main = "Histograms of estimates", 
  xlim = c(.3, max(c(h(U), av_vals)))) 
hist(av_vals, freq=FALSE, breaks=100, col="#e05b5bb0", add = TRUE) 
abline(v = 0.902745, col = "green", lty = 2, lwd = 2) 

legend("topright", 
       legend = c("MC", "AV", "truth"), 
       fill = c("#5b5be0b0", "#e05b5bb0", NA), 
       border = NA, 
       lty = c(NA, NA, 2), 
       lwd = c(NA, NA, 2), 
       pch = c(NA, NA, NA), 
       col = c(NA, NA, "green"))
Code
comparison
                         Mean Standard deviation
Monte Carlo         0.9015331          0.6264433
Antithetic variates 0.9032674          0.3275354

Question 2

The loss to insurance portfolio \(A\) is of the form \[ S_A = \sum_{i = 1}^{n_A} X_i \,, \] where \(N_A\) is a Poisson random variable with parameter \(10\) and \(X_i\) are i.i.d random variables that are log-normal with parameters \(2\) and \(8\) (\(8\) is the variance of the underlying normal, not the standard deviation). The loss to another (independent of \(A\)) insurance portfolio \(B\) is of the form \[ S_B = \sum_{i = 1}^{n_B} Y_i \,, \] where \(N_B\) is a Poisson random variable with parameter \(10\) and \(Y_i\) are i.i.d random variables that are log-normal with parameter \(0\) and \(12\).

Calculate \(\Pr(S_A > S_B)\) and give a \(95\%\) confidence interval for your estimate.

Question 2

Simulating \(S_A\)

  1. Draw \(n_A\) from the Poisson distribution
  2. Given \(n_A = n\), draw \(n\) independent samples \(X_i\)
  3. Compute \(\tilde{S}_A = \sum_{i = 1}^n X_i\)

\[ \begin{aligned} \Pr \left(\tilde{S}_A \leq t \right) &= \sum_{k = 0}^\infty \Pr\left(\sum_{i = 1}^k X_i \leq t {\Bigg |} n_A = k \right) \Pr(n_A = k) \\ &= \sum_{k = 0}^\infty \Pr\left(\sum_{i = 1}^k X_i \leq t \cap n_A = k \right) \\ &= \Pr\left(\bigcup_{k = 0}^\infty \left\{\sum_{i = 1}^{k} X_i \leq t \cap n_A = k \right\} \right) \\ &= \Pr\left(\sum_{i = 1}^{n_A} X_i \leq t \right) \\ &= \Pr(S_A \leq t) \end{aligned} \]

Question 2

\[ \begin{aligned} \Pr(S_A > S_B) &= \mathrm{E} \left[\mathbb{1} \left\{S_A > S_B \right\} \right] \\ & \overset{a.s.}{=}\lim\limits_{n \to \infty} \frac{1}{n} \sum_{i = 1}^n \mathbb{1} \left\{S_A > S_B \right\} \\ & \approx \frac{1}{n} \sum_{i = 1}^n \mathbb{1} \left\{S_A > S_B \right\} \end{aligned} \]

Algorithm

For \(n\) reps, do:

  1. Draw lengths \(n_A\), \(n_B\) from respective Poisson distributions
  2. Draw \(n_A (n_B)\) samples of \(X_i (Y_i)\)
  3. Compute \(S_A\), \(S_B\)

\(\Pr(S_A > S_B) \approx \frac{1}{n} \sum_{i = 1}^n \mathbb{1} \left\{S_A > S_B \right\}\)

Question 2

Code
## Method 1 
insurance_loss <- function(N, lambdapois = 1, meanlog = 0, sdlog = 1){
  Ns <- rpois(N, lambdapois) 
  vapply(Ns, function(x){sum(exp(rnorm(x, mean = meanlog, sd = sdlog)))}, 0.0) 
}

N <- 10^5 
set.seed(123)

ptm <- proc.time()
S_A <- insurance_loss(N, 10, 2, sqrt(8)) 
S_B <- insurance_loss(N, 10, 0, sqrt(12))   
sim_time_method_1 <- (proc.time() - ptm)[3]

pr_method_1 <- mean(S_A > S_B)

## Method 2 
CP_A <- function(x){
  n <- rpois(1, 10)
  x <- exp(rnorm(n, 2, sqrt(8)))
  s <- sum(x)
  return(s)
}

CP_B <- function(x){
  n <- rpois(1, 10)
  x <- exp(rnorm(n, 0, sqrt(12)))
  s <- sum(x)
  return(s)
}

set.seed(123)
ptm <- proc.time()
x <- 1:10^5 
CP_SA <- sapply(x, CP_A)
CP_SB <- sapply(x, CP_B)
sim_time_method_2 <- (proc.time() - ptm)[3]

pr_method_2 <- mean(CP_SA > CP_SB)


## Method 3 
set.seed(123) 
ptm <- proc.time()
CP_A <- rep(0, N)
CP_B <- rep(0, N)
N_A <- rpois(N, 10)
N_B <- rpois(N, 10)
for(i in 1 : N){
  CP_A[i] <- sum(exp(rnorm(N_A[i], 2, sqrt(8))))
  CP_B[i] <- sum(exp(rnorm(N_B[i], 0, sqrt(12))))
}

sim_time_method_3 <- (proc.time() - ptm)[3]
pr_method_3 <- mean(CP_A > CP_B)

cbind(pr_method_1, pr_method_2, pr_method_3)
     pr_method_1 pr_method_2 pr_method_3
[1,]     0.67875     0.67804     0.67954
Code
cbind(sim_time_method_1, sim_time_method_2, sim_time_method_3)
        sim_time_method_1 sim_time_method_2 sim_time_method_3
elapsed             0.245             0.321             0.178

Question 3

Same as Question 2 but now \(n_A = n_B \sim \textrm{Pois}(10)\).

Code
## Method 1 
insurance_loss <- function(N, lambdapois = 10, meanlog = c(2, 0), sdlog = c(sqrt(8), sqrt(12))){
  Ns <- rpois(N, lambdapois) 
  vapply(Ns, function(x){
     sum(exp(rnorm(x, meanlog[1], sdlog[1])) 
      - exp(rnorm(x, meanlog[2], sdlog[2])))}, 
    0.0) 
}

N <- 10^5 
set.seed(123)

ptm <- proc.time()
ins_loss <- insurance_loss(N) 
sim_time_method_1 <- (proc.time() - ptm)[3]

pr_method_1 <- mean(ins_loss > 0)

## Method 2 
CP_AB <- function(x){
  n <- rpois(1, 10)
  x <- exp(rnorm(n, 2, sqrt(8)))
  y <- exp(rnorm(n, 0, sqrt(12)))
  s <- sum(x - y)
  return(s)
}

set.seed(123)
ptm <- proc.time()
x <- 1:10^5 
CP_SAB <- sapply(x, CP_AB)
sim_time_method_2 <- (proc.time() - ptm)[3]
pr_method_2 <- mean(CP_SAB > 0)


## Method 3 
set.seed(123) 

insurance_loss_loop <- function(N, lambdapois = 10, meanlog = c(2, 0), sdlog = c(sqrt(8), sqrt(12))){
  CP_SAB_loop <- rep(0, N)
  N_AB <- rpois(N, lambdapois)
  for(i in 1 : N){
    CP_SAB_loop[i] <- sum(exp(rnorm(N_AB[i], meanlog[1], sdlog[1])) 
      - exp(rnorm(N_AB[i], meanlog[2], sdlog[2])))
  }
  CP_SAB_loop 
}

ptm <- proc.time()
ins_loss_loop <- insurance_loss_loop(N) 
sim_time_method_3 <- (proc.time() - ptm)[3]
pr_method_3 <- mean(ins_loss_loop > 0)

cbind(pr_method_1, pr_method_2, pr_method_3)
     pr_method_1 pr_method_2 pr_method_3
[1,]     0.69181     0.69373     0.69181
Code
cbind(sim_time_method_1, sim_time_method_2, sim_time_method_3)
        sim_time_method_1 sim_time_method_2 sim_time_method_3
elapsed             0.222             0.232             0.181

Question 4

Improve your estimate in Question 2 and 3 by the method of antithetic variables. Do at least one of them and approximate the new standard errors.

Question 4

Recall that \[ X_i = e^{N_i}, \, N_i \sim \mathrm{N}(2,8) \] and note that \(4 - N_i \sim \mathrm{N}(2,8)\).

Now \[ 4 - N_i \sim \mathrm{N}(2,8) \]

and

\[ \textrm{cov}(N_i, 4 - N_i) = - \textrm{var}(N_i) \implies \rho = -1 \]

Futher \(\exp\{x\}\) is an increasing monotonic transformation, so that

\[ \textrm{cov}(\exp\{4 - N_i\}, \exp\{N_i\}) < 0 \]

Question 4

Hence, we can use \[ \bar{S}_A = \sum_{i = 1}^n \exp\{4 - N_i\} \] as the antithetic variate to \(S_A = \sum_{i = 1}^n \exp\{N_i\}\)

Similarly, for \(S_B\) we use the antithetic variate \[ \bar{S}_B = \sum_{i = 1}^n \exp\{ - M_i\} \] for \(S_B =\sum_{i = 1}^n \exp\{M_i\}\), \(M_i \sim \mathcal{N}(0, 12)\)

Question 4

Estimate \(\Pr(S_A > S_B)\) as

\[ \Pr(S_A > S_B) = \textrm{E} [\mathbb{1}\{S_A > S_B\}] \approx \frac{1}{N} \sum_{i = 1}^N \frac{\mathbb{1}\left\{S_A^i > S_B^i \right\} + \mathbb{1}\left\{\bar{S}_A^i > \bar{S}_B^i \right\}}{2} \]

Code
## Method 1 
pr_insurance_loss <- function(N, lambdapois = 10, meanlog = c(2, 0), sdlog = c(sqrt(8), sqrt(12))){
  Ns <- rpois(N, lambdapois) 
  loss <- vapply(Ns, function(x){
      X <- rnorm(x, meanlog[1], sdlog[1])
      Y <-  rnorm(x, meanlog[2], sdlog[2])
      loss <- sum(exp(X)) - sum(exp(Y))  
      antithetic_loss <-   sum(exp(4 - X)) - sum(exp(-Y))   
      ((loss > 0) + (antithetic_loss > 0)) / 2
  }, 
  0.0) 
  c(mean(loss), sd(loss))
}

set.seed(123)
av_est <- pr_insurance_loss(10^5)

ins_loss <- insurance_loss(10^5) > 0
mc_est <- c(mean(ins_loss), sd(ins_loss)) 

rbind(av_est, mc_est)
          [,1]      [,2]
av_est 0.69204 0.3140726
mc_est 0.69342 0.4610757