p.sterzinger@lse.ac.uk
Office hours: Tuesday, COL 5.03
19 November 2025
\[ \begin{aligned} I(g) &= \int_0^1 g(x) dx \\ &= \textrm{E}_{\mathcal{U}(0,1)}\left[g(U) \right] \\ &= \lim\limits_{n \to \infty} \frac{1}{n} \sum_{i = 1}^n g(U_i) \quad (\textrm{almost surely}) \\ & \approx \frac{1}{n} \sum_{i = 1}^n g(U_i) \\ & \equiv \hat{I}(g) \end{aligned} \]
\(\textrm{var}\left(\hat{I}(g) - I(g) \right) = \mathcal{O}(n^{-1})\) but may be substantial in small samples
Suppose we have two random variables \(X,Y\) with the same mean and variance \(\sigma^2\). Then
\[ \textrm{var} \left(\frac{X + Y}{2} \right) = \frac{1}{2} \sigma^2 (1 + \rho) < \sigma^2 \iff \rho < 1 \,, \] where \(\rho \in (-1, 1)\) is the correlation coefficient.
BUT: We can always double the sample size in our simulation
Hence, we should compare \((X+Y) / 2\) to the mean of two independent samples of \(X\), say \(X_1, X_2\).
To acheive effective variance reduction, we thus need
\[ \textrm{var} \left(\frac{X + Y}{2} \right) = \frac{1}{2} \sigma^2 (1 + \rho) < \textrm{var} \left(\frac{X_1 + X_2}{2} \right) = \frac{1}{2} \sigma^2 \iff \rho < 0 \,, \]
Idea: minimize variance of MC estimator \(\hat{I}(g)\) by cooking up second estimator \(\tilde{I}(g)\) with same mean and variance but negative correlation.
Theorem 1 If \(g(x)\) is a monotone function, and \(U \sim \mathcal{U}(0,1)\) a uniform random variable, then \[ \textrm{cov}(g(U), g(1-U)) \leq 0 \]
Thus, letting \(\tilde{I}(g) = \frac{1}{2n} \sum_{i = 1}^n g(U_i) + \frac{1}{2n} \sum_{i = 1}^n g(1 - U_i)\) we have that
\[ \textrm{var}\left( \tilde{I}(g) \right) < \textrm{var}\left( \hat{I}(g) \right) \]
Improve your estimates for all integrals in Exercise \(3\) of Seminar \(1\) and Exercise \(2\) of Seminar \(2\) by the method of Antithetic Variables. Approximate the new standard errors.
\[ I = \int_0^1 \exp\{-x^{1.5}\} dx = \textrm{E}_{\mathcal{U}(0,1)} \left[ g(U) \right] \] for \(g(x) = \exp\{-x^{1.5}\}\).
\[ \hat{I} = \frac{1}{n}\sum_{i = 1}^n g(U_i), \quad U_i \overset{\textrm{i.i.d.}}{\sim} \mathcal{U}(0, 1) \]
\[ \tilde{I} = \frac{1}{2n} \sum_{i = 1}^n g(U_i) + \frac{1}{2n} \sum_{i = 1}^n g(1 - U_i), \quad U_i \overset{\textrm{i.i.d.}}{\sim} \mathcal{U}(0, 1) \]
Monotonicity:
g <- function(x){exp(-x^(1.5))}
N <- 10^5
set.seed(123)
U <- runif(N)
mc_mean <- mean(g(U))
mc_sd <- sd(g(U))
av_vals <- 0.5 * g(U) + 0.5 * g(1 - U)
av_mean <- mean(av_vals)
av_sd <- sd(av_vals)
reps <- 1000
Us <- matrix(runif(reps * N), nrow = N, ncol = reps)
av_means <- vapply(1:reps, function(col){U <- Us[, col]; mean(0.5 * g(U) + 0.5 * g(1 - U))}, 0.0)
mc_means <- vapply(1:reps, function(col){U <- Us[, col]; mean(g(U))}, 0.0)
av <- cbind(av_mean, av_sd)
mc <- cbind(mc_mean, mc_sd)
comparison <- rbind(mc, av)
colnames(comparison) <- c("Mean", "Standard deviation")
rownames(comparison) <- c("Monte Carlo", "Antithetic variates")
par(mfrow = c(1, 2))
hist(g(U), col = "#5b5be0b0", freq = FALSE, breaks = 100, xlab = "MC/AV summands", ylab = "Distribution of Summands", main = "Histograms of MC summands")
hist(av_vals, freq=FALSE, breaks=100, col="#e05b5bb0", border="#e05b5bb0", add = TRUE)
abline(v = 0.699792, col = "green", lty = 2, lwd = 2)
legend("topright",
legend = c("MC", "AV", "truth"),
fill = c("#5b5be0b0", "#e05b5bb0", NA),
border = NA,
lty = c(NA, NA, 2),
lwd = c(NA, NA, 2),
pch = c(NA, NA, NA),
col = c(NA, NA, "green"))
hist(mc_means, col = "#5b5be0b0", freq = FALSE, ylab = "Distribution of MC estimates", xlab = "MC means", main = "Histogram of MC estimates")
hist(av_means, col = "#e05b5bb0", freq = FALSE, ylab = "Distribution of MC/AV estimates", xlab = "MC/AV means", main = "Histograms of MC/AV estimates", add = TRUE, border = "#e05b5bb0")
abline(v = 0.699792, col = "green", lty = 2, lwd = 2)
legend("topright",
legend = c("MC", "AV", "truth"),
fill = c("#5b5be0b0", "#e05b5bb0", NA),
border = NA,
lty = c(NA, NA, 2),
lwd = c(NA, NA, 2),
pch = c(NA, NA, NA),
col = c(NA, NA, "green")) Mean Standard deviation
Monte Carlo 0.7003113 0.196876091
Antithetic variates 0.6998029 0.004407909
\[ \begin{aligned} I &= \int_0^\infty \exp\{-x^{1.5}\} dx \\ &= \int_0^1 \frac{1}{y^2}\exp\left\{-\left(\frac{1}{y} - 1 \right)^{1.5}\right\} dy \\ &= \textrm{E}_{\mathcal{U}(0,1)} \left[ h(U) \right] \end{aligned} \] for \[ h(y) = \frac{1}{y^2}\exp\left\{-\left(\frac{1}{y}-1\right)^{1.5}\right\} \]
cov(h(u), h(1-u)): -0.1782384
mc_mean <- mean(h(U))
mc_sd <- sd(h(U))
av_vals <- 0.5 * h(U) + 0.5 * h(1 - U)
av_mean <- mean(av_vals)
av_sd <- sd(av_vals)
av <- cbind(av_mean, av_sd)
mc <- cbind(mc_mean, mc_sd)
comparison <- rbind(mc, av)
colnames(comparison) <- c("Mean", "Standard deviation")
rownames(comparison) <- c("Monte Carlo", "Antithetic variates")
hist(g(U),
col = "#5b5be0b0",
freq = FALSE, breaks = 100,
xlab = "Estimates",
ylab = "Distribution of Estimates",
main = "Histograms of estimates",
xlim = c(.3, max(c(h(U), av_vals))))
hist(av_vals, freq=FALSE, breaks=100, col="#e05b5bb0", add = TRUE)
abline(v = 0.902745, col = "green", lty = 2, lwd = 2)
legend("topright",
legend = c("MC", "AV", "truth"),
fill = c("#5b5be0b0", "#e05b5bb0", NA),
border = NA,
lty = c(NA, NA, 2),
lwd = c(NA, NA, 2),
pch = c(NA, NA, NA),
col = c(NA, NA, "green")) Mean Standard deviation
Monte Carlo 0.9015331 0.6264433
Antithetic variates 0.9032674 0.3275354
The loss to insurance portfolio \(A\) is of the form \[ S_A = \sum_{i = 1}^{n_A} X_i \,, \] where \(N_A\) is a Poisson random variable with parameter \(10\) and \(X_i\) are i.i.d random variables that are log-normal with parameters \(2\) and \(8\) (\(8\) is the variance of the underlying normal, not the standard deviation). The loss to another (independent of \(A\)) insurance portfolio \(B\) is of the form \[ S_B = \sum_{i = 1}^{n_B} Y_i \,, \] where \(N_B\) is a Poisson random variable with parameter \(10\) and \(Y_i\) are i.i.d random variables that are log-normal with parameter \(0\) and \(12\).
Calculate \(\Pr(S_A > S_B)\) and give a \(95\%\) confidence interval for your estimate.
\[ \begin{aligned} \Pr \left(\tilde{S}_A \leq t \right) &= \sum_{k = 0}^\infty \Pr\left(\sum_{i = 1}^k X_i \leq t {\Bigg |} n_A = k \right) \Pr(n_A = k) \\ &= \sum_{k = 0}^\infty \Pr\left(\sum_{i = 1}^k X_i \leq t \cap n_A = k \right) \\ &= \Pr\left(\bigcup_{k = 0}^\infty \left\{\sum_{i = 1}^{k} X_i \leq t \cap n_A = k \right\} \right) \\ &= \Pr\left(\sum_{i = 1}^{n_A} X_i \leq t \right) \\ &= \Pr(S_A \leq t) \end{aligned} \]
\[ \begin{aligned} \Pr(S_A > S_B) &= \mathrm{E} \left[\mathbb{1} \left\{S_A > S_B \right\} \right] \\ & \overset{a.s.}{=}\lim\limits_{n \to \infty} \frac{1}{n} \sum_{i = 1}^n \mathbb{1} \left\{S_A > S_B \right\} \\ & \approx \frac{1}{n} \sum_{i = 1}^n \mathbb{1} \left\{S_A > S_B \right\} \end{aligned} \]
For \(n\) reps, do:
\(\Pr(S_A > S_B) \approx \frac{1}{n} \sum_{i = 1}^n \mathbb{1} \left\{S_A > S_B \right\}\)
## Method 1
insurance_loss <- function(N, lambdapois = 1, meanlog = 0, sdlog = 1){
Ns <- rpois(N, lambdapois)
vapply(Ns, function(x){sum(exp(rnorm(x, mean = meanlog, sd = sdlog)))}, 0.0)
}
N <- 10^5
set.seed(123)
ptm <- proc.time()
S_A <- insurance_loss(N, 10, 2, sqrt(8))
S_B <- insurance_loss(N, 10, 0, sqrt(12))
sim_time_method_1 <- (proc.time() - ptm)[3]
pr_method_1 <- mean(S_A > S_B)
## Method 2
CP_A <- function(x){
n <- rpois(1, 10)
x <- exp(rnorm(n, 2, sqrt(8)))
s <- sum(x)
return(s)
}
CP_B <- function(x){
n <- rpois(1, 10)
x <- exp(rnorm(n, 0, sqrt(12)))
s <- sum(x)
return(s)
}
set.seed(123)
ptm <- proc.time()
x <- 1:10^5
CP_SA <- sapply(x, CP_A)
CP_SB <- sapply(x, CP_B)
sim_time_method_2 <- (proc.time() - ptm)[3]
pr_method_2 <- mean(CP_SA > CP_SB)
## Method 3
set.seed(123)
ptm <- proc.time()
CP_A <- rep(0, N)
CP_B <- rep(0, N)
N_A <- rpois(N, 10)
N_B <- rpois(N, 10)
for(i in 1 : N){
CP_A[i] <- sum(exp(rnorm(N_A[i], 2, sqrt(8))))
CP_B[i] <- sum(exp(rnorm(N_B[i], 0, sqrt(12))))
}
sim_time_method_3 <- (proc.time() - ptm)[3]
pr_method_3 <- mean(CP_A > CP_B)
cbind(pr_method_1, pr_method_2, pr_method_3) pr_method_1 pr_method_2 pr_method_3
[1,] 0.67875 0.67804 0.67954
sim_time_method_1 sim_time_method_2 sim_time_method_3
elapsed 0.245 0.321 0.178
Same as Question 2 but now \(n_A = n_B \sim \textrm{Pois}(10)\).
## Method 1
insurance_loss <- function(N, lambdapois = 10, meanlog = c(2, 0), sdlog = c(sqrt(8), sqrt(12))){
Ns <- rpois(N, lambdapois)
vapply(Ns, function(x){
sum(exp(rnorm(x, meanlog[1], sdlog[1]))
- exp(rnorm(x, meanlog[2], sdlog[2])))},
0.0)
}
N <- 10^5
set.seed(123)
ptm <- proc.time()
ins_loss <- insurance_loss(N)
sim_time_method_1 <- (proc.time() - ptm)[3]
pr_method_1 <- mean(ins_loss > 0)
## Method 2
CP_AB <- function(x){
n <- rpois(1, 10)
x <- exp(rnorm(n, 2, sqrt(8)))
y <- exp(rnorm(n, 0, sqrt(12)))
s <- sum(x - y)
return(s)
}
set.seed(123)
ptm <- proc.time()
x <- 1:10^5
CP_SAB <- sapply(x, CP_AB)
sim_time_method_2 <- (proc.time() - ptm)[3]
pr_method_2 <- mean(CP_SAB > 0)
## Method 3
set.seed(123)
insurance_loss_loop <- function(N, lambdapois = 10, meanlog = c(2, 0), sdlog = c(sqrt(8), sqrt(12))){
CP_SAB_loop <- rep(0, N)
N_AB <- rpois(N, lambdapois)
for(i in 1 : N){
CP_SAB_loop[i] <- sum(exp(rnorm(N_AB[i], meanlog[1], sdlog[1]))
- exp(rnorm(N_AB[i], meanlog[2], sdlog[2])))
}
CP_SAB_loop
}
ptm <- proc.time()
ins_loss_loop <- insurance_loss_loop(N)
sim_time_method_3 <- (proc.time() - ptm)[3]
pr_method_3 <- mean(ins_loss_loop > 0)
cbind(pr_method_1, pr_method_2, pr_method_3) pr_method_1 pr_method_2 pr_method_3
[1,] 0.69181 0.69373 0.69181
sim_time_method_1 sim_time_method_2 sim_time_method_3
elapsed 0.222 0.232 0.181
Improve your estimate in Question 2 and 3 by the method of antithetic variables. Do at least one of them and approximate the new standard errors.
Recall that \[ X_i = e^{N_i}, \, N_i \sim \mathrm{N}(2,8) \] and note that \(4 - N_i \sim \mathrm{N}(2,8)\).
Now \[ 4 - N_i \sim \mathrm{N}(2,8) \]
and
\[ \textrm{cov}(N_i, 4 - N_i) = - \textrm{var}(N_i) \implies \rho = -1 \]
Futher \(\exp\{x\}\) is an increasing monotonic transformation, so that
\[ \textrm{cov}(\exp\{4 - N_i\}, \exp\{N_i\}) < 0 \]
Hence, we can use \[ \bar{S}_A = \sum_{i = 1}^n \exp\{4 - N_i\} \] as the antithetic variate to \(S_A = \sum_{i = 1}^n \exp\{N_i\}\)
Similarly, for \(S_B\) we use the antithetic variate \[ \bar{S}_B = \sum_{i = 1}^n \exp\{ - M_i\} \] for \(S_B =\sum_{i = 1}^n \exp\{M_i\}\), \(M_i \sim \mathcal{N}(0, 12)\)
Estimate \(\Pr(S_A > S_B)\) as
\[ \Pr(S_A > S_B) = \textrm{E} [\mathbb{1}\{S_A > S_B\}] \approx \frac{1}{N} \sum_{i = 1}^N \frac{\mathbb{1}\left\{S_A^i > S_B^i \right\} + \mathbb{1}\left\{\bar{S}_A^i > \bar{S}_B^i \right\}}{2} \]
## Method 1
pr_insurance_loss <- function(N, lambdapois = 10, meanlog = c(2, 0), sdlog = c(sqrt(8), sqrt(12))){
Ns <- rpois(N, lambdapois)
loss <- vapply(Ns, function(x){
X <- rnorm(x, meanlog[1], sdlog[1])
Y <- rnorm(x, meanlog[2], sdlog[2])
loss <- sum(exp(X)) - sum(exp(Y))
antithetic_loss <- sum(exp(4 - X)) - sum(exp(-Y))
((loss > 0) + (antithetic_loss > 0)) / 2
},
0.0)
c(mean(loss), sd(loss))
}
set.seed(123)
av_est <- pr_insurance_loss(10^5)
ins_loss <- insurance_loss(10^5) > 0
mc_est <- c(mean(ins_loss), sd(ins_loss))
rbind(av_est, mc_est) [,1] [,2]
av_est 0.69204 0.3140726
mc_est 0.69342 0.4610757
Philipp Sterzinger - ST303 Class 7