ST303 Stochastic Simumlation

Class 1


Philipp Sterzinger

p.sterzinger@lse.ac.uk


Office hours: Thursday, 12-1, COL 8.05



09 October 2025

Recap

Estimating an integral

Want to compute the integral

\[ I(g) = \int_0^1 g(x) dx \,, \] for some function \(g:[0,1] \to \Re\).

Estimating an integral

Theorem 1 (LLN) If \(X_1, \ldots, X_n \in \Re\) are i.i.d. random variables with \(E(|X_i|)< \infty\), then

\[ \lim_\limits{n \to \infty} \frac{1}{n} \sum_{i=1}^n X_i \overset{a.s.}{=} E(X_i) \,. \]

Hence, if \(X_1, \ldots, X_n\) are \(\mathcal{U}(0,1)\), and \(E(|g(X_i)|) < \infty\), then \[ \lim_\limits{n \to \infty} \frac{1}{n} \sum_{i=1}^n g(X_i) \overset{a.s.}{=} E(g(X_i)) = \int_0^1 g(x) dx \,. \]

Monte Carlo approximation

Draw a large sample \(X_1,\ldots, X_n \sim \mathcal{U}(0,1)\) and approximate

\[ \int_0^1 g(x) dx \approx \frac{1}{n} \sum_{i=1}^n g(X_i) \,. \]

Uncertainty quantification

Our approximation \[ \bar{I}(g) = \frac{1}{n} \sum_{i=1}^n g(X_i) \] for \(I(g) = \int_0^1 g(x)dx\) is a random variable. To make statements about the precision of our approximation, we can appeal to the central limit theorem.

Uncertainty quantification

Theorem 2 (CLT) If \(Y_1, \ldots, Y_n \in \Re\) are i.i.d. random variables with mean \(\mu\) and variance \(\sigma < \infty\), then

\[ \sqrt{n} \frac{\bar{Y}_n - \mu}{\sigma} \overset{d}{\longrightarrow} Z \sim \mathcal{N}(0,1) \,, \]

where \(\bar{Y}_n = \sum_{i = 1}^n Y_i / n\).

Hence, for \(Y_i = g(X_i)\), \(X_i \sim \mathcal{U}(0,1)\), we get that

\[ \Pr \left( a < \frac{\bar{I}(g) - I(g)}{\sigma / \sqrt{n}} < b \right) \longrightarrow \Psi(b) - \Psi(a) \quad n \to \infty \,, \] where \(\sigma = \int_0^1 (g(x) - I(g))^2 dx\).

Again using the Monte Carlo method, we can approximate \(\sigma\) as \[ \sigma^2 \approx \frac{1}{n} \sum_{i = 1}^n g(X_i)^2 - \bar{I}(g)^2 \,. \]

Integration by substitution

Want to compute an integral with integration bounds \(a,b \in [-\infty, \infty]\), i.e.  \[\int_a^b g(x) dx \,.\]

Recipe

  • \(y:[a, b] \to [0,1]\) with \(\lim_\limits{x \to a} y(x) = 0\), \(\lim_\limits{x \to b} y(x) = 1\)
  • solve for \(x(y)\)
  • compute \(x'(y) = \frac{dx}{dy}\)
  • substitue \(x(y)\) for \(x\), \(x'(y) dy\) for \(dx\)
  • adjust inegration bounds, e.g. \(a \mapsto \lim_\limits{x \to a} y(x) = 0, \, b \mapsto \lim_\limits{x \to b} y(x) = 1\)

Integration by substitution

Then \[ \int_a^b g(x) dx = \int_{0}^{1} g(x(y)) x'(y) dy \,. \]

Problem Set

Question 1

1.a

Create a vector of \(10^5\) random numbers \(X_i \sim \mathcal{U}(0,1)\)

Code
# Sample size 
N <- 10^5 

# Sampling uniform random variables 
set.seed(123)
X <- runif(N) 

Question 1

1.b

Check the features by plotting a histogram

Code
# Histogram 
hist(X, freq = FALSE) 
abline(c(1,0), col = "red", lty = 2, lwd = 3)

Also plot the pairs \(X_i, X_{i+1}\)

Code
# All but last X 
X1 <- X[-N] 
# All but first X 
X2 <- X[-1] 

# Scatter plot 
plot(X1, X2, pch = 20, cex = 0.1, col = "royalblue1") 
plot_title <- expression("Scatter plot " ~ X[i] * "," ~ X[i + 1])
title(main = plot_title)

Plot the autocorrelation function

Code
# ACF function plots by default 
acf(X) 

Question 1

1.c

Calculate the mean, median and variance of the sample

Code
# mean of X 
cat("mean: ", mean(X)) 
mean:  0.4992992
Code
# median of X 
cat("median: ", median(X))
median:  0.4972005
Code
# variance of X 
cat("variance: ", var(X)) 
variance:  0.08317841

1.d

Estimate the probability \(\Pr(X_i > 0.6)\).

Code
# estimate pr(x > 0.6) 
pr <- mean(X > 0.6)
cat("Pr(X_i > 0.6) ≈ ", pr)
Pr(X_i > 0.6) ≈  0.39867

Question 2

If \(y = \exp\{-x\}\) calculate the integral \[ \int_0^\infty g(x) dx = \int_0^1 \frac{1}{y} g(-\log(y)) dy \]

where \(g(-y) = \exp \left\{-\log(y)^2 / 2 \right\}\). Run the algorithm for \(10^4\) simulations.

Code
# setting seed 
set.seed(123) 
# sample size 
Nsim <- 10^4
# draw sample 
y <- runif(Nsim)
# calculate function within integral 
x <- (1/y)*exp((-log(y)^2)/2)
# take mean
cat("Integral approximation: ", mean(x))
Integral approximation:  1.256539
Code
# approximation error 
cat("Approximation errror: ", mean(x) - sqrt(pi / 2)) 
Approximation errror:  0.003224685

Question 3

Estimate the value of the following integrals. In all cases you have to give a measure of how accurate your approximation is such as the standard error or a \(95\%\) confidence interval.

3.a

\[ \int_0^1 \exp\{-x^{1.5}\} dx \]

Question 3

3.a Integral approximation

\[ \begin{aligned} \int_0^1 \exp\{-x^{1.5}\} dx &= \mathrm{E}_{\mathcal{U}(0,1)} \left[ \exp\{-x^{1.5}\} \right] \\ & \overset{\textrm{LLN}}{\approx}\frac{1}{n} \sum_{i = 1}^n \exp\{-X_i^{1.5}\}\,, \end{aligned} \]
where \(X_i\) are i.i.d. \(\mathcal{U}(0,1)\).

Question 3

3.a 95% confidence interval

  • Central limit theorem: As \(n \to \infty\): \[ \Pr\left( \left| \frac{\bar{I}(g) - I(g)}{\sigma / \sqrt{n}} \right| < z \right) \longrightarrow 2 \Psi(z) - 1\,, \] for \(\sigma = \int_0^1 (g(x) - I(g))^2 dx\).

  • Finding \(z\): \[ 2 \Psi(z) - 1 = 0.95 \iff \Psi(z) = \frac{1 + 0.95}{2} \iff z = \Psi^{-1}(0.975) \]

Therefore \[ \Pr\left( \bar{I}(g) + z \frac{\sigma}{\sqrt{n}}< I(g) < \bar{I}(g) + z \frac{\sigma}{\sqrt{n}} \right) \approx 0.95 \,, \] for large \(n\).

Question 3

3.a

Code
# set seed 
set.seed(1) 
# sample size 
N <- 10^5 
# uniform sample 
X <- runif(N) 
# function within integral 
y <- exp(-X^(1.5)) 
# integral approximation 
int_approx <- mean(y) 
# estimating standard error 
se <- sqrt(mean(y^2) - int_approx^2) / sqrt(N) 
# confidence interval 
conf_level <- 0.95 
z <- qnorm((1 + conf_level) / 2)  
uci <- int_approx + z * se 
lci <- int_approx - z * se 
# real integral value 
int_val = 0.699792 
# approximation error 
approx_error <- int_val - int_approx
print(rbind(int_val, int_approx, approx_error, se, lci, uci))
                      [,1]
int_val       0.6997920000
int_approx    0.6999995477
approx_error -0.0002075477
se            0.0006244706
lci           0.6987756079
uci           0.7012234875

Question 3

3.b.

\[ \int_0^\infty \exp\{-x^{1.5}\} dx\,, \] try various transformation for this to find out which one works best.

Question 3

3.b. Integration by substitution

  • \(y = \frac{1}{1 + x}\)
  • \(x = \frac{1}{y} - 1\)
  • \(\frac{d}{dy} \frac{1}{y} - 1 = - \frac{1}{y^2}\)
  • \(\lim_\limits{x \to 0} \frac{1}{1 + x} = 1\), \(\lim_\limits{x \to \infty} \frac{1}{1 + x} = 0\)

Therefore

\[ \begin{aligned} \int_0^\infty \exp\{-x^{1.5}\} dx &= - \int_1^0 \frac{1}{y^2} \exp\left\{-\left(\frac{1}{y} - 1\right)^{1.5}\right\} dy \\ &= \int_0^1 \frac{1}{y^2} \exp\left\{-\left(\frac{1}{y} - 1\right)^{1.5}\right\} dy \end{aligned} \]

Question 3

3.b. What makes a good transformation?

Code
# plotting transformed integrand with y = 1 / (1 + x) on unit interval 
ys <- seq(from = 0.00001, to = 1, length.out = 1000) 
integrand <- 1 / (ys^2) * exp(-(1 / ys - 1)^(1.5)) 
par(mar = c(5, 4, 4, 12))
plot(ys, integrand, type = "l", lwd = 2, ylim = c(0, 2))

# plot another transoformation y = exp(-x^1.5)
other_integrand <- 2 / 3 * (-log(ys))^(-1/3)
lines(ys, other_integrand, type = "l", lwd = 2, col = "green") 

# plot another transoformation y = exp(-x)
other_integrand2 <- 1 / ys * exp(-(-log(ys))^(1.5)) 
lines(ys, other_integrand2, type = "l", lwd = 2, col = "cyan") 

# add hline at value of integral 
f <- function(x){(1/x^2)*exp(-(1/x-1)^(1.5))}
Integral = integrate(f,0,1)$value 
abline(Integral, 0, lwd = 2, lty = 2, col = "red")

# add legend 
legend("right", 
       legend = c(expression(y == frac(1, 1+x)), 
                expression(y == exp(-x^1.5)), 
                expression(y == exp(-x)),  
                "Integral value"), 
       col = c("black", "green", "cyan", "red"), 
       lty = c(1, 1, 1, 2),          
       lwd = 2, 
       inset = -0.35, 
       xpd = TRUE)

Code
set.seed(123) 
N <- 10^5
X <- runif(N) 
y1 <- 1 / (X^2) * exp(-(1 / X - 1)^(1.5)) 
y2 <- 2 / 3 * (-log(X))^(-1/3)
y3 <- 1 / X * exp(-(-log(X))^(1.5)) 

se1 <- sqrt(mean(y1^2) - mean(y1)^2) / sqrt(N) 
se2 <- sqrt(mean(y2^2) - mean(y2)^2) / sqrt(N) 
se3 <- sqrt(mean(y3^2) - mean(y3)^2) / sqrt(N)

ci1 <- mean(y1) + c(-z * se1, z * se1)
ci2 <- mean(y2) + c(-z * se2, z * se2)
ci3 <- mean(y3) + c(-z * se3, z * se3)

cat("y = 1 / (1 + x) \nEstimate:", mean(y1), "\nError: ", mean(y1) - Integral, "\nCI: ", ci1) 
y = 1 / (1 + x) 
Estimate: 0.9015331 
Error:  -0.001212228 
CI:  0.8976504 0.9054157
Code
cat("y = exp(-x^1.5) \nEstimate:", mean(y2), "\nError: ", mean(y2) - Integral, "\nCI: ", ci2) 
y = exp(-x^1.5) 
Estimate: 0.901386 
Error:  -0.001359323 
CI:  0.8977233 0.9050486
Code
cat("y = exp(-x) \nEstimate:", mean(y3), "\nError: ", mean(y3) - Integral, "\nCI: ", ci3) 
y = exp(-x) 
Estimate: 0.9026241 
Error:  -0.0001211551 
CI:  0.9005703 0.9046779

Question 3

3.c.

\[ \int_{-\infty}^\infty \frac{1}{1 + x^2} dx \] This is known to be \(\pi\)

Question 3

3.c. Integration by substitution

Before thinking of transforms, let us simplify things

  • Symmetry: \[ \int_{-\infty}^\infty \frac{1}{1 + x^2} dx = 2 \int_{0}^\infty \frac{1}{1 + x^2} dx \]

  • Split into parts: \[ \int_{0}^\infty \frac{1}{1 + x^2} dx = \int_{0}^1 \frac{1}{1 + x^2} dx + \int_{1}^\infty \frac{1}{1 + x^2} dx \]

Question 3

3.c. Integration by substitution

Hence, we need to find a transformation for \[\int_{1}^\infty \frac{1}{1 + x^2} dx\]

Let \(y = \frac{1}{x}\):

  • \(x = \frac{1}{y}\)
  • \(\frac{d}{dy} x(y) = - \frac{1}{y^2}\)
  • \(\lim_\limits{x \to 1} y(x) = 1\), \(\lim_\limits{x \to \infty} y(x) = 0\)

Question 3

3.c. Integration by substitution

Thus \[ \begin{aligned} \int_{1}^\infty \frac{1}{1 + x^2} dx &=- \int_{1}^0 \frac{1}{1 + (1/y)^2} \frac{1}{y^2} dy \\ &= \int_{0}^1 \frac{1}{1 + (1/y)^2} \frac{1}{y^2} dy \\ &= \int_{0}^1 \frac{1}{\frac{y^2 + 1}{y^2}}\frac{1}{y^2} dy \\ &= \int_{0}^1 \frac{y^2}{1 + y^2} \frac{1}{y^2} dy \\ &= \int_{0}^1 \frac{1}{1 + y^2} dy \\ \end{aligned} \]

Question 3

3.c. Integration by substitution

Hence, we have found that \[ \begin{aligned} \int_{-\infty}^\infty \frac{1}{1 + x^2} dx &= 2 \int_{0}^\infty \frac{1}{1 + x^2} dx \\ &= 2 \left( \int_{0}^1 \frac{1}{1 + x^2} dx + \int_{1}^\infty \frac{1}{1 + x^2} dx \right) \\ &= 2 \left( \int_{0}^1 \frac{1}{1 + x^2} dx + \int_{0}^1 \frac{1}{1 + y^2} dy \right) \\ &= 4 \int_{0}^1 \frac{1}{1 + y^2} dy \\ &= 4 \textrm{E}_{\mathcal{U}(0,1)} \left[\frac{1}{1 + X^2} \right] \\ & \approx 4 \frac{1}{n} \sum_{i=1}^n \frac{1}{1 + X_i^2}, \quad X_i \overset{i.i.d.}{\sim}\mathcal{U}(0,1) \end{aligned} \]

Question 3

3.c. Geometric solution

Code
x <- seq(0, 1, length.out = 500)
f <- sqrt(1 - x^2)

plot(x, f, type = "l", xlab = "x", ylab = "", xlim = c(0, 1), ylim = c(0, 1), lwd = 2, col = "black")
polygon(c(x, rev(x)), c(f, rep(0, length(f))), col = "lightgrey")

legend("topright", legend = c(expression(sqrt(1 - x^2)), expression(frac(pi, 4))),
       lwd = c(2, NA), pch = c(NA, 15), col = c("black", "lightgrey"))




\[ \begin{aligned} \pi &= 4 \int_0^1 \sqrt{(1-y^2)} dy \\ &= 4 \textrm{E}_{\mathcal{U}(0,1)} \left[\sqrt{1 - X^2} \right] \\ & \approx 4 \frac{1}{n} \sum_{i=1}^n \sqrt{1 - X_i^2}, \quad X_i \overset{i.i.d.}{\sim}\mathcal{U}(0,1) \end{aligned} \]

You can obtain this analytically using \(y = \sqrt{\frac{1}{1 + x^2}}\).

Question 3

3.c.

Code
set.seed(123) 
N <- 10^5 
X <- runif(N) 

y1 <- 4 / (1 + X^2)
y2 <- 4 * sqrt(1 - X^2) 
integral <- pi 

se1 <- sqrt(mean(y1^2) - mean(y1)^2) / sqrt(N) 
se2 <- sqrt(mean(y2^2) - mean(y2)^2) / sqrt(N) 

ci1 <- mean(y1) + c(-z * se1, z * se1)
ci2 <- mean(y2) + c(-z * se2, z * se2)

cat("y = 1 / (1 + x^2) \nEstimate:", mean(y1), "\nError: ", mean(y1) - integral, "\nCI: ", ci1) 
y = 1 / (1 + x^2) 
Estimate: 3.143417 
Error:  0.001823983 
CI:  3.139433 3.1474
Code
cat("y = sqrt(1 - x^2) \nEstimate:", mean(y2), "\nError: ", mean(y2) - integral, "\nCI: ", ci2) 
y = sqrt(1 - x^2) 
Estimate: 3.143952 
Error:  0.002359275 
CI:  3.138422 3.149481

Question 3

3.d.

\[ I(g) = \int_0^1 \int_0^1 \int_0^1 \int_0^1 \exp\{-\sqrt{xywz}\} dx dy dz dw \]

Note that \[ \begin{aligned} I(g) = \textrm{E}_{\mathcal{U}(0,1)^4} \left[ \exp\{-\sqrt{XYWZ}\} \right] \approx \frac{1}{n} \sum_{i = 1}^n \exp\{-\sqrt{X_i Y_i W_i Z_i}\}\,, \end{aligned} \] where \(X_i, Y_i, W_i, Z_i\) are independent, \(\mathcal{U}(0,1)\) random variables

Code
set.seed(123) 
N <- 10^5
X <- runif(N)
Y <- runif(N) 
Z <- runif(N) 
W <- runif(N)  
xywz <- exp(-sqrt(X * Y * Z * W))
int_approx <- mean(xywz) 
se <-sqrt(mean(xywz^2) - int_approx^2) / sqrt(N) 
ci <- mean(int_approx) + c(-z * se, z * se)
cat("Estimate:", int_approx, "\nCI: ", ci) 
Estimate: 0.8299637 
CI:  0.8292357 0.8306917