SDE

Stochastic Differential Equations: A Concise Technical Introduction

As a second-year undergraduate in mathematics, I’ve recently started self-studying stochastic differential equations since it’s interesting and applicable to machine learning. While most resources dive quickly into measure-theoretic probability, I found it challenging to find concise, technically accurate summaries that bridge undergraduate calculus/probability and the core machinery of Itô calculus. This post is my attempt to fill that gap. It’s written primarily for fellow undergraduates who know real analysis and basic probability but haven’t yet taken a full course in stochastic calculus.

Brownian Motion

Before introducing stochastic differential equations, it is helpful to recall the deterministic case.

An ordinary differential equation (ODE) has the form

\[\frac{df}{dt} = b(f,t), \quad f(t_0) = f_0,\]

or in differential notation,

\[df = b(f,t)\, dt.\]

Here \(b(\cdot,t)\) is the deterministic drift coefficient, and for suitable conditions on \(b\), there exists a unique deterministic solution $f(t)$.

Ordinary differential equations often provide an effective framework for modeling many real-world systems, such as population growth. However, these models are fully deterministic. Given the same initial conditions, they always produce the same trajectory.

In reality, most systems are subject to random fluctuations due to environmental noise, measurement errors, or other unpredictable influences. For instance, stock prices are affected by unpredictable market events

To capture such randomness in a continuous-time setting, we can extend the deterministic model by adding a stochastic noise term:

\[dX_t = \mu(X_t, t)\, dt + \sigma(X_t, t)\, dW_t, \quad X_0 = x_0.\]

Here the noise term is \(\sigma(X_t, t)\, dW_t\).

In theory, \(dW_t\) can be noise that follow any kind of distributions. For instance, $dW_t$ can be the increment of a compensated Poisson process to follow Poisson distribution. But in application, we often choose standard Brownian motion because it possesses several onvenient properties that make the resulting Itô calculus more tractable and powerful.

These key properties include:

  • Gaussian increments: \(W_t - W_s \sim \mathcal{N}(0, t-s)\) for \(t > s\),
  • Independent increments: $W_{t_{k+1}} - W_{t_k}$ are mutually independent for disjoint intervals,
  • Almost surely continuous paths: \(t \mapsto W_t(\omega)\) is continuous for almost every \(\omega\).

Stochastic Differential Equations

\[dX_t = \mu(X_t, t)\, dt + \sigma(X_t, t)\, dW_t, \quad X_0 = x_0.\]

Now that we have such differential form, we can rewrite it into integral form

\[X_t = x_0 + \int_0^t \mu(s, X_s) \, ds + \int_0^t \sigma(s, X_s) \, dW_s.\]

The second integral is the Itô stochastic integral.

Itô’s Formula

Let \(f(t,x) \in C^{1,2}([0,\infty) \times \mathbb{R})\). Then for an Itô process \(X_t\),

\[df(t,X_t) = \frac{\partial f}{\partial t}(t,X_t) \, dt + \frac{\partial f}{\partial x}(t,X_t) \, dX_t + \frac{1}{2} \frac{\partial^2 f}{\partial x^2}(t,X_t) \, (dX_t)^2,\]

where the multiplication table is $dt \cdot dt = dt \cdot dW_t = 0$, $dW_t \cdot dW_t = dt$.

Substituting $dX_t = \mu \, dt + \sigma \, dW_t$ yields

\[df(t,X_t) = \left( f_t + \mu f_x + \frac{1}{2} \sigma^2 f_{xx} \right) dt + \sigma f_x \, dW_t.\]

Example: Geometric Brownian Motion

Consider the SDE for asset prices

\[dS_t = \mu S_t \, dt + \sigma S_t \, dW_t, \quad S_0 > 0.\]

Apply Itô’s formula to $f(s) = \log s$:

\[f_s = \frac{1}{s}, \quad f_{ss} = -\frac{1}{s^2}.\]

Then

\[d(\log S_t) = \left( \mu - \frac{1}{2} \sigma^2 \right) dt + \sigma \, dW_t.\]

Integrating,

\[\log S_t = \log S_0 + \left( \mu - \frac{1}{2} \sigma^2 \right) t + \sigma W_t.\]

Thus

\[S_t = S_0 \exp\left( \left( \mu - \frac{1}{2} \sigma^2 \right) t + \sigma W_t \right).\]

$S_t$ is log-normally distributed with

\[\mathbb{E}[S_t] = S_0 e^{\mu t}.\]

Stratonovich Interpretation (Brief Note)

The alternative Stratonovich SDE

\[dX_t = \mu(t,X_t) \, dt + \sigma(t,X_t) \circ dW_t\]

converts to Itô form via

\[\mu^{\text{Itô}} = \mu^{\text{Strat}} + \frac{1}{2} \sigma \frac{\partial \sigma}{\partial x}.\]