A rigorous fit of a multivariate Hawkes process to limit order book (LOB) event data, featuring residual analysis via the time-rescaling theorem and high-density technical visualizations.
60-second snapshot of LOB events. Top: Stacked conditional intensities λᵢ(t). Bottom: Actual event tick marks (Market Buys, Market Sells, Limit Additions).
This project implements a 4-dimensional Hawkes process to quantify the self-exciting dynamics of high-frequency trading. By analyzing LOBSTER event logs for GOOG, we model how aggressive trades trigger immediate liquidity replenishment and clustering.
-
Strong Self-Excitation: Market orders exhibit extreme clustering (
$\alpha_{ii} \approx 0.85$ ), confirming that "trades beget trades" in high-frequency regimes. -
Near-Critical Excitation: Aggressive buys (MB) trigger strong excitation in Limit Buy additions (
$\alpha_{LA_B, MB} > 1.0$ ). While individual weights exceed 1.0, the system remains stationary with a full-matrix Spectral Radius$\rho(\alpha) \approx 0.86$ . -
Regime Shifts: Rolling 30-minute fits reveal that
$\rho(\alpha)$ fluctuates significantly throughout the day, peaking during periods of high volatility.
Full-day average excitation weights. Note the strong diagonal and the cross-excitation between Market Orders and Limit Additions.
We use a multivariate Hawkes process with exponential kernels to capture the conditional intensity:
-
$\mu_i$ : Baseline intensity (exogenous arrivals). -
$\alpha_{ij}$ : Excitation weight (expected number of secondary events). -
$\beta$ : Shared decay rate (inverse half-life of memory).
Under the true model, the compensator-transformed inter-event times must be i.i.d.
| Dimension | KS Stat | KS p-val | LB Stat | LB p-val | Verdict |
|---|---|---|---|---|---|
| MB | 0.4705 | < 0.001 | 23.61 | 0.0087 | Fail (D, LB) |
| MS | 0.5226 | < 0.001 | 10.36 | 0.4096 | Fail (D), Pass (LB) |
| LA_B | 0.3421 | < 0.001 | 36.77 | < 0.001 | Fail (D, LB) |
| LA_S | 0.3689 | < 0.001 | 41.33 | < 0.001 | Fail (D, LB) |
The systematic failure of the Kolmogorov-Smirnov (KS) tests highlights the limitations of first-order exponential Hawkes models for LOB data:
- Kernel Misspecification: Real LOB memory often follows a Power Law rather than a simple exponential decay.
- The MS Anomaly: Interestingly, Market Sells (MS) pass the Ljung-Box test but fail the KS test. This suggests the model successfully captures the temporal independence (autocorrelation structure) but fails to model the marginal distribution, a classic indicator that the kernel shape is misspecified while the cross-excitation weights are approximately correct.
- Missing Covariates: The model currently ignores cancellations and mid-price returns, which drive significant intensity variance in real microstructure.
The repository prioritizes legible, dense, and technical 2D signals that provide actionable insights into market microstructure.
Visualizes Hawkes conditional intensity as a time-frequency spectrogram, identifying bursts of cross-excitation activity across all 4 event types.

A professional-grade liquidity landscape. Volume is log-scaled to reveal hidden depth, with actual trade events (dots) overlaid on the price-time grid.

A 4×4 grid of small multiples surfacing the exact trigger-response dynamics ($G_{ij}(t)$) for every possible pair in the system.

A rolling window animation showing the evolution of the excitation matrix 
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# Critical: Fixes tick library for Python 3.12+ ABI compatibility.
# This only modifies metaclass attribute management and does not affect MLE numerics.
python scripts/patch_tick.py- Download the
GOOG_2012-06-21(10 levels) sample from LOBSTER. - Extract the
.csvfiles into thedata/directory.
# A. Initialize the analysis notebook
python notebooks/create_notebook.py
# B. Run global fit & generate primary visuals
python src/model.py
# C. Run rolling window analysis
python notebooks/02_rolling_fit.py- Kernel Shape: Replacing exponential with Power Law (Pareto) kernels for heavy-tailed memory.
-
Asymmetric Decay: Modeling unique decay rates
$\beta_i$ for each event type. - Price Impact: Integrating price changes as a covariate to model the volatility-excitation feedback loop.
Licensed under the Apache License, Version 2.0.