M.Tech AI Thesis | IIT Bhubaneswar | Tariq Husain Khan
ENFORE: Enforcement of Neural-network Formal Runtime Enforcement
Submitted to RV 2026 (Runtime Verification Conference)
ENFORE v2 is a safety-guaranteed autonomous driving framework that combines a cross-attention neural perception controller with a formally verified runtime enforcement shield. The system drives safely in the CARLA simulator using only a front camera (RGB) and a LiDAR summary vector as input.
Two core contributions:
- HydraNet v2 — Multi-task neural controller with cross-attention fusion (RGB × LiDAR)
- VDTA-to-QP Synthesis (Algorithm 1) — Formal algorithm that auto-synthesizes safety constraints from a Valued Discrete Timed Automaton and enforces them at runtime via a Quadratic Program
| Metric | HydraNet v1 (baseline) | HydraNet v2 (ours) |
|---|---|---|
| Steering R² | 0.751 | 0.854 |
| Steering MAE | 0.0412 | 0.0261 |
| Traffic-light accuracy | 87.3% | 91.8% |
| Red-light recall (safety) | 81.2% | 89.6% |
| Distance MAE | 0.0083 | 0.0061 |
| Shield violations | 0 | 0 |
| FGSM ε=0.02 MAE (no shield) | 0.1847 | — |
| FGSM ε=0.02 MAE (+ shield) | 0.0612 | — |
RGB Frame (3×160×320)
│
ResNet-34 backbone
│
Spatial tokens (50×512) ◄──── Cross-Attention ◄──── LiDAR query (1×512)
│ │
Context vector (512) LiDAR Encoder (8→512)
│
Fusion MLP (1024→512)
│
┌────┼────┐
Steer Light Dist
head head head
Cross-attention lets the model attend to the specific spatial regions of the image most relevant to the LiDAR reading — e.g., when LiDAR detects something 5 m ahead, attention highlights the obstacle in the image rather than the sky.
The shield is a formally verified runtime monitor built from a 4-state automaton:
| VDTA State | Condition | D_safe | Behaviour |
|---|---|---|---|
| FAR | D > 20 m | 4 m | Drive freely |
| MEDIUM | 6 m < D ≤ 20 m | 8 m | Begin caution |
| CLOSE | D ≤ 6 m | 15 m | Enforce braking |
| VIOLATED | bad transition | 15 m | Never occurs |
Algorithm 1 (novel): Reads the current VDTA state, extracts bad-transition guards, and automatically assembles the QP constraint matrix A, b. The QP then solves:
min ‖a − â‖² (minimal correction to proposed action â)
s.t. A·a ≤ b (safety constraints from VDTA)
a ∈ [−1,1]³ (action bounds)
Three formal properties (proved):
- Soundness — if the shield outputs a*, then a* satisfies all active VDTA constraints
- Transparency — if â already satisfies all constraints, the shield returns â unchanged
- Completeness — for every reachable VDTA state, Algorithm 1 synthesizes at least one constraint
enfore-v2/
├── vdta_qp_synthesis.py # Core: VDTA + Algorithm 1 + ENFOREShieldV2
├── hydranet_v2.py # HydraNet v2 with cross-attention fusion
├── enfore_v2_train.py # Training script (30 epochs, Adam, cosine LR)
├── enfore_v2_eval.py # Evaluation + FGSM adversarial robustness
├── docs/
│ ├── ENFORE_v2_Algorithm.pdf # Formal algorithm with theorems (for reviewers)
│ └── ENFORE_Thesis_Defense.pdf # Full defense guide with 50 Q&A
└── README.md
git clone https://github.com/<your-username>/enfore-v2.git
cd enfore-v2
pip install torch torchvision numpy pandas tqdm quadprog reportlabCARLA 0.9.14+ required for data collection. See CARLA docs.
python enfore_v2_train.py \
--data /path/to/manifest.csv \
--backbone resnet34 \
--epochs 30 \
--batch 24 \
--out checkpoints_v2/python enfore_v2_eval.py \
--checkpoint checkpoints_v2/best.pt \
--data manifest.csv \
--horizon 2 \
--out_csv results_v2.csvfrom vdta_qp_synthesis import ENFOREShieldV2
import numpy as np
shield = ENFOREShieldV2(predictive_horizon=2)
# Scenario: close obstacle, aggressive throttle proposed
obs = {'D': 4.2, 'speed': 8.0}
a_hat = np.array([0.05, 0.85, 0.0]) # steer, throttle, brake
a_star, info = shield.enforce(a_hat, obs)
print(f"Proposed: {a_hat}")
print(f"Enforced: {a_star}")
print(f"VDTA state: {info['state']}, Intervened: {info['intervened']}")The training script expects a CSV manifest with columns:
| Column | Description |
|---|---|
image_path |
Absolute path to 320×160 RGB JPEG |
lidar_path |
Path to 8-column LiDAR summary CSV |
steer |
Expert steering in [−1, 1] |
light_class |
Traffic-light class {0=none, 1=green, 2=yellow, 3=red} |
distance_norm |
Normalised front distance in [0, 1] (divide by 50 m) |
| # | Paper | Used for |
|---|---|---|
| [1] | Falcone et al., 2011 — Synthesising Correct-by-Construction Runtime Monitors | Soundness & transparency properties |
| [2] | Pinisetty et al., 2017 — Predictive Runtime Enforcement | Predictive horizon k=2 |
| [3] | Alshiekh et al., 2018 — Safe RL via Shielding | Shield architecture |
| [4] | Ames et al., 2019 — Control Barrier Functions | QP structure |
| [5] | Stellato et al., 2020 — OSQP | QP solver motivation |
| [6] | Kendall et al., 2018 — Multi-task learning using uncertainty | Uncertainty-weighted loss |
| [7] | He et al., 2016 — Deep Residual Learning | ResNet-34 backbone |
| [8] | Vaswani et al., 2017 — Attention Is All You Need | Cross-attention mechanism |
Tariq Husain Khan
M.Tech Artificial Intelligence | IIT Bhubaneswar
📧 tariqhusainkhan01@gmail.com
MIT License — free to use with attribution.