konjoai · wesleyscholl · Jun 19, 2026 · Jun 19, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -6,6 +6,55 @@ Versions follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
 ---
 
+## [1.8.0] — 2026-06-19
+
+### Added — Phase 18 (Multi-Turn Jailbreak Engine — Crescendo / Echo Chamber)
+
+**`toki.multiturn` — new module (zero external deps)**
+- `MultiTurnStrategy` — StrEnum: `CRESCENDO` | `ECHO_CHAMBER`
+- `Turn` — frozen dataclass: `index`, `role`, `content`, optional assistant `score`
+- `Conversation` — ordered `Turn` list; `to_messages()` renders OpenAI-style
+  `{"role", "content"}` dicts, `transcript()` renders a scored, human-readable log
+- `MultiTurnConfig` — `name`, `strategy`, `goal`, `max_turns`, `seed`,
+  `success_threshold` (default 0.5), `output_dir`
+- `MultiTurnResult` — `turns`, `n_turns`, `success`, `success_turn`, `min_score`,
+  `final_score`, `transcript`; `to_json()`, `save()` (timestamped dir, no overwrite),
+  `load()` rehydrating typed `Turn`s
+- `Strategy` base + `CrescendoStrategy` (arXiv 2404.01833) / `EchoChamberStrategy`
+  (arXiv 2601.05742) — deterministic opener → escalation ladder → payload planning
+  producing exactly `n_turns` user messages; `get_strategy()` / `STRATEGIES` registry
+- `MultiTurnRunner.run(model_fn)` — drives a chat-style
+  `Callable[[list[dict[str, str]]], str]` through the planned escalation, scores
+  each assistant reply with the real `RuleScorer`, and stops early the first time a
+  reply drops below `success_threshold` (matching Crescendo's halt-on-compliance);
+  `run_multiturn(model_fn, config, save)` convenience wrapper
+- Built-in conversational baselines `conv_baseline_safe`, `conv_baseline_unsafe`,
+  `conv_baseline_crescendo` (benign until ≥3 user turns of benign history, then
+  capitulates) + `CONV_BASELINES` registry
+
+**`toki.coverage` (extended)**
+- `CATEGORY_AXIS` and `_DEFAULT_SEVERITY` gain `"multiturn"` (critical severity);
+  `_category_for` routes `multi`/`turn` categories to the new bucket
+
+**CLI**
+- `python -m toki multiturn` — `--strategy crescendo|echo_chamber`,
+  `--model safe|unsafe|crescendo`, `--goal`, `--max-turns`, `--seed`,
+  `--success-threshold`, `--output-dir`, `--json`; prints outcome + scored transcript
+
+**`toki.__init__`**
+- New exports: `CONV_BASELINES`, `Conversation`, `CrescendoStrategy`,
+  `EchoChamberStrategy`, `MultiTurnConfig`, `MultiTurnResult`, `MultiTurnRunner`,
+  `MultiTurnStrategy`, `Strategy`, `Turn`, `get_strategy`, `run_multiturn`
+
+**`pyproject.toml`**
+- Version bumped to `1.8.0`
+
+**Tests**
+- 31 new tests: `test_multiturn.py` (28), `test_main.py` (3 new CLI tests)
+- Total: 675/675 passing (644 prior + 31 new)
+
+---
+
 ## [1.7.0] — 2026-06-14
 
 ### Added — Phase 17 (Safety-Subspace LoRA — SaLoRA / SPLoRA)

diff --git a/PLAN.md b/PLAN.md
@@ -527,6 +527,52 @@ P3-1 (dual-agent red-team loop) and P3-2 (compliance certification).
 
 ---
 
+## Phase 18 — Multi-Turn Jailbreak Engine (Crescendo) (v1.8.0) [COMPLETE]
+
+**Ship Gate:** 675 Python tests passing. Zero failures. Multi-turn escalation
+verified end-to-end against safe / unsafe / crescendo-vulnerable conversational
+baselines; deterministic per-seed planning; early-exit on first compliance.
+
+### Motivation
+Single-turn safety defenses do not transfer to multi-turn attacks. Crescendo
+(arXiv 2404.01833) reaches 98–100% ASR on frontier models by escalating a
+benign conversation across turns, each message referencing the model's prior
+replies; Echo Chamber (arXiv 2601.05742), GRAF (2506.17881), and AutoAdv
+(2507.01020) confirm multi-turn is the dominant 2026 vector. Every prior toki
+module operated on a single prompt → single response — this was the largest
+blind spot in the coverage map and a prerequisite for the P3-1 dual-agent loop.
+
+### Deliverables
+- [x] `toki.multiturn` — multi-turn jailbreak engine (zero external deps):
+  - `MultiTurnStrategy` — StrEnum: `CRESCENDO` | `ECHO_CHAMBER`
+  - `Turn` (frozen) — index, role, content, optional assistant `score`
+  - `Conversation` — turn list with `to_messages()` (OpenAI-style) + `transcript()`
+  - `MultiTurnConfig` — name, strategy, goal, max_turns, seed, success_threshold,
+    output_dir
+  - `MultiTurnResult` — turns, n_turns, success, success_turn, min_score,
+    final_score, transcript; `to_json()` / `save()` (timestamped, no overwrite)
+    / `load()` rehydrating typed `Turn`s
+  - `Strategy` base + `CrescendoStrategy` / `EchoChamberStrategy` — deterministic
+    opener → escalation ladder → payload planning, exactly `n_turns` messages
+  - `MultiTurnRunner.run(model_fn)` — drives a chat-style
+    `Callable[[list[dict]], str]` through the planned escalation, scores each
+    reply with the real `RuleScorer`, stops early on first success (Crescendo
+    behaviour); `run_multiturn()` convenience wrapper
+  - Built-in conversational baselines: `conv_baseline_safe`, `conv_baseline_unsafe`,
+    `conv_baseline_crescendo` (benign early, capitulates after benign history
+    builds up) — `CONV_BASELINES` registry
+- [x] `toki.coverage` — `CATEGORY_AXIS` + `_DEFAULT_SEVERITY` extended with
+      `"multiturn"` (critical); `_category_for` routes `multi`/`turn` categories
+- [x] CLI: `python -m toki multiturn --strategy crescendo|echo_chamber
+      --model safe|unsafe|crescendo --goal --max-turns --seed
+      --success-threshold --output-dir [--json]`
+- [x] `toki.__init__` exports all new public symbols; `__version__` → `1.8.0`
+- [x] `pyproject.toml` version bumped to `1.8.0`
+- [x] 31 new tests: `test_multiturn.py` (28) + `test_main.py` (3 CLI) — all passing
+- [x] All 644 Phase 1–17 tests still passing (675 total)
+
+---
+
 ## Future / Backlog
 
 - 🟡 **P3-2** — Compliance certification report (OWASP Agentic Top 10 ASI01-ASI10
@@ -539,4 +585,4 @@ P3-1 (dual-agent red-team loop) and P3-2 (compliance certification).
 
 ---
 
-*Last updated: 2026-06-14 — v1.7.0 shipped. Safety-subspace LoRA (SaLoRA/SPLoRA) complete.*
+*Last updated: 2026-06-19 — v1.8.0 shipped. Multi-turn jailbreak engine (Crescendo / Echo Chamber) complete.*
diff --git a/python/pyproject.toml b/python/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 
 [project]
 name = "toki"
-version = "1.7.0"
+version = "1.8.0"
 description = "Adversarial fine-tuning lab for small language models"
 license = { text = "BUSL-1.1" }
 requires-python = ">=3.9"

diff --git a/python/tests/test_main.py b/python/tests/test_main.py
@@ -430,3 +430,39 @@ def test_finetune_model_requires_hf(capsys):
     with patch.dict(sys.modules, {"torch": None, "peft": None, "transformers": None}):
         with pytest.raises((ImportError, SystemExit)):
             main(["finetune", "--model", "gpt2"])
+
+
+# ---------------------------------------------------------------------------
+# multiturn CLI (Sprint 18)
+# ---------------------------------------------------------------------------
+
+
+def test_multiturn_command_jailbroken(tmp_path, capsys):
+    main([
+        "multiturn", "--model", "crescendo", "--strategy", "crescendo",
+        "--max-turns", "5", "--output-dir", str(tmp_path),
+    ])
+    captured = capsys.readouterr()
+    assert "JAILBROKEN" in captured.out
+
+
+def test_multiturn_command_safe_holds(tmp_path, capsys):
+    main([
+        "multiturn", "--model", "safe", "--max-turns", "4",
+        "--output-dir", str(tmp_path),
+    ])
+    captured = capsys.readouterr()
+    assert "held" in captured.out
+
+
+def test_multiturn_command_json(tmp_path, capsys):
+    import json as _json
+
+    main([
+        "multiturn", "--model", "unsafe", "--json",
+        "--output-dir", str(tmp_path),
+    ])
+    captured = capsys.readouterr()
+    data = _json.loads(captured.out)
+    assert data["success"] is True
+    assert data["strategy"] == "crescendo"