MLX Token Convergence Experiments

Exploring how language models converge on token sequences — and what adversarial pressure reveals about their training.

Quick Start

uv run --with mlx-lm mlx_example.py -l 20 -m sequential   # Solo: 2 iterations
uv run --with mlx-lm mlx_example.py -l 20 -m adversarial  # Battle: never converges

The Core Finding

Mode	Solo	Adversarial
Sequential (greedy L→R)	2 iterations	Never converges (oscillates)
Single-token + blocking	~2.5n iterations	~3n iterations

A single model has coherent internal preferences and quickly finds a stable state. Two different models have incompatible preferences and fight forever.

Model Fingerprinting

The most interesting finding: adversarial pressure reveals what models are made of.

Solo vs Adversarial Token Insertions

Model	Solo Mode	Adversarial Mode
Llama 3B	Varied: `\n`, `,`, `{}` braces	Collapsed: `!` (100%)
Qwen 1.5B	Structured: `\n\n`, `#` headers	Collapsed: spaces, Chinese chars

Key Insights

1. Adversarial pressure collapses diversity

In solo mode, models use context-appropriate tokens. Under adversarial pressure, they fall back to whatever they're most confident about regardless of context.

2. The ! finding

Llama never uses ! in solo mode. But under adversarial pressure, it's the only thing it uses. This isn't Llama's "favorite" token — it's Llama's most defensible token. The one it can justify in the widest range of contexts.

3. Chinese tokens only under pressure

Qwen's Chinese tokens (输入, 错误, 背景, 的) don't appear in solo mode. They emerge when Qwen is actively contested — a fallback to vocabulary that Llama can't contest effectively.

4. Training strata revealed

Solo evaluation shows surface behavior. Adversarial pressure reveals composition — the training data that shows through when everything else is stripped away.

Modes

Mode	Description
`sequential`	Greedy left-to-right, recompute after each token
`single`	One token per iteration + oscillation blocking
`batch`	Replace all mismatches at once
`adversarial`	Two models, full sequential pass each
`adversarial-single`	Two models, one token each + blocking + fingerprinting

Usage

# Solo modes
python mlx_example.py -l LENGTH -m single|batch|sequential

# Adversarial modes (with fingerprinting)
python mlx_example.py -l LENGTH -m adversarial|adversarial-single \
  --model-a mlx-community/Qwen2.5-1.5B-Instruct-4bit \
  --model-b mlx-community/Llama-3.2-3B-Instruct-4bit

# Options
-l, --length      Number of random tokens (default: 10)
-n, --iterations  Max iterations (default: 100)
-m, --mode        Convergence mode
--model-a         First model
--model-b         Second model (adversarial modes)

How It Works

Start: Generate random token sequence
Each iteration: Find highest-entropy position where model's prediction differs from actual
Replace: Swap actual token with model's argmax prediction
Track: Record what tokens each model tries to insert (fingerprinting)
Block: Positions that flip 3+ times get blocked to prevent infinite oscillation
Repeat: Until convergence or oscillation detected

Example Output

MODEL FINGERPRINTS (tokens each model compulsively inserts)
============================================================

Model A (Qwen) top insertions (44 total):
   27x  ' '
    5x  '-'
    4x  '!'
    1x  '的'
    1x  '们'

Model B (Llama) top insertions (43 total):
   43x  '!'

The "Model Mass Spectrometer"

This technique works like a mass spectrometer for models:

Apply adversarial pressure to strip away surface behavior
What remains reveals the training composition
Different models have different "elemental signatures"

Not what models prefer — what they're made of.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.cache		.cache
experiments		experiments
results		results
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
core.py		core.py
mlx_example.py		mlx_example.py
run_battle.py		run_battle.py
strategies.py		strategies.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLX Token Convergence Experiments

Quick Start

The Core Finding

Model Fingerprinting

Solo vs Adversarial Token Insertions

Key Insights

Modes

Usage

How It Works

Example Output

The "Model Mass Spectrometer"

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MLX Token Convergence Experiments

Quick Start

The Core Finding

Model Fingerprinting

Solo vs Adversarial Token Insertions

Key Insights

Modes

Usage

How It Works

Example Output

The "Model Mass Spectrometer"

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages