Graph Generation with Network Topology Constraints

RL training task: Teach LLMs to generate graphs matching precise network topology metrics (clustering coefficient, average path length, connectivity).

Overview

This task teaches the model to work with network science and graph theory by generating graphs that satisfy strict topological constraints. The model must:

Choose appropriate graph generation algorithms (Erdős-Rényi, Barabási-Albert, Holme-Kim triadic closure, etc.)
Understand how different parameters affect network properties
Use iterative refinement techniques (edge rewiring, pruning) to hit precise targets
Balance multiple competing objectives under resource constraints

Why this matters for ML engineers: Graph generation with constraints is fundamental in many domains:

Synthetic network datasets for GNN training
Network topology optimization for distributed systems
Social network simulation and analysis
Biological network modeling (protein interactions, neural connectivity)

Task Description

Given a target profile for three key graph metrics:

C (Clustering Coefficient): Measures local triangle density, typically 0.08-0.12
L (Average Path Length): Average shortest path in the giant connected component, typically 4.2-6.0
CC (Connected Components): Must be exactly 1 (fully connected graph)

The model must generate a graph of 400 nodes that achieves a score ≥ 80-85% (randomly chosen per run).

Scoring Function

Score = 100 × average(closeness(C), closeness(L), closeness(CC))

where closeness(metric) = 1 - |actual - target| / tolerance

Hard constraint: If ANY metric achieves < 30% closeness, the total score is zero. This prevents the model from ignoring difficult metrics.

What the Model Learns

1. Graph Generation Algorithms

Triadic closure models (Holme-Kim): Best for achieving realistic C values through triangle formation
Barabási-Albert: Power-law degree distribution, but lower clustering
Erdős-Rényi: Uniform random, poor for this task (C too low)
Configuration model: Custom degree sequences

2. Network Topology Intuitions

m parameter: Higher m → denser graph → shorter L, higher C
Triadic probability p: Higher p → more triangles → higher C
Trade-offs: C and L are often anti-correlated (densification reduces distances)

3. Iterative Refinement

Standard rewiring: Preserves degree distribution, adjusts C/L
Triangle-boosting rewiring: Smart swaps that increase clustering
Edge pruning: Carefully remove edges to increase L while maintaining connectivity

4. Constraint Satisfaction Under Budgets

120 rewire operations max
60 metric computation calls max
25 agent steps max
Penalty for excessive metric calls (>25)

5. Early Stopping Strategy

The task includes a "patience" mechanism: if the model shows no improvement for 6 consecutive checks after 10 attempts, it should abandon the current approach and try a different strategy.

Setup Instructions

1. Install `uv` (if not already installed)

Linux/macOS:

curl -LsSf https://astral.sh/uv/install.sh | sh

Windows (PowerShell):

powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

2. Clone and Navigate

git clone https://github.com/ElkinStas/graph-generation-rl-task
cd graph-generation-rl-task

3. Set API Key

export ANTHROPIC_API_KEY='your_api_key_here'

4. Run the Task

uv run main.py

uv will automatically:

Create a virtual environment
Install dependencies (anthropic>=0.67.0, networkx>=3.0)
Execute the test suite

Running the Task

The script runs 10 independent trials by default. Each trial:

Randomizes target metrics (C, L) and pass threshold
Gives the model a unique seed for reproducibility
Runs the agent loop with tool access
Grades the final submission
Logs detailed execution to logs/

To change number of runs, edit the last line in main.py:

asyncio.run(main(concurrent=False, num_runs=20))  # Run 20 trials

Execution modes:

concurrent=False: Sequential (easier to debug, recommended)
concurrent=True: Parallel execution (faster)

Grading Criteria

✅ Correct answers are accepted if:

Score ≥ randomly chosen threshold (80-85%)
Graph is connected (CC = 1)
All metrics within tolerance bands

❌ Common failure modes:

Using Erdős-Rényi (C too low, always fails)
Generating disconnected graphs (CC > 1)
Getting stuck in local optima (not exploring enough)
Excessive metric calls (penalty applied after 25 calls)
Early termination due to patience threshold

Task Difficulty & Pass Rate

Expected pass rate: 10-40% (per XOR requirements)

This is challenging because:

Tight constraints: Each metric must be ≥30% close to target
Multi-objective optimization: C and L trade off (harder to satisfy both)
No direct analytical solution: Model must learn heuristics through trial-and-error
Resource constraints: Limited rewire/metric budgets force efficiency
Randomized targets: Each run has different C/L goals (generalization required)

Why models fail:

40%: Wrong graph generator choice (e.g., ER instead of triadic)
30%: Poor parameter selection (m/p values)
20%: Insufficient refinement iterations
10%: Budget exhaustion or early stopping

Why models succeed:

Use triadic closure model (Holme-Kim) with appropriate m/p
Compute metrics immediately after generation
Apply targeted refinement (rewire_triangles for C, prune_edges for L)
Submit as soon as threshold reached (avoid over-optimization)

Tools Available

The model has access to 6 tools:

Tool	Purpose	Budget
`generate_graph(kind, params, seed)`	Create initial graph	Unlimited
`rewire(k)`	Random edge swaps (1-5)	120 total
`rewire_triangles(k)`	Clustering-aware swaps (1-5)	120 total
`prune_edges(k)`	Remove edges safely (1-3)	120 total
`compute_metrics()`	Get C, L, CC + score	60 calls
`submit_graph()`	Final submission	Once

Key design decisions:

Shared rewire budget: All rewire operations (rewire + rewire_triangles + prune_edges) share the 120-call budget
Auto-submission: When score ≥ threshold, graph is auto-submitted (no need to call submit explicitly)
Deterministic connectivity: When CC=1 is required, rewire operations that break connectivity are automatically rejected
Efficient L computation: Uses 22 sampled BFS sources (exact APSP would be O(n³) on 400 nodes)

Logs & Reproducibility

Every run generates:

Detailed log: logs/run_<id>_<timestamp>.log - step-by-step execution
Conversation history: logs/run_<id>_<timestamp>.conversation.json - full transcript (on failure)
Summary files:
- logs/summary_<timestamp>.jsonl - Machine-readable results
- logs/summary_<timestamp>.csv - Excel-friendly format

Seed logging: Each run logs its seed prominently for exact reproducibility.

Example Output

============================================================
Running 10 test iterations sequentially...
Target ranges: C=(0.08, 0.12), L=(4.2, 6.0)
Pass threshold range: (80, 85)
============================================================

Run 1: SUCCESS - score=82.3 (threshold=81.2), best=82.3
   Metrics: C=0.0987, L=5.124, CC=1, GCC_size=400
   Duration: 45.2s
   Log: logs/run_1_20250103_143022.log

Run 2: FAILURE - score=68.4 < 83.7, best=68.4
   Metrics: C=0.0654, L=5.892, CC=1, GCC_size=400
   C<30% (C=0.0654, target≈0.095±0.035, closeness=26.3%)
   Duration: 38.1s
   Log: logs/run_2_20250103_143101.log

[... 8 more runs ...]

============================================================
TEST RESULTS
============================================================
  Passed: 3/10
  Failed: 7/10
  Pass Rate: 30.0%
  Avg Score: 64.2
  Score Range: [0.0, 86.1]

Pass rate in target range (10-40%)

Task Statistics

Lines of code: ~850 (main task logic)
Dependencies: 2 (anthropic, networkx)
Avg inference cost: ~$0.50-1.00 per 10 runs (with Claude Sonnet 4)
Avg runtime: 40-60 seconds per run (sequential)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Graph Generation with Network Topology Constraints

Overview

Task Description

Scoring Function

What the Model Learns

1. Graph Generation Algorithms

2. Network Topology Intuitions

3. Iterative Refinement

4. Constraint Satisfaction Under Budgets

5. Early Stopping Strategy

Setup Instructions

1. Install `uv` (if not already installed)

2. Clone and Navigate

3. Set API Key

4. Run the Task

Running the Task

Grading Criteria

Task Difficulty & Pass Rate

Tools Available

Logs & Reproducibility

Example Output

Task Statistics

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Graph Generation with Network Topology Constraints

Overview

Task Description

Scoring Function

What the Model Learns

1. Graph Generation Algorithms

2. Network Topology Intuitions

3. Iterative Refinement

4. Constraint Satisfaction Under Budgets

5. Early Stopping Strategy

Setup Instructions

1. Install uv (if not already installed)

2. Clone and Navigate

3. Set API Key

4. Run the Task

Running the Task

Grading Criteria

Task Difficulty & Pass Rate

Tools Available

Logs & Reproducibility

Example Output

Task Statistics

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Install `uv` (if not already installed)

Packages