Technical Reference: Problem structure, Judge API, and evaluation details for algorithmic track.
For model evaluation workflow, see SUBMIT.md.
Each problem in problems/{id}/ contains:
problems/{id}/
├── statement.txt # Problem description
├── tag.txt # Category tag
├── config.yaml # Time/memory limits, test count
├── testdata/ # Test cases (public: 1 per problem)
│ ├── 1.in
│ └── 1.ans
└── chk.cc / interactor.cc # Checker or interactor
- Language: C++17 only
- Single file: Submit one
.cppfile per problem
- Fetch problem statement from judge API
- Generate solution via LLM (C++ code)
- Submit to judge server
- Poll for result
- Score based on test case pass rate
The judge server will save solutions and their detailed judging results under the folder algorithmic/submissions.
| Endpoint | Description |
|---|---|
GET /problems |
List all problems |
GET /problem/{id}/statement |
Get problem statement |
POST /submit |
Submit solution |
GET /result/{sid} |
Get submission result |
from frontier_cs import SingleEvaluator
evaluator = SingleEvaluator()
# Evaluate an algorithmic problem
result = evaluator.evaluate("algorithmic", problem_id=1, code=cpp_code)
print(f"Score: {result.score}")
# Get unbounded score (without clipping)
result = evaluator.evaluate("algorithmic", problem_id=1, code=cpp_code, unbounded=True)
print(f"Score: {result.score}") # Uses unbounded when unbounded=True
print(f"Score (unbounded): {result.score_unbounded}")# Evaluate a solution
frontier eval algorithmic 1 solution.cpp
# Get unbounded score
frontier eval algorithmic 1 solution.cpp --unboundedFor batch evaluation of multiple solutions, see SUBMIT.md.
frontier batch algorithmic # Evaluate all in solutions/
frontier batch algorithmic --backend skypilot # Use cloud go-judge
frontier batch algorithmic --status # Check progressNote: For algorithmic track, --clusters is not used. All workers share a single go-judge server (local Docker or SkyPilot).
For environments where Docker privileged mode is unavailable (e.g., gVisor, Cloud Run):
# Auto-launch cloud judge
frontier eval algorithmic 1 solution.cpp --backend skypilot
# Or manually launch
sky launch -c algo-judge algorithmic/sky-judge.yaml --idle-minutes-to-autostop 10
frontier eval algorithmic 1 solution.cpp --judge-url http://$(sky status --ip algo-judge):8081For contributing problems to Frontier-CS (detailed file formats, CI requirements), see CONTRIBUTING.md.
time_limit: 1000 # ms
memory_limit: 262144 # KB
test_count: 10
checker: chk.cc # or interactor: interactor.ccThe judge server will be auto-started when running frontier eval algorithmic ....
environment:
PORT: "8081" # API port
JUDGE_WORKERS: "8" # Concurrent evaluations
GJ_PARALLELISM: "8" # go-judge parallelism