Skip to content

ibitato/C64_AI_Companion

Repository files navigation

C64 AI Companion

C64 AI Companion is a reproducible fine-tuning project that adapts reasoning-capable Ministral 3 models (8B and 14B profiles) to technical Commodore 64 knowledge.

Project Objective

The objective is to keep strong reasoning behavior while adding accurate, practical C64 technical knowledge for topics such as BASIC, KERNAL, memory map, VIC-II, SID, and 6502/6510 workflows.

What This Project Is and Is Not

  • It is a container-first training and packaging workflow for controlled base-model profiles.
  • It is a reproducible engineering pipeline for dataset preparation, DAPT/SFT fine-tuning, and GGUF export.
  • It enforces a visible reasoning contract: [THINK]...[/THINK] followed by final answer.
  • It is not a generic multi-model framework.
  • It does not use user-global model caches as authoritative model paths.

Architecture and Reproducibility

  • Training and packaging run in Docker.
  • Canonical training image: rocm/pytorch:rocm7.2_ubuntu24.04_py3.12_pytorch_release_2.9.1.
  • Canonical base model paths:
    • models/Ministral-3-8B-Thinking (--model-profile 8b, default)
    • models/Ministral-3-14B-Thinking (--model-profile 14b)
  • Project-local cache: .cache/huggingface.

Strix Halo Runtime Note

This project intentionally documents a split between host runtime details and container runtime details.

  • Host runtime is infrastructure context.
  • Container runtime is the training source of truth for reproducibility.
  • For this workstation profile, container ROCm/HIP 7.x userland is the supported training runtime for Strix Halo compatibility.

See docs/workstation_profile.md for the host/container compatibility matrix.

Quickstart

  1. Export your host user IDs for container file ownership:
export LOCAL_UID=$(id -u)
export LOCAL_GID=$(id -g)
  1. Build image:
docker compose build trainer
  1. Run GPU smoke test:
docker compose run --rm trainer bash scripts/container/gpu_smoke.sh
  1. Build datasets:
docker compose run --rm trainer bash scripts/container/pipeline.sh

For 14B:

docker compose run --rm trainer bash scripts/container/pipeline.sh --model-profile 14b
  1. Train (DAPT + SFT):
docker compose run --rm trainer bash scripts/container/train.sh

For 14B:

docker compose run --rm trainer bash scripts/container/train.sh --model-profile 14b
  1. Export GGUF:
docker compose run --rm trainer bash scripts/container/export_gguf.sh \
  --model-profile 8b \
  --quantization Q4_K_M

For 14B:

docker compose run --rm trainer bash scripts/container/export_gguf.sh \
  --model-profile 14b \
  --quantization Q4_K_M
  1. Optional extra quantizations (Q6_K, Q8_0):
bash scripts/inference/quantize_additional_gguf.sh --model-profile 8b
  1. Validate reasoning contract (single-turn + multi-turn):
bash scripts/inference/validate_reasoning_behavior.sh --model-profile 8b
  1. Reproducible GGUF benchmark matrix (container-run, CSV output):
bash scripts/inference/benchmark_gguf_matrix.sh --model-profile 8b

Inference Runtimes

  • Ollama helper:
bash scripts/inference/create_ollama_models.sh --model-profile 8b
  • llama.cpp helper:
bash scripts/inference/run_llama_cpp.sh Q8_0 "Explain SID voices in two concise points." --model-profile 8b
  • llama.cpp server (OpenAI-compatible API / GUI reasoning panel):
python3 scripts/prompt_contract.py --model-profile 8b --print-full > .cache/runtime/c64_system_prompt_8b.txt
./llama-server \
  -hf ibitato/c64-ministral-3-8b-thinking-c64-reasoning-gguf:F16 \
  --host 0.0.0.0 --port 8080 \
  --jinja \
  --reasoning-format deepseek \
  --reasoning-budget -1 \
  --system-prompt-file .cache/runtime/c64_system_prompt_8b.txt \
  --ctx-size 32768 \
  -ngl 99 \
  --temp 0.15 \
  --threads "$(nproc)" \
  --fit on

Use --reasoning-format none if you want raw [THINK]...[/THINK] tags in content instead of GUI reasoning separation.

  • Benchmark all GGUF variants and write results/benchmarks/*.csv:
bash scripts/inference/benchmark_gguf_matrix.sh --model-profile 8b --models "F16 Q4_K_M Q6_K Q8_0"

Hugging Face Artifacts

8B profile:

14B profile:

Publication Traceability

This repository and the Hugging Face artifacts are maintained as a bidirectional publication set:

  • GitHub -> Hugging Face: this README links to the LoRA and GGUF model repositories.
  • Hugging Face -> GitHub: both model cards link back to this repository (https://github.com/ibitato/C64_AI_Companion).
  • Reproducibility metadata (trainer states, training summary, model cards) is generated by scripts/release/publish_hf.py from local artifacts.

For operational publishing details, see docs/release_huggingface.md.

Repository Structure

C64_AI_Companion/
|-- c64_docs/                    # Source C64 manuals used for dataset generation
|-- data/                        # Interim and processed datasets
|-- docs/                        # Project documentation and operational manuals
|-- models/                      # Base model, fine-tuned outputs, GGUF exports
|-- scripts/                     # Data pipeline, training, export, runtime automation
|-- tests/                       # GPU and data-pipeline validation tests
|-- Dockerfile.train
|-- docker-compose.yml
`-- requirements*.txt

Documentation Index

Start at docs/index.md.

See docs/specs/reasoning_contract.md for the authoritative reasoning/output contract.

Security and Model Policy

  • Base model path is restricted by policy to profile canonical paths:
    • models/Ministral-3-8B-Thinking (8b)
    • models/Ministral-3-14B-Thinking (14b)
  • Sensitive tokens must be stored in local .env only.
  • Large artifacts and caches remain excluded by .gitignore.

AI Co-Authors

This project includes AI-assisted engineering contributions from:

  • Mistral AI Devstral 2
  • Mistral AI Vibe CLI
  • Codex 5.3
  • Codex CLI

See CREDITS.md for contribution scope and responsibility boundaries.

About

Fine-tuning LLM for Commodore 64 knowledge

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors