entrenar

Training Framework for the Sovereign AI Stack

A pure Rust training framework providing autograd, LoRA/QLoRA fine-tuning, quantization (Int4/Int8), model merging, knowledge distillation, and Compiler-in-the-Loop (CITL) training. Built on trueno for SIMD-accelerated compute and aprender for ML algorithms.

What is entrenar?

Entrenar (Spanish: "to train") is a production-grade neural network training library in pure Rust. It provides everything needed to train, fine-tune, quantize, merge, and distill models -- with no Python dependency.

Core capabilities:

Autograd Engine -- Tape-based reverse-mode automatic differentiation
Optimizers -- SGD, Adam, AdamW with cosine scheduling and gradient clipping
LoRA / QLoRA -- Parameter-efficient fine-tuning with 4-bit quantized base weights
Quantization -- QAT, PTQ, GGUF-compatible Q4_0/Q8_0, NF4 training
Model Merging -- TIES, DARE, SLERP algorithms
Knowledge Distillation -- Multi-teacher, progressive layer-wise
CITL -- Compiler-in-the-Loop training for transpiler optimization
GPU Training -- WGPU backend (AMD/Intel/cross-platform), CUDA/cuBLAS (NVIDIA)
Monitoring -- Real-time metrics, drift detection, Andon alerts

Part of the PAIML Sovereign AI Stack.

Installation

Library

Add to your Cargo.toml:

[dependencies]
entrenar = "0.7"

CLI

cargo install entrenar

From source

git clone https://github.com/paiml/entrenar
cd entrenar
cargo install --path .

Usage

Basic Training

use entrenar::train::{Trainer, TrainConfig, MSELoss, EarlyStopping};
use entrenar::optim::Adam;
use entrenar::Tensor;

let params = vec![Tensor::zeros(784 * 128, true)];
let optimizer = Adam::new(0.001, 0.9, 0.999, 1e-8);

let mut trainer = Trainer::new(params, Box::new(optimizer), TrainConfig::default());
trainer.set_loss(Box::new(MSELoss));
trainer.add_callback(EarlyStopping::new(5, 0.001));

let result = trainer.train(100, || batches.clone(), |x| model.forward(x));
println!("Final loss: {:.4}", result.final_loss);

Autograd

use entrenar::autograd::{matmul, softmax, layer_norm, attention};

let y = matmul(&x, &w);
let s = softmax(&logits);
let n = layer_norm(&x, &gamma, &beta);
let a = attention(&q, &k, &v);

LoRA / QLoRA Fine-Tuning

use entrenar::lora::{LoRALayer, QLoRALayer};

// Standard LoRA
let lora = LoRALayer::new(4096, 4096, 16, 32.0);

// QLoRA: 4-bit base + FP16 adapters (7B model: 28GB -> 3.5GB)
let qlora = QLoRALayer::new(base_weights, 16, 32.0);

Quantization

use entrenar::quant::{FakeQuantize, PTQCalibrator, GGUFQuantizer};

let fq = FakeQuantize::new(8, true);            // QAT with STE
let calibrator = PTQCalibrator::percentile(0.999); // Post-training
let quantizer = GGUFQuantizer::q4_0();           // GGUF export

Model Merging

use entrenar::merge::{TiesMerge, DareMerge, SlerpMerge};

let merged = TiesMerge::new(0.2).merge(&models, &weights);
let merged = DareMerge::new(0.9).merge(&base, &finetuned);
let merged = SlerpMerge::new().merge(&a, &b, 0.5);

Declarative Configuration

# train.yaml
model:
  path: base-model.gguf
data:
  train: train.parquet
  batch_size: 8
optimizer:
  name: adamw
  lr: 0.0001
lora:
  rank: 64
  alpha: 16
training:
  epochs: 10
  grad_clip: 1.0

entrenar train train.yaml

CLI Commands

entrenar train config.yaml --epochs 10
entrenar quantize model.safetensors --bits 4 --output model_q4.json
entrenar merge model1.safetensors model2.safetensors --method ties
entrenar bench config.yaml --warmup 5 --iterations 100
entrenar inspect model.safetensors -v
entrenar audit predictions.parquet --type bias --threshold 0.8
entrenar monitor data.parquet --threshold 0.2

Features

Autograd Engine

Tape-based reverse-mode automatic differentiation with verified gradients. Supports matmul, softmax, layer normalization, and scaled dot-product attention. All gradients validated against finite-difference reference implementations.

LoRA / QLoRA Fine-Tuning

Parameter-efficient fine-tuning with up to 99.75% parameter reduction. QLoRA combines 4-bit NF4 quantized base weights with FP16 low-rank adapters, reducing 7B model memory from 28GB to 3.5GB. PEFT-compatible adapter export for interoperability with HuggingFace tooling.

Quantization

Three quantization strategies: Quantization-Aware Training (QAT) with straight-through estimator, Post-Training Quantization (PTQ) with percentile calibration, and GGUF-compatible Q4_0/Q8_0 export for llama.cpp interoperability. NF4 training with cuBLAS backward pass support.

Model Merging

Three model merging algorithms for combining fine-tuned checkpoints: TIES (Trim, Elect Sign, Merge) for multi-model consolidation, DARE (Dropout And Rescale) for parameter-efficient merging, and SLERP (Spherical Linear Interpolation) for smooth two-model blending.

Knowledge Distillation

Temperature-scaled KD loss with configurable alpha weighting between hard and soft targets. Multi-teacher ensemble distillation with weighted aggregation. Progressive layer-wise distillation for large-to-small model transfer.

CITL (Compiler-in-the-Loop)

Training loop that incorporates compiler feedback for transpiler optimization. Uses RAG-based fix suggestions via trueno-rag to guide training toward compilable outputs. Designed for the depyler/bashrs/decy transpilation stack.

GPU Training

WGPU backend for cross-platform GPU training (AMD, Intel, Apple Silicon). NVIDIA CUDA/cuBLAS backend for dedicated GPU acceleration. NVML integration for real-time GPU monitoring. VRAM ledger with file-based locking for multi-process coordination.

Monitoring

Toyota Way-inspired quality monitoring with real-time metrics collection, drift detection (z-score based), and Andon alert system for automatic anomaly notification. NaN/Inf detection, gradient explosion guards, and loss divergence tracking.

Feature Flags

Flag	Purpose
`gpu`	GPU-accelerated training via wgpu
`cuda`	NVIDIA CUDA/cuBLAS training
`citl`	Compiler-in-the-Loop with trueno-rag
`monitor`	Training monitoring with trueno-db persistence
`server`	REST/HTTP API server via axum
`parquet`	Parquet batch loading via alimentar
`hub`	HuggingFace Hub model fetching
`wasm`	Browser-compatible WASM build
`tracing`	Renacer distributed tracing integration
`nvml`	Real GPU monitoring via NVIDIA NVML

Architecture

entrenar/
  autograd/     Tape-based automatic differentiation
  optim/        SGD, Adam, AdamW, schedulers
  lora/         LoRA, QLoRA fine-tuning
  quant/        QAT, PTQ, GGUF quantization
  merge/        TIES, DARE, SLERP merging
  distill/      Knowledge distillation
  finetune/     ClassifyPipeline, ClassifyTrainer, evaluation
  eval/         Classification metrics, drift detection, Andon
  train/        Trainer, callbacks, metrics, WGPU transformer trainer
  monitor/      Real-time monitoring, Andon alerts
  config/       Declarative YAML configuration
  io/           Model persistence (SafeTensors, APR)

Quality

Metric	Value
Tests	7,527+ passing
Coverage	96%
TDG Score	A+ (96.8/100)
Critical Defects	0
Property Tests	200K+ iterations
Gradient Checking	Finite-difference validated
Mutation Testing	>80% kill rate
MSRV	1.87

Sovereign AI Stack

Crate	Purpose	Version
trueno	SIMD/GPU compute primitives	0.16.x
aprender	ML algorithms, APR v2 format	0.27.x
entrenar	Training and optimization	0.7.x
realizar	Inference engine (APR/GGUF/SafeTensors)	0.8.x
repartir	Distributed compute (CPU/GPU/Remote)	2.0.x
whisper-apr	Pure Rust Whisper ASR	0.2.x
simular	Simulation engine	0.3.x
batuta	Stack orchestration	0.7.x

Documentation

API Reference -- Generated from source
Book -- Comprehensive guide with examples
Examples -- Runnable training, merging, and monitoring examples

Contributing

Fork the repository
Create your changes on the master branch
Run quality gates: make lint && make test
Run coverage: make coverage
Submit a pull request

Cookbook

See entrenar-cookbook for examples and recipes.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1,165 Commits
.config		.config
.dvc		.dvc
.githooks		.githooks
.github		.github
.pmat-metrics		.pmat-metrics
.pmat-tickets		.pmat-tickets
.pmat-work/PMAT-172		.pmat-work/PMAT-172
.pmat		.pmat
.pv		.pv
benches		benches
book		book
config		config
contracts		contracts
crates		crates
data		data
datasheets		datasheets
docs		docs
emc		emc
entrenar/.pv		entrenar/.pv
examples		examples
fuzz		fuzz
golden_traces		golden_traces
infra		infra
model_card		model_card
model_cards		model_cards
models		models
output		output
outputs/deterministic		outputs/deterministic
pkg		pkg
proptest-regressions		proptest-regressions
scripts		scripts
src		src
templates		templates
tests		tests
.clippy.toml		.clippy.toml
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.pmat-baseline.json		.pmat-baseline.json
.pmat-gates.toml		.pmat-gates.toml
.pmat-metrics.toml		.pmat-metrics.toml
.pmat-work.toml		.pmat-work.toml
.pmat.yaml		.pmat.yaml
.pmatignore		.pmatignore
.pre-commit-config.yaml		.pre-commit-config.yaml
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODEOWNERS		CODEOWNERS
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
GOLDEN_TRACE_INTEGRATION_SUMMARY.md		GOLDEN_TRACE_INTEGRATION_SUMMARY.md
LICENSE		LICENSE
MODEL_CARD.md		MODEL_CARD.md
Makefile		Makefile
PROJECT_STATUS.md		PROJECT_STATUS.md
README.md		README.md
RELEASE_v0.1.0.md		RELEASE_v0.1.0.md
build.rs		build.rs
checkpoints		checkpoints
clippy.toml		clippy.toml
deny.toml		deny.toml
docker-compose-jaeger.yml		docker-compose-jaeger.yml
docker-compose.yml		docker-compose.yml
example_model.json		example_model.json
example_model.yaml		example_model.yaml
fix_yamls.sh		fix_yamls.sh
generate_parquet.py		generate_parquet.py
jaeger-sampling.json		jaeger-sampling.json
justfile		justfile
model.gguf		model.gguf
mutants.toml		mutants.toml
pmat.toml		pmat.toml
prepare_data.sh		prepare_data.sh
renacer.toml		renacer.toml
roadmap.yaml		roadmap.yaml
run_qa_rest.sh		run_qa_rest.sh
run_section_a.sh		run_section_a.sh
run_sections_b_c.sh		run_sections_b_c.sh
rust-toolchain.toml		rust-toolchain.toml
rustfmt.toml		rustfmt.toml
tokenizer.json		tokenizer.json

Folders and files

Latest commit

History

Repository files navigation

entrenar

Table of Contents

What is entrenar?

Installation

Library

CLI

From source

Usage

Basic Training

Autograd

LoRA / QLoRA Fine-Tuning

Quantization

Model Merging

Declarative Configuration

CLI Commands

Features

Autograd Engine

LoRA / QLoRA Fine-Tuning

Quantization

Model Merging

Knowledge Distillation

CITL (Compiler-in-the-Loop)

GPU Training

Monitoring

Feature Flags

Architecture

Quality

Sovereign AI Stack

Documentation

Contributing

Cookbook

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages