CAFT is a research codebase accompanying the paper:
Making Limited Data Count: Constraint-Guided Adaptive Fine-Tuning of Neural Networks for Cost-Aware Defect Classification
Kristina Dachtler, Alexander Schiendorfer
ETFA 2026 — Special Session: Addressing Data Scarcity
In manufacturing environments, labeled data is often scarce, imbalanced, and of uncertain quality. Under these conditions, optimizing for conventional aggregate metrics quickly reaches a plateau — and further improvements do not necessarily translate into better operational outcomes. Defining a suitable cost matrix for cost-sensitive learning is non-trivial, and static cost-sensitive training methods tend to collapse on small, imbalanced datasets.
CAFT addresses these challenges with a two-phase training strategy:
- Phase 1 — Pre-Training: A neural network is trained with standard Cross-Entropy (or Weighted Cross-Entropy) loss to establish stable feature representations.
- Phase 2 — Constraint-Aware Fine-Tuning: Starting from the pre-trained weights, misclassification penalties are incrementally increased until predefined operational decision constraints are satisfied. The method does not require a predefined cost matrix — instead, constraint-relevant cost entries are built up adaptively via Lagrangian multipliers.
- CAFT achieves higher constraint satisfaction rates than static cost-sensitive methods across two datasets, without model collapse.
- CAFT maintains the highest F1-score among all methods that fulfill the decision constraint.
- Compared to post-hoc threshold tuning, CAFT shows more robust generalization from validation to test data.
- The approach scales naturally to multiple simultaneous constraints, a setting where threshold-based alternatives become impractical.
- Paint Defects (real-world): Automotive paint defect classification with 2,681 samples, 5 features, and 3 classes (not included due to confidentiality reasons).
- Steel Plates Faults (public benchmark): Adapted from the UCI Steel Plates Faults dataset with 1,941 samples, 27 features, grouped into 3 classes.
- Clone the repository:
git clone https://github.com/AImotion-Bavaria/caft cd CAFT - Install dependencies (Python >=3.10):
This uses the dependencies specified in
pip install -e .pyproject.toml.
src/— Core modules (models, training, evaluation, utilities)config/— Central configuration (paths)artifacts/— Data, models (model_weights from pre-training), and results (auto-generated)experiments/— Experiment scripts for different datasetscreate_experiment_plots.py— Aggregates and plots experiment resultscreate_experiment_summary.py— Summarizes results from text outputs
To run a neural network experiment on the steel plates dataset:
python experiments/steel_plates/01_torch_NN_training_SP.pyResults are aggregated and plotted after running experiments:
python create_experiment_plots.py
python create_experiment_summary.pyThis project is licensed under the MIT License.
Kristina Dachtler (kristina.dachtler@thi.de)