An embeddable computer-vision runtime for face & person understanding. C++ core, plug-and-play bindings, runs everywhere — Pi to phone to GPU server.
A single C++ runtime that does face detection, alignment, recognition, liveness, person detection, tracking, and re-identification — exposed through one stable C ABI with first-class Python and Node bindings.
Built so you can:
pip install naina # Python
npm install @jvoltci/naina # Node / TypeScript…and ship the same model to a Raspberry Pi, a phone, an Apple Silicon laptop, or a CUDA server with no code change. Backends auto-select at runtime (ONNX Runtime · NCNN · OpenVINO · CoreML · TensorRT · ExecuTorch).
| OpenCV | InsightFace | face-api.js | naina | |
|---|---|---|---|---|
| Drop-in for Python + Node + C++ | partial | py only | js only | yes |
| Edge-first (Pi5, Jetson, phones) | yes | partial | partial | yes |
| SOTA face recognition models | no | yes | dated | yes |
| Permissive license | Apache-2.0 | non-comm | MIT | MIT/Apache-2.0 |
| Live in-browser demo | — | — | yes | yes |
| Single API across all targets | no | py only | js only | yes |
import naina
import cv2
engine = naina.Engine() # auto-selects best backend
img_a = cv2.imread("alice_1.jpg")[:, :, ::-1] # BGR -> RGB
img_b = cv2.imread("alice_2.jpg")[:, :, ::-1]
faces_a = engine.detect_faces(img_a)
faces_b = engine.detect_faces(img_b)
emb_a = engine.embed_face(img_a, faces_a[0])
emb_b = engine.embed_face(img_b, faces_b[0])
print("similarity:", naina.similarity(emb_a, emb_b)) # 0..1, higher = sameimport { Engine, similarity, loadImage } from '@jvoltci/naina';
const engine = new Engine();
const a = await loadImage('alice_1.jpg');
const b = await loadImage('alice_2.jpg');
const facesA = await engine.detectFaces(a);
const facesB = await engine.detectFaces(b);
const embA = await engine.embedFace(a, facesA[0]);
const embB = await engine.embedFace(b, facesB[0]);
console.log('similarity:', similarity(embA, embB));More examples: examples/ — face_verify in Python and Node, plus notes on threshold selection.
jvoltci.github.io/naina — open it in any modern browser and run live face recognition on your webcam, no install. Detects N faces simultaneously, lets you enrol any of them, then recognises them across frames. Same models as the native lib.
v1.0 — face stack
- Face detection (multi-scale, multi-face)
- Face alignment (5-point similarity transform)
- Face embedding & verification (512-d L2-normalized)
- Liveness / anti-spoofing
v1.1+ — person stack
- Person detection
- Multi-object tracking
- Person re-identification
Generated by benchmarks/runner.py — see
benchmarks/README.md to contribute numbers from
your own hardware.
| Target | Tier | Host | Detect p50 | Detect p95 | Embed p50 | Embed p95 |
|---|---|---|---|---|---|---|
| m3-pro | default | Darwin arm64 | 3.8 ms | 3.9 ms | 16.5 ms | 18.4 ms |
Accuracy on WIDERFACE / IJB-C / LFW requires dataset downloads that aren't yet automated. See benchmarks/README.md.
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Python (pip) │ │ Node (npm) │ │ Web (CDN) │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└──────── C ABI ┴─────────────────┘
│ (WASM build)
┌──────▼───────┐
│ naina-core │ C++20, no exceptions across ABI
│ modules │ face/person/track/reid/liveness
│ backends │ ONNXRT · NCNN · OpenVINO · CoreML · TRT
│ HAL/SIMD │ NEON · AVX2 · AVX-512
└──────────────┘
See docs/ARCHITECTURE.md for the full layered design, locked decisions, and deployment matrix.
Pre-alpha. Architecture spike committed; v1.0 face stack in progress. The web demo runs the same model artifacts the native lib will load — it's a preview, not the final implementation. See docs/ROADMAP.md for what ships when.
Issues and PRs welcome once v0.1 lands. Until then, file design feedback on the architecture doc.
Apache-2.0. Default model weights are permissive-licensed and ship
with the library. Research-tier weights are opt-in and may carry
non-commercial restrictions; see models/registry.yaml per-model.
