VOKNOV Font Match trains a small metric-learning model that embeds rendered font glyphs into a shared vector space. It can be used to compare font styles across glyphs, build a searchable font gallery, and match query text images against that gallery.
This project is maintained by VOKNOV, the technology brand of Hangzhou Xueyizhiyong Technology Co., Ltd. (杭州学以致用科技有限责任公司). Learn more at www.pictech.cc.
Many real design and localization workflows need to preserve typography even when the source and target languages do not share the same font files or glyph coverage.
For example, in image translation, a source image may contain Chinese text set in a Songti-style font. If the text is translated into German, the original Chinese font cannot render the German result, and a literal font replacement often changes the visual tone of the design. VOKNOV Font Match addresses this by learning a style embedding for rendered glyphs. A Chinese source font and candidate Latin fonts can be compared in the same vector space, making it possible to retrieve the German-capable font whose visual style is closest to the original Songti-like source.
This makes VFM useful for:
- cross-language font matching for image translation and design localization,
- choosing fallback fonts when the original font lacks target-language glyphs,
- building searchable font galleries from local or licensed font collections,
- comparing typography styles across scripts, font families, and rendered text samples.
The current open-source path focuses on font_net:
- generate a font manifest,
- train a FontNet embedding model,
- build a vector gallery,
- run gallery-based inference on query images.
data/
fonts/demo/ Demo open-source font families for smoke tests
gb2312.txt Character pool for Chinese glyph sampling
docs/assets/
demo_inference_result.jpg Example gallery-matching visualization
src/font_net/
generate_font_manifest.py Scan fonts and write a manifest
train.py Train FontNet embeddings
build_font_gallery.py Build a searchable gallery
inference_from_gallery.py Match query images against a gallery
dataset.py Dynamic glyph triplet dataset
model.py FontNet model
loss.py Triplet loss
scripts/
smoke_test.sh End-to-end local smoke test
Python 3.9+ is recommended.
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtInstall the PyTorch build that matches your platform if the default pip install does not select the right CPU, CUDA, or macOS build.
Run the full smoke test:
bash scripts/smoke_test.shThe script uses a small subset of the bundled demo fonts, trains a tiny model, builds a tiny gallery, renders query images, and runs inference. Outputs are written to:
/tmp/fontnet_smoke
This is a functionality test, not a quality benchmark.
The example below shows gallery-based font matching. Each row starts with an input glyph image and then displays the top retrieved candidates from the gallery, including similarity scores. In a cross-language workflow, this same mechanism can be used to retrieve a target-language font whose rendered style is closest to the source font.
python src/font_net/generate_font_manifest.py \
--font-dir data/fonts/demo \
--output data/font_manifest.txtManifest rows use:
name|supports_chinese|path
python src/font_net/train.py \
--manifest data/font_manifest.txt \
--checkpoint-dir checkpoints \
--log-dir logs/font_net \
--arch mobilenet_v3_small \
--epochs 1 \
--batch-size 4 \
--num-workers 0 \
--train-samples 32 \
--val-samples 8 \
--single-processFor real training, increase --epochs, --train-samples, --val-samples, and --batch-size. On Linux with multiple CUDA GPUs, omit --single-process to enable DDP automatically.
By default, torchvision pretrained weights are not downloaded. Add --pretrained if you want ImageNet pretrained weights and your environment can download or cache them.
python src/font_net/build_font_gallery.py \
--font-dir data/fonts/demo \
--model-path checkpoints/best_model.pth \
--output data/font_gallery.pth \
--arch mobilenet_v3_smallFor fast tests, use --max-chars 5. For better gallery quality, omit --max-chars so each font is represented by more glyph samples.
data/fonts/demo is scanned recursively. For a quick end-to-end check, prefer bash scripts/smoke_test.sh; it creates a temporary four-font subset so the test stays fast.
python src/font_net/inference_from_gallery.py \
--model-path checkpoints/best_model.pth \
--gallery-path data/font_gallery.pth \
--image-dir path/to/query_images \
--output inference_result.jpg \
--arch mobilenet_v3_small \
--top-k 3The output image visualizes each query image, its processed patches, and the top gallery matches.
This repository includes only a tiny demo font set. It does not include production training fonts, system font directories, commercial font packs, poster datasets, or checkpoints.
See DATA_LICENSE.md before redistributing fonts or training with third-party assets.
- Technology brand: VOKNOV
- Company: Hangzhou Xueyizhiyong Technology Co., Ltd. (杭州学以致用科技有限责任公司)
- Website: www.pictech.cc
Code is released under the MIT License. See LICENSE.
