Privacy-safe document intelligence pipeline. Analyses technical documents (BRD, RFC, proposal, specification) with local AI — sensitive names are masked before any model sees the text, and restored in the final output.
- You list sensitive terms (company names, system names, product names) in a rules file
- docveil replaces them with neutral placeholders —
[SYSTEM_1],[COMPANY_1], etc. - A local AI model extracts structured knowledge and runs your analysis task
- Real names are restored in the final output
No sensitive data ever leaves your machine. After the one-time model download, the pipeline runs entirely offline.
See deploy/README.md for full setup and usage instructions, including Ollama installation, model download, and a ready-to-run example.
# after setup (see deploy/README.md)
python scripts/run_pipeline.py \
deploy/examples/customer_portal_brd.md \
deploy/examples/nebulize_rules.yaml \
deploy/examples/task_prompt.txt- Python 3.10+
- Ollama with
qwen3:14bandqwen3:8b(~25 GB disk) - Linux, macOS, or Windows
document → normalise → mask → extract → analyse → restore → output
(step 01) (step 02) (step 05) (step 06) (step 08)
Eight steps, three optional. Single command via scripts/run_pipeline.py.
MIT