| Author | Byron Williams |
| Created | 2026-05-05 |
| Repository | ByronWilliamsCPA/Unify |
Foundry Unify is the foundation library for an OCR orchestration and layout
analysis service that will sit in front of the Foundry RAG pipeline. The OCR
orchestration logic is still on the roadmap; what currently ships in
src/foundry_unify/ is the production scaffolding the orchestrator will be
built on top of:
- FastAPI security middleware (
foundry_unify.middleware.security) — OWASP-aligned security headers, in-memory rate limiting with burst control, CORS, trusted-host, and an SSRF prevention middleware that blocks private IPs, cloud metadata endpoints, and dangerous URL schemes. - Request correlation middleware (
foundry_unify.middleware.correlation) — propagatesX-Correlation-ID/X-Request-ID/X-Trace-ID/X-Span-IDviacontextvars, with a structlog processor to add the IDs to every log record. - Structured logging (
foundry_unify.utils.logging) — structlog setup with rich console output for development and JSON output for production, plus alog_performancehelper. - Pydantic Settings (
foundry_unify.core.config) — environment-driven configuration loaded fromFOUNDRY_UNIFY_*variables. - Centralised exception hierarchy (
foundry_unify.core.exceptions) — typed errors (ValidationError,AuthenticationError,APIError, etc.) withto_dict()for safe JSON responses. - Kubernetes health endpoints (
foundry_unify.api.health) —/health/live,/health/ready,/health/startupFastAPI router ready to mount.
- High Quality: 80%+ test coverage enforced via CI
- Type Safe: Full type hints with BasedPyright strict mode
- Well Documented: Clear docstrings and comprehensive guides
- Developer Friendly: Pre-commit hooks, automated formatting, linting
- Security First: Dependency scanning, security analysis, SBOM generation
- ML Ready: Optional ML dependencies with PyTorch support
- Python 3.10+ (tested with 3.12)
- UV for dependency management
Install UV:
# macOS and Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
# Or with pip/pipx
pip install uv
# or
pipx install uvFoundry Unify is not yet published to PyPI. Install from source:
# Using uv (recommended)
uv add git+https://github.com/ByronWilliamsCPA/Unify.git
# or, for the FastAPI middleware stack:
uv add "foundry-unify[api] @ git+https://github.com/ByronWilliamsCPA/Unify.git"
# Using pip
pip install git+https://github.com/ByronWilliamsCPA/Unify.git
pip install "foundry-unify[api] @ git+https://github.com/ByronWilliamsCPA/Unify.git"For local development, clone the repository and install all extras:
git clone https://github.com/ByronWilliamsCPA/Unify.git
cd Unify
# Install dependencies (includes dev tools - REQUIRED for development)
uv sync --all-extras
# Setup pre-commit hooks (required)
uv run pre-commit installWire the middleware, logging, and health endpoints into a FastAPI app
(requires the [api] extra):
from fastapi import FastAPI
from foundry_unify.api.health import router as health_router
from foundry_unify.core.config import settings
from foundry_unify.middleware import (
CorrelationMiddleware,
add_security_middleware,
)
from foundry_unify.utils.logging import get_logger, setup_logging
# Configure structured logging (JSON in production, rich console in dev).
setup_logging(
level=settings.log_level,
json_logs=settings.json_logs,
include_timestamp=settings.include_timestamp,
include_correlation=True,
)
logger = get_logger(__name__)
app = FastAPI(title="foundry-unify")
# Correlation must be added first so subsequent middleware can log with IDs.
app.add_middleware(CorrelationMiddleware)
# Security headers, CORS, rate limiting, SSRF prevention.
add_security_middleware(
app,
enable_https_redirect=False, # Set True behind TLS-terminating proxies.
enable_rate_limiting=True,
enable_ssrf_prevention=True,
allowed_origins=["https://example.com"],
allowed_hosts=["api.example.com"],
rate_limit_rpm=100,
)
# Kubernetes probes at /health/live, /health/ready, /health/startup.
app.include_router(health_router)
@app.get("/")
async def root() -> dict[str, str]:
logger.info("root_called")
return {"status": "ok"}Raise typed exceptions from the centralised hierarchy:
from foundry_unify.core.exceptions import ValidationError
raise ValidationError(
"Invalid email format",
field="email",
value="not-an-email",
)
# ValidationError.to_dict() -> JSON-serialisable error payload.Settings load from environment variables with the FOUNDRY_UNIFY_ prefix
(see src/foundry_unify/core/config.py). All variables are optional.
| Variable | Type | Default | Description |
|---|---|---|---|
FOUNDRY_UNIFY_LOG_LEVEL |
DEBUG/INFO/WARNING/ERROR/CRITICAL |
INFO |
Application log level. |
FOUNDRY_UNIFY_JSON_LOGS |
bool | false |
Emit JSON logs (production) instead of rich console output. |
FOUNDRY_UNIFY_INCLUDE_TIMESTAMP |
bool | true |
Include ISO-8601 timestamps in log records. |
Settings is a pydantic_settings.BaseSettings subclass and also reads from
a .env file when one is present. Extra unrecognised variables are ignored
(extra="ignore").
This project implements enterprise-grade supply chain security with a multi-tier package index strategy and centralized secrets management.
┌─────────────────────────────────────────────────────────────────┐
│ Package Index Priority │
├─────────────────────────────────────────────────────────────────┤
│ 1. Google Assured OSS (SLSA Level 3) - Third-party packages │
│ 2. Internal Artifact Registry - Organization packages │
│ 3. PyPI (fallback) - Packages not in tier 1 or 2 │
└─────────────────────────────────────────────────────────────────┘
# Run the setup script
./scripts/setup-supply-chain.sh
# Or manually configure
gcloud auth login
gcloud auth application-default login
pip install keyrings.google-artifactregistry-auth| Index | SLSA Level | Purpose | Default |
|---|---|---|---|
| PyPI | - | Standard packages | Yes (default) |
| Google Assured OSS | 3 | Verified third-party packages | Opt-in |
| Internal Registry | 2+ | Organization-maintained packages | Opt-in |
How It Works:
By default, all packages resolve from PyPI. After configuring GCP authentication, you can opt-in specific packages to use Assured OSS by uncommenting entries in pyproject.toml:
[tool.uv.sources]
numpy = { index = "assured-oss" }
pandas = { index = "assured-oss" }
requests = { index = "assured-oss" }Why This Matters:
- SLSA Level 3: Build integrity, provenance, and tamper-proof artifacts
- Supply Chain Protection: Reduced risk of dependency confusion attacks
- Compliance: Meets enterprise security and audit requirements
- Graceful Fallback: Works without authentication, opt-in when ready
Secrets are managed via Infisical instead of environment variables or GitHub Secrets.
Local Development:
# Login to Infisical
infisical login
# Initialize project connection
infisical init
# Run commands with secrets injected
infisical run --env=dev -- uv run python main.py
# Or export secrets to local file
infisical export --env=dev > .env.localCI/CD Integration:
- GitHub Actions use Infisical's Machine Identity authentication
- Secrets are injected at runtime, never stored in repositories
- Environment mapping:
main→prod,develop→staging,*→dev
Software Bill of Materials (SBOM) is generated on every release:
# Generate SBOM locally
uv run cyclonedx-py environment -o sbom.json
# Verify package attestation
pip-audit --require-hashesAutomated via CI:
- CycloneDX SBOM generated in JSON and XML formats
- Attestation attached to GitHub releases
- Vulnerability scanning with OSV database
-
Run the setup script (recommended):
./scripts/setup-supply-chain.sh
-
Or configure manually:
Google Cloud Authentication:
gcloud auth login gcloud auth application-default login pip install keyrings.google-artifactregistry-auth
Infisical Setup:
# Install Infisical CLI # macOS brew install infisical/get-cli/infisical # Linux curl -1sLf 'https://dl.cloudsmith.io/public/infisical/infisical-cli/setup.deb.sh' | sudo -E bash sudo apt-get install infisical # Connect to project infisical login infisical init
-
Configure CI/CD secrets in Infisical:
GCP_SA_KEY_BASE64: Base64-encoded GCP service account keyCODECOV_TOKEN: Codecov upload token (if using Codecov)SONAR_TOKEN: SonarCloud token (if using SonarCloud)
| Role | Purpose |
|---|---|
roles/artifactregistry.reader |
Read from Assured OSS and internal registry |
roles/artifactregistry.writer |
Publish to internal registry (CI only) |
Q: Packages not found in Assured OSS?
- UV automatically falls back to PyPI - no action needed
- Check available packages: Assured OSS Supported Packages
Q: Authentication errors with Artifact Registry?
- Run
gcloud auth application-default loginto refresh credentials - Verify service account has
Artifact Registry Readerrole - Check keyring is installed:
pip install keyrings.google-artifactregistry-auth
Q: Infisical connection issues?
- Verify
.infisical.jsonhas correctworkspaceId - Check your Infisical organization permissions
- For CI: Ensure
INFISICAL_CLIENT_IDandINFISICAL_CLIENT_SECRETare set
Q: How to verify supply chain setup?
# Test package index access
./scripts/setup-supply-chain.sh # Re-run to verify all checks pass# Install all dependencies including dev tools
uv sync --all-extras
# Setup pre-commit hooks
uv run pre-commit install
# Install Qlty CLI for unified code quality checks
curl https://qlty.sh | bash
# Run tests
uv run pytest -v
# Run with coverage
uv run pytest --cov=foundry_unify --cov-report=html
# Run all quality checks (using Qlty)
qlty check
# Or use pre-commit
uv run pre-commit run --all-filesAll code must meet these requirements:
- Formatting: Ruff (88 char limit)
- Linting: Ruff with PyStrict-aligned rules (see below)
- Type Checking: BasedPyright strict mode
- Testing: Pytest with 80%+ coverage
- Security: Bandit + dependency scanning
- Documentation: Docstrings on all public APIs
Unified Quality Tool: This project uses Qlty to consolidate all quality checks into a single fast tool. See .qlty/qlty.toml for configuration.
This project uses PyStrict-aligned Ruff rules for stricter code quality enforcement beyond standard Python linting:
| Rule | Category | Purpose |
|---|---|---|
| BLE | Blind except | Prevent bare except: clauses |
| EM | Error messages | Enforce descriptive error messages |
| SLF | Private access | Prevent access to private members |
| INP | Implicit packages | Require explicit __init__.py |
| ISC | Implicit concatenation | Prevent implicit string concatenation |
| PGH | Pygrep hooks | Advanced pattern-based checks |
| RSE | Raise statement | Proper exception raising |
| TID | Tidy imports | Clean import organization |
| YTT | sys.version | Safe version checking |
| FA | Future annotations | Modern annotation syntax |
| T10 | Debugger | No debugger statements in production |
| G | Logging format | Safe logging string formatting |
These rules catch bugs that standard linting misses and enforce production-quality code patterns.
This project includes standardized Claude Code configuration via git subtree:
Directory Structure:
.claude/
├── claude.md # Project-specific Claude guidelines
└── standard/ # Standard Claude configuration (git subtree)
├── CLAUDE.md # Universal development standards
├── commands/ # Custom slash commands
├── skills/ # Reusable skills
└── agents/ # Specialized agents
Updating Standards:
# Pull latest standards from upstream
./scripts/update-claude-standards.sh
# Or manually
git subtree pull --prefix .claude/standard \
https://github.com/williaby/.claude.git main --squashWhat's Included:
- Universal development best practices
- Response-Aware Development (RAD) system for assumption tagging
- Agent assignment patterns and workflow
- Security requirements and pre-commit standards
- Git workflow and commit conventions
Project-Specific Overrides: Edit .claude/claude.md for project-specific guidelines. See .claude/README.md for details.
# Run all tests
uv run pytest -v
# Run specific test file
uv run pytest tests/unit/test_module.py -v
# Run with coverage report
uv run pytest --cov=foundry_unify --cov-report=term-missing
# Run tests in parallel
uv run pytest -n autoRecommended: Use Qlty CLI for unified code quality checks.
# Run all quality checks (fast!)
qlty check
# Run checks on only changed files (fastest)
qlty check --filter=diff
# Run specific plugins only
qlty check --plugin ruff --plugin pyright
# Auto-format code
qlty fmt
# View current configuration
qlty config showQlty runs all these tools in a single pass:
Python Quality:
- Ruff (linting + formatting)
- BasedPyright (type checking)
- Bandit (security scanning)
Security & Secrets:
- Gitleaks (secrets detection)
- TruffleHog (entropy-based secrets detection)
- OSV Scanner (dependency vulnerabilities)
- Semgrep (advanced SAST)
File & Configuration:
- Markdownlint (markdown linting)
- Yamllint (YAML linting)
- Prettier (JSON, YAML, Markdown formatting)
- Actionlint (GitHub Actions workflows)
- Shellcheck (shell script linting)
Container & Infrastructure (if Docker enabled):
- Hadolint (Dockerfile linting)
- Trivy (container security scanning)
- Checkov (infrastructure as code security)
Code Quality Metrics:
- Complexity analysis (cyclomatic, cognitive)
- Code smells detection
- Maintainability scoring
# Format code
uv run ruff format src tests
# Lint and auto-fix
uv run ruff check --fix src tests
# Type checking
uv run basedpyright src
# Security scanning
uv run bandit -r src
# Dependency vulnerabilities
qlty check --plugin osv_scannerfoundry_unify/
├── src/foundry_unify/ # Main package
│ ├── __init__.py
│ ├── core.py # Core functionality
│ └── utils/ # Utility modules
├── tests/ # Test suite
│ ├── unit/ # Unit tests
│ └── integration/ # Integration tests
├── docs/ # Documentation
│ ├── ADRs/ # Architecture Decision Records
│ ├── planning/ # Project planning docs
│ └── guides/ # User guides
├── pyproject.toml # Dependencies & tool config
├── README.md # This file
├── CONTRIBUTING.md # Contribution guidelines
└── LICENSE # License
- CONTRIBUTING.md: How to contribute to the project
- docs/ADRs/README.md: Architecture Decision Records documentation
- docs/planning/project-plan-template.md: Project planning guide
- Use Markdown for all documentation
- Include code examples for clarity
- Update README.md when adding major features
- Maintain architecture documentation (see docs/ADRs/)
All new functionality must include tests:
- Unit tests: Test individual functions/classes
- Integration tests: Test component interactions
- Coverage: Maintain 80%+ coverage
- Markers: Use pytest markers (
@pytest.mark.unit,@pytest.mark.integration)
# Run all tests
uv run pytest -v
# Run only unit tests
uv run pytest -v -m unit
# Run only integration tests
uv run pytest -v -m integration
# Run with coverage requirements
uv run pytest --cov=foundry_unify --cov-fail-under=80- Validate all inputs
- Use secure defaults
- Scan dependencies regularly
- Report vulnerabilities responsibly
Please report security vulnerabilities to byronawilliams@gmail.com rather than using the public issue tracker.
See the ByronWilliamsCPA Security Policy for complete disclosure policy and response timelines.
Contributions are welcome! Please see CONTRIBUTING.md for:
- Development setup
- Code quality standards
- Testing requirements
- Git workflow and commit conventions
- Pull request process
- Code follows style guide (Ruff format + lint)
- All tests pass with 80%+ coverage
- BasedPyright type checking passes
- Docstrings added for new public APIs
- CHANGELOG.md updated (if significant change)
- Commits follow conventional commit format
This project uses Semantic Versioning:
- MAJOR version: Incompatible API changes
- MINOR version: Backwards-compatible functionality additions
- PATCH version: Backwards-compatible bug fixes
Current version: 0.1.0
This project uses python-semantic-release for automated versioning based on Conventional Commits.
How it works:
-
Commit messages determine version bumps:
fix:commits trigger a PATCH release (1.0.0 → 1.0.1)feat:commits trigger a MINOR release (1.0.0 → 1.1.0)BREAKING CHANGE:in commit body or!after type triggers MAJOR release (1.0.0 → 2.0.0)
-
On merge to main:
- Analyzes commits since last release
- Determines appropriate version bump
- Updates version in
pyproject.toml - Generates/updates
CHANGELOG.md - Creates Git tag and GitHub Release
- Publishes to PyPI (if configured)
Commit message examples:
# Patch release (bug fix)
git commit -m "fix: resolve null pointer in data parser"
# Minor release (new feature)
git commit -m "feat: add CSV export functionality"
# Major release (breaking change)
git commit -m "feat!: redesign API for better ergonomics
BREAKING CHANGE: API has been redesigned for improved usability.
See migration guide in docs/migration/v2.0.0.md"Configuration: See [tool.semantic_release] in pyproject.toml for settings.
This project was generated from a cookiecutter template and is managed with cruft.
To sync with the latest template changes:
# Preview changes first
cruft diff
# Apply updates (recommended: use the wrapper script)
./scripts/cruft-update.sh
# Or use cruft directly (requires manual cleanup)
cruft update
python scripts/cleanup_conditional_files.pyCruft only syncs file contents - it does NOT re-run post-generation hooks that clean up conditional files.
When you change feature flags in .cruft.json (e.g., disabling include_api_framework), the corresponding files are NOT automatically removed. You must run the cleanup script:
# Check for orphaned files
python scripts/check_orphaned_files.py
# Remove orphaned files
python scripts/cleanup_conditional_files.py
# Or preview what would be removed
python scripts/cleanup_conditional_files.py --dry-runFiles that may need cleanup when features are disabled:
| Feature | Files to Remove |
|---|---|
include_api_framework: no |
src/*/api/, src/*/middleware/ |
include_sentry: no |
src/*/core/sentry.py |
include_background_jobs: no |
src/*/jobs/ |
include_caching: no |
src/*/core/cache.py |
include_docker: no |
Dockerfile, docker-compose*.yml |
use_mkdocs: no |
mkdocs.yml, docs/ |
The CI pipeline includes automated checks for orphaned files to prevent this issue.
MIT License - see LICENSE for details.
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: byronawilliams@gmail.com
Thank you to all contributors and the open-source community!
Made with by Byron Williams