Skip to content
View halvrenofviryel's full-sized avatar

Highlights

  • Pro

Block or report halvrenofviryel

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
halvrenofviryel/README.md

Ali Toygar Abak — Founder of Phionyx Research

ORCID PyPI: phionyx-core Zenodo DOI Substack X: @phionyx_ai

I build deterministic governance infrastructure for AI systems.

Phionyx treats large language model outputs as noisy measurements rather than final answers. The goal is to place a verifiable governance runtime between AI systems and real-world action: safety gates, ethics gates, telemetry, evaluation standards, state evolution, and audit-first control.

Currently shipping: Phionyx Core v0.7.2 is live on PyPI (pip install phionyx-core) alongside 5 open-source companion packages that wire the runtime into MCP hosts, Inspect AI, LangChain / LangGraph, and the OpenAI Agents SDK. Phionyx Evaluation Standard v0.2.0 (released 2026-05-24) ships the Evidence-Oriented Runtime Telemetry Profile — a vendor-neutral JSON schema for governance evidence rows. See phionyx.ai for the runtime narrative and where to start.

Three distinct things, three version lines

Phionyx ships three things that must not be cross-attributed — each has its own version line:

  • Engine — phionyx-core (the SDK, v0.7.2 on PyPI): the deterministic engine — 46-block canonical pipeline (contract v3.8.0), state vector, kill switch, HITL, ethics/safety gates, signed audit chain. It is the reference implementation that scores L3 + D3 on the Evaluation Standard. It is not claim-governance-rated.
  • Gate — phionyx-pipeline-mcp (stable v0.2.0, alpha v0.3.0a1): an MCP server that verifies an agent's own "I fixed / I tested / this changed" claims against git-diff truth. This is the component the Claim-Governance ladder (CG-L0…CG-L5) rates — stable v0.2.0 = CG-L2; alpha v0.3.0a1 = CG-L3 (opt-in / default-off, already on PyPI), with the stable channel remaining CG-L2. The gate is Layer 3 of the 5-layer governance stack. phionyx-mcp-server (v0.1.0) is the outward MCP trust boundary.
  • Standard — phionyx-evaluation-standard (v0.1.1 + v0.2.0 released; v0.3 is a draft layer): a vendor-neutral spec defining L0-L3 (evaluation maturity), D0-D3 (determinism), and CG-L0…CG-L5 (claim-governance, the v0.3 draft layer). L0-L3 / D0-D3 rate any runtime; CG-L0…CG-L5 rates the gate. phionyx-core is the reference implementation scoring L3 + D3.

Where Phionyx fits

The work organises around three audience entry points, mirrored on phionyx.ai:

Bounded Authority — for safety-first AI providers

AI output should not directly become action. Phionyx adds deterministic gates between model output and real-world action.

Repos that implement and demonstrate the pattern:

  • phionyx-research — the core runtime (phionyx-core, the engine, v0.7.2); 46-block canonical pipeline, kill switch, HITL queue, ethics gate, audit chain. pip install phionyx-core.
  • phionyx-mcp-server — MCP trust boundary (v0.1.0); descriptor signing, signed envelopes, audit chain over third-party MCP tool calls.
  • phionyx-pipeline-mcp — agent self-claim gate (stable v0.2.0 = CG-L2; alpha v0.3.0a1 = CG-L3); verifies what the agent says it did against the repository's actual diff. This is the component the Claim-Governance ladder rates.
  • hearthos — applied: bounded-authority household AI. Browser-only demo + policy gates. The Governance Trilogy, Book 1.

→ Read the full argument: phionyx.ai/bounded-authority

Narrative Coherence — for game AI, NPC, and storytelling systems

When AI characters drift, the story breaks. Phionyx detects narrative drift, state incoherence, and unsafe output before the scene reaches the player.

  • phionyx-research ships the NPC drift reference trace under examples/physics/ — source-inspectable today; end-to-end runnable on the current phionyx-core v0.7.2 classifier surface.
  • trace.phionyx.ai/school — School RPG demo (external surface) running the same coherence mechanism end-to-end.

→ Read the full argument: phionyx.ai/narrative-coherence

Reviewer Evidence — for researchers and technical reviewers

Every claim should be reproducible. Verify Phionyx through installable packages, tests, evidence rows, and public artefacts.

  • phionyx-evaluation-standard — vendor-independent evaluation standard (v0.1.1 + v0.2.0 released; v0.3 draft layer). Defines L0-L3 (evaluation maturity), D0-D3 (determinism), and CG-L0…CG-L5 (claim-governance, v0.3 draft). v0.2.0 ships the Evidence-Oriented Runtime Telemetry Profile + JSON Schema + worked evidence rows. phionyx-core is the reference implementation scoring L3 + D3; the CG ladder rates the gate phionyx-pipeline-mcp.
  • phionyx-eval-inspect — Inspect AI bridge (v0.1.0). Runtime evidence exported into Inspect .eval evaluation logs. Replayable agent evaluations.
  • phionyx-langchain-langgraph — LangChain + LangGraph adapters (v0.1.0a1). Every chain / tool / LLM event + supervisor handoff becomes a signed, hash-chained envelope.
  • phionyx-openai-agents — OpenAI Agents SDK tracing bridge (v0.1.0a1). Every Trace and Span becomes a signed, hash-chained envelope.

→ Read the full Evidence Matrix: phionyx.ai/evidence

Core principles

  • LLM output is not truth; it is a signal requiring governance.
  • AI systems need runtime control, not only prompt-level safety.
  • Safety, coherence, and telemetry should be structured before response release.
  • Evaluation must include behavioural stability, not only benchmark performance.
  • Human-facing AI should be explainable, auditable, and interruptible.

Latest writing

  • Phionyx Evaluation Standard v0.2.0 — Evidence-Oriented Runtime Telemetry Profile (2026-05-24 · Release)
  • Persistent Worlds Need Deterministic Governance (2026-05-22 · Substack post 5 · link)
  • A model saying "fixed" is not evidence (2026-05-22 · X Article · link)
  • MCP Connects Tools. Runtime Evidence Keeps Agents Accountable. (2026-05-19 · X Article · link)
  • The Phionyx Architecture: Treating LLMs as Sensors, Not Oracles (2026-05-09 · Substack post 4 · link)

Links


If runtime evidence for agentic AI is a problem you have, watch phionyx-research to get email updates when we ship new experiments.

Pinned Loading

  1. phionyx-research phionyx-research Public

    Runtime evidence layer for agentic AI — signed audit chain, deterministic gate verdicts with record-bound audit replay. pip install phionyx-core.

    Python 2 3

  2. phionyx-evaluation-standard phionyx-evaluation-standard Public

    Vendor-independent evaluation standard for agentic AI runtimes. JSON-schema signals: reliability, safety, coherence, determinism.

  3. phionyx-eval-inspect phionyx-eval-inspect Public

    Inspect AI bridge — Phionyx runtime evidence exported into Inspect eval logs. Replayable agent evaluations.

    Python

  4. phionyx-mcp-server phionyx-mcp-server Public

    MCP trust boundary — descriptor signing, signed envelopes, audit chain over third-party MCP tool calls.

    Python

  5. phionyx-langchain-langgraph phionyx-langchain-langgraph Public

    LangChain + LangGraph adapters for Phionyx runtime evidence — every chain / tool / LLM event + supervisor handoff becomes a signed, hash-chained envelope.

    Python

  6. phionyx-openai-agents phionyx-openai-agents Public

    OpenAI Agents SDK tracing bridge for Phionyx runtime evidence — every Trace and Span becomes a signed, hash-chained envelope.

    Python