Releases · ClawBio/ClawBio

04 Apr 16:10

v0.5.0

5a10a1e

v0.5.0 — Validation & Benchmark Infrastructure Latest

Latest

What's new

ClawBio v0.5.0 introduces the platform's first systematic validation infrastructure. Every genomics skill can now be scored against curated ground truth with objective metrics -- not just "did it run" but "did it find the right answer."

Benchmark Infrastructure

Alzheimer's Disease Ground Truth Set: 34 positive genes across 3 evidence tiers (4 Mendelian causal from OMIM, 20 GWAS-replicated from Bellenguez et al. 2022, 10 novel loci), 20 brain-expressed negative controls, 10 lead variants with GRCh38 coordinates, and scoring criteria with minimum acceptable thresholds.
Mock API Server: Deterministic endpoints for Ensembl REST, GWAS Catalog, and ClinPGx. Enables offline CI testing without rate limits or upstream API drift. Threaded HTTP server with context manager for test integration.
Benchmark Scorer: Measures gene recovery rate, false discovery rate, precision, recall, F1, and tier-weighted composite score. CLI and Python API. Generates markdown reports with tier breakdown.
Swappable Fine-Mapping Pipeline: First autoresearch-style benchmark. Runs ABF and SuSiE on the same synthetic locus with known causal signals, scores each on recall, precision, credible set size, and composite score, picks the winner. Adding new methods (FINEMAP, PolyFun) requires only a single function in the method registry. First result: SuSiE wins (composite=0.80) vs ABF (composite=0.65).
Nightly Sweep Upgrade: Demo sweep now collects gene lists from skill outputs and scores against ground truth. Benchmark metrics appear alongside pass/fail in the sweep summary.
74 benchmark tests, all green.

Development Standards

Red/green TDD mandate: All skill development must use test-driven development. Write tests first, watch them fail, implement, watch them pass. Enforced in CLAUDE.md and Contributing workflow.

New Skills (7)

Skill	Description	Contributor
struct-predictor	AlphaFold/Boltz protein structure prediction	@camlloyd
cell-detection	CellposeSAM cell segmentation from microscopy	@camlloyd
bigquery-public	SQL against BigQuery public genomics datasets	@YonghaoZhao722
clinical-variant-reporter	ACMG/AMP variant classification	@RezaJF
fine-mapping	SuSiE and ABF statistical fine-mapping	@camlloyd
labstep	Labstep ELN bridge (experiments, protocols, inventory)	@camlloyd
protocols-io	protocols.io search, retrieval, authentication	@camlloyd

Community Milestones

Won the biggest prize at the UK AI Agent Hackathon 2026 (Europe's largest AI hackathon)
Genomebook won 3rd place at AI London hackathon; bioRxiv preprint published
Bioinformatics Application Note submitted via ScholarOne
Nature feature interview (Nicola Jones) on vibe coding in science
PHURI Workshop accepted (22 Apr, Queen Mary University of London)
Corpas 30x WGS reference genome integrated with Zenodo DOI, VCF subsets, QC baselines, and 28 tests
5 tutorial tracks with Google Colab notebooks, all tested end-to-end

Security

Structured JSONL audit logging
Token redaction in httpx logs
Filesystem write restriction to PROJECT_ROOT
Conversation history sanitisation
Disclaimer enforcement in all bot messages

Stats (live, 4 Apr 2026)

587 GitHub stars | 111 forks | 13 contributors
42 skills (21 MVP, 21 planned)
687 tests (74 benchmark + 613 skill/unit tests)
447 commits total

Full changelog: v0.3.1...v0.5.0

Contributors

RezaJF, camlloyd, and YonghaoZhao722

Assets 2

04 Apr 07:15

manuelcorpas

v0.4.0

cbbbbd1

v0.4.0 — Community Skills Wave

What's New

5 new skills from external contributors:

struct-predictor — Boltz-2 protein structure prediction with 3D viewer (@camlloyd) #102
cell-detection — CellposeSAM fluorescence microscopy segmentation (@camlloyd) #101
bigquery-public — BigQuery public genomics data access with read-only SQL enforcement (@YonghaoZhao722) #93
Plus 3 more PRs awaiting rebase: Flow.bio bridge (#76), GWAS pipeline (#92), affinity proteomics (#96)

46 skills (up from 41 in v0.3.1)

Fixes:

Discord invite link updated across all site pages (#103)
Catalog generator YAML block scalar rendering fix (via #93)

Community:

584 stars, 110 forks, 13 contributors
7 open PRs from 4 external contributors
5,770 views / 1,820 unique visitors in last 14 days

Contributors

@camlloyd (54 commits), @jaymoore-research (41), @YonghaoZhao722 (33), @RezaJF, @alexharston, @dalloliogm, @drdaviddelorenzo, @Duvet05, @hg125chinese-sketch, @HDash, @Mandykcl, @zngwee

Contributors

dalloliogm, drdaviddelorenzo, and 10 other contributors

Assets 2

05 Mar 09:04

manuelcorpas

v0.3.1

5c300d5

v0.3.1 — Agent-Friendly

What's new

Most open-source repos are invisible to AI agents. README files are too long, contribution guides assume a human reader, and there's no machine-readable way to discover what a project does. This release fixes that for ClawBio.

llms.txt — A concise, LLM-optimised project summary following the emerging llms.txt standard. Lists every doc, skill, and entry point in a format that fits an agent's context window.
AGENTS.md — Universal guide for AI coding agents (Codex, Devin, Cursor, Claude Code, Copilot Workspace). Setup, commands, code style, project structure, safety boundaries, and full contribution workflow.
Machine-readable skill catalog — skills/catalog.json auto-generated by scripts/generate_catalog.py. Indexes all 21 skills with name, version, status, dependencies, tags, and trigger keywords.
Standardised SKILL.md files — All 21 skill specifications upgraded to a consistent YAML frontmatter schema with emoji, OS compatibility, install instructions, and structured methodology sections.
Upgraded SKILL-TEMPLATE.md — Best-practice template matching the new standardised format so new contributors (human or AI) start right.
Agent pointers in README + CONTRIBUTING — References to llms.txt, AGENTS.md, and catalog.json added so both human and AI contributors can find agent-specific documentation.

Why this matters

We're entering the era where AI agents don't just use tools — they contribute to codebases. If an agent can't discover your project, understand its architecture, and follow its conventions, it won't contribute. These six additions make ClawBio one of the first bioinformatics repos designed for agent collaboration from the ground up.

Files added / changed

File	Status
`llms.txt`	New
`AGENTS.md`	New
`skills/catalog.json`	New
`scripts/generate_catalog.py`	New
`templates/SKILL-TEMPLATE.md`	Updated
`skills/*/SKILL.md` (21 files)	Standardised
`README.md`	Updated
`CONTRIBUTING.md`	Updated
`CHANGELOG.md`	New

Full Changelog: v0.3.0...v0.3.1

Assets 2

01 Mar 14:04

manuelcorpas

v0.3.0

7a1aac9

v0.3.0 — Imperial College AI Agent Hack

Includes video of ClawBio introduction to Peter Steinberger at the UK AI Agent Hack, Imperial College London (1 March 2026).

Assets 3

28 Feb 07:51

manuelcorpas

v0.2.0

f4a232a

v0.2.0 — Tests, CI, and ClawHub

What's new

Test suites: 57 tests across PharmGx Reporter (24), Equity Scorer (24), NutriGx Advisor (9)
GitHub Actions CI: runs on Python 3.10, 3.11, 3.12 for every push and PR
ClawHub: 3 skills published (clawhub install pharmgx-reporter)
Org migration: repo now at github.com/ClawBio/ClawBio
Community: issue templates, PR template, Discussion seeded, 8 open skill issues

Demo

See the NutriGx Advisor demo video attached below.

Assets 3

Releases: ClawBio/ClawBio

v0.5.0 — Validation & Benchmark Infrastructure

What's new

Benchmark Infrastructure

Development Standards

New Skills (7)

Community Milestones

Security

Stats (live, 4 Apr 2026)

Contributors

Uh oh!

v0.4.0 — Community Skills Wave

What's New

Contributors

Contributors

Uh oh!

v0.3.1 — Agent-Friendly

What's new

Why this matters

Files added / changed

Uh oh!

v0.3.0 — Imperial College AI Agent Hack

Uh oh!

v0.2.0 — Tests, CI, and ClawHub

What's new

Demo

Uh oh!