Skip to content

Andrei-WongE/research_knowledge_base

Repository files navigation

Research Knowledge Base 0.2.1

Status: Second Draft Version | Skills Release: 4th Release | Hooks Release: 3rd Release

A specialized Gemini CLI environment for high-rigor academic research, literature synthesis, and evidence mapping. This repository transforms a standard Obsidian vault into an Active Synthesis Pipeline.

Core Architecture: The Synthesis Loop

This version introduces a 4-stage lifecycle that automates the transformation of raw academic data into structured evidence.

flowchart TD
  %% Stage 1: Ingest
  subgraph Ingest["1. Ingest (Compiler + PDF)"]
    Z["Zotero library"] --> Cmp["wiki-compiler"]
    PDF["Paper PDFs"] --> PR["pdf-reading (Docling)"]
    PR --> Cmp
    Cmp --> Wiki["01_papers/"]
  end

  %% Stage 2: Audit
  subgraph Audit["2. Audit (Linter)"]
    Wiki --> Lint["wiki-linter"]
    Lint --> Wiki
    Topics["02_topics/"] --> Lint
    Lint --> Topics
  end

  %% Stage 3: Index
  subgraph Index["3. Index (Indexer)"]
    Wiki --> Idx["wiki-indexer"]
    Topics --> Idx
    Idx --> QColl["qmd embeddings"]
  end

  %% Stage 4: Synthesize
  subgraph Synthesize["4. Synthesize (Synthesizer)"]
    QColl --> Syn["wiki-synthesizer"]
    Syn --> Out["04_synthesis/"]
    Out --> Drafts["05_outputs (Manuscripts/Slides)"]
  end

  %% Feedback Loop
  Drafts -.-> Ingest
Loading

Included Skills

Skill Release Description
wiki-compiler v4.0 Converts Zotero metadata + PDF artifacts into Literature Notes.
wiki-linter v4.0 Audits vault for structural integrity and semantic unlinked clusters.
wiki-indexer v4.0 Syncs vault contents with qmd embeddings for search.
wiki-synthesizer v2.0 Bridges the gap between indexing and manuscript drafting.

Components

  • Gemini CLI - The main orchestrator; runs agents, skills, and hooks.
  • Obsidian CLI - Official Obsidian command-line interface for vault management (favored).
  • Zotero MCP - Exposes the Zotero library (papers, metadata, annotations) as tools.
  • tobi/qmd - Local BM25/semantic search and indexing over .md collections using the official Go binary.
  • pdf-reading Skill - Docling-powered extraction of LaTeX tables, figures, and structured Markdown from PDFs.
  • wiki-compiler Skill - Converts raw Zotero inputs and PDF artifacts into normalized literature notes using Obsidian CLI.
  • wiki-linter Skill - Audits the vault for structural integrity and identifies semantic gaps (unlinked clusters).
  • wiki-indexer Skill - Maintains tobi/qmd collections and embeddings for papers, topics, and synthesis folders.
  • wiki-synthesizer Skill - Bridges the gap between literature indexing and manuscript writing via evidence mapping.

Why these changes? (Rationale)

These enhancements were implemented to move beyond simple note storage toward a computationally-assisted research workflow:

  • Official Tooling: Migration to the official Obsidian CLI for more robust vault interactions and note management.
  • Performance Indexing: Shifting to the Go-based tobi/qmd ensures high-performance indexing and search reliability across large research collections.
  • Portability: Standardizing hooks and skills within the .gemini/ directory allows the entire research environment to be vault-local and shared across different machines (Windows/WSL2).
  • Extraction over Transcription: Continued integration of pdf-reading to automate the extraction of complex LaTeX tables and figures.
  • Semantic Rigor: Enhanced wiki-linter using embeddings to find "unlinked clusters"--papers that should be connected conceptually but aren't.
  • Evidence Mapping: wiki-synthesizer v2.0 for querying specific effects and receiving mapped summaries.

Automated Workflows (Hooks)

The repository includes a suite of hooks located in ./hooks/ to automate the research lifecycle. To enable them, add the following to your gemini-extension.json or .gemini/settings.json.

Recommended Configuration:

"hooks": {
  "SessionStart": [
    {
      "name": "wiki-context",
      "type": "command",
      "command": "node ./.gemini/hooks/session-start.js",
      "description": "Injects vault architecture into session context"
    }
  ],
  "BeforeTool": [
    {
      "matcher": "^(obsidian_write_note|obsidian_create_note)$",
      "name": "wiki-guardrails",
      "type": "command",
      "command": "node ./.gemini/hooks/before-tool.js",
      "description": "Enforce vault folder rules and required frontmatter"
    }
  ],
 "BeforeAgent": [
    {
      "name": "wiki-routing",
      "type": "command",
      "command": "node ./.gemini/hooks/routing-hook.js",
      "description": "Classifies intent and provide architectural routing hints"
    }
  ],
  "AfterTool": [
    {
      "matcher": "^(obsidian_write_note|obsidian_create_note|obsidian_patch_note)$",
      "name": "wiki-quality-gate",
      "type": "command",
      "command": "node ./.gemini/hooks/quality-gate.js",
      "description": "Validates note frontmatter and quality immediately after write"
    },
    {
      "matcher": "^(mcp_zotero_search|mcp_zotero_get_item)$",
      "name": "wiki-compiler-prompt",
      "type": "command",
      "command": "node ./.gemini/hooks/after-tool-wiki.js",
      "description": "Prompts agent to run wiki-compiler after Zotero retrieval"
    }
  ],
  "AfterAgent": [
    {
      "name": "wiki-linter-prompt",
      "type": "command",
      "command": "node ./.gemini/hooks/after-agent.js",
      "description": "Prompts wiki-linter at session end if enough notes changed"
    }
  ]
}

Setup & Templates

This repository includes a new set of Academic Templates in assets/templates/:

  • literature-note.md: For normalized paper summaries (Templater-ready).
  • concept-note.md: For theoretical and conceptual definitions.
  • synthesis-note.md: For literature reviews and meta-analyses.

Recommended Folder Structure

/vault
├── .gemini/        # Configuration, hooks, and skills
│   ├── hooks/      # Portable automation scripts
│   └── skills/     # Specialized research agents
├── 01_papers/      # Normalized literature notes (zotero_key required)
├── 02_topics/      # Conceptual and methodological notes
├── 03_raw/         # Drop zone for human-managed inbox
├── 04_synthesis/   # Cross-cutting summaries & Literature reviews
├── 05_outputs/     # Manuscripts, Marp slides, and Lint reports
└── assets/         # Templates, images, and PDF artifacts

Credits

  • Maintainer: Andrei WongE
  • Engine: Gemini CLI + Docling + tobi/qmd

About

Gemini CLI orchestrated agents knowledge base

Topics

Resources

Stars

Watchers

Forks

Contributors