Research anything on GitHub through natural language. Ask questions in plain English; a LangGraph ReAct agent plans tool calls, queries GitHub via the official MCP server, and streams answers back—backed by Gemini, durable conversation state in Neon Postgres, and optional tracing in Arize AX.
Build an agentic GitHub research assistant that:
- Discovers repos, users, issues, PRs, code, and trends without manual API wrangling
- Uses GitHub Copilot MCP for full API access through dynamically discovered tools
- Keeps multi-turn context in Postgres (not just in the browser)
- Exposes a simple Streamlit chat UI for interactive use
- Supports production observability (OpenTelemetry → Arize AX)
| Layer | Technology |
|---|---|
| UI | Streamlit |
| Agent framework | LangChain + LangGraph (create_agent, ReAct loop) |
| LLM | Google Gemini (gemini-3.5-flash via langchain-google-genai) |
| GitHub access | GitHub Copilot MCP (langchain-mcp-adapters) |
| Conversation state | LangGraph Postgres checkpointer + Neon |
| Observability | Arize AX (arize-otel, openinference-instrumentation-langchain) |
| Runtime | Python 3.11–3.13, uv |
flowchart TB
subgraph client [Client]
User[User]
UI[Streamlit app.py]
end
subgraph app_layer [Application]
Obs[observability.py]
CP[checkpointing.py]
AgentMod[agent.py]
end
subgraph runtime [Agent runtime]
Graph[LangGraph ReAct agent]
Gemini[Gemini 3.5 Flash]
MCP[GitHub MCP tools]
end
subgraph data [Persistence and ops]
Neon[(Neon Postgres checkpoints)]
Arize[Arize AX traces]
end
User --> UI
UI --> Obs
UI --> CP
UI --> AgentMod
AgentMod --> Graph
Graph --> Gemini
Graph --> MCP
CP --> Neon
Graph --> Neon
Obs --> Arize
MCP --> GitHubAPI[GitHub API]
Gemini --> GoogleAI[Google AI API]
| File | Role |
|---|---|
app.py |
Streamlit UI, session cache, async event loop, chat streaming |
agent.py |
Agent factory, MCP client, streaming helpers, system prompt |
checkpointing.py |
Neon connection pool + AsyncPostgresSaver setup |
observability.py |
Arize OTel registration + LangChain instrumentation |
tests/ |
Unit tests (credentials, chunks, observability, checkpointing) |
End-to-end path when a user sends a message in the UI:
- Streamlit appends the user message and calls
stream_agent_response. - Agent is built once per session (cached) with MCP tools and a Postgres checkpointer.
- LangGraph loads prior state for
thread_id, runs the ReAct loop (model ↔ tools). - Gemini decides whether to answer or call GitHub MCP tools.
- MCP executes GitHub API operations; results return to the model.
- Tokens stream to the UI via
astream_eventscallbacks. - Checkpointer writes graph state to Neon after each step.
- Arize (if configured) records spans for the run.
sequenceDiagram
participant User
participant Streamlit as Streamlit_UI
participant Agent as agent.py
participant Graph as LangGraph
participant Gemini
participant MCP as GitHub_MCP
participant Neon as Neon_Postgres
participant Arize as Arize_AX
User->>Streamlit_UI: Enter query
Streamlit_UI->>Streamlit_UI: init observability
Streamlit_UI->>Agent: get_or_create_agent
Agent->>Neon: setup checkpointer pool
Agent->>MCP: get_tools
Agent->>Graph: compile agent with checkpointer
Streamlit_UI->>Agent: stream_agent_response
Agent->>Graph: astream_events with thread_id
Graph->>Neon: load checkpoint
Arize-->>Graph: trace span start
loop ReAct until final answer
Graph->>Gemini: chat completion
Gemini-->>Graph: tool calls or text
alt tool calls
Graph->>MCP: execute GitHub tools
MCP-->>Graph: API results
end
Graph->>Neon: save checkpoint
Graph-->>Streamlit_UI: on_token chunks
end
Streamlit_UI-->>User: streamed markdown reply
Arize-->>Arize: export OTLP spans
- Python 3.11+
- uv
- GitHub PAT (read access is enough for many queries)
- Google AI API key
- Neon database (
DATABASE_URL) - Optional: Arize AX
ARIZE_SPACE_ID+ARIZE_API_KEY
git clone https://github.com/ankurdhuriya/gitHub-intelligence-agent.git
cd gitHub-intelligence-agent
uv sync
cp .env.example .env
# Edit .env with your secretsSee .env.example:
| Variable | Required | Description |
|---|---|---|
GITHUB_PAT |
Yes | GitHub Personal Access Token for MCP |
GOOGLE_API_KEY |
Yes | Gemini API key |
DATABASE_URL |
Yes | Neon Postgres connection string |
ARIZE_SPACE_ID |
No | Arize space ID for tracing |
ARIZE_API_KEY |
No | Arize API key for tracing |
ARIZE_PROJECT_NAME |
No | Project name in Arize (default: github-intelligence-agent) |
uv run streamlit run app.pyOpen the local URL (default http://localhost:8501). Use the chat input or sidebar presets, then watch the assistant stream tool status and the final answer.
uv run pytest
uv run ruff check .
uv run ruff format --check .LangGraph stores conversation state under a thread_id (default: streamlit_session_thread). Tables include checkpoints, checkpoint_blobs, and checkpoint_writes.
Example: list recent checkpoints in the Neon SQL editor:
SELECT checkpoint_id, checkpoint->'updated_channels' AS updated
FROM checkpoints
WHERE thread_id = 'streamlit_session_thread'
ORDER BY checkpoint_id DESC
LIMIT 10;Clearing the UI chat does not delete Postgres history unless you delete rows or use a new thread_id.
When ARIZE_SPACE_ID and ARIZE_API_KEY are set, observability.py registers OTLP export and instruments LangChain before agent imports. View traces in your Arize project (ARIZE_PROJECT_NAME).
- Per-session
thread_id— isolate conversations per Streamlit session instead of one shared thread - Clear Postgres on “Clear conversation” — wipe checkpoint rows for the active thread from the UI
- MCP reconnect — sidebar action to rebuild the agent when the MCP connection goes stale
- Deploy — containerize Streamlit; use Neon pooled URL and secrets manager in production
- Auth — gate the UI or map users to distinct
thread_ids - Evals — golden datasets in Arize Phoenix for regression on GitHub Q&A quality
- README / CI — document required GitHub Actions secrets (
GITHUB_PAT,GOOGLE_API_KEY, optional Arize)
This project is licensed under the Apache License 2.0.
