Skip to content

Add multi-DB assistant with auto-monitor, analyse, session monitor, SQL tuning & snapshot compare#15

Open
devin-ai-integration[bot] wants to merge 19 commits into
mainfrom
devin/1775313621-pg-assistant-cli
Open

Add multi-DB assistant with auto-monitor, analyse, session monitor, SQL tuning & snapshot compare#15
devin-ai-integration[bot] wants to merge 19 commits into
mainfrom
devin/1775313621-pg-assistant-cli

Conversation

@devin-ai-integration

@devin-ai-integration devin-ai-integration Bot commented Apr 4, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a Python tool under tools/pg-assistant/ that converts natural language questions into SQL queries using a local Ollama LLM and executes them against PostgreSQL or Oracle databases via a Streamlit web UI. Includes automated tablespace monitoring with auto-extend, fully programmatic performance analysis, live session/lock monitoring, an AI-powered SQL tuning advisor, and side-by-side snapshot comparison with Plotly visualizations.

Modules:

  • app.py — Streamlit web UI with sidebar for connection/profile management, tabbed interface (Query, Schema, Auto Monitor, Auto Analyse, Sessions & Locks, SQL Tuning Advisor, Compare Snapshots, History)
  • db_client.py — Abstract BaseDBClient with PostgreSQLClient (psycopg2) and OracleClient (oracledb thin mode) implementations, plus create_db_client() factory
  • llm_client.py — Ollama REST API client (/api/generate)
  • sql_generator.py — Prompt engineering with schema injection, dual-DB system prompts (PostgreSQL/Oracle SQL dialects), SQL extraction, keyword-based safety validation, retry logic
  • profile_manager.py — Save/load/delete connection profiles as JSON (~/.pg-assistant/profiles.json) with db_type and service_name fields
  • auto_monitor.pyTablespaceMonitor class: periodic tablespace usage checks (configurable interval, default 1hr), auto-extend Oracle datafiles up to 20 GB per file, PostgreSQL storage size reporting
  • auto_analyse.pyPerformanceAnalyser class: live V$/pg_stat_* collection, AWR snap-ID range analysis (Oracle), pgProfile sample-ID range analysis (PostgreSQL), latest pg_stat_statements snapshot, uploaded report file parsing (AWR HTML/text, CSV, pgProfile). Analysis is 100% programmatic — Python code extracts real findings from DB data with specific SQL IDs, table names, query text, and exact fix commands. No LLM involved in analysis (codellama was hallucinating generic advice). Includes best-practice checks for row contention, sequence caching, high elapsed time, full table scans, high execution count, temp usage, and 30+ other sections.
  • session_monitor.pySessionMonitor class: active sessions, blocking lock tree, lock details, long-running queries, wait events, and kill/cancel session for both Oracle and PostgreSQL
  • sql_tuning_advisor.pySQLTuningAdvisor class: EXPLAIN PLAN execution, per-table metadata collection (columns, indexes, statistics), and LLM-powered tuning recommendations with specific index/rewrite/maintenance suggestions
  • snapshot_compare.pySnapshotComparator class: compare two AWR (Oracle) or pgProfile (PostgreSQL) snapshot ranges, compute delta metrics, generate Plotly bar/pie charts for visual comparison, and produce programmatic differential analysis (no LLM)
  • requirements.txtrequests, psycopg2-binary, oracledb, streamlit, pandas, plotly
  • README.md — Architecture, usage, installation docs

Key behaviors:

  • On connect, fetches schema metadata (information_schema for PG, ALL_TAB_COLUMNS for Oracle) and injects it into every LLM prompt
  • Blocks dangerous SQL keywords (DROP, DELETE, UPDATE, etc.) and enforces SELECT/WITH-only queries in the natural language path
  • Auto Monitor uses a separate internal code path for administrative DDL (ALTER TABLESPACE, ALTER DATABASE DATAFILE)
  • Retries SQL generation up to 3 times if validation fails; additionally, if a generated query fails at the DB level (e.g. ORA-00933), the error is automatically fed back to the LLM for a corrected re-generation attempt
  • Query results rendered as interactive DataFrames with CSV download
  • Connection profiles persist across sessions via JSON file with db_type field
  • Oracle column names are normalized to lowercase in OracleClient.execute_query() for consistent downstream access
  • All analysis queries include 500-character SQL text snippets and exclude system schemas/queries
  • Performance analysis is fully programmatic: _build_findings_report() covers 30+ data sections — all extracted from real DB data with specific SQL IDs, table names, query text, and exact fix commands

Updates since last revision

Restructured analysis output to enterprise DBA / Copilot-quality format:

  • auto_analyse.py: Complete rewrite of _build_findings_report() to produce structured, severity-grouped output:
    • Executive Summary with health rating (CRITICAL / WARNING / ADVISORY / HEALTHY) and key metric headlines
    • Database & Workload Overview table (cache hit, backends, commits, rollbacks, WAL, temp usage)
    • Top Bottlenecks grouped by severity level (SEV-1 Critical, SEV-2 Important, SEV-3 Advisory) — each bottleneck includes specific SQL IDs, table names, exact metrics, and markdown tables
    • Configuration Review from pg_settings / v$parameter with risk flags (e.g. statement_timeout=0, high max_connections)
    • Risk Register table (Risk, Likelihood, Impact)
    • Prioritised Action Plan with Priority 0 (Immediate), Priority 1 (Structural), Priority 2 (Performance Hygiene) groupings
  • New data collection queries added:
    • PostgreSQL: pg_stat_wal (WAL volume, FPI, sync time — PG 14+), pg_total_relation_size with TOAST breakdown, idle-in-transaction sessions, pg_settings configuration parameters, pg_stat_replication lag
    • Oracle: v$parameter configuration, v$session idle sessions (> 5 min)
  • New bottleneck detectors: rollback explosion, idle-in-transaction sessions, WAL pressure, replication lag, table bloat, checkpoint pressure, backend buffer writes, temp file spilling, high redo log switches
  • snapshot_compare.py: Removed dead LLM code (_format_comparison_text() and _get_llm_comparison() methods)
  • app.py: Updated all spinner text and button labels to remove "LLM" references from analysis paths (analysis is fully programmatic, LLM is only used for SQL generation and SQL Tuning Advisor)

Review & Testing Checklist for Human

  • Never tested end-to-end — All code passed ruff lint/format checks only. The Streamlit UI, database connections, all eight tabs, and all Oracle/PostgreSQL query paths have not been run against actual Ollama, PostgreSQL, or Oracle services. This is the highest-risk item.
  • _build_findings_report() is ~850 lines of untested analysis logic — Accesses specific dictionary keys from query results (e.g., row.get("sql_id"), row.get("xact_duration_sec")). If queries return different key names or column structures, bottlenecks will silently be empty. Hardcoded thresholds (cache hit < 95%, rollback rate > 10% = SEV-1, seq scans > 100, etc.) may not suit all environments. Critically verify the output contains real SQL IDs, real table names, real metrics from your database — not empty sections.
  • New queries may fail on specific DB versionspg_stat_wal is PG 14+ only (version check exists), but pg_stat_replication requires replication privileges, v$parameter and v$session require DBA-level access on Oracle. Verify all new queries work with your user's grant level.
  • snapshot_compare.py is ~950 lines of untested code — Contains complex SQL queries for Oracle AWR and pgProfile, delta computation logic, and Plotly chart generation. Column names, join conditions, and WHERE filters have not been validated against a real database.
  • Kill session executes immediately with no confirmation dialog — The Sessions & Locks tab has ALTER SYSTEM KILL SESSION (Oracle) and pg_terminate_backend (PostgreSQL) behind a single button click. There is a warning label but no "Are you sure?" confirmation step.
  • SQL Tuning Advisor EXPLAIN ANALYZE actually executes the query — A user could paste a write statement (INSERT/UPDATE/DELETE) and EXPLAIN ANALYZE on PostgreSQL would run it. The UI has a caution warning but no SQL validation on this path.
  • Auto-monitor executes DDL without confirmationauto_monitor.py runs ALTER DATABASE DATAFILE ... AUTOEXTEND ON and ALTER TABLESPACE ... ADD DATAFILE automatically when thresholds are exceeded.
  • Plaintext password storage in profile_manager.py — database passwords are saved as plaintext JSON in ~/.pg-assistant/profiles.json.
  • pgProfile schema assumptions may be wrong — Both auto_analyse.py and snapshot_compare.py assume pgProfile tables (profile.samples, profile.stmt_list, profile.sample_statements, profile.wait_sampling_total) with specific column names. pgProfile's schema varies by version.
  • No unit tests — zero test coverage for all modules.

Suggested test plan: Run streamlit run app.py with Ollama (codellama) running and both a reachable PostgreSQL and Oracle instance. Verify:

  1. Connecting to PostgreSQL via sidebar form, saving and loading a profile
  2. Connecting to Oracle via sidebar form (with service_name), saving and loading a profile
  3. Asking a simple question in Query tab for each DB type (e.g. "show top 10 tables by row count")
  4. Verify Oracle queries use ROWNUM syntax, not FETCH FIRST
  5. Intentionally trigger a failing query and verify the auto-retry regenerates corrected SQL
  6. Schema tab loads correctly for each DB type
  7. A dangerous prompt is blocked ("delete all users")
  8. Auto Monitor tab: run a one-time check for Oracle (verify tablespace data appears), start periodic monitoring
  9. Auto Analyse — Live mode: collect data and run full analysis for each DB type; critically verify output has severity-grouped bottlenecks (SEV-1/2/3) with real SQL IDs, real table names, real metrics — not empty sections or generic placeholders
  10. Auto Analyse — AWR Snap ID mode (Oracle): load snapshots, select a range, run analysis
  11. Auto Analyse — pgProfile Snap ID mode (PostgreSQL): load samples, select a range, run analysis
  12. Auto Analyse — Latest pg_stat_statements (PostgreSQL): verify extension check, run analysis
  13. Auto Analyse — Upload mode: upload an AWR HTML report and a pg_stat_statements CSV, verify parsing and summary
  14. Verify analysis output excludes system queries (no SYS/SYSTEM schema SQL, no SET/RESET/BEGIN queries)
  15. Sessions & Locks tab: verify Active Sessions view shows data for each DB type, cycle through all five views
  16. Sessions & Locks tab: verify Blocking Lock Tree and Lock Details render correctly (may need to simulate a blocking lock)
  17. Sessions & Locks tab: test kill/cancel session on a disposable test session (verify Oracle SID/Serial# and PostgreSQL PID inputs work)
  18. SQL Tuning Advisor tab: paste a simple SELECT, run with EXPLAIN only (PostgreSQL), verify plan output and LLM recommendations appear
  19. SQL Tuning Advisor tab: paste a multi-table JOIN query, verify table metadata (columns, indexes, stats) is collected and shown in the expander
  20. SQL Tuning Advisor tab: test with Oracle — verify EXPLAIN PLAN FOR + DBMS_XPLAN.DISPLAY path works
  21. SQL Tuning Advisor tab: test EXPLAIN ANALYZE checkbox (PostgreSQL) — verify the query actually executes and shows actual vs estimated rows
  22. Compare Snapshots tab (Oracle): select two AWR snapshot ranges, run comparison, verify delta table renders, Plotly charts display, and programmatic analysis references real SQL IDs
  23. Compare Snapshots tab (PostgreSQL): select two pgProfile sample ranges, run comparison, verify same outputs
  24. Compare Snapshots tab: verify charts show grouped bars with Snapshot A vs B and percentage deltas are calculated correctly
  25. CSV download works on query results
  26. Switching between PG and Oracle profiles without stale state
  27. Verify the timeout slider works: set to 60s, trigger a large-schema query, confirm timeout behavior

Notes

  • Performance analysis is now fully programmatic — LLM is only used for SQL generation (Query tab) and SQL Tuning Advisor. Auto Analyse and Compare Snapshots tabs produce findings entirely from Python code analyzing real DB data.
  • Analysis output follows enterprise DBA assessment format: Executive Summary → Workload Overview → Severity-grouped Bottlenecks → Configuration Review → Risk Register → Prioritised Action Plan.
  • This tool auto-executes generated SQL without user confirmation. For production use, consider adding a confirmation step.
  • Bare module imports (from db_client import ...) require running from the tools/pg-assistant/ directory. Will break if invoked from elsewhere or installed as a package.
  • oracledb thin mode does not require Oracle Client installation but may not support all Oracle features (e.g. Advanced Queuing, Continuous Query Notification).
  • psycopg2 connection objects stored in Streamlit session_state may not survive all rerun edge cases; the is_connected property mitigates this with a health-check query but manual reconnection may occasionally be needed.
  • Default Ollama timeout is 300s. The first request after model load is typically the slowest; subsequent requests should be faster. Users can adjust via the sidebar slider.
  • Uploaded report files are truncated to 15,000 characters. Very large AWR or pgProfile reports will lose tail content — the most important sections (top SQL, wait events) are typically near the top, but verify critical data isn't being cut.
  • HTML report parsing uses simple regex-based tag stripping, not a full HTML parser. Complex AWR HTML reports with nested tables may lose formatting context.
  • The README.md architecture diagram does not yet reflect the Session Monitor, SQL Tuning Advisor, or Compare Snapshots modules.
  • SQL injection patterns in snap-ID queries use Python string .replace() / .format() for placeholders. Values originate from DB query results (not direct user input) but are passed through Streamlit selectbox → integer → string interpolation. Similarly, session_monitor.py Oracle kill session uses .format(sid=sid, serial=serial) — values come from st.number_input (integer-constrained) but the module itself doesn't validate types.
  • Session monitor and auto-analyse require elevated privileges — Oracle: DBA/SELECT_CATALOG_ROLE; PostgreSQL: pg_monitor role or superuser. Users with limited grants will get permission errors.
  • Hardcoded analysis thresholds (cache hit < 95%, rollback rate > 10% = SEV-1, elapsed > 5s per exec, exec count > 1000, seq scans > 100 on tables > 10k rows, etc.) are reasonable defaults but may need tuning for specific environments.
  • LLMClient is still imported and passed to PerformanceAnalyser and SnapshotComparator constructors for API compatibility, even though neither class uses it for analysis anymore. Minor tech debt — could be cleaned up in a follow-up.

Link to Devin session: https://partner-workshops.devinenterprise.com/sessions/75db244b07ca4a3db4c6563dafd2cafc


Open with Devin

- app.py: Main CLI loop with rich terminal output, argument parsing
- llm_client.py: Ollama API client for LLM communication
- mcp_client.py: MCP PostgreSQL server client for query execution
- sql_generator.py: Prompt engineering, SQL extraction, and safety validation
- requirements.txt: Python dependencies (requests, rich)
- README.md: Architecture docs, usage examples, installation instructions

Features:
- Natural language to SQL via Ollama (codellama model)
- Schema-aware prompt engineering
- SQL safety enforcement (SELECT-only, blocks dangerous keywords)
- Retry logic for failed SQL generation
- Rich formatted output with timing metrics
- Interactive CLI commands (help, schema, clear, exit)
@devin-ai-integration

Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

…ofiles

- Replace CLI (app.py) with Streamlit web UI
- Replace MCP client with direct PostgreSQL connection via psycopg2 (db_client.py)
- Add connection profile manager for save/load DB configs (profile_manager.py)
- Update requirements.txt with streamlit, psycopg2-binary, pandas
- Update README with new architecture and usage docs
- Keep llm_client.py and sql_generator.py unchanged
@devin-ai-integration devin-ai-integration Bot changed the title Add AI-powered PostgreSQL assistant CLI tool (pg-assistant) Add AI-powered PostgreSQL assistant with Streamlit UI (pg-assistant) Apr 4, 2026
- Refactor db_client.py with abstract BaseDBClient, PostgreSQLClient, OracleClient
- Add oracledb driver support (thin mode, no Oracle Client needed)
- Add db_type dropdown in profile manager and connection sidebar
- Add auto_monitor.py: periodic tablespace monitoring, auto-extend datafiles (max 20GB/file)
- Add auto_analyse.py: AWR/pg_stat_statements analysis with LLM summary + action plan
- Update sql_generator.py for dual-DB SQL dialects
- Update Streamlit UI with Auto Monitor and Auto Analyse tabs
- Update requirements.txt with oracledb dependency
- Update README.md with new architecture and features
@devin-ai-integration devin-ai-integration Bot changed the title Add AI-powered PostgreSQL assistant with Streamlit UI (pg-assistant) Add multi-DB assistant (PostgreSQL + Oracle) with auto-monitor and auto-analyse Apr 4, 2026
…n UI

- Default timeout increased from 120s to 300s (first model load is slow)
- Added timeout slider (60-600s) in Ollama Settings sidebar
- Improved timeout error message with troubleshooting hint
- Update Oracle system prompt to use ROWNUM instead of FETCH FIRST/OFFSET
  (compatible with Oracle 11g+, fixes ORA-00933)
- Increase MAX_RETRIES from 2 to 3 for SQL generation
- Add auto-retry in Query tab: when a query fails with a DB error, the
  error is fed back to the LLM to regenerate corrected SQL automatically
- Explicit Oracle syntax guidance: NVL, DUAL, TO_DATE, subquery for ORDER BY + ROWNUM
- Oracle: AWR snap ID range selector (queries DBA_HIST_SNAPSHOT, collects
  DBA_HIST_SQLSTAT/SYSTEM_EVENT/SYSSTAT for selected range)
- PostgreSQL: pgProfile sample ID range selector (queries profile.samples,
  collects profile.stmt_list/wait_sampling_total for selected range)
- PostgreSQL: latest pg_stat_statements one-click analysis with extension check
- Both: file upload for AWR HTML/text, pg_stat_statements CSV, pgProfile reports
- Auto Analyse tab now has radio button mode selector per DB type
- Parsed report text shown in expander when no raw data available

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 8 additional findings.

Open in Devin Review

Oracle's oracledb driver returns column names in UPPERCASE by default.
Normalize to lowercase in OracleClient.execute_query() so all downstream
code (AWR snap selector, auto_analyse, etc.) can use lowercase keys
consistently.
- Oracle: collect top CPU SQL (v$sql by cpu_time), full table scans
  (v$sql_plan TABLE ACCESS FULL), existing indexes (all_indexes +
  all_ind_columns with LISTAGG), stale stats (all_tab_statistics),
  and execution plans (v$sql_plan detail for top 5 sql_ids)
- PostgreSQL: collect top CPU queries (pg_stat_statements with
  blk_read_time/temp_blks), seq scan tables (pg_stat_user_tables
  with avg rows per scan), existing indexes (pg_indexes with DDL),
  stale stats/vacuum (dead tuples, last_analyze), lock waits
  (pg_stat_activity)
- Rewrote LLM system prompt to require SQL-ID-specific analysis:
  high-CPU SQL with exact sql_id/queryid, full table scan tables
  with causing sql_id, missing index CREATE statements referencing
  the queryid that benefits, stale stats with ANALYZE/DBMS_STATS
  commands, unused index DROP statements, and numbered action plan
  with exact SQL commands and expected improvement
Session/Lock Monitor (session_monitor.py):
- Active sessions view (v$session / pg_stat_activity)
- Blocking lock tree with recursive hierarchy (CONNECT BY for Oracle,
  recursive CTE for PostgreSQL)
- Lock details (v$lock / pg_locks with object names)
- Long-running queries (>5s threshold)
- Wait event chains
- Kill/cancel session UI (ALTER SYSTEM KILL SESSION for Oracle,
  pg_cancel_backend/pg_terminate_backend for PostgreSQL)

SQL Tuning Advisor (sql_tuning_advisor.py):
- Paste any SQL, runs EXPLAIN PLAN (Oracle) or EXPLAIN (PostgreSQL)
- Extracts tables from plan, collects per-table metadata:
  column stats, existing indexes, table stats, clustering factor
- PostgreSQL: optional EXPLAIN ANALYZE with actual execution stats
- LLM prompt requires step-by-step plan analysis, root cause,
  specific CREATE INDEX statements, SQL rewrite suggestions,
  stats maintenance commands, and numbered action plan

Updated app.py with two new tabs in the UI.
@devin-ai-integration devin-ai-integration Bot changed the title Add multi-DB assistant (PostgreSQL + Oracle) with auto-monitor and auto-analyse Add multi-DB assistant with auto-monitor, auto-analyse, session monitor & SQL tuning Apr 6, 2026
…ysis, exclude system queries, 500-char SQL text
@devin-ai-integration devin-ai-integration Bot changed the title Add multi-DB assistant with auto-monitor, auto-analyse, session monitor & SQL tuning Add multi-DB assistant with auto-monitor, analyse, session monitor, SQL tuning & snapshot compare Apr 6, 2026
…ead of system prompt (codellama is a completion model, not instruction-following)
… Python code now identifies all issues (high elapsed SQL, full table scans, sequence caching, stale stats, unused indexes, etc.) with real sql_ids, table names, and query text - LLM only provides a brief supplementary summary of pre-identified findings - Same hybrid approach applied to snapshot comparison
…LLM summary

- Add top_cpu_queries/top_cpu_sql section (most important - always shows top SQL)
- Add top_queries/top_elapsed_sql section (deduped from CPU section)
- Add database_stats overview (cache hit ratio, connections, temp usage)
- Add connection_stats section (idle connection detection)
- Add Oracle system_stats with cache hit ratio, hard parse ratio, disk sorts
- Add Oracle SGA configuration, tablespace I/O, redo log switches, temp usage
- Add Oracle execution plans display with full scan/hash join detection
- Add Oracle parallel queries section
- Add pgProfile wait events section
- Add table_stats (top tables by activity) section
- Add AWR/pgProfile fallback for top SQL sections
- Remove LLM summary entirely (codellama keeps hallucinating generic advice)
- Update app.py labels: 'Performance Analysis Report' instead of 'AI Analysis'
- All analysis is now 100% programmatic from real DB data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants