Skip to content

Datatab#198

Merged
dmpantiu merged 11 commits intomainfrom
datatab
Mar 3, 2026
Merged

Datatab#198
dmpantiu merged 11 commits intomainfrom
datatab

Conversation

@kuivi
Copy link
Collaborator

@kuivi kuivi commented Mar 2, 2026

FIRST ACCEPT PREVIOUS PR ;)

This PR is based on PR #197 (analysis modes) and must be merged after it.


Data Tab: Downloadable Datasets

Adds a new Data tab to the UI where users can download all datasets generated during a session. Also renames "Additional information" → "Figures".

What's new

  • downloadable_datasets tracking — a new field on AgentState that accumulates dataset entries ({label, path, source}) as they're created throughout the pipeline
  • Climate model CSVs — tracked after write_climate_data_manifest() in data_agent
  • ERA5 climatology JSON — tracked in prepare_predefined_data() after extraction
  • ERA5 time series Zarr — tracked in data_analysis_agent after retrieve_era5_data tool execution
  • DestinE time series Zarr — tracked in data_analysis_agent after retrieve_destine_data tool execution
  • Data tab in UI — lists all tracked datasets with download buttons; Zarr directories are zipped on the fly, JSON/CSV files download directly
  • Tab rename — "Additional information" → "Figures"
  • Data tab always visible — shown regardless of whether figures are available

Pipeline fix

Each agent node now returns downloadable_datasets in its return dict so LangGraph properly merges state across stages (in-place mutation alone is not enough).

Files changed

File Change
climsight_classes.py Add downloadable_datasets: list = [] to AgentState
climsight_engine.py Track datasets in data_agent, prepare_predefined_data, pass through combine_agent
data_analysis_agent.py Track ERA5/DestinE Zarr outputs from tool intermediate steps
streamlit_interface.py Rename tab, add Data tab with download buttons

Works in all modes

  • fast — climate model CSVs + ERA5 climatology JSON
  • smart — above + ERA5 time series Zarr
  • deep — above + DestinE time series Zarr

- New tool: destine_retrieval_tool.py with two-step workflow:
  1. search_destine_parameters: RAG semantic search over 82 DestinE parameters via Chroma vector store
  2. retrieve_destine_data: download point time series via earthkit.data + polytope
- Authentication via ~/.polytopeapirc token (from desp-authentication.py)
- UI toggle for DestinE data with token file status check
- DestinE test suite (pytest -m destine), skipped by default
- Updated README with DestinE authentication instructions
Move os.chdir(REPO_ROOT) from module level to an autouse fixture that
restores the original cwd after each test, preventing side effects on
other test files that use relative paths.
… add utility scripts

- Fix lat/lon swap in polytope request (was [lon, lat], now [lat, lon])
- Remove "keep date ranges SHORT" limits — default to full 2020-2039 period
- Simplify intro_agent prompt
- Add standalone DestinE download scripts (simple + parallel yearly)
- Add ERA5 fetch script and test utilities
…ring

- Guide data_analysis_agent to download ERA5/DestinE variables in parallel (all in one response)
- Relax intro_agent exclusion rules to allow analysis instructions (download data, plot time series, compute statistics)
…gets

- ANALYSIS_MODES dict defines presets for tool limits, max_iterations, and toggle defaults
- resolve_analysis_config() merges mode defaults with explicit UI overrides
- Mode radio selector outside form with on_change callback syncs toggles immediately
- Prompt budgets (hard limit, max per response, reflect limit) adapt per mode
…n to Figures

Track all generated datasets (climate model CSVs, ERA5 climatology JSON,
ERA5/DestinE Zarr time series) through the agent pipeline via a new
downloadable_datasets field on AgentState. Each agent node now returns
the accumulated list so LangGraph properly merges state across stages.
The UI gets a new Data tab with per-file download buttons (Zarr dirs
are zipped on the fly).
@kuivi kuivi requested review from dmpantiu and koldunovn March 2, 2026 20:06
# Conflicts:
#	src/climsight/data_analysis_agent.py
#	src/climsight/streamlit_interface.py
@koldunovn
Copy link
Collaborator

@kuivi Now the conflicts are here :)

# Conflicts:
#	src/climsight/data_analysis_agent.py
@kuivi
Copy link
Collaborator Author

kuivi commented Mar 3, 2026

@kuivi Now the conflicts are here :)

Done

Copy link
Collaborator

@dmpantiu dmpantiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data tab — new UI tab with download buttons for all datasets (CSV, JSON, Zarr) generated during a session
DestinE integration — two new tools: RAG search over 82 Climate DT parameters + data retrieval via polytope API
Analysis modes — fast / smart / deep presets controlling tool budgets and enabled features
Pipeline fix — every agent node now returns downloadable_datasets in its dict for proper LangGraph state merge
Tests & docs — DestinE test suite (auto-skipped by default), CLAUDE.md, climate data architecture docs

@dmpantiu dmpantiu merged commit 3466d9f into main Mar 3, 2026
4 checks passed
@kuivi kuivi deleted the datatab branch March 3, 2026 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants