Skip to content

nikcholer/Sample-NYCTraffic-Refresh

Repository files navigation

Agent-loop harness: loop-design-build

Sample-NYCTraffic

Sample-NYCTraffic turns New York City motor vehicle collision records into a small operational exploration tool: ingest a raw public CSV, normalize it into a queryable SQLite dataset, expose focused API endpoints, and render the results in a browser map designed for incident review.

Why this exists

The source data is useful but unwieldy in raw CSV form. This project reshapes it into something closer to an analyst or operations workflow:

  • filter crashes by borough and date range
  • review injury and fatality counts without hand-cleaning rows
  • inspect map markers and collision detail in a browser UI
  • test the delivery path from ingestion through presentation

Likely users include safety analysts, operations staff, planners, and incident reviewers who need to move from a flat file toward a repeatable local investigation tool.

Portfolio narrative

This repo is a case study in agent-assisted legacy refresh rather than a claim that NYC traffic data is the destination product. It shows how a bounded agent loop can move from raw public data to a working operational tool through narrow, reviewable slices: ingestion, normalization, persistence, API design, browser UI, validation states and reporting.

The commercial pattern is transferable to older internal tools that live around spreadsheets, CSV exports or manual reporting queues. AI helped structure the delivery process, while deterministic code owns the parser, database, API contracts, validation and UI behavior.

Delivery story

The repository is intentionally built as a sequence of narrow slices:

  1. ingest the raw Motor_Vehicle_Collisions_-_Crashes.csv dataset
  2. parse and normalize records into strict DTOs
  3. persist normalized rows into SQLite through Prisma
  4. expose query endpoints for collisions, borough metadata, and date ranges
  5. ship a browser map UI with filtering, status handling, and collision popups
  6. refine the UX so invalid states, truncation warnings, and metadata remain legible
  7. add a borough trend report so the same filters drive both map exploration and lightweight monthly analysis

That progression is visible both in the source layout and in docs/state/, where each run records what changed and what the next agent should do.

Architecture

Main components live under src/:

  • src/services/ parses CSV input, seeds the database, manages Prisma access, and implements repository queries
  • src/models/ defines shared query, response, and metadata contracts used across layers
  • src/api/ mounts the Express application and query-validation logic
  • src/frontend/ renders the React + Leaflet browser experience, including filter state and UI styling
  • src/scripts/ provides the local entry points for seeding data and starting the server
  • src/server.ts resolves runtime port configuration and starts the HTTP server

What the app does today

  • loads NYC collision records from CSV into a local SQLite database
  • serves collision search, borough list, and available date-range metadata
  • renders an institutional-style browser map for browsing filtered crashes
  • supports URL-synchronized filters, warning states, and inline validation for inverted dates
  • pairs the map with a borough trend report that summarizes monthly collisions, injuries, and fatalities for the active filters
  • highlights fatal collisions as a distinct data signal rather than decorative UI chrome

Local run

Prerequisites

  • Node.js 22+
  • npm

Note: the current Prisma 7 dependency chain emits an engine warning on Node 20 because @prisma/streams-local now declares node >=22.0.0. Treat Node 22+ as the supported interactive/runtime baseline for this repo.

Install

npm install
npx prisma generate

Configure local data

Create a .env file with at least:

DATABASE_URL="file:./dev.db"

Optional environment variables:

  • COLLISIONS_CSV_PATH to override the default CSV path
  • PORT to override the default server port (3000)

Create the SQLite schema

npx prisma db push

Seed the collision data

By default the seeder reads Motor_Vehicle_Collisions_-_Crashes.csv from the repository root.

npm run seed:collisions

You can also pass a CSV path explicitly:

npm run seed:collisions -- path\to\Motor_Vehicle_Collisions_-_Crashes.csv

Start the app

npm start

Open http://localhost:3000 unless you set a different PORT.

Interactive verification

Use this sequence for a fresh local smoke test from an uninitialized checkout:

npm install
npx prisma generate
npx prisma db push
npm run seed:collisions
npm start

If you have not created .env yet, add DATABASE_URL="file:./dev.db" before running the Prisma and seed commands.

The bundled CSV currently seeds a dataset that spans 2012-09-25 through 2025-04-12, so the examples below are grounded in the repository's default local data.

Browser smoke test checklist

  1. Open http://localhost:3000 and confirm the page renders the formal header, filter bar, metadata strip, Leaflet map, and borough trend panel.
  2. Leave the default recent window in place and confirm the initial map load completes without an error notice.
  3. Set Borough = QUEENS, Start date = 2021-09-01, and End date = 2021-09-14, then submit.
    • Expect the metadata strip to reflect the submitted borough and date range.
    • Expect several hundred plotted collisions for that window, which makes marker clustering density easy to inspect.
    • Click a marker and confirm the popup shows the REF #... summary plus structured field rows.
  4. Without changing the borough, set Start date = 2021-09-14 and End date = 2021-09-01.
    • Expect the inline invalid-date notice to appear immediately.
    • Expect the Update Map button to disable until the range is valid again.
  5. Clear the borough filter and use Start date = 2021-05-01 with End date = 2021-05-31, then submit.
    • Expect the truncation notice to appear because the default dataset contains 10,058 citywide collisions in that month, which exceeds the API cap of 5,000 results.
  6. Set Borough = BROOKLYN, Start date = 2021-04-01, and End date = 2021-06-30, then submit.
    • Expect the monthly trend panel to show rows for April, May, and June 2021.
    • Expect non-zero collision, injury, and fatality totals in the report summary and table for that range.

Useful seeded scenarios

  • QUEENS + 2021-09-01 to 2021-09-14: reliable marker and popup-detail exercise with 319 seeded collisions.
  • citywide + 2021-05-01 to 2021-05-31: reliable truncation-warning scenario because the seeded dataset exceeds the API result cap.
  • BROOKLYN + 2021-04-01 to 2021-06-30: reliable trend-report scenario with three populated monthly rows and non-zero fatalities.

Run tests

npm test -- --runInBand

API surface

  • GET /collisions supports borough, date_start, and date_end
  • GET /collisions/:collisionId returns one collision or a not-found error
  • GET /borough-trends returns monthly collision, injury, and fatality totals for the active borough/date filters
  • GET /boroughs returns the available borough list
  • GET /date-range returns the persisted minimum and maximum crash dates

Demo assets

Overview of the collision map filters and borough-focused results

Overview of the institutional browser map with the borough/date filter bar, metadata strip, truncation notice, and Brooklyn monthly trend summary.

Collision popup detail treatment with structured field rows

Popup detail treatment showing the government-style reference number and compact label/value metadata rows for a selected collision.

Invalid date guard preventing an inverted date range submission

Validation state for an inverted date range, including the inline error notice and disabled Update Map action.

Citywide truncation notice when the result set exceeds the API cap

Citywide view demonstrating the dense plotting behavior, capped-result warning, and synchronized monthly trend summary for the active filter window.

Evidence of agent-loop delivery

This repo is also a companion showcase for the local markdown-driven agent loop.

  • inspect docs/state/backlog.md to see planned and completed slices
  • inspect docs/state/progress.md to see completed work recorded run by run
  • inspect docs/state/handover.md to see the baton passed between runs
  • inspect git log --oneline to see narrow, sequential changes over time

One representative pause happened when the loop reached the persistence layer before the stack had been explicitly chosen. The agent stopped, wrote a blocker, and waited for a human decision instead of inventing the database approach.

Archived TBD example: database stack ambiguity resolved by a human

Agent blocker (docs/state/archive/20260409171010_tbd.md)

The agent halted because docs/planning.md and package.json did not yet specify the database or ORM, even though the next backlog item required persistence models and tests. It proposed four concrete paths:

  1. SQLite + Prisma
  2. SQLite + better-sqlite3
  3. PostgreSQL + TypeORM or Prisma
  4. A temporary in-memory abstraction

Human response (docs/state/archive/20260409171010_tbd-response.md)

The operator chose Option A: SQLite with Prisma and instructed the next run to update docs/planning.md and continue the TDD flow from that now-explicit baseline.

Lessons learned

The delivery path was not perfectly linear, and the repo keeps that visible on purpose. A few useful corrections came from forcing the agent to stop instead of guessing:

  • The loop had to escalate when the persistence layer work arrived before the database stack had been explicitly chosen, which prevented an invented ORM direction and turned SQLite + Prisma into a deliberate requirement.
  • The loop also tripped over process ambiguity, not just product ambiguity: two later TBDs were raised because the worktree already contained meaningful edits, and the agent could not safely decide on commit scope by itself.
  • Those pauses improved the final result more than bluffing through them would have. The project ended with a clearer Node 22 runtime baseline, cleaner backlog steering, and a README that now shows both the delivered feature set and the moments where human correction was required.

Deliberate simplifications

This is a focused local project, not a full production deployment. It intentionally omits hosted infrastructure, authentication and authorization, background jobs, rate limiting, observability plumbing, and large-scale map/reporting workflows that would likely be required in an operational city deployment.

About

Case study: refreshing a legacy data app through agent-assisted slices using Node, React, SQLite and the loop-design-build harness.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors