Agent-loop harness: loop-design-build
Sample-NYCTraffic turns New York City motor vehicle collision records into a small operational exploration tool: ingest a raw public CSV, normalize it into a queryable SQLite dataset, expose focused API endpoints, and render the results in a browser map designed for incident review.
The source data is useful but unwieldy in raw CSV form. This project reshapes it into something closer to an analyst or operations workflow:
- filter crashes by borough and date range
- review injury and fatality counts without hand-cleaning rows
- inspect map markers and collision detail in a browser UI
- test the delivery path from ingestion through presentation
Likely users include safety analysts, operations staff, planners, and incident reviewers who need to move from a flat file toward a repeatable local investigation tool.
This repo is a case study in agent-assisted legacy refresh rather than a claim that NYC traffic data is the destination product. It shows how a bounded agent loop can move from raw public data to a working operational tool through narrow, reviewable slices: ingestion, normalization, persistence, API design, browser UI, validation states and reporting.
The commercial pattern is transferable to older internal tools that live around spreadsheets, CSV exports or manual reporting queues. AI helped structure the delivery process, while deterministic code owns the parser, database, API contracts, validation and UI behavior.
The repository is intentionally built as a sequence of narrow slices:
- ingest the raw
Motor_Vehicle_Collisions_-_Crashes.csvdataset - parse and normalize records into strict DTOs
- persist normalized rows into SQLite through Prisma
- expose query endpoints for collisions, borough metadata, and date ranges
- ship a browser map UI with filtering, status handling, and collision popups
- refine the UX so invalid states, truncation warnings, and metadata remain legible
- add a borough trend report so the same filters drive both map exploration and lightweight monthly analysis
That progression is visible both in the source layout and in docs/state/, where each run records what changed and what the next agent should do.
Main components live under src/:
src/services/parses CSV input, seeds the database, manages Prisma access, and implements repository queriessrc/models/defines shared query, response, and metadata contracts used across layerssrc/api/mounts the Express application and query-validation logicsrc/frontend/renders the React + Leaflet browser experience, including filter state and UI stylingsrc/scripts/provides the local entry points for seeding data and starting the serversrc/server.tsresolves runtime port configuration and starts the HTTP server
- loads NYC collision records from CSV into a local SQLite database
- serves collision search, borough list, and available date-range metadata
- renders an institutional-style browser map for browsing filtered crashes
- supports URL-synchronized filters, warning states, and inline validation for inverted dates
- pairs the map with a borough trend report that summarizes monthly collisions, injuries, and fatalities for the active filters
- highlights fatal collisions as a distinct data signal rather than decorative UI chrome
- Node.js 22+
- npm
Note: the current Prisma 7 dependency chain emits an engine warning on Node 20 because
@prisma/streams-localnow declaresnode >=22.0.0. Treat Node 22+ as the supported interactive/runtime baseline for this repo.
npm install
npx prisma generateCreate a .env file with at least:
DATABASE_URL="file:./dev.db"Optional environment variables:
COLLISIONS_CSV_PATHto override the default CSV pathPORTto override the default server port (3000)
npx prisma db pushBy default the seeder reads Motor_Vehicle_Collisions_-_Crashes.csv from the repository root.
npm run seed:collisionsYou can also pass a CSV path explicitly:
npm run seed:collisions -- path\to\Motor_Vehicle_Collisions_-_Crashes.csvnpm startOpen http://localhost:3000 unless you set a different PORT.
Use this sequence for a fresh local smoke test from an uninitialized checkout:
npm install
npx prisma generate
npx prisma db push
npm run seed:collisions
npm startIf you have not created .env yet, add DATABASE_URL="file:./dev.db" before running the Prisma and seed commands.
The bundled CSV currently seeds a dataset that spans 2012-09-25 through 2025-04-12, so the examples below are grounded in the repository's default local data.
- Open
http://localhost:3000and confirm the page renders the formal header, filter bar, metadata strip, Leaflet map, and borough trend panel. - Leave the default recent window in place and confirm the initial map load completes without an error notice.
- Set
Borough = QUEENS,Start date = 2021-09-01, andEnd date = 2021-09-14, then submit.- Expect the metadata strip to reflect the submitted borough and date range.
- Expect several hundred plotted collisions for that window, which makes marker clustering density easy to inspect.
- Click a marker and confirm the popup shows the
REF #...summary plus structured field rows.
- Without changing the borough, set
Start date = 2021-09-14andEnd date = 2021-09-01.- Expect the inline invalid-date notice to appear immediately.
- Expect the
Update Mapbutton to disable until the range is valid again.
- Clear the borough filter and use
Start date = 2021-05-01withEnd date = 2021-05-31, then submit.- Expect the truncation notice to appear because the default dataset contains
10,058citywide collisions in that month, which exceeds the API cap of5,000results.
- Expect the truncation notice to appear because the default dataset contains
- Set
Borough = BROOKLYN,Start date = 2021-04-01, andEnd date = 2021-06-30, then submit.- Expect the monthly trend panel to show rows for April, May, and June 2021.
- Expect non-zero collision, injury, and fatality totals in the report summary and table for that range.
QUEENS+2021-09-01to2021-09-14: reliable marker and popup-detail exercise with319seeded collisions.- citywide +
2021-05-01to2021-05-31: reliable truncation-warning scenario because the seeded dataset exceeds the API result cap. BROOKLYN+2021-04-01to2021-06-30: reliable trend-report scenario with three populated monthly rows and non-zero fatalities.
npm test -- --runInBandGET /collisionssupportsborough,date_start, anddate_endGET /collisions/:collisionIdreturns one collision or a not-found errorGET /borough-trendsreturns monthly collision, injury, and fatality totals for the active borough/date filtersGET /boroughsreturns the available borough listGET /date-rangereturns the persisted minimum and maximum crash dates
Overview of the institutional browser map with the borough/date filter bar, metadata strip, truncation notice, and Brooklyn monthly trend summary.
Popup detail treatment showing the government-style reference number and compact label/value metadata rows for a selected collision.
Validation state for an inverted date range, including the inline error notice and disabled Update Map action.
Citywide view demonstrating the dense plotting behavior, capped-result warning, and synchronized monthly trend summary for the active filter window.
This repo is also a companion showcase for the local markdown-driven agent loop.
- inspect
docs/state/backlog.mdto see planned and completed slices - inspect
docs/state/progress.mdto see completed work recorded run by run - inspect
docs/state/handover.mdto see the baton passed between runs - inspect
git log --onelineto see narrow, sequential changes over time
One representative pause happened when the loop reached the persistence layer before the stack had been explicitly chosen. The agent stopped, wrote a blocker, and waited for a human decision instead of inventing the database approach.
Archived TBD example: database stack ambiguity resolved by a human
Agent blocker (docs/state/archive/20260409171010_tbd.md)
The agent halted because docs/planning.md and package.json did not yet specify the database or ORM, even though the next backlog item required persistence models and tests. It proposed four concrete paths:
- SQLite + Prisma
- SQLite + better-sqlite3
- PostgreSQL + TypeORM or Prisma
- A temporary in-memory abstraction
Human response (docs/state/archive/20260409171010_tbd-response.md)
The operator chose Option A: SQLite with Prisma and instructed the next run to update docs/planning.md and continue the TDD flow from that now-explicit baseline.
The delivery path was not perfectly linear, and the repo keeps that visible on purpose. A few useful corrections came from forcing the agent to stop instead of guessing:
- The loop had to escalate when the persistence layer work arrived before the database stack had been explicitly chosen, which prevented an invented ORM direction and turned SQLite + Prisma into a deliberate requirement.
- The loop also tripped over process ambiguity, not just product ambiguity: two later TBDs were raised because the worktree already contained meaningful edits, and the agent could not safely decide on commit scope by itself.
- Those pauses improved the final result more than bluffing through them would have. The project ended with a clearer Node 22 runtime baseline, cleaner backlog steering, and a README that now shows both the delivered feature set and the moments where human correction was required.
This is a focused local project, not a full production deployment. It intentionally omits hosted infrastructure, authentication and authorization, background jobs, rate limiting, observability plumbing, and large-scale map/reporting workflows that would likely be required in an operational city deployment.



