Skip to content

feat: optim pass for 1.0.0#3

Merged
ob-aion merged 12 commits into
mainfrom
feat/optim
May 20, 2026
Merged

feat: optim pass for 1.0.0#3
ob-aion merged 12 commits into
mainfrom
feat/optim

Conversation

@ob-aion
Copy link
Copy Markdown
Collaborator

@ob-aion ob-aion commented May 20, 2026

Summary

Take @coroboros/location-timezone from migrated-and-typed to production-grade 1.0.0. No public API signatures change; subpath exports are additive.

Performance

Array.find / Array.filter / Array.includes over the parsed arrays were the dominant cost. Replaced with Map / Set indexes built once at module load. Benchmarks (Apple M1, Node 22.22.2):

Bench Pre-optim Post-optim Speedup
findTimezoneByCityName (exact, late) 292.59 µs 27.03 ns 10,825×
findLocationsByCountryIso (US) 70.59 µs 20.36 ns 3,468×
findCountryByName (exact, late) 29.59 µs 12.06 ns 2,454×
findCapitalOfCountryIso (iso2, late) 4.12 µs 10.25 ns 402×
findCountryByIso (iso2, late) 4.19 µs 10.33 ns 405×
isValidCountryIso (iso2, late) 599.21 ns 5.36 ns 112×

Partial-match paths and findLocationsByCoordinates stay linear by design; full numbers and going-forward >5% regression budget in bench/baseline.md.

Types

  • Internal CapitalWithCountry / CountryWithCapital at the data boundary; 8 non-null assertions removed.
  • Public arrays now declare ReadonlyArray<X> return types — runtime backs the type with Object.freeze on data arrays, bucket arrays, fresh filter results, and nested country.timezones.
  • helpers.ts is() simplified to typeof-based primitive checks.

Distribution

Subpath exports for fine-grained tree-shaking:

import { findStateAnsiByUspsCode } from '@coroboros/location-timezone/states-ansi'; // ~5 kB
import { findCountryByIso } from '@coroboros/location-timezone/countries';            // ~85 kB
import { findTimezoneByCityName } from '@coroboros/location-timezone/timezones';      // ~870 kB
import { findLocationsByCountryIso } from '@coroboros/location-timezone/locations';   // ~860 kB

tsdown multi-entry auto-emits shared chunks; main entry surface unchanged.

Other

  • tests/location-timezone.property.test.ts — 13 fast-check invariants (ISO round-trips, closure, back-references, idempotency, case insensitivity, reflexivity).
  • scripts/*.js ported to TypeScript ESM (run via tsx).
  • README: swap branch-stable for ci badge, add Why this exists + Subpath exports sections.
  • CLAUDE.md: Git, Publish auth, regression-budget rules; reflect new layout.
  • .github/workflows/ci.yml — bootstrap shape calling the coroboros/ci reusable workflow (forwards NPM_PACKAGE_REGISTRY_TOKEN for the first publish; OIDC switch is a follow-up after 1.0.0 lands on npm).

Test plan

  • CI preflight green (lint + typecheck + test + build) on the PR.
  • pnpm test locally — 109/109 pass.
  • pnpm bench reproduces the post-optim numbers (within the 5% budget).
  • node -e "import('@coroboros/location-timezone/states-ansi').then(m => console.log(m.findStateAnsiByUspsCode('NY')))" — subpath import resolves after install.

ob-aion added 12 commits May 20, 2026 16:59
Add mitata, fast-check, tsx as devDeps. Add bench/location-timezone.bench.mjs
covering the 7 hot functions with late-alphabet inputs to expose linear-scan
worst case. Add bench script (build + run). pnpm-workspace.yaml permits the
esbuild postinstall needed by tsx.
…ypes

Introduce internal CapitalWithCountry / CountryWithCapital in src/data/index.ts
to type the parsed arrays. Public Capital / Country types keep their optional
back-references for consumers. Removes 8 non-null assertions across
src/countries.ts and src/timezones.ts.
Split src/data/index.ts into per-domain modules (countries, locations,
states-ansi, timezones), each building O(1) Map/Set indexes at module
load. Rewrite every find* / getX helper to use the indexes, replacing
Array.find / Array.filter / Array.includes scans. Move isValidCountryIso
from helpers.ts to countries.ts since it depends on country iso Sets.
Freeze returned arrays (top-level data + bucket arrays + nested
country.timezones) as defense-in-depth against mutation.

Benchmarks (Apple M1, Node 22.22.2):
  findTimezoneByCityName     292.59 µs → 28.21 ns   (10,372×)
  findLocationsByCountryIso   70.59 µs → 16.61 ns   ( 4,250×)
  findCountryByName           29.59 µs → 11.82 ns   ( 2,503×)
  findCapitalOfCountryIso      4.12 µs →  9.49 ns   (   434×)
  findCountryByIso (iso2)      4.19 µs →  9.52 ns   (   440×)
  isValidCountryIso (iso2)   599.21 ns →  5.32 ns   (   113×)

Partial-match paths (findLocationsByProvince/State/CountryName with
partialMatch=true) still scan linearly via match() — no Trie-style
index helps substring containment. Province and state empty-string
keys are included in the index to preserve the pre-optim behavior of
findLocationsByProvince('') returning empty-province locations.
…rray

is() now branches on the primitive constructor via typeof for String,
Number, Boolean; the prior constructor-equality check carried no signal
the typeof check did not, and reads simpler. hasLen and match also use
typeof directly.

Every public function returning an array now declares ReadonlyArray<X>:
getCapitals, getCountries, getLocations, getStatesAnsi, getTimezones,
getCountryIso2Codes, getCountryIso3Codes, findLocationsByCoordinates,
findLocationsByCountryIso, findLocationsByCountryName,
findLocationsByProvince, findLocationsByState, findTimezonesByCountryIso,
findTimezonesByCountryName. Runtime backs the type — every returned
array is frozen (data arrays at module load, bucket arrays after
indexing, fresh filter results before return, nested country.timezones).
Add tsdown multi-entry for src/{countries,locations,states-ansi,timezones}.ts;
expose them via package.json "exports" as @coroboros/location-timezone/{countries,
locations,states-ansi,timezones}. tsdown auto-emits shared chunks, so a
consumer importing only one domain pulls only that domain's data.

Effective sizes per subpath (chunked dist):
  /states-ansi  ~12 KB   (was 880 KB — 73x smaller)
  /countries    ~88 KB   (was 880 KB — 10x smaller)
  /timezones   ~860 KB   (intrinsic — needs locations + countries)
  /locations   ~850 KB   (intrinsic — needs countries for ISO validation)
  .            ~870 KB   (merged surface, unchanged)

Main entry surface and the default merged object are preserved; subpaths
expose only their domain's named exports.
13 property tests covering ISO round-trip identity, closure between
get* and find* collections, back-reference consistency on Capital.country
and Location.country, case-insensitive findCountryByIso /
findCapitalOfCountryIso, isValidCountryIso reflexivity for every
alpha-2 / alpha-3 code, referential idempotency, and findCountryByName
results living in getCountries(). Total tests: 96 → 109.
scripts/clean-and-generate.ts, scripts/generate-countries.ts,
scripts/country-iso2-codes.ts, scripts/country-iso3-codes.ts,
scripts/states-ansi.ts replace their .js counterparts. ESM imports,
node: protocol, import.meta.dirname. build:data switches to tsx.

Verified by re-running pnpm build:data and comparing src/data/*.json
hashes vs HEAD — the iso2/iso3 codes, by-iso lookups, and states-ansi
payloads are byte-identical. The locations / countries / capitals /
timezones payloads regenerate to current IANA tzdb (Node 22.22.2) and
differ from the 2024 snapshot; refreshing committed data is a separate
decision, so HEAD's payloads are kept as-is for now.

scripts/generate-countries-md.ts switched to node:fs + .ts import path
+ for-of (was a forEach against ../src without extension).
…Subpath exports sections

Replace the placeholder branch-stable badge with the canonical ci badge.
Add a Why this exists section between Usage and Data that motivates
consolidation across UN, CIA Factbook, ISO 3166-1, ANSI FIPS, IANA, and
city-coordinates sources. Add a Subpath exports section documenting the
four tree-shakable entries and their effective sizes.

Tightened the intro paragraph (was repeating the tagline) and split the
Why this exists opening sentence (was 36 words; now 19 + colon-list).
…rmat

Add Coordinates to the public API (re-exported from src/index.ts) and
use it as the parameter type of findLocationsByCoordinates. Additive
and non-breaking; existing call sites work unchanged.

Realign every per-function block in the README to the canonical
Coroboros API doc format:
- Summary carries the signature only inside
  <details><summary><code>name(args)</code></summary>. Types live in
  the structured Parameters / Returns sections below.
- Parameters table gains the Default column on every block; required
  params marked *(required)*.
- Every block gets an explicit Returns line stating type and semantics.
- Every block gets 2-4 Examples drawn from the test suite.
- Types section now lists all 5 public interfaces (Capital, Coordinates,
  Country, Location, StateAnsi) with consistent property tables and
  cross-references via [Type](#types).

The Why this exists paragraph now links bench/baseline.md for the
head-to-head numbers vs the pre-optim linear-scan baseline.

Matching docs/api-format-alignment PRs are open on coroboros/uri,
coroboros/clone, and coroboros/sparkline so the four packages converge
on the same per-method block shape.
@ob-aion ob-aion merged commit 32af247 into main May 20, 2026
5 checks passed
@ob-aion ob-aion deleted the feat/optim branch May 20, 2026 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant