feat: optim pass for 1.0.0#3
Merged
Merged
Conversation
Add mitata, fast-check, tsx as devDeps. Add bench/location-timezone.bench.mjs covering the 7 hot functions with late-alphabet inputs to expose linear-scan worst case. Add bench script (build + run). pnpm-workspace.yaml permits the esbuild postinstall needed by tsx.
…ypes Introduce internal CapitalWithCountry / CountryWithCapital in src/data/index.ts to type the parsed arrays. Public Capital / Country types keep their optional back-references for consumers. Removes 8 non-null assertions across src/countries.ts and src/timezones.ts.
Split src/data/index.ts into per-domain modules (countries, locations,
states-ansi, timezones), each building O(1) Map/Set indexes at module
load. Rewrite every find* / getX helper to use the indexes, replacing
Array.find / Array.filter / Array.includes scans. Move isValidCountryIso
from helpers.ts to countries.ts since it depends on country iso Sets.
Freeze returned arrays (top-level data + bucket arrays + nested
country.timezones) as defense-in-depth against mutation.
Benchmarks (Apple M1, Node 22.22.2):
findTimezoneByCityName 292.59 µs → 28.21 ns (10,372×)
findLocationsByCountryIso 70.59 µs → 16.61 ns ( 4,250×)
findCountryByName 29.59 µs → 11.82 ns ( 2,503×)
findCapitalOfCountryIso 4.12 µs → 9.49 ns ( 434×)
findCountryByIso (iso2) 4.19 µs → 9.52 ns ( 440×)
isValidCountryIso (iso2) 599.21 ns → 5.32 ns ( 113×)
Partial-match paths (findLocationsByProvince/State/CountryName with
partialMatch=true) still scan linearly via match() — no Trie-style
index helps substring containment. Province and state empty-string
keys are included in the index to preserve the pre-optim behavior of
findLocationsByProvince('') returning empty-province locations.
…rray is() now branches on the primitive constructor via typeof for String, Number, Boolean; the prior constructor-equality check carried no signal the typeof check did not, and reads simpler. hasLen and match also use typeof directly. Every public function returning an array now declares ReadonlyArray<X>: getCapitals, getCountries, getLocations, getStatesAnsi, getTimezones, getCountryIso2Codes, getCountryIso3Codes, findLocationsByCoordinates, findLocationsByCountryIso, findLocationsByCountryName, findLocationsByProvince, findLocationsByState, findTimezonesByCountryIso, findTimezonesByCountryName. Runtime backs the type — every returned array is frozen (data arrays at module load, bucket arrays after indexing, fresh filter results before return, nested country.timezones).
Add tsdown multi-entry for src/{countries,locations,states-ansi,timezones}.ts;
expose them via package.json "exports" as @coroboros/location-timezone/{countries,
locations,states-ansi,timezones}. tsdown auto-emits shared chunks, so a
consumer importing only one domain pulls only that domain's data.
Effective sizes per subpath (chunked dist):
/states-ansi ~12 KB (was 880 KB — 73x smaller)
/countries ~88 KB (was 880 KB — 10x smaller)
/timezones ~860 KB (intrinsic — needs locations + countries)
/locations ~850 KB (intrinsic — needs countries for ISO validation)
. ~870 KB (merged surface, unchanged)
Main entry surface and the default merged object are preserved; subpaths
expose only their domain's named exports.
13 property tests covering ISO round-trip identity, closure between get* and find* collections, back-reference consistency on Capital.country and Location.country, case-insensitive findCountryByIso / findCapitalOfCountryIso, isValidCountryIso reflexivity for every alpha-2 / alpha-3 code, referential idempotency, and findCountryByName results living in getCountries(). Total tests: 96 → 109.
scripts/clean-and-generate.ts, scripts/generate-countries.ts, scripts/country-iso2-codes.ts, scripts/country-iso3-codes.ts, scripts/states-ansi.ts replace their .js counterparts. ESM imports, node: protocol, import.meta.dirname. build:data switches to tsx. Verified by re-running pnpm build:data and comparing src/data/*.json hashes vs HEAD — the iso2/iso3 codes, by-iso lookups, and states-ansi payloads are byte-identical. The locations / countries / capitals / timezones payloads regenerate to current IANA tzdb (Node 22.22.2) and differ from the 2024 snapshot; refreshing committed data is a separate decision, so HEAD's payloads are kept as-is for now. scripts/generate-countries-md.ts switched to node:fs + .ts import path + for-of (was a forEach against ../src without extension).
…Subpath exports sections Replace the placeholder branch-stable badge with the canonical ci badge. Add a Why this exists section between Usage and Data that motivates consolidation across UN, CIA Factbook, ISO 3166-1, ANSI FIPS, IANA, and city-coordinates sources. Add a Subpath exports section documenting the four tree-shakable entries and their effective sizes. Tightened the intro paragraph (was repeating the tagline) and split the Why this exists opening sentence (was 36 words; now 19 + colon-list).
…rmat Add Coordinates to the public API (re-exported from src/index.ts) and use it as the parameter type of findLocationsByCoordinates. Additive and non-breaking; existing call sites work unchanged. Realign every per-function block in the README to the canonical Coroboros API doc format: - Summary carries the signature only inside <details><summary><code>name(args)</code></summary>. Types live in the structured Parameters / Returns sections below. - Parameters table gains the Default column on every block; required params marked *(required)*. - Every block gets an explicit Returns line stating type and semantics. - Every block gets 2-4 Examples drawn from the test suite. - Types section now lists all 5 public interfaces (Capital, Coordinates, Country, Location, StateAnsi) with consistent property tables and cross-references via [Type](#types). The Why this exists paragraph now links bench/baseline.md for the head-to-head numbers vs the pre-optim linear-scan baseline. Matching docs/api-format-alignment PRs are open on coroboros/uri, coroboros/clone, and coroboros/sparkline so the four packages converge on the same per-method block shape.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Take
@coroboros/location-timezonefrom migrated-and-typed to production-grade1.0.0. No public API signatures change; subpath exports are additive.Performance
Array.find/Array.filter/Array.includesover the parsed arrays were the dominant cost. Replaced withMap/Setindexes built once at module load. Benchmarks (Apple M1, Node 22.22.2):findTimezoneByCityName(exact, late)findLocationsByCountryIso(US)findCountryByName(exact, late)findCapitalOfCountryIso(iso2, late)findCountryByIso(iso2, late)isValidCountryIso(iso2, late)Partial-match paths and
findLocationsByCoordinatesstay linear by design; full numbers and going-forward >5% regression budget inbench/baseline.md.Types
CapitalWithCountry/CountryWithCapitalat the data boundary; 8 non-null assertions removed.ReadonlyArray<X>return types — runtime backs the type withObject.freezeon data arrays, bucket arrays, fresh filter results, and nestedcountry.timezones.helpers.tsis()simplified totypeof-based primitive checks.Distribution
Subpath exports for fine-grained tree-shaking:
tsdownmulti-entry auto-emits shared chunks; main entry surface unchanged.Other
tests/location-timezone.property.test.ts— 13 fast-check invariants (ISO round-trips, closure, back-references, idempotency, case insensitivity, reflexivity).scripts/*.jsported to TypeScript ESM (run viatsx).branch-stableforcibadge, addWhy this exists+Subpath exportssections..github/workflows/ci.yml— bootstrap shape calling thecoroboros/cireusable workflow (forwardsNPM_PACKAGE_REGISTRY_TOKENfor the first publish; OIDC switch is a follow-up after1.0.0lands on npm).Test plan
pnpm testlocally — 109/109 pass.pnpm benchreproduces the post-optim numbers (within the 5% budget).node -e "import('@coroboros/location-timezone/states-ansi').then(m => console.log(m.findStateAnsiByUspsCode('NY')))"— subpath import resolves after install.