feat(crawler): SPEC-006 Farfetch + SPEC-007 Shopify multi-brand editorials#7
Merged
Merged
Conversation
…rials SPEC-PLATFORM-EXPANSION-006 (Farfetch KR/US): - New src/lib/farfetch-engine.ts (~700 LOC): Playwright channel:'chrome' + DOM-scrape (NOT XHR-interception). Region-parameterized KR + US. - ToS verbatim Korean §13 embedded at top-of-file (verdict: AMBIGUOUS-ACCEPTED-BY-OWNER 2026-05-07). - KR multi-nav 8/8 pass at 3-sec pacing, 912 SKU full crawl verified. - US disabled until US-routable infra (KR-IP forces /kr/ redirect). - 27 characterization tests + fixture (32 cards from /kr/men/clothing-2). SPEC-PLATFORM-EXPANSION-007 (Shopify multi-brand editorials): - 3 new sites: Slam Jam (5,638 SKU), Antonioli (6,933), Browns (24,995 Shopify hard cap). Total +37,566 SKU, +955 unique vendors. - Currency routing via existing CURRENCY_TO_COUNTRY localization cookie. Engine improvements (rate-limit hardening): - Shopify dispatch: parallel Promise.all → sequential + 5s inter-site delay. Resolves HTTP 429 cascade observed empirically. - Shopify engine: full Chrome 131 UA + sec-ch-ua client hints + Accept-Encoding + Referer (was triggering bot fingerprint). - fetchWithBackoff helper for 429/503 retry with Retry-After honoring. - Bulk maxPages 300 across all Shopify/Cafe24 sites. CLI extensions (src/crawl.ts): - --type=A,B,C comma-separated multi-type - --exclude-type=cafe24 - --exclude-site=key1,key2 Misc: - platforms.ts: 39 → 42 active sites - structure.md: documents Farfetch + farfetch-engine layering - .gitignore: .moai/cache/ excluded (analysis artifacts) 🗿 MoAI <email@mo.ai.kr> Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
src/lib/farfetch-engine.ts(Playwrightchannel:'chrome'+ DOM-scrape). KR active(912 SKU), US disabled(KR-IP redirect). ToS verbatim §13 임베딩 + AMBIGUOUS-ACCEPTED-BY-OWNER.--type=A,B,C콤마 분리 /--exclude-type=cafe24/--exclude-site=key.Verification
pnpm tsc --noEmit: cleanpnpm test: 100/100 pass (Farfetch 27 신규 + 73 기존)pnpm crawl --probe=farfetch-kr: L2에서 723 cards / 24 productsPlan
플랫폼 39개 → 42개. Farfetch US는 향후 US 인프라 마련 시
disabled: true제거하면 즉시 활성화.🗿 MoAI email@mo.ai.kr
Co-Authored-By: Claude noreply@anthropic.com