Skip to content

feat(crawler): SPEC-006 Farfetch + SPEC-007 Shopify multi-brand editorials#7

Merged
bbbang105 merged 1 commit into
devfrom
feat/spec-006-007-platform-expansion
May 7, 2026
Merged

feat(crawler): SPEC-006 Farfetch + SPEC-007 Shopify multi-brand editorials#7
bbbang105 merged 1 commit into
devfrom
feat/spec-006-007-platform-expansion

Conversation

@bbbang105
Copy link
Copy Markdown
Member

Summary

  • SPEC-006 Farfetch KR/US: 신규 엔진 src/lib/farfetch-engine.ts (Playwright channel:'chrome' + DOM-scrape). KR active(912 SKU), US disabled(KR-IP redirect). ToS verbatim §13 임베딩 + AMBIGUOUS-ACCEPTED-BY-OWNER.
  • SPEC-007 Shopify 편집샵 3개: Slam Jam(5,638) + Antonioli(6,933) + Browns(24,995, Shopify cap). +37,566 SKU, +955 vendors.
  • 엔진 강화: Shopify dispatch sequential + UA/sec-ch-ua 헤더 보강 + 429/503 backoff. parallel Promise.all로 인한 429 cascade 해소.
  • CLI 확장: --type=A,B,C 콤마 분리 / --exclude-type=cafe24 / --exclude-site=key.

Verification

  • pnpm tsc --noEmit: clean
  • pnpm test: 100/100 pass (Farfetch 27 신규 + 73 기존)
  • pnpm crawl --probe=farfetch-kr: L2에서 723 cards / 24 products
  • 풀 크롤 검증: 39개 사이트 약 35분, 83,669 SKU (Cafe24 제외)

Plan

플랫폼 39개 → 42개. Farfetch US는 향후 US 인프라 마련 시 disabled: true 제거하면 즉시 활성화.

🗿 MoAI email@mo.ai.kr
Co-Authored-By: Claude noreply@anthropic.com

…rials

SPEC-PLATFORM-EXPANSION-006 (Farfetch KR/US):
- New src/lib/farfetch-engine.ts (~700 LOC): Playwright channel:'chrome'
  + DOM-scrape (NOT XHR-interception). Region-parameterized KR + US.
- ToS verbatim Korean §13 embedded at top-of-file (verdict:
  AMBIGUOUS-ACCEPTED-BY-OWNER 2026-05-07).
- KR multi-nav 8/8 pass at 3-sec pacing, 912 SKU full crawl verified.
- US disabled until US-routable infra (KR-IP forces /kr/ redirect).
- 27 characterization tests + fixture (32 cards from /kr/men/clothing-2).

SPEC-PLATFORM-EXPANSION-007 (Shopify multi-brand editorials):
- 3 new sites: Slam Jam (5,638 SKU), Antonioli (6,933), Browns (24,995
  Shopify hard cap). Total +37,566 SKU, +955 unique vendors.
- Currency routing via existing CURRENCY_TO_COUNTRY localization cookie.

Engine improvements (rate-limit hardening):
- Shopify dispatch: parallel Promise.all → sequential + 5s inter-site
  delay. Resolves HTTP 429 cascade observed empirically.
- Shopify engine: full Chrome 131 UA + sec-ch-ua client hints +
  Accept-Encoding + Referer (was triggering bot fingerprint).
- fetchWithBackoff helper for 429/503 retry with Retry-After honoring.
- Bulk maxPages 300 across all Shopify/Cafe24 sites.

CLI extensions (src/crawl.ts):
- --type=A,B,C   comma-separated multi-type
- --exclude-type=cafe24
- --exclude-site=key1,key2

Misc:
- platforms.ts: 39 → 42 active sites
- structure.md: documents Farfetch + farfetch-engine layering
- .gitignore: .moai/cache/ excluded (analysis artifacts)

🗿 MoAI <email@mo.ai.kr>
Co-Authored-By: Claude <noreply@anthropic.com>
@bbbang105 bbbang105 merged commit d8667a6 into dev May 7, 2026
3 checks passed
@bbbang105 bbbang105 added ✅ test 테스트 코드 🔧 ci CI/CD 파이프라인 변경 🚀 feat 새로운 기능 추가 / 일부 코드 추가 / 일부 코드 수정 (리팩토링과 구분) / 디자인 요소 수정 🚨 fix 버그 수정 / 에러 해결 labels May 7, 2026
@bbbang105 bbbang105 deleted the feat/spec-006-007-platform-expansion branch May 7, 2026 07:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🔧 ci CI/CD 파이프라인 변경 🚀 feat 새로운 기능 추가 / 일부 코드 추가 / 일부 코드 수정 (리팩토링과 구분) / 디자인 요소 수정 🚨 fix 버그 수정 / 에러 해결 ✅ test 테스트 코드

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant