feat(crawler): SPEC-BRAND-NODE-001 PR-Y — products.brand_node_id FK + representative selection by bbbang105 · Pull Request #10 · endurance-ai/crawler

bbbang105 · 2026-05-14T07:59:48Z

Summary

products.brand_node_id (bigint NULL FK → brand_nodes.id) backfilled on import. Unknown brand strings trigger trigram fuzzy match (Jaccard ≥ 0.85) against existing brand_nodes; new rows inserted with node columns NULL, alias candidates enqueued to brand_node_review_queue (reason=alias_candidate).
products.is_brand_representative boolean — brand-VLM 5-image input SOT. Replaces SPEC-3 random.sample(5).
New CLI select-representatives.ts: diversity heuristic (category round-robin + in_stock + recency) flags 5~10 products per brand, syncs cache to brand_nodes.representative_image_urls.
Drop mood_tags from PAI prompt + AnalysisResult (search weight=0 in v6). brand_nodes.primary_node_id (brand-VLM) replaces product-level mood channel.

Coordinated with app session

Migrations 057_products_brand_node_fk.sql / 058_products_brand_representative.sql live in app repo (committed via app PR alongside endpoint work).
After app PR-Z merges + INTERNAL_API_KEY shared, select-representatives wet-run will be followed by POST /api/internal/classify-brand calls (separate follow-up PR).
TS-side trigram similarity used in import-products.ts is close (±0.05) to pg_trgm similarity(); 0.85 threshold has enough margin.

Test plan

pnpm typecheck clean
pnpm test 100/100 passing
Dry-run after migration: pnpm tsx src/select-representatives.ts --all --dry-run --target 5
Wet-run on subset: pnpm tsx src/select-representatives.ts --brand "AURALEE" --target 5
Verify backfill stats from migration 057 logs (matched / null breakdown)
Sanity check brand_node_review_queue for alias_candidate rows after first import sweep

… representative selection - products.brand_node_id (bigint NULL FK → brand_nodes.id) backfill on import. Unknown brands trigger trigram fuzzy match (≥0.85 Jaccard) against existing brand_nodes; new brand_nodes row inserted with node columns NULL, alias candidates enqueued to brand_node_review_queue (reason='alias_candidate'). - products.is_brand_representative boolean: brand-VLM 5-image input SOT. Replaces SPEC-3 random.sample(5) approach. - New CLI select-representatives.ts: diversity heuristic (category round-robin + in_stock + recency) flags 5~10 products per brand, syncs cache to brand_nodes.representative_image_urls. - Drop mood_tags from PAI prompt + AnalysisResult (search weight=0 in v6). brand_nodes.primary_node_id (brand-VLM) replaces product-level mood channel. Migrations 057/058 live in app repo (committed in app PR-Z). 🗿 MoAI <email@mo.ai.kr>

🗿 MoAI <email@mo.ai.kr>

- classify-brands.ts: app /api/internal/classify-brand 일괄 호출 CLI. token bucket (8s start, auto adjust on success/429), 429/5xx retry with exponential backoff, 4xx no-retry, per-step verbose logging (token / POST / HTTP / response / outcome), failure jsonl log for re-run via --retry-failed. Options: --all, --brand-id, --limit, --force, --concurrency, --interval, --dry-run. - select-representatives.ts: --limit N added for small-batch testing (works combined with --all, applied after fetch). - .gitignore: .app-src (transient symlink used for editing app repo from this session). 🗿 MoAI <email@mo.ai.kr>

…tch CDN block Schema rename (app migration 062): - primary_node_id → primary_style_node_id - secondary_node_id → secondary_style_node_id - node_confidence → style_node_confidence - node_assigned_at → style_node_assigned_at - node_assigned_model → style_node_assigned_model - DROP COLUMN style_node (legacy text enum) Crawler updates: - classify-brands.ts: primary_node_id IS NULL filter renamed + auto-filter brands with no products - import-products.ts: loadBrandNodes() drops style_node select + nodeMap removed (legacy text source gone); products.style_node always NULL on new insert (v5 search axis, weight=0 in v6) - analyze-prompt.ts: doc comment updated - select-representatives.ts: add BLOCKED_IMAGE_DOMAINS (farfetch-contents.com) to prevent brand-VLM 5-image bundle fail on a single 403 image URL 🗿 MoAI <email@mo.ai.kr>

bbbang105 added the 🚀 feat 새로운 기능 추가 / 일부 코드 추가 / 일부 코드 수정 (리팩토링과 구분) / 디자인 요소 수정 label May 14, 2026

style: format import block (4→2 space)

21af9ee

🗿 MoAI <email@mo.ai.kr>

github-actions Bot assigned bbbang105 May 14, 2026

bbbang105 added the 🌱 style 코드 의미에 영향을 주지 않는 변경사항 (코드 포맷팅, 오타 수정, 변수명 변경, 에셋 추가) label May 14, 2026

bbbang105 added the ✅ test 테스트 코드 label May 14, 2026

bbbang105 added ✂️ remove 패키지 혹은 폴더, 클래스 삭제 🎫 rename 패키지 혹은 폴더명, 클래스명 수정 🔄 refactor 코드 리팩토링 labels May 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(crawler): SPEC-BRAND-NODE-001 PR-Y — products.brand_node_id FK + representative selection#10

feat(crawler): SPEC-BRAND-NODE-001 PR-Y — products.brand_node_id FK + representative selection#10
bbbang105 wants to merge 4 commits into
devfrom
feature/spec-brand-node-crawler

bbbang105 commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bbbang105 commented May 14, 2026

Summary

Coordinated with app session

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant