Skip to content

Feature/hf search#59

Merged
ihabkhaled merged 6 commits into
mainfrom
feature/hf-search
May 24, 2026
Merged

Feature/hf search#59
ihabkhaled merged 6 commits into
mainfrom
feature/hf-search

Conversation

@ihabkhaled
Copy link
Copy Markdown
Owner

@ihabkhaled ihabkhaled commented May 24, 2026

Summary

Brief description of changes.

Type

  • Feature
  • Bug Fix
  • Refactor
  • Documentation
  • Test

Checklist

  • npm run lint passes
  • npm run build passes
  • npm run test passes
  • Documentation updated (if applicable)
  • No secrets in code

ihabkhaled and others added 6 commits May 24, 2026 03:35
Parking work-in-progress on feature/hf-search so a parallel agent can
keep using main without colliding.

Backend:
- pull-jobs DELETE now dual-purpose: cancel-active or dismiss-terminal.
  PullJobsRepository.deleteById + PullJobsService.cancel returns
  PullJobCancelResult { id, status: CANCELLED | DISMISSED }.
- New PullJobCancelOutcome enum.
- catalog: GET /catalog/hf-search, GET /catalog/hf-models/:author/:name,
  POST /catalog/hf-import (HfDiscoveryManager queries the live
  HuggingFace API for GGUF-tagged models, ranks by trending/likes,
  imports a row into FrontierCatalogEntry on demand).
- huggingface-client: rewrite 401 errors to clarify "repo may not
  exist or is gated" instead of HF's misleading "Invalid username".
- seed-catalog: replace bogus `unsloth/Qwen3-Coder-7B-GGUF` (404 on
  HF) with the canonical `Qwen/Qwen2.5-Coder-7B-Instruct-GGUF`.

Frontend (partial):
- download-job-row: Dismiss button for terminal jobs (calls existing
  cancel endpoint, now smart enough to delete the row).
- types/hf-search.types.ts + repository methods + 2 hooks
  (useHfSearch, useHfDetails). Dialog component not yet wired.

Next: hf-search-dialog component + "Browse HuggingFace" entry point
on /models/local-frontier page.
Backend (claw-llamacpp-service):
- HfAutoSyncManager: pulls the top-N HuggingFace GGUF models by
  trendingScore, downloads, and likes (config-driven caps per sort key,
  default 40 each). Runs once 30s after bootstrap, then on a daily
  cron (04:00 UTC). Imports each model into FrontierCatalogEntry with
  heuristic category/qualityTier from tags; dedupes on (name, tag);
  skips gated/private repos and files >80 GB. Concurrent-run guard,
  reports counts + per-repo outcomes.
- New env vars: HF_AUTO_SYNC_ENABLED (default true), HF_AUTO_SYNC_CRON,
  HF_AUTO_SYNC_TRENDING_LIMIT, HF_AUTO_SYNC_DOWNLOADS_LIMIT,
  HF_AUTO_SYNC_LIKES_LIMIT.
- New enum: HfAutoSyncTrigger + HfAutoSyncOutcomeStatus (kept the
  string-literal-union ban happy).
- POST /api/v1/llamacpp/catalog/hf-auto-sync (admin/operator) to
  trigger a manual run.

Frontend (claw-frontend):
- HfSearchDialog with live HuggingFace search, sortable by
  trending/downloads/likes/lastModified, debounced query. Two-pane
  layout: results list on the left, details + import controls on the
  right. Quantization defaults to the recommended file's quant, with
  fallback to other GGUF quants present in the repo.
- "Browse HuggingFace" button on /models/local-frontier opens the
  dialog. All state lives in the controller hook
  (useHfSearchDialog); the dialog component is pure render.
- Extracted: hf-search.enum.ts (HfSearchSort/HfCategoryChoice/
  HfQualityTierChoice), hf-search.constants.ts (sort/category/tier
  options + debounce + result-limit), hf-format.utility.ts,
  pull-job-cancel.types.ts. hf-results-list.tsx and hf-details-panel.tsx
  are separate components per the "no inline helper components in
  TSX" rule.

Backend + frontend pass typecheck and lint with 0 errors.
…NUAL_MODEL has no model

Root cause of the recurring "⚠️ Connector with id 'ANTHROPIC' not found"
on threads whose routingMode is MANUAL_MODEL but preferredProvider/Model
are null (e.g. legacy threads, or threads where the user picked a model
once but the preference never landed in `preferred*`):

1. UI/curl sends `{ threadId, content }` with no routingMode/provider/model.
2. chat-service resolveRoutingParams inherited MANUAL_MODEL from
   thread.routingMode but resolved forcedProvider=undefined,
   forcedModel=undefined (thread.preferredProvider/Model both null).
3. routing-service handleManualModel silently substituted
   CLOUD_MODEL_DEFAULT ('claude-sonnet-4') and inferred ANTHROPIC.
4. Connector lookup 404'd because the user never configured an Anthropic
   connector → the ASSISTANT message stored the failure string.

Two fixes:

claw-chat-service / chat-messages.service.ts:resolveRoutingParams
- For MANUAL_MODEL, walk a real fallback chain:
  dto.provider → thread.preferredProvider → thread.lastProvider.
  Same for model. So a thread that "remembers" the last model used
  (lastProvider/Model) keeps using it on bare-payload sends.
- If after all that there's still no provider AND no model, downgrade
  to AUTO and log a warn — never let MANUAL_MODEL leave this service
  with both fields undefined.

claw-routing-service / routing.manager.ts:handleManualModel
- Defense in depth. If forcedModel is missing, do NOT default to
  CLOUD_MODEL_DEFAULT. Fall through to handleAuto so the router
  actually picks something appropriate for the message + connector
  availability. (Made the method async to match handleAuto's signature;
  ModeHandler already accepts Promise | RoutingDecisionResult.)

This kills the "ANTHROPIC not found" fallback for users without an
Anthropic connector configured.
…astModel reuse

chat-service: MANUAL_MODEL with no DTO/preferred selection now downgrades
to AUTO so the router can decide. Previously fell back to thread.lastProvider
/ lastModel which is "whatever ran last" — not user intent — so threads
where a specialty model (e.g. medgemma1.5) had been used once would keep
using it for unrelated prompts.

routing-service: detectCategoryRoute no longer early-returns when
isRuntimeHealthy('OLLAMA') is false. The runtime health check trips on a
single Ollama-assisted router-model timeout even while the actual chat
models on the same Ollama are reachable, which used to send all
category-aware routing (coding → LOCAL_CODING, medical → LOCAL_REASONING,
etc.) through to the hardcoded best-effort path that picked
ANTHROPIC/claude-sonnet-4 and failed with "Connector 'ANTHROPIC' not found"
on installs without an Anthropic connector. Category routing is now
attempted; the existing fall-through to heuristic + cloud best-effort
preserves prior behavior when no local model is installed for the role.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ihabkhaled ihabkhaled merged commit 1a4d626 into main May 24, 2026
59 checks passed
@ihabkhaled ihabkhaled deleted the feature/hf-search branch May 24, 2026 11:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant