Feature/hf search#59
Merged
Merged
Conversation
Parking work-in-progress on feature/hf-search so a parallel agent can
keep using main without colliding.
Backend:
- pull-jobs DELETE now dual-purpose: cancel-active or dismiss-terminal.
PullJobsRepository.deleteById + PullJobsService.cancel returns
PullJobCancelResult { id, status: CANCELLED | DISMISSED }.
- New PullJobCancelOutcome enum.
- catalog: GET /catalog/hf-search, GET /catalog/hf-models/:author/:name,
POST /catalog/hf-import (HfDiscoveryManager queries the live
HuggingFace API for GGUF-tagged models, ranks by trending/likes,
imports a row into FrontierCatalogEntry on demand).
- huggingface-client: rewrite 401 errors to clarify "repo may not
exist or is gated" instead of HF's misleading "Invalid username".
- seed-catalog: replace bogus `unsloth/Qwen3-Coder-7B-GGUF` (404 on
HF) with the canonical `Qwen/Qwen2.5-Coder-7B-Instruct-GGUF`.
Frontend (partial):
- download-job-row: Dismiss button for terminal jobs (calls existing
cancel endpoint, now smart enough to delete the row).
- types/hf-search.types.ts + repository methods + 2 hooks
(useHfSearch, useHfDetails). Dialog component not yet wired.
Next: hf-search-dialog component + "Browse HuggingFace" entry point
on /models/local-frontier page.
Backend (claw-llamacpp-service): - HfAutoSyncManager: pulls the top-N HuggingFace GGUF models by trendingScore, downloads, and likes (config-driven caps per sort key, default 40 each). Runs once 30s after bootstrap, then on a daily cron (04:00 UTC). Imports each model into FrontierCatalogEntry with heuristic category/qualityTier from tags; dedupes on (name, tag); skips gated/private repos and files >80 GB. Concurrent-run guard, reports counts + per-repo outcomes. - New env vars: HF_AUTO_SYNC_ENABLED (default true), HF_AUTO_SYNC_CRON, HF_AUTO_SYNC_TRENDING_LIMIT, HF_AUTO_SYNC_DOWNLOADS_LIMIT, HF_AUTO_SYNC_LIKES_LIMIT. - New enum: HfAutoSyncTrigger + HfAutoSyncOutcomeStatus (kept the string-literal-union ban happy). - POST /api/v1/llamacpp/catalog/hf-auto-sync (admin/operator) to trigger a manual run. Frontend (claw-frontend): - HfSearchDialog with live HuggingFace search, sortable by trending/downloads/likes/lastModified, debounced query. Two-pane layout: results list on the left, details + import controls on the right. Quantization defaults to the recommended file's quant, with fallback to other GGUF quants present in the repo. - "Browse HuggingFace" button on /models/local-frontier opens the dialog. All state lives in the controller hook (useHfSearchDialog); the dialog component is pure render. - Extracted: hf-search.enum.ts (HfSearchSort/HfCategoryChoice/ HfQualityTierChoice), hf-search.constants.ts (sort/category/tier options + debounce + result-limit), hf-format.utility.ts, pull-job-cancel.types.ts. hf-results-list.tsx and hf-details-panel.tsx are separate components per the "no inline helper components in TSX" rule. Backend + frontend pass typecheck and lint with 0 errors.
…NUAL_MODEL has no model Root cause of the recurring "⚠️ Connector with id 'ANTHROPIC' not found" on threads whose routingMode is MANUAL_MODEL but preferredProvider/Model are null (e.g. legacy threads, or threads where the user picked a model once but the preference never landed in `preferred*`): 1. UI/curl sends `{ threadId, content }` with no routingMode/provider/model. 2. chat-service resolveRoutingParams inherited MANUAL_MODEL from thread.routingMode but resolved forcedProvider=undefined, forcedModel=undefined (thread.preferredProvider/Model both null). 3. routing-service handleManualModel silently substituted CLOUD_MODEL_DEFAULT ('claude-sonnet-4') and inferred ANTHROPIC. 4. Connector lookup 404'd because the user never configured an Anthropic connector → the ASSISTANT message stored the failure string. Two fixes: claw-chat-service / chat-messages.service.ts:resolveRoutingParams - For MANUAL_MODEL, walk a real fallback chain: dto.provider → thread.preferredProvider → thread.lastProvider. Same for model. So a thread that "remembers" the last model used (lastProvider/Model) keeps using it on bare-payload sends. - If after all that there's still no provider AND no model, downgrade to AUTO and log a warn — never let MANUAL_MODEL leave this service with both fields undefined. claw-routing-service / routing.manager.ts:handleManualModel - Defense in depth. If forcedModel is missing, do NOT default to CLOUD_MODEL_DEFAULT. Fall through to handleAuto so the router actually picks something appropriate for the message + connector availability. (Made the method async to match handleAuto's signature; ModeHandler already accepts Promise | RoutingDecisionResult.) This kills the "ANTHROPIC not found" fallback for users without an Anthropic connector configured.
…astModel reuse
chat-service: MANUAL_MODEL with no DTO/preferred selection now downgrades
to AUTO so the router can decide. Previously fell back to thread.lastProvider
/ lastModel which is "whatever ran last" — not user intent — so threads
where a specialty model (e.g. medgemma1.5) had been used once would keep
using it for unrelated prompts.
routing-service: detectCategoryRoute no longer early-returns when
isRuntimeHealthy('OLLAMA') is false. The runtime health check trips on a
single Ollama-assisted router-model timeout even while the actual chat
models on the same Ollama are reachable, which used to send all
category-aware routing (coding → LOCAL_CODING, medical → LOCAL_REASONING,
etc.) through to the hardcoded best-effort path that picked
ANTHROPIC/claude-sonnet-4 and failed with "Connector 'ANTHROPIC' not found"
on installs without an Anthropic connector. Category routing is now
attempted; the existing fall-through to heuristic + cloud best-effort
preserves prior behavior when no local model is installed for the role.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Brief description of changes.
Type
Checklist
npm run lintpassesnpm run buildpassesnpm run testpasses