Phase 1 Production Hardening: E2E tests, RF-7 privacy, perf baselines, Tantivy consistency#53
Merged
Conversation
- AGENTS.md: 705→258 行,保留核心约束(环境指引/关键约定/安全原则/架构红线/禁止事项) - docs/AGENTS-full.md: 保留完整版(历史记录/路线图/详细讨论) - docs/production-readiness.md: 新增生产就绪检查清单 - 6 大维度:稳定性/性能/MCP/Agent集成/文档/发布流程 - 4 阶段推进计划:Phase0(当前)→Phase1(稳定化)→Phase2(Agent试点)→Phase3(v1.0) - 所有条目客观可验证,禁止主观描述
- Add description fields to 12 crates missing them in Cargo.toml - Remove dead code: sync_skills_to_clarity (Client-Agnostic violation), update_repo_last_synced_at, list_workspaces_by_tier - Add SAFETY comments to 4 unsafe env var blocks in mcp/tests.rs - RepairResult caller now logs orphan/missing counts; remove allow(dead_code) - Remove unused FolderScheduler::new, add NOTE for retained dead_code Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- relation: add test_list_relations, test_find_related_entities_bidirectional, test_save_relation_upsert (was only 1 smoke test) - health: add test_get_health_batch covering batch query, empty input, and partial miss scenarios - 5 tested crates now at 88-97% region coverage Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Add MANAGED_TAGS constant and RepoEntry::is_managed() in registry.rs Document that managed lives in repo_tags (queryable) not metadata - Replace inline MANAGED_TAGS check in sync/tasks.rs with repo.is_managed() - Add devbase repo list: shows managed flag, type, tier, path - Add devbase repo status: batch git health (ahead/behind/dirty/managed) with health cache TTL reuse and --json support - Enhance sync output: categorize skipped repos by reason with counts Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Promote devkit_hybrid_search from Beta to Stable
- Add docs/reference/stable-tools/{health,project_brief,hybrid_search,
vault_search,session_recall}.md with frozen input/output schemas,
example requests/responses, and error catalogs
- Add stable-tools/README.md with stability guarantee contract
- Update mcp-tools.md cross-links and tier markings
- Add docs/clients/claude/scenarios.md with 5 usage scenarios
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- code-symbols: 9 tests covering query_all, type filter, name filter, file_path filter, combined filters, limit, cross-repo isolation, optional field preservation - call-graph: 8 tests covering all_edges, callee/caller/file filters, combined filters, limit, cross-repo isolation - dead-code: 6 tests covering include_pub/exclude_pub, caller exclusion, tests.rs exclusion, limit, empty repo - All 3 crates raised from 0% to functional coverage Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Scenario validation tests for Claude onboarding and semantic code exploration revealed a production-grade silent-failure bug in devkit_vault_search: VaultNote deserialization from partial JSON failed and was masked by unwrap_or_default(), causing all queries to return empty results. - Add append_mcp_oplog() NDJSON tracing for tool call latency and error classification - Add seed_scenario_data() + two scenario integration tests - Fix vault_search to operate on serde_json::Value directly, eliminating the deserialization trap Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Extend devkit_oplog_query with an `analytics` flag that reads mcp-oplog.ndjson and returns statistical reports: - tool call frequency and success rate per tool - latency percentiles (P50, P95, P99) - error classification breakdown - time range coverage Also fixes clippy lint issues in sort_by closures. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rf baselines, Tantivy consistency 端到端 workflow 测试 (Task #16): - Add MCP-tool-level DAG workflow integration tests: test_workflow_run_dag_success: 3-step Condition chain via DevkitWorkflowRunTool -> DevkitWorkflowStatusTool round-trip test_workflow_run_failure_propagation: verifies ErrorPolicy::Fail status propagation to execution record 性能回归基线校准 (Task #17): - Fix schema bug in perf tests (missing `signature` column caused panic) - Add profile-aware thresholds via cfg!(debug_assertions): release 1k<200ms/10k<500ms; debug 1k<800ms/10k<2000ms - Add latency eprintln reporting for CI observability RF-7 路径隐私修复 (Task #18): - Add sanitize_path() helper: replaces home dir prefix with ~, normalizes \ to / - Apply across project_context output: repo.path, modules.path, symbols.file, calls.caller_file, assets.path - Add 5 unit tests for path desensitization logic Tantivy 一致性强化 (Task #19): - Fix AppContext to use actual storage backend's index_path for repair_tantivy_consistency_at and sync_index_to_db_at (was hardcoded to DefaultStorageBackend, breaking TempStorageBackend) - Fix repair_tantivy_consistency_at early-return bug: now loads SQLite IDs first; on Tantivy read failure reports missing_from_index = sqlite_ids.len() instead of silently 0 - Add tests: fresh_workspace consistency, empty-index+DB-repos detection, AppContext correct index path verification Formatting: - Run cargo fmt across workspace to satisfy CI fmt --check Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The G5 RF-6 rule was flagging in as production code because the skip regex did not match (no trailing slash). in test modules is idiomatic Rust and should not be flagged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The release workflow uses --locked which requires Cargo.lock to be in sync with Cargo.toml version bumps. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR delivers the Phase 1 production-readiness hardening for devbase v0.20.0, spanning end-to-end testing, path privacy (RF-7), performance regression baselines, and Tantivy index consistency.
Changes
Testing (Wave 1-3)
MCP Tooling
devkit_oplog_querygainsanalyticsmode with latency percentiles (p50/p95/p99), tool breakdown, error classification, and success-rate metricsVaultNotedeserialization with directValuetraversal, eliminating silent empty-result failuresdevkit_health,devkit_project_brief,devkit_query_repos,devkit_hybrid_search,devkit_vault_searchSecurity / Privacy (RF-7)
sanitize_path()helper: replacesdirs::home_dir()prefix with~, normalizes\→/project_contextoutput fields:repo.path,modules.path,symbols.file,calls.caller_file,assets.pathPerformance
keyword_search_latency_regression_*tests (missingsignaturecolumn caused panic)cfg!(debug_assertions): release 1k<200ms/10k<500ms; debug uses relaxed thresholds to avoid false positivesStorage / Index Consistency
AppContext::with_storage()now uses the actual storage backend'sindex_pathforrepair_tantivy_consistency_atandsync_index_to_db_at— previously hardcoded toDefaultStorageBackend, causingTempStorageBackendtests to check the wrong directoryrepair_tantivy_consistency_atno longer returns early on Tantivy read failure; SQLite IDs are loaded first, and on failuremissing_from_index = sqlite_ids.len()is reportedSync / Registry
MANAGED_TAGSconstant +RepoEntry::is_managed()for sync transparencydevkit_repo_statuscommand with managed/unmanaged/dirty/behind/ahead countsTest Plan
cargo test --lib --tests --bins --examplespassescargo fmt --checkcleancargo clippy --all-targets -- -W warningscleanRisk Assessment
sanitize_path()affects allproject_contextconsumers; paths now use~prefix and forward slashes.🤖 Generated with Claude Code