Skip to content

feat(deepseek): prefix cache stability primitives (#145 follow-up)#295

Merged
quangdang46 merged 1 commit into
masterfrom
feat/deepseek-prefix-cache-stable
May 24, 2026
Merged

feat(deepseek): prefix cache stability primitives (#145 follow-up)#295
quangdang46 merged 1 commit into
masterfrom
feat/deepseek-prefix-cache-stable

Conversation

@quangdang46
Copy link
Copy Markdown
Owner

Gap close on #145

Upstream PR 1jehuang/jcode#194 (open, not merged) added a prefix_cache_stable module + agent integration. Our master had neither — earlier verify-close was inaccurate.

This PR ports the detection + preflight primitives as a standalone module. Agent integration (turn loop history folding, tool-result truncation) is a separate follow-up that needs dedicated test coverage.

use jcode::prefix_cache_stable as pcs;

if pcs::is_prefix_cache_stable_mode() {
    let cap = pcs::recommended_tool_result_cap_tokens()
        .unwrap_or(usize::MAX);
    // truncate large tool results to <cap> tokens
}

let ctx_max = pcs::context_tokens_for_model(model);
let decision = pcs::preflight_check_simple(estimate, ctx_max);
if decision.needs_action {
    // user_warn or kick off compaction
}

Detection signals (any one, case-insensitive)

JCODE_OPENROUTER_CACHE_NAMESPACE=deepseek
JCODE_RUNTIME_PROVIDER=deepseek
JCODE_NAMED_PROVIDER_PROFILE=deepseek

Tests

$ cargo test -p jcode --lib prefix_cache_stable
test result: ok. 12 passed; 0 failed

Refs upstream PR 1jehuang/jcode#194, fork #145.

Gap close: upstream PR 1jehuang#194 (open) added a
prefix_cache_stable module + agent integration. Our master had
neither. Earlier verify-close was wrong — upstream itself never
merged that PR.

This PR ports the **detection + preflight primitives** as a clean
standalone module. Agent integration (turn loop history folding,
tool-result truncation hooks) is a separate follow-up that needs
dedicated test coverage.

API:

  use jcode::prefix_cache_stable as pcs;

  if pcs::is_prefix_cache_stable_mode() {
      let cap = pcs::recommended_tool_result_cap_tokens()
          .unwrap_or(usize::MAX);
      // truncate large tool results to <cap> tokens
  }

  let ctx_max = pcs::context_tokens_for_model(model);
  let decision = pcs::preflight_check_simple(estimate, ctx_max);
  if decision.needs_action {
      // user_warn or kick off compaction
  }

Detection signals (any one is enough, case-insensitive):
  JCODE_OPENROUTER_CACHE_NAMESPACE=deepseek
  JCODE_RUNTIME_PROVIDER=deepseek
  JCODE_NAMED_PROVIDER_PROFILE=deepseek

Constants exported:
  DEEPSEEK_V4_CONTEXT_TOKENS=1_000_000
  DEFAULT_CONTEXT_TOKENS=128_000
  HISTORY_FOLD_THRESHOLD/AGGRESSIVE_THRESHOLD/FORCE_SUMMARY_THRESHOLD
  PREFLIGHT_EMERGENCY_THRESHOLD=0.95
  TURN_END_RESULT_CAP_TOKENS=3000

Tests: 12 unit tests covering detection (5) + context resolution
(2) + preflight thresholds (3) + recommended cap toggle (2).

Refs upstream PR 1jehuang#194, fork #145.
@quangdang46 quangdang46 merged commit eec64bd into master May 24, 2026
@quangdang46 quangdang46 deleted the feat/deepseek-prefix-cache-stable branch May 24, 2026 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant