Expose ngram speculative decoding#268
Conversation
|
Chat app preview deployed for
|
|
Expanded local speculative benchmark pass on macOS arm64 / CPU, using the rebuilt local native dylib from leehack/llamadart-native#23. Models/configs tested:
Important blocker found during the expanded pass:
Upstream references checked:
|
|
Correction to the benchmark note above: zsh stripped the inline backtick snippets before posting. Exact parity blocker values:
The conclusion is unchanged: Dart PR #268 should stay draft until the ngram draft=4 deterministic parity issue is fixed, intentionally constrained, or documented as an accepted semantic tradeoff. |
|
Updated the issue-190 ngram fix on commit What changed:
Local validation:
Real-model runtime validation with the local
Native/upstream boundary:
Current state: PR CI is running on |
|
Follow-up: CI is now green on All PR checks passed, including Analyze & Lint, Docs Build Check, Test Linux VM with Coverage, Test Web (Chrome), Test Native on macOS and Windows, Native Prompt Reuse Parity, both companion package jobs, the chat app PR preview, and the LiteRT-LM smoke jobs. The PR is still draft only because the runtime feature depends on landing/publishing the native wrapper symbols from leehack/llamadart-native#23. |
Summary
Validation
Blocking before ready
Related