-
Notifications
You must be signed in to change notification settings - Fork 134
Pull requests: lightseekorg/tokenspeed
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix Qwen3.5 MoE text config initialization
#328
opened Jun 1, 2026 by
lightseek-bot
Contributor
Loading…
feat(engine): compute_log_probs API for RL sequence scoring (RL-plan M2)
#321
opened May 30, 2026 by
HJSang
Collaborator
Loading…
fix(cute_dsl): skip CuTe DSL argmax kernel on H20 GPUs
#311
opened May 29, 2026 by
botieking98
Loading…
fix(dp): fix qwen 3.5 data parallel bug.
#309
opened May 29, 2026 by
tuanzhangCS
Contributor
Loading…
feat(entrypoints): add HTTP server sidecar alongside smg gateway
#308
opened May 29, 2026 by
qywu
Collaborator
Loading…
[WIP] refactor + perf(spec-decode): refactor #217 + add prefill scope
#304
opened May 28, 2026 by
rjzhb
Contributor
Loading…
[Perf] Enable missing FlashInfer MoE autotuning
#290
opened May 28, 2026 by
mmangkad
Contributor
Loading…
perf(model_loader): multi-threaded safetensors weight loading
#287
opened May 28, 2026 by
yuanqingz
Loading…
fix(PD): fix PD speculative bootstrap input seeding
#286
opened May 28, 2026 by
XucSh
Contributor
Loading…
chore: use -O3 -use_fast_math for tokenspeed_kernel compilation
#285
opened May 28, 2026 by
syuoni
Member
Loading…
Add Triton sampling backends alongside FlashInfer
#280
opened May 27, 2026 by
FlamingoPg
Contributor
Loading…
feat(scheduler): per-adapter KV prefix-cache namespace + max_loras batch cap
#268
opened May 26, 2026 by
qywu
Collaborator
Loading…
5 tasks
fix(trtllm-mla): make spec-decode CUDA graph capture causal
#260
opened May 26, 2026 by
mesaleh
Loading…
perf: TokenSpeed MLA decode kernel optimization for num_heads=16
#255
opened May 26, 2026 by
dishengbin
Contributor
•
Draft
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.