lightseekorg / tokenspeed Public

Notifications You must be signed in to change notification settings
Fork 134
Star 1.3k

Code
Issues 5
Pull requests 34
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security and quality
Insights

Pull requests: lightseekorg/tokenspeed

Labels 10 Milestones 0

New pull request New

34 Open 278 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix Qwen3.5 MoE text config initialization

#328 opened Jun 1, 2026 by lightseek-bot Contributor

Loading…

feat(engine): compute_log_probs API for RL sequence scoring (RL-plan M2)

#321 opened May 30, 2026 by HJSang Collaborator

Loading…

fix(eagle): avoid mutating drafter sequence lengths

#318 opened May 30, 2026 by XucSh Contributor • Draft

[WIP]perf: add Gluon MoE kernels for GPT-OSS

#314 opened May 29, 2026 by knwng Contributor

Loading…

[Perf] Optimizes loads in gfx950 fp16 decode kernel

#313 opened May 29, 2026 by Yu-Zhewen

Loading…

test: extend agentic bench

#312 opened May 29, 2026 by syuoni Member

Loading…

fix(cute_dsl): skip CuTe DSL argmax kernel on H20 GPUs

#311 opened May 29, 2026 by botieking98

Loading…

fix(dp): fix qwen 3.5 data parallel bug.

#309 opened May 29, 2026 by tuanzhangCS Contributor

Loading…

feat(entrypoints): add HTTP server sidecar alongside smg gateway

#308 opened May 29, 2026 by qywu Collaborator

Loading…

[WIP] refactor + perf(spec-decode): refactor #217 + add prefill scope

#304 opened May 28, 2026 by rjzhb Contributor

Loading…

(feat) L3 KVStore: prefetch and backup support

#293 opened May 28, 2026 by ehuohz

Loading…

perf: Optimize inter-iteration small op

#291 opened May 28, 2026 by yweng0828 Contributor

Loading…

[Perf] Enable missing FlashInfer MoE autotuning

#290 opened May 28, 2026 by mmangkad Contributor

Loading…

perf(model_loader): multi-threaded safetensors weight loading

#287 opened May 28, 2026 by yuanqingz

Loading…

fix(PD): fix PD speculative bootstrap input seeding

#286 opened May 28, 2026 by XucSh Contributor

Loading…

chore: use -O3 -use_fast_math for tokenspeed_kernel compilation

#285 opened May 28, 2026 by syuoni Member

Loading…

Add Triton sampling backends alongside FlashInfer

#280 opened May 27, 2026 by FlamingoPg Contributor

Loading…

feat(memory-saver): optional CPU staging for round-trip weight preservation

#275 opened May 27, 2026 by qywu Collaborator • Draft

3 of 5 tasks

feat(memory-saver): wrap CUDA graphs and attention workspaces in saver.region()

#274 opened May 27, 2026 by qywu Collaborator • Draft

3 of 5 tasks

feat(scheduler): per-adapter KV prefix-cache namespace + max_loras batch cap

#268 opened May 26, 2026 by qywu Collaborator

Loading…

5 tasks

feat(spec-decode): add native DFlash support

#263 opened May 26, 2026 by mesaleh

Loading…

fix(deepseek): guard missing quant weight_block_size

#261 opened May 26, 2026 by mesaleh

Loading…

fix(trtllm-mla): make spec-decode CUDA graph capture causal

#260 opened May 26, 2026 by mesaleh

Loading…

perf: TokenSpeed MLA decode kernel optimization for num_heads=16

#255 opened May 26, 2026 by dishengbin Contributor • Draft

feat(eplb): eplb support high priority

#251 opened May 26, 2026 by XucSh Contributor

Loading…

Previous 1 2 Next

Previous Next

ProTip! Filter pull requests by the default branch with base:main.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!