Skip to content

feat: add conservative multi-GPU simulation#113

Open
Andyyyy64 wants to merge 1 commit into
mainfrom
feature/multi-gpu-fit-simulation
Open

feat: add conservative multi-GPU simulation#113
Andyyyy64 wants to merge 1 commit into
mainfrom
feature/multi-gpu-fit-simulation

Conversation

@Andyyyy64

Copy link
Copy Markdown
Owner

summary

  • accepts repeated, comma-separated, and count shorthand --gpu specs, including 2x RTX 4090
  • computes a conservative effective VRAM budget for multi-GPU fit checks instead of treating raw VRAM as one perfect pool
  • keeps multi-GPU speed low confidence and applies a conservative speed factor
  • exposes multi-GPU metadata in JSON output

scope

This is the fit simulation pass for #112. It does not try to model exact tensor parallel or data parallel throughput, PCIe lane layout, NVLink, NCCL/RCCL behavior, or backend-specific tensor splits. Those belong in #52.

Fixes #112.
Refs #65, #52, #84, #110.

hardware feedback

@cobra91 @theodufort @Honghe, if you still have access to the multi-GPU systems from #65, #84, or #110, could you try this branch and paste the hardware panel plus the top results? I mainly want to see whether the detected GPU list, effective VRAM warning, and top recommendations look sane on real hardware.

Suggested commands:

uvx --from "git+https://github.com/Andyyyy64/whichllm.git@feature/multi-gpu-fit-simulation" whichllm hardware
uvx --from "git+https://github.com/Andyyyy64/whichllm.git@feature/multi-gpu-fit-simulation" whichllm --status --top 5 --evidence any

@0xDE57, I kept PCIe lane and interconnect modeling out of this PR. The fit math is conservative, but topology-aware deployment strategy should stay in #52.

validation

  • uvx ruff check .
  • uvx ruff format --check .
  • uv run python -m compileall -q src tests
  • uv run pytest (350 passed)
  • manual CLI checks for single GPU, repeated --gpu, comma-separated specs, count shorthand, invalid --vram, JSON output, 2x RTX 4090, mixed RTX 4090 + RTX 3060, and 4x RTX 4090

@Andyyyy64 Andyyyy64 marked this pull request as ready for review June 14, 2026 06:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement maintainer-owned multi-GPU fit simulation

1 participant