Multi-GPU Deployment Configuration and Optimal Parallel Strategy

### Problem

The project looks like only support matching the best local LLM for single-GPU scenarios？

However, I want to leverage multiple GPUs (e.g., 4 or 8 GPUs) to run larger models that cannot fit on a single GPU, or to achieve higher throughput via parallel   inference.How to specify the target number of GPUs for deployment matching？

### Proposed Solution

  1. Allow users to specify target GPU count when searching for compatible models
  2.  Provide the optimal parallel strategy configuration for model deployment

### Alternatives Considered

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-GPU Deployment Configuration and Optimal Parallel Strategy #52

Problem

Proposed Solution

Alternatives Considered

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Multi-GPU Deployment Configuration and Optimal Parallel Strategy #52

Description

Problem

Proposed Solution

Alternatives Considered

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions