A3000 Laptop 6GB: Recommendations very poor

### Description

I have been pondering which LLM to run (when I have time) for some months, and watched as new models get announced and new ways of running them are announced (MTP) and new distilled versions are announced etc. and I have read a lot of Reddit posts for people wanting to do similar things with similar hardware. So I have a reasonable idea of what might be best for my hardware.

The output is  below...

For some reason it failed to download some stuff and gave an error. But when I ran it again it didn't give an error but gave the exact same results.

A few things in the results that stood out:

* The specific use case(s) matter - but there is no way for me to state I want e.g. agentic coding
* Qwen3.6 27B dense rather than Qwen3.6 35B A3B MoE which would run much better with hybrid inference.
* No TPS estimates - which are absolutely essential for evaluating LLMs - 35Tps vs. 1Tps is a huge impact
* Q8 rather than Q5 or Q6? Really?
* No MTP evaluations
* No distil evaluations
* No data regarding which runner should be used with which params

### Steps to Reproduce

Run `whichllm`.

### Hardware Info

```shell
Leaderboard fetch failed: Client error '429 Too Many Requests' for url 'https://datasets-server.huggingface.co/rows?dataset=open-llm-leaderboard%2Fcontents&config=default&split=train&offset=3800&length=100'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429
AA Index fetch failed, will use fallback: __NEXT_DATA__ payload not found

╭──────────────────────────────────────────────────────────────────────────────────────────────── Hardware Info ────────────────────────────────────────────────────────────────────────────────────────────────╮
│ GPU 0: NVIDIA RTX A3000 Laptop GPU — 6.0 GB (CUDA 13.2) — BW: N/A                                                                                                                                             │
│ GPU 1: Intel(R) UHD Graphics — shared memory — BW: N/A                                                                                                                                                        │
│ CPU: Unknown CPU — 8 cores (AVX2)                                                                                                                                                                             │
│ RAM: 31.3 GB                                                                                                                                                                                                  │
│ Disk free: 224.8 GB                                                                                                                                                                                           │
│ OS: windows                                                                                                                                                                                                   │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

                                                 Recommended Models
┏━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━┓
┃   # ┃ Model                                         ┃ Params ┃ Quant  ┃ Published  ┃ Downloads ┃ Score ┃ License  ┃
┡━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━┩
│   1 │ Qwen/Qwen3.6-27B                              │  27.8B │  Q8_0  │ 2026-04-21 │      5.2M │  56.7 │ apache-… │
├─────┼───────────────────────────────────────────────┼────────┼────────┼────────────┼───────────┼───────┼──────────┤
│   2 │ google/gemma-4-31B-it                         │  32.7B │  Q6_K  │ 2026-03-11 │     11.3M │  54.3 │ apache-… │
├─────┼───────────────────────────────────────────────┼────────┼────────┼────────────┼───────────┼───────┼──────────┤
│   3 │ google/gemma-4-26B-A4B-it                     │  26.5B │  Q8_0  │ 2026-03-11 │     11.5M │  47.5 │ apache-… │
│     │                                               │ (3.8B… │        │            │           │       │          │
├─────┼───────────────────────────────────────────────┼────────┼────────┼────────────┼───────────┼───────┼──────────┤
│   4 │ Qwen/Qwen3-30B-A3B                            │  30.5B │  Q6_K  │ 2025-04-27 │      2.1M │  47.5 │ apache-… │
│     │                                               │ (3.0B… │        │            │           │       │          │
├─────┼───────────────────────────────────────────────┼────────┼────────┼────────────┼───────────┼───────┼──────────┤
│   5 │ zai-org/GLM-4.7-Flash                         │  31.2B │  Q6_K  │ 2026-01-19 │      1.1M │  45.8 │ mit      │
│     │                                               │ (12.0… │        │            │           │       │          │
├─────┼───────────────────────────────────────────────┼────────┼────────┼────────────┼───────────┼───────┼──────────┤
│   6 │ Qwen/QwQ-32B                                  │  32.8B │  Q6_K  │ 2025-03-05 │     62.5K │  45.4 │ apache-… │
├─────┼───────────────────────────────────────────────┼────────┼────────┼────────────┼───────────┼───────┼──────────┤
│   7 │ openai/gpt-oss-20b                            │  21.5B │  Q8_0  │ 2025-08-04 │      7.9M │  45.0 │ apache-… │
│     │                                               │ (3.6B… │        │            │           │       │          │
├─────┼───────────────────────────────────────────────┼────────┼────────┼────────────┼───────────┼───────┼──────────┤
│   8 │ deepseek-ai/DeepSeek-R1-Distill-Qwen-32B      │  32.8B │  Q6_K  │ 2025-01-20 │    608.3K │  44.6 │ mit      │
├─────┼───────────────────────────────────────────────┼────────┼────────┼────────────┼───────────┼───────┼──────────┤
│   9 │ mistralai/Mistral-Small-3.2-24B-Instruct-2506 │  24.0B │  Q8_0  │ 2025-06-19 │    632.7K │  43.9 │ apache-… │
├─────┼───────────────────────────────────────────────┼────────┼────────┼────────────┼───────────┼───────┼──────────┤
│  10 │ Qwen/Qwen3-14B                                │  14.8B │  Q8_0  │ 2025-04-27 │      1.7M │  43.3 │ apache-… │
└─────┴───────────────────────────────────────────────┴────────┴────────┴────────────┴───────────┴───────┴──────────┘
  Top pick confidence: Low (direct benchmark, gap +2.3, partial offload)
  Benchmark reference: 2026-05 curated snapshot; live AA / LiveBench / Aider merged when reachable.
  Speed caution: Low-confidence speed estimates in top ranks: #1, #2, #3
  Warning #1 Qwen3.6-27B: ~81% of layers will be offloaded to CPU RAM
  Warning #2 gemma-4-31B-it: ~79% of layers will be offloaded to CPU RAM
  Warning #3 gemma-4-26B-A4B-it: ~78% of layers will be offloaded to CPU RAM
```

### Python Version

3.14

### Operating System

Windows 11

### whichllm Version

0.5.7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A3000 Laptop 6GB: Recommendations very poor #76

Description

Steps to Reproduce

Hardware Info

Python Version

Operating System

whichllm Version

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

A3000 Laptop 6GB: Recommendations very poor #76

Description

Description

Steps to Reproduce

Hardware Info

Python Version

Operating System

whichllm Version

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions