Model evaluations
Pre-rendered evaluation pages for popular models on common GPUs. Every page is real llm-cal output.
Qwen/Qwen2.5-72B-Instruct
| GPU |
Weight |
Quant |
Prod GPUs |
Page |
| A100-80G |
145.4 GB |
BF16 |
8 |
→ |
| H100 |
145.4 GB |
BF16 |
8 |
→ |
| H800 |
145.4 GB |
BF16 |
8 |
→ |
Qwen/Qwen2.5-7B
| GPU |
Weight |
Quant |
Prod GPUs |
Page |
| L40S |
15.2 GB |
BF16 |
4 |
→ |
| RTX4090 |
15.2 GB |
BF16 |
7 |
→ |
Qwen/Qwen3-30B-A3B
| GPU |
Weight |
Quant |
Prod GPUs |
Page |
| A100-80G |
61.1 GB |
BF16 |
4 |
→ |
| H100 |
61.1 GB |
BF16 |
4 |
→ |
deepseek-ai/DeepSeek-V3
| GPU |
Weight |
Quant |
Prod GPUs |
Page |
| B200 |
688.6 GB |
FP8 |
8 |
→ |
| H100 |
688.6 GB |
FP8 |
8 |
→ |
| H800 |
688.6 GB |
FP8 |
8 |
→ |
deepseek-ai/DeepSeek-V4-Flash
| GPU |
Weight |
Quant |
Prod GPUs |
Page |
| 910B4 |
159.6 GB |
FP4_FP8_MIXED |
8 |
→ |
| B200 |
159.6 GB |
FP4_FP8_MIXED |
2 |
→ |
| H100 |
159.6 GB |
FP4_FP8_MIXED |
8 |
→ |
| H800 |
159.6 GB |
FP4_FP8_MIXED |
8 |
→ |
microsoft/Phi-4
| GPU |
Weight |
Quant |
Prod GPUs |
Page |
| L40S |
29.3 GB |
BF16 |
8 |
→ |
| RTX4090 |
29.3 GB |
BF16 |
8 |
→ |
mistralai/Mixtral-8x7B-v0.1
| GPU |
Weight |
Quant |
Prod GPUs |
Page |
| A100-80G |
93.4 GB |
BF16 |
8 |
→ |
| H100 |
93.4 GB |
BF16 |
8 |
→ |