模型评估报告
热门模型 × 主流 GPU 的预渲染评估页。每页都是 llm-cal 实跑结果。
Qwen/Qwen2.5-72B-Instruct
| GPU |
权重 |
量化 |
推荐 GPU 数 |
链接 |
| A100-80G |
145.4 GB |
BF16 |
8 |
→ |
| H100 |
145.4 GB |
BF16 |
8 |
→ |
| H800 |
145.4 GB |
BF16 |
8 |
→ |
Qwen/Qwen2.5-7B
| GPU |
权重 |
量化 |
推荐 GPU 数 |
链接 |
| L40S |
15.2 GB |
BF16 |
4 |
→ |
| RTX4090 |
15.2 GB |
BF16 |
7 |
→ |
Qwen/Qwen3-30B-A3B
| GPU |
权重 |
量化 |
推荐 GPU 数 |
链接 |
| A100-80G |
61.1 GB |
BF16 |
4 |
→ |
| H100 |
61.1 GB |
BF16 |
4 |
→ |
deepseek-ai/DeepSeek-V3
| GPU |
权重 |
量化 |
推荐 GPU 数 |
链接 |
| B200 |
688.6 GB |
FP8 |
8 |
→ |
| H100 |
688.6 GB |
FP8 |
8 |
→ |
| H800 |
688.6 GB |
FP8 |
8 |
→ |
deepseek-ai/DeepSeek-V4-Flash
| GPU |
权重 |
量化 |
推荐 GPU 数 |
链接 |
| 910B4 |
159.6 GB |
FP4_FP8_MIXED |
8 |
→ |
| B200 |
159.6 GB |
FP4_FP8_MIXED |
2 |
→ |
| H100 |
159.6 GB |
FP4_FP8_MIXED |
8 |
→ |
| H800 |
159.6 GB |
FP4_FP8_MIXED |
8 |
→ |
microsoft/Phi-4
| GPU |
权重 |
量化 |
推荐 GPU 数 |
链接 |
| L40S |
29.3 GB |
BF16 |
8 |
→ |
| RTX4090 |
29.3 GB |
BF16 |
8 |
→ |
mistralai/Mixtral-8x7B-v0.1
| GPU |
权重 |
量化 |
推荐 GPU 数 |
链接 |
| A100-80G |
93.4 GB |
BF16 |
8 |
→ |
| H100 |
93.4 GB |
BF16 |
8 |
→ |