Model evaluations¶

Pre-rendered evaluation pages for popular models on common GPUs. Every page is real llm-cal output.

Qwen/Qwen2.5-72B-Instruct¶

GPU	Weight	Quant	Prod GPUs	Page
A100-80G	145.4 GB	`BF16`	8	→
H100	145.4 GB	`BF16`	8	→
H800	145.4 GB	`BF16`	8	→

GPU	Weight	Quant	Prod GPUs	Page
L40S	15.2 GB	`BF16`	4	→
RTX4090	15.2 GB	`BF16`	7	→

GPU	Weight	Quant	Prod GPUs	Page
A100-80G	61.1 GB	`BF16`	4	→
H100	61.1 GB	`BF16`	4	→

GPU	Weight	Quant	Prod GPUs	Page
B200	688.6 GB	`FP8`	8	→
H100	688.6 GB	`FP8`	8	→
H800	688.6 GB	`FP8`	8	→

GPU	Weight	Quant	Prod GPUs	Page
910B4	159.6 GB	`FP4_FP8_MIXED`	8	→
B200	159.6 GB	`FP4_FP8_MIXED`	2	→
H100	159.6 GB	`FP4_FP8_MIXED`	8	→
H800	159.6 GB	`FP4_FP8_MIXED`	8	→

GPU	Weight	Quant	Prod GPUs	Page
L40S	29.3 GB	`BF16`	8	→
RTX4090	29.3 GB	`BF16`	8	→

GPU	Weight	Quant	Prod GPUs	Page
A100-80G	93.4 GB	`BF16`	8	→
H100	93.4 GB	`BF16`	8	→