The performance comparison with H100 isnt apples to apples. CS3 has a much larger area. Actually the ratio of CS3 area and H100 area is approx 57. So, 57 H100s with perfect scaling would give the exact same performance. It would make more sense to compare against an Nvidia cluster of GPUs with approx equal area as CS3.
I suppose the argument is perfect scaling with H100 is harder. Although I agree current version of comparing 1 H100 to 1 CS3 or CS2 is unfair / incorrect comparision
Impressive technological achievement. Can't wait to try these 24T-parameter models!
if you can mix this with northpole from ibm
The performance comparison with H100 isnt apples to apples. CS3 has a much larger area. Actually the ratio of CS3 area and H100 area is approx 57. So, 57 H100s with perfect scaling would give the exact same performance. It would make more sense to compare against an Nvidia cluster of GPUs with approx equal area as CS3.
I suppose the argument is perfect scaling with H100 is harder. Although I agree current version of comparing 1 H100 to 1 CS3 or CS2 is unfair / incorrect comparision
going to war with nvidia
looks like I can't play solitaire on this hardware