We've had pretty high parallelism in server CPUs for a while, eg Ultrasparc T1 had 32 threads in 2005, later generations with 128 threads (T5). But a high number of threads and cores is a worse solution than being able to keep up high single threaded execution speedups, as most programs are not multithreaded because parallel programming is so hard.
The T1 was unusual at the time. It did have 32 threads, but provided by 8 VERY anemic cores. It was only competitive in some very niche scenarios. A "real" 8-core SMP Sun box was still pretty much a full rack and 6 figures.
I remember when the big company I was working for started using T1s for databases (I guess caused by Oracle marketing & change of licensing model) and asked us "consumers" to start using those systems.
I was skeptical so I downloaded some SPECint benchmark results ( https://www.spec.org/benchmarks.html ) of T1s, Power and Xeon, compared them and thought "mh, probably a bad idea" and then I ended up having to invest quite some time to convince my management to keep using "normal" servers.
On the other hand, later, I had a bit of fun hearing stories from other colleagues telling me how slow those T1-machines were once they started running on them their normal DB-workloads => after months of everybody complaining about bad performance, everybody went back to normal servers => a lot of time & money (& nerves) spent for nothing.
Yep. We've historically had a lot of CPUs or multi-cpu machines that were good for running "embarassingly parallel" loads but wasted capacity for general purpouse use, a bit like this one! More hw threads or cores has traditionally not been good news from software POV.
This kind of machine is nice to look at from afar but for most apps trying to get anywhere near a 64 fold speedup on your application on this would mean scrapping the code base and doing 1 failed rewrite, followed by 1 marginally successful one, taking up ruinous amounts of calendar time and engineering resources.
But of course it's diferent now than in 2005, because today we can't do any better.
It’s not though in the server space. Many server workloads are inherently parallel because the support multiple concurrent users. Think database servers or web servers.
By sustaining single core speedups, I meant big speedups, like we were observing bin the days before mainstream CPUs resorted to multiore. It was rather more than 10%.
And we'd still have be able to make processors with lots of slow threads of course for applications where that's cost or power effective, that's comparatively very easy.
Yes, having one core at 2x clock is definitely better than having two cores at 1x clock, I'd agree with that.
These days however, the difference between fastest core performance on low-core processors and fastest core performance on high-core processors is rather a lot less than 50% (Comparing like for like, eg AMD to AMD and Intel to Intel).