The last couple of years in datacenter compute has been dominated on two fronts. Arm establishing a foothold in the cloud, and AMD establishing a foothold everywhere. Yes, Intel still owns the vast majority of the datacenter, but it certainly doesn’t feel that way. This feeling is partly because of the amazing story that is AMD EPYC and partly because of Intel’s lack of any significant response. Sure, when AMD first reemerged in 2017, it was understandable for Intel to be a little dismissive. But, after two successive generational launches that saw the company eat into Intel’s datacenter share. It almost felt like 2005-2006 all over again, when AMD gained over 20% market share with Opteron.
With 2021 came much change at Intel. Pat Gelsinger left VMware to return to the company as CEO, and we started seeing green shoots as he got the company back on track. And I believe what Intel showed at its recent Architecture Day and Hot Chips presentation demonstrates that the company is getting back on track in the datacenter.
Intel has always seemed to “get it”
Before getting into some of the details around Intel’s architecture disclosure, it is essential to point out something about Intel that can sometimes get lost. The company understands something very critical to its long-term success. And that is this – the applications running in our datacenters tomorrow look vastly different from today. And the infrastructure required to run those applications and workloads must also look different.
Just as Intel supplanted “big iron” in the datacenter a couple of decades back, newer architectures with different performance-power profiles are making its way into today’s datacenter. The company that will win in the next generation will have the portfolio to support these emerging workloads – and the company that has the confidence of enterprise IT to support what comes next.
The Efficient core
Intel introduced two new core architectures as part of its rollout – the Efficient core (E-core) and the Performance core (P-core). As the names imply, the Efficient core design goals are around scalability and density, whereas the Performance core targets those workloads that require the best performance.
While the Efficient core only seems to show up in Intel’s client (Alder Lake) slides, I can see where this core could be a good fit in different deployment models such as scale out-cloud environments and some edge instances.
The efficiency gains that Intel demonstrates in its Efficient core are pretty spectacular. Intel says the Efficient core can deliver 40% more performance than a single Skylake. And four single-threaded Efficient Cores can deliver the same performance as two Skylake cores with four threads – at 80% less power. I like how Intel describes its Efficient core – “designed for throughput, enable scalable multi-threaded performance for modern multitasking.”
Performance is more than IPC
Intel’s Performance core reveal shows what the company is thinking about the performance characteristics of the workloads that will be powering the future datacenter. Specifically, that there will be requirements spanning the scalar, vector, and spatial spaces. And that specific microarchitecture design considerations can enable this wide range of workloads.
Intel’s new Advanced Matrix Extensions (Intel AMX) is a critical component of enabling compute-intensive workloads such as Machine Learning by greatly expanding the number of instructions per clock cycle per core. It is precisely this kind of enablement that demonstrates Intel’s understanding that best performance requires more than just strong integer performance. Instead, it’s strong integer performance combined with a compute complex that can support the unique requirements of the workloads that are becoming of greater importance in the datacenter.
With the Performance core, Intel is making some bold claims around performance. The company is claiming a 19% performance claim (over the 11th Generation Intel Core) at the same frequency.
Sapphire Rapids – the datacenter SoC
Sapphire Rapids is the code name for the CPU that will follow Ice Lake in the Scalable Xeon roadmap. While we established that Intel’s Performance core would be a well-designed foundational part of Sapphire Rapids, there’s much more to this.
As you can see in the above graphic, Sapphire Rapids comprises four compute tiles interconnected via a high-speed interconnect. Each tile has a complete complex of compute (cores, acceleration engines), I/O, and memory. This design should give Intel greater packaging flexibility and should support better performance and performance-per-watt in the datacenter.
One can look at the Sapphire Rapids complex and think it looks similar to other chiplet designs. And from a very high level, it does. What stands out about Intel is what it builds into the core and around the complex to deliver optimal performance. For instance, Sapphire Rapids will include two acceleration engines to offload common functions (data streaming and cryptography & data de/compression). These engines offload considerable compute overhead from the core, allowing for faster and more balanced performance of workloads. No application modifications are required, no special architecting.
Two other improvements of note in Sapphire Rapids are in the area of AI and container support. AMX (previously described) should drive substantial performance improvements for AI workloads (with native support for the major frameworks and libraries).
Intel claims up to 69% better improvement in container support over Cascade Lake for Kubernetes support. The company attributes this to enhanced instructions, improved telemetry, and its Sapphire Rapids use of acceleration engines.
At the end of the proverbial day, I don’t believe many IT professionals will consider whether AMX provides 8x the performance of AVX-512. Or whether Sapphire Rapids Quick Assist Technology (QAT) will deliver a 98% reduction in core utilization for cryptographic functions. But, it will notice three things if Sapphire Rapids lives up to Intel’s positioning:
- Better performance for datacenter workloads
- More consistent performance for those very same workloads
- Support for a greater variety of workloads
Intel entered the datacenter as a disruptor. Starting as a processor relegated to lightweight tasks such as directory services and file/print – the x86 architecture seemingly took over the enterprise datacenter overnight (any NetWare fans out there?). The cloudification of the datacenter has caused IT architects to concern themselves less with CPU architectures, and more with workload-tailored compute platforms. In this era of cloud-native and runtime environments – x86 v Arm v something else? It doesn’t matter. What matters is performance, consistency of performance, power, price, and security.
I write this because Intel’s Architecture Day seems to signal its understanding that it needs to return to innovative roots to win in this CPU market. The core designs revealed and Sapphire Rapids should position Intel quite competitively. I’m looking forward to the next reveal, when we can learn more about speeds, feeds, and time to market. Stay tuned.
Note: Moor Insights & Strategy writers and editors may have contributed to this article.