INVESTOR CORNER: ARM-based Microservers are Dead; Long Live ARM-based Servers

By Patrick Moorhead - January 6, 2014
Moor Insights & Strategy “Investor Corner” is content specifically written for the professional investment community.  This analysis was written by Steven Eliscu as part of Investor Corner, L-sq Advisors, January 5, 2013.  See important disclosures and disclaimers below. In the world of server processors, especially the ~$2.5bn (2013) opportunity for those x86 processors used in the cloud, the potential for disruption over the next 2-3 years is as great as it has been since the original Intel Pentium Pro debuted in 1995. At that time, Intel undercut proprietary RISC-based server approaches and became the leading server/workstation architecture on a unit basis over the next few years by leveraging its dominance in PCs. In a similar manner, ARM, which is leveraging its dominance in mobile processors, is seeking to do to Intel in server processors what Intel did to RISC-based approaches by offering a similarly disruptively lower-cost approach for server workload computation. As Intel is well-schooled in the threats from lower-cost emerging technologies enumerated in “The Innovator’s Dilemma” by Clayton Christensen, it has sought to preempt an attack from ARM in part by repurposing its Atom processors used in mobile computing for the server market. In the 2010 time frame, it worked in conjunction with then-independent SeaMicro, which defined the category for the sea-of-tightly-interconnected lightweight-CPU server that could realize equivalent functionality consuming one-fourth the power in one-sixth the floor space vs traditional rack servers for a class of data center applications such as web serving. Intel dubbed this class of box a “microserver,” for which it has forecasted the portion based on Atom or ARM to grow to become 3% of the total server market by 2015. With a potential market penetration of 3% for microserver-class processors, why has there been so much fuss about ARM invading Intel’s space? The recent closure of microserver processor pioneer, Calxeda, only reinforces Intel’s view that the market potential is small. Indeed, applications for microservers are likely to remain limited to dedicated web hosting and other very specific so-called “wimpy-core” tasks. Although one could argue that the potential for microservers is materially larger than 3% of the total server market, it is still likely to be a niche, as over 80% of Intel’s unit sales are Xeon E5-class – what Intel often refers to as its “volume” server chips. This is the sweet spot of the market and is what powers most data centers. Intel has made the points that 1) the heterogeneity of datacenter workloads and 2) customers’ desire to have a homogenous deployment environment require a processor as the Xeon E5, which has: 1) excellent single-thread performance, 2) large caches and support for copious amounts of external memory (i.e. hundreds of gigabytes), 3) optimal power efficiency. However, even as Intel’s ability to make Xeons well-suited for general purpose server workloads in high-volume is a key strength, it may also be its Achilles heel, as we discuss below. The ARM-based server has been largely synonymous with the term microserver (especially as AMD now owns SeaMicro and is tailoring its own ARM-based server processors to power future generations of SeaMicro machines); however, the more important challenge from the ARM-based approaches is vs the Xeon E5. In particular, Cavium has set the bar at outperforming Xeon E5 by 5-10x on a performance/watt/dollar metric – think 2x the performance at half the power and half the cost. Is this possible for a broad enough class of workloads? The answer is likely yes, although possible does not imply probable – here are the details as to how this advantage could be achieved: 1)      >>2x the cores – Xeon E5 in the 2014 time frame will be based on the 22nm Haswell microarchitecture, and according to leaks in the trade press, it will have up to 12-cores/processor. In contrast, Cavium is already delivering 48-core processors with its 28nm MIPS-based Octeon III chip. Using this as a baseline for its future ARM-based processor for servers, Cavium could deliver 4x the number of cores as Intel, (as the primary difference between Cavium’s ARM and MIPS based cores will be the instruction decode pipeline, with the other sections of the core being very similar). To complement these cores, Cavium will employ hardware accelerators that should provide a performance boost for the specific data center workloads it has targeted. In addition, Broadcom (from its NetLogic acquisition) is likely to take a similar approach, although it probably won’t be in production until 2016 (as it is using a much more advanced 16nm FinFET process) – a year or more after Cavium.  Of course, single-thread performance is still very important, but if those cores could realize 70% of the single-thread performance of a Xeon E5 and are designed to have good core scalability for their target workloads, they could still achieve > 2x the performance of Xeon at the chip level. In contrast, AMD and Applied Micro may both find themselves at a major disadvantage if they don’t quickly boost core count well beyond their initial 8-core devices. 2)      Lower-power consumption – the laws of physics primarily dictate power consumption (i.e. number of transistors and process technology), not instruction set architecture (ISA), and to discuss the merits of the ARM ISA vs x86 largely misses the point. Usage of the ARM ISA does not magically imply the chip implementation will be lower-power than x86. However, smaller (ARM) cores with smaller caches intelligently complemented with hardware accelerators will use less power per core than bigger cores (whether they be ARM or x86), even with less advanced process technology. 3)      Cost – as Intel’s Xeon E5 average selling price is likely in the $600 range, its gross margin for Xeon E5 is likely in the 90% range. Even if its competitors priced their chips to yield 70% gross margin and had 50% higher cost because of the process node disadvantage (28nm vs Intel at 22nm in 2014), they could still undercut Intel by 50% on processor price. Of course, the server processor price is only one component of the overall server cost, but for the mega-datacenters deploying hundreds of thousands of servers per year, the savings would still be meaningful. While this analysis suggests the ARM-based server processor entrants, especially Cavium and Broadcom, have the potential to be disruptive, clearly Intel is not standing still. Although Intel’s Atom-based server processors have limitations, Intel will also be introducing in 2H14 a system-on-a-chip device based on its 14nm Broadwell architecture that could offer Xeon-class performance with a high-level of integration and single-digit-watt power consumption. Thus, even with Calxeda fading from view, the confluence of upcoming ARM and x86 server processor introductions in 2014 is likely to stoke fresh fires of the debate if ARM can penetrate the x86 juggernaut. Within this debate, the discussion will likely shift from microservers to mainstream server processors, a business that is a key driver of Intel’s profitability and which it will defend vigorously. DISCLOSURES I do not own a stock position in any company whose stock is mentioned in this article. I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it. L-sq Advisors is a consulting firm that may have in the past, present or future solicited and/or generated consulting services from any company mentioned in this article. Patrick founded the firm based on his real-world world technology experiences with the understanding of what he wasn't getting from analysts and consultants. Ten years later, Patrick is ranked #1 among technology industry analysts in terms of "power" (ARInsights)  in "press citations" (Apollo Research). Moorhead is a contributor at Forbes and frequently appears on CNBC. He is a broad-based analyst covering a wide variety of topics including the cloud, enterprise SaaS, collaboration, client computing, and semiconductors.