Nvidia’s Mobile Custom 64-bit ARM CPU: It’s Sooner Than You May Think

Since Apple released the iPhone 5s, there have been a slew of articles on the merits of 64-bits in a mobile device. I participated in the punditry as, a decade ago, I spent years knee-deep in 64-bits, bringing 64-bits first to theWindows PC and mainstream server platform. I have a bit of experience on the matter. After the Apple iPhone 5s launch with the A7 chipSamsung quickly reacted to Apple saying their next smartphones will be 64-bit. Qualcomm was then cited by the press, accurately or inaccurately, saying 64 bits was a “gimmick.” What about Nvidia’s Tegra? Confirmed by Nvidia, the follow-on to Nvidia’s Logan will be 64-bits and will be a custom core like Apple’s Cyclone (the name of Apple’s 64-bit custom CPU).  Why does this make sense? Read on.

This all goes back to CES 2011, when Nvidia announced an ARM Cortex A15 core license but more importantly, an architecture license for ARM’s 64 bit “future processor architecture,” which was subsequently called “v8”. Nvidia called the custom SoC “Project Denver”.  As Nvidia wasn’t about to disclose specifics on the targeted application at that point, speculation ensued on the type of chip. Would it be a supercomputer chip?  For all PCs? For notebooks? It could be all of those, but what Nvidia has confirmed to me is that the Logan SoC follow-on for mobile devices does have a custom 64-bit processor core, based on Denver.  Doing a back of the envelope calculation, if Nvidia started development in late 2010, before their CES 2011 announcement, they should be able to sample in early 2014 and have tablets, and possibly smartphones, in-market, 6 months after that – this puts them on pace to possibly be the first custom, Android-based 64-bit mobile ARM processor.

What is so special about a custom, 64-bit CPU core? As we have seen from the performance scores from Apple’s A7, a well-designed and executed, custom SoC based on ARM’s v8 processor architecture can really scream. Like Apple and Qualcomm you can get a lot out of having an ARM architectural license. As I have said many times, the CPU is one of many (GPU, DSP, connectivity, multimedia, connectivity) mobile SoC “engines” that must be optimized, but an important one.

Nvidia’s Tegra has been a mixed bag of execution performance. Tegra 2 and 3 were deemed a commercial success, Tegra 1 was not, and while over 10 devices have been announced powered by Tegra 4, the jury is still out. So what does it take to execute on a custom CPU core?

Let’s look at Nvidia’s processor team. They have been in existence since 2006 and have hardened multiple, “off-the-shelf” ARM cores.  Unknown to most, Nvidia’s engineers have been working on their ARM 64-bit Denver for at least three years, since before the CES 2011 announcement. Hailing out of Portland, Oregon, the team consists of former CPU jockeys from Intel, AMD, HP, Sun, and Transmeta with experience in superscalar, OoO (out of order) execution design, micro-code, VLIW, hyper-threading, and multi-core.  Does this experience and background guarantee success? No, but it provides the opportunity to succeed, and succeed big if you look at what others have accomplished.

With their Logan follow-on, Nvidia has the opportunity to put themselves in the same, small company as Apple and Qualcomm as mobile ARM architecture licensees. Is it risky? Sure it is, but then again, you need to expect high-risk/high-return from Nvidia.