It seems of late like there is an unlimited thirst for GPU performance at the right power efficiency. Whether it is deep learning, object recognition, artificial intelligence, simulations, VR or AR, the industry desperately needs GPU improvements. Many within the graphics industry would agree that a new era of graphics performance and efficiency is upon us. This new era is partly thanks to the fact that the entire industry is finally making the transition from the 28nm process node down to the 14 and 16nm FinFET (3D transistor) process nodes, a full process node shrink. It’s part process but it’s also architecture. Advanced Micro Devices (AMD) Radeon Technologies Group’s (RTG) newest GPU architecture, codenamed Polaris, is the company’s latest architecture that is specifically designed with FinFET in mind enabling higher levels of performance while using significantly less power and using a smaller chip. This should translate to some extremely impressive power and performance improvements that could inevitably drive improved value to consumers while also improving yields and hopefully margins of the companies that make those GPUs.
The FinFET shrink to 14 and 16nm has been a long time coming, since the last shrink to 28nm happened in 2011, 4 years ago or an eternity in the semiconductor industry. Back in 2011 when AMD’s Radeon 7900 series, the first 28nm GPU, was announced, the most popular smartphones were the Galaxy S2 and iPhone 4. That’s how long it has been. So you can imagine why the entire industry is absolutely excited to make the transition to 14nm, it has probably felt like an eternity for them, I know it has for me. There was originally promise that 20nm was a possibility as a half-node shrink between 28 and 14/16 FinFET but the reality was that the process node was not good for multi-billion transistor GPUs and both Advanced Micro Devices and NVIDIA were forced to skip 20nm. So here we are in 2015, still using 28nm and both Advanced Micro Devices and NVIDIA are on their 3rd and 4th generations of GPUs using 28nm and constantly looking for ways to improve performance without increasing.
This leads us to where AMD is now, AMD’s Polaris architecture is built with FinFET transistors from TSMC and Globalfoundries. The reason why AMD hasn’t clearly stated whether or not their GPUs are 14nm or 16nm is because they have been working with both TSMC and Globalfoundries whose FinFETs are stated differently. Globalfoundries is based on Samsung’s 14nm process, which is also 14nm and combines Samsung’s low power expertise with Globalfoundries high performance expertise to hopefully deliver a very efficient and powerful GPU. TSMC has traditionally been AMD’s and before AMD, ATI’s fab of choice. TSMC’s FinFET process is a 16nm process, but for all intents and purposes 14nm and 16nm FinFET are pretty much interchangeable in terms of power savings when compared to 28nm. We’ll have to wait to see real products to see if that’s really the case.
No two companies’ process nodes are identical, so the reality is that we could see TSMC’s transistors deliver slightly better power efficiency while Globalfoundries’ could deliver a physically smaller chip with slightly more performance, as was seen when Apple dual sourced their Apple A9 with TSMC and Samsung. AMD is stating that FinFET offers a 20-35% improvement in terms of performance over 28A planar, which is a huge performance boost and much less variation over planar, meaning a much more reliable yields and performance. This performance improvement is also in conjunction with a power reduction of up to 50-60% compared to 28A planar, which is also massive when you consider what that can mean for GPUs in mobile form factors and servers. All of this results in AMD making the biggest performance per watt jump in the history of AMD GPUs, which includes all of ATI’s history as well.
AMD RTG’s new Polaris architecture, which AMD is calling an SoC Architecture, has a new GPU macro architecture which features their new 4th generation of their GCN (Graphics Core Next) compute units as well as support for HDMI 2.0a, DisplayPort 1.3 and 4K H.265 encode and decode. AMD claims that their Polaris GPU is really an SoC architecture because “GPUs are more than just Graphics IP.” They claim the SoC definition because of the collection of different cores and engines which include multi-media, display, caches, memory controllers and power management. Even so, there are a lot of components inside of a GPU and we can all agree that they are only becoming more complex and capable with every generation. Part of that comes from all of the improvements that are made over time. With Polaris, these improvements are being made almost across the board with a new geometry processor, new command process, new multimedia cores, new L2 cache, new memory controller, new display engine and as stated earlier new 4th generation of GCN compute cores.
All of these improvements should translate to some pretty impressive performance and power figures as mentioned earlier. Because this new architecture is coming with a massive node shrink, the end result of the biggest performance per watt improvement in the company’s history. The company actually showed this off in a side by side demo back in December when the company invited selected press and analysts to Sonoma to talk about the future of RTG. There, they showed AMD RTG’s Polaris architecture playing Star Wars Battlefront at 1080P versus a competitor.
In their demo they capped the GPU to 60 FPS for both GPUs and compared the power consumption of the two GPUs showing that their competitor’s whom they didn’t expressly name was running at almost twice the power consumption as their parts. The Polaris GPU ran at 86W while the competitor part ran at 140W, which is absolutely a huge difference and a massive improvement. However, they did not disclose exactly which competitor GPU they were comparing against so we can’t know exactly how recent the comparison is. Additionally, the competitor’s part is a 28nm GPU so it will very likely be more power hungry in almost any scenario. Even so, 86W for 60 FPS 1080P gaming in a game as beautiful as Star Wars Battlefront is absolutely amazing. The key here is not a competitive analysis, but the improvements from the architecture, new process, and improved transistor geometry.
Advanced Micro Devices (AMD) isn’t positioning the Polaris architecture at launch as an architecture that will be the fastest on earth, which is interesting. It is a change of strategy and positioning on the part of the company which I believe is a good one. Constantly trying to prove that you have the fastest graphics card on earth isn’t going to actually move graphics cards in volumes. The reality is that the middle of the product stack is what moves GPUs and if your focus is on the middle and upper middle of the market at first and delivering the best value there you have the best opportunity to increase sales, profitability and market share. On the other hand, AMD doesn’t have anything to fall back on if they moss the mark in the middle, and that’s a risk.
Advanced Micro Device’s RTG is saying that they expect Polaris to enable new kinds of products and ship in thin and light notebooks, small form factor desktops and discrete graphics cards with fewer power connectors. As we always see with new nodes, it’s always easier to launch the smaller, less complex GPUs that deliver better value and performance per watt and then follow up with the bigger more powerhouse GPUs. Additionally, the real killer applications for Polaris, I believe, are in the notebook form factors where power is always a concern and where discrete GPUs are still needed as resolutions continue to hike upward. Notebooks are so power and thermally constrained that such massive reductions in power consumption will translate to MUCH more powerful notebook GPUs which could broaden the gap between tablet and notebook performance. Products based on AMD’s new Polaris Architecture are expected to be available in mid-2016, roughly when we could also see a potentially competitive product from NVIDIA as well.