Google’s announcement last week that they developed a custom chip for Deep Learning has created a lot of press and unanswered questions. At the Google I/O developers conference, the company shared that they have been using an internally-developed processor, called a Tensor Processing Unit (TPU), for over a year to accelerate Deep Learning applications, from Google Street View to their much-heralded win at the game of Go. Rumors have swirled for years that Google may develop their own processors, potentially based on ARM Holdings V8 and / or IBM OpenPOWER, to displace Intel Xeon server processors. This announcement shows that Google may be more interested in chips that are tailored to accelerate specific workloads, especially for Artificial Intelligence.
Google’s TensorFlow Accelerator for AI (Source: Google)
In any event, the market for Deep Learning computation is a tide that is likely to lift all ships and chips, including CPUs, GPUs, FPGAs and even custom processors like the Google TPU. However, developing custom ASICs for processing deep learning could only make financial sense for the largest internet companies and service providers, while semiconductor vendors including Intel, NVIDIA and others will continue to provide more flexible solutions targeting the large market of startups and enterprises exploring how to leverage Deep Learning in their products and businesses.
What did Google (not) announce?
While Google claimed that their TPU beats an “other” alternative by an order of magnitude in performance per watt, the company did not disclose performance data or any technical details about what lies under that large heat sink seen in their photo of the disk-sized accelerator. We reached out to Google last week, but the company has not responded yet to any of our questions on the TPU. The Moor Insights & Strategy team has decades of chip experience, and details are everything in the chip business.
Google compares their performance to “Other” (Source: Patrick Moorhead from Google I/O)
After all, if this is primarily about making Google AI capabilities superior to their competition—and I think that’s exactly what this is about—why would they divulge their secrets? Google is famously mum about their internal computing infrastructure. Google was also not clear whether the TPU is being used for training neural networks, potentially displacing GPUs, or drawing inferences from a trained network, potentially displacing CPUs, or both. While they indicated that the TPU would be available in the Google Cloud Platform, they did not disclose pricing or availability.
The company said the chip was designed to run TensorFlow but did not say whether it could also support programmers using other popular DNN frameworks such as Theano, Caffe and Torch. Google didn’t say whether TensorFlow algorithms were put in hardware or if the chip was really just an accelerator for low precision matrix operations, which dominate Deep Learning compute cycles regardless of which software platform you choose.
In short, Google did not provide the information needed to allow for a quantified comparison versus existing technologies from NVIDIA and Intel. And they are probably justified in keeping this close to their vest, at least for now.
What does this mean for the Deep Learning ecosystem?
The implications of this announcement may be many but cannot be discerned without more data from Google. Google is not saying if the TPU is being used for training or inference, or for both. We believe the TPU is used for inference, not like NVIDIA’s P100 which is used for training. If Google believes that this development represents an opportunity to differentiate themselves in the Google Cloud Platform against Amazon Web Services and Microsoft Azure, both of whom deliver DNN training services in their clouds today, then they will need to disclose more details to attract customers. If a cloud of TPUs can really deliver the claimed 10x performance per watt advantage in training or executing neural networks over using GPUs or CPUs respectively, then one could expect a boost for TensorFlow and the Google Cloud Platform for enterprises and startups building their AI applications and neural networks.
Second, it means that NVIDIA may have a new challenger in inference but not for training, at least where AI is being computed in the cloud. Note that Google does not plan to become a merchant silicon provider, meaning that the TPU can only be used a) in Google’s cloud and b) with Google TensorFlow software. It will take far more than that to challenge NVIDIA’s pervasive position in the Deep Learning ecosystem of hardware, software and scientists that are leading the AI movement. And if NVIDIA decides to build a derivative of their new Pascal-based GPU that would only be used for AI inference, then they certainly could do so if and when they see a sufficiently large market opportunity and / or competitive threat. We are also hearing that NVIDIA’s Pascal is 25x the performance in inference than Kepler. It’s also very important to recognize that Google hasn’t disclosed if their workloads are “pre-taught”; if this is the case, then it isn’t a threat to NVIDIA’s P100, which is a teaching monster. We are assuming the TPU is being used for inference, not teaching. Unless Google tells us otherwise.
Finally, this announcement adds some weight to the argument that startups like Nervana, Movidius and Knupath could change the way neural networks are deployed, using chips that just do one thing but do it very, very well. Similarly, this move could add more credibility to the arguments Intel and Microsoft have made regarding the potential of FPGAs in this space; mass customization with FPGAs can provide dramatic increases in performance for specific computationally intensive applications that have repetitive calculations.
Where do we go from here?
One of the fundamental tenants of AI is that the power goes to those who have the most data; Machine Learning is a means to turning all that data into profitable products and services. Google probably has the world’s largest and perhaps most prestigious AI research team and has massive stores of data about their users’ shopping and browsing habits. So it should not be a surprise that Google’s CEO Sundar Pichai focused much of his comments about AI enabled applications and devices—and even that the company has developed their own AI chip. We can expect a lot more “Machine Intelligence”, as they prefer to call it, from Google in the future.
We don’t know at this time whether Google’s chip is better than specific alternatives as Google isn’t answering our specific questions, but at least Google believes they are well served by developing their own AI technology.