Amazon Web Services hosted its annual conference last week. Over 50,000 attendees converged on Las Vegas to partake in the massive 4-day agenda of all things cloudy. In the realm of Artificial Intelligence (AI), AWS announced over a dozen completely new services for building and running smart applications, adding significantly to what was already a rich portfolio. “We want to help all of our customers embrace machine learning, no matter their size, budget, experience, or skill level,” said Swami Sivasubramanian, Vice President of Amazon Machine Learning. By simplifying use and lowering the cost of a complex development process, Amazon.com AWS hopes to extend its lead in hosting the market for enterprises and AI startups.
AWS says that direct customer engagements helped define over 90% of its AI portfolio. Based on its latest features, I believe AWS is now able to even better challenge Google GCP and Microsof Azure for AI development and hosting in key areas:
AWS added reinforcement learning (RL) to SageMaker, its end-to-end AI developer platform. Google added RL to its TensorFlow Deep Learning (DL) framework last April, and Microsoft is investing heavily in researching this approach to the emerging field of unsupervised learning.
AWS is building a custom ASIC for AI Inference, called Inferentia, which it claims can scale from hundreds to thousands of TOPS (trillions of operations per second) and reduce the cost of cloud-based inference by an order of magnitude. Google has its own TensorFlow Processing Unit (TPU), and Microsoft is using Intel and Xilinx FPGAs to accelerate inference processing.
AWS SageMaker, the company’s AI development platform continues to blossom—it received over 90 enhancements in the last year, and now offers over 150 models and algorithms. Adobe, an AWS partner and user, said SageMaker helped cut its AI development time by as much as 90%—an impressive claim.
Let’s take a closer look at what AWS announced, and the implications for the cloud leader and its users.
What did AWS announce?
Amazon’s cloud AI services are already used extensively by startups and enterprises alike to research, develop, and deploy smart services and products—primarily in the area of deep neural networks. Amazon said at its reInvent conference last week that more public cloud ML happens on its service than any other service. As the world’s largest cloud services provider, AWS supports practically all AI DNN frameworks, as well as the latest Intel, AMD and now Arm CPUs, NVIDIA GPUs, and FPGAs from Xilinx. While the company has always been the go-to cloud provider for Infrastructure-as-a-Service, the most recent announcements significantly expand its footprint in the ML-as-a-Service (MLaaS) market.
AI services: expanding into higher-level features
While Amazon’s previous AI services focused on more rudimentary voice, text, image, and video processing, it now includes more advanced services and specialized knowledge domains. The new AI services include a new forecasting tool which the company says requires no previous AI experience, a suite of AI-enabled personalization services based on the same technologies Amazon uses in its storefront, a sophisticated new optical character processing tool with the ability to process tables as well as prose, and a pre-trained neural network that can help medical professionals improve diagnoses and treatment plans. Amazon also announced a “Custom Terminology” feature for AWS Translate. This significantly improves the value of AWS to enterprise customers, since this is something Microsoft has claimed for a while.
ML services: Reinforcement Learning and "Ground Truth" data preparation
The new AWS Machine Learning development services focus on expanding the utility of the SageMaker development platform. This includes new support for Reinforcement Learning and a “Ground Truth” service to help customers curate and tag their datasets—a tedious and time-consuming process today. Amazon also introduced a new AWS ML Marketplace that provides click-and-go web access to over 150 models and algorithms. Many of these were probably developed by Amazon for internal use in its quest to improve customer interaction and services.
Reinforcement Learning is the latest trend in AI. Microsoft Research is a major contributor to RL research, as is Google Deep Mind. Much of the early research in RL focused on gaming, such as Google’s efforts to build its famous Alpha Go champion. Basically, RL randomly explores different actions and rewards or reinforces those that produce desired results, such as winning a video game. RL requires a massive amount of computation but requires far less training data. Because of this, in theory, it can be more easily adopted and can generate significant cloud service revenues.
To demonstrate RL in action and provide an easy to use developer sim, AWS also announced AWS DeepRacer—a 1/18th scale race car that can be trained with RL (now on pre-order for $299). Like AWS DeepLens before it, a DeepRacer helps developers have fun while working to understand this new deep learning technology. Deep Racer will be available early next year, and AWS is planning to hold racing events throughout the year, culminating in a grand championship race at next year’s re:Invent in Las Vegas.
Frameworks and infrastructure: a new inference instance and a new chip
AWS also introduced new IaaS offerings to support the development and execution of DL models. For developers, AWS introduced a new, larger GPU EC2 P3dn instance which supports eight NVIDIA Volta V100 GPUs and a newly optimized Tensorflow AMI (Amazon Marketplace Instance). The company says that this new instance improves scaling efficiency from 65% to a very impressive 90%, with 256 NVIDIA GPUs. Using such large instances is increasingly common. While the performance of an NVIDIA GPU has increased by a factor of 10 over the last few years, the size and depth of deep learning networks have as well—driving up compute demand exponentially. To support this level of scaling, AWS also introduced the Dynamic Training version of Apache MXNet, highlighting a project at Sony that scaled to 2,176 NVIDIA Tesla V100 GPUs, and demonstrated training ResNet50 on IMAGENET in 224 seconds.
While the training market enjoys AI rock star status and has garnered much of the industry’s attention over the last few years, the market for inference is poised to grow rapidly—likely dwarfing the training market by a factor of ten over the next decade. Amazon agrees with this prediction and announced new hardware instances to support its customers use of DL networks.
First, AWS announced a new Elastic Inference EC2 Instance, designed to reduce inference costs by 75% and provide elastic scaling from 1 to 32 TOPS per NVIDIA GPU accelerator (initially supporting TensorFlow and Apache MXNet). Prior to this instance type, inference jobs on EC2 were pretty much a do-it-yourself exercise. Now you can use SageMaker to both develop and run your Deep Neural Network in the Amazon cloud and on IoT devices.
In a major announcement that many had been expecting, AWS also brought out its own AI chip, called "Inferentia", that the company claims will offer cost reduction by an additional order of magnitude. The device will be available in the Elastic Inference instances, and of course, will be supported by SageMaker. The company said a single chip would provide “100s of TOPS” with int8 and fp16 mixed precision, which would put them squarely in or above NVIDIA’s performance class. AWS also said it would scale to “1,000s” of TOPS, likely via an on-die interconnect, and would initially support MXNet and Pytorch. When combined with the 75% cost reduction of the new Elastic Inference instance, this could be a game-changer that dims the prospects of many startups. Like Google’s first Tensor Processing Unit, the Amazon device will only be useful for inference. This leaves the training capability in AWS to NVIDIA, at least for now.
AWS also announced “SageMaker Neo,” which allows the developer to deploy their trained network model on a wide variety of hardware, including CPUs, GPUs, FPGAs, and the new Inferentia ASIC, when it becomes available. AWS says that ML models are optimized by Neo to run up to two times faster and consume less than a hundredth of the resources compared to typical models. This provides AWS customers incredible value for heterogeneous edge computing environments.
AWS’s latest barrage of announcements significantly increases the utility of its MLaaS offerings. SageMaker continues to emerge as one of the industry’s most comprehensive development platforms for AI, with support for nearly all types of neural networks and frameworks, and write-once, deploy-anywhere workflow. Moving AWS to the forefront of inference processing is also a smart move as many AI projects are entering the deployment phase and searching for low-cost hosting platforms. The addition of specific solutions to business problems, like forecasting, customization, and OCR processing, should be well received by enterprise customers looking for easy on-ramps to AI-enabled features. AWS Comprehend Medical is an interesting first industry-specific Comprehend solution, and more solutions should be expected.
The Elastic Inference instance will be a welcome addition to EC2, especially when coupled with the promised performance of the new Inferentia chip. I will be looking for application benchmarks soon so we can compare Inferentia to Google’s TPU and other ASICs. Inference processing is a hot topic, and dozens of startups like Habana Labs and GraphCore (not to mention Google) are all armed with impressive benchmarks that will help customers decide where to host their workloads. A head-to-head comparison with TPU will be very interesting, and I hope AWS works with a third party to ensure it is as clean as a whistle.
The last thing I’ll say about re:Invent 2018, was that I was impressed with how many customers and partners AWS was able to reference by name. That’s always a tricky ask, but the company succeeded. There was a lot to cover here—thanks for reading this all the way through to the end. Have a happy Holiday Season!
Note: This blog contains contributions from Patrick Moorhead, President, and principal analyst, Moor Insights & Strategy.