“Everyone knows” the cloud is the least expensive way to host AI development and production, right? Well, it turns out that the best solution may depend on where you are on your AI journey, how intensively you will be building out your AI capabilities, and what your end-game looks like. Today I wanted to share some of my learnings and observations. For those looking for more details, you can read the paper Moor Insights & Strategy published on the topic here.
Why cloud computing is so attractive for AI
Cloud service providers (CSPs) have extensive portfolios of development tools and pre-trained deep neural networks for voice, text, image, and translation processing. Much of this work stems from the internal development of AI for in-house applications, so it is pretty robust. Microsoft Azure, for example, offers around 30 pre-trained networks and tools which can be accessed by your cloud-hosted application as APIs. Many models can even be customized with users’ own data, such as specific vocabulary or images. Amazon SageMaker provides cradle-to-production AI development tools, and AWS offers easy chatbot, image recognition, and translation extensions to AWS-hosted applications. Google has a pretty amazing slew of tools as well. Most notably, perhaps, is its AutoML, which builds Deep Learning neural networks auto-magically, saving weeks or months of labor in some cases.
All of these tools have a few things in common. First, they make building AI applications seem enticingly easy. Since most companies struggle to find the right skills to staff an AI project, this is very attractive. Second, they offer ease of use, promising click-and-go simplicity in a field full of relatively obscure technology. Lastly, all these services have a catch—for the most part, they require that the application you develop in their cloud runs in their cloud.
The fur-lined trap
These services are therefore tremendously sticky. If you use Amazon Poly to develop chatbots, for example, you can never move that app off of AWS. If you use Microsoft’s pre-trained DNNs for image processing, you cannot easily run the resulting application on your own servers. You will probably never see a Google TPU in a non-Google data center, nor be able to use the Google AutoML tool if you later decide to self-host the development process.
Now, stickiness, in and of itself, is not necessarily a bad thing—right? After all, elastic cloud services can offer a flexible hardware infrastructure for AI, complete with state-of-the-art GPUs or FPGAs to accelerate the training process and handle the flood of inference processing you hope to attract to your new AI (where the trained neural network is actually used for real work or play). You don’t have to deal with complex hardware configuration and purchase decisions, and the AI software stacks and development frameworks are all ready-to-go. For these reasons, many AI startups begin their development work in the cloud, and then move to their own infrastructure for production.
Here’s the catch: a lot of AI development, especially training deep neural networks, eventually demands massive computation. Furthermore, you don’t stop training a (useful) network; you need to keep it fresh with new data and features, or perhaps build a completely new network to improve accuracy using new algorithms as they come out. The publicly available research I have seen says that this level of compute can become pretty expensive in the cloud, costing 2-3 times as much as building your own private cloud to train and run the neural nets. Note that one can reserve dedicated GPUs for longer periods in the cloud, significantly lowering the costs, but from what I’ve seen, owning your own hardware remains the lowest cost for ongoing, GPU-heavy workloads. As I will explore in the next section, there are other, additional reasons for cloud or self-hosting.
Factors to consider beyond cost
Starting an AI Project can take a lot of time, effort, and expenses. Cloud AI services can greatly reduce the pains of getting started, and some hardware vendors offer bundles of hardware and software to even the playing field with the CSPs. Dell EMC’s “AI Ready Solutions” for Deep and Machine Learning, for example, comes complete with GPUs and integrated software stacks, all designed to smooth the on-ramp to building AIs.
Some industries are tightly regulated and require on-premises infrastructure. Others, such as financial services, deem it too risky to put sensitive information into a cloud.
This is the most important factor for some organizations. Simply put, if your data is in the cloud, you should build your AIs and put your apps there as well. If your data is on-premises, the hassle and costs of data transfers can be onerous, especially considering the massive size of neural network training dataset. For that reason, it usually makes sense to build your AI on-prem as well.
The location of infrastructure for training and running a neural network for Artificial Intelligence is a very big decision that should be made with a holistic view of requirements and economics. The cost side of the equation may warrant your own careful TCO analysis. Many hardware and cloud vendors have TCO models they can share, but, of course, they all have an axe to grind—be prepared to do the homework yourself. The problem here is that it is challenging to size the required infrastructure (number of servers, number of GPUs, type of storage, etc.) until you are pretty far down the development path. A good hardware vendor can help here with services—some of which can be quite affordable if you are potentially planning for a substantial investment.
A common option is to start your model experimentation and early development in a public cloud, with a plan for an exit ramp with pre-defined triggers that will tell you if and when you should move the work home. That includes understanding the benefits of the CSP’s machine learning services, and how you would replace them if you decide to move everything to your own hardware.