AI in various forms has existed for decades, but the subset of generative AI (GAI)—spearheaded recently by ChatGPT—has propelled AI into the public view like never before. Consequently, I would wager that by this point just about every CEO and board of directors has asked, “What is our generative AI story?”
Since so many companies use VMware, VMware has likely heard many customer requests to hear its GAI story and capabilities, too. At VMware Explore 2023 last week, we received the first chapter of that story, which is what I’ll unpack in this article. I was on the ground in Las Vegas and talked to company executives about the capabilities
Generative AI poses challenges for enterprises
Before we get to VMware’s specific angle on GAI, let’s think about the broader challenges in its deployment. While generative AI can transform business by generating content, designs and creative outputs, there are multiple potential pitfalls in using public large language models (LLMs) to build GAI applications.
The large datasets that GAI uses to learn and generate outputs may require special efforts to ensure compliance with data protection regulations. For example, if a healthcare organization wants to use generative AI to predict patient outcomes based on medical records, HIPAA regulations would require scrubbing patient identities out of the dataset before training the model.
Generative AI models are only as good as the data feeds them. If the training data contains biases, those biases will show up in the results. Suppose a lender uses generative AI to analyze loan applications based on training data that includes historical lending decisions that were racially biased. In that case, the new output would perpetuate the historical biases. Even setting aside concerns about bias, using uncertain training sources can make understanding how the model arrives at specific conclusions difficult, which can be problematic when transparency and accountability are critical.
Some enterprises can take a hybrid approach, leveraging public LLMs and layering on smaller models or using other techniques to incorporate an organization’s proprietary data and targeted use cases. But, for most enterprises, the low-risk option is to develop LLMs based only on internal data.
Forbes Daily: Get our best stories, exclusive reporting and essential analysis of the day’s news in your inbox every weekday.
A single stack to run generative AI applications
VMware AI Labs developed VMware Private AI specifically to provide a prescriptive way for enterprises to build LLMs and GAI applications internally. VMware Private AI Foundation with Nvidia includes the software to customize LLMs and run GAI applications, such as chatbots, virtual assistants and content search and summarization—and, of course, it is tied to Nvidia hardware and software. The single-stack solution provides enterprises with the software and compute capacity to fine-tune LLM models and run generative AI applications using proprietary data in VMware’s hybrid cloud infrastructure.
The Private AI Foundation architecture is built on VMware Cloud Foundation, an integrated software stack bundling NSX for networking, vSphere for computing and vSAN for storage. VMware Cloud Foundation is a hybrid cloud platform that supports native Kubernetes workloads and management in addition to VM-based workloads.
The stack includes Nvidia AI Enterprise, a software suite of AI tools, frameworks and pre-trained models that helps enterprises develop and deploy AI workloads. The suite includes the Nvidia NeMo framework for building, customizing and deploying generative AI models. NeMo combines customization frameworks, guardrail toolkits, data curation tools and pre-trained models to enable enterprises to adopt generative AI.
The stack will also include new software from both companies. Nvidia AI Workbench is a unified tool that enables the creation, testing and deployment of AI models in GPU-powered workstations, data centers and the cloud. VMware has also promised new software to help in AI projects, including a vector database and deep learning virtual machines.
VMware and Nvidia are working with server manufacturers Dell Technologies, Hewlett Packard (HPE) and Lenovo to build validated reference architectures that have VMware Private AI Foundation with Nvidia built in. The server configurations will include the new Nvidia L40S GPU, which Nvidia claims has 1.2 times more generative AI inference performance and 1.7 times more training performance when compared with the Nvidia A100 Tensor Core GPU.
Supporting best-in-class open-source software (OSS) technologies
Enterprise customers that want to build and feed data to open-source software running VMware Cloud Foundation can utilize the VMware Private AI Reference Architecture for Open Source. VMware has collaborated with several partners to provide validated reference architectures for building and serving OSS models on top of VMware Cloud.
VMware is publishing a reference architecture with code samples for SafeCoderfrom Hugging Face, a tools developer for machine learning applications. SafeCoder enables customers to build LLMs, fine-tuned on a proprietary codebase, using state-of-the-art open models and libraries without sharing code with Hugging Face or any other third party.
The VMware reference architectures will also include Anyscale, which uses the widely adopted open-source Ray unified compute framework. Ray can be used on the VMware Cloud Foundation to scale AI and Python workloads.
In a parallel development, VMware, Domino Data Lab and Nvidia have also partnered to deliver a unified analytics, data science and infrastructure platform that is optimized, validated, supported and purpose-built for AI/ML deployments in the financial services industry.
By this point, I have sat in hundreds of generative AI briefings and attended 25 events, but I have to give VMware credit for clearly articulating the value of generative AI for the enterprise. Many moving parts still need to come together, making me skeptical about the company’s somewhat vague “early 2024” launch date.
While VMware Private AI Foundation with Nvidia will be available as a single-SKU product from VMware, the underlying infrastructure and software stack are complex. I predict that most customers will wait for the more compelling supported pre-integrated systems from Dell, HPE and Lenovo. Depending on when VMware started working with the server vendors, the availability could be a schedule risk. I also wonder if the traditional hardware OEMs can get enough Nvidia silicon to roll out these capabilities, even if it is “early 2024.”
While VMware had to start with Nvidia for its more turnkey AI Foundation, companies can use the reference architecture for both Intel and AMD. As the market wants choice, I am hopeful we will see an AI Foundation for Intel, AMD and maybe even Qualcomm.
I expect the first VMware Private AI users will be existing VMware customers. That said, there is a choice of competing offerings from cloud providers AWS, Microsoft Azure and Google Cloud, each of which has generative AI enterprise systems publicly available today. I have also written about IBM’s full-stack generative AI offering, Red Hat OpenShift AI, a direct competitor to VMware’s offering. Red Hat with IBM is already GA on many generative AI capabilities, and I will be interested in assessing that horse race. With enterprises everywhere champing at the bit to integrate generative AI into the business, choice and competition is always a good thing.