Blaize AI Emerges From Stealth

Throughout 2020, a wave of AI hardware startups will launch their companies and products. Cerebras started this wave with its wafer-scale engine last September. This week, Intel announced its AI chips from Nervana, Groq (founded by the inventors of Google TPU) announced its quadrillion ops per second TSP, and Graphcore announced that its chip is available on Microsoft Azure and Dell servers. Last week, a startup named “Blaize,” previously named “Thinci,” emerged from stealth, having already reached key milestones in four areas: innovative hardware, a comprehensive software stack, a staff of over 325 employees, and most importantly, 15 pilot projects underway in the USA, Europe and Asia. Let’s take a closer look at what Blaize is doing. I will cover Intel, Groq, and Graphcore in subsequent blogs, so stay tuned.

The Blaize Graph Streaming Processor

Architectural innovation forms the core of every AI HW startup. Simply adding more multiply/accumulate registers or on-die memory will be inadequate for most high-performance applications. Blaize’s team built a general-purpose graph processor which can natively process graph-based applications, including, but not limited to the Deep Neural Networks which lie at the heart of most modern AI work. While the company claims this architecture can deliver massive gains in efficiency, we will need to await production-ready silicon next year to evaluate how well it performs against other engines that are coming to market.

Figure 1: The Blaize Graph Streaming Processor creates a connected graph of processing elements for each task, using on-die buffers to transfer activations from one layer of the network to the next, reducing the overhead associated with using DRAM or HBM. 

Inference processing is rapidly becoming quite complex, requiring multiple models to deliver accurate results. One of the key differentiators Blaize hopes to cultivate is the ability to simultaneously deploy and stream multiple tasks onto the GSP, consequently accelerating the entire application. NVIDIA supports this by providing various engines on an SOC, as seen on the NVIDIA Drive AGX Xavier SOC, and Xilinx is heading in a similar direction with its flexible Versal ACAP. Time will tell if Blaize’s approach of task-level parallelism can deliver superior performance and power efficiency.

Figure 2: Targeting the automotive ADAS market, Blaize points to the need to accelerate multiple neural networks simultaneously on the GSP. 

Blaize initially targets three markets: autonomous transportation, enterprise AI applications and smart vision devices. In fact, Blaize is pursuing a multi-application approach in each, with the hopes that its generalized graph processing capability can increase the opportunity footprint in each of these markets. For example, in automotive, Blaize is tackling in-cabin monitoring, infotainment, intelligent telematics and vision pre- and post-processing, in addition to ADAS and safety. In smart vision, Blaize clients are evaluating the GSP in detection/classification, smart retail, smart city, factory automation and robotics.

Blaize Picasso software

An area worth noting is the Blaize software platform, called Picasso. To help optimize a neural network for deployment efficiency on the GSP, the Netdeploy tool can prune the net, quantize the layers of the network for optimal precision and generate the streaming data flow graph programs for execution. Most startups have to develop these tools, which can require as many software engineers as the hardware engineers who design the chip itself.

Figure 3: The Blaize Picasso software platform includes all the prerequisite tools and frameworks needed to build AI applications on the GSP. Of particular interest are the Netdeploy package and library of pre-built DNN models. 

Blaize customer traction

Early customer engagement is typically a weak point for many AI startups until they are close to having a working production platform. Blaize, on the other hand, has been working together with marquee clients in its target markets since the early conception of the firm. This has two significant benefits: co-design of the hardware and software based on specific customer needs, and early traction when hardware is production ready. Some of these early engagements produced strategic capital investments as well as early prototyping, from the likes of Samsung, Denso, and Daimler. I see a lot of startups’ pitches, but very few if any can match the scope of Blaize’s early client engagement.

Figure 4: Blaize shared this image which summarizes their current market reach enabled by their development kit. 


I don’t normally blog about pre-silicon startups, preferring to wait (and wait, and wait, and wait) for production silicon and benchmark tests. However, in Blaize I see a startup that has invested as much in software and in early customer co-design engagements as it has in innovative hardware design. This approach makes it somewhat unique in the industry. Of course, it needs to tape out its chip and support its extensive customer pilot program with the attention and field engineering needed to turn them into design wins. That said, Blaize’s approach could help it stand out in the crowd and pave the way for success.

Patrick Moorhead

Patrick founded the firm based on his real-world world technology experiences with the understanding of what he wasn’t getting from analysts and consultants. Ten years later, Patrick is ranked #1 among technology industry analysts in terms of “power” (ARInsights)  in “press citations” (Apollo Research). Moorhead is a contributor at Forbes and frequently appears on CNBC. He is a broad-based analyst covering a wide variety of topics including the cloud, enterprise SaaS, collaboration, client computing, and semiconductors. He has 30 years of experience including 15 years of executive experience at high tech companies (NCR, AT&T, Compaq, now HP, and AMD) leading strategy, product management, product marketing, and corporate marketing, including three industry board appointments.