And over at Qualcomm, AMD, and Intel, they’re drastically changing the game of personal computing through the creation of their own Neural Processing Units (NPUs). These specialized chips are designed to significantly enhance AI capabilities in modern laptops. Qualcomm’s Snapdragon X chip and AMB’s Ryzen AI Max are the frontrunners in the race to fuel AI-driven applications. With competitive offerings from both Intel and AMD, they complement and deepen the overall integration of Microsoft’s powerful new Copilot+ features.
The Snapdragon X chip is at the forefront of that trend. It is equipped with a super-efficient NPU which can complete AI tasks by accelerating the data processing to achieve the best performance. AMD’s Ryzen AI Max boasts a formidable NPU that can produce up to 50 TOPS, or trillions of operations per second. This potent performance should put it up there with the best of the segment. Similarly, AMD is increasing its NPU performance, with capabilities from 40 to 50 TOPS.
Qualcomm Takes the Lead
Qualcomm can lay an impressive claim to have been the first to introduce NPUs to Windows-powered laptops, setting the precedent that others are now following. Intelligent performance, accelerated efficiency The Snapdragon X chip accelerates performance and reduces power consumption—enabling more fluid engagement with increasingly complex AI models. This chip’s design is oriented around processing tensors, which are a multidimensional array of numbers. It makes it able to process more tokens per second than previous models.
Steven Bathiche, a key figure in the development of Qualcomm’s NPU technology, noted, “With the NPU, the entire structure is really designed around the data type of tensors.” This specialization is the key to stringing together multiple sophisticated AI tasks seamlessly and naturally. Bathiche further elaborated on the advantage of NPUs over traditional CPUs, stating, “NPUs are much more specialized for that workload. And so we go from a CPU that can handle three trillion operations per second (TOPS), to an NPU.”
Qualcomm’s centriq AI 100 NPU supports high efficiency performance benchmarks of 100 TOPS. This mighty chip makes short work of even the most processor-intensive AI applications on PCs. The Windows ML runtime runs these AI tasks extremely power efficiently. It routes these operations to the most appropriate hardware, whether that’s a CPU, GPU, NPU or something else in the mix.
AMD and Intel Join the Race
AMD is taking big steps with its Ryzen AI Max architecture. This unique architecture packs general CPU cores, Radeon-branded graphics cores, and a neural processing unit on the same piece of silicon. This integration removes CPU and GPU bottlenecks, optimizes performance and energy efficiency, and improves the overall performance and flexibility of Windows laptops. As of 2023, AMD chips with NPUs were virtually nonexistent. They only typically delivered single digit TOPS, and AMD’s big on driving even better performance.
Intel has been making moves to position itself as a player in this space. The company’s recent offerings include NPUs that rival those of Qualcomm and AMD, boasting performance metrics in the range of 40 to 50 TOPS. Both companies are pushing the envelope with their innovations. They are ambitious to drive NPU performance even higher to keep up with the increasing need for on-device AI processing.
Mike Clark from AMD echoed this sentiment by stressing the need to maintain traditional computing tasks in parallel with new AI tasks. We need to be great at low latency, at dealing with smaller data types, at branching code—this is the compute staple workloads. We can’t cede leadership on that front, but we don’t want to cede leadership in AI and machine learning either. This expression of frustration is indicative of a deeper industry trend as manufacturers are finding it challenging to improve the performance of the AI while maintaining other features.
Future Prospects for NPUs
Looking ahead, industry experts predict that NPUs capable of thousands of TOPS will become widely available within a couple of years. The need for that kind of performance emerges from the rapid mainstreaming of AI into general purpose computing workloads. Rakesh Anigundi highlighted this growing need by stating, “You’ll want to be running this for a longer period of time, such as an AI personal assistant, which could be always active and listening for your command.”
This move to more integrated NPUs directly counterbalances several historical drawbacks tied to memory architecture when it comes to modern PCs. Joe Macri outlined the challenges faced when using discrete GPUs: “When I have a discrete GPU, I have a separate memory subsystem hanging off it.” He elaborated on the complexity of data sharing between CPU and GPU, stating, “When I want to share data… I’ve got to take the data out of my memory, slide it across the PCI Express bus, put it in the GPU memory… then move it all back.”
In comparison, having CPUs and NPUs share a common thermal head makes this drastic process much simpler. Mahesh Subramony noted, “By bringing it all under a single thermal head, the entire power envelope becomes something that we can manage.” This seamless integration not only creates a more efficient data processing pipeline, but improves energy efficiency—a key factor for mobile computing devices.
Vinesh Sukumar expressed a vision for future advancements in AI technology: “I want a complete artificial general intelligence running on Qualcomm devices.” This latest ambition is a clear continuation of their sustained progress and desire to experiment and push the limits of what NPUs can do.

