Neural Processing Units (NPUs) have changed the PC architecture landscape significantly. Chipmakers such as Qualcomm, AMD and Intel currently are in a frenzy to increase artificial intelligence (AI) powers in PCs. Yet, in 2023 NPUs remained a rarity across AMD chips. At the same time, Qualcomm’s Snapdragon X chip became the gold standard in this new technology. These powerful and specialized processors help accelerate AI tasks at a faster speed. As a consequence, users experience swifter and more productive interactions with AI models.
NPUs are currently able to provide thousands of TOPS (trillions of operations per second). Going into 2023, NPUs would often be lucky to get 10 TOPS. If so, its performance would still be substandard compared to Nvidia’s GeForce RTX 5090 which supposedly has a peak of 3,352 TOPS. This difference highlights the current arms race between chip makers as they rush to create faster, stronger NPUs.
With demand for AI applications exploding, the reality today is that most PCs are devoid of NPUs entirely. This gap indicates a desire for greater inclusion of these specialized processors into compute devices we use on a day-to-day basis. Companies are just starting to realize that they can’t rely solely on NPUs. They need to go beyond just adding these components and feature them in a larger ecosystem to unlock their full potential.
The NPU Landscape in 2023
The climate for NPUs in 2023 so far has been one of great creativity and competition. Qualcomm’s Snapdragon X chip is a game-changer in the industry. It includes the industry’s first 128-bit NPU that more than triples AI processing capabilities. Bittware TeraBox This chip is a good representation of a workload that NPUs can do better than CPUs, especially complex matrix-like computations.
Though Qualcomm has the headstart, AMD is moving in this direction as well. The Ryzen AI Max features an especially robust NPU rated at 50 TOPS. This spectacular achievement places it squarely in competition with Snapdragon’s high end offering. As of early 2023 AMD chips with NPUs were still an extreme novelty. This scarcity is a sign that the market is maturing.
Intel is not sitting still, either, and just like Qualcomm’s technology, it is working on NPUs that aim to beat Qualcomm at its own game. Even as these companies scale up their work they are working to ensure that producing more and better performance metrics efficiency. This race for larger TOPS has been the characterizing element of the current chip market.
“With the NPU, the entire structure is really designed around the data type of tensors [a multidimensional array of numbers],” – Steven Bathiche
The Role of NPUs in AI Processing
The introduction of NPUs is a key enabling technology to achieve higher performance AI applications in a personal computer form factor. These processors are optimized to run workloads that utilize AI, giving them a clear advantage over traditional CPUs.
Steven Bathiche stresses this specialization by diving into how NPUs relieve the heavy lifting from CPUs. Although CPUs can still do a few trillion operations per second, this is the forte for dedicated processing engines. This transition lays the groundwork for entirely new classes of more advanced and resource-consuming, yet powerful AI applications to function seamlessly and efficiently.
NPUs allow for faster processing by allowing systems to process more tokens per second. This capability quickly becomes a major differentiator in user experience, especially when using cutting-edge AI models that demand significant underlying computing power. Faster, more efficient NPUs make your device more powerful than ever! Applications are now able to take advantage of richer experiences and capabilities, with increasingly more workflows occurring locally.
“NPUs are much more specialized for that workload. And so we go from a CPU that can handle three [trillion] operations per second (TOPS), to an NPU” – Steven Bathiche
The Future of PC Architecture
The adoption of NPUs in personal computing devices heralds a dramatic shift in PC architecture. As these processors become more widely adopted, they’ll start to shape the design and construction of future systems.
For high performance NPUs perform best when included as a component of a shared memory architecture. Mahesh Subramony explains why this integration is so important. In his announcement, he says it allows them to optimize power envelopes and maintain performance bar high. By unifying memory systems, we enable incredibly efficient data transfer between CPUs and NPUs. This increases throughput and minimizes latency, accelerating computational speed across the board.
The ongoing arms race among Qualcomm, AMD, and Intel to produce superior NPUs reflects the growing significance of AI in everyday computing. Innovative companies are working hard to develop these more holistic solutions. As systems architects and designers, they know that designing systems to take advantage of every processor counts is critical.
“It’s about being smart. It’s about using all the [processors] at hand, being efficient, and prioritizing workloads across the CPU, the NPU, and so on. There’s a lot of opportunity and runway to improve,” – Steven Bathiche

