At the Consumer Electronics Show (CES) taking place in Las Vegas this week, Nvidia announced its new innovation, Vera Rubin architecture. This new architectural paradigm will deliver accelerated computing performance for current and future workloads. Retailers should look for it late this year. Vera Rubin architecture is intended to maximize efficiency and reduce costs. It guarantees a tenfold reduction in inference costs and a fourfold reduction of GPUs to train certain models.
The Vera Rubin architecture is a big step beyond. It significantly improves upon Blackwell, the architecture from fellow GPU manufacturer Nvidia, which dominated the market starting around 2016. This groundbreaking architecture increases computational power while flexibility is introduced to handle a variety of workloads and numerical types. Organizations are increasingly seeking out advanced, efficient solutions to address their most challenging computing workloads. With Vera Rubin, we offer a truly groundbreaking option to the market.
Performance Enhancements
Nvidia’s Vera Rubin architecture produces a jaw-dropping 50 quadrillion floating-point operations per second, or petaFLOPS. All of this truly remarkable performance is very customized for 4-bit compute. This new performance metric is up to 5x more performance than the Blackwell architecture, especially for transformer-based inference workloads. The structure is designed around several elements that increase its production capabilities. The announced foundational elements of the ConnectX are the networking interface card, BlueField-4 data processing unit and Spectrum-6 Ethernet switch.
These improvements enable the Vera Rubin architecture to complete a wider variety of computations in a more energy efficient manner. By leveraging co-packaged optics, the proposed architecture advances inter-rack data transmission speeds, supporting much faster exchanges between racks while lowering latency. As Gilad Shainer, a senior vice president at Nvidia, noted, “The same unit connected in a different way will deliver a completely different level of performance.” Furthermore, this statement highlights the importance of connectivity in realizing the full potential of this new architecture.
Architectural Innovations
Vera Rubin architecture includes six new chips designed to work in tandem: the Vera CPU, Rubin GPU, and four distinct networking chips. This new combined arrangement enables a much more holistic system, focused on moving and processing data at unparalleled speed and scale. An emphasis on co-design between each of these components lets them all operate in harmony with one another, providing world-class performance.
“Right now, inferencing is becoming distributed, and it’s not just in a rack. It’s going to go across racks,” Shainer explained. This shift in how inferencing is conducted highlights the adaptability of the Vera Rubin architecture in addressing emerging computing demands. The ability to scale effectively across multiple systems ensures that organizations can meet their growing needs without compromising on performance.
Strategic Advantages
Improving the state-of-the-art for data centers and high-performance computing has been a core focus of Nvidia since its inception. The release of Vera Rubin architecture illustrates this commitment. Nvidia currently holds an iron grip on the industry by offering solutions that need the fewest GPUs available. This transactionless approach allows for dramatic cost savings for users. Beyond greater performance, these innovations represent a new era in public engagement. Like many things, they give businesses a competitive edge to run their operations as efficiently as possible.
“It doesn’t stop here, because we are seeing the demands to increase the number of GPUs in a data center,” Shainer stated. This conclusion highlights the reality that, as workloads will always change, so will the technologies that empower them. The Vera Rubin architecture does that while looking ahead, to be ready when the computing landscape will doubtless face new challenges.

