During the Consumer Electronics Show (CES) in Las Vegas this week, Nvidia presented their cutting-edge Vera Rubin architecture. This innovative technology has the potential to change the game in artificial intelligence and big data. This new architecture greatly increases the performance and efficiency of deep learning for machine learning tasks. It helps you address the growing demand for more powerful computing capabilities.
The Vera Rubin architecture
To a lot of people, electric vehicles are still a new concept. It is a notable improvement from the Blackwell architecture. It includes a whole suite of different components meant to minimize inference costs and hardware requirements.
Key Features of Vera Rubin Architecture
That architecture also provided performance improvements that were equally impressive, including a ten-times reduction of inference cost. This incredible efficiency allows organizations to deliver far greater impact, all while reducing their bottom-line costs. It enables a greater-than-four-fold reduction in the number of GPUs required to train certain models. That efficiency translates into simpler, more efficient data center operations overall.
At the heart of the architecture’s potential are the Vera central processing unit and Rubin graphics processing unit. Specifically tuned for today’s GIS workloads, the Rubin GPU delivers mind-blowing performance. In 4-bit computation it is able to perform 50 quadrillion floating-point operations per second. This architecture enables the ReFlex to have an exceptional fit for transformer-based inference workloads, like those needed for large language models.
Providing even better communication and data transfer within these data centers, the architecture features four separate networking chips. These chips are essential to power the growing needs for distributed inferencing across racks.
Networking Innovations
Beyond its core processing cores, the Vera Rubin networking architecture incorporates some of the most advanced, state of the art networking components. Among these, of note, is the ConnectX-9, a new networking interface card designed to facilitate the new era of exabits-per-second data transfers. Additionally, the BlueField-4 data processing unit increases the collective efficiency of data processing within the overall architecture.
As a result, one of the major distinguishing features is the Spectrum-6 Ethernet switch. This switch uses co-packaged optics, meaning data can be transmitted more efficiently between racks. Nvidia is riding these networking developments to increase data center performance to new heights. This should enable them to keep pace with ever-growing demands of AI-powered apps.
Gilad Shainer, a senior leader at Nvidia, emphasized the shift in inferencing practices:
“Two years back, inferencing was mainly run on a single GPU, a single box, a single server.”
This, he said, is compounded today by the trend for inferencing to be increasingly distributed across racks, requiring more advanced networking technologies.
“Right now, inferencing is becoming distributed, and it’s not just in a rack. It’s going to go across racks.”
This evolution further emphasizes the critical need to take advanced networking capabilities into account in the design of modern computing architectures.
A New Era of Computing
With the Vera Rubin architecture, these complex workloads and numerical formats are easily accommodated. It’s precisely this versatility that makes it ideal for so many different applications. This flexibility enables organizations to choose the most optimal configurations for their unique needs, improving performance and efficiency across the board.
This approach focuses on optimizing not just individual components but their interactions within an entire system to create a cohesive and high-performing solution.
“The same unit connected in a different way will deliver a completely different level of performance.”
As demands for processing power continue to rise, Shainer remarked on the necessity for more GPUs within data centers:
“That’s why we call it extreme co-design.”
Nvidia is addressing these issues with the Vera Rubin architecture. This drives the company further ahead of the competition as enterprises seek AI solutions capable of managing their increasingly complex workloads.
As demands for processing power continue to rise, Shainer remarked on the necessity for more GPUs within data centers:
“It doesn’t stop here, because we are seeing the demands to increase the number of GPUs in a data center.”
By addressing these challenges with the Vera Rubin architecture, Nvidia aims to position itself at the forefront of the industry as organizations seek solutions that can effectively manage increasingly complex AI workloads.

