NVIDIA’s Triton Inference Server has some critical remote code execution vulnerabilities. These vulnerabilities would allow attackers with no authorization to remotely execute code and potentially gain control of AI servers. Therefore, three vulnerabilities have been discovered: CVE-2025-23319, CVE-2025-23320, CVE-2025-23334. Impact-wise, they have been rated with high severity on the Common Vulnerability Scoring System (CVSS), with scores of 8.1, 7.5, and 5.9 respectively. These vulnerabilities impact the server’s dangerous Python backend and exist in local copies of the software on both Windows and Linux machines.
It’s impossible to overstate how dangerous these vulnerabilities are. Researchers have warned that these flaws can result in catastrophic exploits. Specifically, successful exploitation of these vulnerabilities would lead to information disclosure and remote code execution (RCE). When put together, these factors can increase the risks of an info disclosure to a complete system takeover. This means an existential threat to nonprofits just lying in wait.
Details of the Vulnerabilities
CVE-2025-23319 is an out-of-bounds write vulnerability in the Triton Inference Server’s Python backend. From a high-level perspective, attackers can use this vulnerability to execute arbitrary memory manipulation. This exploitation may give them the ability to view sensitive data or take over the server.
CVE-2025-23320 provides attackers to bypass shared memory limits, offering yet another path to exploitation. This vulnerability is especially damaging since it would allow attackers to circumvent security techniques based on memory limits.
CVE-2025-23334 enables an out-of-bounds read on the Python backend. This vulnerability can cause potential data leaks, worsening the security concerns presented by the other two vulnerabilities.
“When chained together, these flaws can potentially allow a remote, unauthenticated attacker to gain complete control of the server, achieving remote code execution (RCE),” – Wiz researchers Ronen Shustin and Nir Ohfeld.
Implications for Organizations
These vulnerabilities certainly pose a critical risk for organizations leveraging NVIDIA’s Triton. Or, they might put artificial intelligence and machine learning applications at risk. In a successful exploit, malicious actors can steal multimillion-dollar AI models. Alternatively, it can leak private information or bias the results of machine-learning algorithms. These events might make an organization vulnerable by giving attackers a foothold through which they can penetrate further into an organization’s network.
Will Vandevanter, of Trail of Bits, walked the community through the technical details of one vulnerability. He noted that the standard ‘alloca’ function allocates memory on the stack based off of runtime parameters. This is obviously very dangerous when those parameters are untrusted, as it leads to stack overflows and memory corruption.
NVIDIA’s Response
NVIDIA also recently released version 25.07 of its Triton Inference Server. This security update addresses the most serious vulnerabilities CVE-2025-23319, CVE-2025-23320, and CVE-2025-23334. Beyond CVE-2025-23309, other major critical bugs such as CVE-2025-23310, CVE-2025-23311 and CVE-2025-23317 were similarly addressed in NVIDIA’s August bulletin.
“This poses a critical risk to organizations using Triton for AI/ML,” warned Wiz researchers. “A successful attack could lead to the theft of valuable AI models, exposure of sensitive data, manipulating the AI model’s responses, and a foothold for attackers to move deeper into a network.”