Americas

Asia

Oceania

lconstantin
CSO Senior Writer

Critical PyTorch flaw puts sensitive AI data at risk

News
10 Jun 20243 mins
PyTorchVulnerabilities

Researchers have discovered a critical flaw in PyTorch’s distributed RPC system, allowing attackers to execute arbitrary commands on the OS and steal AI training data.

vulnerability attack
Credit: Shutterstock

Popular machine learning framework PyTorch fixed a critical vulnerability that could allow attackers to execute arbitrary commands on master nodes, potentially leading to theft of sensitive AI-related data.

The vulnerability, tracked as CVE-2024-5480, was reported through Huntr.com, a bug bounty platform for AI/ML applications. The flaw was patched in PyTorch 2.2.2 and is rated with the maximum severity score of 10 in Common Vulnerability Scoring System (CVSS).

PyTorch is a framework for developing machine learning applications in the fields of computer vision and natural language processing. It is a continuation of the older Torch library, which is no longer maintained. PyTorch was developed by the artificial intelligence laboratory at Meta but is now maintained as an open-source project by the PyTorch Foundation under the Linux Foundation’s umbrella.

Unsanitized RPC function calls

The vulnerability is located in PyTorch’s distributed Remote Procedure Call (RPC) component, torch.distributed.rpc. The component facilitates inter-process communication between the various nodes involved in distributed training scenarios, in which a task is distributed between multiple deployments that function as workers and is controlled from a master node.

When using RPC, workers can serialize PythonUDFs (User Defined Functions) and send them to the master node, which then deserializes and runs them. The problem is that in PyTorch versions older than 2.2.2 there are no restrictions on calling built-in Python functions such as eval, which further allows executing arbitrary commands on the underlying operating system.

“An attacker can exploit this vulnerability to remotely attack master nodes that are starting distributed training,” the researchers who reported the vulnerability wrote in their report. “Through RCE [remote code execution], the master node is compromised, so as to further steal the sensitive data related to AI.”

Increasing vulnerabilities in AI tools

With adoption of machine learning models and other AI applications on the rise across all industries, organizations are increasingly turning to specialized AI frameworks such as PyTorch. These tools can have vulnerabilities like any other software program, and some, being fairly new, haven’t been as heavily scrutinized by the security research community.

It’s important to include AI frameworks into your vulnerability management programs as their compromise can have a big impact. Machine learning models are usually trained on sensitive business data and are very costly to produce since training them requires a lot of computing power and time. Attackers can exploit vulnerabilities to steal training data or to poison it. Depending on the model’s use case, stealing it to figure out how it makes decision can also be valuable to certain types of attackers.

More on vulnerabilities and exploits:

  翻译: