The National Institute of Standards and Technology (NIST) has identified and outlined four major cyber threats capable of manipulating the behavior of AI systems. The guidance, titled “Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations,” emphasizes the exploitation of AI vulnerabilities through attacks that introduce untrustworthy data, leading to system malfunctions. One prominent attack type highlighted is the “evasion” attack, where adversaries confuse an AI’s decision-making process, potentially causing disruptions such as a driverless car veering into oncoming traffic due to misleading road markings. The publication is part of NIST’s efforts to foster trustworthy AI development.
AI systems, integral to various aspects of modern society, including autonomous driving, medical diagnoses, and online chatbots, are particularly susceptible to adversarial attacks. The large datasets used to train these systems, which may include unreliable sources like public interactions and websites, create opportunities for threat actors to manipulate the data and induce undesirable AI behavior, such as chatbots adopting abusive language. NIST computer scientist Apostol Vassilev, one of the paper’s authors, highlights the need for better defenses, emphasizing that existing measures lack robust assurances to fully mitigate risks associated with adversarial attacks on AI systems.
In addition to evasion attacks, the NIST report identifies three other major cyber threats: poisoning, privacy, and abuse. Poisoning attacks occur during training, introducing corrupted data to influence future AI behavior, while privacy attacks aim to extract sensitive information about the AI or its training data. Abuse attacks involve inserting false information into legitimate sources that the AI uses, repurposing its intended function. The researchers caution against overconfidence in AI security, emphasizing the ease of mounting such attacks with minimal knowledge and adversarial capabilities, making them challenging to undo once executed.