Continuous integration and continuous delivery (CI/CD) misconfigurations in the open-source TensorFlow machine learning framework have been identified, posing a potential risk of supply chain attacks. Researchers from Praetorian found that these misconfigurations could be exploited by attackers to compromise TensorFlow releases on GitHub and PyPi by manipulating TensorFlow’s build agents via a malicious pull request. Successful exploitation could lead to an external attacker uploading malicious releases, gaining remote code execution on GitHub runners, and obtaining a GitHub Personal Access Token (PAT) for the tensorflow-jenkins user. TensorFlow, which uses GitHub Actions for automation, faced vulnerabilities in self-hosted runners executing jobs, allowing contributors to execute arbitrary code through malicious pull requests.
The identified misconfigurations raised concerns about the security of GitHub Actions workflows in TensorFlow, particularly when executed on self-hosted runners. Researchers noted that forks of public repositories could potentially run dangerous code on self-hosted runners, as contributors could create pull requests executing code in a workflow. This vulnerability allowed adversaries to execute arbitrary code on the self-hosted runner by submitting a malicious pull request, posing a security risk. Further investigation revealed non-ephemeral self-hosted runners with extensive GITHUB_TOKEN permissions, enabling attackers to upload releases and push code directly to the TensorFlow repository.
Following responsible disclosure on August 1, 2023, TensorFlow maintainers addressed the shortcomings by requiring approval for workflows submitted from all fork pull requests, including those from previous contributors. Additionally, GITHUB_TOKEN permissions were changed to read-only for workflows running on self-hosted runners. These measures aimed to enhance security and mitigate the risk of unauthorized access and malicious code injection. The incident highlights the growing threat of CI/CD attacks, particularly for organizations relying on self-hosted runners for significant compute power in workflows, making them susceptible to potential exploits.