When Your CI/CD AI Agent Becomes the Attacker: Microsoft Reveals How a Prompt Injection in Claude Code's GitHub Action Exposed Your Pipeline Secrets

Development teams using GitHub Actions with artificial intelligence agents have a new risk vector to understand. On 5 June 2026, Microsoft Threat Intelligence published on its security blog the complete analysis of a vulnerability its researchers discovered and responsibly disclosed in Anthropic's official Claude Code GitHub Action: through prompt injection techniques, an external attacker could instruct the AI agent to read the CI/CD runner's environment — including API keys, cloud credentials and GitHub tokens — and exfiltrate them through workflow logs, a repository comment or a web request. All of this with a simple HTML comment invisible to the human reviewer but readable by the model. The patch has been available since 5 May in Claude Code 2.1.128. But the lesson this incident leaves goes far beyond one specific GitHub Action: any AI agent that processes untrusted GitHub content while having access to secrets and the ability to act externally is a first-order security risk.

‍

What did Microsoft discover and how did the attack work?

Facts documented by Microsoft Threat Intelligence (Microsoft Security Blog, 5 June 2026), GBHackers, CybersecurityNews and Decrypt:

The vulnerability: Claude Code's GitHub Action included a tool called Read for the AI agent to read repository files. Unlike the Bash tool (which was properly sandboxed), the Read tool was not subject to the same isolation environment and could access system files on the runner, including /proc/self/environ — the file containing all process environment variables, where GitHub Actions runners store API keys, tokens and credentials injected as secrets.
The prompt injection vector: an attacker with no special repository access — only the ability to open an issue or submit a pull request — could insert a malicious instruction inside an HTML comment (). The HTML comment is completely invisible when the issue renders in the browser, but the AI model reads the raw markdown and sees the instructions.
The payload and evasion technique: Microsoft's team built a payload that asked the model to perform a “compliance review” (to avoid triggering Claude's safety filters, which block direct requests to print API keys), read /proc/self/environ, and then “trim the first 7 characters of the result”. This extra step was key: the ANTHR prefix of ANTHROPIC_API_KEY is exactly 7 characters, which allowed evading both the model's filters and the GitHub Secret Scanner — which detects known API key patterns but not trimmed fragments.
Exfiltration: with the secret in hand, the agent could send it via a web fetch to an external server, write it to workflow logs (publicly visible in public repositories) or include it in a GitHub comment. The attacker recovers the API key without any direct access to the repository or secrets.
Scope of secrets at risk: ANTHROPIC_API_KEY was the documented example, but GitHub Actions runners can contain GITHUB_TOKEN, cloud credentials (AWS, Azure, GCP), third-party tokens (npm, Docker, Terraform) and any other secret injected into the workflow.
Responsible disclosure: Microsoft reported the vulnerability to Anthropic via HackerOne on 29 April 2026. Anthropic published the fix in Claude Code 2.1.128 on 5 May, blocking Read tool access to sensitive /proc/ files.
Second related vulnerability: independently, researcher RyotaK of GMO Flatt Security documented another vulnerability in Claude Code GitHub Actions (patched in v1.0.94) that allowed an unauthenticated external attacker to exfiltrate secrets, steal OIDC tokens and push malicious code to downstream repositories by combining a flaw in checkWritePermissions with prompt injection techniques.

‍

Why AI agents in CI/CD are a new-order attack surface

The Claude Code GitHub Action incident is not an isolated anomaly — it is the first public documentation of a class of vulnerabilities that will affect all teams integrating AI agents into their development pipelines. Microsoft makes it explicit: “we began this research after observing prompt injection attempts in public repositories using AI-assisted GitHub workflows across multiple vendors.” This is not a Claude Code problem. It is a problem with how AI agents are being deployed in high-trust CI/CD environments.

CI/CD runners are high-trust environments with privileged access. A GitHub Actions runner can have access to production cloud credentials, deploy tokens, SSH keys, and access to the complete infrastructure the pipeline manages. When an AI agent with access to these environments processes untrusted content (issues, PRs, comments from any user), the trust boundary collapses.
Prompt injection is the ultimate attack vector against AI agents. Unlike traditional software exploits, prompt injection requires no code vulnerabilities: it leverages the model's own ability to understand and follow natural language instructions. Any agent that processes untrusted text can be redirected toward malicious actions with the right instructions.
Evasion mechanisms are simple and effective. The invisible HTML comment technique is the simplest: the human reviewer approves the issue or PR without seeing the instructions, but the model executes them. More sophisticated are instructions disguised as legitimate tasks (“compliance review”) that avoid the model's own safety filters.
The proliferation of AI agents in CI/CD is fast and outpaces security practices. Claude Code, GitHub Copilot, Cursor, Devin and other AI development agents are being integrated into pipelines without the same security controls we would apply to any other production-access system. It is the same pattern seen with VS Code extensions (TeamPCP/Nx Console) last month: trust in the tool precedes security auditing.

‍

How the attack works: from GitHub issue to exfiltrated secret

The attacker creates an issue or PR with a hidden payload. Inside the issue body, in an HTML comment (<!-- >), the attacker inserts natural language instructions designed for the AI model to execute. The visible text of the issue looks completely normal.
The CI/CD workflow processes the issue with the agent. Many teams configure GitHub Actions for the AI agent to review issues, classify bugs, generate automatic responses or run triage tasks when a new issue or PR is opened.
The agent reads the hidden instructions and executes them. The model sees the raw markdown, including the HTML comment content. The instructions ask it to do something that seems reasonable (a review, a check) but actually includes reading /proc/self/environ.
The Read tool accesses the runner environment. Without proper sandboxing, the agent's Read tool can access /proc/self/environ and read all process environment variables, including secrets injected by GitHub Actions.
The model launders the output to evade filters. The instructions include an output cleanup step (trimming characters, reformatting) that prevents GitHub Secret Scanner from detecting the API key pattern.
The secrets reach the attacker. The agent exfiltrates the value through a public comment on the issue, workflow logs or a call to an external server. The attacker recovers the API key without having had direct access to the repository secrets at any point.

‍

Key lessons: Microsoft's Agents Rule of Two and the hardening checklist

Microsoft Threat Intelligence introduced in its analysis a security design principle for AI workflows that DevSecOps teams should adopt immediately: the “Agents Rule of Two.” An AI workflow should never simultaneously hold all three of the following capabilities:

1. Processing untrusted input (issues, PRs, GitHub comments, any content generated by external users).
2. Access to sensitive secrets (API keys, cloud credentials, deploy tokens, SSH keys).
3. Ability to act externally or modify state (tools like Bash, WebFetch, GitHub MCP, file writes, repository pushes).

If a workflow has two of these three capabilities, the risk is manageable. If it has all three simultaneously, the trust boundary has collapsed and the agent can be weaponised.

Immediate hardening checklist for teams with AI agents in GitHub Actions:

Update Claude Code GitHub Action to version 1.0.94 or higher (RyotaK's patch) and use Claude Code 2.1.128 or higher (Microsoft's patch). Verify the action version in the .github/workflows/ file in the repository.
Apply the Agents Rule of Two: if the workflow processes issues or PRs from external users, remove access to production secrets or disable tools that allow external actions.
Least-privilege principle on every token: each API key and workflow credential must be scoped to the minimum permissions that specific workflow needs. An agent classifying issues does not need cloud credentials.
Separate triage workflows from privileged-access workflows: use separate repositories or separate jobs for the part processing user content and the part with production access.
Monitor anomalous agent behaviour: configure alerts if the agent attempts to access files outside the working directory, makes web calls to unexpected domains, or generates unusually long outputs in triage workflows.
Audit all GitHub Actions workflows using AI agents — not just Claude Code, but also GitHub Copilot, Cursor, Devin and any other — against the Agents Rule of Two.

‍

Cybersecurity as a strategic priority

The Claude Code GitHub Action incident arrives at a moment when AI agent adoption in development pipelines is accelerating at maximum speed. Engineering teams are integrating AI agents to review PRs, classify bugs, generate documentation and automate maintenance tasks — and in many cases doing so in the same workflows that have production access. Microsoft's research does not say AI agents should not be used in CI/CD: it says their integration requires the same security controls we would apply to any other privileged-access system. The pattern is the same as with VS Code extensions (TeamPCP/Nx Console), compromised npm packages (TanStack) and developer tool vulnerabilities documented this month: the speed of developer tool adoption systematically outpaces the speed of security auditing.

‍

Apolo Cybersecurity: AI agent security and CI/CD pipeline hardening

At Apolo Cybersecurity we help development teams audit the security of their GitHub Actions workflows with AI agents: workflow assessment against the Agents Rule of Two, token and CI/CD access permission review, prompt injection vector detection in workflows processing user content, runner environment hardening, and version review of Claude Code GitHub Action, GitHub Copilot and other AI agents integrated into production pipelines.

If your team uses Claude Code, Copilot or any other AI agent in your GitHub Actions and those workflows have access to production secrets, this Monday morning is the time to review whether the Agents Rule of Two holds in your architecture.

‍

When Your CI/CD AI Agent Becomes the Attacker: Microsoft Reveals How a Prompt Injection in Claude Code's GitHub Action Exposed Your Pipeline Secrets

What did Microsoft discover and how did the attack work?

Why AI agents in CI/CD are a new-order attack surface

How the attack works: from GitHub issue to exfiltrated secret

Key lessons: Microsoft's Agents Rule of Two and the hardening checklist

Cybersecurity as a strategic priority

Apolo Cybersecurity: AI agent security and CI/CD pipeline hardening

Any questions?
‍We're happy to help!

CONTACT US TODAY!

¡Gracias !

te esperamos!

When Your CI/CD AI Agent Becomes the Attacker: Microsoft Reveals How a Prompt Injection in Claude Code's GitHub Action Exposed Your Pipeline Secrets

What did Microsoft discover and how did the attack work?

Why AI agents in CI/CD are a new-order attack surface

How the attack works: from GitHub issue to exfiltrated secret

Key lessons: Microsoft's Agents Rule of Two and the hardening checklist

Cybersecurity as a strategic priority

Apolo Cybersecurity: AI agent security and CI/CD pipeline hardening

Any questions?‍We're happy to help!

CONTACT US TODAY!

¡Gracias !

te esperamos!

Any questions?
‍We're happy to help!