Coding Agent Security: Admin Rights, Prompt Injection, MCP

Here is an uncomfortable truth about modern software development. The most privileged identity in your engineering organization is no longer a person. It is an AI coding agent that reads your source code, runs commands in your terminal, holds your cloud credentials, and connects to whatever tools a developer wires up on a Tuesday afternoon.

Claude Code, Cursor, GitHub Copilot, OpenAI Codex, and Devin have quietly become production infrastructure. Most security programs are blind to their capability to run amok.

Banning the tools slows innovation. The productivity gains are real. Organizations must secure them the way we secure any other powerful, autonomous, internet-connected system: with a healthy dose of distrust.

The new attack surface is the coding agent, not just the app

Traditional application security asks whether the code you ship is safe. Coding agent security asks a different and newer question:

“Is the agent writing that code being manipulated while it works?”

Those are not the same problem, and the second one has major blind spots.

These exploits are not hypothetical. Security researchers have already documented and catalogued cases where a single piece of injected text, arriving through a connected tool or a repository file, rewrites an agent’s configuration and executes attacker-controlled commands on a developer’s machine. In controlled testing, getting these agents to run malicious instructions succeeds far more often than anyone will admit.

The pattern is consistent: feed the agent poisoned input, and its considerable privileges silently become the attacker’s windfall. Stolen credentials, supply chain injections, dead-man switches. All real, all a threat to your business.

The mechanism behind most of these is indirect prompt injection. A coding agent reads a GitHub readme file, an email or poisoned RAG document, a Teams or Slack message, and buried in that content are instructions the agent dutifully follows. It cannot reliably tell the difference between approved actions and novel attack patterns.

MCP is the connective tissue, and the soft underbelly

The Model Context Protocol, the open standard that lets agents plug into external tools and data, is what makes these agents genuinely useful. It is also where a lot of the risk lives. These connections get set up by developers, not security teams, which means most organizations have not built the practices to inventory, approve, and constantly retest for compromise. Supply chain attacks delivering credential-stealing malware through cloned connectors have already happened.

You cannot apply least privilege to a tool you did not know existed. Discovery and scanning of MCP servers, their privileges, and the systems they connect to is not a nice-to-have. It is a critical requirement for organizations embracing AI-assisted coding regardless of scale.

Zero Trust for AI Agents is the answer

Anthropic’s recent eBook, Zero Trust for AI Agents, makes the case better than any vendor pitch.

Its three principles will sound familiar to anyone who has done network zero trust: never trust and always verify, assume breach has already occurred, and enforce least privilege.

The paper adds a sharp new wrinkle it calls least agency: where least privilege limits what an agent can access, least agency limits what each agent tool can actually do, how often, and where.

Where AI runtime guardrails come in

AI runtime guardrails sit at the coding agent’s boundaries, integrated via hooks or a proxy, and inspect every tool call the agent makes before it executes. The controls that matter run two detection engines in parallel on each call. A rule-based engine matches commands, arguments, and file references against curated patterns to catch known attacks fast and deterministically. An LLM-based engine adds semantic analysis to catch what patterns miss: novel attack techniques, obfuscated payloads, and context-dependent threats where the same command is benign in one session and malicious in another. Together they evaluate every prompt, response, and tool call for injection, credential and data exfiltration, and tool abuse. Just as important, they constrain the overly privileged actions that turn a single compromised call into a breach, blocking the offending tool call rather than killing the session.

Granular enforcement that blocks or rewrites the offending tool call rather than killing the whole session, so a single flagged action does not halt legitimate work.
Continuous red teaming that attacks your agents the way a real adversary would, in CI/CD before they reach production, mapped to frameworks like the OWASP Top 10 for Agentic Applications.
MCP discovery and scanning that inventories every agent and connected server in your environment and flags the poisoned, abandoned, or over-permissioned ones.

Together these controls add up to a zero-trust posture for coding agents. Verify every call continuously with both engines, you assume the agent can be tricked, and you constrain what each tool is allowed to do and how far a single call can reach.

Guardrails supplement good DevSecOps

AI runtime guardrails secure the coding agent and AI Agent Actions. They do not secure the shipped code the agent helps you build. You still need good DevSecOps procedures for your DevOps pipelines, unit tests, code scanning, etc.

Static Application Security Testing (SAST) analyzes your source code and binaries without running them, catching flaws like injection and insecure cryptography early in the development lifecycle. Dynamic Application Security Testing (DAST) tests the running application from the outside, finding the issues that only appear under execution.

Neither one can see what your coding agent is doing in the IDE, and runtime guardrails cannot prove your shipped code is free of SQL injection. Defense in depth is never optional. The coding agents just added a layer.

Let us pressure-test your coding agents together

In Balance IT Solutions is co-presenting a Summer Security Series with Straiker focused on securing AI agents in the enterprise, and coding agent security is front and center. If your developers are running Claude Code, Cursor, Copilot, Codex, or Devin, and your security team does not yet have an inventory of the MCP servers attached to them, that gap is worth a conversation.

Schedule time with In Balance IT Solutions to walk through your current coding agent security plans, and register for the Summer Security Series to see runtime guardrails, agent red teaming, and MCP scanning in action. Reach out to your In Balance account team or visit our site to claim a seat. Bring your hardest questions. We will bring the red team.

About This Series

This post is the third in the Adaptive Defense series. Each article addresses a specific domain where traditional frameworks fall short of today’s agentic AI threat landscape.

Post 1 — Why NIST, ISO 27001 & COBIT Can’t Keep Up With AI Threats

Post 2 — Agentic Adoption, the New Pattern for Cybersecurity

Post 3 — Non-Human Identity Security: An Attack Surface You Can’t See

Post 5 — Fighting Fire with Fire: The Case for an Agentic SOC