Heimgard | Security Series

    The Agent Attack Playbook (2025)

    6 Common Attack Paths, and How to Defend Against Them

    Oct 5 2025 | 5 min read

    TL;DR

    • Model Context Protocol (MCP) is fast becoming the “tool bus” for agentic systems, and attackers are adapting just as fast.
    • We summarize real incidents, the six most common attack patterns, and the concrete controls that actually work.
    • A vetted MCP registry, strong auth & input validation, and continuous monitoring are key components.

    The Urgent Need for Agent Security

    In June 2025, Asana discovered a flaw in one of its MCP servers that exposed data from 1000 enterprise customers. A simple endpoint misconfiguration allowed attackers to siphon off internal files, notes and metadata without detection.

    The fallout from data breaches is significant. IBM reported the average cost of a data breach at USD 4.4M, with the typical case stretching 241 days before containment.

    Meanwhile, agentic adoption is far outpacing its security measures. Among organizations that experienced AI‑related breaches, 97% lacked proper access controls, and 63% lacked an AI governance policy.


    What is MCP

    Model Context Protocol (MCP) is an open standard that lets AI agents connect to external tools and data sources — essentially a USB port between agents and the applications they use.

    Clients like IDEs or desktop assistants can link to one or more servers, where each tool is defined in a manifest and called with structured inputs and outputs.

    It’s a flexible, plug-and-play system, but every tool you connect becomes part of your trust boundary.


    6 Common Attack Methods via MCP

    Below are patterns we see repeatedly across incidents, advisories, and research.

    1. Data exfiltration

    A tool steals sensitive data (emails, tokens, files) by hiding exfiltration in legitimate responses or background calls.

    Case in point:

    An unofficial postmark-mcp npm package silently blind‑copied users’ emails to an attacker server for weeks before being discovered.

    2. Privilege escalation

    Over‑broad permissions or chained tools allow attackers to step outside intended scope (e.g., read/write beyond an approved workspace).

    Related research:

    Researchers found flaws in Anthropic’s MCP Inspector that let attackers read and write outside approved directories, exposing tokens that could lead to full host compromise.

    3. Tool poisoning

    Malicious prompts, metadata, or schema definitions can manipulate an agent into performing unintended actions.

    Case in point:

    A poisoned GitHub README hijacked the Cursor coding assistant, tricking it into running blocked commands and exfiltrating API keys

    4. Command & SQL injection

    Crafted inputs inject OS commands or DB queries through MCP tool calls.

    Case in point:

    A Datadog Security Labs case study documented SQL injection in Anthropic's PostgreSQL MCP server, turning benign queries into read/modify operations.

    5. Rug pulls

    A tool looks safe when you install it, then its code/manifest is updated and the behavior changes without user re‑consent.

    Evidence:

    Researchers showed MCP manifests can be updated after install, turning a once-benign tool into a data-exfiltration vector without alerting users.

    6. Path traversal

    Poor path validation lets attackers escape a workspace and read sensitive system files (e.g., ../../etc/passwd).

    Case in point:

    Researchers discovered a directory traversal flaw in Anthropic’s Filesystem MCP Server (@modelcontextprotocol/server-filesystem) where crafted paths like ../../etc/passwd could easily bypass a naive path-prefix check and escape the intended workspace.


    The Defender’s Playbook

    Here are practical steps you can take immediately to mitigate risk.

    1. Before you connect — secure the supply chain

    Start clean. Only use tools and servers from trusted sources.

    • Install from vetted registries and verify signatures.
    • Pin exact versions and review permissions before approval.
    • Apply least-privilege access — no tool should touch more than it needs.
    • Rotate credentials often and keep tokens scoped to a single tool.

    2. At runtime — contain and control

    Assume every agent connection is a potential breach path.

    • Enforce strong authentication and isolate each tool in its own sandbox.
    • Validate inputs and outputs strictly — reject on error, don’t warn.
    • Restrict network egress and file system access to known safe targets.
    • Require human approval for destructive or external actions.

    3. Detect and respond — watch for drift

    Visibility is half the battle.

    • Log every tool call, parameter, and outbound request.
    • Monitor for behavioral changes — new domains, new scopes, or altered manifests.
    • Run regular red-team tests for prompt injection, SQLi, and sandbox escapes.

    Heimgard’s Approach to Agent Security

    Heimgard focuses on securing MCP servers and curating a trusted registry so teams can ship agentic features without inheriting hidden risks.

    • Heimgard Registry: signature verification, digest pinning, publisher reputation, and automated manifest diffs to stop rug‑pulls before they land.
    • MCP Gateway: mutual‑TLS, identity‑aware policy (per‑tool scopes), output DLP/secret scanning, and egress allow‑lists—so exfiltration attempts are blocked in real time.
    • Observability: per‑tool telemetry with drift detection (new domains, new scopes, unusual file paths), plus ready‑to‑use SIEM dashboards.

    References & Further Reading

    1. UpGuard. Asana Discloses Data Exposure Bug in MCP Server, 2025
    2. IBM. Cost of a Data Breach Report 2025
    3. HiddenLayer. How Hidden Prompt Injections Can Hijack AI Code Assistants Like Cursor, 2025
    4. Oligo Security. Critical RCE Vulnerability in Anthropic MCP Inspector, 2025
    5. BleepingComputer. Unofficial Postmark MCP Package Stole Emails, 2025
    6. Data Dog. MCP Vulnerability Case Study: SQL Injection in the Postgres MCP Server, 2025
    7. Simon Willison. Model Context Protocol has Prompt Injection Security Problems, 2025
    8. Snyk. Directory Traversal, 2025