A developer clones a repo on Monday morning and opens it with an AI coding agent.
Nothing dramatic happens. No suspicious installer flashes by. The repo looks like a normal JavaScript project with the usual clutter: .vscode/, .cursor/, package.json, a GitHub helper, maybe a few model-specific settings files.
Then the agent starts its work.
It reads project rules. It loads editor instructions. It sees task definitions. It may start a local MCP server. It may run npm test because that is how the project tells contributors to validate a change.
The dangerous part is not the clone. The dangerous part is the moment the repo becomes the agent's workspace.
SafeDep put the lesson plainly in its Miasma analysis: "Cloning the repo is safe. Opening it is not."
For teams adopting coding agents, that sentence should change the review habit. Repository contents are no longer just source code and build files. They are part prompt, part runtime, and part policy surface.
What Miasma changed about the repo threat model
On June 3, 2026, SafeDep reported that Miasma hit npm packages and also used a parallel GitHub source-repo route. The source route matters because it pushed directly to GitHub repositories instead of relying only on package publishing.
In the commit SafeDep analyzed, the attacker added or modified files across several developer surfaces:
.claude/settings.json.gemini/settings.json.cursor/rules/setup.mdc.vscode/tasks.json.github/setup.jspackage.json, including a test-script change
SafeDep says the payload was wired to run through five surfaces: Claude Code, Gemini CLI, Cursor, VS Code, and npm test.
That is a different shape from the classic "malicious dependency runs during install" story. It targets the workbench around the code. Files that look like editor preferences, agent instructions, setup helpers, and validation scripts can become execution paths.
SafeDep also found the same pattern across more than 120 repositories and described stolen-token propagation patterns. That makes this more than a clever repo trick. It is a supply-chain pattern aimed at the places developers and agents trust by default.
For small teams, the operational lesson is simple: if an agent can read a file as instruction, run it as setup, or treat it as project policy, then changes to that file deserve review.
Why agent and editor config files belong in code review
Most teams already review application code. Fewer teams review .cursor/rules/setup.mdc with the same suspicion they bring to a production API handler.
That habit made sense before coding agents. Editor settings were often local convenience. Project rules were onboarding material. Task files helped developers run builds faster.
Agents blur those categories.
A .claude/settings.json file can shape how Claude Code behaves inside the repo. A .gemini/settings.json file can do the same for Gemini CLI. A .cursor/rules/setup.mdc file can tell Cursor how to interpret the project. A .vscode/tasks.json file can define commands that look like standard project tasks. A package.json script can turn "run tests" into "run anything."
Those files may be plain text, but they are not harmless.
This is why repository config security belongs in ordinary pull-request review. A reviewer should ask:
- Did this change add new agent instructions?
- Did it change what the agent is allowed to run?
- Did it add a local startup command?
- Did it change
test,lint,prepare,postinstall, or setup scripts? - Did it introduce an MCP server or modify how one starts?
- Did it request access to files, shell commands, network calls, or credentials?
The review does not need malware expertise. It needs a new label: agent-facing config.
When a repo tells the agent "always run this task before editing" or "use this local MCP server," that instruction should be treated closer to code than documentation.
This also explains why prompt files alone are a weak control plane. We wrote about that problem in System prompts are not an agent control plane: instructions help, but they do not replace runtime boundaries, approvals, and evidence.

What serious agent security looks like now
The more useful direction in AI coding agent security is not "write a stronger prompt." It is closer to normal engineering discipline: threat models, verification, sandboxed execution, approval gates, and logs.
Anthropic's Defending Code Reference Harness is a good signal here. It is an open-source reference implementation for autonomous vulnerability discovery and remediation with Claude. It includes Claude Code skills for /quickstart, /threat-model, /vuln-scan, /triage, /patch, and /customize.
The pipeline is explicit: recon, find, verify, report, patch.
The safety boundary is explicit too. Anthropic warns that the autonomous reference pipeline executes target code and refuses to run outside a gVisor sandbox unless explicitly overridden.
That is the posture teams should copy. If the agent might run target code, isolate it. If the agent might change files, review the diff. If the agent might touch credentials, reduce what it can reach. If the agent might call tools, log the calls.
MCP security raises the same issue from another angle.
The official MCP security best practices call out confused deputy problems, token passthrough, SSRF, session hijacking, and local MCP server compromise. The local MCP server section is blunt: downloaded or executed local MCP servers can run with the user's privileges and create risks including arbitrary code execution, data exfiltration, data loss, and low-visibility command execution.
That matters because many agent workflows make MCP feel like plumbing. A repo says "start this local server," the client connects, and suddenly the agent has a new tool surface.
The MCP guidance says clients should show exact commands before execution, require explicit consent, highlight dangerous command patterns, sandbox where possible, and restrict file and network access. The token passthrough guidance points to another operational problem: if servers cannot distinguish which client is acting, accountability gets muddy.
This is where the market is moving. Archestra's seed announcement says multiple Fortune 500 companies are running its agent infrastructure in production, connecting agents to corporate data, and hitting security and compliance walls around scale. The useful takeaway is not the vendor news. It is the category signal: agent security infrastructure is becoming part of production operations.
A practical pre-open checklist for teams
Teams do not need a full security program before using coding agents. They do need a pre-open habit.
Use this when an agent is about to work in an unfamiliar repo, a fork, a customer repo, an open-source dependency, or a project that recently accepted outside contributions.
Use it as a six-part repo-open scorecard before the agent opens the workspace:
| Scorecard area | What to inspect before opening the repo in an agent | Pass signal |
|---|---|---|
| Agent-facing files | .claude/, .gemini/, .cursor/, .vscode/tasks.json, project rules, and local MCP config | The team can name which files can instruct, configure, or launch agent behavior. |
| Risky config diffs | New shell commands, encoded payloads, lifecycle scripts, external fetches, and changed test commands | Tooling changes receive the same review attention as application code. |
| Sandbox boundary | Disposable filesystem, no host SSH keys, no cloud credentials, restricted file/network access | Unfamiliar code can run without touching the developer's normal environment. |
| Credential scope | Tokens, .env files, browser sessions, and deploy permissions reachable from the workspace | The agent receives only the credentials required for this task, if any. |
| Approval queue | Setup scripts, MCP startup, dependency installs, deploy/publish steps, and outbound data movement | Risky actions wait for an explicit human decision with command, reason, and expected result. |
| Receipt log | Files read/changed, commands run, tools called, services contacted, approvals requested, and blocked attempts | Operators can reconstruct what the agent touched after the job finishes. |
That is the concrete artifact behind the scorecard CTA: a shared way to decide whether a repo is safe enough for an AI coding agent to open, what needs human approval first, and what evidence the team should keep afterward.
1. Inspect agent-facing files before opening the repo in an agent
Before giving an agent the workspace, scan for files that can shape behavior:
.claude/.gemini/.cursor/.vscode/tasks.json.github/setup helperspackage.jsonscriptsMakefilejustfileTaskfile.yml- local MCP server config
- devcontainer and workspace bootstrap files
The question is not "does this file look like malware?" The question is "can this file instruct, launch, configure, or redirect the agent?"
SafeDep's Miasma guidance is useful here: treat unexpected .claude/, .gemini/, .cursor/, .vscode/, and script changes as supply-chain signals, not editor noise.
2. Diff config changes with the same attention as code
A pull request that changes .cursor/rules/setup.mdc or .vscode/tasks.json should not slide through as "developer tooling."
Look for new shell commands, encoded payloads, curl-to-shell patterns, external fetches, credential reads, token exports, base64 decoding, eval-style execution, and changed lifecycle scripts.
Pay special attention to npm test. Developers and agents run tests reflexively. That makes test scripts attractive execution points.
3. Run unfamiliar repos in a sandbox
If the agent may execute project code, run it in a disposable environment.
A good baseline:
- no host SSH keys
- no cloud credentials
- no production
.env - no persistent browser sessions
- no write access outside the workspace
- restricted network access when possible
- disposable filesystem after the job
Anthropic's gVisor boundary in the Defending Code reference harness is the right mental model. Target code is untrusted until proven otherwise.
4. Limit credentials by task
Do not give the agent your normal developer environment and hope it behaves.
If the job is a code review, it probably does not need write tokens. If the job is a test run, it probably does not need production API keys. If the job is a docs edit, it probably does not need cloud deploy access.
This is a data security problem as much as a tooling problem. Ask what private data or credential this workflow could reach if the prompt, repo, or tool layer turned hostile.
5. Put risky actions behind an approval queue
Agents should not self-approve actions that can publish, deploy, exfiltrate, delete, or grant access.
Use an approval queue for actions such as:
- running repo-provided setup scripts
- starting local MCP servers
- installing dependencies from unfamiliar sources
- changing CI, secrets, permissions, or deploy config
- pushing commits or opening pull requests
- sending customer data to external tools
- running commands that touch files outside the repo
We have written about this pattern in Build the approval queue before you build the agent, and the implementation details matter. The reviewer needs to see the command, the files involved, the reason the agent requested it, and the expected result.
For a more concrete starting point, see BaristaLabs' AI approval queue guide.
6. Keep a receipt log
After the agent finishes, the team should be able to answer:
- What files did it read?
- What files did it change?
- What commands did it run?
- What tools did it call?
- What external services did it contact?
- What did it ask a human to approve?
- What did it attempt but get blocked from doing?
This is not bureaucracy. It is how operators debug agent behavior after something goes wrong.
We call this an agent receipt log. The point is evidence: a readable record of what the agent touched, changed, ran, or attempted. We covered the operating model in Agent receipts: what to log before AI touches customer work.
Where BaristaLabs would draw the first boundary
If a team came to BaristaLabs after Miasma and asked where to start, we would not begin with a giant policy document.
We would draw the first boundary around repo-open behavior.
That means:
- Detect agent-facing config before the agent starts.
- Show the human which files can shape the agent.
- Block automatic execution of repo-provided tasks until approved.
- Run unfamiliar code in a sandbox.
- Keep credentials out of the workspace unless the task needs them.
- Log the agent's reads, writes, commands, tool calls, and approval requests.
That boundary is small enough for a real team to adopt. It also catches the class of problem Miasma exposed.
The more advanced version can come later: per-repo trust levels, policy-as-code for agent tools, MCP allowlists, network egress rules, signed internal config, and automated detection for suspicious editor or agent config changes.
But the first step is cultural. Stop treating .claude/, .gemini/, .cursor/, .vscode/, local MCP servers, and package.json scripts as harmless project noise.
For agent-assisted teams, the repo is already talking to the agent.
Review what it says before the agent listens.
If you are designing internal AI coding workflows and want help putting approval queues, sandboxing, credential boundaries, and receipt logs around them, BaristaLabs can help through AI consulting.
Agent repo security scorecard
Score the repo before the agent opens it.
Bring one AI coding workflow and use the six-part repo-open scorecard to map agent-facing files, risky startup paths, sandbox boundaries, credential scope, approval gates, and receipt logging before the agent gets a real workspace.
Built for AI coding agents, MCP servers, internal repos, and unfamiliar dependency work. No credentials or private source code needed.
Practical AI Workflow Notes
Want more practical AI operations ideas?
Get short notes on applying AI inside real small-business workflows — from document handling and customer follow-up to internal reporting, compliance, and automation guardrails.
Share this post
