Scan the skill before the agent reads it

Don't ask an AI if you're audit-ready. Put it in a read-only room.

Your agent audit log needs a rehearsal, not a promise

Your SBOM stops before the agent starts

Article-specific next step

Bring one skill to the session

Bring the most-shared skill on your team. We will run it through the bench, separate the real findings from the false positives, and decide what gets to auto-load.

Pressure-test one skill

The fields are free to copy. The session is for teams who want a second read before a skill loads into every session.

Sensitive systems

Stalled infrastructure work can be scoped without exposing private details.

For an anonymized certification board, BaristaLabs completed an AKS upgrade in 1 week with zero downtime and restored a vendor-supported Kubernetes version path.

0
application downtime: 4x
more subnet IP capacity

Anonymized case study for regulated technical work.

Client and infrastructure details stay confidential.

Read case study

Share this post

Don't ask an AI if you're audit-ready. Put it in a read-only room.

Your agent audit log needs a rehearsal, not a promise

Your SBOM stops before the agent starts

Set up the bench with BaristaLabs

Keep Reading

Industry Insights

Scan the skill before the agent reads it

Sean McLellan

Lead Architect & Founder

June 22, 20269 min read

That is the moment worth slowing down. Not because the folder is malicious — it almost certainly is not. Because of what the folder becomes the instant your agent decides it is relevant.

Why a skill is not a README

That design is genuinely good. It is why skills are portable and easy to share. It is also why the trust question is sharper than it looks.

A scanner that treats skills as untrusted code

Prompt injection — a skill that tells the model to ignore prior guidelines, wipe its context, adopt a new persona, or treat fake [SYSTEM] tokens as real.
Exfiltration — reading secrets, SSH keys, or environment variables and shipping them out over curl, DNS, or a GitHub API call.
Command execution — eval, exec, shell=True, os.system, child_process, and the rest of the family that turns a "helper script" into arbitrary code.
Persistence — writing to crontab, appending to ~/.bashrc, installing a systemd unit: ways an attack survives past the session it arrived in.
Privilege escalation, supply-chain risk, filesystem abuse, obfuscation, and model-specific jailbreaks.

Why a quarantine bench can't just grep for bad words

If that were the whole story, you could scan a skill with a clever search-and-replace and call it done. It is not, and the reason is the most important part of the teardown.

The skill quarantine bench

Scroll sideways to see all 2 columns.

Field	What you are deciding
Source and maintainer	Who wrote this, and do you trust them the way you would trust a dependency author? A name in a Slack thread is not provenance.
Scope of the skill	What is it supposed to do? Write the one-sentence job down now, so every capability you find later can be checked against it.
Files included	What is actually in the folder beyond `SKILL.md`? Scripts, templates, assets, "other files"? Each non-Markdown file is a thing the agent can run, not just read.
Dynamic context, shell commands, and scripts	Does the body run commands, inject external output, or point at bundled scripts? This is where documentation turns into execution.
Network and secret-touching patterns	Does anything read environment variables, credentials, SSH or cloud keys, or reach out over the network? A skill that both reads secrets and makes requests is the classic exfiltration shape.
Encoded or obfuscated content	Is there base64, hex, or URL-encoded content? Decode it and judge what is inside. Unread encoded blobs are an automatic hold.
Runtime hooks and persistence attempts	Does it try to write cron jobs, shell startup files, systemd units, or agent config that loads on every future startup? A skill should not outlive the task.
Suppression and baseline decision	If you are ignoring a finding, which one, and why? Record it. SkillsGuard supports config ignore patterns, a baseline snapshot, and inline `// skillsguard-ignore` comments — each one is a decision someone should be able to defend later.
Reviewer and re-scan date	Who signed off, and when does this expire? Skills get updated upstream. A clean scan in June is not a clean scan in September.

Three of these tend to surprise people the first time through.

A bench is a bench, not a guarantee

It would be dishonest to hand you this and imply it closes the problem. It does not, and SkillsGuard's own docs are refreshingly upfront about why.

NEXT STEP

Build a skill quarantine bench before you share skills

The fields are free to copy. The session is for teams who want a second read before a skill loads into every session.

Practical AI Workflow Notes

Want more practical AI operations ideas?

Get short notes on applying AI inside real small-business workflows — from document handling and customer follow-up to internal reporting, compliance, and automation guardrails.

Turn this idea into a pilot

Which workflow should go first?

Use the readiness check to compare impact, effort, risk, owner, and next step before booking a call.

3-5 minutes
Deterministic score
No sensitive data

Check workflow readiness

Share this post

Don't ask an AI if you're audit-ready. Put it in a read-only room.

Your agent audit log needs a rehearsal, not a promise

Your SBOM stops before the agent starts

Article-specific next step

Bring one skill to the session

Bring the most-shared skill on your team. We will run it through the bench, separate the real findings from the false positives, and decide what gets to auto-load.

Pressure-test one skill

The fields are free to copy. The session is for teams who want a second read before a skill loads into every session.

Sensitive systems

Stalled infrastructure work can be scoped without exposing private details.

For an anonymized certification board, BaristaLabs completed an AKS upgrade in 1 week with zero downtime and restored a vendor-supported Kubernetes version path.

0
application downtime: 4x
more subnet IP capacity

Anonymized case study for regulated technical work.

Client and infrastructure details stay confidential.

Read case study

Share this post

Don't ask an AI if you're audit-ready. Put it in a read-only room.

Your agent audit log needs a rehearsal, not a promise

Your SBOM stops before the agent starts