Industry Insights

`safe_mode=True` got its first CVEs. The Hugging Face scanner missed them.

Researchers found six zero-day vulnerabilities in ML model loading, including the first CVEs ever assigned to Keras safe_mode. Over 90% of non-security ML practitioners believed safe_mode=True prevented arbitrary code execution. It did not.

Sean McLellan

Lead Architect & Founder

March 16, 20263 min read

Over 90% of non-security ML practitioners surveyed said they perceived no risk of arbitrary code execution when loading a Keras model with safe_mode=True. A team of security researchers just demonstrated six zero-day vulnerabilities proving otherwise — including the first CVEs ever assigned to that flag.

The paper, "On the (In)Security of Loading Machine Learning Models," examined the attack surface that opens the moment you call a framework's load() function. Not during training. Not during inference. During the load step itself — the step most teams treat as a file read.

What `safe_mode` actually blocked

Keras introduced safe_mode to restrict deserialization of arbitrary Python objects during model loading. The idea was sound: prevent pickle-style code execution by limiting what the loader would instantiate. In practice, the researchers found multiple bypass paths. The specifics that earned CVE numbers involved crafting model files that passed safe_mode's allowlist checks while still triggering code execution through object instantiation chains the flag was not designed to catch.

Six zero-days across the model loading surface. Some in Keras directly. The paper notes these are the first CVEs ever filed against safe_mode — meaning that until this work, the security community had not formally challenged the assumption baked into that boolean.

The Hugging Face scanner gap

Hugging Face runs integrated security scanners on uploaded models. The researchers tested whether those scanners caught their framework-level exploits. The answer: not reliably. The scanners are pattern-based and tuned for known payload signatures — pickle exploits, obvious shell calls, known-bad serialization patterns. Framework-level bypasses that exploit legitimate loading paths do not match those signatures.

This is the gap that matters operationally. A team pulling a fine-tuned model from the Hub sees a green checkmark from the scanner. The scanner checked what it knows to check. It did not check what the Keras researchers found.

The survey number that reframes the risk

The 90% figure comes from a practitioner survey the authors ran alongside the technical work. They asked ML engineers — specifically non-security-focused ones — whether loading a model with safe_mode=True could result in arbitrary code execution. Over nine in ten said no. That gap between perceived safety and actual safety is where incidents start.

Most ML deployment pipelines treat model loading as deterministic I/O: read the weights file, hydrate the graph, run inference. The paper's core argument is that loading is execution. Every framework that deserializes objects during load — Keras, PyTorch (via pickle), TensorFlow SavedModel — runs code paths the user did not write and may not audit.

One loading audit worth running this week

For an ops lead managing a model pipeline at a 20–50 person firm — maybe pulling LoRA adapters or fine-tuned checkpoints from the Hub into a FastAPI serving stack — the exposure is specific:

Inventory your load paths. Which framework's load() or from_pretrained() are you calling? If it is Keras with safe_mode=True, you now know that flag has documented bypass routes.
Do not treat Hub scanner results as a security boundary. The green check means the scanner's known-pattern database did not trigger. It does not mean the model file is inert.
Sandbox model loading. Run the load step in an isolated container or VM with no network access and minimal filesystem permissions. If the load step needs nothing beyond reading weights into memory, give it nothing beyond that.
Pin framework versions. The CVEs were disclosed responsibly, meaning patches exist or are in progress. Running an unpatched Keras version with safe_mode=True against untrusted model files is running unpatched code with a false safety label.

The cost of sandboxing a model load step in a containerized pipeline is near zero — a Dockerfile change and a restricted network policy. The cost of not doing it is an arbitrary code execution path inside your inference stack, blessed by a flag named safe_mode that 90% of your team assumed was sufficient.

Skip the scanner checkmark. Sandbox the load.

Practical AI Workflow Notes

Want more practical AI operations ideas?

Get short notes on applying AI inside real small-business workflows — from document handling and customer follow-up to internal reporting, compliance, and automation guardrails.

Share this post

Share on X Share on LinkedIn Share on Bluesky

The transaction memo is part of your AI attack surface now

June 10, 2026

Do not let an AI support bot reset credentials by itself

June 4, 2026

AI vulnerability triage needs evidence packets, not alert floods

June 4, 2026

Turn this idea into a pilot

Which workflow should go first?

Use the readiness check to compare impact, effort, risk, owner, and next step before booking a call.

3-5 minutes
Deterministic score
No sensitive data

Check workflow readiness

Keep Reading

Industry Insights

`safe_mode=True` got its first CVEs. The Hugging Face scanner missed them.

Sean McLellan

Lead Architect & Founder

March 16, 20263 min read

What `safe_mode` actually blocked

The Hugging Face scanner gap

The survey number that reframes the risk

One loading audit worth running this week

For an ops lead managing a model pipeline at a 20–50 person firm — maybe pulling LoRA adapters or fine-tuned checkpoints from the Hub into a FastAPI serving stack — the exposure is specific:

Inventory your load paths. Which framework's load() or from_pretrained() are you calling? If it is Keras with safe_mode=True, you now know that flag has documented bypass routes.
Do not treat Hub scanner results as a security boundary. The green check means the scanner's known-pattern database did not trigger. It does not mean the model file is inert.
Sandbox model loading. Run the load step in an isolated container or VM with no network access and minimal filesystem permissions. If the load step needs nothing beyond reading weights into memory, give it nothing beyond that.
Pin framework versions. The CVEs were disclosed responsibly, meaning patches exist or are in progress. Running an unpatched Keras version with safe_mode=True against untrusted model files is running unpatched code with a false safety label.

Skip the scanner checkmark. Sandbox the load.

Practical AI Workflow Notes

Want more practical AI operations ideas?

Get short notes on applying AI inside real small-business workflows — from document handling and customer follow-up to internal reporting, compliance, and automation guardrails.

Share this post

Share on X Share on LinkedIn Share on Bluesky

The transaction memo is part of your AI attack surface now

June 10, 2026

Do not let an AI support bot reset credentials by itself

June 4, 2026

AI vulnerability triage needs evidence packets, not alert floods

June 4, 2026