Before an agent remembers for you, make it forget on purpose

AI agents need a state ledger before they need more context

Agent receipts: what to log before AI touches customer work

Agent evals should test workflow receipts, not just model answers

Map one memory misfire drill

Article-specific next step

Run a memory misfire drill before an agent remembers for you

Best fit for support, operations, agency, and internal automation teams piloting persistent-memory agents on customer or project context.

Share this post

AI agents need a state ledger before they need more context

Agent receipts: what to log before AI touches customer work

Agent evals should test workflow receipts, not just model answers

Map one memory misfire drill Review the security worksheet

Keep Reading

Industry Insights

Before an agent remembers for you, make it forget on purpose

Sean McLellan

Lead Architect & Founder

June 29, 20268 min read

The support agent tells the customer, helpfully and with a citation, that the card on file is the Amex ending 4022.

What a memory bounty is actually testing

The reason this scene is worth dwelling on right now is that "give your agent persistent memory" stopped being a research project and became a package you install.

"Recall worked" answers the wrong question

Watch how those come apart in the bounty thread, because the submissions are basically a field guide to memory misfires:

Stale-but-valid-yesterday. One submission (PR #823, reported in the thread) describes a time-travel query, search_as_of, dropping memories that expired today even when they were valid at the historical timestamp the caller actually asked about. Freshness logic is subtle: "expired now" and "expired as of the date in question" are different facts, and conflating them is how you either leak a dead fact or hide a live one.
Low-confidence, dressed as fact. Another (PR #801, reported in the thread) reports that numeric min_confidence thresholds were being translated into categorical keywords the stored memories did not actually use — so memories the caller meant to filter out by confidence could slip back in. A confidence threshold that does not bite is not a threshold.
Cross-agent and session bleed. The boundary between whose memory is whose is its own failure surface. The merged PR #820, "Fix session mismatch status codes," shows at least one session-and-authorization behavior moving through maintainer review. Separately, a batch of security submissions in the thread (this comment) tested path traversal, unauthenticated UI endpoints, reflected-origin CORS, and arbitrary file write — the plumbing failures that decide whether a memory store can be read or written by something that should never touch it.

The memory misfire drill

Run one fact down this column. Each row is a way the fact can be true and still wrong.

Scroll sideways to see all 3 columns.

Drill row	The misfire it catches	What to write down before you trust the recall
Remembered fact	The thing itself, stated plainly	The exact fact as the agent would surface it to a customer or a colleague, in its own words. If you can't write it in one sentence, the agent shouldn't be storing it.
Source	A fact with no traceable origin	Where this came from — a specific message, document, or human edit — captured as provenance, not as a vibe. A fact you can't trace is a fact you can't defend or delete cleanly.
Scope	Wrong customer, wrong project, wrong account	Whose fact this is: the customer, account, or project it belongs to, and an explicit rule that it never surfaces outside that scope. One namespace per blast radius.
Freshness	True yesterday, dangerous today	A valid-until date or a superseded-by rule, and the behavior when a fact is asked for "as of" a past date versus now. Decide what happens to a fact that was valid then and is expired now.
Confidence	A guess wearing a fact's clothes	The minimum confidence required for this fact to be recalled at all, and proof the threshold actually filters — that a low-confidence extraction stays out instead of slipping back in.
Contradiction	Two true-looking facts that disagree	A worked example of the conflicting fact (cancelled card vs. card on file) and the rule for which one wins. Newest? Highest-confidence? Human-edited? Pick, and write it down.
Boundary	One agent reading another's memory	The cross-agent and cross-session line: which agent or session may read this, which may not, and what the system returns when the wrong one asks.
Edit and delete	A wrong memory you can't take back	The exact route to correct or remove this fact — the `edit` and `forget` equivalent for your stack — and who can run it under pressure, fast, without a redeploy.
Pass or fail	"It recalled" mistaken for "it's right"	The single rule that decides whether this fact is safe to use: not "the agent returned something," but "the agent returned the right fact, in scope, fresh, above threshold, with no live contradiction."

The two rows that catch people are Freshness and Contradiction.

Why this is now an operations primitive

What changed is not that agents got smarter. It is that their memory got durable, and durable memory is an operational surface, not a feature.

Your next move

Memory drill help

Run a memory misfire drill before an agent remembers for you

Best fit for support, operations, agency, and internal automation teams piloting persistent-memory agents on customer or project context.

Practical AI Workflow Notes

Want more practical AI operations ideas?

Get short notes on applying AI inside real small-business workflows — from document handling and customer follow-up to internal reporting, compliance, and automation guardrails.

Turn this idea into a pilot

Which workflow should go first?

Use the readiness check to compare impact, effort, risk, owner, and next step before booking a call.

3-5 minutes
Deterministic score
No sensitive data

Check workflow readiness

Share this post

AI agents need a state ledger before they need more context

Agent receipts: what to log before AI touches customer work

Agent evals should test workflow receipts, not just model answers

Map one memory misfire drill

Article-specific next step

Run a memory misfire drill before an agent remembers for you

Best fit for support, operations, agency, and internal automation teams piloting persistent-memory agents on customer or project context.

Share this post

AI agents need a state ledger before they need more context

Agent receipts: what to log before AI touches customer work

Agent evals should test workflow receipts, not just model answers