Industry Insights

The transaction memo is part of your AI attack surface now

A tiny transfer memo became a prompt-delivery path. Before an AI assistant reads payments, tickets, emails, or PDFs, map which fields are data and which actions they can influence.

Sean McLellan

Lead Architect & Founder

June 10, 20266 min read

A banking customer asks a normal question: "Show me my recent transactions."

The app retrieves amounts, dates, merchants, and descriptions. One description was written by someone else. It arrived through a tiny transfer.

That is the uncomfortable lesson in Blue41's case study about bunq's financial AI assistant. Blue41 says its proof of concept used a EUR0.02 transfer. The attacker controlled the transaction description. When the banking assistant later pulled recent transactions into the model context, that description was no longer just a payment memo. It was text sitting beside the instructions that shaped the answer.

Blue41 says it helped bunq, described in the post as Europe's second-largest digital bank with more than 20 million customers, secure the assistant against spearphishing risk. The banking detail is the hook. The operating lesson is wider: any field your system retrieves for an AI answer can become a prompt-delivery channel.

For most companies, the equivalent field is not a transaction memo. It is a support ticket, a CRM note, a vendor email, an uploaded PDF, a web page, a survey response, or a chat transcript.

The question is not "can we write a better prompt?" The better question is: which fields are allowed to influence which outputs?

The attack traveled through a trusted-looking field

Prompt injection usually gets explained as a user typing hostile instructions into a chatbot. That mental model misses the Blue41 scenario.

In the bunq proof of concept, the attacker did not need to interact with the victim's assistant. The attacker only needed to put text somewhere the assistant would later read.

The path looked like this:

Send a small transfer.
Put hostile text in the transfer description.
Wait for the victim to ask a routine question about recent transactions.
Let the assistant retrieve the transaction data.
Let the model confuse attacker-written data with instructions.

A Hacker News commenter compressed the issue well: "It was never about the prompt, it is about the prompt delivery."

That framing matters because it moves the security review from model personality to data plumbing. If a field can be written by an outsider, imported from a partner, copied from the web, or pasted by a customer, it should not quietly enter the same decision space as trusted application instructions.

OWASP's LLM01 Prompt Injection guidance defines prompt injection as prompts altering an LLM's behavior or output in unintended ways. It also describes indirect prompt injection, where content from external sources such as websites or files changes model behavior when the model accepts that content as input.

That is the key distinction. The dangerous text does not have to be typed by the person using the assistant. It only has to be retrieved.

A field inventory beats another policy paragraph

Teams often respond to prompt-injection risk by adding one more instruction: ignore malicious content, follow the system prompt, do not reveal secrets, do not click links.

Those lines can help, but they are too vague to be the control surface.

The useful artifact is a field inventory for one workflow. Not a grand AI governance framework. One page. One workflow. Every text field the assistant reads.

For a payment or account-support workflow, the inventory should answer seven questions:

What is the field?
Who can write it?
Does the assistant need it for this task?
What may the assistant do after reading it?
What outputs or actions are blocked?
What evidence gets logged?
Who owns exceptions?

Untrusted context map for a financial AI assistant — Map untrusted fields before an AI assistant reads customer or transaction data.

Field	Writer	Needed for this task?	Allowed influence	Blocked influence	Evidence to keep	Owner
Transaction description	External sender, merchant, customer	Only for transaction summaries	Neutral summary of the record	Links, credential requests, security advice	Record ID, field used, blocked reason	Fraud or support lead
Support ticket body	Customer or impersonator	Yes	Draft response for review	Credential reset, account changes, refund promise	Ticket ID, retrieved sources, proposed action	Support manager
CRM note	Internal staff or imported system	Sometimes	Account-history summary	Treat note as policy or permission override	Note ID, field used, output category	Account owner
Uploaded PDF	Customer, vendor, applicant	Depends	Extract structured facts	Follow document instructions, open URLs, forward files	File ID, extracted fields, blocked instruction	Operations owner
Vendor email	External vendor or compromised mailbox	Sometimes	Classify request and urgency	Approve payment, change bank details, share sensitive data	Email ID, sender domain, requested action	Finance lead
Web page	Public internet	Rarely for internal actions	Summarize with citation	Execute page instructions or copy hidden text into actions	URL, fetch time, output category	Workflow owner

The table is deliberately plain. It forces a sentence like this:

"A transaction description may help summarize spending, but it may not cause the assistant to generate a reauthentication link."

That sentence is much more enforceable than "be safe."

The same shape shows up outside banking

The Blue41 article is about financial services, but the pattern is already sitting inside ordinary business systems.

A customer support team may feed ticket bodies into a drafting assistant. A property manager may feed lease PDFs into a summarizer. A finance team may feed vendor emails into an AP workflow. A sales team may feed CRM notes into an account assistant. A marketing team may feed public webpages into a research assistant.

Each source has a different trust level.

A signed internal policy is not the same as a customer message. A merchant-supplied payment reference is not the same as a bank-generated transaction ID. A vendor email is not the same as an approved supplier record. A scraped webpage is not the same as your own product catalog.

When those fields reach an LLM, the model does not automatically preserve those distinctions. The application has to preserve them.

That is why the action after reading matters so much. A low-risk summary can tolerate more messy context. A system that can send emails, create tickets, request credentials, update records, approve refunds, or initiate payment changes needs tighter field rules.

This is the same boundary problem behind our support bot credential reset guidance. A bot can gather facts and prepare a handoff. That does not mean it should complete the reset.

Guardrails need a narrower job

Blue41's mitigation section is practical: minimize unnecessary context, treat retrieved data as untrusted, constrain sensitive outputs and actions, and monitor runtime behavior.

The OWASP LLM Prompt Injection Prevention Cheat Sheet points in the same direction with structured prompts, clear separation between instructions and data, output monitoring, human-in-the-loop controls, least privilege, and comprehensive monitoring.

The common thread is specificity.

A broad assistant forces the guardrail to decide too much at runtime. It sees messy text, retrieves many records, and can produce many kinds of output. The control has to infer intent from a huge surface area.

A narrow workflow gives the control something concrete to enforce:

Transaction descriptions can be quoted or summarized, but cannot produce links.
Support tickets can produce draft replies, but cannot reset credentials.
Vendor emails can classify urgency, but cannot change bank details.
Uploaded PDFs can populate structured fields, but cannot pass hidden instructions into tool calls.
Public web pages can be cited, but cannot instruct the agent to execute a downstream action.

That is where an AI approval queue becomes useful. It is not just a person clicking yes or no. It is the review surface where the system shows which fields were read, what trust level they had, which action was proposed, and which rule triggered the stop.

The queue is stronger when the field rules are already clear.

RAG and fine tuning do not solve the memo problem

Retrieval augmented generation can make an assistant more useful. Fine tuning can shape style and domain behavior. Neither changes the fact that retrieved text can carry instructions.

OWASP's LLM01 guidance says RAG and fine tuning do not fully mitigate prompt injection vulnerabilities. The Blue41 case shows why. The vulnerable moment appears when external content enters the prompt path and the model has to decide how to treat it.

A stronger model may refuse more obvious attacks. A better classifier may catch suspicious wording. A better prompt may reduce the number of failures.

Those layers are worth having. They do not answer the design question: should this field be allowed to affect this action?

If the answer is no, the control should not depend on the model noticing danger. The application should remove the field, mark it as untrusted, transform it into a safer structured value, block sensitive output types, or route the request to review.

For teams evaluating vendors, ask these questions before you ask for benchmark slides:

Which retrieved fields are treated as untrusted?
Can customers, vendors, or public sources write any of those fields?
Are data fields separated from instructions in the prompt structure?
Which outputs are impossible from untrusted fields?
Which tool calls require a human decision?
Can logs show the source fields behind a blocked or escalated answer?

If the answer is only "our model is robust," keep going.

Runtime evidence is the recovery plan

Preventive controls will miss things. Attackers adapt wording. Business context changes. A field that looked harmless in a read-only pilot may become risky once the assistant gets a new action.

Runtime evidence is what lets a team reconstruct the event without guessing.

For a higher-risk assistant, the log should show at least:

the user request category
the retrieved source fields
each field's trust level
the response category
blocked links, phrases, or actions
proposed tool calls
final action taken
reviewer decision, if a person reviewed it

That is not surveillance theater. It is incident response for AI-mediated work.

It also connects directly to workflow receipts for agent evals. The final answer is not enough. The system should prove which data it read, which action it proposed, which boundary it respected, and what changed afterward.

For higher-risk workflows, define the agent operating envelope before launch. The envelope should say what the assistant may read, what it may write, what it may recommend, what it may never do, and which exceptions go to review.

The practical test: pick one field this week

The bunq case is easy to file away as a banking story. That would miss the point.

The relevant object is the transaction description: a mundane field that became powerful only because an assistant retrieved it and generated a response from it.

Every company has its own version of that field.

Pick one AI-assisted workflow and choose one external text field inside it. Then write three rules:

When does the assistant need this field?
What can the field influence?
What can the field never influence?

If the field can influence money movement, credential recovery, customer promises, compliance decisions, legal language, medical advice, employment decisions, or sensitive data sharing, add a review path before launch.

BaristaLabs helps teams turn that kind of boundary into practical implementation work: field inventories, approval queues, operating envelopes, and smaller automations that leave a receipt. If you are evaluating an assistant that reads customer records, payment notes, tickets, documents, or emails, start with the field map before the prompt.

For teams that want help turning this into a safe rollout plan, BaristaLabs offers AI consulting and process automation support focused on useful workflows with clear review boundaries.

AI Pilot Readiness Checklist

Turn the idea into a pilot you can defend.

AI agent articles are easy to bookmark and hard to operationalize. Use the readiness questions as a shared way to decide whether a workflow is specific enough, safe enough, and measurable enough to pilot. If they surface a strong candidate, BaristaLabs can review it with you and help shape a first version that fits your systems, approval process, and risk tolerance.

Turn this into a pilot plan Talk through a pilot candidate with BaristaLabs

Please do not submit PHI, customer records, credentials, or confidential workflow exports.

Practical AI Workflow Notes

Want more practical AI operations ideas?

Get short notes on applying AI inside real small-business workflows — from document handling and customer follow-up to internal reporting, compliance, and automation guardrails.

Share this post

Share on X Share on LinkedIn Share on Bluesky

The AI code scan is not the control. The remediation receipt is.

July 9, 2026

The phishing email looks perfect now

July 6, 2026

When the workflow is known, don't let the agent invent the route

July 6, 2026

Industry Insights

The transaction memo is part of your AI attack surface now

A tiny transfer memo became a prompt-delivery path. Before an AI assistant reads payments, tickets, emails, or PDFs, map which fields are data and which actions they can influence.

Sean McLellan

Lead Architect & Founder

June 10, 20266 min read

A banking customer asks a normal question: "Show me my recent transactions."

The app retrieves amounts, dates, merchants, and descriptions. One description was written by someone else. It arrived through a tiny transfer.

For most companies, the equivalent field is not a transaction memo. It is a support ticket, a CRM note, a vendor email, an uploaded PDF, a web page, a survey response, or a chat transcript.

The question is not "can we write a better prompt?" The better question is: which fields are allowed to influence which outputs?

The attack traveled through a trusted-looking field

Prompt injection usually gets explained as a user typing hostile instructions into a chatbot. That mental model misses the Blue41 scenario.

In the bunq proof of concept, the attacker did not need to interact with the victim's assistant. The attacker only needed to put text somewhere the assistant would later read.

The path looked like this:

Send a small transfer.
Put hostile text in the transfer description.
Wait for the victim to ask a routine question about recent transactions.
Let the assistant retrieve the transaction data.
Let the model confuse attacker-written data with instructions.

A Hacker News commenter compressed the issue well: "It was never about the prompt, it is about the prompt delivery."

That is the key distinction. The dangerous text does not have to be typed by the person using the assistant. It only has to be retrieved.

A field inventory beats another policy paragraph

Teams often respond to prompt-injection risk by adding one more instruction: ignore malicious content, follow the system prompt, do not reveal secrets, do not click links.

Those lines can help, but they are too vague to be the control surface.

The useful artifact is a field inventory for one workflow. Not a grand AI governance framework. One page. One workflow. Every text field the assistant reads.

For a payment or account-support workflow, the inventory should answer seven questions:

What is the field?
Who can write it?
Does the assistant need it for this task?
What may the assistant do after reading it?
What outputs or actions are blocked?
What evidence gets logged?
Who owns exceptions?

Field	Writer	Needed for this task?	Allowed influence	Blocked influence	Evidence to keep	Owner
Transaction description	External sender, merchant, customer	Only for transaction summaries	Neutral summary of the record	Links, credential requests, security advice	Record ID, field used, blocked reason	Fraud or support lead
Support ticket body	Customer or impersonator	Yes	Draft response for review	Credential reset, account changes, refund promise	Ticket ID, retrieved sources, proposed action	Support manager
CRM note	Internal staff or imported system	Sometimes	Account-history summary	Treat note as policy or permission override	Note ID, field used, output category	Account owner
Uploaded PDF	Customer, vendor, applicant	Depends	Extract structured facts	Follow document instructions, open URLs, forward files	File ID, extracted fields, blocked instruction	Operations owner
Vendor email	External vendor or compromised mailbox	Sometimes	Classify request and urgency	Approve payment, change bank details, share sensitive data	Email ID, sender domain, requested action	Finance lead
Web page	Public internet	Rarely for internal actions	Summarize with citation	Execute page instructions or copy hidden text into actions	URL, fetch time, output category	Workflow owner

The table is deliberately plain. It forces a sentence like this:

"A transaction description may help summarize spending, but it may not cause the assistant to generate a reauthentication link."

That sentence is much more enforceable than "be safe."

The same shape shows up outside banking

The Blue41 article is about financial services, but the pattern is already sitting inside ordinary business systems.

Each source has a different trust level.

When those fields reach an LLM, the model does not automatically preserve those distinctions. The application has to preserve them.

This is the same boundary problem behind our support bot credential reset guidance. A bot can gather facts and prepare a handoff. That does not mean it should complete the reset.

Guardrails need a narrower job

Blue41's mitigation section is practical: minimize unnecessary context, treat retrieved data as untrusted, constrain sensitive outputs and actions, and monitor runtime behavior.

The common thread is specificity.

A narrow workflow gives the control something concrete to enforce:

Transaction descriptions can be quoted or summarized, but cannot produce links.
Support tickets can produce draft replies, but cannot reset credentials.
Vendor emails can classify urgency, but cannot change bank details.
Uploaded PDFs can populate structured fields, but cannot pass hidden instructions into tool calls.
Public web pages can be cited, but cannot instruct the agent to execute a downstream action.

The queue is stronger when the field rules are already clear.

RAG and fine tuning do not solve the memo problem

Retrieval augmented generation can make an assistant more useful. Fine tuning can shape style and domain behavior. Neither changes the fact that retrieved text can carry instructions.

A stronger model may refuse more obvious attacks. A better classifier may catch suspicious wording. A better prompt may reduce the number of failures.

Those layers are worth having. They do not answer the design question: should this field be allowed to affect this action?

For teams evaluating vendors, ask these questions before you ask for benchmark slides:

Which retrieved fields are treated as untrusted?
Can customers, vendors, or public sources write any of those fields?
Are data fields separated from instructions in the prompt structure?
Which outputs are impossible from untrusted fields?
Which tool calls require a human decision?
Can logs show the source fields behind a blocked or escalated answer?

If the answer is only "our model is robust," keep going.

Runtime evidence is the recovery plan

Preventive controls will miss things. Attackers adapt wording. Business context changes. A field that looked harmless in a read-only pilot may become risky once the assistant gets a new action.

Runtime evidence is what lets a team reconstruct the event without guessing.

For a higher-risk assistant, the log should show at least:

the user request category
the retrieved source fields
each field's trust level
the response category
blocked links, phrases, or actions
proposed tool calls
final action taken
reviewer decision, if a person reviewed it

That is not surveillance theater. It is incident response for AI-mediated work.

The practical test: pick one field this week

The bunq case is easy to file away as a banking story. That would miss the point.

The relevant object is the transaction description: a mundane field that became powerful only because an assistant retrieved it and generated a response from it.

Every company has its own version of that field.

Pick one AI-assisted workflow and choose one external text field inside it. Then write three rules:

When does the assistant need this field?
What can the field influence?
What can the field never influence?

For teams that want help turning this into a safe rollout plan, BaristaLabs offers AI consulting and process automation support focused on useful workflows with clear review boundaries.

AI Pilot Readiness Checklist

Turn the idea into a pilot you can defend.

Turn this into a pilot plan Talk through a pilot candidate with BaristaLabs

Please do not submit PHI, customer records, credentials, or confidential workflow exports.

Practical AI Workflow Notes

Want more practical AI operations ideas?

Get short notes on applying AI inside real small-business workflows — from document handling and customer follow-up to internal reporting, compliance, and automation guardrails.

Share this post

Share on X Share on LinkedIn Share on Bluesky

The AI code scan is not the control. The remediation receipt is.

July 9, 2026

The phishing email looks perfect now

July 6, 2026

When the workflow is known, don't let the agent invent the route

July 6, 2026