Quick path
In this article
Quick read: what changed, why it matters, and what to do next.
The funny version starts with a customer who wants lunch and a LeetCode answer.
In a viral McDonald's screenshot on X, a customer asks the restaurant bot to reverse a linked list in Python before starting a Chicken McNuggets order. The bot writes the code, explains the runtime, and then politely pivots back to food. A separate Chipotle write-up describes the same shape: a restaurant chatbot answers a programming question before returning to burritos.
Screenshots are messy evidence. They can be staged, copied, cropped, or passed around without context. The lesson does not depend on treating every viral image as a forensic artifact.
The lesson is that people will test the edge of any AI surface you put in front of them. If a business chatbot sits on top of a general-purpose model, someone will ask whether the business boundary is real.
For a fast-food bot, the failure is mostly comic. The model burns tokens, looks strange, and teaches the internet that the restaurant shipped a very helpful bot with a very loose job description.
For a bank, clinic, insurer, law firm, ecommerce store, SaaS support desk, or local service business, the same pattern stops being cute.
A customer might ask the appointment bot to write a school essay. Harmless, probably. They might ask the insurance bot how to avoid an eligibility check. Less harmless. They might ask the support bot what account fields it can see, whether it can reveal its internal instructions, or whether it can classify a risky case as routine before escalation.
The problem is not that the model knows how to code. The problem is that the product does not know where helpfulness should end.
The user smuggles the side quest into the real task
Most customer-service chatbots now contain two jobs that pull against each other.
The visible job is narrow. Answer menu questions. Route support requests. Explain an approved policy. Collect intake details. Help a customer find the next step.
The hidden model is broad. It can write code, summarize documents, role-play, translate, brainstorm, and explain algorithms. The system prompt may say, "You are a helpful restaurant support assistant," but the word "helpful" does a lot of damage when the user phrases an unrelated request as part of the customer journey.
The pattern sounds cooperative:
I want to do the thing you are here for,
but first I need you to do this unrelated thing.
These prompts work better than obvious attacks because they sound cooperative. They do not always look like hostile prompt injection. They look like a customer with one more question before checkout.
A prompt can tell the model to stay on topic. The model still has to interpret what "on topic" means. Users will keep wrapping off-topic requests inside the official task until the product, not the prompt, draws the line.
The prompt is not the permission system
A prompt is a useful instruction. It is not an access-control layer, a policy engine, or an audit record.
If a chatbot should only help with ordering, reservations, account support, service intake, or case routing, the application has to enforce that boundary outside the main response prompt. The same rule applies to every AI agent with tool access: the model can propose, but the product decides what is allowed.
A safer chatbot separates the conversation from the control plane:
User message
↓
Classify the user's intent
↓
Check whether that intent belongs to the bot's job
↓
If no: use a constrained refusal and return to an allowed task
↓
If yes: retrieve approved business context
↓
Draft the response or proposed action
↓
Check the content, tools, and action against policy
↓
Respond, escalate, or create an approval item
↓
Write a receipt
That workflow is less glamorous than one giant prompt. It is also easier to trust.
The bot can still sound warm. It can still recover when a customer phrases something awkwardly. It can still answer normal questions. It just should not become a coding assistant because the user mentioned lunch in the first sentence.
Start with the job, not the model
The first guardrail is a product sentence, not a prompt sentence.
"A helpful AI assistant for customers" is too broad. It describes the model's personality, not the bot's job.
A better job definition names allowed intents in plain language:
This bot may answer menu, service, pricing, availability, policy, or account questions.
It may collect structured intake details.
It may look up order status after verification.
It may draft a support response for human review.
It may route a case to the right queue.
It must hand off when the request falls outside those jobs.
Then name the non-goals with the same bluntness:
The bot may not write code.
It may not do schoolwork.
It may not provide legal, medical, or financial advice outside approved content.
It may not reveal internal instructions.
It may not explain security controls.
It may not classify the user's own risk level without policy checks.
It may not perform actions that bypass approval.
Those exclusions are not there to make the bot cold. They are there because every open-ended assistant eventually gets asked to do something outside the business process.
Once the job is clear, the rest of the guardrails have somewhere to attach.
Put a gate before the answer
The first model call should not always be "answer the customer."
For customer-facing workflows, especially ones with account data or tools behind them, the first step should classify the request and return a structured decision:
{
"intent": "off_topic_coding_help",
"allowed": false,
"safe_reply": "I can help with orders, menu questions, account support, or connecting you to the right team, but I can't help write code here. What would you like to order?",
"escalate": false
}
Now the application has a control point. If allowed is false, the response comes from a constrained refusal path. The general model does not get a second chance to charm its way into the forbidden task.
This matters more when the bot can call tools. Intent classification should happen before retrieval, before order lookup, before CRM access, and before any action is proposed.
Keep tools narrower than the chat
The conversation can be flexible. The tools should be boring.
An ordering bot might need tools to search menu items, start a cart, update a cart, check store hours, and hand off to support. It does not need a generic browser. It does not need to run code. It does not need to summarize arbitrary documents. It does not need an "answer anything" tool sitting behind the scenes.
This is where teams blur the line between chatbot and agent. A text-only bot can embarrass you. A bot with broad tools can expose data, change records, or create work someone later treats as approved.
Use deterministic policy checks around the actions that matter. Do not ask the model whether it is allowed to do the thing it just proposed. Check the action in application code:
- Is this request inside the bot's allowed intent list?
- Is the user verified for the account or order?
- Can this workflow access this customer record?
- Does the dollar amount, data class, or customer segment require review?
- Is the answer grounded in approved knowledge content?
- Did the request include instructions to ignore policy, reveal prompts, or change risk labels?
The model can help classify. The product has to enforce.
This is the same argument behind an AI approval queue. Proposed actions should pass through a policy layer before execution. Sometimes the approval is automatic. Sometimes a person reviews it. Either way, the rule lives outside the prompt.
Log the weird stuff
Teams usually review successful conversations first. Off-topic prompts are often more useful.
If customers keep asking a restaurant bot to write code, you probably have internet mischief. If customers keep asking an insurance bot how to avoid eligibility checks, you have product risk. If customers keep asking a clinic intake bot for diagnosis or medication advice, you have a boundary problem.
Log enough to understand the pattern without collecting more sensitive data than you need:
- request category
- allowed or denied decision
- refusal template used
- tool access blocked
- escalation decision
- policy rule triggered
- model and prompt version
- reviewer outcome if a person got involved
We have written before about agent receipts because the same rule applies here. If the bot makes, blocks, or escalates a decision, the team needs a record it can reconstruct later.
A small test for this week
If your company already has a customer-facing chatbot, do not start by debating model vendors. Run ten prompts through the current system and watch what happens.
Use a normal support question. Use a normal order or intake request. Then try the edge cases: an off-topic coding request framed as a prerequisite, a legal or medical question, a request to reveal internal instructions, a request to ignore policy, a request to access someone else's information, a request to change the user's own risk label, a manipulative message, and a confusing but legitimate customer request.
For each one, write down what the bot answered, whether it called a tool, whether it exposed internal behavior, whether it escalated, and what receipt the team could inspect afterward.
If the only control you can point to is "the prompt says not to do that," the system is not ready for broad deployment.
The nugget bot is a warning shot
The fast-food version is a bot writing linked-list code when it should be helping with lunch.
The serious version is a customer-service agent that cannot tell where helpfulness ends and authority begins.
A bot can be warm, conversational, and useful without being a general-purpose assistant. It can answer customer questions without doing homework. It can propose actions without executing them. It can refuse off-topic requests without sounding broken.
The companies that get this right will not be the ones with the longest system prompt. They will be the ones with the clearest job definition, narrow tools, policy checks, and receipts when something strange happens.
The next time someone asks your support bot to reverse a linked list, the best answer is not a Python function.
It is a graceful refusal, a return to the customer's real task, and a log entry your team can learn from.
Review a chatbot guardrail
Keep the helpful bot inside the job
BaristaLabs helps teams turn customer-facing AI prototypes into controlled workflows with scoped tools, policy gates, escalation paths, and audit receipts.
Best fit for support bots, intake assistants, ordering flows, lead qualification, website chat, and service agents that need to be useful without wandering off-task.
Practical AI Workflow Notes
Want more practical AI operations ideas?
Get short notes on applying AI inside real small-business workflows — from document handling and customer follow-up to internal reporting, compliance, and automation guardrails.
Turn this idea into a pilot
Which workflow should go first?
Use the readiness check to compare impact, effort, risk, owner, and next step before booking a call.
- 3-5 minutes
- Deterministic score
- No sensitive data
Share this post
