Six months ago the feature was one Gemini call: summarize the support ticket, suggest a reply. Somebody added a step that checks the customer's order status. Somebody else wired in a search tool so the model could pull the current return policy instead of a stale one baked into the prompt. Then it started generating a formatted PDF for the agent to send. The whole thing now takes ninety seconds, calls two tools, produces a file, and occasionally needs to keep working after the customer has closed the tab.
Nobody redesigned this feature on purpose. It grew one reasonable addition at a time, and nobody stopped to draw the diagram until something needed explaining. The code still calls it "the Gemini call." It hasn't been a call for a while. It's a job now, with steps, a clock, and a result you have to retrieve instead of just receiving.
Google just gave that shift a name and an API to match.
What Google actually announced
On June 22, Google said the Interactions API has reached general availability and is "now our primary API for interacting with Gemini models and agents." It had been in public beta since December 2025; GA means the schema is stable and, per Google, documentation across Google AI Studio and the Gemini API now defaults to it, with the company working with ecosystem partners to make it the default in third-party SDKs too.
The feature list attached to GA is long: Managed Agents, background execution, built-in tool mixing, a "From Roles to Steps" schema simplification, Flex and Priority service tiers, field-level errors, and the ability to retrieve past interactions for up to 55 days on paid tiers. We've already covered Managed Agents and hosted AI sandboxes on their own: one API call that provisions a remote Linux sandbox where an agent can reason, run code, browse, and manage files. Keep the scope straight here. Managed Agents is one feature inside this shift, not the whole shift. What matters more is what the API itself assumes about the shape of the work now.
Google's own documentation is blunt about the hierarchy: "generateContent is now considered legacy but remains fully supported." Nothing breaks. But the same docs also call the Interactions API "recommended for all new projects," and the GA announcement is sharper still. Expect frontier capabilities for long-running models and agents to increasingly land on Interactions API only, not on both.
From a call to a resource
The mechanical difference matters, because it's what actually changes how you build.
A generateContent call is stateless. You send everything relevant every time, you get one response back, and the conversation "history" is just you re-sending the transcript on the next request. That model works well for exactly what it sounds like: single-turn or lightly-threaded text generation where the client owns all the memory.
The Interactions API centers on something different: an Interaction, a resource with an ID, representing a full turn that contains a chronological list of steps — model thoughts, tool calls, tool results, and finally a model output. You can hand a completed interaction's ID back as previous_interaction_id on the next call, and the server carries the state forward, which the docs also note tends to improve implicit caching. You can set background=true and the interaction runs asynchronously server-side instead of holding a connection open for ninety seconds. You can retrieve a finished interaction later by ID, inspect its steps for debugging, or render them in a UI so a human can see what the model actually did, not just what it said.
That's the real shift. Gemini work stops being "send a prompt, get text back" and starts being "start something, get a durable record, check on it, retrieve it, maybe let it keep running while you do something else."

There's a catch worth flagging before anyone gets excited about "state, finally": tools, system_instruction, and generation_config are scoped to a single interaction, not carried forward automatically. If you want a tool available on turn four, you specify it again on turn four, even with previous_interaction_id set. State means the conversation record persists. It does not mean your configuration follows along for free. That's an easy assumption to get wrong once, in production, at 2 a.m.
Retention has its own boundary too. By default, store=true, and paid tiers keep interactions for 55 days; the free tier keeps them for one day. Turning store=false off is incompatible with background execution, and it breaks previous_interaction_id chaining for later turns. So the moment you need either of those two things, you've also opted into a 55-day (or one-day) retention window for whatever went into that interaction, unless you delete it yourself by ID.
What actually moves now, and what stays put
This is not a "rewrite everything by Friday" moment, and Google isn't asking for one. generateContent remains fully supported and will keep receiving new mainline Gemini models for the foreseeable future. The honest migration test is simpler than a full audit: does this feature need state, tools, background time, or a retrievable record?
Stay on generateContent if the feature is a single round trip with no memory requirement: classification, extraction, a one-off rewrite, a summary that doesn't need to reference what happened last time. Nothing about GA makes that architecture worse. It's still the cheaper, simpler path for genuinely simple work, and Google's own docs frame it that way rather than as a deprecated dead end.
Move to the Interactions API when a feature already has, or is about to grow, any of these: a multi-turn thread that needs to resume; a mix of built-in tools (Google says Search and Maps can now sit alongside custom functions in the same request, with tool results able to return images as well as text); a task long enough that a human shouldn't have to wait on an open connection; or a requirement that someone can inspect what the model did, step by step, after the fact, not just what it said. The support-ticket feature from the opening scene checks every one of those boxes now. It just hasn't been told yet.
If you're touching the Gemini 3 family specifically, know one gap before you plan a sprint around it. The docs note that remote MCP isn't supported there yet, and neither are video_metadata, the Batch API, automatic function calling for Python, or explicit caching, all still pending on the Interactions API side as of the last documentation update in late June. None of that blocks the migration decision above. It just means "move everything" isn't a same-week option even for a team that wants to.
Migration mechanics, if you get to that stage, are in Google's migration guide and the May 2026 breaking-changes notes covering the "From Roles to Steps" schema change, along with SDK support in google-genai 2.3.0+ for Python and @google/genai 2.3.0+ for JavaScript. Developer chatter on this so far reads as early and practical — people migrating a provider integration, filing an issue about transport support — not a groundswell. Treat it as engineers doing the work quietly, not as a signal that everyone has already moved.
The lifecycle map
Before a feature moves to background execution, tool mixing, or Managed Agents, it helps to answer five questions about it. None of them require new tooling. They're just easy to skip when a feature grew one step at a time instead of being designed all at once.
Start: what is actually doing the work? Name the model ID or agent ID, not just "Gemini," so the feature has a real execution target.
State: can this be resumed, inspected, or continued later? If the answer depends on an interaction ID, previous_interaction_id, or retrievable step history, the feature is no longer just a prompt call.
Work: is this inference, or delegated work with a clock running? background=true, tools, a sandbox, or custom functions all move the feature toward a job that needs operational ownership.
Control: who can stop it, and what is it allowed to touch this turn? Permissions, data sources, tool lists, retries, cancellation, system_instruction, and generation settings need to be named again where they apply. The field most teams skip is Control. That's the one that turns into an incident.
Exit: where does the record live, for how long, and who can pull it? Outputs, citations, generated artifacts, retention windows, and deletion paths belong in the same conversation as the model choice.
If a Gemini feature can't get a clean answer to all five questions, that's not a reason to avoid the Interactions API. It's a reason to answer them before flipping background=true on. We've written before about comparing Bedrock models: model choice is one layer of this decision. It stopped being the only layer the day the feature picked up its second tool call.
This is also a different problem from the one we covered when multiple agents start calling each other — that piece is about the graph between agents. This map is about the lifecycle of a single interaction, before it ever needs to talk to anything else.
The practical next step
Nothing here requires urgency. generateContent isn't going away, and Google has said as much directly. What's worth doing this week is smaller: pick the one Gemini feature in your product that has quietly picked up a tool call, a background task, or a file it generates, and run it through the five questions above. Most teams find the gap in the same place — Control, the question nobody answered because the feature grew a step at a time and nobody stopped to ask who could cancel it.
If your AI feature has tools, files, background time, or a point where someone needs to approve what happened, map it as an interaction before you pick the model. BaristaLabs can help turn one Gemini-powered workflow into a lifecycle map with clear owners, controls, and a retention answer, before it becomes infrastructure nobody remembers building.
Next step
Map one Gemini feature before the next one ships
Bring the AI feature that has quietly grown tools, background time, or a file it generates. BaristaLabs will help you name its start, state, work, control, and exit before it becomes a production dependency nobody can explain.
Best fit for teams with at least one Gemini or agent feature that already calls a tool, runs in the background, or produces a file.
Practical AI Workflow Notes
Want more practical AI operations ideas?
Get short notes on applying AI inside real small-business workflows — from document handling and customer follow-up to internal reporting, compliance, and automation guardrails.
Turn this idea into a pilot
Which workflow should go first?
Use the readiness check to compare impact, effort, risk, owner, and next step before booking a call.
- 3-5 minutes
- Deterministic score
- No sensitive data
Share this post
