By two in the afternoon, the coordinator has done the same thing ten times. Open the event portal. Tab into the registration form. Copy a name from row 11 of a 500-row spreadsheet, paste it, copy an email, paste it, pick a date from a dropdown that loads half a second late, submit, wait for the green confirmation, go back, do row 12. It is not hard work. That is the problem. It is the kind of work a person does worse the longer they do it, and it eats an afternoon that was supposed to go to something else.

This is the moment everyone wants to hand to an AI agent. And it is also the moment where most attempts quietly fall apart, because the team gives the agent the wrong thing to start from.

A small project that points at the right idea

On June 18, 2026, a tool called AutomatiQ showed up on Hacker News with a plain description: a tool that watches you browse, then writes HTTP-based automation scripts. It is early. The README calls it alpha, and the thread is quiet, so treat it as a signal rather than a verdict. But the design is worth borrowing even if you never run the tool.

Here is what it actually does. You record one pass through a workflow in your own browser. AutomatiQ captures the session through the Chrome DevTools Protocol, which is the same low-level interface a browser's own dev tools use to see network traffic. It compiles per-action video clips of what you clicked and typed, alongside the real request and response data sitting behind each of those actions. Then it hands that bundle to an LLM working in an isolated Python environment, and the agent reverse-engineers the session into a standalone script. Its VISION.md opens on exactly the scene above: a teacher copy-pasting from a 500-row sheet into online forms.

The demo is the flashy part. Watch me browse, get a script. The useful part is quieter, and it is the reason I want to talk about this at all: the recording is the artifact. Not the script. The recording.

I am going to give that recording a name, because it behaves like one specific object once you start treating it that way. Call it an automation cassette.

Most people try to automate a web task one of two ways, and both have a hole in the middle.

The first way is to describe it. You write an agent a paragraph: "Go to the event portal, log in, register each person from this spreadsheet." The agent now has to guess everything the paragraph left out, which is most of it. Which field is the email. Whether the date picker needs a click or accepts typed text. What "registered" actually looks like on screen. A prompt is a memory of the task, and human memory of a routine task is famously bad. You forget the small step you do without thinking, and that is usually the step that breaks.

The second way is to capture the network traffic and hand the agent the logs. This is closer, because underneath every click the website is just sending HTTP requests, and those requests are the thing you actually want to reproduce. AutomatiQ's own argument is that UI automation is brittle: buttons move, pages load slowly, a layout quirk breaks a script that worked yesterday. Go for the requests instead and you skip the flaky surface.

But raw requests alone are blind in a different way. A recorded session might fire forty requests on one page. The agent can see all forty and have no idea which one mattered, or why it happened. Was that POST the registration, or a background analytics ping? Did that token come from the login step or a cookie set three pages ago? The network log knows what happened. It does not know what you meant.

That gap is the whole point. AutomatiQ closes it by recording action-correlated clips around your clicks and keystrokes, so the agent gets intent next to evidence: this is the request that fired when the human pressed submit. That pairing of intent and evidence is what makes the recording worth keeping. And once you are keeping it deliberately, you may as well capture the few things a tool will not capture for you.

The cassette fields

A cassette is not the script. It is the recorded pass your team can replay, inspect, compile, and eventually retire. The tool records some of these fields for you. The rest are judgment, and judgment is the part you do not want an agent guessing at. Eight fields:

Field	What it holds
Authorized task	The specific workflow you have permission to automate, on a system you own or are allowed to use.
Human path	The exact sequence a person took: pages, clicks, fields, the order it happened in.
Action clip	The short recording around each click and keystroke, so intent sits next to the traffic.
Request and response evidence	The real HTTP calls behind each action: headers, payload, response, and the data that came back.
Data boundary	What data the workflow touches, how sensitive it is, and where it is allowed to go.
Success marker	The observable signal that the task worked. The green confirmation, the new record, the returned ID.
Fallback route	When the request-only path is not safe to assume, and a real browser step is required instead.
Repair rule	Who owns the fix, and what tells you the cassette is stale and needs re-recording.

A translucent glass cassette cartridge resting on a dark studio table, holding glowing path threads, blank action blocks, and ribboned network traces in navy, cyan, emerald, and violet, with no text or labels. — The cassette holds the human path, the action clips, and the network traces in one inspectable object.

Notice that only the middle four fields come off the recorder. The first one and the last three are yours. A tool can show you the traffic behind a click. It cannot tell you that you are allowed to make that click ten thousand times, or who gets paged when the site changes its login flow next quarter.

What the agent is actually good at

Once the cassette exists, an agent is genuinely good at three jobs, and they are the jobs you would least want to do by hand.

Compiling is the obvious one. Turning a folder of clips and requests into a working script is tedious pattern work: figure out which call is the real one, thread the auth token through, parameterize the parts that change per row. AutomatiQ rebuilt its agent around IPython for this, on the reasoning that LLMs handle a Jupyter-style environment better than an unfamiliar shell, and it paginates output so the agent does not drown in its own logs. The detail matters less than the shape: a structured workspace the agent can poke at, test a hypothesis in, and correct itself against.

Testing hypotheses is the second. The agent can try a request, read what comes back, and adjust, instead of you eyeballing a network capture at 4pm. It can notice that the registration call needs a header you did not think to mention.

Maintenance is the third, and it is the one teams underrate. When the site changes and the script breaks, you do not start over from a blank prompt. You replay the cassette, re-record the pass, and hand the agent a fresh bundle - exactly the kind of repeatable workflow process automation is supposed to keep alive instead of rebuilding from scratch each quarter. AutomatiQ also notes that the recorded folder has no vendor lock-in: any other LLM or plain script can open it, so the cassette outlives whatever tool made it.

Where a human still owns the boundary

Here is where I want to be careful, because the same design that makes this useful makes it easy to point at the wrong target.

A cassette assumes an authorized task on a system you own or have permission to use. That word decides everything. The fact that a recorder can see the real traffic behind a click does not mean every site is a fair target. Automation creates load, and load is abuse risk: the OWASP guidance on denial of service is a reminder that hammering a service, even by accident, is a way to break it. Scope the work, rate-limit it, and stay inside the vendor's terms. If you do not own the system and you do not have permission, the cassette ends before it starts.

The data boundary is also yours, not the agent's. The cassette records what data flows through the workflow. A human has to decide whether that data is allowed to flow into a script, a log, or a third-party model in the first place. This is the same discipline behind a set of AI workflow controls: decide the limits before the automation touches a customer or partner system, not after.

The fallback route is a judgment call the agent will happily skip. AutomatiQ's own roadmap keeps a real browser around for the steps that need one, and that is the honest version. Some steps are safe to replay as a bare request. Some are not, because they depend on session state, a human-in-the-loop check, or a step the site expects a real browser to perform. Marking which is which is not the agent's job. It is the part you keep.

This is also why the cassette pairs well with two habits we have written about before. Pick the right thing to record in the first place, because some workflows should not be automated at all. And watch the real work before you wire anything up, the way you would run a shadow week, so the pass you record is the pass that actually happens and not the tidy version you imagine. Skip those and you get the thing nobody wants: an overbuilt automation that gets ripped out a quarter later, with an invoice attached.

Try it on one task

The next time someone on your team is on the tenth pass of the same web form, do not hand an agent a paragraph and hope. Record one clean, authorized pass and fill in the eight fields. You will probably find that two of them, the data boundary and the fallback route, are the ones nobody had thought about, and they are the ones that decide whether this is a fifteen-minute win or a mess you maintain forever.

Bring one repetitive authorized web workflow and map its cassette with us before you automate it. The recording is the part worth getting right. The script is the easy part.

The automation cassette is the missing artifact for web agents

A small project that points at the right idea

Why a prompt is too thin and raw logs are too blind

The cassette fields

What the agent is actually good at

Where a human still owns the boundary

Try it on one task

Which workflow should go first?

Want more practical AI operations ideas?