Browser Agents, Agent Layer, NuClide Stack

What it is

Browser agents pair an LLM with a headless browser (Chromium, Playwright, Puppeteer) so the model can see a webpage, reason about it, and click. It’s the natural answer to a real problem: most of the world’s data lives behind JavaScript, and most of the world’s tools live behind UIs that have no API. Frameworks like browser-use, Stagehand, and the Anthropic Computer-Use harness all share this shape: a screenshot, a model decision, an action.

What goes wrong

The agent process is an entire web browser running with the operator’s full session context: cookies, saved logins, residential IP, sometimes payment methods. When the agent’s control plane (the API that accepts “go do X”) is exposed to the public internet without auth, anyone can drive that browser through the operator’s identity. We’ve also found stale instances where the browser left a session pinned for hours after the agent’s last task; an attacker who finds the open port inherits whatever the human last logged into.

How we test

We probe for known framework footprints (the WebSocket port browser-use opens, the screenshot endpoint of the Computer-Use sample server, the Stagehand remote-control API) and confirm reachability with a benign read of agent state: current URL, session cookies count, last action history. From there we map the agent’s identity by inspecting what site it last interacted with, which is sufficient to identify the operator without ever issuing an action.

Receipts

Research

Every survey, case study, and disclosure we've published that touches this layer of the stack. Counts on the cells above tally these directly.

Cross-cloud surveys

1

Survey May 1, 2026

Browser Automation / Agent Backends: Cross-Cloud Survey (2026-05)

Browser-automation backends (Browserless, Playwright server, Puppeteer remote, Selenium Grid, Skyvern) underpin AI agent stacks: the agent navigates websites, scrapes content, fills forms, and harvest…

Read →