What it is
Browser agents pair an LLM with a headless browser (Chromium, Playwright, Puppeteer) so the model can see a webpage, reason about it, and click. It’s the natural answer to a real problem: most of the world’s data lives behind JavaScript, and most of the world’s tools live behind UIs that have no API. Frameworks like browser-use, Stagehand, and the Anthropic Computer-Use harness all share this shape: a screenshot, a model decision, an action.
What goes wrong
The agent process is an entire web browser running with the operator’s full session context: cookies, saved logins, residential IP, sometimes payment methods. When the agent’s control plane (the API that accepts “go do X”) is exposed to the public internet without auth, anyone can drive that browser through the operator’s identity. We’ve also found stale instances where the browser left a session pinned for hours after the agent’s last task; an attacker who finds the open port inherits whatever the human last logged into.
How we test
We probe for known framework footprints (the WebSocket port browser-use opens, the screenshot endpoint of the Computer-Use sample server, the Stagehand remote-control API) and confirm reachability with a benign read of agent state: current URL, session cookies count, last action history. From there we map the agent’s identity by inspecting what site it last interacted with, which is sufficient to identify the operator without ever issuing an action.