Most recent
navigate open esc close Corpus index built 2026-06-07 23:58 UTC

§ THE STACK / AGENT LAYER

Code Agents

Aider, OpenHands, Continue, SWE-agent

How LLMs reach out and take action: call APIs, browse the web, drive workflows.

What it is

Code agents pair an LLM with a development environment. The model reads the codebase, edits files, runs tests, and submits PRs. Aider (Paul Gauthier) is the terminal-native pair-programming agent. OpenHands (formerly OpenDevin) is the all-in-one autonomous-developer platform. Continue.dev is the IDE plugin the model drives from inside VS Code or JetBrains. SWE-agent (Princeton) is the research-grade benchmark agent. Cline and Roo Code sit in the same niche. Together they are how the “AI writes the code” workflow actually ships.

What goes wrong

A code agent runs as the operator inside a development environment with the operator’s git credentials, SSH keys, cloud SDK config, and shell history. When the agent’s web UI or REST control plane is exposed without auth, an attacker drives the same shell. They can read the codebase (including secrets in .env files the agent has visibility into), commit and push to the operator’s repos, deploy via the operator’s CI hooks, and pivot via any SSH credential the agent’s environment carries. Most installs assume “this is on my laptop” and never reconsider when the operator deploys to a remote workstation.

How we test

We probe the agent’s web UI for the framework signature (OpenHands, Continue server, Aider’s --browser mode all have distinct asset bundles) and read the workspace path from the status endpoint. The workspace path is enough to characterise the operator (corporate hostname, repo name, often the user’s home directory). We never invoke the agent. Workspace metadata is sufficient attribution evidence.