AI Agents Need Their Own Infrastructure

March 10, 2026 · Allan Karanja

While working on OpenHands, an open source agent framework, I came across this issue: Agent created a duplicate PR because it had too many permissions. The agent bypassed using the tools it had been provided and directly created the PR. This by itself isn’t a big issue, just a duplicate PR. But the architecture that allows agents to take arbitrary actions is a huge security risk.

Usually opening PRs is handled by an agent tool call to the resolver. The resolver is a deterministic module that interacts with github(and other services as well). It opens PRs, posts comments etc. The resolver is basically two files:

    resolve_issue.py - spins up the agent, gives it the issue, lets it write code
    send_pull_request.py - creates the branch, opens the PR, posts comments

Usually the resolver passes in its token into the agent’s environment. The agent tried calling the create pr function, it failed, so it wrote Python code to call the GitHub API directly using the token in its environment. The issue is that the resolver’s token is broadly scoped, meaning full access to enable it to create PRs. The agent noticed this and wrote the code. With a broadly scoped token, the agent could do more than open a duplicate PR. It could push directly to main, delete branches, or modify workflow files. An attacker who compromises the agent’s environment inherits all of those capabilities. The smarter models get, the more creatively they’ll exploit whatever access they have.

Software engineers currently work on the principle of least privilege, users permissions are scoped tightly to only what they need. Junior devs don’t get write access to prod. The solution is to have two tokens. The agent gets Contents: write + Metadata: read only, resolver keeps broad permissions.

Although this covers only a single piece of the puzzle, as agents become more autonomous, we need infrastructure-level guardrails (tokens, network policies, scoped credentials) not just prompt-level instructions (“please don’t create PRs”). Prompts which models sometimes treat as suggestions.