The network is the open side of the sandbox

Sandboxing the filesystem and processes is the easy part. The network is the surface most stacks leave open, and the one that matters most as agents start touching real systems. Here is how we are closing it in Condukt, and the missing piece we are watching for.

We have been thinking about a particular hole in the way we run agents, and we keep arriving at the same place. We sandbox the filesystem. We sandbox process execution. We feel reasonably safe because the agent cannot reach into the host machine, cannot read arbitrary files, cannot fork off and start mining crypto on our laptop. When the agent runs ls, we are comfortable. When the agent runs rm, we have ways to be comfortable too. And then the agent runs curl. Or npm install. Or it issues an HTTP request from some library buried six layers deep in a tool call, and in that moment the sandbox quietly stops being a sandbox. The thing we built to feel safe was built around a model of "the host", and the network is exactly the place where "the host" stops being a useful boundary.

A coding agent in 2026 is, in practice, an LLM driving an arbitrary set of tools. Most of those tools are mundane: read a file, write a file, run a build, open a pull request. Useful, contained, easy to reason about. A few of them are not. They reach to the network. And once they do, the assumption that the agent is "in a box" stops holding, because the blast radius of a single misjudged tool call is no longer bounded by the pod's filesystem. It extends out to whichever endpoint accepts a request, whichever registry resolves a package, whichever webhook accepts a POST. Consider what an agent's outbound network access can actually do, even with the best filesystem isolation in the world. It can read a value from an environment variable and POST it to an attacker-controlled host, with the encrypted side of the wire unreadable to anyone watching from the outside, which is exfiltration through a side channel that the host has no good way to spot. It can resolve npm install some-package to a registry that is not the one you thought you were using, pull a tampered tarball, and execute the install hook before anyone notices, which is the supply-chain attack vector that has been quietly compromising CI pipelines for years and that becomes more interesting when the thing pulling the dependency is an agent that decided it needed a library on the fly. It can follow an MCP server reference to a host that looks fine on paper and have that host smuggle instructions back into the agent's context, which is prompt injection through the very transport the agent depends on to do useful work. It can fetch a remote script and pipe it into a shell, because something somewhere in the prompt told it to. None of these are exotic. All of them are within the routine repertoire of any tool that touches the network. And none of them are addressed by a sandbox that defines itself in terms of files and processes.

What the providers give you

You can lean on whoever runs your sandbox to handle this for you, and some of them do, to a point. There are agent providers that ship a network policy you can configure as a list of allowed domains, and that is a real answer for a real subset of the problem. It is also, by construction, a static answer: you declare what is safe at deploy time, and anything outside that list either fails closed or quietly slips through, depending on how the provider implemented it and how recently they thought about this layer. The list gets stale the moment your agent's job description changes, and you end up either over-permitting because the cost of curating the list is high, or under-permitting and watching the agent fail in places where it should have succeeded. Either failure mode is a sign that the layer is asking the wrong question. The question is not "is this hostname on a list", it is "does this request make sense for what this session is doing", and a list cannot answer that. This is also the part of the stack where you are most reliant on whoever runs the sandbox. If you are running an agent inside someone else's environment, you mostly get what they decided to build, and the gap between providers who have invested here and providers who have not is widening as agentic workloads grow. As an operator picking where to run agentic workloads, that is now part of the decision in a way it was not a year ago.

The pattern worth borrowing

There is one motif we keep coming back to, and it comes from how Amazon has been talking about agent security publicly. The shape of the idea is that the right way to gate sensitive access is not a static allowlist but another model. A specialised one whose job is to look at the request the agent is about to make, look at the context it is making the request from, and decide whether to let it through. The intuition is simple enough that it almost seems obvious once you have heard it. Whether a network request is safe is rarely a property of the URL alone. It is a property of the URL plus what the agent is trying to do, what it has been doing, and what it has been told. A request to a public API can be perfectly safe in one context and a red flag in another, and only the context tells you which. A list of hostnames cannot reason about the context. A model that has the situational context can at least try. The shape is also recursive in a way we like, because the thing being gated is an agent, and the thing doing the gating is an agent of its own with a much narrower job, which means we can reuse the same machinery we already had for running models and tools. That framing is what motivated us to explore something similar in Condukt.

What we built

Condukt has a few sandbox backends. Sandbox.Local runs against the host. Sandbox.Virtual runs against an in-memory filesystem backed by a Rust NIF. Sandbox.Kubernetes runs each session inside a dedicated pod. The first two are mostly about filesystem and process isolation, where what matters is which files the agent can touch and which subprocesses it can spawn. The Kubernetes one is the one where we have the room to do something useful about the network, because it is the only one where we own the network namespace and the manifests, and where we can decide what the pod is allowed to do before the agent ever starts running. The plumbing we need does not exist on the host, but it does exist in Kubernetes, and that is where we put the new layer.

We added a piece called Sandbox.NetworkPolicy that runs alongside the agent's pod and intercepts every outbound TCP connection on ports 80 and 443. It works because we set the rules of the pod ourselves. An init container writes iptables rules into the pod's network namespace that redirect outbound 80/443 traffic to a sidecar process. The sidecar holds a per-session ephemeral certificate authority that the workspace ends up trusting through two complementary mechanisms, neither of which asks the operator to rebuild anything. The first is environment variables. The pod spec sets NODE_EXTRA_CA_CERTS, REQUESTS_CA_BUNDLE, SSL_CERT_FILE, PIP_CERT, CURL_CA_BUNDLE, and GIT_SSL_CAINFO on the workspace container, all pointing at the mounted per-session CA, and Node, Python, pip, curl, git, Ruby Net::HTTP, and the rest of the language toolchain honour those without any cooperation from the image. The second is the system trust store itself. We ship a snapshot of the Mozilla public root list inside Condukt, splice the per-session CA onto the end of it, and mount the resulting bundle via Kubernetes subPath mounts at /etc/ssl/certs/ca-certificates.crt and /etc/ssl/cert.pem, the two paths every mainstream Linux distro and distroless image use. Tools that ignore env vars and read the system bundle directly (static Go binaries, openssl CLI, anything falling back to the OS) see Mozilla's roots plus our session CA at the location they already look. The upshot is that untouched base images like node:20-bookworm, python:3.13-slim, gcr.io/distroless/cc-debian12, or any internal runtime an operator already has, cooperate with the MITM with zero preparation. There is no prepare script and nothing to bake at build time. A NetworkPolicy scoped to the session pod makes the sidecar the only thing that can reach the outside world, which is the structural piece that keeps the agent from quietly opening a TCP connection on a port we did not redirect or to a destination we did not intend. The image, the CA, the bundle, the iptables rules, and the policy file all come from the same per-session manifest, so an operator can reason about what the agent can do by reading what the pod is allowed to do. What the agent sees is an ordinary HTTPS request that succeeds or fails like any other. What we see is a structured event with method, host, path, headers, and timing, plus a decision point in the middle, and it is the decision point that we are most interested in.

A pipeline of rules

The policy is shaped as an ordered keyword list of rules, in the same spirit as a Plug pipeline. For every outbound request the runtime walks the list top to bottom and stops at the first rule that matches. If no rule matches, the policy's :default action fires. We default to :deny so the policy fails closed.

sandbox: {
Condukt.Sandbox.Kubernetes,
network_policy: %Condukt.Sandbox.NetworkPolicy{
rules: [
deny: ["*.internal.example.com"],
allow: ["api.github.com", "*.openai.com"],
decide: {Condukt.Sandbox.NetworkPolicy.AgentDecider, agent: MyApp.NetGuard}
],
default: :deny
}
}

Three rule kinds ship out of the box, and order is the way you express priority. :allow and :deny both match against host glob patterns: * for a single DNS label, ** for one or more labels. Putting a :deny for an internal subdomain before an :allow for the parent domain denies the one you care about and allows the rest. Swap the order and the deny wins everywhere. The static rules are the fast path, evaluated entirely inside the sidecar without a round trip, because spending a model call on a request to your own API is a waste. The interesting rule is :decide, and it accepts four shapes that all collapse to the same runtime contract. The first is a plain function: given a session context snapshot and the request, return :allow or {:deny, reason}, which is the place to put rules you can express in code, like denying anything to an internal hostname unless the session metadata says the session is internal. The second is an {module, function} tuple, the same shape but referenceable from configuration, which is the form you reach for when you want the rule to live in a module other people can read. The third is a module alone, which calls module.decide(ctx, req, []) and is the natural shape for behaviour-backed deciders that don't need configuration. The fourth is the one that nudged the whole design. You can pass {module, opts} and the module can be a Condukt-defined agent through the shipped AgentDecider wrapper, which means a model gets to make the call:

defmodule MyApp.NetGuard do
use Condukt
@impl true
def system_prompt do
"""
You gate outbound network requests for an AI coding agent. You
receive the request and recent session context as JSON. Allow
well-known reputable API hosts the task plausibly needs; deny
everything else.
"""
end
end

The prompt only describes the policy, never a wire format. AgentDecider injects the decision contract as the agent's structured output schema, so the model returns a validated {"decision": "allow" | "deny", "reason": "..."} answer without being told to. The schema lives in the wrapper, not in every prompt that uses it.

When the workspace agent makes a request and the sidecar reaches a :decide rule, it holds the connection in its hand, emits a decision_request over a bidirectional NDJSON control channel back to the BEAM, and waits for the answer. On the BEAM side, a ControlBridge receives the frame, builds a context snapshot from the live session that includes the last few messages and any caller-supplied metadata, runs the decider, and writes back a decision. The sidecar either lets the request through with proper MITM TLS termination so that body capture is meaningful, or it RSTs the connection and emits a denial event that the rest of the system can read. Default deny on timeout, because the failure mode should always favour the user, and the wrong direction of failure is the one nobody catches until something embarrassing has already left the building. The decider agent runs as a sub-agent so its own network traffic does not recurse through the gate it is helping to enforce, which would be a delightful kind of bug to chase but not one we want in the hot path. Decisions are cached per-session per-host, so once the model has said "no" to evil.com, the next attempt does not pay another model call to hear the same answer.

The honest part

For the Kubernetes sandbox, we can do all of this because we own the network namespace and the manifests. We can inject containers, write iptables rules, mount secrets, scope egress with a NetworkPolicy. The control surface is there, and we can be explicit about what the agent's pod is allowed to do, which is what makes the whole layer work. For Sandbox.Local, we cannot do any of this in a way that would actually contain a determined agent. The host's network stack is the host's, and without elevated privileges and a per-process network namespace, an agent that wanted to bypass our intentions could always find a way around them. We could lean on environment variables like HTTP_PROXY and hope every library inside the agent honours them, but "hope" is the right word, and a sandbox built on hope is not the kind of primitive you would put under a multi-tenant product. We are honest about that. If you run a local sandbox, you are in cooperative-with-yourself mode, which is fine for development against your own machine and a real cost to acknowledge when you start trusting agents to do work you would not want a stranger to do. For Sandbox.Virtual, the in-memory bashkit interpreter does not have a network surface today, and when it does, we will hook into the same network policy layer at the Rust boundary where every outbound call originates from code we own. That will be the cleanest of the three, because there is no escape hatch at the OS level when the entire execution lives inside a NIF, and the gate is no longer a redirect we hope nothing slips around but a function call that nothing can avoid. For any sandbox you do not control, including the ones that hosted agent platforms offer as part of their environments, you are reliant on whatever they decided to build. Some of them are thoughtful about this. Some of them treat it as a tomorrow problem. As an operator deciding where to run agentic workloads, that gap is increasingly part of the cost, and it is one of the reasons we keep ending up at "let us own the pod" as the place where the interesting decisions get made.

Where we land

The part we keep coming back to is that the bottleneck is no longer the plumbing. The plumbing was real work, and it took a while to find the right shape. We had to build a per-session CA that the workspace can trust without an operator manually distributing certificates. We had to build an iptables-based redirect that catches every outbound connection on the ports we care about without forcing the workspace image to know about us. We had to build a TLS terminator that speaks both HTTP/1.1 and HTTP/2, because anything modern is going to negotiate h2 and we did not want to ship a feature that quietly disabled itself on anything reasonable. We had to build a bidirectional control channel that connects the sidecar to the BEAM. The sidecar listens on a control port and the BEAM reaches it over a pods/portforward websocket, the Kubernetes-native way to talk to a port inside a pod, supervised so a dropped channel re-dials with backoff instead of taking the session down. We had to build an agent runtime that can stand in for the decider and produce structured output reliable enough to put on a hot path. None of that was a research project, but it was many weeks of careful work, and it is now done.

We also avoided forcing operators to build a separate image. The pod spec sets the canonical TLS-client env vars on the workspace container, and we overlay a Mozilla-plus-session bundle at the two system trust paths every Linux distro reads. Between the two paths there is no image-rebuild story to maintain. The only stack still not addressed is the JVM, whose keystore format the bundle approach cannot satisfy, and that is the one place we leave the image-side work to the operator.

The remaining bottleneck is the model that sits in the middle. The decisions are only as good as the gatekeeper is at making them, and a general-purpose model is not the right tool for that job. It is too expensive to put in the hot path of every request, because at scale a model call per outbound connection adds up to a budget that competes with the actual product. It is too slow to keep latency tolerable, because a 500ms-to-2s round trip in front of every request is something users will absolutely notice the first time they wait for a curl to come back. And it is too easily talked into things by the agent's own context, which is the part that worries us most: the gatekeeper is exposed to the same prompt-injection surface as the agent it is gating, and a clever prompt that survives long enough to reach the decider can flip a deny into an allow with no signal that anything went wrong. A small, narrow model trained on a narrow task is a much better candidate to be hardened against that, because the contract is smaller and the surface to defend is smaller. What we would love to see, and what we suspect is on the near horizon, is a class of small, fast, specialised models trained explicitly to look at a request and the context it came from and decide whether the request fits the session's purpose. Cheap enough to put in the hot path. Fast enough not to feel it. Hardened against the kind of prompt injection that flows through the agent's own context, in the way a focused model can be hardened more cleanly than a general one. That is the missing piece in this layer. When it shows up, building safer agentic workflows will stop feeling like a research project and start feeling like a deployment choice, the same way TLS termination eventually stopped feeling like a research project and started feeling like something every load balancer simply does. The companies that put a real model in the middle, rather than a list of hostnames, will be the ones whose agents customers will trust with something interesting, and that is the part of the stack we are most excited about right now.


← All posts