Launch HN: Runtime (YC P26) – Sandboxed coding agents for everyone on a team

www.runtm.com

22 points by gustrigos 1 hour ago

Hey HN, We're Gus and Carlos from Runtime (https://runtm.com). We're building infra that lets your whole team (including non-engineers) ship with Claude Code, Codex, and other agents without engineering having to handhold every session.

After Mentum (YC S21) was acquired, I personally shipped 4 full-stack products in 3 months using coding agents. When I tried to roll the same workflow out to the rest of the team, it fell apart: Most PRs were unmergeable slop - Every repo required an engineer doing one-off local setup. - Skills and context lived in one person's head. - There was no safe way for a PM to touch a real codebase without risking a bad deploy or a secrets leak.

Carlos comes from building agentic reconciliation systems at Modern Treasury and had a similar experience when letting his support team use devin.

We ended up building internal background agent infra but it quickly became a nightmare to mantain and develop. We built Runtime so you don't have to do this kind of thing.

Runtime work like as follows. Engineering defines the context once: system instructions, skills, and scoped integrations installable via CLI, mise, npm, or any package manager. Then Runtime snapshots your full running environment including multi-service Docker Compose setups, Kafka, Redis, seeded DBs, so it comes up in milliseconds with every server already running.

We orchestrate across sandbox providers like E2B, Daytona, EC2 or self-hosted K8s depending on your setup. Secrets are injected through our managed proxy so they never touch the agent directly, and guardrails run at the infrastructure level: command allow/deny lists, network egress controls, and RBAC scoped per human and per agent. Every session also gets a shareable preview URL, so internal builds go from sandbox to the rest of the team without needing production access.

Runtime works with whichever agent your team already uses: Claude Code, Codex, Cursor, Copilot, Gemini, Devin. You can trigger sandboxes from our web app, CLI, Slack, Linear, GitHub, or API.

One of our customers built an on-call inspector that wires PagerDuty, Sentry, and their repo so when an alert fires, the agent finds the cause and opens a PR with a unit test before anyone gets paged. Another runs a finance agent in a private Slack channel pulling from Stripe, NetSuite, and Snowflake to run reconciliations in minutes with source rows attached.

A fintech unicorn and several YC scaleups are live on Runtime, including a few teams who had built similar infrastructure internally and handed it to us to take over.

The core is open source at https://github.com/runtm-ai/runtm. Hosted version is live at https://app.runtm.com, free tier included. We're charging a flat platform fee plus compute, no token markup.

Check our demo: https://www.youtube.com/watch?v=wLwj__aEEh4

We'd love to hear how you're thinking about the infra for letting more people across your org use coding agents without creating chaos!

killerstorm 21 minutes ago

I have a suggestion - an assistant which can help to set up all these agents, perhaps based on templates. You already covered various use cases, but it's not clear if it's something concrete.

I think a lot of people who might be interested in this product might be interested in an easy set-up process. Even if it doesn't really save time for an experienced ops person, a lot of people would rather talk to a bot than fill a form.

gustrigos 9 minutes ago

Good point. We launched our cli recently exactly for this. It comes with skills, so you can use your own local setup (Claude Code, Cursor, Codex), to build up templates, spin up sessions, and set things up. You can scope what agents can and can't do. I wouldn't recommend using an agent to set up guardrails. There should be some human oversight for this.

nilirl 38 minutes ago

Hi, this looks really powerful, in that it seems to have many use cases.

One question I had:

Does every sandbox change end (when ready for production) in a pull request? If marketing sends me a pull request and I hate the code, what's the flow like for me to fix it?

gustrigos 12 minutes ago

Thanks! Lots of use cases. For the workflow you mentioned, the idea is marketing sends you a PR with the live preview. You can see the UI changes and if you need to change anything, you can open the session, which will let you modify, backtrack, or continue their work inside the same sandbox they used.

mritchie712 57 minutes ago

I wonder how this would be looked upon by the ever changing rules of claude code.

If someone from Anthropic sees this, would love to know if I can use my max plan here.

gustrigos 13 minutes ago

all of our customers are using Anthropic APIs for programmatic use. Codex and other providers let you use Oauth. But inside of a sandbox, you can technically use max plan since it is the same as using Claude locally.