Angelo Saraceno

Jun 24, 2026

Skill Issue, Writing /skills to solve agent failures

If you want to make your agent an expert at operating Railway, all you need to do is run the following command:

curl -fsSL agents.railway.com | sh

This will install the Railway CLI and configure agent support. If you want more details about what what’s happening under the hood and how we got here, the rest of this blog post is for you.

Agents didn’t know how to hold Railway

Around six months ago, we found a good portion of our user base “agent-pilled”. Instead of our users doing things via the Railway dashboard or by typing commands with the CLI, they'd tell their agent what they wanted and walk away.

"Deploy my app on Railway"
"My app is slow. Here's the deployment, can you figure out why?"
"I'm on Railway. Spin up a Postgres database and wire it into my app"

It was great in theory, but then the support threads started rolling in.

Why did Claude send my replica to Tahiti? Why did Cursor say Railway storage was 2 cents?

…and the dreaded… Why did my agent delete my whole project!?!

It didn't take us too long to work out what was missing. If an agent was going to drive Railway, it needed two things:

Tools. A way to actually operate Railway, whether that's an API, a CLI, or an MCP server.
Education. The knowledge of how to use those tools, plus a real mental model of the platform's primitives and features.

We initially focused our efforts on tooling and published an official MCP server for Railway. Then two problems showed up.

The first problem was awareness. Many users didn’t know the MCP server existed in the first place or that it was a way to make Railway work better with agents. When you ask the agent to use Railway without anything set up, it would install the Railway CLI, start firing off bash commands, guess the command flags, and assume features we'd never built.

The second problem was worse. The official MCP had around 30 perfectly good tools, but you had to tell the agent to call them. The desired state of telling the agent what you wanted and walking away wasn’t possible. We started to wonder if this was some failure baked in during post-training, that the models simply hadn't been taught how to hold a product.

Then Agent skills happened.

You have to give credit where it’s due, Anthropic’s ability to lexicographically own a mental model is maybe their number one superpower. Railway dubbed our version “corpus” - well, our term for it was a walking corpse after that video and SKILLS.md.

It turns out the single most effective way to do real-time RL on a model was… a text document. We were skeptical. We were also wrong. It's super effective.

Experimenting with Agent Skills internally

We started where it was safe, on ourselves. Internal docs and runbooks went into the grinder and came out the other side as agent skills.

By Jan 7th, Jake Cooper personally merged a PR with 10,000+ line diff of almost everything we had in Notion and made a /build-skill skill. The mandate was clear. Everything was to be a skill, and if it couldn’t be, it needed to be in an MCP so we could turn it into one.

What followed was a mad dash to make everything tractable to the agent. The work was unglamorous, but it was funny to see the spike in markdown commits in our repo. We listed out all of the verbs that we would handle in runbook land and ported them over.

Railway already had a standard called “3AM proof,” where the instructions on how to perform something needed to be clear. We would then associate the skill with the tools that we needed to make sure that the skill would point the model in the right direction.

name: reboot-metal-host
description:  This skill should be used when the user asks to “reboot a metal host”
allowed-tools: Bash,mcp___get,mcp___sensors,mcp__datadog-mcp__get_datadog_metric, mcp__datadog-mcp__search_datadog_logs

We would then just run water through these, exposing these in the proper agent contexts that we had, of which we have three harnesses: support, agent, and internal. We were very lucky that we had this company brain well before that innovation from Anthropic. We then took that expertise to you.

Poaster Roon said it best:

…the thing that makes a model good at your product is not a better model. It’s a document. …and it was a document that solved tool calling, go figure. Once we'd convinced ourselves the thing worked internally, the next move was obvious: give users the same experience when their agent drives Railway. Enter the official Railway skills.

Writing the Railway Agent skills

Five months ago, we published the first version of the Railway Agent skills. We had twelve skills, packaged in a single repo, with a bash script that would install each one for whatever agents you had:

railway-docs, service, central-station, status, deployment,deploy, database, projects, environment, metrics, new, domain

The thinking behind twelve skills was that the agent would load and unload them on demand based on what the user asked for. The thinking was wrong and agents turned out to be pretty bad at that. So we collapsed the whole set into one: use-railway skill that had a different shape entirely:

use-railway/
├── SKILL.md
├── references/
│   ├── analyze-db-mongo.md
│   ├── analyze-db-mysql.md
│   ├── analyze-db-postgres.md
│   ├── analyze-db-redis.md
│   ├── analyze-db.md
│   ├── configure.md
│   ├── deploy.md
│   ├── operate.md
│   ├── request.md
│   └── setup.md
└── scripts/
    ├── analyze-mongo.py
    ├── analyze-mysql.py
    ├── analyze-postgres.py
    ├── analyze-redis.py
    ├── dal.py
    ├── enable-pg-stats.py
    ├── pg-extensions.py
    └── railway-api.sh

The important idea is that SKILL.md is the router. It defines when to use Railway, what tools are allowed, the Railway resource model, and how to choose between Remote MCP, local CLI MCP, or plain railway CLI.

The references/ folder keeps SKILL.md from turning into one giant instruction blob. The router's job is to load only the reference that matches the user's intent, usually one file, two at most. A deploy request lands on deploy.md. Debugging goes to operate.md. Configuration goes to configure.md. Database introspection starts at analyze-db.md and fans out from there.

The scripts/ folder contains the code-backed parts of the skill: database analyzers, Postgres helpers, and a Railway GraphQL wrapper. These are not the primary interface; they support cases where structured analysis or lower-level API access is needed.

This shape fixed most of what had been going wrong, when Railway gets mentioned in a prompt the agent would load it properly.

Unfortunately, since an agent is only as sharp as what it knows about the platform in front of it, and anyone who didn't have the skill installed was still letting their agent fly blind. So we went back to the setup experience.

Refining the Railway Agent skills set up experience

We added railway skills install to the CLI, one focused command for getting Railway's skills onto your machine. It pulls them from the railway-skills repo and drops them into the universal ~/.agents/skills directory, plus any tool-specific skills directories it detects: Claude Code, Cursor, Codex, OpenCode, GitHub Copilot, and Factory Droid.

Want just one tool? Target it with --agent <agent-name>, or pass the flag more than once to set up several. Valid values are universal, claude-code, codex, opencode, cursor, copilot, and factory-droid.

If you've got CLI auto-update enabled, the installer checks the skills repo in the background and applies new revisions as they land, so your agent stays on the latest and greatest without you thinking about it.

Final thoughts

In the end, It was a skill issue. The fix was literal. Anyway, with the Railway Agent Skills, you can have your agent fluent in Railway, from the first prompt.

If you have any thoughts or feedback about using Railway with agents, reach out to us on Central Station, we’d love to hear from you. You can also @ mention Railway on X or LinkedIn and we’ll get back to you.