Avatar of Mahmoud AbdelwahabMahmoud Abdelwahab

Deploy Full-Stack TypeScript Apps: Architectures, Execution Models, and Deployment Choices

TypeScript has become the default language for many developers building full-stack applications. When frontend and backend code share a single type system, the boundaries between components become easier to reason about. Data structures stay consistent across client utilities, API handlers, background workers, and database access layers. Errors that would normally surface only at runtime are caught during development.

This article outlines the major architectural patterns for deploying full-stack JavaScript applications, the tradeoffs behind each, and the factors that influence deployment platform choice.

The table below outlines the key differences between serverless platforms and long-running servers. Use it as a quick reference before diving into the architectural details that follow

Serverless (Cloudflare, Vercel)Long-Running Servers (Railway)
Execution modelShort-lived invocations; instances spin up on demand and scale to zero when idlePersistent processes that remain active for the lifetime of the service
State managementStateless by design; global variables are instance-scoped and unreliable across requestsStable in-memory state persists until process restarts; predictable behavior for counters, caches, and session data
Resource limitsStrict ceilings:
Workers cap at 128 MB memory, 5 min CPU time per request; Vercel has similar function-level constraints
Up to 32 vCPU and 32 GB RAM per instance (Pro Plan); no artificial timeouts or invocation limits
ScalingAutomatic horizontal scaling; new instances created per request as traffic increasesVertical scaling within plan limits; horizontal scaling via manual replica deployment across regions
RegionsCloudflare: edge-first across 300+ global locations;
Vercel: select a primary region on AWS
Four dedicated regions: US West, US East, EU West, Southeast Asia; deploy replicas globally with automatic routing
Supported runtimesCloudflare: V8-based runtime (TypeScript, Python, Rust);
Vercel: Node.js, Bun, Go, Python, Ruby, Edge (V8)
Any runtime that runs in a container: Node.js, Bun, Go, Python, Rust, Java, custom binaries, multi-process systems
Persistent connectionsNot supported; connections terminate when invocation endsFull support for WebSockets, SSE, long-polling, and any connection that needs to stay open
ProtocolsPrimarily HTTP; limited support for non-HTTP trafficHTTP, WebSockets, gRPC, raw TCP, custom binary protocols
StoragePlatform-specific primitives on Cloudflare:
- D1 → SQLite. Has limits on dataset size, query complexity, or write volume - R2 - S3-compatible object storage - KV → - Durable Objects → Vercel (Edge Config, Blob) with no support for relational databases or key-value stores. You need to bring in more providers
Full databases (Postgres, MySQL, Redis, MongoDB, ClickHouse, etc.) with persistent volumes; no limits on dataset size, query complexity, or write volume
Private networkingLimited; services communicate over public internet or vendor-specific bindingsBuilt-in private network (*.railway.internal) for secure service-to-service communication without public exposure
Billing modelUsage-based per invocation, CPU time, and/or request countUsage-based on active CPU and memory at one-minute resolution; idle services cost nearly nothing
Infrastructure managementFully abstracted: no servers, patches, or certificates to manageFully abstracted: Railway handles builds, deploys, TLS, routing, and observability
Best suited forRequest–response APIs, SSR, edge logic, lightweight preprocessing, globally distributed readsStateful backends, background jobs, real-time systems, AI/ML inference, data processing, persistent connections
Not well suited forWorkloads that are long-running or require a persistent connection.
e.g. ETL jobs, media processing, long-running tasks, WebSocket servers, chat systems, live dashboards
Workloads that benefit from sub-10ms global latency at the edge

Teams building JavaScript applications typically choose between two structural patterns.

Frameworks such as Next.js, TanStack Start, React Router, Nuxt, and SvelteKit bundle routing, rendering, and server logic into a single codebase.

Next.js app example architecture

fullstack-app/
├─ public/
├─ app/
│  └─ app/
│     ├─ layout.tsx
│     ├─ page.tsx/
│     └─ api/
│         └─ users/
│            └─ route.ts
├─ components/
│  └─ ui/
├─ lib/
│  └─ db.ts
├─ package.json
└─ tsconfig.json

This reduces the number of interfaces a team must maintain and keeps UI and API changes aligned. When application logic, data-fetching patterns, and component structure evolve together, inconsistencies between browser and server show up far less often. A single pull request can update the UI, adjust the API shape, and revise validation logic in one place. The deployment process also stays uniform, since any change results in a new build of the entire application.

These advantages come with constraints. Full-stack JS frameworks rarely ship with the full set of tools required for a production-grade backend. Teams often add validation libraries, ORMs, queue systems, and background processing infrastructure. Their execution model is optimized for request–response work and server-side rendering, not for long-running processes or heavy task runners. Workloads that fall outside the framework’s assumptions require dedicated services or custom integration. Every change triggers a full rebuild and redeployment, limiting how independently components can evolve.

Independently deployed services allow each component to define its own runtime behavior. Most teams using this model maintain a monorepo (e.g. using Turborepo) so that frontend code, backend services, workers, and shared TypeScript utilities remain in one place while still deploying independently.

repo/
├─ apps/
│  ├─ frontend/
│  │  ├─ src/
│  │  ├─ public/
│  │  └─ package.json
│  ├─ api/
│  │  ├─ src/
│  │  └─ package.json
│  └─ worker/
│     ├─ src/
│     └─ package.json
├─ packages/
│  └─ shared/
│     ├─ src/
│     └─ package.json
├─ package.json
└─ tsconfig.json

This structure preserves the benefits of shared types without coupling all logic to a single execution environment.

This model handles workloads that full-stack frameworks do not cover directly. Background jobs, event processors, and long-running tasks can run as separate services with their own scaling characteristics. Backend changes deploy independently from the frontend, reducing the scope of each release. The tradeoff is a need for additional coordination. API changes must remain compatible with existing UI versions, and teams need predictable internal networking, consistent observability, and clear configuration boundaries to keep the system manageable.

Modern platforms fall into two broad execution models: serverless and long-running servers.

Understanding the differences matters because each model sets clear limits on how your application runs, how it scales, and what kinds of workloads it can support.

Serverless environments run code inside short-lived execution contexts. A request triggers a function, the platform allocates resources, executes the handler, and releases those resources once the request completes or the instance becomes idle. When traffic increases, additional instances are created automatically, which provides horizontal scaling without configuration.

The two most widely used serverless platforms are Cloudflare and Vercel. They share several characteristics that define the developer experience:

  • Both provide an integrated workflow that includes:
    • Logging, metrics, and tracing
    • Environment variables and secrets management
    • Git-based deployments with instant rollbacks
    • Preview environments for every pull request
  • Neither requires direct interaction with servers, OS patches, SSL certificates, or network appliances. Infrastructure management remains hidden.
  • Compute is billed on usage, so you pay only for active CPU and memory instead of pre-reserving capacity.
  • Each platform includes a CDN for hosting and serving static assets.

Despite these similarities, the platforms differ significantly in how they execute workloads and structure their infrastructure.

Cloudflare uses an edge-first architecture built around a globally distributed network. Workers execute short-lived functions across a global network of machines that Cloudflare operates in its own datacenters. Incoming traffic is routed automatically to the nearest location, which reduces latency and distributes load across the network.

Cloudflare’s global network locations

Cloudflare’s global network locations

Each machine runs workerd, a V8-based runtime that exposes Web Platform APIs such as fetch, Streams, and Web Crypto. The API surface is broad but incomplete, so applications must operate within the supported capabilities. Workers support several languages, including TypeScript, Python, and Rust. A Worker runs inside a single-threaded event loop. One instance may process multiple requests concurrently, interleaving whenever the runtime awaits asynchronous operations.

Vercel follows the same general serverless model but operates atop AWS rather than its own datacenter network. Instances start on demand, support concurrent execution (see Fluid Compute), and eventually scale down to zero.

Vercel Fluid Compute

Vercel Fluid Compute

You also need to specify where your deployed functions run by picking a region.

Vercel supports multiple runtimes—Node.js, Bun, Go, Python, Ruby, and an Edge runtime built on V8—but all runtimes inherit the short-lived invocation model.

The main benefit of serverless remains the elimination of infrastructure management. You deploy code, and the platform determines how many instances to run. This reduces operational decision-making and simplifies the deployment pipeline.

That said, due to how these platforms are architected, you quickly run into limits once your application needs more than short-lived, stateless execution.

Because the runtime is tuned for lightweight request–response behavior, platforms impose strict ceilings on memory, execution time, bundle size, and instance lifetime.

For example, in the case of Workers, there is no guarantee that two requests from the same user will reach the same instance, and no expectation of isolation across users. Global variables belong to the instance, not the request, so they behave only as best-effort caches. For example:

let counter = 0;

export default {
  async fetch() {
    counter++;
    return new Response(counter.toString());
  }
}

The value of counter depends on which instance handles the request and whether that instance is reused or replaced. Because Workers have no lifecycle guarantees, in-memory state is temporary and cannot be relied upon for correctness.

Workers provide no filesystem and no reliable mechanism to preserve state across invocations, so applications must rely on external systems or Cloudflare-specific primitives like Durable Objects, D1, R2, Workers Analytics Engine, Cloudflare Realtime, or Cloudflare Images.

Workers also impose tight resource limits. Memory is capped at 128 MB. CPU time is limited to five minutes for an HTTP request and fifteen minutes for a Cron Trigger. Wall-clock time is not restricted, meaning a Worker can remain open as long as the client stays connected. This enables streaming, long-lived responses, and background work via event.waitUntil, but once the CPU budget for an invocation is exhausted, the request terminates immediately.

For example, the following fails even if the client remains connected:

export default {
  async fetch() {
    let sum = 0;
    // long-running task not fit for running inside a Worker
    for (let i = 0; i < 5_000_000_000; i++) {
      sum += i;
    }
    return new Response(sum.toString());
  }
}

Additionally, applications that require persistent connections fall outside what serverless environments can support.

Together, these limits rule out entire categories of workloads, including:

  • Data Processing: ETL jobs, large file imports and exports, analytics aggregation
  • Media Processing: Audio/video transcoding, image manipulation
  • Report Generation: Large PDFs, financial summaries, bulk exports
  • Infrastructure Tasks: Backups, CI/CD steps, provisioning workflows
  • Billing & Finance: Usage calculation, invoice generation, payment retries
  • User Operations: Account deletion, data merging, statistical recalculations

Likewise, any workload that depends on a steady, long-lived connection cannot be sustained:

  • Chat and messaging systems
  • Live dashboards, analytics panels, and tickers
  • Collaborative editing and presence systems
  • Delivery and device tracking
  • Push notification pipelines
  • Signaling paths for voice or video calls

Vercel also has limits when it comes to the amount of resources a function can use as well as its total running duration.

These constraints shape both what serverless platforms can run and how applications must be structured. When a workload requires stable locality, sustained compute, or persistent connections, the only practical alternative is the other execution model: long-running servers.

Long-running servers represent the other major execution model. Instead of creating short-lived invocation contexts, a process starts once and remains active for as long as the service runs. The runtime is stable, memory persists across requests, and applications can maintain open connections, background tasks, and internal state without interruption.

This model supports any language or runtime that can run inside a process. Applications that rely on event loops, threading, background workers, schedulers, binary dependencies, or custom networking protocols behave as expected because the environment does not enforce function-level execution boundaries. State remains in memory until the process restarts, and long-running operations execute without arbitrary time limits.

That flexibility comes with its own trade-offs. Traditional server-based systems typically require sizing instances, selecting CPU and memory allocations, and planning for scaling. Costs follow provisioned capacity rather than actual usage, meaning you pay for the instance whether it’s busy or idle. Horizontal scaling requires explicit orchestration and coordination across replicas, and managing multiple environments often introduces operational overhead.

These factors make long-running servers well-suited for stateful systems, background workloads, and applications that expect continuous processing, but they also create friction when teams want the simplicity associated with serverless.

This is the gap a platform like Railway addresses.

Railway provides a managed platform for deploying applications without requiring direct interaction with traditional infrastructure. You get a unified developer workflow that includes logging, metrics, secret management, Git-based deployments with instant rollbacks, and automatic preview environments for every pull request. You do not handle servers, apply OS patches, or manage virtual machines. Low-level operational work—maintaining SSL certificates, rotating hardware, or configuring firewalls—is abstracted away.

Railway’s key distinction from serverless platforms lies in its execution model. You get long-running servers with usage-based billing, so you retain the flexibility of a persistent process without paying for idle capacity.

Instead of short-lived invocation contexts, a Railway service behaves like a normal long-running process that stays online, maintains in-memory state, and handles connections for as long as the service runs. Any runtime that can run inside a container is supported: Go, Python, Rust, Node.js, Bun, Java, custom binaries, or multi-process systems.

Railway uses a custom builder that consumes either your source code or a Dockerfile and produces an OCI-compliant container without additional configuration. Because your application runs as a full Linux process, it does not inherit the ceilings imposed by serverless runtimes. There are no function timeouts, invocation-level lifecycle rules, or artificial memory caps beyond what you allocate. State persists until the process restarts, and long-running tasks execute without interruption.

For example, in-memory state behaves predictably:

// This works as expected on Railway.
// The process is long-lived, so state is stable.

let counter = 0;

import { serve } from "bun";

serve({
  port: 3000,
  fetch(req) {
    counter++;
    return new Response(counter.toString());
  }
});

This stability matters for workloads such as real-time systems, socket-based applications, schedulers, and background workers that depend on state that should not disappear between requests.

Railway operates its own global datacenters, giving it control over hardware, network routing, and reliability. Each service runs in a chosen region, providing consistent latency and predictable performance. Railway currently offers the following regions:

  • US West, California
  • US East, Virginia
  • EU West, Amsterdam
  • Southeast Asia, Singapore
Railway regions

Railway regions

Services scale vertically up to the limits of your plan. On the Pro Plan, a single instance can access up to 32 vCPUs and 32 GB RAM. If an application needs more capacity, it can scale horizontally by deploying multiple replicas. Each replica gets the full resource allocation of your plan and can be placed in any region. Railway’s routing layer balances public traffic across replicas within each region.

Deploy replicas to scale your service. You can also deploy it across regions for multi-region deployments

To run globally, you deploy the same service across multiple regions. Railway routes incoming requests to the nearest healthy replica automatically.

This model gives you straightforward scaling without the usual cluster or autoscaler overhead.

Railway has first-class support for Databases. You can one-click deploy any open-source database:

  • Relational: Postgres, MySQL
  • Analytical: Clickhouse, Timescale
  • Key-value: Redis, Dragonfly
  • Vector: Chroma, Weviate
  • Document: MongoDB

Check out all of the different storage solutions you can deploy.

One-click deploy databases on Railway

One-click deploy databases on Railway

Railway takes a general-purpose approach to storage rather than a collection of narrow serverless primitives. You have access to full databases and persistent disk that behave like traditional long-running systems.

You get strong consistency, predictable query performance, proper indexing, connection pooling, extensions, and full control over schema design. There are no artificial constraints on dataset size, table count, query complexity, or write volume. You can tune configuration files, mount custom volumes, and run additional processes alongside them. The platform does not impose database-specific limits.

Because storage is deployed in the same infrastructure and communicate over the same network, it makes high-throughput, I/O-heavy workloads viable. It also allows you to design processing pipelines or indexing systems that depend on predictable filesystem behavior. It also allows you to design processing pipelines or indexing systems that depend on predictable filesystem behavior.

Railway provides two networking modes: public and private.

  • Public networking exposes a service to the internet. Railway automatically provisions TLS certificates and supports custom domains via a simple CNAME or ALIAS record.
  • Private networking creates an internal network connecting services within the same project. APIs, workers, caches, queues, and databases can communicate without being publicly accessible. Each service receives a hostname under railway.internal.
Public and Private Networking

Public and Private Networking

Railway also provides a TCP proxy for workloads that rely on non-HTTP protocols.. A service can expose HTTP and TCP simultaneously when needed. This makes it possible to run:

  • WebSockets and real-time systems
  • gRPC servers
  • Custom binary protocols
  • Message brokers or queue processors

Railway does not alter or filter traffic. Your container receives requests exactly as they were sent.

Because Railway charges for active CPU and memory, long-running workloads behave like servers but are billed like serverless. CPU and memory are metered at one-minute resolution. When your service is idle and not consuming CPU, you pay essentially nothing. When it spikes, you pay only for the additional usage.

Railway’s usage-based pricing

Railway’s usage-based pricing

You get the execution characteristics of a persistent server with the economic characteristics of on-demand compute.

Railway and Cloudflare solve different parts of the stack, and many applications benefit from using both. Railway provides long-lived containers, persistent storage, and stateful compute. Cloudflare provides global ingress, caching, DNS, and programmable logic at the edge. When paired, you get stable regional backends with reliable global reach.

A common pattern is to run the core of your application on Railway while putting Cloudflare in front as the global entry point. Railway handles APIs, background workers, databases, and any workload that needs sustained compute or state. Cloudflare handles everything that needs to run close to users: DNS, CDN, edge routing, caching, simple request logic, security filtering, and path rewriting.

Workers are a good fit for lightweight pre-processing. They can validate requests, apply rate limits, add or remove headers, handle redirects, filter traffic, or respond from cache. When the logic becomes heavy, involves state, or requires sustained CPU or memory, the request is forwarded to a Railway service where the application has full control of runtime and storage.

If Workers need to talk directly to a Railway-hosted database, Hyperdrive can help reduce cross-region latency. Hyperdrive is not a database, but a routing layer that accelerates connections from the edge to your existing Postgres instance. This keeps your database on Railway while giving Workers faster access for read-heavy or latency-sensitive paths.

This split keeps each platform focused on its strengths. Cloudflare handles global entry points. Railway handles the application logic.

A straightforward arrangement looks like this:

  • Static assets hosted on Cloudflare
  • The main backend running on Railway
  • Workers acting as an edge layer for routing or validation
  • AI, ML inference, or batch compute running inside Railway containers
  • Event ingestion through Cloudflare Queues, processed by a Railway worker service

This pairing keeps global traffic fast and resilient while keeping your backend simple and stateful. Cloudflare gives you worldwide reach and protection. Railway gives you predictable compute, persistent storage, and the freedom to design your application without the constraints of a serverless execution environment.

Serverless platforms and long-running servers represent two fundamentally distinct ways to run applications. Serverless removes infrastructure from the developer’s view and reduces operational responsibility, but the model imposes limits on execution time, memory, state management, and connection lifetime. For workloads that follow a request–response pattern, or for logic that benefits from being close to users, serverless is an efficient option. Once an application requires continuous processing, stable in-memory state, or heavier compute, the model becomes restrictive.

Long-running servers provide a stable process that stays alive, holds state, and supports any runtime or protocol that fits inside a container. This makes them suitable for backends, schedulers, real-time systems, AI inference, data processing, and anything that depends on predictable performance. The cost has traditionally been operational overhead and the need to make capacity decisions early.

Railway removes much of that overhead. It provides the flexibility of long-running compute while preserving a managed, usage-based experience. You deploy your application, and Railway handles building, running, scaling, routing, and providing observability. You retain the ability to run stateful services, background workers, persistent storage, and custom protocols, while paying only for the resources your application actually uses.

The right choice ultimately depends on your application’s needs. If the workloads fit within the constraints of serverless, platforms like Cloudflare Workers and Vercel work well. When the workloads require durable state, long-running computation, or tight control over the execution environment, long-running servers become the natural choice.

Railway provides a practical middle path that avoids the constraints of serverless without reintroducing the operational burden that usually comes with servers. It supports the architecture your application requires rather than imposing an architecture your application must work around. You can also combine Railway with a platform like Cloudflare to get the best of both worlds.