Connection Pooling for SSE Servers Permalink to this section

Part of Backend Stream Generation & Connection Management.

SSE streams are long-lived: a single browser tab can hold a connection open for hours. At a few dozen concurrent users that is trivial. At ten thousand it means ten thousand simultaneous open TCP sockets, each consuming a file descriptor, a kernel socket buffer (~87 KB receive + ~87 KB send by default on Linux), and some application-layer state. Naively treating each request as an independent, short-lived transaction collapses the server in three predictable ways: OS file-descriptor exhaustion, thread-pool starvation, and repeated TLS handshake cost. Connection pooling for SSE is the discipline of sharing and reusing transport-layer resources across logical event streams so that none of these ceilings are hit in production.

SSE Connection Pool Architecture Three layers: clients on the left connect through a reverse proxy to a connection pool manager in the middle, which maps logical streams to a limited set of pooled TCP sockets on the right connected to the event bus. Clients Browser tab A Browser tab B Mobile client curl / script N clients (unbounded) Reverse Proxy nginx / Caddy TLS termination keepalive pool proxy buffering off for SSE Connection Pool Manager FD budget tracker Heartbeat scheduler Circuit breaker Drain queue Last-Event-ID store max_conns enforced here Event Bus Redis Pub/Sub Kafka consumer In-process channel M upstream sockets (M << N) proxy_buffering off X-Accel-Buffering: no
N client connections fan into a pool manager that enforces FD budgets, heartbeats, and circuit breaking before fanning out to M upstream event-bus sockets (M << N).

How Connection Pooling Works for SSE Permalink to this section

A standard HTTP/1.1 SSE connection occupies one TCP socket per client for its entire lifetime. The OS represents each socket as a file descriptor; the default ulimit -n on Linux is 1024 per process. Raise it and you still need per-socket kernel buffers, per-socket epoll registrations, and—if you are using a threaded model—one thread or green thread per connection.

Connection pooling addresses this at two distinct levels:

Transport pooling (proxy → application): The reverse proxy (nginx, Caddy, HAProxy) maintains a pool of persistent HTTP/1.1 or HTTP/2 connections to your application process. Clients each get their own downstream TCP socket to the proxy, but the proxy multiplexes upstream using a bounded set of keep-alive connections. With HTTP/2 upstream you can multiplex many logical streams over a single TCP connection.

Application-level connection registry: Inside your application, you maintain a registry—often a Map<clientId, WritableStream>—that maps each logical subscriber to its response handle. The underlying I/O event loop (Node.js libuv, Go’s netpoll, Python’s asyncio selector) already handles many sockets on a single OS thread. The “pool” here is the controlled, observable set of active handles with enforced limits.

The key invariant: the number of upstream event-source connections (to Redis, Kafka, or your own service) must remain constant regardless of how many clients connect. Each Redis SUBSCRIBE from a new client wastes one Redis connection. Instead, one subscriber per channel fans out to all registered clients—the Redis Pub/Sub fan-out pattern.

Wire-Level Anatomy Permalink to this section

Every SSE connection begins with a standard HTTP request:

GET /api/events HTTP/1.1
Host: api.example.com
Accept: text/event-stream
Cache-Control: no-cache
Last-Event-ID: 1718900000042
Connection: keep-alive

The server responds with 200 OK and never closes the body:

HTTP/1.1 200 OK
Content-Type: text/event-stream; charset=utf-8
Cache-Control: no-cache
X-Accel-Buffering: no
Connection: keep-alive
Transfer-Encoding: chunked

: keepalive\n\n
id: 1718900000043\n
event: update\n
data: {"price":142.07}\n\n

The connection stays in ESTABLISHED state indefinitely. The pool manager’s job is to keep track of every such socket, enforce a per-process ceiling, and detect silent drops before the client’s EventSource reconnect timer fires.

Server-Side Implementation: Node.js Connection Registry Permalink to this section

Node.js is the most common SSE runtime because its event loop is already non-blocking. The pattern below implements a bounded connection registry with heartbeats, idempotent Last-Event-ID tracking, and graceful drain.

// sse-pool.js — production connection registry for Node.js SSE
import { randomUUID } from 'crypto';

const MAX_CONNECTIONS = parseInt(process.env.SSE_MAX_CONN ?? '8000', 10);
const HEARTBEAT_MS    = 20_000;  // 20 s — under most proxy idle timeouts
const DRAIN_TIMEOUT   = 5_000;   // 5 s to flush on shutdown

class SSEPool {
  #clients = new Map();          // clientId → { res, channel, lastEventId }
  #heartbeatTimer = null;

  constructor() {
    this.#heartbeatTimer = setInterval(() => this.#sendHeartbeats(), HEARTBEAT_MS);
    // Prevent the timer from blocking process exit
    this.#heartbeatTimer.unref();
  }

  /** Register a new SSE client; returns clientId or throws if pool is full. */
  add(res, { channel = 'default', lastEventId = null } = {}) {
    if (this.#clients.size >= MAX_CONNECTIONS) {
      res.writeHead(503, {
        'Retry-After': '30',
        'Content-Type': 'text/plain',
      });
      res.end('Connection pool exhausted. Retry in 30s.');
      throw new Error('SSE pool full');
    }

    const id = randomUUID();

    // Standard SSE headers — disable every caching layer
    res.writeHead(200, {
      'Content-Type':      'text/event-stream; charset=utf-8',
      'Cache-Control':     'no-cache',
      'X-Accel-Buffering': 'no',        // nginx: disable proxy_buffer
      'Connection':        'keep-alive',
    });
    res.flushHeaders();                  // send headers immediately; don't wait for body

    // Disable Nagle — SSE frames are tiny, latency matters
    res.socket?.setNoDelay(true);

    this.#clients.set(id, { res, channel, lastEventId });

    // Remove client on disconnect (TCP RST, tab close, network drop)
    res.on('close', () => this.remove(id));
    res.on('error', () => this.remove(id));

    return id;
  }

  /** Remove a client and destroy its response handle. */
  remove(id) {
    const client = this.#clients.get(id);
    if (!client) return;
    this.#clients.delete(id);
    try { client.res.destroy(); } catch (_) { /* already gone */ }
  }

  /** Broadcast a formatted SSE event to all clients on a channel. */
  broadcast(channel, { id, event, data }) {
    const payload = this.#format({ id, event, data });
    for (const [clientId, client] of this.#clients) {
      if (client.channel !== channel) continue;
      const ok = client.res.write(payload);
      if (!ok) {
        // Back-pressure: the kernel send buffer is full.
        // Evict slow consumer rather than blocking the loop.
        this.remove(clientId);
      } else if (id) {
        client.lastEventId = id;         // track for reconnect
      }
    }
  }

  #format({ id, event, data }) {
    let msg = '';
    if (id)    msg += `id: ${id}\n`;
    if (event) msg += `event: ${event}\n`;
    msg += `data: ${typeof data === 'string' ? data : JSON.stringify(data)}\n\n`;
    return msg;
  }

  #sendHeartbeats() {
    const comment = ': keepalive\n\n';
    for (const [id, { res }] of this.#clients) {
      const ok = client.res.write(comment);
      if (!ok) this.remove(id);
    }
  }

  /** Graceful shutdown: notify clients, then drain within timeout. */
  async drain() {
    const goodbye = 'event: close\ndata: server_shutdown\n\n';
    for (const { res } of this.#clients.values()) {
      try { res.write(goodbye); } catch (_) { /* ignore */ }
    }
    await new Promise(resolve => setTimeout(resolve, DRAIN_TIMEOUT));
    for (const id of this.#clients.keys()) this.remove(id);
    clearInterval(this.#heartbeatTimer);
  }

  get size() { return this.#clients.size; }
}

export const pool = new SSEPool();

Wire it into an Express route:

// routes/events.js
import { pool } from './sse-pool.js';
import { redisSubscriber } from './redis.js';   // single shared subscriber

app.get('/api/events', (req, res) => {
  const channel     = req.query.channel ?? 'default';
  const lastEventId = req.headers['last-event-id'] ?? null;

  pool.add(res, { channel, lastEventId });
  // No further work in this handler — events arrive via redisSubscriber.on('message')
});

// Separate process.on('SIGTERM') handler
process.on('SIGTERM', async () => {
  server.close();          // stop accepting new connections
  await pool.drain();      // flush in-flight SSE frames
  process.exit(0);
});

Metric Exposure Permalink to this section

Expose pool depth to your observability stack:

app.get('/metrics', (req, res) => {
  res.set('Content-Type', 'text/plain');
  res.send([
    `# HELP sse_active_connections Number of live SSE clients`,
    `# TYPE sse_active_connections gauge`,
    `sse_active_connections ${pool.size}`,
  ].join('\n'));
});

Server-Side Implementation: Go Connection Registry Permalink to this section

Go’s net/http http.Flusher pattern maps naturally to a pooled registry protected by a sync.RWMutex. See Go Streaming Patterns for SSE for full context; below is a pool-focused excerpt:

// pool.go
package sse

import (
    "fmt"
    "net/http"
    "sync"
    "time"
)

const (
    maxConnections = 10_000
    heartbeatEvery = 20 * time.Second
)

type client struct {
    w         http.ResponseWriter
    flusher   http.Flusher
    channel   string
    done      chan struct{}
}

type Pool struct {
    mu      sync.RWMutex
    clients map[string]*client  // uuid → client
}

func NewPool() *Pool {
    p := &Pool{clients: make(map[string]*client)}
    go p.heartbeatLoop()
    return p
}

// Add registers a client. The caller must hold the HTTP handler goroutine
// open (block on <-client.done) so the connection stays alive.
func (p *Pool) Add(id string, w http.ResponseWriter, channel string) (*client, error) {
    p.mu.Lock()
    defer p.mu.Unlock()

    if len(p.clients) >= maxConnections {
        http.Error(w, "pool full", http.StatusServiceUnavailable)
        w.Header().Set("Retry-After", "30")
        return nil, fmt.Errorf("pool full")
    }

    flusher, ok := w.(http.Flusher)
    if !ok {
        return nil, fmt.Errorf("streaming not supported")
    }

    // Mandatory SSE headers
    h := w.Header()
    h.Set("Content-Type",      "text/event-stream; charset=utf-8")
    h.Set("Cache-Control",     "no-cache")
    h.Set("X-Accel-Buffering", "no")
    h.Set("Connection",        "keep-alive")
    w.WriteHeader(http.StatusOK)
    flusher.Flush()

    c := &client{w: w, flusher: flusher, channel: channel, done: make(chan struct{})}
    p.clients[id] = c
    return c, nil
}

func (p *Pool) Remove(id string) {
    p.mu.Lock()
    c, ok := p.clients[id]
    if ok {
        delete(p.clients, id)
        close(c.done)
    }
    p.mu.Unlock()
}

func (p *Pool) Broadcast(channel, payload string) {
    p.mu.RLock()
    defer p.mu.RUnlock()
    for _, c := range p.clients {
        if c.channel != channel { continue }
        fmt.Fprint(c.w, payload)
        c.flusher.Flush()
    }
}

func (p *Pool) heartbeatLoop() {
    ticker := time.NewTicker(heartbeatEvery)
    defer ticker.Stop()
    for range ticker.C {
        p.Broadcast("", ": keepalive\n\n") // empty channel = all clients
    }
}

The handler blocks on <-c.done; when the client disconnects, req.Context().Done() fires and the handler calls pool.Remove(id) which closes c.done.

Edge Cases and Network Interference Permalink to this section

SSE connections traverse multiple network hops, each of which can silently destroy a long-lived stream.

Layer Failure mode Mitigation
Corporate HTTP proxy Strips Connection: keep-alive, enforces 60 s idle timeout Heartbeat every 20 s; retry: directive ≤ 5000 ms
nginx (default config) proxy_buffering on holds body until buffer fills proxy_buffering off; proxy_cache off;
AWS ALB 60 s idle timeout (configurable to 4000 s) Raise to 3600 s in listener settings; heartbeat < 60 s
Cloudflare (free) 100 s HTTP response timeout Upgrade to Pro/Business; or use Durable Objects streaming
CDN edge cache May cache 200 text/event-stream responses Cache-Control: no-store, no-cache; Surrogate-Control: no-store
IPv6 NAT64 gateway May re-sequence TCP segments, breaking chunked boundaries Enforce Transfer-Encoding: chunked at the app layer; verify with curl
TLS termination proxy X-Forwarded-Proto mismatch causes mixed-content block Terminate TLS at proxy, set Strict-Transport-Security

nginx Configuration for SSE Pools Permalink to this section

upstream sse_backend {
    keepalive 64;          # maintain 64 idle keep-alive connections to app
    server 127.0.0.1:3000;
}

server {
    listen 443 ssl http2;

    location /api/events {
        proxy_pass         http://sse_backend;
        proxy_http_version 1.1;                # required for keep-alive
        proxy_set_header   Connection "";       # clear hop-by-hop for upstream pool
        proxy_set_header   Host $host;

        # Disable ALL buffering for SSE
        proxy_buffering    off;
        proxy_cache        off;
        proxy_read_timeout 3600s;              # match your SSE session length
        chunked_transfer_encoding on;

        # Tell Cloudflare, Varnish, and other CDNs not to buffer
        add_header X-Accel-Buffering no always;
    }
}

File-Descriptor Budget Permalink to this section

SSE servers exhaust file descriptors long before they exhaust CPU. Each connection consumes 1 FD minimum; with TLS and logging pipes, expect 3–5 FDs per client. The detailed process is covered in Tuning File-Descriptor Limits for SSE Connection Pools, but the fast version:

# Raise soft and hard limits for the SSE process user
echo "www-data soft nofile 65535" >> /etc/security/limits.conf
echo "www-data hard nofile 65535" >> /etc/security/limits.conf

# For systemd services, override in the unit file:
# [Service]
# LimitNOFILE=65535

# Verify at runtime:
cat /proc/$(pgrep -f 'node server.js')/limits | grep 'open files'

And the matching kernel parameters:

# /etc/sysctl.d/99-sse.conf
fs.file-max          = 2097152   # system-wide FD ceiling
net.core.somaxconn   = 65535     # listen backlog
net.ipv4.tcp_tw_reuse = 1        # reuse TIME_WAIT sockets for new connections

Performance and Scale Considerations Permalink to this section

Connection Count vs. Memory Permalink to this section

A Node.js process idle-holding 10,000 SSE connections uses roughly:

Resource Per-connection cost 10,000 connections
OS socket buffers (default) ~174 KB (87 KB rx + 87 KB tx) ~1.7 GB kernel memory
Node.js net.Socket object ~4–8 KB heap ~80 MB heap
TLS session state (if applicable) ~8–16 KB ~160 MB
Application registry entry ~0.5–1 KB ~10 MB

Reduce kernel buffer consumption:

# Halve default socket buffers — fine for SSE (mostly server→client)
sysctl -w net.core.rmem_default=43690
sysctl -w net.core.wmem_default=43690

Event-Loop Saturation (Node.js) Permalink to this section

Broadcasting to 10,000 clients in a single synchronous loop blocks the event loop for the duration of all res.write() calls. Chunk the broadcast into micro-task batches:

async function broadcastBatched(clients, payload, batchSize = 500) {
  const entries = [...clients.entries()];
  for (let i = 0; i < entries.length; i += batchSize) {
    const batch = entries.slice(i, i + batchSize);
    for (const [id, { res }] of batch) {
      if (!res.write(payload)) pool.remove(id);  // back-pressure eviction
    }
    // Yield to the event loop between batches
    await new Promise(resolve => setImmediate(resolve));
  }
}

Back-Pressure and Slow Consumers Permalink to this section

When res.write() returns false, the kernel send buffer is full—the client is consuming events more slowly than they arrive. Options, in order of preference:

  1. Evict immediately — simplest; the EventSource reconnect on the client handles recovery. Best for non-critical streams.
  2. Drop events with a skip counter — emit event: skip\ndata: {"n":42}\n\n and continue. Good for dashboard metrics.
  3. Buffer with a bounded queue — hold up to N events per client; evict if queue exceeds the limit. Adds memory pressure.

The interaction between pool-level back-pressure and token-bucket rate limiting is covered in Rate Limiting & Backpressure Handling.

Worker / Thread Scaling Permalink to this section

For CPU-bound workloads or very high connection counts, run multiple Node.js worker threads or Go goroutines, each owning a shard of the connection registry:

// cluster.js — shard pool by worker ID
import cluster from 'cluster';
import os from 'os';

const WORKERS = os.cpus().length;   // or a fixed number like 4

if (cluster.isPrimary) {
  for (let i = 0; i < WORKERS; i++) cluster.fork();
  cluster.on('exit', (worker) => {
    console.error(`Worker ${worker.process.pid} died; restarting`);
    cluster.fork();
  });
} else {
  // Each worker runs an independent SSE pool shard
  import('./server.js');
}

With Node.js cluster, each worker holds its own OS connections; a shared-nothing model. For multi-node fan-out, use Redis Pub/Sub as the inter-process event bus so all shards receive the same events.

Validation and Debugging Permalink to this section

Load Test the Pool Ceiling Permalink to this section

# Open 1000 concurrent SSE connections with wrk2 (event-stream aware)
wrk -t4 -c1000 -d60s --latency \
  -H "Accept: text/event-stream" \
  http://localhost:3000/api/events

# Simpler: bash loop with curl (count FDs after)
for i in $(seq 1 200); do
  curl -sN -H "Accept: text/event-stream" http://localhost:3000/api/events &
done
ls /proc/$(pgrep -f 'node')/fd | wc -l

Confirm Heartbeats Arrive Permalink to this section

# One-liner: watch for keepalive comments in the stream
curl -sN -H "Accept: text/event-stream" http://localhost:3000/api/events \
  | grep --line-buffered ":"
# Expected output every ~20 s:
# : keepalive

Check Proxy Buffering is Disabled Permalink to this section

# If the first event is delayed, proxy buffering is active.
# Test by timing to first byte:
curl -o /dev/null -s -w "TTFB: %{time_starttransfer}s\n" \
  -H "Accept: text/event-stream" https://api.example.com/api/events
# Should be < 500 ms if headers flush immediately

Structured Logging for Pool Events Permalink to this section

// Emit JSON log lines that your log aggregator can count/alert on
const logPoolEvent = (event, clientId, extra = {}) => {
  process.stdout.write(JSON.stringify({
    ts:       new Date().toISOString(),
    event,                              // 'connect' | 'disconnect' | 'evict' | 'full'
    clientId,
    poolSize: pool.size,
    ...extra,
  }) + '\n');
};

Alert thresholds to configure in your APM:

  • poolSize > 0.90 * MAX_CONNECTIONS → warn (circuit breaker will fire at 100%)
  • evict rate > 10/min → slow consumer problem; investigate back-pressure handling
  • full events any rate → either scale horizontally or raise SSE_MAX_CONN

DevTools Verification Permalink to this section

  1. Open Chrome DevTools → Network tab → filter by EventStream.
  2. Select the SSE request; the EventStream sub-tab shows each id, event, data field.
  3. Check Timing → “Waiting (TTFB)” should be < 500 ms; “Content Download” should grow steadily (not spike).
  4. In Firefox, about:networking#http shows active persistent connections; count them to verify proxy pooling is working.

⚡ Production Directives

  • Set SSE_MAX_CONN and respond with 503 + Retry-After: 30 when the pool is full — never let the server silently drop frames under overload.
  • Send a heartbeat comment (: keepalive\n\n) every 20 s and set retry: 3000\n\n so clients reconnect in 3 s after a silent drop.
  • Disable proxy buffering at every layer: proxy_buffering off in nginx, X-Accel-Buffering: no response header, and raise ALB/Cloudflare idle timeouts to at least 3600 s.
  • Tune OS FD limits to at least 3× your expected peak connection count before deploying (LimitNOFILE=65535 in the systemd unit).
  • Emit structured pool metrics (poolSize, evict rate) and alert at 90% capacity to give time for horizontal scaling before the circuit breaker fires.

Production Checklist Permalink to this section

Frequently Asked Questions Permalink to this section

What is the practical maximum number of SSE connections per Node.js process?

With default kernel socket buffers (~174 KB each) and a 16 GB server, the kernel memory ceiling lands around 90,000 connections before you hit 1 GB of socket buffer memory. In practice, Node.js heap and V8 GC pressure typically cap you at 20,000–30,000 active connections per process before latency degrades. Run Node.js with --max-old-space-size=4096 and use Node's cluster module to shard across CPU cores. Each core can own ~5,000–10,000 connections cleanly.

Do I need a connection pool if I'm using HTTP/2?

HTTP/2 multiplexes many logical streams over a single TCP connection between client and server, which helps at the transport level. However, your application still needs a connection registry to track active subscribers, enforce per-tenant limits, send heartbeats, and handle graceful drain. The FD pressure is reduced (one TCP socket may carry many streams), but the application-layer pool remains necessary.

Why does my SSE stream break silently after exactly 60 seconds behind AWS ALB?

AWS ALB's default idle timeout is 60 seconds. If no bytes flow on the connection for 60 s, the ALB sends a TCP RST. Your SSE server never sees the disconnect; it keeps writing to a dead socket. Fix: raise the ALB idle timeout to 3600 s in the listener configuration, and send a heartbeat comment (: keepalive\n\n) every 20–30 s to keep the connection active regardless.

How do I handle the Last-Event-ID across horizontal scaling?

When a client reconnects it sends Last-Event-ID in the request headers. If the reconnect lands on a different server node, that node must be able to replay events from that ID. Store events in a Redis stream (XRANGE mystream <lastId> +) or a time-series database indexed by event ID. The pool manager reads the header, replays missed events, then switches to live fan-out. This is covered in detail in the Idempotent Event ID Generation guide.

Should I evict or queue slow SSE consumers?

For most use cases (live dashboards, price tickers, notification feeds) immediate eviction is correct: the EventSource reconnects, sends Last-Event-ID, and replays from the last known-good position. Queuing adds unbounded memory risk—a client paused behind a corporate proxy may queue millions of events. Only queue when the data cannot be replayed (e.g., one-time tokens) and the queue is strictly bounded with an explicit max depth and eviction policy.

Deep Dives