Ports, Processes, Threads & File Descriptors — Engineering Summary#

1. File Descriptors (FDs)#

A File Descriptor is a handle the OS gives a process to represent an open resource — a file, socket, or pipe.

Every open network connection = 1 FD consumed.

FD Limits — Two Levels#

Level	What it is	Default	Changeable?
Per-process soft limit	Max FDs one process can open	65,535	Yes, at runtime
Per-process hard limit	Ceiling for soft limit	~1,048,576	Yes, via config
System-wide limit	Total FDs across all processes on machine	Billions on modern Linux	Rarely a concern

# Check current soft limit
ulimit -n

# Check hard limit
ulimit -Hn

# Raise soft limit to 1M for current session
ulimit -n 1000000

# Permanently raise in /etc/security/limits.conf
*    soft    nofile    1000000
*    hard    nofile    1000000

# Check system-wide limit
cat /proc/sys/fs/file-max
# Output: 9223372036854775807 — effectively unlimited

Key Insight#

65k is a per-process soft limit, not a kernel ceiling. Raise it per process to 1M+. System-wide limit on modern Linux is practically unbounded.

2. How a Server Listens on a Port#

When you start a web server, it:

Creates a socket
Binds it to a port (e.g., 8080)
Calls listen() — OS starts accepting incoming connections
Each accepted connection = new FD consumed

// Node.js — single process, single port
const net = require('net');

const server = net.createServer((socket) => {
  // socket is a new FD for this connection
  socket.write('Hello!\n');
  socket.end();
});

server.listen(8080, () => {
  console.log('Listening on port 8080');
});

3. Multiple Processes on the Same Port — SO_REUSEPORT#

By default, only one process can bind a port. SO_REUSEPORT changes this — multiple processes can bind the same port, and the OS kernel load balances incoming connections across them.

Port 8080
├── Worker Process 1  (PID 101)  ← OS sends some connections here
├── Worker Process 2  (PID 102)  ← OS sends some connections here
├── Worker Process 3  (PID 103)  ← OS sends some connections here
└── Worker Process 4  (PID 104)  ← OS sends some connections here

import socket
import os

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# Enable SO_REUSEPORT — multiple processes can bind port 8080
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
sock.bind(('0.0.0.0', 8080))
sock.listen(1024)

4. Master / Worker Model#

This is how Gunicorn, Nginx, and Node.js Cluster work.

Master Process
├── Opens port 8080 and creates the socket
├── Does NOT handle requests itself
├── Forks N worker processes
│   ├── Workers inherit the open socket from master via fork()
│   └── Each worker independently calls accept() on the same socket
└── Monitors workers — restarts crashed ones

Why Multiple Workers?#

Single Process (Python/Flask with GIL):
  Request A arrives → being processed
  Request B arrives → WAITS (GIL blocks parallelism)

4 Worker Processes:
  Request A → Worker 1 (own GIL, own CPU core)
  Request B → Worker 2 (own GIL, own CPU core)
  Request C → Worker 3 (own GIL, own CPU core)
  → True parallelism across CPU cores → ~4x throughput

Node.js Cluster Module#

const cluster = require('cluster');
const http = require('http');
const numCPUs = require('os').cpus().length;

if (cluster.isMaster) {
  console.log(`Master PID: ${process.pid}`);

  // Fork one worker per CPU core
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on('exit', (worker, code) => {
    console.log(`Worker ${worker.process.pid} died — restarting`);
    cluster.fork(); // auto-restart
  });

} else {
  // Each worker runs the full server code
  // All workers share port 3000 via SO_REUSEPORT internally
  http.createServer((req, res) => {
    res.writeHead(200);
    res.end(`Handled by Worker PID: ${process.pid}`);
  }).listen(3000);

  console.log(`Worker PID: ${process.pid} started`);
}

Gunicorn (Python) equivalent#

# 4 worker processes, each running your Flask app
gunicorn --workers 4 --bind 0.0.0.0:8080 app:app

5. Each Worker = Full Copy of Server Code#

When master forks, each worker gets a complete copy of the server's memory via fork().

Master loads app into memory: 200MB
Forks 4 workers:
  Naive: 4 × 200MB = 800MB
  Actual: ~200MB + small delta (Copy-on-Write)

Copy-on-Write (CoW)#

The OS doesn't immediately copy memory pages on fork. Workers share the master's memory pages — read-only. Only when a worker writes to a page does the OS make a private copy for that worker.

Master memory page [code: getUserById()]
├── Worker 1 reads it → shared, no copy made ✅
├── Worker 2 reads it → shared, no copy made ✅
└── Worker 1 writes to a variable → OS copies ONLY that page for Worker 1

This is why forking is cheap in Linux.

6. Why Workers Must Be Stateless#

Each worker is an independent process with its own memory.

Request 1 → lands on Worker 2 → stores session in Worker 2's memory
Request 2 → lands on Worker 4 → Worker 4 has NO idea about that session ❌

Solution: Never store state in process memory. Always use external state stores.

// WRONG — in-memory state, breaks with multiple workers
const sessions = {};
app.post('/login', (req, res) => {
  sessions[req.body.userId] = { loggedIn: true }; // ❌ only in this worker
});

// CORRECT — state in Redis, visible to all workers
app.post('/login', async (req, res) => {
  await redis.set(`session:${req.body.userId}`, JSON.stringify({ loggedIn: true }));
  res.send('OK');
});

7. Concurrent Connections — The Math#

Machine specs:
  RAM: 8GB
  Per-process FD limit (after raising): 1,000,000
  Memory per SSE connection: ~50KB

FD bottleneck:
  1,000,000 FDs = 1M concurrent connections per process

Memory bottleneck:
  8GB / 50KB = 160,000 connections per machine (if not raised)
  With 64GB RAM: 64GB / 50KB = ~1.3M connections

Real bottleneck: Memory, not FDs (once FD limit is raised)

At Scale (WhatsApp model)#

10M DAU
→ 10-20% concurrent at any time = 1-2M concurrent connections
→ Dedicated connection servers (stateful)
→ Few processes per machine × 1M FDs each
→ ~2-4 machines for connections alone (theory)
→ Business logic servers: stateless, scale independently

8. Mental Model Summary#

Port 8080
  └── OS Socket (shared via SO_REUSEPORT)
        ├── Master Process (lifecycle only)
        │     ├── Worker 1 (full app code, own FD table, own memory)
        │     ├── Worker 2 (full app code, own FD table, own memory)
        │     ├── Worker 3 (full app code, own FD table, own memory)
        │     └── Worker 4 (full app code, own FD table, own memory)
        │
        └── OS distributes incoming connections to workers

Rule of thumb:

Scale via fewer processes with higher FD limits — not more processes with low limits
Process overhead (RAM) kills you before FDs do
Always store state externally — Redis, DB — never in process memory
Connection layer (stateful) and business logic layer (stateless) should be separate services

Ports, Processes, Threads & File Descriptors