Networking Deep Dive — Complete Revision Notes#

Full stack: REST → Sockets → TCP → HTTP/2 → Frameworks

Table of Contents#

REST APIs
How One Port Serves Millions
What is a Socket
What is a File Descriptor
What's Inside the Socket Struct
Buffer Size and RAM
How Streaming Works at Socket Level
Streaming JSON — TypeScript and Python
Backpressure — Slow Consumer Problem
Circular Buffer — How OS Tracks What's Read
TCP is Always a Stream
REST Stateless vs TCP Connections
HTTP Versions — 1.0, 1.1, 2, 3
TLS vs TCP Handshake
HTTP/2 Multiplexing
HTTP/2 Frame Structure
Body, Query and Path Params in HTTP/2
socket.connect() — OS vs HTTP Client
How a Framework Decodes Packets
The Full Stack — End to End

1. REST APIs#

Simple intuition#

A standard way for two applications to talk over the internet using HTTP — the same protocol browsers use.

Analogy: a waiter — you don't go to the kitchen yourself. You tell the waiter (API), the kitchen (server) prepares it, waiter brings it back.

Why it exists#

Before REST, every team invented their own conventions. SOAP was complex. REST said: "reuse HTTP and add simple rules on top." Roy Fielding formalized this in his 2000 PhD dissertation.

Core components#

Resource + URL    /users/42, /orders, /products
HTTP Verbs        GET=read, POST=create, PUT/PATCH=update, DELETE=delete
Statelessness     server remembers NOTHING between requests
                  every request must carry all info it needs (JWT, session token)

Trade-offs#

Use REST when	Avoid REST when
Public APIs, CRUD ops	Real-time bidirectional (use WebSockets)
Wide tooling support needed	Complex nested data (use GraphQL)
Simplicity matters	High-performance internal services (use gRPC)

30-second interview answer#

"REST is a set of conventions for building APIs over HTTP. Every resource has its own URL, and HTTP methods define the action. The most important constraint is statelessness — the server remembers nothing between requests, so every request is self-contained. REST became the standard because it's simple and works everywhere HTTP works. Main trade-off is over/under-fetching, which is why GraphQL was created."

Senior insights#

Statelessness is a scaling superpower — any server behind a load balancer can handle any request because no session state is stored server-side
Most "REST" APIs are actually just HTTP APIs — almost nobody implements HATEOAS (Fielding's full REST spec)

2. How One Port Serves Millions#

Simple intuition#

A port is just a door. The door number doesn't limit how many conversations can happen inside. What matters is the 4-tuple that uniquely identifies each connection.

The 4-tuple#

Every connection is uniquely identified by:

(Source IP, Source Port, Destination IP, Destination Port)

Example — 3 users all hitting port 443:
103.21.5.1:54231  ↔  13.0.0.1:443   ← unique connection
203.55.2.8:61024  ↔  13.0.0.1:443   ← unique connection
91.12.8.44:49871  ↔  13.0.0.1:443   ← unique connection

OS keeps a hash table keyed by 4-tuple — packet arrives → nanosecond lookup → right socket found.

What actually enables millions#

1. Non-blocking async I/O (event loop)
   One thread handles thousands of idle connections
   "wake me up when data arrives" instead of blocking

2. Load balancers
   Distribute connections across multiple servers

3. HTTP Keep-Alive / HTTP/2 multiplexing
   Reuse connections — fewer total connections needed

Real limits (not the port)#

File descriptors per process (ulimit -n, default 1024)
Memory — each socket ~2KB struct + buffers
CPU — processing requests
Bandwidth — network I/O

Senior insights#

The C10K problem (1999) — solving 10K concurrent connections required switching from thread-per-connection to event-driven I/O (epoll). This is why Nginx crushed Apache.
NAT breaks the assumption — corporate offices share one public IP. 65,535 ports shared across entire office. Can cause connection exhaustion from a single IP.

3. What is a Socket#

Simple intuition#

A socket is a file that represents a network connection. The OS makes it look like a file so your program can just read and write — exactly like a text file — and OS handles all network complexity underneath.

Why it was created#

Berkeley BSD Unix (1983): "What if we hide all network complexity behind something programmers already know — a file?"

The "everything is a file" philosophy#

Actual file on disk     → file
Network socket          → file
Keyboard input (stdin)  → file
Terminal output         → file
Pipe between processes  → file

All have one thing in common: you read from them and write to them.
Same interface. Different things underneath.

Types of sockets#

Type	Protocol	Analogy	Use case
Stream socket	TCP	Phone call — reliable, ordered	HTTP, SSH, databases
Datagram socket	UDP	Postcards — fast, no guarantee	Video calls, gaming, DNS
Unix domain socket	None (local only)	Note in same building	Nginx ↔ app on same machine

Socket lifecycle#

SERVER:                     CLIENT:
socket()  → create fd       socket()  → create fd
bind()    → attach to port  connect() → connect to server
listen()  → ready to accept
accept()  → new fd per client
read/write → communicate    read/write → communicate
close()   → done            close()   → done

accept() always returns a brand new socket for each client. Original listening socket stays open.

Senior insights#

A socket is not a connection — it's an endpoint. It exists before connection is made (socket() call). Connection happens at connect() or accept().
Slow consumers are dangerous — if your app reads slowly, recv buffer fills, TCP flow control kicks in. One slow consumer can back-pressure the entire pipeline.

4. What is a File Descriptor#

Simple intuition#

A file descriptor is just a number that acts as a nickname for something your program has opened. OS says: "you want to work with this? I'll call it number 4. Whenever you say 4, I'll know what you mean."

The word "file" is misleading#

It doesn't mean a file on disk. It means "something you can read/write to." Unix uses one unified interface for everything.

The 3 tables#

YOUR PROCESS
  File Descriptor Table (per process — just an array)
  ┌────┬──────────┐
  │ 0  │  stdin   │  ← keyboard
  │ 1  │  stdout  │  ← terminal
  │ 2  │  stderr  │  ← terminal
  │ 3  │  *───────┼──→ Open File Table entry
  │ 4  │  *───────┼──→ Open File Table entry (socket)
  └────┴──────────┘

OS: Open File Table (shared across all processes)
  tracks: current position, access mode, reference count

OS: Inode Table / Socket Table (actual resource)
  for files: inode → data on disk
  for sockets: struct in MEMORY only → ip, port, buffers, tcp state

Does OS create a file on disk for a socket?#

No. For sockets, OS creates a data structure in memory only. The file descriptor machinery is reused — because it already works — but nothing is written to disk.

Prove it#

ls -la /proc/$$/fd          # see all open fds in your shell
# socket shows as: socket:[12345678]  ← listed like a file, no disk

Why this design is brilliant#

read(fd, buffer, size)   # works for file, socket, pipe, keyboard
write(fd, buffer, size)  # works for everything
close(fd)                # works for everything
# write code once, works everywhere

Senior insights#

File descriptor limit is a real production problem — default 1024 per process. Each open socket = 1 fd. Server handling 10K connections needs 10K+ fds. Hitting the limit causes EMFILE: too many open files. Fix: ulimit -n and /etc/security/limits.conf.
fd inheritance is a security trap — forked child processes inherit all parent's open fds including database connections. Fix: mark fds with FD_CLOEXEC.

5. What's Inside the Socket Struct#

Simple intuition#

The socket struct is a dashboard for one network connection — everything the OS needs to send, receive, and manage data for that connection.

The 5 buckets#

SOCKET STRUCT
┌─────────────────────────────────────────────┐
│  IDENTITY                                   │
│    local_ip:  192.168.1.5                   │
│    local_port: 54231                        │
│    remote_ip: 142.250.80.46                 │
│    remote_port: 443                         │
│    protocol: TCP                            │
├─────────────────────────────────────────────┤
│  BUFFERS                                    │
│    send_buffer   [ data waiting to send ]   │
│    recv_buffer   [ data waiting to read ]   │
│    ~4MB each (tunable)                      │
├─────────────────────────────────────────────┤
│  TCP STATE                                  │
│    LISTEN → SYN_RECEIVED → ESTABLISHED      │
│    → FIN_WAIT_1 → FIN_WAIT_2               │
│    → TIME_WAIT (2 min) → CLOSED            │
├─────────────────────────────────────────────┤
│  SEQUENCE NUMBERS                           │
│    snd_seq, rcv_seq                         │
│    last_ack_sent, last_ack_received         │
│    (TCP uses these for reliable delivery)   │
├─────────────────────────────────────────────┤
│  TIMERS                                     │
│    retransmit_timer  (resend if no ACK)     │
│    keepalive_timer   (detect dead clients)  │
│    timewait_timer    (2 min after close)    │
└─────────────────────────────────────────────┘

The buffers are the heart of the socket#

Your code calls write(fd, "Hello", 5)
  → OS copies "Hello" into SEND BUFFER
  → returns immediately ← your code doesn't wait!
  → OS sends over network in background

Packet arrives from network
  → OS puts data into RECV BUFFER
  → your app calls read(fd, buf, size)
  → OS copies from RECV BUFFER into your app's memory

Your app never touches the network directly. It just reads/writes to buffers. OS does everything else.

Memory cost#

In Linux kernel (struct tcp_sock): ~1.5KB–2KB per socket struct. 1 million connections = ~2GB RAM just for structs, before buffer data.

Senior insights#

recv buffer IS the TCP flow control window — free space in recv buffer = TCP window size advertised to sender. Slow app fills buffer → window shrinks → sender automatically slows. No extra code needed.
TIME_WAIT = 2 minutes — after connection closes, socket stays in TIME_WAIT to ensure no stale packets confuse new connections. Under heavy load, thousands of TIME_WAIT sockets consume memory.

6. Buffer Size and RAM#

The real cost of a large payload#

A 5MB payload doesn't just use 5MB of RAM — data gets copied multiple times:

Client sends 5MB
  ↓
Kernel recv buffer:    5MB   (OS puts it here first)
  ↓  app calls read()
  ↓  OS COPIES it
App buffer:            5MB   (now lives here too)
  ↓  framework parses
Parsed object:         5MB+  (JSON parsed, headers, metadata)

Total per request: ~15MB
With thread stack:  ~20MB+

Real concurrent request count with 1GB RAM#

Not 200 requests (1GB / 5MB).
More like 65-100 requests (1GB / 15MB).

Real bottleneck order#

1. FIRST:  CPU         → parsing, business logic
2. SECOND: Threads     → context switching
3. THIRD:  RAM         → buffers filling up
4. FOURTH: Bandwidth   → 1Gbps NIC = 125MB/s ÷ 5MB = 25 req/s max

Production solutions#

1. Streaming         → never buffer full body, process 64KB at a time
2. Reject early      → check Content-Length, return 413 before reading
3. Upload to S3      → client uploads direct to object storage
                       API receives just the URL (~1KB)

Senior insights#

Slow HTTP attack — if you don't check payload size before reading, attacker sends infinite stream exhausting recv buffers and RAM. Always set hard limits at reverse proxy (Nginx client_max_body_size).
SO_RCVBUF tuning at scale — increasing recv buffer globally means every socket gets that size, including health check pings. 50,000 connections × 5MB buffer = 244GB reserved. Always tune per-socket.

7. How Streaming Works at Socket Level#

Core insight#

TCP doesn't have "messages" or "files". It's just bytes flowing continuously — like water through a pipe. Streaming = embracing this reality instead of waiting to collect everything first.

Non-streaming vs streaming#

NON-STREAMING:
  wait for ALL data → copy to app memory → process
  recv buffer holds entire payload idle

STREAMING:
  chunk arrives → read immediately → process → buffer drained → TCP window opens → more arrives
  recv buffer never holds more than one chunk

Memory comparison (5MB payload, 100 users)#

Without streaming:  100 × 15MB  ≈ 1.5GB
With streaming:     100 × 64KB  ≈ 13MB    ← 75x less memory

The raw read loop#

CHUNK_SIZE = 65536   # 64KB

while True:
    chunk = socket.recv(CHUNK_SIZE)   # give me UP TO 64KB available NOW
    if not chunk:
        break             # connection closed
    process(chunk)        # handle immediately
    # recv buffer freed → TCP window grows → sender can send more

recv(n) doesn't wait for exactly n bytes. It returns whatever is available now — could be 10KB, could be 64KB.

How frameworks hide this#

// Node.js — req is a readable stream
req.on('data', (chunk) => {   // called every time a chunk arrives
    process(chunk)
})
req.on('end', () => {
    res.send('done')
})

# FastAPI
async for chunk in request.stream():
    process(chunk)

Both are just the same recv() loop underneath. Framework calls recv() and emits each chunk to your handler.

Senior insights#

TCP is always a stream — "streaming mode" doesn't exist at TCP level. It's entirely your application's choice to process chunks vs accumulate everything.
You can stream response simultaneously while reading request — read 64KB → process → write 64KB output → repeat. Memory stays flat regardless of file size. This is how FFmpeg processes 100GB video on 512MB RAM.

8. Streaming JSON — TypeScript and Python#

When streaming JSON helps#

Scenario	Helps?	Why
Flat JSON object (any size)	NO	Need full body — closing `}` not arrived yet
Large array of objects	YES	Each item is independent
CSV upload	YES	Each row is independent
NDJSON	YES	Each line is complete valid JSON

NDJSON — designed for streaming#

{"id": 1, "name": "Alice"}
{"id": 2, "name": "Bob"}
{"id": 3, "name": "Charlie"}

Each line = valid JSON. Used by log pipelines, Kafka, bulk APIs.

TypeScript#

npm install stream-json    # large JSON arrays
npm install ndjson         # newline-delimited JSON

// stream-json — large array
app.post('/users', (req, res) => {
  req.pipe(parser()).pipe(streamArray())
    .on('data', ({ value }) => saveToDatabase(value))  // one object at a time
    .on('end', () => res.json({ done: true }))
})

Python#

pip install ijson       # large JSON arrays
pip install jsonlines   # NDJSON

# ijson — large array
for obj in ijson.items(stream, 'item'):
    save_to_database(obj)   # full array NEVER in memory

The honest caveat#

Libraries maintain internal state for partial chunks. If an object is split across chunks:

chunk1: '{"id": 1, "name": "Ali'   ← incomplete, held internally
chunk2: 'ce"}, {"id": 2...'        ← completes object 1, emits it

Library does the bookkeeping. You just get complete objects out.

9. Backpressure — Slow Consumer Problem#

The mechanism#

Your app slow to process
    → recv_buffer fills
        → free space shrinks
            → TCP window in ACK shrinks
                → sender slows down
                    → sender's send_buffer fills
                        → client's write() blocks

Slowness of YOUR app propagates all the way back to client.
No extra code. Pure buffer mechanics.

Three scenarios#

App reads fast:      buffer mostly empty → window large → sender at full speed
App reads slow:      buffer fills → window shrinks → sender slows automatically
App stops reading:   buffer full → window = 0 → sender fully pauses
                     TCP connection stays ALIVE, just paused
                     no data lost, resumes when app reads

The timeout caveat#

TCP will wait forever. But clients have timeouts:

App slow for > timeout → client closes connection regardless
→ all that buffered data = wasted

Fix for slow processing:

request arrives → return 202 Accepted immediately
push to queue (Kafka/SQS) → process async at own pace

Senior insights#

TCP Zero Window ([TCP ZeroWindow] in Wireshark) = receiver buffer full, sender stuck. Fix is never network tuning — it's making consumer process faster or scaling horizontally.
This pattern repeats at every layer — Node.js streams have pause()/resume(), Kafka has max.poll.records, gRPC has flow control tokens. All model the same TCP mechanic at application layer.

10. Circular Buffer — How OS Tracks What's Read#

The wrong mental model#

buffer = [chunk1, chunk2, chunk3]
app reads chunk1 → buffer.remove(chunk1)   ← NOT how it works

The real model — two pointers, no deletion#

recv_buffer (4MB fixed block of RAM)
┌────────────────────────────────────────────────────┐
│  consumed  │  data waiting to be read  │   free     │
└────────────────────────────────────────────────────┘
             ↑                           ↑
         READ ptr                    WRITE ptr
         (moves on recv() call)      (moves on data arrival)

free space = total(4MB) - (WRITE - READ)
free space → TCP window size in next ACK

The recv() call IS the signal#

app calls recv(fd, buf, 64KB)
  1. OS copies buffer[READ...READ+64KB] → app's memory
  2. READ pointer moves forward 64KB
  3. free space recalculated → TCP window in next ACK increases
  4. sender knows it can send more

No separate "I read it" notification. The act of calling recv() moves the pointer.

Why circular?#

When WRITE pointer hits end of 4MB block, it wraps to start — reusing already-consumed memory.

[consumed][consumed][new data ][unread  ][consumed]
↑                              ↑
WRITE (wrapped)               READ
Same 4MB block reused forever. No allocation, no GC.

senior insights#

Circular buffer pattern is everywhere — Linux pipes, GPU command queues, LMAX Disruptor, Kafka producer buffers. Always: fixed memory, zero allocation, no GC, just pointer arithmetic.
recv() returning 0 = EOF, not error — when remote closes connection, recv() returns 0. This is clean shutdown signal. Break the read loop gracefully — don't treat as error.

11. TCP is Always a Stream#

The key insight#

TCP has no concept of "streaming mode" or "buffered mode". There is no flag, no setting, no configuration.

TCP always:   receives bytes → puts in recv_buffer → done
TCP never:    knows what HTTP is, what JSON is, what streaming means

The streaming vs buffering decision is 100% your application code:

# "buffered" — app waits for everything
data = b''
while len(data) < content_length:
    chunk = socket.recv(65536)
    data += chunk          # accumulating in memory
process(data)

# "streaming" — app processes as it goes
while True:
    chunk = socket.recv(65536)
    if not chunk: break
    process(chunk)         # process immediately

Same socket. Same recv_buffer. Same TCP. Only difference: what your code does with each chunk.

12. REST Stateless vs TCP Connections#

The confusion#

"REST is stateless" does NOT mean a new TCP connection per request. These are different layers entirely:

REST (stateless) = server has no memory between requests
TCP connections  = completely separate concern, managed by HTTP layer

HTTP version behavior#

HTTP/1.0:   1 connection per request. 10 calls = 10 TCP handshakes. (nobody uses this)
HTTP/1.1:   Keep-alive by default. 1 connection reused for all requests.
            But requests are ordered — head of line blocking.
            Browser workaround: opens 6 parallel connections per origin.
HTTP/2:     1 connection, multiple parallel streams. No head of line blocking at HTTP level.
HTTP/3:     1 QUIC connection (UDP-based). Per-stream reliability.

The 6-connection browser hack#

HTTP/1.1's head of line blocking: slow request blocks all behind it. Browser fix: open 6 connections to same server simultaneously.

Problems with this hack:
  6 × TCP handshakes
  6 × TLS handshakes
  6 × socket structs in OS
  6 × TCP slow start
  10,000 users × 6 = 60,000 connections server handles

HTTP/2 made this unnecessary — one connection, unlimited parallel streams.

Connection pooling (HTTP client)#

HTTP client keeps a pool of open connections keyed by host+port:

Pool entry: {
  fd: 4,                  ← only link to OS
  host: "api.example.com",
  port: 443,
  http_version: "h2",
  in_use: false,
  idle_since: timestamp
}

When you make a request → check pool → found open connection → reuse fd → skip handshakes.

Common Python bug#

# BAD — new TCP connection every call, slow
for item in items:
    requests.get('/api/data')   # opens + closes connection each time

# GOOD — one Session, connections reused
session = requests.Session()
for item in items:
    session.get('/api/data')    # same fd reused

Senior insights#

HTTP/2 is a massive performance win for REST APIs — many small requests fly in parallel over one connection. Just upgrading often gives 30–50% latency improvement with zero app code changes.
Stateless server + connection reuse is not a contradiction — TCP connection is maintained by OS network stack, below your application entirely. Your Express app being stateless means no session store. Says nothing about OS keeping TCP connection open.

13. HTTP Versions — 1.0, 1.1, 2, 3#

How client and server agree on HTTP version#

ALPN (Application Layer Protocol Negotiation) — happens inside TLS handshake:

Client TLS ClientHello:
  + ALPN extension: ["h2", "http/1.1"]    ← what I support

Server TLS ServerHello:
  + ALPN extension: "h2"                  ← what I picked

No extra round trip. Piggybacks on TLS.

If server doesn't support HTTP/2 → picks "http/1.1" → graceful downgrade.

HTTP/3 discovery (different — uses QUIC, not TCP)#

First visit (HTTP/2):
  Server response includes header:
    Alt-Svc: h3=":443"; ma=86400   ← "I support HTTP/3"

Second visit:
  Client opens QUIC connection on UDP port 443
  Uses HTTP/3 directly

Installing/enabling HTTP versions#

You don't "install" HTTP versions. They're built into your server/client software. You just enable them:

# Nginx
listen 443 ssl;           # HTTP/1.1
listen 443 ssl http2;     # + HTTP/2
listen 443 quic reuseport; # + HTTP/3 (needs Nginx 1.25+)
add_header Alt-Svc 'h3=":443"';  # advertise HTTP/3

# FastAPI — HTTP/2 automatic with SSL
uvicorn main:app --ssl-keyfile key.pem --ssl-certfile cert.pem

# Caddy — HTTP/2 and HTTP/3 automatic, zero config
api.example.com {
    reverse_proxy localhost:3000
}

Verify what's actually running#

curl -v --http2 https://yourdomain.com 2>&1 | grep -E "ALPN|HTTP/"
# → ALPN, offering h2
# → ALPN, server accepted h2
# → HTTP/2 200

Senior insights#

ALPN failure silently downgrades — a proxy stripping ALPN extension means server never sees h2 preference, both fall back to HTTP/1.1 silently. Always verify with curl or DevTools.
Domain sharding (HTTP/1.1 optimization) hurts HTTP/2 — spreading assets across subdomains forces multiple connections, losing HTTP/2's single-connection multiplexing benefit.

14. TLS vs TCP Handshake#

Simple intuition#

TCP handshake = "Can we talk?" — just establishing channel exists. TLS handshake = "Can we talk privately?" — verifying identity and agreeing on encryption.

TCP handshake — 3 steps#

Client                          Server
  |──── SYN (seq=1000) ──────────►|   "I want to connect"
  |◄─── SYN-ACK (seq=5000) ───────|   "OK, ready"
  |──── ACK ──────────────────────►|   "Let's go"

Total: 1.5 round trips
No encryption. No identity. Just: are you there?

TLS handshake — after TCP connects#

Client                                    Server
  |──── ClientHello ────────────────────►|
  |     - TLS version: 1.3              |
  |     - Supported ciphers             |
  |     - client_random                 |
  |     - ALPN: ["h2", "http/1.1"]      |
  |                                      |
  |◄─── ServerHello ────────────────────|
  |     - Chosen cipher: AES-256        |
  |     - server_random                 |
  |     - ALPN chosen: "h2"             |
  |     - Certificate (public key)      |
  |     - Digital signature             |
  |                                      |
  |  Client verifies certificate:        |
  |    → signed by trusted CA?          |
  |    → domain name matches?           |
  |    → not expired?                   |
  |                                      |
  |──── Key Exchange (Diffie-Hellman) ──►|
  |  Both independently compute same    |
  |  session key — key never sent over  |
  |  wire                               |
  |                                      |
  |◄─── Finished ────────────────────── |
  |──── Finished ────────────────────── |
  |  TLS established, HTTP/2 starts     |

Total: 1 round trip (TLS 1.3)

Comparison#

	TCP	TLS
Purpose	Open a channel	Secure the channel
Steps	3	4-6
Checks identity?	No	Yes (certificate)
Encrypts?	No	Yes (session key)
Negotiates?	Sequence numbers	Cipher + HTTP version
Required?	Always	Only for HTTPS

Full timeline for one HTTPS request#

0ms    TCP SYN
10ms   TCP complete     ← 1 round trip
10ms   TLS ClientHello
20ms   TLS complete     ← 1 round trip (TLS 1.3)
20ms   First HTTP byte  ← real request starts here

Senior insights#

TLS 1.3 0-RTT resumption — if you've connected before, client reuses session ticket to send encrypted data in first message (zero extra round trips). Trade-off: vulnerable to replay attacks, so only safe for idempotent GET requests.
Certificate Transparency — all issued certs must be publicly logged. DigiNotar was hacked in 2011, issued fake Google certs, Iranian users intercepted. CT makes fake certs detectable. The CA trust chain is the weakest point in TLS security.

15. HTTP/2 Multiplexing#

The problem HTTP/2 solved#

HTTP/1.1 on one connection:
  req1 (500ms DB query) → res1 → req2 → res2 → req3 → res3
  Total: 600ms sequential

HTTP/2 on one connection:
  req1 → req2 → req3  (all at once)
         ← res2      (fast query, back first)
         ← res3
  ←──── res1          (slow query, back last, nobody waited)
  Total: 500ms (just the slowest one)

How it works — frames and stream IDs#

HTTP/2 breaks everything into small frames. Every frame has a 9-byte header:

┌─────────────────────────────────────────────────┐
│  Length    (3 bytes) — payload size             │
│  Type      (1 byte)  — HEADERS/DATA/SETTINGS    │
│  Flags     (1 byte)  — END_STREAM/END_HEADERS   │
│  Stream ID (4 bytes) — which request this is    │
└─────────────────────────────────────────────────┘

Stream ID is the key — multiple requests share one TCP connection by tagging their frames:

One TCP connection carries all streams simultaneously:

stream 1 frames: [H:1][D:1][D:1]        ← GET /orders
stream 3 frames:       [H:3][D:3]       ← GET /profile
stream 5 frames:             [H:5]      ← GET /settings

On the wire: [H:1][H:3][D:1][H:5][D:3][D:1]
              interleaved freely, sorted by receiver using stream ID

Stream ID rules#

Client-initiated: odd numbers  (1, 3, 5, 7...)
Server-initiated: even numbers (2, 4, 6...)
Always increasing, never reused

Frame types#

Frame	Purpose
HEADERS	HTTP headers (method, path, status)
DATA	Request/response body
SETTINGS	Connection config (max streams, frame size)
WINDOW_UPDATE	Flow control signals
RST_STREAM	Cancel one stream without closing connection
GOAWAY	Closing the connection
CONTINUATION	Overflow of HEADERS when compressed headers are large

Two-level flow control#

Connection level: total bytes across ALL streams combined (default 65535)
Stream level:     bytes for ONE specific stream (default 65535 per stream)

Why two levels?
  File download stream consuming whole connection window
  → all 99 other streams starve
  With stream-level control:
  → file download gets limited window
  → other streams get their own windows
  → all 100 streams make progress

HPACK header compression#

Request 1: full headers sent → both sides add to table (~800 bytes)
Request 2: send index numbers only → ~20 bytes
           only changed headers sent in full
800 bytes → 20 bytes per request

What HTTP/2 does NOT fix#

HTTP/2 fixed HTTP-level head of line blocking. But TCP-level still exists:

One lost TCP packet → ALL streams wait for retransmit
TCP doesn't know about HTTP/2 streams
→ This is what HTTP/3 + QUIC solves

Senior insights#

Stream prioritization is underused — HTTP/2 lets clients assign weights to streams. Browsers use it to load HTML before CSS before JS before images. Most backend clients never use it.
HTTP/2 multiplexing can hurt with slow backends — clients pile up hundreds of concurrent streams. Without SETTINGS_MAX_CONCURRENT_STREAMS, one HTTP/2 client can overwhelm backend. Always set it in production (Nginx default: 128).

16. HTTP/2 Frame Structure#

The 9-byte frame header is always exactly 9 bytes#

This never changes. The variable part is the payload:

HTTP/2 FRAME:
┌─────────────────────────────────────────────────┐
│  FRAME HEADER (always exactly 9 bytes)          │
│  ┌──────────┬──────┬───────┬───────────────┐    │
│  │ Length   │ Type │ Flags │  Stream ID    │    │
│  │ (3 bytes)│(1b)  │ (1b)  │  (4 bytes)    │    │
│  └──────────┴──────┴───────┴───────────────┘    │
├─────────────────────────────────────────────────┤
│  PAYLOAD (0 to 16,383 bytes, variable)          │
│  ← your actual HTTP headers or body live here   │
└─────────────────────────────────────────────────┘

The Length field tells receiver exactly what to read#

3 bytes = 24 bits → max 16,383 bytes payload (default)
  (negotiable up to 16MB via SETTINGS frame)

Receiver always:
  1. Read exactly 9 bytes → parse frame header
  2. Read exactly Length bytes → that's the payload
  3. No ambiguity, no scanning

When HTTP headers are too large — CONTINUATION frames#

Big JWT token or many cookies → HEADERS payload > 16KB

Frame 1: HEADERS
  Flags = 0x0  ← END_HEADERS NOT set (more coming)
  Payload: [first 16KB of HPACK compressed headers]

Frame 2: CONTINUATION
  Flags = 0x4  ← END_HEADERS set (done now)
  Stream ID: same as HEADERS frame
  Payload: [rest of headers]

Receiver buffers CONTINUATION frames until END_HEADERS = 1

The END_HEADERS and END_STREAM flags#

Flags byte (8 bits):
  Bit 0 (0x1) = END_STREAM   → no more data on this stream
  Bit 2 (0x4) = END_HEADERS  → headers complete, no CONTINUATION coming

Examples:
  GET request (no body):  Flags = 0x5 → END_HEADERS + END_STREAM
  POST request (body):    HEADERS Flags = 0x4 → END_HEADERS only
                          last DATA Flags = 0x1 → END_STREAM

Why CONTINUATION frames are rare#

HPACK compresses repeated headers aggressively:

Request 1: ~800 bytes of headers → both sides store in table
Request 2: ~20 bytes (just index references)
→ fits in one frame easily

Senior insights#

CONTINUATION frames were the HTTP/2 Rapid Reset attack vector (CVE-2023-44487) — attackers sent HEADERS without END_HEADERS, forcing servers to buffer unbounded CONTINUATION frames, then RST_STREAM to cancel — but buffering already happened. Millions of times per second = largest DDoS in internet history (late 2023). Fix: server-side limits on CONTINUATION frames per stream.

17. Body, Query and Path Params in HTTP/2#

Where everything lives#

Request part	Frame type	Notes
HTTP method	HEADERS	`:method` pseudo-header
Path params	HEADERS	Part of `:path`
Query params	HEADERS	Part of `:path` after `?`
HTTP headers	HEADERS	HPACK compressed
Request body	DATA	Raw bytes, uncompressed
Large headers	HEADERS + CONTINUATION	Until END_HEADERS flag
Large body	Multiple DATA frames	END_STREAM on last one

Path and query params — always in HEADERS#

GET /users/42/orders?status=pending&limit=10

HEADERS frame payload (HPACK):
  :method    = GET
  :path      = /users/42/orders?status=pending&limit=10
  :scheme    = https
  :authority = api.example.com
  authorization = Bearer abc

No DATA frame for GET — body is empty

Request body — DATA frames, separate from HEADERS#

POST /orders
Content-Type: application/json
{"item": "coffee", "qty": 2}

Frame 1: HEADERS
  Flags = END_HEADERS (no END_STREAM — body coming)
  :method = POST, :path = /orders, content-type = application/json

Frame 2: DATA
  Flags = END_STREAM (body done, stream done)
  Payload: {"item":"coffee","qty":2}   ← raw bytes, no compression

Large body — multiple DATA frames#

POST with 5MB body:
  HEADERS [END_HEADERS]
  DATA (16KB)
  DATA (16KB)
  ...
  DATA (last chunk) [END_STREAM]   ← this flag says body is complete

Content-Length in HTTP/2#

Content-Length is advisory — END_STREAM flag on last DATA frame is the real signal. Still sent for validation — if actual bytes ≠ Content-Length → treat as error.

18. socket.connect() — OS vs HTTP Client#

Two completely different things#

RAW SOCKET (talking directly to OS):
  s = socket.socket()               # system call → OS creates socket struct
  s.connect(("google.com", 80))     # system call → OS does TCP handshake
  s.send(b"GET / HTTP/1.1\r\n")     # system call → OS writes to send_buffer
  data = s.recv(1024)               # system call → OS reads recv_buffer

HTTP CLIENT (talking to library):
  client = httpx.Client()
  client.get("https://google.com")  # library call → library calls OS internally

What axios.get() actually does internally#

axios.get("https://api.example.com/orders")
    ↓ check connection pool
    ↓ OS: socket() → fd = 4
    ↓ OS: connect(fd=4, ip, 443) → TCP handshake
    ↓ TLS library: wrap(fd=4) → TLS handshake + ALPN
    ↓ OS: write(fd=4, HTTP2_frames) → HEADERS frame to send_buffer
    ↓ OS: recv(fd=4, buf, n) → read response from recv_buffer
    ↓ parse frames → build response object
    ↓ return to your code

You called one line. Library made a dozen system calls.

Server side — what app.listen() does#

app.listen(3000)
    ↓ OS: socket() → fd = 4 (listening socket)
    ↓ OS: bind(fd=4, port=3000) → "port 3000 = this process"
    ↓ OS: listen(fd=4, backlog=511) → mark as passive
    ↓ OS: accept(fd=4) → block, wait for clients
         TCP handshake done by OS before accept() returns
         new fd = 5 for this client
         fd=4 goes back to waiting

The two sockets that always exist#

fd = 4  LISTENING socket  → "front door", bound to port, never reads data
fd = 5  CONNECTED socket  → one per client, actual data flows here
fd = 6  CONNECTED socket  → another client
fd = 7  CONNECTED socket  → another client

When your server also makes HTTP calls#

app.get('/orders', async (req, res) => {
    const data = await axios.get('http://inventory-service/items')
    //                ↑ your server acting as HTTP CLIENT here
    res.json(data)
})

Your process is simultaneously HTTP server (receiving) and HTTP client (calling other services). Two separate socket pools. This is exactly what microservices are.

19. How a Framework Decodes Packets#

The framework is responsible for decoding — not you#

recv_buffer bytes (raw hex)
      ↓
HTTP parser (C library — llhttp in Node.js, httptools in Python)
      ↓ state machine eats bytes:
        read until space    → METHOD ("GET")
        read until space    → PATH ("/orders")
        read until \r\n     → VERSION
        loop:
          read until ":"    → header name
          read until \r\n   → header value
          if \r\n\r\n       → headers done, body starts
        read remaining      → body
      ↓
Middleware stack runs on parsed data:
  express.json()    → raw body bytes → req.body JavaScript object
  cookieParser()    → Cookie string  → req.cookies object
  authMiddleware()  → Auth header    → req.user object
      ↓
YOUR handler runs:
  req.method, req.path, req.headers, req.body  ← all clean, all decoded
  you never saw a single raw byte

HTTP parsers in popular frameworks#

Language	Framework	HTTP Parser
JavaScript	Express	llhttp (C, ships with Node.js)
Python	FastAPI	httptools (C, pip installed)
Java	Spring	Netty (Java NIO)
Go	net/http	Built into Go stdlib
Rust	Actix	httparse (Rust, zero-copy)

The exact boundary#

─────────────────── recv_buffer bytes ────────────────────
                         FRAMEWORK'S WORLD
  TCP byte stream → HTTP frame parsing → HPACK decompression
  → headers decoded → body assembled → middleware chain
──────────────────── your route handler ──────────────────
                          YOUR WORLD
  req.body, req.headers, req.params, business logic
──────────────────────── res.send() ─────────────────────
                         FRAMEWORK'S WORLD AGAIN
  serialize response → build HTTP frames → write to send_buffer
──────────────────── OS send_buffer bytes ────────────────

Framework vs HTTP server vs HTTP client#

HTTP CLIENT (axios, httpx, curl):
  INITIATES connections, sends requests, receives responses
  system calls: connect()

HTTP SERVER (Express, FastAPI, Rails):
  WAITS for connections, receives requests, sends responses
  system calls: bind(), listen(), accept()

FRAMEWORK = HTTP server + HTTP parser + router + middleware system

20. The Full Stack — End to End#

Complete mental model#

SILICON
  Electrical signals / radio waves on network hardware

IP LAYER (OS)
  Raw packets — may arrive out of order, may be lost
  Reassembles into bytes

TCP LAYER (OS)
  Ordered, reliable byte stream
  Handles: retransmits, flow control, congestion control
  Writes bytes into recv_buffer
  Manages circular buffer (READ/WRITE pointers)

HTTP/2 LAYER (framework's C library)
  Reads raw bytes from recv_buffer
  Parses 9-byte frame headers
  Reads exactly Length bytes payload
  Sorts frames to streams by stream ID
  Runs HPACK decompressor on HEADERS frames
  Assembles body from DATA frames

MIDDLEWARE STACK (framework)
  JSON parsing → req.body
  Cookie parsing → req.cookies
  Auth checks → req.user

YOUR HANDLER
  req.method, req.path, req.headers, req.body
  business logic
  res.json(), res.send()

REVERSE (sending response):
  Your object → framework serializes → HTTP/2 frames → send_buffer → network

What each layer owns#

Layer	Owns
Your code	Business logic only
Framework	HTTP parsing, routing, middleware
OS	Socket struct, buffers, TCP state
Network	Moving bits between machines

Why swapping TCP for QUIC (HTTP/3) only affects one layer#

HTTP/3 = same HTTP/2 framing + same HPACK + same stream multiplexing
         but QUIC replaces TCP at transport layer

Your app code:    unchanged
Framework:        unchanged
HTTP/2 frames:    unchanged
Transport:        TCP → QUIC (UDP-based, per-stream reliability)

Each layer is independent. Replace one layer without touching others.

Quick Reference — Interview Answers#

What is a socket?#

"A socket is an abstraction the OS provides representing one end of a network connection. Under the hood it's a file descriptor — an integer — pointing to an in-memory struct holding the connection's identity (4-tuple), send/receive buffers, TCP state, sequence numbers, and timers. Your code never directly touches the network — it reads and writes to buffers, and the OS handles everything else."

How does one port serve millions?#

"A port alone doesn't define a connection. The OS uses a 4-tuple: source IP, source port, destination IP, destination port. Millions of users all connecting to port 443 each have a unique source IP/port combination. The OS creates a dedicated socket per connection while the listening socket stays open. What actually enables millions is async I/O — one thread handles thousands of idle connections using an event loop — combined with load balancers across multiple machines."

How does streaming work?#

"TCP is always a stream of bytes — no concept of messages or files. Streaming means reading bytes as they arrive in chunks (typically 64KB) and processing each chunk immediately, instead of waiting for everything. At the socket level this is just calling recv() in a loop and handling each return. This integrates with TCP flow control: recv buffer fills when you're slow, TCP window shrinks, sender automatically slows down. Frameworks expose this as streams/generators. Memory stays constant regardless of payload size."

What is HTTP/2 multiplexing?#

"HTTP/2 breaks everything into binary frames with a 9-byte header containing a stream ID. Multiple requests share one TCP connection by tagging their frames with different stream IDs. The receiver sorts frames to streams by ID. A slow stream 1 response doesn't block stream 3 — their frames are independent. This eliminates HTTP-level head of line blocking. HTTP/2 also adds two-level flow control and HPACK header compression. The one remaining problem — TCP-level head of line blocking — is what HTTP/3 solves with QUIC."

How does a framework work?#

"A framework is an HTTP server plus an HTTP parser plus a router. When a request arrives, the OS does the TCP handshake and puts bytes in the recv buffer. The framework's HTTP parser — a C library like llhttp — reads raw bytes, runs a state machine to find frame boundaries, HPACK-decompresses headers, assembles body from DATA frames, runs middleware, then calls your handler with a clean request object. Your business logic never sees a single raw byte."

Topics to explore next: gRPC and Protocol Buffers · WebSockets · Service mesh (Envoy/Istio) · QUIC internals · Database connection pooling