— Writing —
Every process you run will be told to stop — the only question is whether it gets a request it can act on or a bullet it never sees coming. A first-principles walk through graceful shutdown via Unix signals: why SIGTERM is catchable and SIGKILL is not, the five-stage drain (catch → stop intake → drain → flush → exit 0) every well-behaved server runs, PostgreSQL's three shutdown modes mapped to three signals, Redis saving its RDB on SIGTERM (and losing everything on SIGKILL / the OOM killer), connection draining in web servers, how Kubernetes wires it together with terminationGracePeriodSeconds and a closing SIGKILL, the minimal correct handler in Go and Node, and the five pitfalls — PID 1 with no signal disposition, work inside the handler, unbounded drains, the endpoint-removal race, and a grace period shorter than your real drain.
Kernel TCP keep-alive probes the network path; app heartbeats prove the peer process is serving; HTTP Connection: keep-alive only reuses sockets. Why NAT, load balancers, and firewalls still drop “idle” long-lived connections, how defaults like Linux’s multi-hour timers compare to prod WebSocket patterns, and when you really need both mechanisms.
A senior-engineer walkthrough of the whole cache stack: CPU L1/L2/L3, TLB, OS page cache, DB buffer pool, in-process LRU (Caffeine), distributed cache (Redis/Memcached), HTTP cache + ETags, CDN edge, browser, DNS, reverse proxies, query plan cache. Latency budgets at every layer, when each layer is the right answer, the patterns (cache-aside, read-through, write-through, write-behind, stale-while-revalidate, single-flight), and the failure modes (thundering herd, hot keys, negative-cache stampedes) — with a decision framework for which layer to actually cache at.
SVG-backed walkthrough: ring hotspots, successor overload when a node dies, and two ring generations during deploy. Virtual nodes, Redis-style key skew, range split for a fat month, and when range partitions beat a hash ring. Case studies (composite, production-shaped), a deep look at range sharding, and a decision table — with metrics and what actually worked.
A beginner-friendly walkthrough: what “concurrent” means, what the C10K and C10M problems actually are, how the industry usually fixes them (event loops vs whole-system design), and what still goes wrong in production — blocking I/O, fd limits, reconnect storms, databases, and tail latency.
Fan-out on write vs fan-out on read — and why Twitter uses a hybrid at 500M users. A beginner-friendly breakdown.