— Writing —
It's 3 AM. Inserts are failing with 'Duplicate entry 2147483647' — but you never inserted a duplicate. Welcome to one of MySQL's most quietly famous outages. A friendly walk through why a signed INT AUTO_INCREMENT dies at 2.1 billion, why the error is misleading, a two-minute laptop repro, the 70% alert that prevents the whole thing — and the two mitigations: ALTER to BIGINT for small tables, and the atomic table-swap senior operators run on billion-row tables in the middle of an outage.
Why “our queue guarantees exactly-once” is almost always a lie once your flow crosses a component boundary: Kafka EOS scope, ack-after-process, unique constraints, SQS FIFO, DB transactions with SMTP, retries without idempotency, Redis pub/sub, replays, gRPC — and what works: transactional outbox, idempotency done right, WebSocket patterns, saga steps, and observability metrics that expose the truth.
When the Garbage Collector pauses, every thread freezes — zero requests, zero responses. This post explains how GC actually works: the heap, mark-and-sweep, minor vs major collections, Stop-the-World pauses, and what triggers them.
A user had access to 1 lakh devices across MongoDB, ClickHouse, and Elasticsearch. We tried seven approaches. All broke. The full journey — dead ends, edge cases, and what production systems actually do.