DNS runs before every single thing you do on the internet — yet most engineers treat it as a black box. Here's exactly what happens in those first 50 milliseconds.

What's actually happening here?

Computers talk to each other using numbers called IP addresses — 142.250.80.46, not google.com. DNS (Domain Name System) is the phonebook that converts the name you type into the number the network actually needs. Every page load, every API call, every app opening on your phone — each one starts with a DNS lookup.

The problem this solves

You can't expect humans to memorise 142.250.80.46 instead of google.com. But more importantly, IP addresses change — companies move servers, switch cloud providers, expand to new regions. DNS lets them change the number silently while the name you use stays the same forever.

How it really works (step by step)

The lookup chain — four servers, one question:

  1. Browser cache — your browser first checks if it already looked this up recently. If the answer is cached and the TTL hasn't expired, the journey ends here. No network call at all.

  2. OS cache — if the browser doesn't know, it asks the operating system. Your OS keeps its own DNS cache. On most machines you can see this with ipconfig /displaydns (Windows) or checking /etc/hosts (Linux/Mac).

  3. Recursive resolver — if neither cache has it, your device contacts a recursive resolver — usually run by your ISP (Jio, Airtel) or a public service (Google's 8.8.8.8, Cloudflare's 1.1.1.1). This server does the heavy lifting on your behalf.

  4. Root name server — the resolver asks one of 13 root servers (replicated to 1,500+ locations via anycast). The root doesn't know the IP for google.com — but it knows who manages .com domains and points the resolver there.

  5. TLD name server — the Top Level Domain server for .com doesn't know the IP either — but it knows which authoritative server Google registered for google.com. It returns that address.

  6. Authoritative name server — this is Google's own server. It holds the actual DNS records and returns the final answer: 142.250.80.46. This is the only server in the chain that gives a real answer.

  7. Response cached and returned — the recursive resolver caches the answer for the duration of the TTL Google set, then returns it to your device. Your browser caches it too. Now the page load can actually begin.

The part most tutorials skip

TTL is a weapon, not just a setting. When engineers set a DNS TTL of 3600 seconds (1 hour), they're making a reliability decision: if they need to change their IP address during an outage, every user in the world will be stuck for up to an hour. Production engineers drop TTL to 60 seconds before a planned migration, do the migration, then raise it back. Companies that skip this step have caused hours of downtime because they couldn't redirect traffic fast enough.

Real company doing this right now

Cloudflare's 1.1.1.1 resolver handles over 1 trillion DNS queries per day. The trick that makes it fast: anycast routing. The IP address 1.1.1.1 is announced from 300+ data centres simultaneously. When your device queries it, BGP routing automatically sends your request to the nearest Cloudflare location — often under 10ms. There's no central server. Every query goes somewhere physically close to you, which is why it consistently benchmarks as the world's fastest public resolver.

What breaks at scale?

DNS is a single point of failure most engineers ignore. In 2016, a DDoS attack on Dyn (a DNS provider) took down Twitter, Netflix, Reddit, and GitHub simultaneously — not by attacking those companies directly, but by making their domain names unresolvable. Every one of those companies had redundant servers, multiple data centres, and load balancers — none of it mattered because DNS couldn't resolve their names. The fix: use multiple independent DNS providers and configure both in your domain registrar.

The "aha" moment

DNS doesn't find the fastest server for you — it just returns whatever IP address was registered. The intelligence of routing users to nearby servers (what CDNs do) is a separate layer built on top of DNS using techniques like anycast and GeoDNS.

Your practical takeaway

  • Before any production migration, drop your TTL to 60 seconds at least 2× the current TTL in advance — if your current TTL is 3600s, change it 2 hours early. This limits how long users are stuck on the old IP if anything goes wrong.

  • Never use a single DNS provider for a production system — register your domain with a registrar that supports multiple authoritative nameservers from different providers (e.g., Route53 + Cloudflare). One DNS outage shouldn't take your entire product down.

  • Use dig to debug DNS from the command linedig google.com shows the full response including TTL remaining; dig +trace google.com walks the entire resolver chain from root to authoritative, showing exactly where a lookup is slow or broken.

Lesson 03 · Stage 1 — Network Foundations · System Design made easy