Your buddy started Cat2Log, a website for cataloging cat pictures. You sign up and start posting all the cat pictures you have. But how does this work? When you share cat pictures, check the weather forecast, or watch a video:
- Your browser sends an HTTP request to the website.
- The HTTP request is sent over a protocol called TCP, then routed to its destination by a protocol called IP.
- The website sends an HTTP response to your browser, then your browser displays the page using data in the HTTP response.
This article, and others in our networking series, will walk you through every part of the web stack.
Heads up! If you're an absolute beginner to web programming, check out Mozilla's Server-side website programming guide.
HTTP: Where your message lives
Browsers use HTTP (HyperText Transfer Protocol) to communicate with web servers. With HTTP your browser can request and receive web pages and perform transactions like updating your account or posting images.
HTTP defines a format for messages sent over the internet. It describes the action being performed (like GET, PUT, or DELETE), the URL to perform the action on, and the data format being used, such as images, text, or JSON.
Read the following sections of Mozilla's HTTP overview:
- Components of HTTP-based systems (excluding Proxies). This will teach you basic terminology.
- HTTP flow and HTTP messages: This will teach you what HTTP requests and responses look like.
Exercise: Take a look at HTTP messages
If you use a command-line program called curl
, you can see the raw HTTP messages for yourself. Open Command Prompt on Windows or Terminal on macOS, then run the following:
curl -v example.org
You should get a result like the following:
> GET / HTTP/1.1 > Host: example.org > [request info] > < HTTP/1.1 200 OK < [response info] < [a bunch of HTML]
The lines prefixed by >
are the HTTP request; the lines prefixed by <
are the HTTP response. Try other websites and see what you get!
HTTP: Message Format
Now read Mozilla's article about HTTP messages to learn more about the HTTP format. You can skip the section on HTTP/2 Frames.
This is a good time to learn about some common HTTP headers used in web requests:
Don't worry if some terminology is still unfamiliar. You can always look up the definitions of any other headers you see in your test requests.
Exercise: Run an HTTP server
You can run an HTTP server on your own computer. Python is the easiest way to get started. Once you have Python installed, run the following on your command line:
python -m http.server
Then go to localhost:8000
in your browser. You should see a directory listing of the folder where you are running Python. Python here is running an HTTP server on your local computer, and your browser is making HTTP requests to fetch data from it.
You can also use curl
to see the details of the response. Run the following in your terminal:
curl -v localhost:8000
HTTP: Check your knowledge
- What is the structure of an HTTP request or response?
- Which part of an HTTP request contains the domain name?
- Which part of an HTTP request contains the rest of the URL?
TCP: Retry, retry, retry
HTTP messages are sent via a protocol called TCP (Transmission Control Protocol). TCP ensures that messages are delivered reliably—that all the data arrives in one piece.
TCP divides messages into pieces called packets. These packets are individually sent across the internet, and each may take a different route to the destination. There are many reasons for this, but the primary one is simply to allow the network to be shared by many devices—by chunking messages into pieces, many messages can be interleaved and sent across the same network.
However, packets are not guaranteed to reach their destination. Packets can get lost, arrive out of order, or get corrupted along the way. Tom Scott covers some of this with his overview of the Two Generals problem:
TCP is designed to deal with these problems, and therefore make it easier for programmers to write network code. Packets are given sequence numbers, and the receiving end will ensure that it has all the packets in sorted order before giving them to the application. The receiver sends an acknowledgment of each packet, and the sender will re-send any packets that are not acknowledged. TCP also dynamically adjusts how quickly messages are sent, so as to not overwhelm the network.
TCP packets also have a port, which tells the receiver which program the packet is for. For example, when sending an HTTP request to Handmade Network, your computer will send a TCP message with port 443 to the Handmade Network server. On our end, our server will route the incoming TCP packets to our website, which is listening on port 443.
Watch the following videos to learn more about TCP:
Exercise: Make your own TCP server!
Follow Bharat Chauhan's Writing an HTTP server from scratch. You will start by creating a TCP server that echoes the data it receives. Then, you will expand it to handle HTTP messages.
IP: What's my address?
Underlying TCP is IP, the Internet Protocol. IP contains the essentials to deliver data from A to B, primarily the source and destination IP addresses. Internet routers use IP addresses to route the packet across the internet.
Watch the following video to learn more about IP:
Exercise:
So, how do you actually send the cat though? How do you send a real packet yourself? It's time to make that theory stick.
Go through Beej's Guide to Network Programming: Using Internet Sockets. You will learn how to create a server from the ground up (well, from the operating system up).
If you didn't do it above, go through Bharat Chauhan's article Writing an HTTP server from scratch. It will give you a good high-level view of what a web server actually does—critical knowledge for a variety of web applications.
DNS: Wait, how do I get an IP?
The last really important bit for day-to-day network programming is DNS. The job of DNS is to help you find IP addresses for domain names, like "handmade.network".
For a high-level introduction to DNS, there's no one better than Julia Evans. Here are a few very short comics that give you the basics:
- https://wizardzines.com/comics/why-dns/
- https://wizardzines.com/comics/dns-queries/
- https://wizardzines.com/comics/life-of-a-dns-query/
- https://wizardzines.com/comics/dns-record-types/
- https://wizardzines.com/comics/dns-packet/
If these interest you, her full zine on DNS is well worth the money.
Finally, here are a couple other resources to round things out:
- https://jvns.ca/blog/2022/02/01/a-dns-resolver-in-80-lines-of-go/
- https://jvns.ca/blog/2022/05/10/pages-that-didn-t-make-it-into--how-dns-works-/
Exercise: Mess with DNS
Learn how DNS works in practice by going through Julia Evans's "mess with dns" tool: https://messwithdns.net/
Ethernet and PHY: Rock bottom
Ok, so you've got some of the basics down, and you're ready for some serious spelunking? Let's talk bits and bytes.
Watch this series of videos by Ben Eater, stopping after video 8 (The Internet Protocol).
Written by Colin Davidson. Edited by Ben Visness. Art by Jacob Bell.