Skip to content

2. How the Internet Works - A Bottom-Up View

This note answers a couple of questions: How is data transferred across the Internet? How are network resources shared? What is the life of a packet through the network.

Review

The goal of the Inernet is to transfer data between end hosts

2.1 endhost.png

Data Organization: Packets

It will construct a packet: a unit of data that gets transported across the Internet.

  • A switch will read a header, which is metadata that describes how data is to be delivered. Header is meaningful to the network and endpoint. What information must a header contain? Destination address.

  • Payload: the information in the packet. Meaningful info only to the endpoints, like bits from a file, video, etc.

In practice, a packet has multiple headers (next lecture) [[3. ]]

And communication between a pair of endhosts involved multiple packets

  • Flow: stream of packets exchanged between two endpoints

Bandwidth: number of bits sent (or received) per unit time (bits/second, bps)

  • "Width" of the link

Propagation delay: time it takes a bit to travel along the link (seconds)

  • "Length" of the link

Bandwidth-Delay Product (BDP): bits/time * propagation delay (bits)

  • "Capacity" of the link

2.2 Link.png

For now, let's say the packet is something where the size of the packet is 100B. The link is 1Mbps, with propagation delay 1ms.

1Mbps = one bit takes \(1/10^6\) s. Time when that one bit reaches Point B = \(1/10^6 + 1/10^3\) s. The time to transmit 800 bits = \(800 * 1/10^6\) s.

The last bit reaches B at \((800 * 1/10^6) + 1/10^3\) s \(= 1.8\) ms

In general

\[ \text{Packet delay = Transmission Delay + Propagation Delay} \]

where Transmission Delay can be better represented as

\[ \text{Packet delay = (Packet Size / Link Bandwidth) + Propagation Delay} \]

So, which link is better?

  • Link1: bandwidth = 10Mbps and propagation delay = 10ms

  • Link2: bandwidth = 1 Mbps and propagation delay = 1ms

Note that the answer is that it depends.

  • With a 10B packet: \(L_1 \approx 10ms, L_2 \approx 1ms\)

  • With a 10000B packet: \(L_1 \approx 18ms, L_2 \approx 81ms\)

2.3 Pipe.png

You see the right image tells you how much capacity you have (how busy it is).

2.4 lengthwidth.png

Packet at the Switch

The switch now has to figure out where to go. In other words, which link to forward it to. Every switch has a forwarding table, which says which link the packet should follow

2.5 forward.png

Recap: Life of a Packet So Far

  1. Source has some data to send to a destination
  2. Chunks it up into packets: each packet has a payload and a header
  3. Packet travels along a link
  4. Arrives at a switch; switch forwards the packet to its next hop
  5. Repeat 3-4 until we reach destination

Challenges

  1. How to scale the forwarding table?
  2. Two ports to the same destination
  3. What happens when a link goes down?
  4. Where does the table come from?
  5. How do you know it actually gets there
  6. Where did the address come from?

Challenge: Addressing and Naming

Network address: where host is located

Network name: which host it is

Need an addressing and naming scheme that works at Internet scale! --> IP addresses

Consider when you access a web page

  • Insert URL into browser

  • You want to communicate with the server hosting that URL

How do you get to the server?

  • URL is the user-level name (e.g. cnn.com)

  • Network needs address (where is cnn.com)

Must map (or resolve) host names to addresses. This is done by the Domain Name System (DNS).

Challenge: Routing

WHen a packet arrives at a router, the forwarding table determines which outgoing link the packet is sent on

How do you compute the forwarding tables necessary to deliver packets?

Conceptually, routing involves

  • Distributed routing algorithm run between switches/routers

  • Gather information about the network topology

  • Compute paths through that topology

  • Store forwarding information in each router

  • This is the forwarding table

Control Plane vs. Data Plane

Control plane: mechanisms used to compute forwarding tables

  • Inherently global: must know topology to compute

  • Routing algorithm is part of the control plane

  • Time scale: per network event

Data plane: using those tables to actually forward packets

  • Inherently local: depends only on arriving packet and local routing table

  • Forwarding mechanism is part of data plane

  • Time scale: per packet arrival

Control Plane Challenge

  • Computing routes at scale

  • In the face of network failures and topology changes

  • While respecting ISPs need for autonomy. Internet is comprised of many different ISPs, and they each get to make their own decisions about how to do routing within their networks. Typically do not want to reveal the internals of this decision making. Can we ensure that ISPs independent decisions result in usable end-to-end routes? --> BGP

Data Plane Challenge

Consider a 1 Tbps link receiving 10k bit packets. A new packet arrives every 10 nanoseconds.

The following operations must be done after packet arrives (in 10 ns or less)

  • Parse packet (extract address, etc.)

  • Look up address in forwarding table

  • Update other fields in packet header (if needed)

  • Update relevant internal counters, etc.

  • Send packet to appropriate output link

Important Topics

  • How do we name endhosts on the Internet? (naming)

  • How do we address endhosts? (addressing)

  • How do we map names to addresses? (DNS)

  • How do we compute forwarding tables? (Routing Control Plane --> Project 1)

  • How do we forward packets? (Routing Data Plane)

Fundamental Fact About Networks

Network must support many simultaneous flows at the same time. Recall that flow = stream of packets sent between two end hosts. This means network resources (links and switches) are shared between end hosts. Network resources (like bandwidth) are statistically multiplexed.

Statistical Multiplexing

Combining demands to share resources efficiently (vs. statically partitioning resources).

Long history in CS

  • Process on an OS (vs. every process has own core)

  • Cloud computing (vs. everyone has own datacenter)

Based on premise: peak of aggregate demand is << aggregate of peak demands

Aggregates, Peaks, Etc.

Define the peak rate of flow \(f\) to be \(P(f)\). The peak rate of a set of flows \(f_1, f_2\): \(P(f_1 + f_2)\).

The aggregate of the peaks:

\[ \sum_{\{f\}} [P(f)] \]

Peak of aggregate

\[ P(\sum_{\{f\}} f) \]

Typically

\[ \sum_{\{f\}} [P(f)] >> P(\sum_{\{f\}} f) \]

Typically

\[ P(\sum_{\{f\}}f) \approx A(\sum_{\{f\}}f) \]

where \(A(f)\) is the average rate of flow \(f\); \(A(f_1 + f_2)\) is the average rate of set of flows \(f_1, f_2\).

Statistical multiplexing merely means that you don't provision for absolute worst case (When every peaks at the same time). Instead, you share resources and hope that peak rates don't occur at same time.

How Would You Share Network Resources?

Two approaches to sharing: 1. Reservations: end-hosts explicitly reserve bandwidth when needed (e.g. at the start of a flow).

  1. Best-effort: just send data packets when you have them and hope for the best

Implementation

Reservation: 1. Source sends a reservation request to the destination 2. Switches "establish a circuit" 3. Source starts sending data 4. Source sends a "teardown circuit" message

Idea reserve network capacity for all packets in a flow

Best-Effort

Idea: allocate resources to each packet independently (independent across switches and across packets)

Both approaches embody statistical multiplexing

  • Circuit switching: resources shared between flows currently in system. Reserve the peak demand for a flow, but don't reserve for all flows that might ever exist.

  • Packet switching: resources shared between packets currently in system. Resources given out on packet-by-packet basis, and never reserve resources.