The Phone Numbers of the Internet

In the scant logic of “hacking” in movies or TV shows, IP addresses seemingly play a big role. Finding an IP address is the key to figuring out exactly who the mysterious hacker is, or how to break into the top-secret computer system, or whatever. This is … mostly false, at least in the way it’s presented in most media, but there are some kernels of truth in there. Let’s start from the top, though: what are IP addresses? Why are they useful? And why would you care, or not care, about hiding yours?

An IP address is basically an internet phone number for your computer. (Or phone, or printer, or smart fridge, or something.) At least, that was the original intention. Nowadays it’s a little bit more complicated, and I’ll explain why in a moment, but the general idea — that an IP address tells the internet where to send something — still stands.

In 1981, a bunch of nerds at the Internet Engineering Task Force (one of the organizations that sets the standards for the internet) published Internet Protocol Version 4, or IPv4. (IPv1, 2, and 3 were experimental versions that were never widely used.) IPv4 specifies the standard for how packets (pieces of data) are sent across a network of computers to where they need to go. The mechanics of exactly how that happens are beyond the scope of this article, but the important part here is the address. They decided that each computer on the network would be identified by its IPv4 address: a set of four numbers from 0 to 255, commonly written like “219.53.52.12” or something similar. (This format means that each number fits nicely into one binary byte, which is convenient for various reasons that I won’t get into.) Four numbers from 0 to 255 means there are approximately 4.3 billion possible IPv4 addresses.

“Excellent!” they said. “4.3 billion? There’s no way we’ll ever have that many computers to connect!” This seemed like a very reasonable statement in 1981, when CSNET, a precursor to the Internet, connected approximately 200 computers. However, as you may know, the internet is a little bit bigger now. As of 2018, there are roughly 22 billion devices connected to the internet. So how do you fit 22 billion devices into 4.3 billion addresses?

One possibility is to expand the number of addresses. In the 1990s, as the internet expanded massively and the developers of IPv4 realized their mistake, IPv6 was developed. Instead of four numbers from 0 to 255, IPv6 addresses have eight numbers from 0 to 65,535. This expands the number of possible addresses from 4.3 billion to 3.4 × 10³⁸, or 340 undecillion, or 340 billion billion billion billion addresses. Our minds aren’t really built to work with numbers like that, but putting it another way, you could assign an IPv6 address to every single atom on the surface of the earth six thousand times over. Just know that we probably won’t run out of IPv6 addresses any time soon. The issue with IPv6 is mostly that the two systems don’t work very well with each other, so the transition from one system to another is slow and painful, with a lot of weird adapters necessary for interoperability. Over twenty years after the standard was first introduced, only about 30% of connections use IPv6.

Since IPv6 doesn’t solve the problem, how about we just share? That’s the idea behind Network Address Translation , which lets 22 billion devices share 4.3 billion addresses (mostly) seamlessly. So, what is NAT, and why would you use it? Well, most devices, including pretty much all consumer devices like smartphones, don’t really need their own IP address. They are usually the devices that initiate a connection—opening websites, sending texts, and so on. Unlike a public server, there’s no good reason that any random person on the internet should need to directly connect to your smartphone out of the blue. In fact, it’s a security risk: if an attacker has network access, they might be able to find some crack in your device’s virtual armor and infect it with a virus. This isn’t particularly likely; modern, well-updated devices are pretty well-defended against that sort of thing, which is why you don’t get hacked every time you connect to the WiFi at an airport, but it’s better not to take any chances. (How cyberattacks actually happen is a topic for another article, but suffice to say, you should forget everything you have learned from any piece of popular media ever.)

Back to NAT. NAT creates a private, small-scale network, usually on the scale of a single home, school, apartment, or business. This entire network, which could contain dozens or hundreds of devices, has a single IP address on the public internet. All communications between the devices inside the network and the internet go through a single router, which hands out private IP addresses to these devices and re-broadcasts the packets they send onto the wider internet, while keeping track of which responses need to be sent where. Each device on a private network behind a NAT gets its own IP from one of the ranges designated for that purpose. (Usually, homes use the range of IP addresses starting with 192.168 for this purpose; larger networks like an office building might use the 10.x.x.x range. Swarthmore’s campus network doesn’t actually use NAT, for reasons I’ll get into in a moment.) To communicate within the private network, you can use one of those private IP addresses. Think of devices on a private network like rooms in a house. If someone says “the kitchen” while standing inside a house, then it’s obvious what room they mean, but if they said the same thing while standing outside on the street, nobody would have any idea which kitchen they’re referring to.

If you’re reading this article from the comfort of your home network, here’s what probably happened for your device to get it from our servers. First, your computer figured out which IP address corresponded to swarthmorephoenix.com through the DNS (a sort of internet phonebook that links website addresses to IPs). It probably found an IP like “23.185.0.3.” Then, it sent a packet labeled “from 192.168.0.34 to 23.185.0.3” to your home router. Your router then re-labeled the packet so that instead of “from 192.168.0.34” (or whatever IP you might have on the local network) it said “from 12.34.56.78” and sent it onwards into the Internet where it made its way to the server. The server sent the article back through the internet to 12.34.56.78, where your router, which would have been keeping an eye out for a reply, re-labelled it a final time, and passed it on to your computer. This re-labeling of packets is completely invisible to the outside server: a server could be communicating with dozens of devices at once, but from its perspective it’s only talking to one particularly prolific device. So, a lot of devices can share a single IP address, with (almost) nobody the wiser.

I mentioned above that Swarthmore doesn’t actually use NAT, and I’m going to dig into that a bit more here, because Swarthmore’s network is somewhat unusual. You see, back in 1988, when concerns about running out of IP addresses were still quite far off, the college acquired a massive block of IP addresses; specifically, all IP addresses beginning with “130.58,” about 65,000 in all. So, instead of everyone sharing a single IP address, every single device on campus gets its own public IP from that range. (The specific IP assigned to your device can change a lot and might change every time you disconnect and reconnect, or walk around to a different building, or whatever. This is why it’s not really like a phone number.) You can see this for yourself: google “what is my IP?” on two different devices on the Swarthmore network, and you’ll get two different results, but try it with two devices on the same home network and it’ll give you the same IP each time. Note that even though each device technically does have a public IP, this isn’t really a security issue: Swarthmore’s firewall blocks incoming connections for most devices in much the same way that a NAT would. None of this really matters for the average student that isn’t messing around with hosting; I just think it’s kind of neat.

It would seem that we’ve tackled the great IP address shortage, at least for now. Let’s go back to those tropes I was talking about. Can you actually find where someone lives just by using their IP address? Kind of. IP addresses work much like phone numbers in that IP ranges are associated with approximate geographical areas. For example, for the IP address 24.22.105.131 (which I picked at random), a quick search shows that it’s a Comcast customer in the Portland, Oregon metro area. This obviously isn’t very helpful for finding someone’s house. For that matter, the geolocation might not even be accurate, particularly if that IP address is merely a relay point for a user located elsewhere. But it’s good enough for a lot of things; this is how Slack displays everyone’s time zone, for example.

So, some ten-year-old kid isn’t about to figure out where you live because of your IP address. But things are a bit different for law enforcement, because your internet provider knows which IP address it gave you, as well as who you are and where you live. If you do something criminal on social media — threats or defamation, for example — then law enforcement can subpoena the social media company for its IP address logs (which companies keep for a number of mostly-benign reasons like monitoring suspicious logins), then subpoena your internet provider and show up at your house. Of course, the existence of this practice raises some questions about surveillance and privacy and so on, but on balance it’s probably a good thing that people like this guy get caught. And besides, hiding your IP is trivially easy — just walk to any nearby public hotspot, or use a VPN. (Though if a government agency really wants to figure out who you are, there’s not much you can really do about it.)

What about getting hacked? As I mentioned above, home routers have pretty strong defenses against this sort of thing, most importantly the fact that your home router will block all incoming connections by default because nobody has any business connecting to your home network from the outside world. (A network like Swarthmore’s campus network is much harder to secure, because there’s all sorts of reasons people might be connecting from off-campus, and every one of those reasons opens up potential surface area where an attacker can gain a foothold.) But there is one kind of attack (calling it a “hack” is a bit unfair) that does work on basically any network: a Distributed Denial of Service attack, or DDoS attack. A DDoS attack just floods your incoming internet connection with nonsense. None of it has any chance of stealing your data or anything, but the computing power required to reject all of those packets, not to mention all the bandwidth they take up, can easily bring a residential-grade internet connection to a crawl. Larger attacks can even bring down commercial networks, like when a group of hackers brought down the Playstation and Xbox servers on Christmas morning in 2014. A DDoS attack is basically the internet equivalent of dumping a bunch of dirt into someone’s mailbox. You aren’t going to steal any of their bank account data or anything, but now all their mail is buried under a bunch of dirt and you’ve accomplished your goal: making their life harder.

Avoiding a DDoS attack is one reason you might want to try and hide your IP from random people on the internet, besides just the general principle. It’s also one of a few reasons that peer-to-peer services, which create direct connections between a small group of people for online games or video calls, have been falling out of favor compared to services that pass the data through a central relay. In a peer-to-peer setup, everyone can see each other’s IP address, and a particularly unscrupulous gamer might decide to launch a small DDoS attack against their opponent’s home network in the hopes of bringing down their internet connection to win by default. Peer-to-peer video chat services like Skype are also undesirable for certain people who might want to hide their country of origin or just take every possible precaution in favor of anonymity. (For the record, Zoom uses central servers.) Peer-to-peer connections also just kind of suck: aside from the loss of IP address privacy, they don’t play nicely with all the NAT stuff I mentioned earlier, they cause one person’s crappy connection to ruin everyone’s experience, and I spent a chunk of an internship last summer dealing with them, so I have a bit of a personal vendetta.

So, while the idea of IP addresses as some magical locating device isn’t strictly true, there’s a kernel of truth in there, and you probably shouldn’t be posting your home IP address on Twitter.

This article barely scratches the surface of an incredibly complicated topic that doesn’t get nearly enough attention, and that I personally still have plenty to learn about. We tend to take for granted the ubiquitous but largely invisible infrastructure that keeps the internet and all of modern society running, so I hope I’ve pulled back some small part of the curtain here.

If you have any further questions, would like to see a column on a specific topic, or think that I got something wrong, feel free to email me at zrobins2@swarthmore.edu. You can also DM me on Instagram @software.dude.

Some final notes:

The “4.3 billion addresses” number is technically a bit high — that’s the theoretical upper bound for addresses in that format. In reality, about 300 million IP addresses are reserved for various uses. For example, all 16 million IP addresses starting with 127 are designated as loopback addresses, which are for when a computer wants to connect to itself. This is useful for various reasons I won’t get into here.
For the comparison of how many IPv6 addresses exist, I assumed that each atom on the surface is a square with sides one Ångström (or 0.1 nanometers) long and then ran this WolframAlpha calculation.
As you may have guessed, the concept of a “private network” is closely related to the “virtual private networks” from my first article — as I mentioned there, the original application of VPNs was to create encrypted tunnels through which you could access private internal networks like Swarthmore’s “virtually,” that is, without actually being physically present. Hence the name.
Technically, devices on Swarthmore’s eduroam Wi-Fi network get assigned IPs from some smaller sub-range; ITS allocates various ranges for Wi-Fi versus hardwired devices, different buildings on campus, and so on, along with a few separate fortified areas for highly secure stuff. Either way, though, you get an IP somewhere in Swarthmore’s range.
While the original intention was that each computer would have some individual IP address for fairly long periods of time, analogous to a phone number, later developments such as Wi-Fi have changed that. In the course of a day, your smartphone or laptop might switch between several different IPs as it connects to different Wi-Fior cellular networks. Also, your home’s IP might change occasionally, due to the way that internet providers allocate addresses. Since there’s no reason anyone actually needs to initiate a connection to your personal devices (unless you’re doing something like running a Minecraft server), this usually doesn’t matter. Industrial-grade computers like website servers usually have a static IP, which changes extremely rarely but is somewhat more complicated.
Thank you to Mark Dumic from ITS for answering my questions about Swarthmore’s campus network, as well as the two people in Willets basement lounge three weeks ago who wanted to know about this topic and made me realize I should write an article about it. (I have entirely forgotten what your names were. Sorry about that.)

Zachary Robinson

Zack Robinson '24 is a sophomore from Portland, Oregon, studying computer science and English literature. He enjoys tinkering with technology, epeé fencing, and diving into random Wikipedia rabbit holes.

3 Comments Leave a Reply

James Knott says:

March 6, 2021 at 6:10 pm

A couple of facts here. First off, Vint Cerf, the guy who invented IP, intended the 32 bit addresses to be proof of concept only and the real version would have a lot more address bits. Problem is IPv4 “escaped” and became the standard. NAT was a hack developed to get around the shortage, but breaks many things. The first one I was aware of was FTP, which in active mode wouldn’t work through NAT. These days, we need hacks like STUN to get VoIP and games past NAT. So, if we stick with IPv4, we’ll continue to have hacks on a hack, to produce a system that’s still crippled by NAT. As for the address I’m reading this from, the address this article comes from is 2620:12a:8000::3, which is most definitely IPv6. I have also been running IPv6 on my home network for almost 11 years. As you mentioned, IPv6 has an unbelievably huge address space. For example, from my ISP, I get a /56 prefix, which contains 256 /64 prefixes, each containing 18.4 billion, billion addresses, all reachable from the world.

The problem with promoting NAT, as you seem to be doing, is it delays the world moving to IPv6, leaving many stuck behind carrier grade NAT on IPv4, which means they can never connect to their own network, as anyone with a public address can do. So, you’d be doing everyone a favour by promoting IPv6, instead of hanging onto IPv4 and NAT>

Reply
- Zachary Robinson says:
  
  March 6, 2021 at 11:18 pm
  
  Thanks for the info! I didn’t know the proof-of-concept fact, that’s interesting. And yeah, I actually spent last summer working with STUN/TURN servers for video calling, it just didn’t make it into the article.
  
  I figured the people reading this article probably aren’t making decisions about IPv6 implementation, but I guess I should say it—if you’re in any position to do so, lobby your ISP and various other institutions to eventually move us all off IPv4! Then maybe my article explaining how addresses work on the internet won’t run to almost three thousand words explaining all the weirdness we have to deal with. 😉
  
  Reply
James Knott says:

March 9, 2021 at 9:09 pm

Here’s some more info about Vint Cerf and IPv6:

‘ One of the decisions his team needed to make was the size of the address space in the packets.

Some researchers wanted a 128-bit space for the binary address, Cerf (recalled) … But others said, “That’s crazy,” because it’s far larger than necessary, and they suggested a much smaller space. Cerf finally settled on a 32-bit space that was incorporated into IPv4 and provided a respectable 4.3 billion separate addresses.

“It’s enough to do an experiment,” he said. “The problem is the experiment never ended.” ‘
http://www.networkworld.com/article/2227543/software-why-ipv6-vint-cerf-keeps-blaming-himself.html

Reply