One of our first questions as a group was this: What it does it look like to stand in the position of a broadband internet provider and look back down the pipe toward a household? We asked some experts around here, including a friendly new provider in our town, a network analysis researcher, and some system administrators.
The answer turns out to be pretty complicated.
- Firstly, I should say that I never actually called my own home provider to ask them for a record of my activity. I’ll try that approach later, but the thought of trying to penetrate their terrible service interface was just too daunting.)
- Secondly, this post may or may not be of interest to people who actually design, build or manage broadband networks for a living. Importantly, we approached this stage of the project as novices – and probably picked the most ignorant person in our group to pursue it. So get ready for some seriously basic language.
- Thirdly, if there’s anywhere that I end up blatantly WRONG in my conclusions here, I’d like to know. For any place I get mixed up is likely to be a place that consumers get mixed up too.
Lastly, we’ve tried to answer our question with a minimum of financial expense, so as to follow a path available to more than just research university professors.
The first thing we learned (or already knew, in some cases) was that when a broadband provider looks at your household, they pretty much see a list of requests – let’s call this a log. These requests originate from different computers (or phones, or dvd players, or game systems, or washing machines, etc) in your home. They say, “Please fetch me file such-and-such from the following location on the internet.” The service provider simply passes along the requests to the appropriate party, which may or may not actually respond by sending the requested file back.
Now these requests originate from different devices in your home, but they first go to your home’s router, which passes them along to your modem, which passes them along to the service provider. And logs of these requests could originate from any point along that path. Each device might keep a log of such requests, or your router might keep such requests – or, the provider might keep such a log.
So what might that provider’s log contain? The only information that the provider can tell for certain is the destination of these requests. In most cases the provider can also make a good guess about the sort of file that’s requested and returned. So the provider can look at the data and say that household X requested audio files from a server owned by say, Spotify, at certain times and at certain degrees of regularity.
Beyond that, the provider can’t tell which device made the request unless the router in your home is specially configured to pass that information along. They can, however, infer or make good guesses about a great deal. They might look at frequency of requests, for example, and make a good guess about whether or not multiple devices are making them.
We weren’t sure what sort of guessing our providers might be able to do, because we didn’t know what their logs looked like. When we asked, it turns out that these logs can look quite different depending on which format or protocol a provider uses. And, as in so many such cases, there are proprietary formats and open ones. They seem to originate from the companies that make the actual router.
Obviously, and unlike other test datasets out there, sample sets of internet provider service logs aren’t likely to be in abundance. Though the protocols may not be all that sensitive, the data sure is, given that if it’s good it probably originated from actual people who would rather not share that data.
We looked long and hard, and could in fact find none, not even for such often-used and market-dominant protocols as Netflow. This was something of a surprise to us. To use an analogy, what if, as a doctor or a patient you just learned that there is something called an X-Ray machine which produces images of one’s insides. You think you might want to try out such a thing – but nobody will share their examples, not even the X-Ray company. You have to just buy a machine and give it a try.
The only way to see what an “X-Ray” of a household’s internet use looks like from the provider’s point of view is, apparently, to buy an expensive Cisco router and start tracking yourself.
And, as that’s not exactly an accessible option for most from an economic or technical point of view, we decided not to pursue it. Instead, we moved “downstream,” adjusted our goals a bit. To use a different analogy, instead of trying to see our home’s power consumption as it looks at the power company, we moved back to where the power enters our house, to measure it there. We decided to start looking for a picture of one household’s internet use at the router, where all one’s networked devices get in line to make requests of the internet god.
More on that in the next post.