Add a Comment Now - We Want to Hear From You
By David Oliver
We did have the opportunity to do this blog post as a video recording and put it on YouTube, but we realized that, ironically, as the post is all about how companies use NetFlow to track YouTube, because YouTube can, in many cases, suck down bandwidth, it was probably best just to write this out in text.
As we mentioned a week ago, YouTube is now supporting high definition content, with a high bandwidth to match. Now, I've done a little bit of research into how YouTube actually works. So I thought I’d explain to all those companies out who don’t yet have their own solutions some ideas about how to track and manage YouTube and other streaming media data – as well as give users out there an idea of exactly how companies can track your YouTube usage at work.
Anyway, when you make a request for a video on YouTube, you are directed to YouTube’s servers via one of four IP addresses that are easily found on Google or other search engines. From there you're going to be relayed to the Limelight network, which will actually feed you the video in the flash-based player. You can see the flows to and from that initial IP address for the HTTP GET of that video.
There are many solutions for providing visibility into traffic on the network by looking at the Cisco NetFlow data (which is already on most Cisco routers). I’m going to refer to NetQoS’s own solution, ReporterAnalyzer, when I talk about tracking NetFlow data.
What we can do with ReporterAnalyzer is monitor the Internet-facing link, and create and use custom reports looking for YouTube’s specific IP addresses. If you see a substantial amount of data being transferred, that's a good marker of seeing that YouTube video traffic.
You can rely on those custom reports and run them anytime you want, but companies can also monitor YouTube in real-time. By mapping HTTP Port 80 traffic that involves one of YouTube's IP addresses to some other ephemeral port, (and naming it something catchy, like "YouTube,") it'll actually show up as it's own protocol in both real time reporting, as well as flow forensics. You could use that data to create customer reports, to get a comprehensive list of users, and to sort YouTube use by volume.
The other thing you can do is use analyses to know when YouTube traffic accounts for more than, say, 10% of any of my links' traffic. Then it will go through on a link-by-link basis and tell you about violations, helping you further localize the source of that traffic. You can also configure it to alert you when and only when YouTube traffic on a particular link passes a threshold that you set.
(The other option is to try to block it entirely, but that's an engineering nightmare. Any employee smart enough to provide good value to a company - particularly a high tech company - will likely be smart enough to know how to circumvent blocks through proxies and other means.)
Custom reports to find correct addresses and to localize YouTube traffic may take a couple minutes. The entire real-time application mapping process takes maybe another 15 minutes. I can be showing real-time data specific to YouTube traffic just a few minutes after configuration of application mapping. (If your boss asks in the morning for something to track YouTube usage, the company can get YouTube tracking up and running by that afternoon - if the boss just wants some a quick snapshot of the current YouTube traffic volume, it could take as little as five minutes through custom reports.)
Of course, this isn't limited to YouTube. You can use similar methods and techniques to find and track streaming audio feeds, other video sites, etc. Any TCP flow is going to create some sort of NetFlow data. Based on the source or destination address, you can localize that. So as long as ReporterAnalyzer has visibility of that destination address, they can report on it. As you know, there are a multitude of media based streaming sites, all of which are going to have their own IP address range, which you can find pretty easily. You can then further localize and label them so that when you pull up reports, they're already differentiated from other traffic.
While YouTube is great, we’ve found that YouTube traffic congesting corporate networks is a common issue. For any company, WAN links are a finite resource and need to be managed. It's something that's a concern because you're sizing your network around capacity needs for the business. YouTube is (usually) non-business traffic, but it's going to share that limited resource. The more you share a resource, the less is available for the requirements you originally scoped it for. At NetQoS, we’ve found YouTube traffic congesting corporate networks is a common issue.
David Oliver is a Product Manager at NetQoS
