Capacity Planning Archives

Bandwidth “shortage” has 1950s precedent.


Most Americans can barely remember a time when there wasn’t enough electricity running to their house or apartment. Oh, certainly, we can remember times when we’d blow a fuse or trip a circuit breaker; but they were rare and usually happened under excessive strain – when the hairdryer and the air conditioner were on at the same time as the electric stove or George Foreman Lean Mean Fat-Reducing Grilling Machine.

In the 1950s, according to this advertisement uncovered by Modern Mechanix, running out of juice was a real problem because the wiring in house built decades earlier simply didn’t have enough capacity to run all those appliances. Well, of course it was – and this brings me back to my undergraduate days as a history major at the New Jersey Institute of Technology. (Yes, NJIT had a history major. No, it wasn’t a big class…)

Now, the second industrial revolution was years prior – during the 1910s through the 1920s, but consumer adoption of technological advances was stunted for a period of about 40 years. The great depression killed disposable income, when the war came and finally did bring some income, there was significant rationing and many companies specializing in electronics and electromechanical devices were building for the war effort. In a sense, when money started flowing in during the late 40s and 1950s, consumer demand had been “pent up.” Yeah, consumerism went overboard in the 1950s, but can you blame ‘em?

Fast forward to today and replace appliances with applications – there’s a reason that both have the same root word, the Latin “applicare – and you can see the parallels.

Of course, things are a bit more complicated than the 1950s – there are only a few types of electricity – AC/DC, 110v/220v… etc. The difference is that it was relatively easy to tell whether or not the problem you were having was caused by a lack of electricity. Does the thing light up? Is it working? If the answer is no to both, you’ve got an electricity problem. If the answer to the first is yes, and the answer to the second is no, you’ve got a busted appliance. (If the answer to the first is no and the second yes, you need to replace the lightbulb.)

Today it’s a little bit harder to diagnose the problems you have with application performance – application, server, and network problems can all look very similar – especially to the end-user. Sometimes you spend time and energy working on one area only to find out it’s not the problem.

But sometimes, yes, the network is the problem, and more capacity is needed. The important thing is to be sure about it rather than just guessing.

There’s another lesson here too for consumer applications as well. That is, in the 1950s, the way to solve the problem of new demand on the electrical grid was to upgrade the wiring. Maybe we should be doing that to solve some of our own broadband “shortage” problems, instead of resorting to things like “bandwidth caps” and “aggressive traffic shaping.”

Because it’s not just about not being able to see the latest cat on treadmill video. Videoconferencing via Skype or other method puts people in face-to-face communication. CERN scientists needed to upgrade the Web’s infrastructure to share the massive amounts of data the LHC would create.

But one of the most telling things is what’s going on in Austin right now.

Our local NBC affiliate, KXAN, and our local cable provider, Time Warner Cable, haven’t reached an agreement to show KXAN programming on cable. One of TWC’s gambits is running an advertisement on television showing people how to connect their laptop computers up to their television so that they can watch the streaming video of the NBC shows that they’re missing.

Which, of course, begs the question; if you can get television shows via the Internet directly from the networks themselves, why do you need cable TV or network affiliates in the first place? This move may backfire. One of the reasons I don’t have cable is that I’ve been hooking my TV up to my computer for years now… then again, I have pretty good broadband service.


Capacity Planning Archives

What network performance taught me about optimizing a lemon


David Oliver talks about his experiences running the 24 Hours LeMons race in Houston, and how knowing about network performance helped him optimize his junker.







Capacity Planning Archives

Cisco ships Mexican folk music instead of VPN software. Easy mistake: They’re so similar…


According to The Register, Cisco installation CDs for VPN networks contained music.

Specifically, music that sounded exactly like this.

Now, Mexican folk music of the “narcocorridos” variety has a rich tradition and requires extreme skill to produce, and is greatly enjoyed by many music aficionados. But still, if you’re going to come up with a piece of music designed to surprise the hell out of everyone, you could probably choose no better music in the world.

Knowing Cisco, there’s no way that this was deliberate; but this brings to mind two things: First, is there someone out in Baja California with a copy of VPN software in his or her hand, wondering to themselves: “¿Dónde está mi música?”

Second, will this start a trend of “narcorrido-rolling” network engineers?

Cisco is doing everything they can to recover from this error, and in a statement, said:


Cisco is aware that some customers have received defective VPN Client CDs as part of recent orders.

Manufacturing is aware of this problem and is actively reshipping new media to impacted customers.

Defective VPN Client CDs can be identified by the following marking on the back of the media which ends in "MX21511/4"


Of course the moral of the story is that you need to test before you deploy. In this case, it was a little embarrassment, and we all pretty much just have a chuckle about it. But deploying technology on the network without knowing the full effects is just asking for trouble.

I mean, what would have happened if the music actually installed? Is your enterprise prepared to handle accordion configuration?


Capacity Planning Archives

Scalability isn’t just about numbers


Scalability is one of the more overused terms in networking – which makes it hard to explain why it’s important. Well, I mean, beyond the main concept of: “More scalability means you can hook up more computers to it!”

True, how big the deployment is probably the best way to objectively prove scalability – for example, NetQoS has one ReporterAnalyzer deployment monitoring over 20,000 WAN links. No small feat. But scalability isn’t just the quantity of computers hooked up to the box, but also how much of the quality of the data you maintain when you’ve got tons of computers hooked up to the box. Or to put it another way, scalability means that in even large deployments, you get all the data at high granularity.

Talking about scalability in pure device count is sort of like talking about network performance purely in terms of fault. It is possible to have poor scalability without having no scalability, when you sacrifice detail for device count.

Another key of scalability that many people don’t think about is performance of the device itself. It would be ironic to purchase a device to monitor network performance that had a very slow UI because it strained under the load of monitoring thousands of links.

One of NetQoS’s many accomplishments over the past six months has been getting a patent on a memory management method and system which allows us to manage hundreds of thousands of combinations in a very small memory footprint.

Memory management is a major part of scalability, because allocating memory during a programming operation is relatively expensive, in terms of operating processor resources, to allocate memory during runtime. Put another way: the more efficiently you use memory, the harder you can push the processor on other tasks. For this reason, scalability requires efficient memory usage.

In addition to our own products, we also use it in our integrations into Cisco Wide Area Application Services (WAAS) – we’re able to integrate code there with little impact to the host systems.


Capacity Planning Archives

Why the Olympics stay online – because fewer people than you think are watching.


While we’ve talked quite a bit about what impact the Olympics may have on an enterprise network’s performance, we haven’t talked much about the performance of the NBC site hosting the live streaming of the Olympics. 

According to Jason Perlow at ZDNet, Limelight networks (which hosts the streaming videos) deployed the videos by going to the public internet by hosting the content more locally – at the ISP.  That means you’re viewing the Olympics through your ISP’s internal network, and the broader internet doesn’t even enter into the connection. 

This is smart thinking, it appears to be working, and by all measures this should be applauded.  Perhaps even duplicated – if you know that multiple employees will download the same content, local hosting on the LAN is preferable to duplicate download streams tying up the more expensive, slower WAN lines.

From the enterprise end of the equation, the fact that Limelight is delivering Olympics video more effectively just means that IT managers cannot count on their servers going down from being unable to handle the demand – IT managers still need to monitor their own networks for performance problems when a big event like the Olympics come up. 

However, it would be wrong to assume that Limelight’s strategy is the only reason why Olympic live-streaming hasn’t slowed to a trickle.

First of all, the site blocks 95.44% of visitors from accessing the content – because it limits the content only to those in the United States.  That’s a lot of people.

Secondly, the site requires Microsoft Silverlight. Most people don’t have Silverlight installed.  Some can’t even install it on their systems.  And there are certainly going to be a quite a few people who just didn’t think installing Silverlight was worth the bother to watch five minutes of Olympic footage they may be mildly interested in. 

And finally – none of the really popular sports are being streamed.  Gymnastics, Women’s Beach Volleyball, Swimming (with the exception of synchronized) and most of the track and field events aren’t available live. So you’re left with judo, fencing, and the decathlon.

So while it is a true technological wonder that the lights have stayed on and the site performs admirably – it is important to recognize that Limelight has not found a magic bullet to deal with extremely high internet video demand. 


Capacity Planning Archives

Latency and Jitter


By Kevin Davis
Adapted from “Sources of Latency” Whitepaper

When network users call the Help Desk to report poor application performance, you don’t typically hear things like “The router’s CPU is too busy!,” “The network utilization is above 70%!,” or “The carrier path has failed-over to a sub-optimal path.” Instead, what you’re likely to hear is “The network is slow” or “The calls on my IP phone sound terrible.”

Complaints that end-users lodge are nearly always based their quality of experience using the application. And their quality of experience is almost always reliant on time.

Anytime a significant delay occurs in the delivery of network data, application performance suffers. Depending on the type of application and how it works, variances in network delay can have a severe impact on application performance thereby degrading end-user’s experiences.

Two important measurements of time intervals in network transmission systems are referred to as “latency” and “jitter”. Understanding latency and jitter sources and how their values vary in network architectures is critical to engineering application performance and optimizing information resources. For many regular readers, this will be old-hat, but we’ll go over it again.

Network latency is the amount of time it takes for a packet to be transmitted end-to-end across a network and is composed of five variables:


Network Latency = (Distance Delay) + (Serialization Delay) + (Queue Delay) + (Forwarding Delay) + (Protocol Delay)


Serialization Delay refers to the amount of time it takes for a network interface (such as a router’s interface or computer’s NIC) to perform bitwise transmission of a frame unto the outbound media, Forwarding Delay is the amount of time it takes a network device to process a frame/packet by performing a destination address lookup and forwarding the frame/packet to the outbound interface, and Protocol Delay is the amount of time that access or transmission algorithms may contribute to the delay of a network frame, and is typically introduced at the endpoints of the data transmission system.

Serialization delay, on a per-packet basis, becomes insignificant at data rates above 1.544 Mbits/s – or a T1. Forwarding delay is typically insignificant in modern routers and switches (when appropriately configured – significant delay can occur in misconfigured routers.) And Protocol delay typically occurs at the access layer or the end points. So the two major variables that have the most effect on network latency are Distance Delay and Queue Delay.

Distance Delay is simply the minimum amount of time that it takes the electrical signals that represent bits to travel down the physical wire. Optical cable sends bits at about ~5.5 µs/km, copper cable sends it at ~5.606 µs/km, and satellite sends bits at ~3.3 µs/km. (There are a few additional microseconds of delay from amplifying repeaters in optical cable, but compared to distance, the delay is negligible.)

Distance delay can have a significant impact on application performance for applications that require a large number of network round trips in order to complete a transaction – for example, custom transactional based applications, database queries, and VoIP, which begins do degrade when one-way end-to-end latency exceeds 200-220 milliseconds.

One of the biggest sources of end-user ire are database queries designed to run over a LAN ported to the WAN. For example if a user executes a SQL database query that requests 100 rows of a database table, one row at a time, over a link with a latency due to distance of 60 ms, it would take approximately 6 seconds (60 ms * 100 turns) to complete the transaction. The same query executed by a user on a LAN connected to the same database server would take less than 2-3 ms to be completed, as the latency due to distance across the LAN is insignificant.

Queue Delay is the amount of time a packet must spend in a network buffer waiting its turn to be transmitted. Network interfaces transmit one frame at a time, typically one bit at a time. As such, when two or more packets are forwarded to a network interface at the same time, or close to the same time – one packet is transmitted while the others are put in a queue on the interface buffer to await their turn at the interface. Packets that are put into the queue must wait until they can be transmitted, adding milliseconds of delay.

Increases in Queue Delay can be measured and detected by monitoring traffic along a given network path. Typically, most intermittent increases in latency above the baseline distance latency can be attributed to network congestion. (In order to reduce the possibility of excessive queue delay, application servers that are members of the same application architecture should be placed on the same Ethernet switch and on the same VLAN to ensure they do not have to compete for uplink bandwidth when problems like the one pictured above occur.)

Worse still, if the problem gets worse and packets wait in increasingly longer lines within the queue, the buffer may become full and the packets may be dropped. Packet drop, in turn, causes TCP connections to throttle back on the rate of transmission.

Those are some of the main causes of latency – but what about jitter?

Jitter is a term that refers to the variance in the arrival rate of packets from the same data flow, and abnormal jitter values can negatively impact real-time applications like VoIP and video. Jitter is typically created by three different mechanisms in a network: variance in Serialization Delays due to variance in packet sizes, variance in per-packet Queue Delay due to packet spacing from multiple sources at a common outbound interface, or packets taking different routes from source to destination – perhaps due to per-packet load sharing or routing issues.

The most effective way to deal with jitter is by using low-latency queuing for VoIP and video traffic on network interfaces with large serialization and/or queue delays. In addition, endpoints (such as IP phones) can use jitter buffers or playout delay buffers in order to deliver received packets at a constant rate to the end consumer. These buffers are typically 30-50 ms in depth, and thus they attempt to manage jitter values within these values on any single one-way path. While these buffers technically add 30-50ms in latency, they significantly reduce jitter. Since human beings don’t start to notice latency in VoIP or VideoIP applications till it hits about 200ms, if latency can be kept to under 150 milliseconds, then jitter can be significantly reduced using this method.


Capacity Planning Archives

Waiting for Firefox


It’s Download Day.  At 10:00 a.m. PDT, or noon, for us in Austin, Firefox 3.0 was released to the public in what the Mozilla foundation has dubbed “download day.” In fact, they’re attempting to set a Guinness World Record for “most downloads in a 24 hour period.” 

So, it was a bit of a concern to us because with all those people downloading Web browsers, there would be sure to be traffic spikes on our network. But the “Download Day” promotion is such a huge success that Mozilla is having trouble keeping their own server up. 

At 10:16 a.m. PDT, I can see a “The server at www.spreadfirefox.com is taking too long to respond” error.  Mozilla.org is also unable to resolve. 

At 10:30 a.m. PDT, it’s still not connecting, and I decide to stop hitting refresh and go and eat lunch. Mmm.  Roast Beef. 

At 11:30 a.m. PDT, Spreadfirefox.com is still not resolving, but Mozilla.org does.  That doesn’t last, however, as I go to download Firefox, I get a “Http/1.1 Service Unavailable” error.   I bring up a copy of “Waiting for Godot” in another browser window.

It is 12:00 noon on the Pacific.  Spreadfirefox.com is still not resolving. 

12:30 p.m. PDT.  Still not working.  I clean off my work desk, something I’ve been putting off for a wh—ew, is that mayonnaise?  (I hope that’s mayonnaise.)

1:00 p.m. PDT. No Firefox, but My desk is now clean.  (My closet is now dangerous.)  Time to catch up on my RSS feeds to find out if there are any interesting leads that I can investigate. Hmm.  Wine 1.0 is out, but that really doesn’t have a lot to do with network performance.  Reddit seems have problems with Firefox too.  But somebody has to be getting the browser – there’s over 8000 downloads a minute according to the counting tracker.  Wait.  Some users report the counts running backward… what, are people uploading it back?

1:45 p.m. PDT. Aha!  Finally.  The page resolves and I begin my download… and it redirects me to Firefox 2.0.0.14.  Great.

1:55 p.m. PDT. I download Opera 9.5.

2:00 p.m. PDT. Mozilla’s page finally shows a link to Firefox 3.0 – but still shows the logo for Firefox 2.  The 7.1 MB download starts at around 50kBytes/s – which is pretty lame for the usual 700kBytes/s I can get when I download from work. 

2:15 p.m. I install Firefox 3.0 and launch it.  It’s nice.  It’s certainly more responsive and uses less memory.  However, my Tab Mix Plus extension isn’t compatible, and furthermore, there’s no option to undo closed tabs.  All in all, a disappointment – if it were a restaurant, it would be infamous for slow service and bad food.

Leaving aside the whole “Undo Closed Tabs” issue, you would think that an organization actively trying to beat the world record for the most downloads in a 24 hour period might, you know, be prepared enough to make sure the servers don’t go down?

Additionally; Mozilla has been promoting “Download Day” for some time now, so it makes sense for IT departments to be prepared for the onslaught of downloads coming into the network from users upgrading their PCs to the latest version of the browser – and keep track of the impact that traffic has on the user experience for more mission-critical apps.


Capacity Planning Archives

The Half-Bakery: 10 gigabit Ethernet, Virtualization, and the Geek in his Natural Habitat


brianboyko3.jpgby Brian Boyko
Editor, Network Performance Daily

Enterprises are seeing more adoption of 10 gigabit Ethernet according to a report by Network Instruments, and reported on their Network Observations blog that nearly one quarter of businesses are implementing 10G networks by the end of the year. The larger the company, the more likely a 10G rollout.

There’s certainly evidence of a trend, but is that evidence of a need-based demand? LAN technology at the gigabit Ethernet level typically has low latency – and I don’t see 10G Ethernet helping with that much if at all. Gigabit Ethernet is still a heck of a lot of bandwidth, especially compared to the bandwidth offered by WAN solutions. In any LAN/WAN/LAN traffic path, it’s almost always the WAN that proves to be the bottleneck.

But it is possible, with large VoIP networks, that you could be overloading the LAN capacity and decide to move to 10G for that reason. This could possibly explain why big companies are more likely to have 10G than smaller companies – because if you’re not hitting the bottleneck on the LAN, 10G doesn’t really help you deliver the applications any faster or effectively.

What I think is more likely is that 10G has hit a price point where it costs about as much to roll out 10G as it does the older technologies. Instead of 10G taking over the market from companies migrating from 1G, instead it seems that when companies choose to build new systems, they’re choosing to build them in 10G instead of 1G.

But again, it comes down to application delivery. And if we’re not delivering applications faster, the question is then asked – is there any application that is not feasible to execute on a 1G network for which a 10G network would be suitable?

Then I remembered that I’m a geek, and I like my toys.

Specifically, when I move into my new apartment next month, I’ll be back on my own router hardware. My current place has Ethernet built in – it’s a feature that saves me $50 a month, but the complex houses its own routers, which I have no capability to port-forward, which means that I can’t set up a remote desktop connection so that I can check on my home computer from work. And looking forward to being able to do that again reminds me that perhaps one of the new applications that could propel an adoption to 10G might be combining virtualization with remote desktop software – that is, making the end users work from their desk computers on a virtualized environment on a server. This means that you get more life out of older but still usable desktop hardware. According to the FAQ from RealVNC, at 100Mbps per connection, “most tasks will be indistinguishable performed remotely from if they were performed locally” Still, 100Mbps fills up a 1Gbps LAN pretty quickly. However, a 10Gb LAN might be able to accommodate this new application.

There are limitations – anything using full screen video or animation (a movie, or a 3-D environment) where there are rapid changes of every pixel will require even more bandwidth before it gets “choppy” – which will probably sink my plans of playing Half Life 2 on my Mac via a remote desktop connection to a PC. But this is certainly one of those “think about it” half baked ideas that may become reality in the near future.


Capacity Planning Archives

Can you have 21st century broadband with 19th century infrastructure?


We’ve mentioned numerous times about broadband penetration and speed lagging behind countries more rural and less populated – in other words, countries the U.S. has no excuse lagging behind.

Ars Technica recently put out an article detailing what differences in national broadband policy exist that have enabled other nations to surpass the U.S.’s broadband capability. Japan and France have local loop unbundling – that allows for more competition among ISPs.  They also both deploy fiber instead of copper even if it doesn’t show an immediate profit, and competing ISPs are rolling out new fiber infrastructure instead of just leasing lines. 

Japan, France, Sweden, and Canada all treat broadband as a “core infrastructure element” – that is, it is treated as vital to the functioning of the national economy as good roads, bridges, tunnels, and electrical grids.

In all fairness, the U.S. can claim the same thing.  The U.S. may have no broadband policy, may be looking to traffic shaping to solve problems that would be better addressed by more fiber rollouts (oh, and by the way, there’s a new $800,000 deep packet inspection device on the market today to help service providers monitor and shape traffic), and may be relying on increasingly obsolete technologies – but at least we treat it the same as we do our roads, bridges, tunnels, and electrical grids. 

Which is to say, not very well at all.  The American Society of Civil Engineers gave the United States infrastructure a “D” in 2005, down from a score of “D+” in 2003 – and to fix those problems would require $1.6 trillion over five years.  Since then, not much has been done, according to this CBS video reposted on RawStory.com.

Instead, the government is considering plans to lease highways to private companies – using tolls to provide a “free market” solution to the infrastructure problem – but which will ultimately be a government sanctioned private monopoly over certain sections of blacktop. It is difficult to see how this would improve infrastructure, rather than simply allowing private companies to charge the maximum people will pay for the minimum infrastructure service people will put up with.

So, as far as treating broadband infrastructure like the rest of America’s infrastructure, it seems we already do that.  But what needs to be clear is that broadband infrastructure is infrastructure – that is, it is just as important for the rural area to get good broadband as it was for them to get good roads back during the Eisenhower administration

In a macabre way, this limited broadband is good for vendors; if broadband was plentiful there wouldn’t be so great a demand for WAN Optimization tools, for example.  Sure, WAN Optimization is a good idea anyway but it is the high cost of bandwidth that spurs demand forward.  It is becoming harder to maintain performance not just because of the various new demands on the network but also because the infrastructure across the country is simply inadequate – thus the demand for network performance monitoring tools.  Increasing bandwidth doesn’t always solve the network problem but insufficient bandwidth always creates one.


Capacity Planning Archives

Podcast: Prof. Michael Geist of the University of Ottawa on Bell Canada's traffic shaping


We've recently covered Bell Canada throttling P2P service. Today, in this podcast, we speak to Professor Michael Geist, Canada Research Chair in Internet and E-commerce Law at the University of Ottawa, regarding the controversial move by Bell Canada to use traffic shaping on wholesale service providers.

A transcript of this podcast will be provided at the earliest opportunity.



<< 1 2 3 4