Application Performance Archives

Three Things You Can Do Today To Improve Network Performance Without Spending a Dime


For months, we’ve been waiting to see what the fallout would be from the sub-prime mortgage crisis.

Apparently, the results are not unlike a hefty bag filled with chili con carne, dropped from the top of a skyscraper. Only instead of a hefty bag, it’s the U.S. economy.

So, as Wall Street explodes like an explosive so explosive it could explode and create a massive explosion, technology turnaround times will probably extend a couple more years as CIOs try to figure out how to use existing tools to solve network management problems and improve performance. How do you do that?

Luckily, there are ways to do that – Cisco routers and switches already have “application-aware” technologies and don’t require any additional purchases – including IP Service Level Agreement (IP SLA), Class Based Quality of Service (CBQoS), and Network Based Application Recognition (NBAR).

Managing Application Response Times with Cisco IP SLA

Now, measuring real application transactions is the most accurate method for measuring response times. But, failing that, you can use Cisco IP SLA to create synthetic transactions. This is not only useful when on an IT budget crunch but can also provide useful data when assessing whether or not to roll out a new application, or measuring a service provider’s SLA edge-to-edge.

IP SLA operates by sending synthetic transactions between two network devices or between a network device and a server. It can be configured to send different types of synthetic transactions based on port, packet size, type of service, and even more advanced characteristics, as is the case with Voice over Internet Protocol (VoIP) tests. When it gets a response, the sender then calculates the response-time metrics appropriate for the test type, and then repeats multiple times.

Some SNMP polling products can collect data automatically, store it in a database, display the results in a GUI, and provide analytical function beyond data collection, such as calculating baselines, displaying trends, and triggering threshold alerts based on collected IP SLA data. There’s also the possibility of simply getting the information from the CLI, but extracting the IP SLA response-time metrics and copying them to a spreadsheet can be difficult and tedious. However, for the extremely budget-conscious, it can be done.

Deploying Quality of Service with Cisco CBQoS

QoS is a blanket term for network policies and practices that help to manage different types of data traffic that share network links. Effectively, QoS determines how different types of traffic, with different priorities, are handled whenever tradeoffs that are likely to impede performance must be made.

Now, within any enterprise, the end-user experience with certain applications will always be more critical than it is with others. Strategies to avoid (or at least manage) congestion could include dropping traffic, adjusting application responses, and building packet queues. CBQoS is one way to do this – and comes with the CBQoS Management Information Base (MIB) to collect statistics about the traffic traversing the router and reports how the QoS configuration is being applied.

Here, an SNMP polling product with application-aware capabilities can get information on input and output QoS class map utilization, drop percentage, and packet counts. It can also get information on pre-versus-post QoS traffic volume, rate, and packet count. It can also point out traffic marked in conformance, in excess, and in violation of defined policies.

Without CBQoS, network managers don’t have a whole lot of evidence to verify that their QoS settings are actually improving network performance – in fact, they may even be inadvertently harming performance. CBQoS prevents network managers from flying blind with QoS deployments. And, like IP SLA, it’s built into Cisco IOS.

Gaining a New Level of Visibility with Cisco NBAR

From within the network device operating system, Cisco NBAR can inspect packets traversing the device and identify the corresponding application – for example, TCP traffic running on port 80 could be labeled as Google, SAP, SharePoint, SalesForce, etc. NBAR can also provide utilization, volume, and rate metrics on a per-application basis relative to the network circuit carrying the traffic.

It’s similar to NetFlow, but NetFlow identifies protocol traffic mixes – not application-layer visibility. NBAR identifies by application – which is important in setting proper QoS policies. And because NBAR is part of Cisco’s IOS, and the data can be collected with an application-aware SNMP poller (which many of you already have), it can be a more cost-effective solution than application discovery hardware.


Application Performance Archives

Nick Carr takes on Colbert


First off, congrats to Nick Carr – we’ve talked with him (and disagreed with him!) often on the blog and we’re thrilled that he managed to go toe-to-toe with Stephen Colbert on last night’s show.

And, thanks to the Colbert Report’s online presence, here’s an embedded player with that interview.


Although the book plugged is “The Big Switch,” the majority of the interview talks more about the implications of dwindling attention spans due to the Internet’s “hyper” hyperlinked nature – a topic not covered in “The Big Switch,” but instead in the cover article Carr wrote for the Atlantic Monthly, “Is Google Making Us Stoopid?

The idea, as we’ve mentioned before, is that Carr believes the end result of the attention getting behavior of the Internet is that it will “scatter our attention and diffuse our concentration.”


“When the Net absorbs a medium, that medium is recreated in the Net's image. It injects the medium's content with hyperlinks, blinking ads, and other digital gewgaws, and it surrounds the content with the content of other media it has absorbed. A new e-mail message, for instance, may announce its arrival as we're glancing over the latest headlines at a newspaper's site. The result is to scatter our attention and diffuse our concentration.”


During the interview, Colbert made a play of ignoring Carr to check his iPhone. Now, that that does happen in real life, but I’d say that’s more an indication of individual rudeness then of culture spinning on a dime over the concept of hypertext.

The same criticisms that Carr makes of the Internet could be made of the newspaper – you’re trying to read one thing but it’s broken up, put next to all these other interesting articles, and ads designed to catch your attention… with all these… analog gewgaws, how is one supposed to be informed?

We’ve mentioned before that the limits on network performance limit the ability to communicate complex thought back when the Atlantic Monthly article first came out. But, we missed an opportunity to get less academic, more practical, and closer to the issues in a corporate network environment.

While we disagree with Carr’s diagnosis that the Internet causes short attention spans, (I’m a pro-blogger at a tech company, raised on Nintendo and MTV – I’m the poster-child for the 21st century digital boy, and still I managed to summon the concentration to read the book Nick Carr wrote…) we do agree that human attention spans are short.

When I worked at a supermarket retailer, back in the early 2000s, as I’ve mentioned (and complained about) we were using a java-based networking app that took one to two seconds to input each number and move to the next field, and processing the entire report took minutes. The network performance was absolutely horrible, and as I pointed out before, we would have mentioned it in the hopes of having the performance improved somehow, except that we all realized that our jobs were essentially superfluous anyway and that we could all be replaced by a very small shell script that could parse the orders as they came in instead of printing them out and having us enter in all of them by hand.

Of course the lot of us at the data entry farm had CNN.com, Slashdot, and All Your Base Are Belong To Us and Hamsterdance open while we waited for the pages to load. (It was a simpler time back then.)

Of course, if we didn’t have outside Internet access, we could very well have distracted ourselves offline with desktop toys or conversation. We did that often anyway – as I said, it just took forever for those fields to come up.

I’ve also heard, second and third-hand, stories of other companies who are shocked to find that employees are going on to do other tasks while they wait for reports to generate, fields to come up, and pages to load – so if you’re honestly worried about dwindling attention spans, it might be better to not curse Google or the Internet, but to go in and actually improve things where you can.


Application Performance Archives

Google Chrome and Network Performance – it’s bigger than you think.


When Google Chrome was released, our genuine reaction around the office was something like this:

ourreaction.jpg

Okay, so the last thing the world needs is yet another browser. Between IE, Firefox, Safari, Opera, Flock, Konqueror, Epiphany, Camino, Galeon, SeaMonkey, OmniWeb, and, of course, Wii Internet Channel, Web applications developers already have their hands full.

However, if you work in IT, you are either in the business of developing applications or delivering applications. And sometimes the bottleneck in application delivery is the browser. You can have the best network in the world, with only a couple hundred milliseconds of overall delay – but if it takes seconds to render the JavaScript on the front-end, it’s almost academic. At any rate, the end-user probably can’t tell the difference between delays on the network to delays on the client-side browser.

There are two things that make Chrome stand out – the first is running each tab, and each plug-in, as a separate process, with protected memory address space. Problems in one tab will not crash the entire browser.

The other is advances in JavaScript execution. By running java scripts in separate process, buggy JavaScript can’t hang the browser, like it would if JavaScript ran in a single-thread in a browser process. The above scenario should come as no surprise to anyone that has used Firefox and watched as a single buggy JavaScript site made you restart all the tabs on your browser.

But Chrome also comes with a JavaScript virtual machine, which speeds up JavaScript-based Web applications by turning the interpreted JavaScript code directly into machine-code for your processor and OS. Again, faster delivery of the application, when the browser is the bottleneck.

There are a few nay-sayers out there that are looking at this from a bottom line point of view – that Google is trying to enter into the browser wars and try to own the space – basically, if you use Google’s browser, even if it’s open-source, you’ll view Google’s advertisements, and make Google money. That’s true enough. But what we really should be taking from this is that even if Google’s code wasn’t open-sourced – and it is – these innovative ideas would eventually make their way into other Web browsers in order to stay competitive. Firefox will likely incorporate changes at least by the next full release, and Microsoft, Apple, and Opera Software will do so if they want to remain competitive.

I’m skeptical that Google Chrome will make it onto enough desktops that Google becomes a key competitor in the Browser Wars. Then again, Mosaic was the first Web browser, and no one uses it today – but we certainly use a lot of the technological ideas behind Mosaic. It really was a quantum leap forward, and though I may be overly optimistic about it, this really is a quantum leap forward in Web application development.

The point is not Google Chrome. The point is the technology behind Google Chrome.


Application Performance Archives

John Dvorak – baiting the cloud


Saying that your business should never, never, never use cloud-based applications instead of desktop or network/server based ones is about as ridiculous as saying that cloud-based applications will eventually replace IT completely.  

With an article that begins with “Cloud computing apps are for suckers. If there is an alternative that runs locally on your own machine, it will always be better,” John C Dvorak, seems to be going from “baiting Mac users” to “baiting Google users.”

But let’s just take the argument at face value.  Some of the points he makes are good ones – specifically, the ones with performance issues. 


I don't care if you have 30-megabit-per-second service—you'll get flaky performance from most online apps, especially if they're popular. Always remember that your online speed is only as good as the speed at which data is coming at you: The application server may be swamped, and the various nodes along the route could become clogged, too. Nothing is ever as fast as the machine sitting on top of (or beneath) your own desk.


Your desktop is faster than the cloud – that’s true - but is your car?  Information stored in the cloud can be accessed from any place with a Net connection.  Information stored locally can only be accessed locally – well, unless you connect through a VPN or set up a VNC server.  But even for those of us that know how to do it, a VNC server is a hassle, and a security risk unless you do it exactly right.  90 minutes is horrendous downtime for an enterprise application, and Dvorak is right so far as any application where 90 minutes downtime is unacceptable shouldn’t be put on the cloud. 

But there are plenty of applications – and for small-to-medium companies, e-mail is one of them – where the losses incurred from 90 minutes of downtime is less than the cost of having a dedicated in-house application installed and maintained on the network.  (If the opposite is true, don’t use cloud computing, use the in-house application, and keep an eye on how it performs.)

Dvorak also points out that your data is at the mercy of the service provider and that if the service is cut off, for whatever reason, so is your data.  That’s true, but if you don’t back-up your data, your data can be lost by a hard drive crash.  Both are about as likely to happen, in my experience. 

To Dvorak, “People tend to forget that software is NOT a service; the whole cloud scheme is a scam to lock users into a single product and somehow extract more money from them.”  There is some aspect of vendor lock-in, but mostly cloud computing is a way to provide an application at low startup costs in exchange for revenue over time – whether through advertising, in the case of Google’s apps, or through a subscription model.  Yes, it is very much “renting” rather than “owning,” but that can very well make financial sense in many cases. 

After that, the arguments get a bit silly. 


What happens if the net is attacked and your entire cloud world is gone for days and days? It just happened in the Republic of Georgia, and it can probably happen anywhere.


If the Russians start bombing us, John, I’m sure that the boss will give us a few days off. 


Ask yourself why the heck will we need six-core, high-performance chips if the cloud takes over everything?


Why do we need six-core, high-performance chips now?  In a virtualized server, certainly we’ll need power to spare, but unless you’re doing video editing or animation rendering, a six-core chip is probably overkill.  And if we stop putting the big iron in the datacenters of big companies (very unlikely,) they’ll pop up in the data centers of the SAAS providers. 

When it comes to performance and scalability, absolutely, standard client-server IT applications and local programs are going to have SAAS beat.  Final Cut Pro is not going to the cloud.  Photoshop isn’t going to the cloud (though Photoshop Elements is…).  But the key advantage of cloud computing isn’t performance or scalability – it is portability.  This is why people will pay twice as much for a laptop with the same specs as a desktop computer.  Mobility is important.   


Application Performance Archives

Whose OC3 Line Is It Anyway?


A number of East Coast based customers of World of Warcraft have been experiencing connection delays and uncomfortable lag – and no one seems to know exactly where the problem is.

The New York Post says that Blizzard is blaming Time Warner Cable is for the problem:


"The only commonality between all the players experiencing these disconnects and extreme latency is Time Warner/Road Runner," the company said in a June 23 support post.


But the Digital Communications Director for Time Warner has said that the lags and disconnections are not on their end and points to the traceroutes as evidence.


Take a look at some of the traceroutes posted to the thread in question ... starting here, at comment #446: http://tinyurl.com/5gqe27

If you follow the commenter's posted trace results, you'll notice that it's only on TWC's Roadrunner (rr) network for the first 6 hops — with maximum response times of 10 ms. The response time jumps drastically at hop # 11 — when the trace is no longer on the Roadrunner network.

Scroll down further on the same page to comment #456, and you'll see something similar — a giant leap in lag times. However, this trace never touches our network. It starts at Verizon, goes to Alter.net at hop #5, and then jumps to ATT.net's network at hop #8. Hop #9 shows a response time of 114 ms — quite a jump from the 49ms at hop #8.


So, what’s going on?

One of the theories is that Time Warner is lying and is throttling World of Warcraft traffic, considering all the bad blood between savvy broadband users and major ISPs over BitTorrent throttling. And while I can’t prove that they’re not doing so, I have to admit that the theory doesn’t seem very likely because of the nature of World of Warcraft.

See, MMORPGs care more about latency than bandwidth. While patch downloads can be huge, the majority of the content of WoW requires low latency to provide instant responses to actions. Latency, in WoW can result in an annoyingly choppy game, and a multi-hundred millisecond delay may be the difference between slain dragon or hobbit pâté.

So from a bandwidth-saving perspective, a ISP wouldn’t have a whole lot of motive in blocking World of Warcraft or other MMORPGs.

Additionally, Comcast, Time Warner, and other cable companies were rumored to use BitTorrent throttling because both legal and copyright infringed video files competed with the standard television cable offerings of those companies. This also doesn’t seem to be the case – as while more generally, time spent playing WoW is time not spent watching TV, it’s not a specific competition. Indeed, MMORPGs are one of the key drivers for broadband speeds in the U.S., and I have trouble believing that TW or any other company would knowingly interfere with such a cash cow.

Indeed, I believe that TW might be reaching out to users to find out more about the problem because TW might be interested in solving the problem instead of losing customers to other ISPs like Verizon FIOS.

Of course, I don’t know anything – and I wish that I had some inside information to figure out what was going on and solve the problem. Not only would I look like a genius but every one of my friends who plays World of Warcraft would hoist me on their shoulders, and treat me like a Lich King for a Day. Sadly, I think that it’s going to take Blizzard and TWC together to try to triangulate why this problem is happening.


Application Performance Archives

New Managed Services Offerings


By David Byrne,
General Manager of Managed Services

Our tools here at NetQoS deliver the capabilities and insight needed to understand what’s going on across the network from a performance perspective to improve application delivery. Some customers find it difficult to maximize the performance of their monitoring tools given the constant change they’re experiencing, which includes everything from new applications to infrastructure changes to staff turnover. And ironically, customers need to fully utilize their performance monitoring tools to minimize the risk of change and ensure performance remains optimal as they deploy new technologies and make alterations to the infrastructure.

While training helps improve knowledge and skills, a managed services approach can make more sense for customers undergoing these challenges.  For example, a company may go through a rapid amount of change in a short period and simply can’t keep up with the performance monitoring on its own.  Or, perhaps a company might have trouble recruiting and retaining technical employees.  Or maybe a company realizes the IT team’s core strength lies outside performance monitoring and doesn’t want their best engineers spending all their time on it.

For these reasons, we’re announcing the NetQoS Managed Services today, which means we can offer customers the benefit of our talent and resources to perform the traditional tasks required to maintain optimal network performance. Because we can deliver our expert knowledge and best practices at a reduced cost compared to a full-time equivalent, many companies will find our Managed Services a compelling option.

Think of all the devices going out on the network today.  Years ago, you didn’t have to worry about things like streaming audio or video or BlackBerry/iPhone compatibility. But change is constant, and right now it’s exploding, which makes putting the right technical resources on the right tasks critical.  For many companies, it doesn’t make sense to have their broadly skilled engineers focus only on one or two tools.   There is a value to focusing on performance, but for many companies, NetQoS Managed Services might be a better way to do it. 


Application Performance Archives

Latency and Jitter


By Kevin Davis
Adapted from “Sources of Latency” Whitepaper

When network users call the Help Desk to report poor application performance, you don’t typically hear things like “The router’s CPU is too busy!,” “The network utilization is above 70%!,” or “The carrier path has failed-over to a sub-optimal path.” Instead, what you’re likely to hear is “The network is slow” or “The calls on my IP phone sound terrible.”

Complaints that end-users lodge are nearly always based their quality of experience using the application. And their quality of experience is almost always reliant on time.

Anytime a significant delay occurs in the delivery of network data, application performance suffers. Depending on the type of application and how it works, variances in network delay can have a severe impact on application performance thereby degrading end-user’s experiences.

Two important measurements of time intervals in network transmission systems are referred to as “latency” and “jitter”. Understanding latency and jitter sources and how their values vary in network architectures is critical to engineering application performance and optimizing information resources. For many regular readers, this will be old-hat, but we’ll go over it again.

Network latency is the amount of time it takes for a packet to be transmitted end-to-end across a network and is composed of five variables:


Network Latency = (Distance Delay) + (Serialization Delay) + (Queue Delay) + (Forwarding Delay) + (Protocol Delay)


Serialization Delay refers to the amount of time it takes for a network interface (such as a router’s interface or computer’s NIC) to perform bitwise transmission of a frame unto the outbound media, Forwarding Delay is the amount of time it takes a network device to process a frame/packet by performing a destination address lookup and forwarding the frame/packet to the outbound interface, and Protocol Delay is the amount of time that access or transmission algorithms may contribute to the delay of a network frame, and is typically introduced at the endpoints of the data transmission system.

Serialization delay, on a per-packet basis, becomes insignificant at data rates above 1.544 Mbits/s – or a T1. Forwarding delay is typically insignificant in modern routers and switches (when appropriately configured – significant delay can occur in misconfigured routers.) And Protocol delay typically occurs at the access layer or the end points. So the two major variables that have the most effect on network latency are Distance Delay and Queue Delay.

Distance Delay is simply the minimum amount of time that it takes the electrical signals that represent bits to travel down the physical wire. Optical cable sends bits at about ~5.5 µs/km, copper cable sends it at ~5.606 µs/km, and satellite sends bits at ~3.3 µs/km. (There are a few additional microseconds of delay from amplifying repeaters in optical cable, but compared to distance, the delay is negligible.)

Distance delay can have a significant impact on application performance for applications that require a large number of network round trips in order to complete a transaction – for example, custom transactional based applications, database queries, and VoIP, which begins do degrade when one-way end-to-end latency exceeds 200-220 milliseconds.

One of the biggest sources of end-user ire are database queries designed to run over a LAN ported to the WAN. For example if a user executes a SQL database query that requests 100 rows of a database table, one row at a time, over a link with a latency due to distance of 60 ms, it would take approximately 6 seconds (60 ms * 100 turns) to complete the transaction. The same query executed by a user on a LAN connected to the same database server would take less than 2-3 ms to be completed, as the latency due to distance across the LAN is insignificant.

Queue Delay is the amount of time a packet must spend in a network buffer waiting its turn to be transmitted. Network interfaces transmit one frame at a time, typically one bit at a time. As such, when two or more packets are forwarded to a network interface at the same time, or close to the same time – one packet is transmitted while the others are put in a queue on the interface buffer to await their turn at the interface. Packets that are put into the queue must wait until they can be transmitted, adding milliseconds of delay.

Increases in Queue Delay can be measured and detected by monitoring traffic along a given network path. Typically, most intermittent increases in latency above the baseline distance latency can be attributed to network congestion. (In order to reduce the possibility of excessive queue delay, application servers that are members of the same application architecture should be placed on the same Ethernet switch and on the same VLAN to ensure they do not have to compete for uplink bandwidth when problems like the one pictured above occur.)

Worse still, if the problem gets worse and packets wait in increasingly longer lines within the queue, the buffer may become full and the packets may be dropped. Packet drop, in turn, causes TCP connections to throttle back on the rate of transmission.

Those are some of the main causes of latency – but what about jitter?

Jitter is a term that refers to the variance in the arrival rate of packets from the same data flow, and abnormal jitter values can negatively impact real-time applications like VoIP and video. Jitter is typically created by three different mechanisms in a network: variance in Serialization Delays due to variance in packet sizes, variance in per-packet Queue Delay due to packet spacing from multiple sources at a common outbound interface, or packets taking different routes from source to destination – perhaps due to per-packet load sharing or routing issues.

The most effective way to deal with jitter is by using low-latency queuing for VoIP and video traffic on network interfaces with large serialization and/or queue delays. In addition, endpoints (such as IP phones) can use jitter buffers or playout delay buffers in order to deliver received packets at a constant rate to the end consumer. These buffers are typically 30-50 ms in depth, and thus they attempt to manage jitter values within these values on any single one-way path. While these buffers technically add 30-50ms in latency, they significantly reduce jitter. Since human beings don’t start to notice latency in VoIP or VideoIP applications till it hits about 200ms, if latency can be kept to under 150 milliseconds, then jitter can be significantly reduced using this method.


Application Performance Archives

The Application Delivery Engineer


by Patrick Ancipink

Things used to be easy.

No, wait.  Things never used to be easy.  In fact, they were horribly complex and frustrating to the point where engineers pull their hair out.  But now we usually expect around 99.99umpteen% uptime from our network equipment. 

So frustration today often stems from the new tasks that enterprise IT engineers are expected to handle beyond the routers and switches.  Application delivery controllers, WAN Optimization controllers, and more latency sensitive applications such as VoIP and Teleconferencing simply mean that the IT teams are being tasked with problems that require them to think in new ways about what it means to be in IT.

If you’ve been to any networking convention or conference, you’ve probably heard “in IT you either develop applications or deliver applications” more times than you’ve seen the Brady Bunch episode in which Marcia gets hit in the face with a football.  That’d doesn’t make it any less true. 

Ann Bednarz, writing for Network World, suggests that companies take research firm Gartner’s advice and look to hire “application delivery architects and engineers.” The idea is that there should be at least one person in the IT department whose full time job is worrying about application delivery and tuning on a WAN – someone who can converse with application developers and security teams and end users. 

At NetQoS, we’re trying to help companies get the information they need to either designate and train an existing member of the IT staff for these new responsibilities, or at least know what to look for when hiring for an Application Delivery Engineer position.

For example, some things we’re doing right now include our NetAnalyst training based on real-world examples on resolving complex network application issues, and integrating our multiple products together in the NetQoS Performance Center

But there are some more subtle ways in which we’re hoping to get this point across.  We argue that the most important metric for network performance management is application response time.  And while there’s many things that can affect application response time, the most basic is that your best possible application response time is limited by the latency of the connection (especially in financial applications,) multiplied by the number of connections that the application has to make.  Network engineers often focus on only one aspect of that formula, latency – while application developers only focus on the other aspect – the connections.  (That’s if they bother to think about the impact of the app on the network at all. And if they do, their test environment sorely lacks any similarity to the real world WAN.)  

So the value of developing the role of the Application Delivery Engineer, someone who can coordinate the two halves of that Application Response Time equation, becomes clear. 


Application Performance Archives

Podcast: Dr. Jim Metzler on the Next Generation NOC


In a few minutes, Jim Metzler of Ashton, Metzler, and Associates, will be delivering his keynote on the Next Generation NOC at NetQoS Symposium 2008 at Barton Creek Resort in Austin. Last week, we pre-recorded a podcast with Dr. Metzler regarding the speech he is about to give and what he means by a "next generation NOC."

He talks about the changing role of the NOC and moves in enterprises towards integrating what were once seperate stovepipe functions to focus on application delivery.

The podcast is below.


Application Performance Archives

Podcast: Dr. Jim Metzler talks about Handbook of Application Delivery 2008 and NetQoS Symposium.


Today, in this podcast, we speak to Dr. Jim Metzler at Ashton, Metzler, and Associates regarding his handbook, "The Handbook of Application Delivery 2008" and his upcoming keynote speech a NetQoS Symposium 2008.




<< 1 2 3 4 5 6 7