Network Performance Archives

This-specific-end-to-that-specific-end network performance management.


EMA analyst Dennis Drogseth had a column in Network World yesterday talking about end-to-end application management. In it, he had this to say:


You might believe, and with some real justification, that the term “end to end” is only used by vendors who custom-fit the definition to the scope of their particular product.

Does “end-to-end” application management, for instance, include the mainframe? You bet it does if you’re a vendor that manages the mainframe environment! Does it include capturing the end user experience at the end station, desktop, or mobile device? Once again, the answer is a definitive “yes” if you’re a vendor that has strong QoE (Quality of Experience) roots. Or how about insights into the code and design of the application itself? If you’re one of the few vendors that does this, you’re proud of it and wouldn’t have it any other way!


And this concerned me because, if you do a google search for: [site:networkperformancedaily.com “end-to-end”], you get 122 results. The phrase, “end-to-end” appears in a little more than 1 in 5 posts we’ve made to this blog.

So, what do we mean by “end-to-end?”  We’re usually using the phrase in connection with network response times and the end-user experience at the end station; NetQoS is a “vendor that has strong QoE roots.”

Now, we do have some insight into the code and design of the application.  But that isn’t the focus of our tools; the focus is to tell you whether the problem is in the network, server, or application, and if it’s in the application, give you a good idea of where to start your investigation.  (For example, an application that is slow due to unnecessary round-trip transactions behaves differently from an application that is slow due to a memory leak on the server where it is being run.) 

Drogseth is right when he says that no one vendor is optimized to do it all.  In the future, there could be, but then you run into the quality vs. quantity problem.  Is it better to do it all adequately or to do a few things extremely well?

EMA defined five major technology spheres, and last June, they polled more than 400 respondents to find out which of them they believed “most critical to end-to-end application management in 2008.”  The answer was “Network Application Management,” focusing on application flows and end-to-end (as we define it) transaction capabilities. 

For more information on this, I recommend you read the original article up at Network World.  Additionally, Drogseth promises to follow-up in his next two columns. 


Network Performance Archives

What do YOU think of the NetQoS Performance Center?


What do YOU think of the NetQoS Performance Center?

Peter Sevcik and Rebecca Wetzel, analysts with NetForecast, would like to know (and of course we at NetQoS are always looking for feedback). They have published an article about NetQoS for their “App Performance View” blog on NetworkWorld.com. This is the result of a series of blog posts they are writing about “tools that monitor application performance in real-world environments.” They ask for customers to comment about their experience with the NetQoS Performance Center at the end, and we would like to encourage all of you who have experience with any of NetQoS products to respond: What do you think of the NetQoS Performance Center?

You may respond anonymously if you wish, and the comment can be as short or as long as you want to make it. Feedback is encouraged on this blog as well.

And if you are just thinking about deploying any of our products, check out the customer comments already posted at the end of the Network World post!


Network Performance Archives

Three Things You Can Do Today To Improve Network Performance Without Spending a Dime


For months, we’ve been waiting to see what the fallout would be from the sub-prime mortgage crisis.

Apparently, the results are not unlike a hefty bag filled with chili con carne, dropped from the top of a skyscraper. Only instead of a hefty bag, it’s the U.S. economy.

So, as Wall Street explodes like an explosive so explosive it could explode and create a massive explosion, technology turnaround times will probably extend a couple more years as CIOs try to figure out how to use existing tools to solve network management problems and improve performance. How do you do that?

Luckily, there are ways to do that – Cisco routers and switches already have “application-aware” technologies and don’t require any additional purchases – including IP Service Level Agreement (IP SLA), Class Based Quality of Service (CBQoS), and Network Based Application Recognition (NBAR).

Managing Application Response Times with Cisco IP SLA

Now, measuring real application transactions is the most accurate method for measuring response times. But, failing that, you can use Cisco IP SLA to create synthetic transactions. This is not only useful when on an IT budget crunch but can also provide useful data when assessing whether or not to roll out a new application, or measuring a service provider’s SLA edge-to-edge.

IP SLA operates by sending synthetic transactions between two network devices or between a network device and a server. It can be configured to send different types of synthetic transactions based on port, packet size, type of service, and even more advanced characteristics, as is the case with Voice over Internet Protocol (VoIP) tests. When it gets a response, the sender then calculates the response-time metrics appropriate for the test type, and then repeats multiple times.

Some SNMP polling products can collect data automatically, store it in a database, display the results in a GUI, and provide analytical function beyond data collection, such as calculating baselines, displaying trends, and triggering threshold alerts based on collected IP SLA data. There’s also the possibility of simply getting the information from the CLI, but extracting the IP SLA response-time metrics and copying them to a spreadsheet can be difficult and tedious. However, for the extremely budget-conscious, it can be done.

Deploying Quality of Service with Cisco CBQoS

QoS is a blanket term for network policies and practices that help to manage different types of data traffic that share network links. Effectively, QoS determines how different types of traffic, with different priorities, are handled whenever tradeoffs that are likely to impede performance must be made.

Now, within any enterprise, the end-user experience with certain applications will always be more critical than it is with others. Strategies to avoid (or at least manage) congestion could include dropping traffic, adjusting application responses, and building packet queues. CBQoS is one way to do this – and comes with the CBQoS Management Information Base (MIB) to collect statistics about the traffic traversing the router and reports how the QoS configuration is being applied.

Here, an SNMP polling product with application-aware capabilities can get information on input and output QoS class map utilization, drop percentage, and packet counts. It can also get information on pre-versus-post QoS traffic volume, rate, and packet count. It can also point out traffic marked in conformance, in excess, and in violation of defined policies.

Without CBQoS, network managers don’t have a whole lot of evidence to verify that their QoS settings are actually improving network performance – in fact, they may even be inadvertently harming performance. CBQoS prevents network managers from flying blind with QoS deployments. And, like IP SLA, it’s built into Cisco IOS.

Gaining a New Level of Visibility with Cisco NBAR

From within the network device operating system, Cisco NBAR can inspect packets traversing the device and identify the corresponding application – for example, TCP traffic running on port 80 could be labeled as Google, SAP, SharePoint, SalesForce, etc. NBAR can also provide utilization, volume, and rate metrics on a per-application basis relative to the network circuit carrying the traffic.

It’s similar to NetFlow, but NetFlow identifies protocol traffic mixes – not application-layer visibility. NBAR identifies by application – which is important in setting proper QoS policies. And because NBAR is part of Cisco’s IOS, and the data can be collected with an application-aware SNMP poller (which many of you already have), it can be a more cost-effective solution than application discovery hardware.


Network Performance Archives

Nick Carr takes on Colbert


First off, congrats to Nick Carr – we’ve talked with him (and disagreed with him!) often on the blog and we’re thrilled that he managed to go toe-to-toe with Stephen Colbert on last night’s show.

And, thanks to the Colbert Report’s online presence, here’s an embedded player with that interview.


Although the book plugged is “The Big Switch,” the majority of the interview talks more about the implications of dwindling attention spans due to the Internet’s “hyper” hyperlinked nature – a topic not covered in “The Big Switch,” but instead in the cover article Carr wrote for the Atlantic Monthly, “Is Google Making Us Stoopid?

The idea, as we’ve mentioned before, is that Carr believes the end result of the attention getting behavior of the Internet is that it will “scatter our attention and diffuse our concentration.”


“When the Net absorbs a medium, that medium is recreated in the Net's image. It injects the medium's content with hyperlinks, blinking ads, and other digital gewgaws, and it surrounds the content with the content of other media it has absorbed. A new e-mail message, for instance, may announce its arrival as we're glancing over the latest headlines at a newspaper's site. The result is to scatter our attention and diffuse our concentration.”


During the interview, Colbert made a play of ignoring Carr to check his iPhone. Now, that that does happen in real life, but I’d say that’s more an indication of individual rudeness then of culture spinning on a dime over the concept of hypertext.

The same criticisms that Carr makes of the Internet could be made of the newspaper – you’re trying to read one thing but it’s broken up, put next to all these other interesting articles, and ads designed to catch your attention… with all these… analog gewgaws, how is one supposed to be informed?

We’ve mentioned before that the limits on network performance limit the ability to communicate complex thought back when the Atlantic Monthly article first came out. But, we missed an opportunity to get less academic, more practical, and closer to the issues in a corporate network environment.

While we disagree with Carr’s diagnosis that the Internet causes short attention spans, (I’m a pro-blogger at a tech company, raised on Nintendo and MTV – I’m the poster-child for the 21st century digital boy, and still I managed to summon the concentration to read the book Nick Carr wrote…) we do agree that human attention spans are short.

When I worked at a supermarket retailer, back in the early 2000s, as I’ve mentioned (and complained about) we were using a java-based networking app that took one to two seconds to input each number and move to the next field, and processing the entire report took minutes. The network performance was absolutely horrible, and as I pointed out before, we would have mentioned it in the hopes of having the performance improved somehow, except that we all realized that our jobs were essentially superfluous anyway and that we could all be replaced by a very small shell script that could parse the orders as they came in instead of printing them out and having us enter in all of them by hand.

Of course the lot of us at the data entry farm had CNN.com, Slashdot, and All Your Base Are Belong To Us and Hamsterdance open while we waited for the pages to load. (It was a simpler time back then.)

Of course, if we didn’t have outside Internet access, we could very well have distracted ourselves offline with desktop toys or conversation. We did that often anyway – as I said, it just took forever for those fields to come up.

I’ve also heard, second and third-hand, stories of other companies who are shocked to find that employees are going on to do other tasks while they wait for reports to generate, fields to come up, and pages to load – so if you’re honestly worried about dwindling attention spans, it might be better to not curse Google or the Internet, but to go in and actually improve things where you can.


Network Performance Archives

A few of a many, or many of a few?


Ken Church, Albert Greenberg and James Hamilton of Microsoft recently put out a paper on “Delivering Embarrassingly Distributed Cloud Services.”[PDF] Like most papers of this type, it’s a dry read, but informative. It looks at the tradeoff between mega-data center size and micro-data center diversity from the both the viewpoints of total cost of ownership and of performance.

The most important line in the entire report, of course, is “The trade-offs vary by application.” However, they make the argument that applications with little need for server-to-server communications will show benefits in cost, scale, reliability and performance through geo-diversification – in other words, lots of little datacenters as opposed to one big datacenter.

This seems to fly in the face of the trend in data consolidation, but there is a point to it: For any data center, there needs to be redundancy, but in a centralized data center, there needs to be more redundancy than having multiple small data centers. As Church, Greenberg, and Hamilton put it, “the more geo-diversity, the better. N+1 redundancy becomes more attractive for large N.”

The part that really interested me, though, was the networking section. (Section 3, in case you want to skip right to it.) Church, Greenberg, and Hamilton point out that in a large, centralized datacenter, you can have end-to-end control and assure a particular level of performance through supported service level agreements. On the other hand, they argue:


“[with distributed data centers] the cloud service provider has ceded control of quality to its Internet access providers, and so cannot support (or even fully monitor) SLAs on flows that cross out multiple provider networks, as the bulk of the traffic will do. However, by artfully exploiting the diversity in choice of network providers and using performance sensitive global load balancing techniques, performance may not appreciably suffer. Moreover, by exploiting geo-diversity in design, there may be attendant gains in reducing latency…”



“Many large analysis applications are best run centrally in mega data centers… Interactive applications are best run near users… [they] can be delivered with better QoS (e.g., smaller TCP round trip times…) via micro data centers.”


The argument’s sound, especially when you consider that interactive applications are probably the most latency sensitive because they need to make multiple trips to and from the client and server with every interaction.

But reducing the propagation delay (or distance delay) is merely one part of the performance equation. By ceding control over router performance and transmission, you have no way of diagnosing network round trip time problems if they occur, and wouldn’t be able to fix them – short of the messy step of changing service providers – even if you did. If something goes wrong, it could negate the speed increases by diversifying servers, so moving to this model more of a gamble than a guarantee of improvement. Granted, it’s a gamble that might make sense for some apps and some organizations – some apps, apparently, can get away with less than 100% uptime.


Network Performance Archives

Google Chrome and Network Performance – it’s bigger than you think.


When Google Chrome was released, our genuine reaction around the office was something like this:

ourreaction.jpg

Okay, so the last thing the world needs is yet another browser. Between IE, Firefox, Safari, Opera, Flock, Konqueror, Epiphany, Camino, Galeon, SeaMonkey, OmniWeb, and, of course, Wii Internet Channel, Web applications developers already have their hands full.

However, if you work in IT, you are either in the business of developing applications or delivering applications. And sometimes the bottleneck in application delivery is the browser. You can have the best network in the world, with only a couple hundred milliseconds of overall delay – but if it takes seconds to render the JavaScript on the front-end, it’s almost academic. At any rate, the end-user probably can’t tell the difference between delays on the network to delays on the client-side browser.

There are two things that make Chrome stand out – the first is running each tab, and each plug-in, as a separate process, with protected memory address space. Problems in one tab will not crash the entire browser.

The other is advances in JavaScript execution. By running java scripts in separate process, buggy JavaScript can’t hang the browser, like it would if JavaScript ran in a single-thread in a browser process. The above scenario should come as no surprise to anyone that has used Firefox and watched as a single buggy JavaScript site made you restart all the tabs on your browser.

But Chrome also comes with a JavaScript virtual machine, which speeds up JavaScript-based Web applications by turning the interpreted JavaScript code directly into machine-code for your processor and OS. Again, faster delivery of the application, when the browser is the bottleneck.

There are a few nay-sayers out there that are looking at this from a bottom line point of view – that Google is trying to enter into the browser wars and try to own the space – basically, if you use Google’s browser, even if it’s open-source, you’ll view Google’s advertisements, and make Google money. That’s true enough. But what we really should be taking from this is that even if Google’s code wasn’t open-sourced – and it is – these innovative ideas would eventually make their way into other Web browsers in order to stay competitive. Firefox will likely incorporate changes at least by the next full release, and Microsoft, Apple, and Opera Software will do so if they want to remain competitive.

I’m skeptical that Google Chrome will make it onto enough desktops that Google becomes a key competitor in the Browser Wars. Then again, Mosaic was the first Web browser, and no one uses it today – but we certainly use a lot of the technological ideas behind Mosaic. It really was a quantum leap forward, and though I may be overly optimistic about it, this really is a quantum leap forward in Web application development.

The point is not Google Chrome. The point is the technology behind Google Chrome.


Network Performance Archives

Cisco’s WAAS and the Olympics


I can’t believe I missed this the first time around.

I was so focused on how the online Olympic video was getting through the last mile, that I completely forgot to ask: How the heck are they getting it from Beijing to the U.S.?

Douglas Gourlay at Cisco has been blogging about how NBC’s been using Cisco’s Wide Area Application Services (WAAS) for WAN optimization, so that NBC’s video editors can use three 155Mbps OC-3 pipes, combined and load-balanced (with, of course, Cisco gear) to get the files directly from Beijing. While I’m not 100% sure on “as if they were stored locally,” holds true, it’s clear that WAAS is capable of some amazing stuff – we know because NetQoS has SuperAgent integration on WAAS devices and ACE load balancers. We track stuff like that all the time.


“This reduces operating costs of housing, air travel, transportation, and food. Avoiding 800 airplane trips also supports NBC’s green initiatives for the Olympic Games.”


It also probably makes the video editors a bit grumpy that they didn’t get to go to Beijing.

What I’m curious about is what will happen after the Olympics. Just as Olympic stadiums still stand – and are used – in every host city, I’m wondering if the infrastructure that NBC has to Beijing to deliver high definition video will remain after the Olympics. As China starts to become a new superpower, more news and information is bound to come from Beijing, after all.

And if this can be done for one series of events in one major city, is it that far off from having video-heavy WANs in every city to cover every major event?


Network Performance Archives

Complexity of Thought is Limited by Network Performance


[Ed. Note: The article referenced in this post has since been published online on the Atlantic Monthly Web site.]

Nick Carr, author of “The Big Switch” and “Is IT Obsolete?” has written “Is Google Making Us Stupid?,” and it has been published in the July issue of the Atlantic Monthly. 

Sadly, I called up my local bookstore and they only currently carry the June issue of the Atlantic Monthly, so I can’t give you a well informed critique of Carr’s thoughts.  However, the article is quoted – minimally – by Matt Asay of C|Net.

Of course, there’s a certain amount of irony that Asay seems to use very limited excerpts from Carr’s article to decry the “soundbite culture.” 

Then again, I’m about to give you my thoughts on an article that Asay has read and I haven’t, so if Asay’s article is ironic, this one is downright hypocritical.  So be it.  This seems to be the most important direct quote from Carr’s original article:

The Internet promises to have particularly far-reaching effects on cognition....The Internet, an immeasurably powerful computing system, is subsuming most of our other intellectual technologies. It's becoming our map and our clock, our printing press and our typewriter, our calculator and our telephone, and our radio and TV.

When the Net absorbs a medium, that medium is recreated in the Net's image. It injects the medium's content with hyperlinks, blinking ads, and other digital gewgaws, and it surrounds the content with the content of other media it has absorbed. A new e-mail message, for instance, may announce its arrival as we're glancing over the latest headlines at a newspaper's site. The result is to scatter our attention and diffuse our concentration.

Of course you could make the opposite point.  With RSS feeds, news aggregators, and “long tail” blogs, there is also a point to be made that instead of distracting us and diffusing our concentration, we end up hyper-focused on one or two topics to the complete exclusion of everything else.  (The “scattering” effect actually came up quite a bit in my grad-level journalism classes as a defense of the dying newspaper – that you get to see articles you may not have been interested out of the corner of your eye while you read articles that you are interested in.)

But as I mentioned, I haven’t read the entire article; so instead of taking apart Carr’s argument – let’s put forward a new one. 

The limits on network performance then in turn limit the ability to communicate complex thought. 

Let’s start with Twitter.  A twitter post is to information what bumper stickers are to philosophy, at 140 characters, there’s not much that can be done.  But Twitter already suffers from network performance problems and outages presumably related to scale.  If Twitter allowed longer posts, that increases the amount of data traversing across the network. 

You may be asking: So what?  Twitter, at its core, isn’t very different than the “Friends” feature of LiveJournal – and you can post long posts on Livejournal.  This is mostly true, but there is a major difference between Twitter’s model and LiveJournal.  LiveJournal’s “Friends” posts are pulled out of the database at various different times by various readers who actively “pull” the information to their Web browsers.  Twitter, on the other hand, “pushes” the information like an IM client – and does so simultaneously to multiple users.  Twitter’s big selling point is immediacy - latency needs to be low.  Since a single twitter user can have hundreds or even thousands of subscribers… well, you can see the implications.  Twitter’s performance problems may seem incongruous for such a “simple little app” but are actually quite complex.

So, for right now, 140 characters is all that Twitter can handle.  You can blog, you can email, you can IM to express more complex ideas, but because Twitter requires additional demands on the network, the medium’s ability to express complex thought is limited by the performance of the network.

To take a further point, let’s look at YouTube.  There’s another arbitrary limit – 10 minutes or 100MB of data.  Here, the relationship between performance and the limit are a bit more direct; though other video services allow for longer/bigger videos, none of them have the demand that YouTube has. 

But the relationship between complex thought and YouTube is a bit less direct – certainly a complex thought can be expressed in 10 minutes.  Perhaps not completely examined like a book – but certainly expressed.  And comparatively, the 10 minute YouTube video delivers subtlety and nuance to the point where it is replacing the 10 second sound-bite usually found on television.  In this case, the medium’s ability to express complex thought is limited by the performance of the network but is still more informative than the alternative. 

Then again, it’s all about how we use the information; if we used every bit of information in a Cisco Telepresence rig to send text, there would be no human that would be able to parse that much text, that quickly.  The 100MB used for those 10 minutes of YouTube video could also hold the entire text of War and Peace 32 times over. 

To talk about whether the Internet makes you stupid (as SomethingAwful.com has been decrying for years) is to oversimplify a complex idea.  If, during the course of one’s Internet browsing, one is easily distracted when looking for information; this distraction will interfere with your ability to think about things in depth.  On the other hand, if one thinks about things in depth and does not allow for some distraction, one can end up with a deep, but not particularly broad amount of information.  Neither one really decreases your actual intelligence; it’s just the way that one looks at different subjects. 

Eventually I’ll manage to read Mr. Carr’s article and address these points in more depth.  Right now, however, I’m forced to conclude that Google is not making us stupid.

4Chan is making us stupid.   


Network Performance Archives

Podcast: Dr. Jim Metzler talks about Handbook of Application Delivery 2008 and NetQoS Symposium.


Today, in this podcast, we speak to Dr. Jim Metzler at Ashton, Metzler, and Associates regarding his handbook, "The Handbook of Application Delivery 2008" and his upcoming keynote speech a NetQoS Symposium 2008.



Network Performance Archives

Symposium Preview: Kevin Davis on Time-based Troubleshooting.


Kevin Davis, a senior consultant at NetQoS, will be presenting a few training sessions at Symposium about SuperAgent, the end-to-end response time module of the NetQoS Performance Center. This will include a training session about how to use time-based network metrics in troubleshooting.  He talks about his upcoming training session below.

In the session, I’m going to be covering the importance of using a time-based metric in troubleshooting, because end-users complain foremost about time.  For example, they’ll say “the application is running slow,” or they believe “the network is slow.”  To users, everything is based on time, that’s what they’re complaining about.  And they’re correct.

It’s very new to many people to think of performance in “time” although that may seem counterintuitive - because most people are used to reading utilization graphs.  With utilization graphs, however, we don’t know if 70 or 80 or 90 percent utilization is necessarily impacting the user experience.  I mean, we buy networking equipment, routers, switches, firewalls, servers, and we want them to be highly – or efficiently - utilized.  Seeing high utilization could indicate a problem – or it could just indicate that you haven’t over-purchased.  So you can have a link at 90% utilization or a router at ninety percent CPU utilization but you won’t know if that’s impacting the end-user without a time based metric.

It’s time-based data that tells you how the users are being impacted.  Sure, the utilization data – the interface utilization, memory utilization, I/O utilization, can often tell what is doing the impact.  But the time base shows you the degree of the impact – the real-world effect on end-users.  With a time-based instrument, such as NetQoS SuperAgent, you can find out where the delay increase is occurring, and whether it’s based in the network, server, or application. 

In fact, you can take a look at time-based data and make a determination very quickly as to which entity is creating the performance issue – the beautiful thing about SuperAgent, in particular, is that it trends by time 24/7, so not only can you determine how your important business applications are being impacted today, but you can go back and look at recurring patterns in performance issues.  You can see if today is worse than yesterday or last week or last month.

In the session, I’ll also be going over how to architect the data center for performance.  Placement of servers that participate in inter-architectures is critical for the health and performance of the application and indeed the data center.  We also talk about how different protocols, for example, Microsoft’s TCP/IP stack, can impact application performance by enhancing or degrading it. 

It’s important for servers that are serving the same application.  For example, a front-end Web server and a back-end Oracle database really should be on the same switch on the same VLAN.  That way they receive optimum service from the network.  If they do leave the switch, they’ll have to contend with bandwidth going up and down the switch links, and they’ll be switched and routed multiple times. 

Based on measurements from customer environments and from our own laboratories, when two servers are on different switches they can have up to 18 milliseconds delay between them.  If we think of that in the terms of network engineers of one millisecond per 100 miles, what in effect we’re doing when we put two different servers on different switches, or two different VLANs on the same switch, we’re making it look like those servers are 1800 miles apart – like one server is in Los Angeles and the other is in Memphis. 



1 2 3 4 5 6 7 8