Network Performance Archives

A few of a many, or many of a few?


Ken Church, Albert Greenberg and James Hamilton of Microsoft recently put out a paper on “Delivering Embarrassingly Distributed Cloud Services.”[PDF] Like most papers of this type, it’s a dry read, but informative. It looks at the tradeoff between mega-data center size and micro-data center diversity from the both the viewpoints of total cost of ownership and of performance.

The most important line in the entire report, of course, is “The trade-offs vary by application.” However, they make the argument that applications with little need for server-to-server communications will show benefits in cost, scale, reliability and performance through geo-diversification – in other words, lots of little datacenters as opposed to one big datacenter.

This seems to fly in the face of the trend in data consolidation, but there is a point to it: For any data center, there needs to be redundancy, but in a centralized data center, there needs to be more redundancy than having multiple small data centers. As Church, Greenberg, and Hamilton put it, “the more geo-diversity, the better. N+1 redundancy becomes more attractive for large N.”

The part that really interested me, though, was the networking section. (Section 3, in case you want to skip right to it.) Church, Greenberg, and Hamilton point out that in a large, centralized datacenter, you can have end-to-end control and assure a particular level of performance through supported service level agreements. On the other hand, they argue:


“[with distributed data centers] the cloud service provider has ceded control of quality to its Internet access providers, and so cannot support (or even fully monitor) SLAs on flows that cross out multiple provider networks, as the bulk of the traffic will do. However, by artfully exploiting the diversity in choice of network providers and using performance sensitive global load balancing techniques, performance may not appreciably suffer. Moreover, by exploiting geo-diversity in design, there may be attendant gains in reducing latency…”



“Many large analysis applications are best run centrally in mega data centers… Interactive applications are best run near users… [they] can be delivered with better QoS (e.g., smaller TCP round trip times…) via micro data centers.”


The argument’s sound, especially when you consider that interactive applications are probably the most latency sensitive because they need to make multiple trips to and from the client and server with every interaction.

But reducing the propagation delay (or distance delay) is merely one part of the performance equation. By ceding control over router performance and transmission, you have no way of diagnosing network round trip time problems if they occur, and wouldn’t be able to fix them – short of the messy step of changing service providers – even if you did. If something goes wrong, it could negate the speed increases by diversifying servers, so moving to this model more of a gamble than a guarantee of improvement. Granted, it’s a gamble that might make sense for some apps and some organizations – some apps, apparently, can get away with less than 100% uptime.


Network Performance Archives

Google Chrome and Network Performance – it’s bigger than you think.


When Google Chrome was released, our genuine reaction around the office was something like this:

ourreaction.jpg

Okay, so the last thing the world needs is yet another browser. Between IE, Firefox, Safari, Opera, Flock, Konqueror, Epiphany, Camino, Galeon, SeaMonkey, OmniWeb, and, of course, Wii Internet Channel, Web applications developers already have their hands full.

However, if you work in IT, you are either in the business of developing applications or delivering applications. And sometimes the bottleneck in application delivery is the browser. You can have the best network in the world, with only a couple hundred milliseconds of overall delay – but if it takes seconds to render the JavaScript on the front-end, it’s almost academic. At any rate, the end-user probably can’t tell the difference between delays on the network to delays on the client-side browser.

There are two things that make Chrome stand out – the first is running each tab, and each plug-in, as a separate process, with protected memory address space. Problems in one tab will not crash the entire browser.

The other is advances in JavaScript execution. By running java scripts in separate process, buggy JavaScript can’t hang the browser, like it would if JavaScript ran in a single-thread in a browser process. The above scenario should come as no surprise to anyone that has used Firefox and watched as a single buggy JavaScript site made you restart all the tabs on your browser.

But Chrome also comes with a JavaScript virtual machine, which speeds up JavaScript-based Web applications by turning the interpreted JavaScript code directly into machine-code for your processor and OS. Again, faster delivery of the application, when the browser is the bottleneck.

There are a few nay-sayers out there that are looking at this from a bottom line point of view – that Google is trying to enter into the browser wars and try to own the space – basically, if you use Google’s browser, even if it’s open-source, you’ll view Google’s advertisements, and make Google money. That’s true enough. But what we really should be taking from this is that even if Google’s code wasn’t open-sourced – and it is – these innovative ideas would eventually make their way into other Web browsers in order to stay competitive. Firefox will likely incorporate changes at least by the next full release, and Microsoft, Apple, and Opera Software will do so if they want to remain competitive.

I’m skeptical that Google Chrome will make it onto enough desktops that Google becomes a key competitor in the Browser Wars. Then again, Mosaic was the first Web browser, and no one uses it today – but we certainly use a lot of the technological ideas behind Mosaic. It really was a quantum leap forward, and though I may be overly optimistic about it, this really is a quantum leap forward in Web application development.

The point is not Google Chrome. The point is the technology behind Google Chrome.


Network Performance Archives

Cisco’s WAAS and the Olympics


I can’t believe I missed this the first time around.

I was so focused on how the online Olympic video was getting through the last mile, that I completely forgot to ask: How the heck are they getting it from Beijing to the U.S.?

Douglas Gourlay at Cisco has been blogging about how NBC’s been using Cisco’s Wide Area Application Services (WAAS) for WAN optimization, so that NBC’s video editors can use three 155Mbps OC-3 pipes, combined and load-balanced (with, of course, Cisco gear) to get the files directly from Beijing. While I’m not 100% sure on “as if they were stored locally,” holds true, it’s clear that WAAS is capable of some amazing stuff – we know because NetQoS has SuperAgent integration on WAAS devices and ACE load balancers. We track stuff like that all the time.


“This reduces operating costs of housing, air travel, transportation, and food. Avoiding 800 airplane trips also supports NBC’s green initiatives for the Olympic Games.”


It also probably makes the video editors a bit grumpy that they didn’t get to go to Beijing.

What I’m curious about is what will happen after the Olympics. Just as Olympic stadiums still stand – and are used – in every host city, I’m wondering if the infrastructure that NBC has to Beijing to deliver high definition video will remain after the Olympics. As China starts to become a new superpower, more news and information is bound to come from Beijing, after all.

And if this can be done for one series of events in one major city, is it that far off from having video-heavy WANs in every city to cover every major event?


Network Performance Archives

Complexity of Thought is Limited by Network Performance


[Ed. Note: The article referenced in this post has since been published online on the Atlantic Monthly Web site.]

Nick Carr, author of “The Big Switch” and “Is IT Obsolete?” has written “Is Google Making Us Stupid?,” and it has been published in the July issue of the Atlantic Monthly. 

Sadly, I called up my local bookstore and they only currently carry the June issue of the Atlantic Monthly, so I can’t give you a well informed critique of Carr’s thoughts.  However, the article is quoted – minimally – by Matt Asay of C|Net.

Of course, there’s a certain amount of irony that Asay seems to use very limited excerpts from Carr’s article to decry the “soundbite culture.” 

Then again, I’m about to give you my thoughts on an article that Asay has read and I haven’t, so if Asay’s article is ironic, this one is downright hypocritical.  So be it.  This seems to be the most important direct quote from Carr’s original article:

The Internet promises to have particularly far-reaching effects on cognition....The Internet, an immeasurably powerful computing system, is subsuming most of our other intellectual technologies. It's becoming our map and our clock, our printing press and our typewriter, our calculator and our telephone, and our radio and TV.

When the Net absorbs a medium, that medium is recreated in the Net's image. It injects the medium's content with hyperlinks, blinking ads, and other digital gewgaws, and it surrounds the content with the content of other media it has absorbed. A new e-mail message, for instance, may announce its arrival as we're glancing over the latest headlines at a newspaper's site. The result is to scatter our attention and diffuse our concentration.

Of course you could make the opposite point.  With RSS feeds, news aggregators, and “long tail” blogs, there is also a point to be made that instead of distracting us and diffusing our concentration, we end up hyper-focused on one or two topics to the complete exclusion of everything else.  (The “scattering” effect actually came up quite a bit in my grad-level journalism classes as a defense of the dying newspaper – that you get to see articles you may not have been interested out of the corner of your eye while you read articles that you are interested in.)

But as I mentioned, I haven’t read the entire article; so instead of taking apart Carr’s argument – let’s put forward a new one. 

The limits on network performance then in turn limit the ability to communicate complex thought. 

Let’s start with Twitter.  A twitter post is to information what bumper stickers are to philosophy, at 140 characters, there’s not much that can be done.  But Twitter already suffers from network performance problems and outages presumably related to scale.  If Twitter allowed longer posts, that increases the amount of data traversing across the network. 

You may be asking: So what?  Twitter, at its core, isn’t very different than the “Friends” feature of LiveJournal – and you can post long posts on Livejournal.  This is mostly true, but there is a major difference between Twitter’s model and LiveJournal.  LiveJournal’s “Friends” posts are pulled out of the database at various different times by various readers who actively “pull” the information to their Web browsers.  Twitter, on the other hand, “pushes” the information like an IM client – and does so simultaneously to multiple users.  Twitter’s big selling point is immediacy - latency needs to be low.  Since a single twitter user can have hundreds or even thousands of subscribers… well, you can see the implications.  Twitter’s performance problems may seem incongruous for such a “simple little app” but are actually quite complex.

So, for right now, 140 characters is all that Twitter can handle.  You can blog, you can email, you can IM to express more complex ideas, but because Twitter requires additional demands on the network, the medium’s ability to express complex thought is limited by the performance of the network.

To take a further point, let’s look at YouTube.  There’s another arbitrary limit – 10 minutes or 100MB of data.  Here, the relationship between performance and the limit are a bit more direct; though other video services allow for longer/bigger videos, none of them have the demand that YouTube has. 

But the relationship between complex thought and YouTube is a bit less direct – certainly a complex thought can be expressed in 10 minutes.  Perhaps not completely examined like a book – but certainly expressed.  And comparatively, the 10 minute YouTube video delivers subtlety and nuance to the point where it is replacing the 10 second sound-bite usually found on television.  In this case, the medium’s ability to express complex thought is limited by the performance of the network but is still more informative than the alternative. 

Then again, it’s all about how we use the information; if we used every bit of information in a Cisco Telepresence rig to send text, there would be no human that would be able to parse that much text, that quickly.  The 100MB used for those 10 minutes of YouTube video could also hold the entire text of War and Peace 32 times over. 

To talk about whether the Internet makes you stupid (as SomethingAwful.com has been decrying for years) is to oversimplify a complex idea.  If, during the course of one’s Internet browsing, one is easily distracted when looking for information; this distraction will interfere with your ability to think about things in depth.  On the other hand, if one thinks about things in depth and does not allow for some distraction, one can end up with a deep, but not particularly broad amount of information.  Neither one really decreases your actual intelligence; it’s just the way that one looks at different subjects. 

Eventually I’ll manage to read Mr. Carr’s article and address these points in more depth.  Right now, however, I’m forced to conclude that Google is not making us stupid.

4Chan is making us stupid.   


Network Performance Archives

Podcast: Dr. Jim Metzler talks about Handbook of Application Delivery 2008 and NetQoS Symposium.


Today, in this podcast, we speak to Dr. Jim Metzler at Ashton, Metzler, and Associates regarding his handbook, "The Handbook of Application Delivery 2008" and his upcoming keynote speech a NetQoS Symposium 2008.



Network Performance Archives

Symposium Preview: Kevin Davis on Time-based Troubleshooting.


Kevin Davis, a senior consultant at NetQoS, will be presenting a few training sessions at Symposium about SuperAgent, the end-to-end response time module of the NetQoS Performance Center. This will include a training session about how to use time-based network metrics in troubleshooting.  He talks about his upcoming training session below.

In the session, I’m going to be covering the importance of using a time-based metric in troubleshooting, because end-users complain foremost about time.  For example, they’ll say “the application is running slow,” or they believe “the network is slow.”  To users, everything is based on time, that’s what they’re complaining about.  And they’re correct.

It’s very new to many people to think of performance in “time” although that may seem counterintuitive - because most people are used to reading utilization graphs.  With utilization graphs, however, we don’t know if 70 or 80 or 90 percent utilization is necessarily impacting the user experience.  I mean, we buy networking equipment, routers, switches, firewalls, servers, and we want them to be highly – or efficiently - utilized.  Seeing high utilization could indicate a problem – or it could just indicate that you haven’t over-purchased.  So you can have a link at 90% utilization or a router at ninety percent CPU utilization but you won’t know if that’s impacting the end-user without a time based metric.

It’s time-based data that tells you how the users are being impacted.  Sure, the utilization data – the interface utilization, memory utilization, I/O utilization, can often tell what is doing the impact.  But the time base shows you the degree of the impact – the real-world effect on end-users.  With a time-based instrument, such as NetQoS SuperAgent, you can find out where the delay increase is occurring, and whether it’s based in the network, server, or application. 

In fact, you can take a look at time-based data and make a determination very quickly as to which entity is creating the performance issue – the beautiful thing about SuperAgent, in particular, is that it trends by time 24/7, so not only can you determine how your important business applications are being impacted today, but you can go back and look at recurring patterns in performance issues.  You can see if today is worse than yesterday or last week or last month.

In the session, I’ll also be going over how to architect the data center for performance.  Placement of servers that participate in inter-architectures is critical for the health and performance of the application and indeed the data center.  We also talk about how different protocols, for example, Microsoft’s TCP/IP stack, can impact application performance by enhancing or degrading it. 

It’s important for servers that are serving the same application.  For example, a front-end Web server and a back-end Oracle database really should be on the same switch on the same VLAN.  That way they receive optimum service from the network.  If they do leave the switch, they’ll have to contend with bandwidth going up and down the switch links, and they’ll be switched and routed multiple times. 

Based on measurements from customer environments and from our own laboratories, when two servers are on different switches they can have up to 18 milliseconds delay between them.  If we think of that in the terms of network engineers of one millisecond per 100 miles, what in effect we’re doing when we put two different servers on different switches, or two different VLANs on the same switch, we’re making it look like those servers are 1800 miles apart – like one server is in Los Angeles and the other is in Memphis. 


Network Performance Archives

Cisco Beefs Up WAN and Application Acceleration Materials


patrickancipink.jpgby Patrick Ancipink
Director of Product Marketing, NetQoS

There’s been a lot of growth (and attendant hype) in technology areas like WAN optimization and application acceleration over the past few years, and for good reason. Anything that helps companies speed up and reduce the risk of strategic IT initiatives like consolidating data centers, turning up new branches or serving an increasingly mobile and scattered user community will be popular.

To help with cope with the increasing reliance on the WAN and keep latency in check, there are a dizzying array of vendors and products out there – but if you’re trying to determine precisely which techniques and technologies to implement for your specific needs, the array of vendors quickly goes from “dizzying” to “disorienting” and finally “nauseating.” 

Cisco’s been in this Tilt-a-Whirl™ of a market for a while (and NetQoS has been right there with them) and they’ve taken some big steps recently to provide a more holistic approach that centers on building an “application aware” network, rather than trying to highlight one type of implementation against another for a narrow set of capabilities.

NetQoS started working exclusively with Cisco closely to help customers evaluate, measure, and prove the effectiveness of WAN optimization and application acceleration deployments. As customers are moving from pilot phases into full production, the before/after measurements and comprehensive monitoring are critical to ensure customers are getting the benefits they intended and doing what they need to deliver application performance. 

To help get the word out, Cisco just launched a new section of their web site today that contains a wealth of information about, as they call it, “WAN and Application Optimization.” The downloadable presentation, Cisco WAN and Application Optimization Technical Overview Presentation, puts Cisco technologies (and complimentary ones, NetQoS included) into a useful context with a methodical approach and framework built around four steps: Profile and Baseline, Optimize, Evolve, and Operate. A whole Campbell’s Factory of Cisco alphabet soup technologies are included—WAAS, ACE, NBAR, Netflow, CBQoS, IP SLA, PfR—to show how they work in concert and what role they play in the bigger picture.

There’s also the Cisco WAN and Application Optimization Solution Guide , a very in-depth publication—like 227 pages deep—that is targeted for “technical personnel involved in the specification, design, and implementation of specific WAN and application optimization solutions.” We, here at NetQoS, are proud to have contributed several sections to book regarding the methodology and implementation of network performance monitoring for WAN optimization and application acceleration. 

(If you are looking for some lighter fare, the video on the site tells a nice story in about 6 minutes including an airshow, snowmobiles, windsurfers, and skydiving—interesting choices for demonstrating the criticality of serving video over the WAN.  Then again, some company somewhere has to make the recreational products, I suppose.)


Network Performance Archives

Windows Server 2008 launched


Windows Server 2008 officially launched today with little fanfare; but the new enterprise-class operating system has been eagerly awaited by people who eagerly await operating systems, instead of going out and having a good time with their lives.

NetworkWorld has a thorough review of the W2K8 OS up on their site, but spends a bit of time tracking the performance of the network input output in various tests.

We tested network I/O performance using both emulated I/O and various traffic/assault tests (see How we did it) and found Windows 2008 Server performance has improved - and especially improved when Vista is the client….
The new stacks also have the ability to dynamically respond to communications latency in network connections as they possess the ability to dynamically change TCP packet window size, which allows a communication channel to be more efficiently stuffed with data.
This isn't that surprising; we've covered the redesigned TCP/IP stack previously when Vista came out. What is interesting however, is that Vista provides the most benefit. Adoption of new server OSes tends to be slow, but so has adoption of Vista on work client computers, with many choosing to stay with XP SP2. For companies concerned about network performance; W2K8 might speed up adoption of desktop Vista. But conversely, Vista's drawbacks (real and perceived) might slow down adoption of W2K8.
In our testing we found that under light loads, the effects in terms of speed of tasks like copying folders, streaming media and loading complex Web pages aren't strongly demonstrated, but the effects under heavy loads, however, favors performance for Vista, strongly. Depending on the mixture of I/O (but pronounced under streaming media and heavy file copying), Vista can be as much as 43% faster than Windows XP SP2 in copying operations and 18% faster in opening concurrent streams.
This also means that there's a two-class affinity for clients of Windows 2008 Server Editions - Vista and everyone else, including Windows XP SP2, MacOS (we used 10.4.10 and 10.5.2) or other SAMBA clients that use SAMBA 3.0.2+ connection methods. If you have a client with the new stack, you're more efficient, and, therefore faster under higher loads, but you're a second-class citizen if your stack isn't up to date.

What I'd like to know is what, specifically, makes W2K8-server/Vista-client combinations so powerful. Is it just the compound TCP protocol? Are there kernel optimizations for network data processing? (I don't have the technical knowledge to address those questions, I'm hoping that my readers will be able to share their theories and the results of any tests they may run.)

At any rate, while W2K8 is a significant milestone release, good or ill, the history of server software distribution usually means a slow rollout period - to the point where naming your operating system by year becomes almost a bitter irony; chances are most companies who use W2K3 will want to roll out W2K8 in 2009 at the earliest.


Network Performance Archives

Interesting network applications and the worthwhile endeavor of "attempting not to get blown up."


Just a quick post today - I wanted to call attention to an article by David Talbot of MIT's Technology Review, entitled "A Technology Surges" about how DARPA produced a kind of wikified Google Maps for Iraq-stationed patrol commanders.

The application, called the "Tactical Ground Reporting System" or, because the military loves acronyms, "TIGR" - is a wonderful thing. Junior officers who command patrols study data telling them about key buildings, location data on past attacks, etc., and then they can add the information they found out on their patrol to the map-centered database for the next patrol to study. Using cameras with embedded GPS technology, they can take pictures of the scene on the ground and add them to the database as well.

And of course, the system was designed with the Iraq theatre's networking performance needs in mind.

Deploying it widely required dealing with two main challenges raised by Iraq's spotty data connections: how to synchronize scattered copies of the same database, any one of which a returning patrol leader might modify, and how to give soldiers multimedia information without crashing the system. One solution was a network that carefully rations out bandwidth. For example, the default mode for any photograph is a thumbnail version. A soldier has to click on the thumbnail to see a larger version and will get a response only if bandwidth allows.

With future advances, such a database can be updated and accessed live from the patrol in-country.

The next step, says Maeda, is to install it in Humvees and other military vehicles, allowing soldiers to download and act on new information in real time. Some of these vehicles already have some low-bandwidth connections, and Maeda says DARPA is working on ways to make the software work using these thin pipes.

It's not that any of this should sound unfamiliar. Google Maps mashups for sales data, tourists, and even MMORPG players are used in a similar manner for similar purposes. The significant thing is overcoming the challenges in an unstable, wartime environment where network performance is never a certainty.


Network Performance Archives

Walking on AIR: Adobe's new "offline-online" app dev platform and what it means for network needs


brianboyko3.jpgby Brian Boyko
Editor, Network Performance Daily

The release of Adobe AIR today might just bring about major changes - both good and bad - for network performance. AIR is a way to produce Web apps that can be run as desktop apps. It is cross-platform and relies, like Java, on a just-in-time compiler and an interpreter of application bytecode. There are interpreters for Windows and OSX, and a Linux interpreter in development.

"It allows Web application developers - or just application developers - to use the Internet technologies they know, whether it's Flex and ActionScript to target the Flash part of AIR, or Javascript/HTML/CSS to target the AJAX part of AIR," said Phil Costa, director of product management at Adobe. "It allows them to take those applications and run them on the desktop."

Costa explained that through AIR, (depending on what the application does and how it is coded,) companies may theoretically experience a lowered amount of data throughput and an improved network performance.

"Today a huge number of corporate networks are moving towards browser based applications, and one of the extra bandwidth requirements that it puts upon the network is that every time you access a [Web based] application, you need to download it. Whether that's HTML or Javascript, or all kinds of Flex and Flash content, that needs to be pulled over the network. Having the application installed locally avoids that. All that will be going forth is the actual data that you're trying to access."
"We've done tests with some of our customers where they've seen our bandwidth [usage] go down for Internet applications in general, because unlike a Web site, which creates both the content and the formatting of the content, most AIR apps are just passing the information back and forth instead of refreshing the page each time."
"Now, depending on what the application does, it may actually add [to] bandwidth requirements for the network as well. One of the things that applications do, is run in the background and connect permanently to a data source's real time streams, or frequently check for data. That could increase the bandwidth requirements. But that's more about what the application specifically does than anything specific about AIR."

AIR's capabilities allow for offline usage as well, which will likely prompt more demand for online apps as the major drawback of SAAS - inaccessibility - is mitigated.

"In addition to giving the developers and then end-user of the application the convenience of launching the [Web] application like any other desktop application," said Costa, "it gives them additional capabilities that they didn't have when they were targeting the browser, such as local storage, either in flat-files or structured storage like a SQL database, which is embedded in there, or drag-and-drop integration with the file system, and cut-and-paste as well as the ability to take data or content offline, and run it when they're on an airplane or just not connected to the network."
"The runtime provides a whole set of APIs for notifying the application when it is on and offline, and so the developer can implement behavior that accounts for that; in many cases what we see is that the developers are caching some of the information offline, so that if the user takes it offline, it will still be available."
"To give you an example… one of our customers, Anthropologie, built an online catalog that lets people browse through things they have, and they built an AIR version which lets customers make little notes to themselves about the product, and rather than store them on the Anthropologie Web site, it stores them locally. The customer can put notes on things the same way they put stickie notes on an actual physical catalog, and they don't have to share that information with the Web site, so it's private to them. It also means, from Anthropologie's standpoint, that they don't have to create massive databases to store that information."

Costa said that Adobe hopes that there will be AIR apps on mobile phones, something that there's no specific date on, but which is on the Adobe roadmap.



1 2 3 4 5 6 7 8 9 10