In this video (part one of two), Jim Metzler looks back at some prediction he made at the beginning of the year, and how they're shaping up to reality in this retrospective interview with Jordan Weiss.
Data Center Archives
Data Center Archives
When Savvis promises “proximity hosting,” they mean it – according to this New York Times Magazine article. In Weehawken, New Jersey, right outside of the Lincoln Tunnel, there’s a data center that houses the Philadelphia Stock Exchange’s computers. (The PSE is now part of Nasdaq.) Firms compete to have their computers located close – physically and in the networking sense – to the trading exchanges in that data center. Milliseconds of latency are unacceptable in this environment.
“It used to be that things were done in seconds, then milliseconds,” Varghese Thomas, Savvis’s vice president of financial markets, told me. Intervening steps — going through a consolidated ticker vendor like Thomson Reuters— added 150 to 500 milliseconds to the time it takes for information to be exchanged. “These firms said, ‘I can eliminate that latency much further by connecting to the exchanges directly,’ ” Thomas explained. Firms initially linked from their own centers, but that added precious fractions of milliseconds. So they moved into the data center itself. “If you’re in the facility, you’re eliminating that wire.” The specter of infinitesimal delay is why, when the Philadelphia Stock Exchange, the nation’s oldest, upgraded its trading platform in 2006, it decided to locate the bulk of its trading engines 80 miles — and three milliseconds — from Philadelphia, and into NJ2 [in Weehawken, NJ], where, as Thomas notes, the time to communicate between servers is down to a millionth of a second. (Latency concerns are not limited to Wall Street; it is estimated that a 100-millisecond delay reduces Amazon’s sales by 1 percent.)
Back in March 2008, electronic trading made up 60-70 percent of the daily volume of the NYSE. (I’m sorry I don’t have more recent numbers, but they might have been artificially affected by the credit crisis anyway.) And when you remove human beings from trades; the only thing that matters is the speed of a sale; whichever seller’s computer connects first to the buyer makes the sale, whichever buyer connects to the low-bidding seller first gets the bargain. Speed, while not everything, is not underestimated – and it’s one of the reasons you need to identify immediately any problems with network performance in financial applications. Every second a problem doesn’t get fixed – even problems that are imperceptable to the end user, like an added 3ms of delay - means more money is lost.
Now, if your company is over-leveraged and built on shaky investments, network performance won’t save you – we’ve seen a lot of companies with very good network infrastructures go downhill these past few months.
If you want to learn more about the topic of monitoring trading applications for performance, you might want to check out Alex Malone, Software Engineer Manager at NetQoS, who will be speaking at the Securities Industry and Financial Markets Association Technology Management Conference & Exhibit on June 23-25 in NYC. Alex is scheduled to speak June 24, at 2:35pm. You can also look us up at booth #1822.
Data Center Archives
Cisco announced Cisco EnergyWise at Networkers Barcelona; a software upgrade that allows users of Cisco Catalyst routers to control the energy consumption of pretty much anything that has an IP address. According to the promotional literature and video, you can limit power based on schedules, while allowing exceptions – for example, if a particular employee swipes an ID card to gain access to the building on a weekend, you can turn on the power only in his office until he signs out.
“Why does my toaster need an IP address?” is no longer a joke.
Rob Aldrich at Cisco’s datacenter blog wrote about it, and much of what he has to say is worth a look. If nothing else, the video is kind of cool.
One of the main benefits to data center consolidation and server virtualization is to decrease the power draw of the company overall – and it’s been argued that IT will have to handle electrical facilities in order to limit power consumption without creating problems with network performance. If facilities and IT do not communicate, you could end up with a real disaster, like, in a worst case scenario, someone trying to be too eco friendly and turning off the air conditioners in the server room over a long weekend.
While energy management via IP may seem like yet another thing – in addition to security, VoIP, and of course, network performance – that the network engineer will have to manage, the cost savings can be substantial over time. The trick is that energy management only occurs on things that have an IP address. Since it’s a pain in the butt to try to subnet every single light bulb, this may be an incentive for IPv6 adoption.
Data Center Archives
We just announced that NetQoS Unified Communication Monitor works with Microsoft Office Communications Server 2007 Release 2 [OCS 07 R2] this morning, and while it’s easy to get into the small details of how unified communications applications place great demands on the network, and how to handle those demands, I found myself pausing for a moment.
Where, exactly, are we going with this?
And by “we,” I don’t just mean NetQoS as a company but I mean – Us. The big Us. The human condition.
That is, unified communications applications do place more demands on the network than any other type of data that’s come before it. So, why then, do we even do it in the first place?
It’s because treating voice and video as data to be sent over the network allows us to do more with the communication-as-data than we could with the analog alternatives, even if this makes the network as a whole slightly less effective due to congestion – or if you have good performance monitoring information perhaps no less effective, but perhaps more complex. (We try to simplify the complexity as best we can, by using metrics directly from OCS 07 R2, but the necessities of a mixed communications and data network are simply more complex than the needs of a pure data network alone.)
Everything is becoming binary data, and this process is not likely to stop. To those of earlier generations, the New York Times is a newspaper; to many of the young, the New York Times is a news Web page with video and audio content; text and images being one of many offerings possible. A back-of-the-envelope calculation shows that it would be cheaper, over the course of a year’s subscription, to send every New York Times subscriber a free Kindle E-book reader – at retail prices, no less – and send the newspaper to them digitally, than it costs to print out and deliver all those physical papers to the subscribers.
It doesn’t take long to realize three things: Technology is getting cheaper, paper and distribution is getting more expensive, and the market of people who read the New York Times but will only do so if they have a physical piece of paper are dying out. The Grey Lady will become data, or there will be no Grey Lady.
Every advantage of the physical paper - portability, permanency, and simplicity, is being lost as technology becomes more portable, more permanent, and simpler. Our standards for digital technology to replace the more traditional equivalent are relatively low. Assuming it takes two and a half seconds to locate an article by flipping through the pages, any system that can serve up a Web page in less than 2500ms is an improvement. Even the sheer scale involved – that you would measure the time it would take to find an article in milliseconds rather than seconds – implies an entire quantum leap from the old way of doing things.
Those of us who work closely with the Web – bloggers, Web designers, media professionals – are aware of CSS, which removes content from layout, and RSS, which removes content from context. How far can we be from a society in which all content is completely removed from any sort of context or layout? A society where everything is abstracted? Where you could download the model of a basketball, and print it out on a 3D printer. Or even, if you wish, have the New York Times printed daily on a basketball, if you so chose…
But in that world, where everything is data, network performance suddenly becomes one of the most important things in the world. The bottlenecks once caused by the unfortunate limitations of pure physics suddenly give way to a single bottleneck – that of network performance.
Digitization is an awesome and powerful force… and while it has been mostly beneficial, I think that too often we do not recognize the power of this inexorable tide – this benevolent but gargantuan inevitability.
I don’t know if I’m ready for that world. I’m not sure I want my news to bounce.
Data Center Archives
Ken Church, Albert Greenberg and James Hamilton of Microsoft recently put out a paper on “Delivering Embarrassingly Distributed Cloud Services.”[PDF] Like most papers of this type, it’s a dry read, but informative. It looks at the tradeoff between mega-data center size and micro-data center diversity from the both the viewpoints of total cost of ownership and of performance.
The most important line in the entire report, of course, is “The trade-offs vary by application.” However, they make the argument that applications with little need for server-to-server communications will show benefits in cost, scale, reliability and performance through geo-diversification – in other words, lots of little datacenters as opposed to one big datacenter.
This seems to fly in the face of the trend in data consolidation, but there is a point to it: For any data center, there needs to be redundancy, but in a centralized data center, there needs to be more redundancy than having multiple small data centers. As Church, Greenberg, and Hamilton put it, “the more geo-diversity, the better. N+1 redundancy becomes more attractive for large N.”
The part that really interested me, though, was the networking section. (Section 3, in case you want to skip right to it.) Church, Greenberg, and Hamilton point out that in a large, centralized datacenter, you can have end-to-end control and assure a particular level of performance through supported service level agreements. On the other hand, they argue:
“[with distributed data centers] the cloud service provider has ceded control of quality to its Internet access providers, and so cannot support (or even fully monitor) SLAs on flows that cross out multiple provider networks, as the bulk of the traffic will do. However, by artfully exploiting the diversity in choice of network providers and using performance sensitive global load balancing techniques, performance may not appreciably suffer. Moreover, by exploiting geo-diversity in design, there may be attendant gains in reducing latency…”
“Many large analysis applications are best run centrally in mega data centers… Interactive applications are best run near users… [they] can be delivered with better QoS (e.g., smaller TCP round trip times…) via micro data centers.”
The argument’s sound, especially when you consider that interactive applications are probably the most latency sensitive because they need to make multiple trips to and from the client and server with every interaction.
But reducing the propagation delay (or distance delay) is merely one part of the performance equation. By ceding control over router performance and transmission, you have no way of diagnosing network round trip time problems if they occur, and wouldn’t be able to fix them – short of the messy step of changing service providers – even if you did. If something goes wrong, it could negate the speed increases by diversifying servers, so moving to this model more of a gamble than a guarantee of improvement. Granted, it’s a gamble that might make sense for some apps and some organizations – some apps, apparently, can get away with less than 100% uptime.
Data Center Archives
Cisco has put up a new video in their “Seminar and Webcast Series” talking about “Energy Efficiency in the Data Center.” It may be produced by Cisco but the key points are pretty much vendor-neutral – starting with the idea that “Green” computing is a political/PR buzzword, and the way enterprises should look at the problem is one of efficiency and of sustainability.
Data center power consumption has more than doubled since 2001; the worry is that the trend will continue on an exponential pattern. This power consumption mainly comes from cooling the servers, rather than powering the servers; and with each 1U server (running 24/7/365) requiring the same amount of energy per year as it would take a Toyota Camry to drive 15,000 miles, energy efficiency is crucial.
Part of the solution is to buy more efficient components that cost more up front but pay money back. Another part of the solution is virtualizing servers, consolidating servers, and decommissioning servers.
They also mentioned using provided utilities to step-down the voltage if the server was underutilized – a trick laptop owners have been doing to get more life out of their batteries on the road. Same concept – if you don’t need all the power, consume less of it.
As far as the network goes, data center consolidation brought on by advances in WAN optimization is a big step towards reducing utility costs. Another step is taking advantage of the movement towards putting tools in the network infrastructure itself rather than as separate appliances – for example, putting SuperAgent network monitoring software (shameless plug) into Cisco’s WAAS.
These are all some common sense solutions and probably not the first time you’ve heard them. But the key point of the video-seminar was that just as we keep harping on the fact that you need to baseline your network performance to ensure that the changes you make to your network are having the desired effect, you also need to baseline your power costs as you make improvements.
Data Center Archives
Recently, there’s been some discussion on Slashdot regarding MySQL in the past few months, after MySQL (the company) was bought out by Sun Microsystems. MySQL (the company) has announced that they will be developing some proprietary add-ons to the backup capabilities of MySQL (the database) which will only be available to MySQL’s (the company’s) customers of MySQL (the database) enterprise edition, and not to MySQL (the database) community edition.
This has been blown a bit out of proportion. (The headline, on Slashdot, “Sun may begin close-sourcing MySQL” was misleading at best). We e-mailed Steve Curry at MySQL (the company) and he pointed us to some information clearing up the situation.
· Anything that has been released as open-source under GPL continues to be released as open-source under GPL. Sun and MySQL (the company) are not going to start “closing” the open-source MySQL (the database,) and it seems unlikely that they will be able to legally do so even if they wished to.
· Improved backup capabilities are being planned in MySQL (the database) 6.0 for both the open-source community and open-source with proprietary add-ons enterprise version.
· Proprietary add-ons are being added to the Enterprise version of MySQL (the database). These add-ons are not core critical, they are essentially added-value for paying customers, which add compression, encryption, specific native drivers – things that a particular business might need but which aren’t critical to the core functioning of MySQL (the database.)
· The decision to do so was done before MySQL (the company) was acquired by Sun Microsystems. If anything, Sun has been very open-source friendly, with Star Office forming the basis of OpenOffice.org, and Solaris and Java both open-source now.
· There is nothing preventing people from forking the MySQL (the database) source code and producing open-source versions of the proprietary capabilities.
The use of proprietary add-ons to an open-source system isn’t even all that rare. Click N’ Run for Linux systems adds proprietary software to the open-source Linux; MacOSX is based on the BSD-licensed Darwin, a BSD-like distribution.
We also note the irony of a number of proprietary Web applications running off of LAMP stacks, where the L, the A, the M (the DB) and the P are all “free software.”
There are a number of proprietary Web applications running with MySQL (the database) – and a move to “close source” MySQL (the database) would have messed with the business models of many companies – including NetQoS. NetQoS uses MySQL (the database) Enterprise edition in our network monitoring and reporting products and we’re customers of MySQL (the company). So we’re glad this whole thing is a tempest in a teapot.
I tried to think of a prominent case where someone successfully “closed the source” of a flagship product after it was open-sourced - but couldn't until I went much, much farther afield. There is a company “closing the source” on its major flagship product.
That company is Wizards of the Coast, a subsidiary of Hasbro. And the flagship product is “Dungeons and Dragons.”
Wizards (the company) makes Dungeons and Dragons, a role-playing, computer-less tabletop game where you play knights, elves, and powerful wizards (the characters) – a game that has a history of being very attractive to the technology-oriented crowd because of our love of math and power fantasies.
What makes Dungeons and Dragons particularly interesting is that a while back, Wizards (the company) released an “Open Gaming License” (OGL) which allowed third parties to develop additional content for Dunegons and Dragons, and, in fact, create entirely new games in different settings and genres using the rules established in Dungeons and Dragons 3rd edition. If you were a third-party company, you could publish supplements to provide traps, monsters, or new spells for wizards (the characters) to cast. And many did.
This had numerous benefits all around; players needed to learn how to use only one system, and they had tons of D&D supplements to choose from, game companies found they had an audience in D&D players that they might not have otherwise had, Wizards (the company) found a sea of “developers” for their system which made ownership of D&D’s “core books” more valuable, and while it may not have resulted in a rebirth of the roleplaying game industry, it sure propped it up for a little while longer.
Because game players only had to learn one set of rules to play, the roleplaying game industry standardized quite a bit and the system used in Dungeons or Dragons (known as “d20”) became quite widely used, dominating the RPG field for a time.
D&D “version 4.0” will soon be released, and many game beta testers believe the system has been radically overhauled and improved. However, this new system will not be released under the OGL. It will however, be released under the “Dungeons and Dragons 4th Edition Game System License” (GSL).
The GSL license has not yet been made public, but there are rumors, speculations, and concerns, fueled by online posts made by the brand manager and licensing manager for Dungeons and Dragons, and relayed by the lead writer of third-party publisher Necromancer Games that the GSL will contain a “poison pill” clause – that is, in order to use the GSL, a game company must not publish anything under the OGL.
This would be like Microsoft saying that developers for Windows Vista are forbidden from publishing anything under the GNU public license. And the upshot is now that developers have to choose between not developing games with the improved system or destroying their back-catalogs.
Even if you don’t have a huge interest in D&D – in which case, I envy your normal social adjustment and relatively less awkward adolescence – it pays to keep up with this developing situation to see how a fight to close an open-source software product might actually go down. Will Hasbro fail in its efforts to dominate the RPG industry, either shrinking their portion of market share or shrinking the size of the entire market? Or will Hasbro succeed with this business plan, and the publishers of Monopoly (the game) end up with a de facto monopoly (the economic term) on this niche industry?
Update: On May 2, 2008, a week after this article's publication, Wizards of the Coast released an FAQ about the 4th edition licensing terms. The FAQ states:
Q. Can companies still produce 3.x products under the OGL?
Whether this FAQ was changed over the past week while WoTC remained silent or whether this was WoTC policy from the beginning is anybody's guess.
A. Yes, but we anticipate that interest in the 4e GSLs will be greater.
Q. Can publishers release new products under both the OGL and 4E GSL?
A. No. Each new product will be either OGL or 4E GSL. If a new product is published under the 4e GSL, it cannot also be published as 3.x product under the OGL; and vice versa.
Q. I have multiple product lines. If I update one product line to 4th Edition, do they all have to be updated?
A. No. Publishers are able to choose on a product line by product line basis which license will work best.
Q. Will there be a different license for other lines, such as d20 Modern?
A. The d20 GSL will allow for other genres of roleplaying games.
Q. Why are there two different licenses?
A. The D&D 4e GSL is specific to the Dungeons & Dragons brand. The d20 GSL allows for non-fantasy genres. Both licenses tie to the 4th edition rule set.
Q. Do I have to give up my right to publish 3.5 OGL products in order to publish 4e compatible products?
A. No. Publishers are free to print product lines under either the OGL or 4E GSL. We would love to see our industry colleagues convert their entire product offerings to 4E, as we are doing, but we do not expect or require entire companies to convert to the new edition.
Q. Can publishers update their previous publications from older editions to the D&D 4th Edition rules?
A. Yes. Publishers participating in the Dungeons & Dragons 4th Edition GSL will be allowed, and encouraged, to convert their publications from earlier editions to the 4th Edition rules.
Data Center Archives
In a few minutes, Jim Metzler of Ashton, Metzler, and Associates, will be delivering his keynote on the Next Generation NOC at NetQoS Symposium 2008 at Barton Creek Resort in Austin. Last week, we pre-recorded a podcast with Dr. Metzler regarding the speech he is about to give and what he means by a "next generation NOC."
He talks about the changing role of the NOC and moves in enterprises towards integrating what were once seperate stovepipe functions to focus on application delivery.
The podcast is below.
Data Center Archives
CIA Plan #328 to remove Castro from power in Cuba has succeeded. This plan, also known as "wait until El Presidente gets old and retires," now suggests that there will be vast geopolitical changes.
However, while Castro may be retiring, the communist regime he's supported for nearly half a century isn't going away, and whoever wins the elections in Cuba, the policies vis-à-vis the U.S. probably won't change that much until and unless the U.S. foreign policy towards Cuba drastically changes after the next U.S. election.
Which is a shame, really, because when old hardware (I have no problem calling Castro "old hardware,") is too troublesome to maintain than to replace, it's usually a good time to re-evaluate the way that your company, or, in some cases, your tropical island nation, does business.
Now, whether Castro is retiring because of his natural health, or because the CIA slipped him a cigar with a really, really slow acting poison back in 1963, it doesn't matter. The point is that we are so used to the way things are, we often don't pause to consider what could be. We don't understand the true cost of maintaining legacy hardware versus trying the new, and don't give much thought to better alternatives if "things are working well enough."
IT professionals have been thwarted many times by using legacy management tools that are outmoded, increasingly and annoyingly cumbersome to maintain and administer, or no longer deliver the value for which they were acquired. Beyond the obvious maintenance and license costs, there are also the opportunity costs related to speed, quality, scalability and efficiency. When you combine high maintenance costs with opportunity costs, the financial penalty for the status quo begins to add up.
Now, no one should advocate tossing out a product or technology just because it's a little long in the tooth- in fact, long, huge, revolutionary solutions when small fixes are the most efficient solution can be a serious problem that takes time and energy away from IT. But, there is nothing wrong with taking time to assess the situation and ask whether we're better served by what is or what could be.
Data Center Archives
The Internet Corporation for Assigned Names and Numbers (ICANN) recently put out a press release which announced that six of the 13 root servers in the root zone (presumably located in-between the Phantom Zone and the Forbidden Zone) now had IPv6 addresses.
It's a small step but one which is necessary for bigger steps to follow. With the root name servers having IPv6 addresses, it paves the way for a full IPv6 end-to-end transmission path for data. The fact that the nameservers still relied on IPv4 made at least some form of IPv4 over IPv6 necessary for Internet transmission.
We were able to ask a few questions about the IPv6 assignment to ICANN and got answers from David Conrad, Vice President of Research and IANA Strategy.
NPD: Why only six of the 13 root servers?
Conrad: 6 of the 13 were ready at this time. Each of the root servers is run independently and are funded through internal means (that is, in general, no one is paying any of the root server operators to operate a root server directly). The 6 that requested ICANN add IPv6 records were the ones that had finished setting up their IPv6 infrastructure sufficiently to provide service.
NPD: Could you explain a bit about the 512 byte limit on the packet sizes?
Conrad: The original specification of the DNS protocol chose 512 bytes as a reasonable approximation of the largest packet that could get through the Internet (of the time, circa 1983) without being fragmented. Enhancements to the DNS protocol since then have allowed for an increase in that limit (specifically, requesters can indicate how large a packet they're willing to accept).
NPD: How will computers get the new info about the root name servers' new IP addresses?
Conrad: The only computers that will actually need the new information are DNS caching servers. When a caching server starts up, it asks one of the 13 root servers it has pre-configured (the root hints) for an up-to-date list of all the root servers. It then uses that new list.
DNS caching servers are typically operated by ISPs or the IT departments of large enterprises. Average PCs and workstations send their DNS queries to these caching servers.
NPD: Would this require operating system upgrades/patches?
Conrad: A patch will likely be supplied to make the change in the root hints permanent (the updated list obtained by the caching servers isn't generally written to disk), but as described previously, caching servers will be able to use the new addresses without the patch.
Is this a significant move towards standardizing Internet traffic on IPv6? Tell us your thoughts in our comments section below.
