Commentary Archives

Cyxymu


There were outages related to denial of service attacks on three of the biggest social networking Web sites – Twitter, Facebook, and Livejournal - yesterday. 

What could be the purpose of such a thing?  Actually, it was a concentrated effort to silence a particular individual, a Georgian (the country) blogger who goes by the name of “Cyxymu,” an economics professor known for his criticisms of Russian conduct during the war in South Ossetia. 

Cyxymu himself blamed the attack on the Russian government according to an interview he gave to British newspaper The Guardian


He added: "An attack on such a scale that affected three worldwide services with numerous servers could only be organised by someone with huge resources."


If it seems implausible, Max Kelly, Facebook’s chief security officer confirmed that the major DoS was targeted primarily at Cyxymu.


Max Kelly, Facebook's chief security officer, confirmed yesterday that the attack that disrupted the Twitter site and caused problems for Facebook and LiveJournal was aimed at Cyxymu. "It was a simultaneous attack across a number of properties targeting him to keep his voice from being heard," he said.


Talk about a backfire, however.  Now everyone’s talking about Cyxymu, and people who haven’t heard of him before are talking about his blog.  He’s become the next Salam Pax

I’m not sure what this teaches us about network performance.  Except maybe that we have always lived in a world where butterflies’ wings have brought hurricanes; it’s just that, with everything around the world connected, there are more butterflies and more hurricanes.  That’s the funny thing about globalization and the disappearance of regionalism due to the Internet; as regional problems become worldwide ones.

And not to end this on a sour note, but… let’s say it’s true.  Let’s say that Russia decided to take down much of the most important bits of the Internet in order to silence one man.  Maybe it’s time we started realizing that an assault on freedom to communicate anywhere is an assault on freedom to communicate everywhere. 

Though with the human race being the way it is, I doubt it. 


Commentary Archives

Risky Business


Bruce Schneier, if you don’t know him, is one of the Web’s foremost experts on security. I don’t just mean computer security, though he focuses on that – but security overall, including anti-terrorism and crime security. I read his blog often, because even though I’m not a security geek, his writings are very insightful.

Schneier often talks about how human beings can sometimes misunderstand the ideas behind risk .

For example, there’s the oft-cited example that statistically it’s safer to have a gun in the home than a pool. (How the gun got into the pool, I’ll never know!)

That is, while people are more willing to put up with pools than guns because you get more enjoyment out of a pool than a gun, and that a gun is designed to be dangerous (if you’re standing at the wrong end of it,) and the dangerousness of a pool is an incidental side-effect of water and concrete… proportionally, more people die in homes with pools than in homes with guns. So when we evaluate risk, most people instinctually think the gun is “riskier” than the pool.

But other than those “freakonomics” type cases, Schenier points out in his latest post that for the most part, human beings do understand risk, and that there is a certain level of risk that we’re comfortable with – indeed, there’s even a certain amount of risk that we crave.

So when he was at a security conference, where the speaker made a familiar complaint that end users at a company don’t understand security, and don’t grasp the importance of it. Schenier suggested that perhaps the security researcher didn’t understand the importance of the end-users getting their jobs done.


They know what the real risks are at work, and that they all revolve around not getting the job done. Those risks are real and tangible, and employees feel them all the time. The risks of not following security procedures are much less real. Maybe the employee will get caught, but probably not. And even if he does get caught, the penalties aren't serious.

Given this accurate risk analysis, any rational employee will regularly circumvent security to get his or her job done. That's what the company rewards, and that's what the company actually wants.


It’s the old argument about the balance between security and performance – that is, that security is there to prevent loss, and everything else in the company is designed around making necessary gains.

We’ve seen where security procedures have severely degraded application performance, and we’ve seen overreactions become worse than the problems they are designed to solve.

Schneier made this suggestion to the conference presenter:


"Fire someone who breaks security procedure, quickly and publicly," I suggested to the presenter. "That'll increase security awareness faster than any of your posters or lectures or newsletters." If the risks are real, people will get it.


So, in effect, Schneier suggest increasing the consequences of risky security behavior – in other words, to increase the personal risk to employee’s livelihoods. In this case, however, I disagree with him – a public firing of the next employee to write down his password on a post-it-note because he can’t remember which combination of random lowercase, uppercase, numeric and punctuation characters is the active one this month... has risks of its own.

Which is, does the company place as much importance on security as it does on productivity? Is it more important to be secure than to be effective?

In some industries, such as banking, the military, law firms, and hospitals, this may be the case; but for most businesses, such draconian policies make an unpleasant work environment, and “degrades application performance” in the worst way possible – by degrading the employees.

What’s more, in a highly competitive company, these draconian security measures can be subverted to serve malicious goals – like an auto-immune disease. If you fire someone for putting a post-it-note with the department password on their monitor, how long is it before professional rivals will plant post-it-notes on other people’s computers in order to get competitors for promotions fired? This belongs in the world of David Mamet plays, not in the corporate workplace.

Instead, maybe it’s more important to make sure that the end-user has to understand as little about security as possible, and to be proactive about stopping attacks way before they even reach the end-user.

Because if I heard that someone lost their job because they couldn’t remember “Nei#oEVwi3” and had to write it down… I’d be looking for a new job. And I wouldn’t feel too guilty about using company time to spruce up the resume.


Commentary Archives

Google Aquires On2 Technologies for $106.5M in Stock Deal


The Dow Jones Newswires report that Google will acquire On2 Technologies, a company that makes video compression, for $106.5M worth of stock, presumably for the video site YouTube.

It’s an awfully big investment - (a hefty 6.7x multiple of On2’s trailing twelve months (TTM) revenue, one of the highest multiples in tech over the last 18 months) - for a site which is perpetually the butt of jokes about not being able to turn a profit. But there are a number of reasons it might be a smart move.

At it’s core, Google has always been about using the power of computing to make information searchable and organized.  Video has a major limitation – unlike text, you cannot search by keyword, only by ‘tags’ – self-reported information – or by context, in this case, “links.” If 12 people link to a video with the word “Tango,” for example, then chances are the video is about tango in some shape or form. 

But Google is pretty good about finding ways around these limitations.  Google 411 was a free service that had a secondary function – it allowed for Google to improve its voice recognition algorithms to the point where it could offer Google Voice.  And if it can offer Google Voice, which automatically transcribes audio voicemail messages into searchable text, it’s not that much of a leap to transcribe the audio track of uploaded video into searchable messages.  That makes video more attractive to advertisers. 

Where On2 fits into this is that On2 offers a video codec, called VP6, which is compatible with Flash video and provides roughly the same quality as the current standard, H.264, at the same bitrates (filesizes).  However, the processing power needed to decode (play) the VP6 codec is significantly less than the processing power needed to decode the H.264 codec. 

Obviously, this is an advantage for Google, who is producing its own “Google OS” for use with low-powered netbooks.  Plus, there’s an awful lot of slow computers out there that are still in use. 

But less obviously – and this is a guess – because VP6 takes less processing power to decode, complex complications – like trying to do voice recognition – can be done faster when decoding thousands of VP6 files at once, compared to thousands of H.264 files at once.  Even if the difference is on the order of microseconds per video, when you’re talking about the millions of videos on YouTube, those little microseconds add up quickly. 

Perhaps Google is losing money, but it may be because they're creating, essentially, a new application, and trying to get the best performance for it before trying to market it, and increases in application performance can often offset hardware costs, power requirements, or bandwidth needs. 


Commentary Archives

Brownouts Vs. Blackouts.


NetworksFirst.com has recently created an online “Impact of Network Downtime” calculator, which you can use to estimate how much money it would cost if your network went down.  It makes a compelling case for fault management and worrying about outages. 


However, the cost of poor application performance is harder to quantify – or at least, requires more sophisticated tools and data - than the cost of fault.  That may be the reason that many companies still consider fault management, and not performance management, to be the core responsibility of the IT team. Our most recent research conducted with Ashton, Metzler & Associates bears this out:


Fifty percent of respondents indicated that they measure and report on the mean time to repair (MTTR) for a network or application outage. However, only thirty percent confirmed they actually measure and report on the MTTR for degraded application performance, revealing a continuing legacy of fault and availability management over performance management.


As technology has improved, fault performance problems have, for the most part, been solved.  It’s no longer a distinguishing feature for a network service provider to promise 99.999% uptime.  The next big challenge is maintaining good performance throughout the network. 

But in many ways, it’s a hard sell, because unlike a fault cost calculator, it’s difficult to show you exactly why you need performance management tools until you have the more nuanced calculation of what poor performance costs your business. What’s the difference in employee productivity when an application is 10% slower, 50% slower?

These types of metrics have typically been calculated for customer facing applications like Web retailers, but getting the data for internal IT users has been far less popular since it’s considered a soft cost in some arenas. But it really starts to add up if you pay attention.

One NetQoS customer said their typical critical business application “brownout” (before deploying NetQoS products) cost them $6000 per hour and they had about 20 of these per year, each taking about six hours to isolate and resolve. That’s $720k gone per year due to poor application performance ($6k * 6 hours * 20 events = $720k/year). True, the brownout costs less per hour than most estimates you see for out-and-out downtime, but they occur a lot more frequently.

It took some investigation and understanding on the customer end to establish the value of different applications, who was using them, and then run the numbers, but now they have some idea of the cost of all of those shades of gray between up and down and this helps them justify their investments in technology and process improvements to reduce the brownouts as well as the blackouts.

This is why vendors, such as ourselves, are willing to come out and have a conversation and demo with your company.

But even so, consider this idea as an inaccurate but useful shorthand in the form of a Zen koan: If the network is so slow that nothing gets done, is it any different than if the network were down all together?  And what is the difference between a network down for half a day than a network that takes twice as long to get anything done for a full day? 

And if a computer goes down in the woods, but no one receives an error message, did it really have an error at all? And what is the sound of one router crashing?


Commentary Archives

Notes on John Chambers’ Interview


Recently, in the Herald Tribune, there was an interview with John Chambers, CEO of Cisco. One of the things about the interview that I found particularly interesting was that it seems that Chambers is really into collaboration technologies.

You would expect a networking hardware and software company to be into collaboration – the entire point of networks is to interconnect computers so that two or more computers can work together. But Chambers focuses more on Web 2.0-style collaboration, talking about video and blogs.

Today’s world requires a different leadership style — moving more into a collaboration and teamwork, including learning how to use Web 2.0 technologies. If you had told me I’d be video blogging and blogging, I would have said, no way. And yet our 20-somethings in the company really pushed me to use that more... By the second [video blog], I realized this was going to transform communications — not just for the C.E.O., but it would change how we do business.

From MediaNet to FlipCams to TelePresence, it seems – and I’m only guessing here – that Chambers is trying to take Cisco from a “networking company” to a “collaboration company” – much like Xerox tried to move from being the “copier company” to “the document company” in the late 1990s.

But it also is worth pointing out that collaboration tools are more bandwidth heavy than they have been in recent years; Chambers choosing to video-blog, for example, rather than text-blog. Telecommunications used to be about voice, now it’s about high definition video. When you have a population of 20 and 30-somethings not afraid to use the technology, they’re going to push that technology to the limits. That’s a good thing, but it bears pointing out that you only want to push the technology to the limits… not over it’s limits – which is why network monitoring tools are so important to knowing exactly what those limits are – and give you an idea of how you can start to overcome them.


Commentary Archives

Two thoughts on health and the economy:


I’d like to be skinny. And have a million dollars. G’night folks!

------------

I have just been informed that, even though it is Friday, I still need to put at least some effort into writing an intelligent blog post.  So, here goes.

Here’s my first thought about health and the economy: Obviously, there have been massive layoffs across the board, and IT has not been spared.  Over the past two years, not only have there been layoffs due to the general contractions (or as I refer to them, death-spasms) of the economy, but since 2006, there has been an increase in the number of internationally outsourced jobs by IT service vendors, according to Network World. 


Data prepared by Everest Group Inc., a research and outsourcing consulting firm, shows in broad brush fashion the shift of jobs overseas by some major IT services vendors. In 2006, U.S. and European firms typically had less than 20% of their workforces offshore; Now, for most companies that figure may well be generally over 30%.


At the same time, many laid off workers are starting their own businesses.  Certainly not all of them, but when you need a job, and no one is hiring – entrepreneurship and despair seem the only logical choices. 


A quarterly survey of 3,000 job seekers conducted by Chicago-based outplacement firm of Challenger, Gray & Christmas, Inc., released Thursday, shows a near doubling in the year-to-year growth of job seekers turning to self-employment.


The problem with this is that because more people are becoming unemployed or self-employed, it creates additional political pressure on the health care debate.  The unemployed and self-employed have to rely on the private insurance market for healthcare, which is where the majority of healthcare horror stories exist, according to NYT opinion columnist and Nobel Laureate Paul Krugman. Krugman argues that government regulations require that employer contributions to health care “can’t discriminate based on pre-existing medical conditions or restrict benefits to highly paid employees,” and thus people don’t see the worst parts of private insurance until they’re either unemployed or self-employed.

Which brings me to my second thought on health and the economy: The “American Recovery and Reinvestment Act” (a.k.a. “the Stimulus”) which give $19B for healthcare related information technology, but also requires that each American has an electronic health record in 5 years.  TechTarget argues that meeting that deadline simply isn’t that easy

"The concern is that when you have these programs that are time limited … that the quality of those implementations could go down," said Chad Eckes, the chief information officer for Schaumburg, Ill.-based Cancer Treatment Centers of America (CTCA). "There can be spectacular failures of electronic health records, because folks didn't anticipate what might happen if it was unstable, and that can have disastrous consequences for patient care."

We’ve seen some of the difficulties with medical data networks before – with the importance of maintaining network performance when lives are literally on the line. 

There’s no question that electronic data records are faster than paper records, and (usually) more accurate… or at least, more accessible in a crisis, which is why there is a mandate.  The problem arises when there is poor network performance, in which case records are less accessible.  Moving the patient records from paper to data places the onus of providing that information from the administrative staff of a hospital or medical center to the IT department.


Commentary Archives

The Middle Ground


Deep Packet Inspection, infamously used by Comcast to forge reset packets to disrupt the BitTorrent protocol, and by the NSA to spy, and by the government of Iran to identify protestors (pursuant to imprisoning and murdering some of them) is making a comeback in enterprises, according to Christopher Rhoads at the Wall Street Journal.  


Out of 145 top-tier communication-services providers, 77% of respondents said they were either strongly or somewhat interested in DPI, according to results of a survey last year by Yankee Group and RCR Wireless News. Most said they wanted it to improve network security, according to the survey.


The concerns, as always, are with traffic prioritization and security.  For traffic prioritization, the obvious uses are placing streaming teleconferencing videos ahead of streaming YouTube videos of cats or wedding dances, and in the case of security, it mostly deals with being able to identify malware on the network, for example, by content, rather than by anomalous behavior. 

The point is that most network traffic monitoring solutions identify traffic by context: Flows, flags, and facts about your devices. DPI identifies traffic by content.  True, DPI gives you a lot of information, but it gives you far more information than you need, with uncomfortable privacy concerns. 

From a more pragmatic standpoint, by focusing efforts on content rather than context, network engineers and network management might end up spending too much of their time micromanaging the network.  That is, it should not be the priority of the network team to prevent non-critical traffic – it should be the priority of the network to preserve critical traffic.  For most organizations, having a controlled network is not as important as having a network that meets the application performance needs of the business.

And somewhere in the middle of controlling every aspect of the network by content and not knowing or caring what goes on in the network is the middle ground of knowing how your network is being used. 


Commentary Archives

Microsoft and Yahoo. (Again.)


According to Yahoo Finance, which, you would imagine might have an accurate take on such things, Microsoft and Yahoo have finally agreed to a partnership.  You will remember that Microsoft tried to purchase Yahoo outright last year, but the deal fell through.  Instead, Yahoo will now use Microsoft’s Bing search engine to power search, while Yahoo will handle the online advertising. 

Why Yahoo decided to switch to Bing is unclear at this time, as Yahoo’s engine already serves nearly 20% of the market, compared to Microsoft’s 8.4% (and Google’s 65%).  I’m not prepared to speculate further than saying that Yahoo’s value isn’t really in the search engine, but the SAAS solutions that are so ubiquitous, one barely thinks of them.  Yahoo Mail, Yahoo Groups, Flickr, Del.icio.us, Yahoo Voice, and Upcoming.org.  Yahoo still has more overall users on the Web and more overall pageviews than Google.

Details are still sketchy, but the deal doesn’t seem to affect Yahoo’s SAAS offerings.  Perhaps that’s because Microsoft has gotten more aggressive on the online services front since they last tried to acquire Yahoo in February of last year, offering an ad-supported online version of Office.  Actually, that may explain the deal – Microsoft no longer needs to own Yahoo’s cloud software, but it still would benefit from Yahoo’s ad revenue model. 

We’ve talked in general about the effects of cloud computing on application performance. (Long story short: Just because it’s on the cloud doesn’t mean you can forget about making sure apps perform well.) However, one has to consider that if Office goes ad-supported, and widely adopted, how much traffic will be used up serving up those ads – especially if they’re large files, like those annoying flash-based video ads that pop up.  I suppose we’ll find out more as time goes on – whether they’re inconsequential, or eroding network performance in a matter not unlike being nibbled to death by ducks. 


Commentary Archives

The State of Network Management


We recently put together a report with Ashton, Metzler & Associates, trying to gauge the state of network management today. After our best efforts, we have learned a few things.

For example, the state of network management is not Ohio. That’s the Buckeye State.

After checking the 50 states of the U.S., the six states of Australia, and the 31 Estados of Mexico – even broadening our definition to include Canadian Provinces – we still couldn’t find the state of network management.

Then we thought about surveying more than 300 network engineering, operations, and management professionals about how IT organizations manage application performance.

Here’s what we found out:


  • 93 percent of respondents indicated their organization had either formally or informally identified a set of applications that are considered critical to the business. However, only 41 percent of those surveyed indicated that the company’s business managers were involved in identifying the critical applications.
  • 75 percent of respondents said identifying the company’s critical applications has led to at least a moderate change in the way they design, manage and troubleshoot the network infrastructure. The most common change cited was implementation or enhancement of quality of service (QoS) policies.
  • 80 percent of respondents reported that their IT organization has mapped the supporting network infrastructure components upon which key applications depend. These organizations are far more likely to focus their monitoring efforts either exclusively or primarily on these critical components than the non-critical ones.
  • Half of respondents indicated that they measure and report on the mean time to repair (MTTR) for a network or application outage. However, only 30 percent confirmed they actually measure and report on the MTTR for degraded application performance, revealing a continuing legacy of fault and availability management over performance management.

What this means is that we still have a long way to go – that many companies still look at networking problems from a perspective of fault, and not of performance, and that end-users are still likely to notice slow-performing applications before the IT organization.

On the other hand, the good news is that the report shows that IT professionals are focusing more on applications as part of the network, not as a separate discipline.

In “The Mandate for a New Age MOM” Dr. Metzler recommended specific goals IT organizations must meet to effectively manage the network for application performance:


  • Discover all applications that are on the network and identify the handful of them that are the most critical to the running of the business.
  • Baseline the performance and usage of the company’s primary IT resources - the most important business applications and the components of the IT infrastructure that support those applications.
  • Implement tools and processes that allow the IT organization to monitor the key performance metrics (e.g., response time, utilization) of the company’s primary IT resources, and allow the IT organization to quickly respond to a situation once it has impacted the end user.


Commentary Archives

AT&T confuses, infuriates 4chan.


Yesterday, TechCrunch and Slashdot, among others, reported that AT&T users were unable to access img.4chan.org; one of the subdomains hosting the infamous “b” board. 

If you’re unfamiliar with 4chan, do not google it. I have not provided a link to the site in the blog, and that is for very good reason.  It is rather disgusting. 

Still, while crude, 4chan has had a profound influence on Web culture, and is one of the largest participatory Web sites out there – so large that Time.com did a profile on its founder, Moot, who was named Time Magazine’s Most Influential Person of the Year… after 4chan rallied enough followers to completely dominate the online poll rankings so that the first letter of each of the top 21 people on the list spelled out a secret message

Here’s the problem: AT&T blocked part of 4chan in order to cut off a DDoS attack in its tracks last night.


AT&T made a statement to TechCrunch this morning, explaining exactly what happened.

Beginning Friday, an AT&T customer was impacted by a denial-of-service attack stemming from IP addresses connected to img.4chan.org. To prevent this attack from disrupting service for the impacted AT&T customer, and to prevent the attack from spreading to impact our other customers, AT&T temporarily blocked access to the IP addresses in question for our customers. This action was in no way related to the content at img.4chan.org; our focus was on protecting our customers from malicious traffic.

Overnight Sunday, after we determined the denial-of-service threat no longer existed, AT&T removed the block on the IP addresses in question. We will continue to monitor for denial-of-service activity and any malicious traffic to protect our customers.


However, none of the users of the site - nor its owner - understood why the site was blocked for AT&T users. (AT&T claims that they tried to contact Moot, Moot says he was never contacted.)  In the absence of solid information, a conspiracy theory popped up that AT&T decided to “censor” 4chan.  Within hours, 4chan denizens, known collectively as “Anonymous” made plans to take on AT&T, much like they took on Scientology, though it looks like with this morning’s disclosure, those plans are on hold – though individual 4chan users may still make decisions – like cancelling service - based on bad information.  It’s a misguided effort, of course, considering that already, the site is back up, AT&T has explained their position, and there was no harm meant by the temporary blockage. 

But the damage has been done.  That’s the problem with making networking changes without informing people – if you block a particular site, or make a major network change affecting tons of people, you owe it to your users in order to explain why you’ve made that decision. 

A post by “anonimouse” on the Project AT&T web site sums it up:


Why is img.4chan.org blocked?
That is the question you should be asking. Without a why we don't have a reason to do anything. Now, we know this is not a mistake from the customer service convos but we don't know exactly why it is banned.
If it's about Net Neutrality, they have a war coming on.
If it's about the DDoS like the rumor says, we are getting out panties in a bunch for nothing.


If you’re messing with the Web experience anyway, wouldn’t it make sense to return, as a small HTML page something explaining what the problem is and why the decision was made? In fact, the statement AT&T made to TechCrunch would have explained everything – if AT&T had disclosed the information to 4chan’s userbase instead of trying to communicate through the tech media after-the-fact. 

When you don’t explain why you’ve made changes to the network, people will assume the worst about what you’re doing – in this case, that AT&T censored out of sheer spite.  We’ve seen this with Bilderberg.  We’ve seen this with the undersea internet cables accidentally cut by ships’ anchors.  Now we see this with AT&T. 

Because there wasn’t a little disclosure, a millions-large community of Internet users are now suddenly more aware of the net neutrality issue and likely to support regulation of companies like AT&T – or, in extreme cases, just interested in making life difficult for AT&T in general.  Either way – this is not good from AT&T’s perspective. 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59