Risk Management Archives

Tackling access management challenges in virtual, cloud environments


CA Technologies releases products designed to manage and secure virtual, cloud applications as well as deliver service assurance capabilities across physical, virtual and cloud environments.

By Denise Dubie

With every innovative new technology comes a mandate for advanced management and security tools to help IT organizations actually achieve the benefits promised by deploying, say, virtual servers or cloud applications.

CA makes ambitious moves to the cloud

This week CA Technologies made news with several high-tech reporting outlets for just that. The software maker Monday announced the general availability of five products within its CA Virtual portfolio: CA Virtual Assurance, CA Virtual Automation, CA Virtual Configuration, CA Virtual Assurance for Infrastructure Managers and CA Virtual Privilege Manager. The vendor also introduced its CA Virtual Foundation Suite, which CA Technologies states in a press release “combines select virtualization management products at a compelling price point.”

Continue reading "Tackling access management challenges in virtual, cloud environments" »


Risk Management Archives

Visual Virtual


Brian Bakstran, VP of Product Marketing at our parent company, CA, recently blogged about a study from Network Instruments which talks about how 59% of IT organizations “lack the experience to manage virtualized environments effectively.”

Combined with the idea that by 2012, 80% of all new servers will be virtual ones, and you start to get this sinking feeling that the entire IT industry knows where it’s going, but hasn’t really thought about what it needs to do once it gets there… sort of like sitting in the first four rows at Sea World, all excited to see Shamu, but forgetting to pack a poncho.

And so vendors like us and our parent company offer that visibility. (In the case of CA, for right now, we’re offering it in spades, with the NetQoS stuff [PDF] and the e-Health stuff and CA Virtualization Management.) 

The main concern that the lack of visibility presents to enterprise IT shops is the idea that mission critical applications that performed fine before virtualization may perform poorly when virtualized, and the IT shop will have no way of being proactive in finding performance problems, nor will they have the tools they need to quickly find the root cause of the problem. 

And visibility is necessary even before virtualization to compare performance to the non-virtualized baseline.  There are some applications that simply will always perform poorly in virtualization, and the sooner those applications are discovered, the better.  Knowing what does and does not work in virtualized environments gives you options – you can replace the app, run the app on a dedicated server, or even recode the app to work better in virtualized environments.  But without visibility, you have no options.

Between the reduction in energy consumption and the better utilization of existing servers, the benefits of virtualization are worth the risk, but there’s nothing that says that you can’t bring in everything you can to get visibility into your virtualized servers and mitigate the risk. 


Risk Management Archives

Risky Business


Bruce Schneier, if you don’t know him, is one of the Web’s foremost experts on security. I don’t just mean computer security, though he focuses on that – but security overall, including anti-terrorism and crime security. I read his blog often, because even though I’m not a security geek, his writings are very insightful.

Schneier often talks about how human beings can sometimes misunderstand the ideas behind risk .

For example, there’s the oft-cited example that statistically it’s safer to have a gun in the home than a pool. (How the gun got into the pool, I’ll never know!)

That is, while people are more willing to put up with pools than guns because you get more enjoyment out of a pool than a gun, and that a gun is designed to be dangerous (if you’re standing at the wrong end of it,) and the dangerousness of a pool is an incidental side-effect of water and concrete… proportionally, more people die in homes with pools than in homes with guns. So when we evaluate risk, most people instinctually think the gun is “riskier” than the pool.

But other than those “freakonomics” type cases, Schenier points out in his latest post that for the most part, human beings do understand risk, and that there is a certain level of risk that we’re comfortable with – indeed, there’s even a certain amount of risk that we crave.

So when he was at a security conference, where the speaker made a familiar complaint that end users at a company don’t understand security, and don’t grasp the importance of it. Schenier suggested that perhaps the security researcher didn’t understand the importance of the end-users getting their jobs done.


They know what the real risks are at work, and that they all revolve around not getting the job done. Those risks are real and tangible, and employees feel them all the time. The risks of not following security procedures are much less real. Maybe the employee will get caught, but probably not. And even if he does get caught, the penalties aren't serious.

Given this accurate risk analysis, any rational employee will regularly circumvent security to get his or her job done. That's what the company rewards, and that's what the company actually wants.


It’s the old argument about the balance between security and performance – that is, that security is there to prevent loss, and everything else in the company is designed around making necessary gains.

We’ve seen where security procedures have severely degraded application performance, and we’ve seen overreactions become worse than the problems they are designed to solve.

Schneier made this suggestion to the conference presenter:


"Fire someone who breaks security procedure, quickly and publicly," I suggested to the presenter. "That'll increase security awareness faster than any of your posters or lectures or newsletters." If the risks are real, people will get it.


So, in effect, Schneier suggest increasing the consequences of risky security behavior – in other words, to increase the personal risk to employee’s livelihoods. In this case, however, I disagree with him – a public firing of the next employee to write down his password on a post-it-note because he can’t remember which combination of random lowercase, uppercase, numeric and punctuation characters is the active one this month... has risks of its own.

Which is, does the company place as much importance on security as it does on productivity? Is it more important to be secure than to be effective?

In some industries, such as banking, the military, law firms, and hospitals, this may be the case; but for most businesses, such draconian policies make an unpleasant work environment, and “degrades application performance” in the worst way possible – by degrading the employees.

What’s more, in a highly competitive company, these draconian security measures can be subverted to serve malicious goals – like an auto-immune disease. If you fire someone for putting a post-it-note with the department password on their monitor, how long is it before professional rivals will plant post-it-notes on other people’s computers in order to get competitors for promotions fired? This belongs in the world of David Mamet plays, not in the corporate workplace.

Instead, maybe it’s more important to make sure that the end-user has to understand as little about security as possible, and to be proactive about stopping attacks way before they even reach the end-user.

Because if I heard that someone lost their job because they couldn’t remember “Nei#oEVwi3” and had to write it down… I’d be looking for a new job. And I wouldn’t feel too guilty about using company time to spruce up the resume.


Risk Management Archives

Two thoughts on health and the economy:


I’d like to be skinny. And have a million dollars. G’night folks!

------------

I have just been informed that, even though it is Friday, I still need to put at least some effort into writing an intelligent blog post.  So, here goes.

Here’s my first thought about health and the economy: Obviously, there have been massive layoffs across the board, and IT has not been spared.  Over the past two years, not only have there been layoffs due to the general contractions (or as I refer to them, death-spasms) of the economy, but since 2006, there has been an increase in the number of internationally outsourced jobs by IT service vendors, according to Network World. 


Data prepared by Everest Group Inc., a research and outsourcing consulting firm, shows in broad brush fashion the shift of jobs overseas by some major IT services vendors. In 2006, U.S. and European firms typically had less than 20% of their workforces offshore; Now, for most companies that figure may well be generally over 30%.


At the same time, many laid off workers are starting their own businesses.  Certainly not all of them, but when you need a job, and no one is hiring – entrepreneurship and despair seem the only logical choices. 


A quarterly survey of 3,000 job seekers conducted by Chicago-based outplacement firm of Challenger, Gray & Christmas, Inc., released Thursday, shows a near doubling in the year-to-year growth of job seekers turning to self-employment.


The problem with this is that because more people are becoming unemployed or self-employed, it creates additional political pressure on the health care debate.  The unemployed and self-employed have to rely on the private insurance market for healthcare, which is where the majority of healthcare horror stories exist, according to NYT opinion columnist and Nobel Laureate Paul Krugman. Krugman argues that government regulations require that employer contributions to health care “can’t discriminate based on pre-existing medical conditions or restrict benefits to highly paid employees,” and thus people don’t see the worst parts of private insurance until they’re either unemployed or self-employed.

Which brings me to my second thought on health and the economy: The “American Recovery and Reinvestment Act” (a.k.a. “the Stimulus”) which give $19B for healthcare related information technology, but also requires that each American has an electronic health record in 5 years.  TechTarget argues that meeting that deadline simply isn’t that easy

"The concern is that when you have these programs that are time limited … that the quality of those implementations could go down," said Chad Eckes, the chief information officer for Schaumburg, Ill.-based Cancer Treatment Centers of America (CTCA). "There can be spectacular failures of electronic health records, because folks didn't anticipate what might happen if it was unstable, and that can have disastrous consequences for patient care."

We’ve seen some of the difficulties with medical data networks before – with the importance of maintaining network performance when lives are literally on the line. 

There’s no question that electronic data records are faster than paper records, and (usually) more accurate… or at least, more accessible in a crisis, which is why there is a mandate.  The problem arises when there is poor network performance, in which case records are less accessible.  Moving the patient records from paper to data places the onus of providing that information from the administrative staff of a hospital or medical center to the IT department.


Risk Management Archives

Fear of the Unknown


One of the things holding back the rollout of new applications (like VoIP, Video, and Unified Communications) is the fear that the new applications will cause network performance problems; according to Network World’s Denise Dubie, citing a survey from Apparent Networks.


Nearly 61% said that they had delayed a VoIP implementation due to network performance concerns. Some 35% postponed a video rollout for the same reasons and 26% put a unified communications project on hold. The survey also showed that network managers can’t always validate their service-level agreements (SLA) with external service providers. More than one-quarter of respondents don’t have the capability to validate SLAs.


It would be instructive to know if decision makers are “concerned” that new apps will reduce their performance because they have baselined performance and know that the network cannot handle new application rollouts… or if they’re concerned because they have no idea whether the network can handle it or not.

It’s the difference between being stopped by practicality and being paralyzed by fear.

And if you’re being paralyzed by fear, it’s costing you money.

For example, Cisco decided to “eat it’s own dogfood” and estimated that they saved $277M from bringing in their own virtual office telecommuting technology – a new application (based on their “Cisco Virtual Office”) for the network that leads to cost savings. If Cisco didn’t know that their network was capable of supporting the CVO application, they would have been out $277M.

Of course, the reason you don’t roll out an application that might save you millions when you don’t know whether those applications will negatively affect network performance is that poor network performance can cost more than whatever you’d save by the rollout.

You can know, or you can be paralyzed by fear of the unknown. I know which I’d rather be.


Risk Management Archives

“The world will look up and shout, ‘Verify our SLA.’ And I’ll whisper, ‘No.’“


Okay, hard decision time.  Do I go grab the very last ticket to see “Watchmen” at the last showing tonight alone, or do I spend a sleepless night tossing and turning, while having nightmares about blue men and conspiracy theorists as I wait to see “Watchmen” with my friends the next day? 

Ha Ha!  Just kidding!  I don’t have any friends!

It should not surprise anyone that I have read Watchmen, that I consider it the most culturally significant work of comic fiction produced since William Hogarth’s “A Rake’s Progress.”  But it also speaks to me on a personal level.  That is, the core themes of “Watchmen” have quite a lot to do with network performance.

Okay, maybe “network performance” isn’t as much on a “personal level” so much as it would be some sort of romantic relationship, or spiritual experience.  But it’s still important.  Anyway…

Who, indeed, watches the watchmen?  For example, carriers claim that by going to MPLS you would get better performance.  But there’s not a lot of data provided by the service providers to verify those performance gains.  Without that ability to quantify how the carrier network is performing, you have no idea of providers are living up to their service level agreements – or whether you’re better off switching to MPLS in the first place.

Additionally, without watching carefully, you can have well meaning IT teams affecting the network – changes which seem trivial to an application developer may actually cause major repercussions to the network traffic - generating and transmitting graphics required for a CAPTCHA, for instance, when previously you were only dealing with a text-based application.

The most extreme case we’re familiar with is when one team of application developers flipped a single flag in the database, making a graphics field available to the user, sending 1 MB worth of graphics per page on the application.   This one “little” change caused network traffic to balloon dramatically over a single night. 

It is not that carriers or application developers are mean or untrustworthy – just that they may not know the impact of what they do on the network; and as such, you should.

This means, of course, monitoring the overall round-trip time from end-to-end on an application by application basis, and monitoring for sudden changes in the network. 


Risk Management Archives

Change Management when the Network Changes Us.


If you want to score some easy points with the geek crowd, tell them that DRM (Digital Rights Management) stinks. But with the stress of the elections, I could use a few easy points, so humor me.

When you’re evaluating a change to the network, you have to think – always – what will the real effect on network performance be? And this is important – if there’s one overarching theme this blog has had over the past two years, it is that the network does not begin and end with the router. It doesn’t even begin and end with applications. No – when you think about what the real effect on network performance will be, you have to think beyond the technical into the realms of the personal and psychological. That’s true end-to-end performance.

There are two stories that have been making their rounds through the Web recently; the first, which we’ve covered extensively, is the Australian Internet filtering software. The second, is the release of the highly anticipated Fallout 3 PC video game being bundled with SecuROM DRM software. This is notable for two reasons: First, Fallout 3’s publishers, Bethesda Softworks, earlier took a stand against DRM for the release of their other major product, Oblivion. Second, SecuROM, made by Sony, is particularly invasive, and particularly when used with EA’s hit, “Spore,” caused a particularly nasty backlash – which included a campaign to pirate the game via BitTorrent just to spite EA.

Bethesda Softworks insists that the SecuROM software is only used for a CD check and is not nearly as restrictive as the software that came with Spore.

A common complaint with DRM schemes is that they actually cause more problems for the people who legally purchase the game than for those who break the DRM in order to pirate the game. The worst DRM schemes can make using the product a hassle, cause system instability, and just generally be a pain in the butt, while doing nothing to stop piracy. The best DRM schemes don’t get in the end-user’s way, allows for reasonable use and portability, and is never under threat of expiration. Of course, these don’t do anything to stop piracy either, but let’s only concern ourselves with the latter for right now.

Point is, when you’re hassling only the people who adhere to the rules, you’re creating a situation where people will get around the rules.

In the earlier post we did on Australian net filtering; the point was made that for five dollars a month, you can set up a VPN in the United States get around all the restrictions of any of the proposed filters. However, in doing so, you’re essentially routing all traffic through the United States and back again to Australia – putting added stresses on the very expensive international pipes even when accessing local content that would have been better served via local pipes.

This is obviously a large-scale version of the problem, but enterprises that deny access (rather than de-prioritize bandwidth) to particular protocols, sites, or applications will find employees will get around the obstacles in order to do their job.

But even this is a very specific example of a larger point – one that goes beyond tactics, beyond strategy – to network philosophy. Even if you don’t think of it as a change to the network, when you change how users behave – even if it’s a new policy from the HR department that gets printed on a piece of paper, you change how the network is being used. Changes can occur to the network from outside the network as well – when culture changes, so does the way that people use the network.

All of this is leading to the point: Even though you aren’t planning on changing the network yourself, you should consider always keeping a close eye on your network with network performance monitoring tools. Change management is not just for when you change the network; change management is for when the network changes you.


Risk Management Archives

What network performance taught me about optimizing a lemon


David Oliver talks about his experiences running the 24 Hours LeMons race in Houston, and how knowing about network performance helped him optimize his junker.







Risk Management Archives

Disasters in IT, and Ninja Networking


Other than Unix Beards and “funny” T-shirts with hex code on them – which more accurately qualify as fashion disasters – the biggest project disasters in IT, according to today’s top story in Computer World, tend to repeat themselves:


When you look at the reasons for project failure, "it's like a top 10 list that just repeats itself over and over again," says Holland, who is also a senior business architect and consultant with HP Services.


You’ve got your usual run of top-ten disasters in the article, including IBM’s Stretch project (Overpromised and underdelivered), Knight-Ridder’s Viewtron (misread the market), California and Washington States’ DMV overhaul and FoxMeyer’s ERP program, (didn’t make sure the new system worked better than the old one), Apple’s Copland (succumbed to feature-creep), Sainsbury’s warehouse automation (just plain didn’t work), and Canada’s Gun Registration System (cost much more than anticipated due to poor planning), and three U.S. government projects (multiple failures with perhaps more in the future).

But one of the things that I noticed was that it’s relatively rare (not unheard of, but relatively rare) to see networking take a prime role in the huge IT disaster stories that get passed around the campfire during IT tribe meetings. And I think that there are a few reasons why that is – the first is that most of these blunders would fall under the category of “strategic errors” as opposed to “tactical errors.” That is, network problems are usually subtle errors caused by mis-configurations and highly technical mistakes. The networking screw-up can be one of the most subtle, stealthy types, compared to the grandiosity of all-out strategic incompetence.

Or in other words, networking performance problems can cause the best laid plans to often go astray; the worst laid plans need no additional help.

Take, for example, a common error from back when they were first rolling out VoIP deployments – companies would roll out VoIP on the network as if it were just another data application, but then found that their other applications slowed to a crawl or even stopped working.

The problem was that VoIP packets are based on protocols designed to use as much of the pipeline as possible, while most applications are based on the TCP protocol, which is designed to throttle back it’s use of the pipeline if packets don’t go through. So what happened was that the VoIP packets would take more of the pipe, TCP applications would be crowded out and drop packets, which would cause the TCP protocol to throttle back, and the VoIP packets would now see the free space and take up more of the pipe, crowd out TCP packets and TCP would throttle back… creating a vicious cycle.

Was this a problem with strategy? Was it some form of bureaucratic incompetence? No – it’s just that it was a very subtle effect and if you didn’t know enough about the TCP and VoIP protocols (or even if you did, but didn’t put two and two together until it was deployed) you ended up with a problem.

Networking problems may have major effects but they’re rarely caused by major boneheaded screw-ups. I think that’s one of the reasons why the two major areas where IT departments spend a great deal of money – networking and security – is because those two problems are extremely subtle to detect and tricky to solve; security problems by malicious design, networking by nature.

Networking problems are subtle, can strike quickly, can often leave little trace of their presence. They’re the ninjas of IT problems.

Of course, ninjas can be defeated.


Risk Management Archives

Georgia on my mind.


I’ve been getting a number of e-mails and comments asking why I haven’t yet written anything about the Russian/Georgian war and the supposed “cyber-warfare” taking place. ZDNet has written extensively about the DDoS attacks being waged against Georgian government sites.

At first, I thought that this was solely a security issue. As a general rule, I don’t like to talk a whole lot about computer security on Network Performance Daily because I lack the proper mindset to get around security – security experts are people who look at things and see how to break them down, network performance experts are generally people who look at things and see how to build them better. Besides, there are tons of blogs out there about computer security, and very few about network performance.

I’m not going to get into the geopolitical aspects of it, except to say that getting involved in a land war in Asia is one of the “classic blunders.”

However, I did start thinking about things… I mean… wasn’t the Internet partially designed to be a resistant form of communication in case the Russians ever attacked? The irony of the Russians effectively taking down a country’s Internet is… well, it’d be funny if it wasn’t for all the people dying.

What this does tell me, however, is that cloud computing (and I’ll continue to call it that despite Dell’s claim to the term,) has a long way to go. While the Internet can be cheaper and simpler than having a fully-fledged IT department monitoring in-house servers and applications on leased lines over a WAN, the one problem that in-house IT has licked is fault.

For the most part, we’ve managed to get it so that we no longer worry about fault on the enterprise network. It was a while ago that we passed the 99.999% uptime mark. So while we may worry about security and performance, we typically don’t have to worry about the network not working.

But cloud computing still has fault problems. And it doesn’t take the Russians attacking. I love Stumbleupon, but they went down for a few minutes yesterday – Twitter also, but they’ve got problems. Even Gmail, which I greatly rely upon for my personal e-mail, went down for a little while earlier this week.

By and large, cloud computing makes great solutions for smaller companies and start-ups because of the low cost, low maintenance, and portability. However, the tradeoff is reliability – Internet applications simply aren’t as reliable as the bulky solutions that get things done when a single hour of downtime can mean thousands in lost business.

There really is no such thing as a private cloud. The entire concept revolves around using IT services offered from outside companies, which connect on public lines through to shared servers.

This is not to say that there is no room for the cloud in enterprise computing but that incidents like the South Ossetian war show that Internet applications suffer from one fatal flaw: They’re on the Internet.



<< 1 2 3