Add a Comment Now - We Want to Hear From You
Kevin Davis, a senior consultant at NetQoS, will be presenting a few training sessions at Symposium about SuperAgent, the end-to-end response time module of the NetQoS Performance Center. This will include a training session about how to use time-based network metrics in troubleshooting. He talks about his upcoming training session below.
In the session, I’m going to be covering the importance of using a time-based metric in troubleshooting, because end-users complain foremost about time. For example, they’ll say “the application is running slow,” or they believe “the network is slow.” To users, everything is based on time, that’s what they’re complaining about. And they’re correct.
It’s very new to many people to think of performance in “time” although that may seem counterintuitive - because most people are used to reading utilization graphs. With utilization graphs, however, we don’t know if 70 or 80 or 90 percent utilization is necessarily impacting the user experience. I mean, we buy networking equipment, routers, switches, firewalls, servers, and we want them to be highly – or efficiently - utilized. Seeing high utilization could indicate a problem – or it could just indicate that you haven’t over-purchased. So you can have a link at 90% utilization or a router at ninety percent CPU utilization but you won’t know if that’s impacting the end-user without a time based metric.
It’s time-based data that tells you how the users are being impacted. Sure, the utilization data – the interface utilization, memory utilization, I/O utilization, can often tell what is doing the impact. But the time base shows you the degree of the impact – the real-world effect on end-users. With a time-based instrument, such as NetQoS SuperAgent, you can find out where the delay increase is occurring, and whether it’s based in the network, server, or application.
In fact, you can take a look at time-based data and make a determination very quickly as to which entity is creating the performance issue – the beautiful thing about SuperAgent, in particular, is that it trends by time 24/7, so not only can you determine how your important business applications are being impacted today, but you can go back and look at recurring patterns in performance issues. You can see if today is worse than yesterday or last week or last month.
In the session, I’ll also be going over how to architect the data center for performance. Placement of servers that participate in inter-architectures is critical for the health and performance of the application and indeed the data center. We also talk about how different protocols, for example, Microsoft’s TCP/IP stack, can impact application performance by enhancing or degrading it.
It’s important for servers that are serving the same application. For example, a front-end Web server and a back-end Oracle database really should be on the same switch on the same VLAN. That way they receive optimum service from the network. If they do leave the switch, they’ll have to contend with bandwidth going up and down the switch links, and they’ll be switched and routed multiple times.
Based on measurements from customer environments and from our own laboratories, when two servers are on different switches they can have up to 18 milliseconds delay between them. If we think of that in the terms of network engineers of one millisecond per 100 miles, what in effect we’re doing when we put two different servers on different switches, or two different VLANs on the same switch, we’re making it look like those servers are 1800 miles apart – like one server is in Los Angeles and the other is in Memphis.
