Here are some rather bad photos of my 1972 Holden HQ Kingswood Premier, one of my first ever cars (and one that I sadly no longer own):
This was the V8 Four Litre model (actually 253 cubic inches, often jokingly described as having all the power of a 6 cylinder with the fuel economy of an 8). The engine bay was so huge and empty I could open the bonnet and sit on the side of the car with my feet comfortably inside the bay while I changed spark plugs or cleaned the points.
The Kingswood was not what you would call an instrumented vehicle. The dashboard had a speedo, a fuel gauge and three lights: Temperature, Oil and Charging. I dubbed these three lights the idiot lights: as once they came on, you were the idiot. (sorry, no picture; this was the 1980s).
Modern storage infrastructure by comparison is slightly more instrumented. A vast array of metrics are tracked and these can be used to perform all sorts of analysis. Analysis like:
- Are my hosts getting good response times?
- Are specific disks or arrays being over worked?
- Are my fibre ports being used in a balanced fashion?
So can you do this with the Storwize products? Of course! I documented the built-in tools here (where I talked about the Performance GUI):
And here (where I talked about the performance CLI):
But these tools have only limited usefulness. They are not granular, in that you cannot look at specific hosts or specific arrays or specific FC ports (meaning the three analysis ideas I suggested above are not even possible). So how can we do this analysis?
The good news is that Storwize products do track all the metrics needed to do very granular analysis and these are freely accessible. These files are documented by IBM, here is a fairly old page that documents some of them:
But how to turn these into something useful? There used to be a tool called svcmon but this tool appears to have been killed as per this rather sad blog post:
There is another IBM Community developed tool called qperf which you can access using the link below:
With a graphing tool here:
And another tool here: http://www.stor2rrd.com/
And yet another one here! https://code.google.com/p/svc-perf/
The challenge for many of these tools is that they require manual setup, usually have a limited database engine and analysis is not always easy or simple.
You can of course use IBM’s TPC:
You could also consider Intellimagic. Although I have not looked too deeply at this one, these guys wrote IBM’s Disk Magic tool, so they certainly understand storage performance
The challenge for all Storage Admins is that they are not always experts at diagnosing performance issues. Getting some genuine examples of the thinking process and the flow of getting from problem to solution, is vital. This makes BVQ another good choice.
To see an example of how instrumented data presented in a graphical format can be used to generate a useful problem analysis, check out this blog post here:
and another one here:
I really like these posts for two reasons:
- They clearly shows just how instrumented the product is
- They clearly show how using this data in a graphical format can lead to good and quick root cause analysis.
Also have a look at some of these videos:
So how are you instrumenting your Storwize?
What do you find the easiest tool to use?