Storwize V7000 and SVC Performance Monitoring

I have previously blogged about the Performance monitoring panel that IBM added to the  Storwize V7000 and SVC GUI.   You can find that post with links to some YouTube videos here.   IBM introduced this panel in version V6.2 of the code and enhanced it in V6.3.    What I didn’t mention at the time was that the same information shown in that GUI panel was also available in the CLI, by using two new commands:  lssystemstats and lsnodestats.  I have hyperlinked both commands to take you to the SVC Information Center (for more detailed description of the output).

The nice trick here is that you can use them to access some very useful instant performance information without starting the GUI at all.  Here are a just a few of the lines of output you will get from the system stats command.  You can see the Fibre Channel (fc) traffic in MBps and IOPS and the overall VDisk (volume) and  MDisk response time in milliseconds (ms), throughput in MBps plus the IOPS.

dr1-poc:~ # svcinfo lssystemstats
stat_name      stat_current stat_peak stat_peak_time 
cpu_pc         0            0         120622045248   
fc_mb          365          365       120622045248   
fc_io          17724        17724     120622045248   
vdisk_mb       135          135       120622045248   
vdisk_io       2169         2169      120622045248   
vdisk_ms       5            5         120622045248   
mdisk_mb       80           80        120622045248   
mdisk_io       539          539       120622045248   
mdisk_ms       18           18        120622045248

We can also query specific nodes.  Here is an example run against node 1 in my SVC cluster.  Again I have not shown all the output, just a few example lines.   You can see that read (_r_) and write traffic (_w_) is shown separately.

dr1-poc:~ # svcinfo lsnodestats 1
node_id node_name stat_name      stat_current stat_peak stat_peak_time 
1       node1     cpu_pc         3            3         120622045503   
1       node1     fc_mb          307          347       120622045453   
1       node1     fc_io          13239        14414     120622045258     
1       node1     vdisk_r_mb     92           96        120622045458   
1       node1     vdisk_r_io     1485         1545      120622045458   
1       node1     vdisk_r_ms     0            0         120622045503   
1       node1     vdisk_w_mb     93           102       120622045258   
1       node1     vdisk_w_io     1502         1647      120622045258   
1       node1     vdisk_w_ms     7            10        120622045248   
1       node1     mdisk_r_mb     6            70        120622045248   
1       node1     mdisk_r_io     26           285       120622045248   
1       node1     mdisk_r_ms     3            6         120622045343   
1       node1     mdisk_w_mb     12           14        120622045423   
1       node1     mdisk_w_io     240          275       120622045423   
1       node1     mdisk_w_ms     24           35        120622045248

You can also filter on just one (or more values).    If you are interested in confirming your CPU utilization before turning on Realtime Compression, here are some commands you could run.  Firstly this command will show just the CPU utilization at that moment:
svcinfo lssystemstats -filtervalue stat_name=cpu_pc

If you plan to output to a file, run the same command with no header:
svcinfo lssystemstats -nohdr -filtervalue stat_name=cp

You can also get five minutes of history.  You could run this every five minutes (outputing to a file) to build a view over a longer period:
svcinfo lssystemstats -nohdr -history cpu_pc

In all the cases above, we are looking at an average of both (or all) nodes (the system stats).  You could also run the same commands against just one node with lsnodestats.  In this example I run them against node one:
svcinfo lsnodestats -nohdr -history cpu_pc 1 

So next time you’re logged onto the CLI, checkout these commands and add them to your repertoire of handy CLI tricks, or even script them and build your own instant performance monitoring tool (which is an interesting idea).

For more hints about using the SVC and Storwize V7000 CLI read my earlier blog post here.   There is also a page on IBM developerWorks here dedicated to SVC and Storwize V7000 scripting.   Check it out!    If you have any cool scripting hints I would love to hear them.

About Anthony Vandewerdt

I am an IT Professional who lives and works in Melbourne Australia. This blog is totally my own work. It does not represent the views of any corporation. Constructive and useful comments are very very welcome.
This entry was posted in IBM, IBM Storage, Storwize V7000, SVC and tagged , , , , , , , . Bookmark the permalink.

38 Responses to Storwize V7000 and SVC Performance Monitoring

  1. Pingback: Storwize V7000 and SVC Performance Monitoring « Storage CH Blog

  2. carlos says:

    hi

    those commands work in case I want to monitor my storwize for let’s say a week?
    the idea is to monitor how my new v7000 behave since we have already 15 servers attached to it and we want to add 15 more servers and we don’t want to impact performance

    how do I find out about real performance? I want to know if my v7000 is already hitting the ceiling or it’s working well?
    any tool? or the ones you posted are ok?

    thanks a lot

  3. Claudio says:

    Hi Anthony, Thank you for this article. I Have a couple of questions regarding migration between DS4000 and V7000
    What is the prerequisite asking for ESX host allowing Volume copies to be recognized in the Migration Wizard? How do I verify if i meet this requirement? “I am not migration clustered esx host” etc
    I couldn’d find any information about it. Can you point me the way?
    I means that I cant migrate LUNs that contain vmware datastores?

    Thank you

    • Hi Claudio… use Storage vMotion and move the VMDKs… that way no need to migrate the datastores themselves. If you instead migrate the datastore you need to take an outage on all VMs using space in that datastore. In addition the volume ID of the data store will change when you re-present it to ESX (cause it now comes from the V7000 rather than the old storage). This can confuse ESX… you will need to do some funky stuff to get ESX to import the Datastore without treating as a new disk.

  4. Anand Gopinath says:

    Hi Anthony,

    Thanks alot for all your great posts on v7000. Apparently i find only this blog having useful info on V7000 for beginners like me.

    i need to build v7000s through cli . Request your help in finding out answers for the following queries i have

    1. What is the best practise for creating RAID 5 & RAID 10 mdisks through CLI ???

    Disks from the same SAS Chain or disks split between chains

    Disks from Same enclosure or different enclosures ( like Enclosure loss protection in DS5000’S )

    Any performance presets while creating mdisks through CLI ???
    2. How can we create RAID 10 arrays while ensuring ” RAID Chain Balance ” ??? Does this mean choosing half the no. of disks from each SAS Chain ?? How does
    v7000 decide which disks form the mirror pair ??

    3. Is the hot spares configured in a SAS Chain usable only in that chain only ??? What will happen to mdisks spread across the chains ???

    4.In one of our existing v7000, we have 7 enclosures 4 in One chain 1 and 3 in chain 2 . We have 5 spares configured in Chain 1 and 2 spares configured in
    Chain 2 . How will this affect the hotspare protection ??? DO we need to change this config ??

    • 1. What is the best practise for creating RAID 5 & RAID 10 mdisks through CLI ???

      Disks from the same SAS Chain or disks split between chains <— Split between chains may give you slightly higher performance but I dont think it is worth the careful planning

      Disks from Same enclosure or different enclosures ( like Enclosure loss protection in DS5000′S ) <— There is no concept of enclosure loss protection with V7000 as the loops dont come back to the controller. I would not bother attempting to achieve this.

      Any performance presets while creating mdisks through CLI ??? <– No. The moment you start using the CLI the machine assumes you know what you are doing.

      2. How can we create RAID 10 arrays while ensuring ” RAID Chain Balance ” ??? Does this mean choosing half the no. of disks from each SAS Chain ?? How does
      v7000 decide which disks form the mirror pair ?? <— Good question but its not really relevant. As long as each RAID0 set is on a different chain, we are safe.

      3. Is the hot spares configured in a SAS Chain usable only in that chain only ??? What will happen to mdisks spread across the chains ??? <– the machine can use hot spare from either chain for either chain, but it will always prefer the local chain and will always try and create unique hot spares for each chain.

      4.In one of our existing v7000, we have 7 enclosures 4 in One chain 1 and 3 in chain 2 . We have 5 spares configured in Chain 1 and 2 spares configured in
      Chain 2 . How will this affect the hotspare protection ??? DO we need to change this config ?? <— As long as you have enough spares to meet the spare goal of each array type dont worry where they are, the machine will use them on either chain.

      • Anand Gopinath says:

        Thanks Anthony . I was looking for these answers for long.

        if i specify 8 disks ( 4 disks from each chain ) to form a RAID 10 array , how do i make sure that disks from same chain does not form a mirror pair ???

        should i specify the drives as a mirror pair in command line ???

        eg:- if i give mkarray with the option -drive a:b:c:d will it mean a and b will form mirror pair and c & d will form another mirror pair ???

  5. Hi Anthony,
    I just did code level upgrade to V7K (from 6.1.07 to 6.3.0.4) mainly due to extract some performance data. Noticed that in 6.3.04 available command is “lsnodecanisterstats” not lsnodestats. Further would like to know whether it’s Possible/Fair to compare the performance level of XIV(Gen2) Vs. Storwize V7K. And overall your thoughts on “Performance comparison of XIV(Gen2) Vs. Storwize V7K”.

    Thanks.

    • good point… in V7000 it is not nodes, it is cannisters. Thanks for letting me know this.

      As for comparison of performance, thats what disk magic is for. Talk to your IBM pre-sales guy or BP who has access to this.
      Most V7000s have way fewer disks than XIVs and way less cache so comparing the two is not totally fair…. but as I said.. that is what Disk Magic is for.

      • Thanks.
        Really appreciate your quick response.
        Gen2 XIV | 6 modules | out of 43TB nearly 99% utilized | 1TB SATA disk | p750 | fully virtualized (dule vios) High Available Live environment said to be out performed by Storwize V7K | 300GB SAS disk | p740 DR environment. DR is having less load. But upon a parallel run of batch process like EOD, DR process finishes way ahead of Live.
        SAN access for hosts given via vscsi in both Live & DR. Is there a best practice for queue_depth setting when it comes to XIV? Further does any advantage of using NPIV over vscsi?

      • With XIV you really want the queue depth to be deep as the XIV can handle lots of parallel processes.
        The V7000 will finish each job quicker but you can throw more jobs at the XIV at the same time.
        What are your queues depths?
        NPIV vs VSCSI is all about moving admin around. I know SAN admins hate NPIV but system admins find vSCSI more work.
        From a performance perspective I am now sure what differences you will see.
        I think your issue is more likely queue depth.

  6. Thanks. My understanding was we have set queue_depth to 128 all across.
    Just did check to get a complete view after you told so. Then found out that it’s not consistent.
    Production Site
    – 2 Nodes(p750) each with duel VIOS
    In VIOS
    appvg – queue_depth – 128
    oradatavg – queue_depth – 128
    oraredovg – queue_depth – 128
    oraindexvg – queue_depth – 128
    oraarchvg – queue_depth – 40
    LPAR – Application_Active & Application_Passive
    appvg – concurrent – queue_depth – 3
    LPAR – Database_Active & Database_Passive
    oradatavg – concurrent – queue_depth – 128
    oraredovg – concurrent – queue_depth – 128
    oraindexvg – concurrent – queue_depth – 128
    oraarchvg – concurrent – queue_depth – 3
    ==================================================
    DR Site
    – 2 Nodes(p740) each with single VIOS
    In VIOS
    appvg – queue_depth – 64
    oradatavg – queue_depth – 64
    oraredovg – queue_depth – 64
    oraindexvg – queue_depth – 64
    oraarchvg – queue_depth – 64
    LPAR – Application
    appvg – queue_depth – 3
    LPAR – Database
    oradatavg – queue_depth – 64
    oraredovg – queue_depth – 64
    oraindexvg – queue_depth – 64
    oraarchvg – queue_depth – 64
    ==================================================

  7. Damien says:

    Hi Anthony! How are you? Have you tried performance with Real-time compression on a Replication topology? Example. When you replicate for example with Metro or Global from a Storage A with a generic volume to a Storage B with a real Time Compression volume. The Performance is VERY BAD and it affects the write performance on Storage A! I have elevated the case to IBM but no news right now. Thanks in advance!
    Damien

    • Damien says:

      Obviously, Same kind of RAID and number of spins at source and destination. When you don’t use the real-time compression at destination but generic volume, all work like a charm.

  8. I have a question you might be able to answer for me. :)

    My company utilizes both XIV and StorWize V7000. We are virtualizing the XIV onto the V7000 and everything works well… However, after doing some testing with SQLIO against this setup all looks fine from both the server and the V7000 in terms of performance. One would think that if you check the performance stats on the V7000 that is virtualizing XIV you would see similar performance stats on the XIV for the storage pools/volumes you are virtualizing. This is not the case in my scenario. Just out of randomness let’s say SQLIO reports 14,000 IOPS at 850 MBps using a 25GB test file, and the V7000 GUI stats says close to the same thing. I would assume that the storage volumes on XIV that have been virtualized should show similar stats, but instead I see like 20 IOPS on the XIV GUI metrics. Could you explain if this is by design or if something has been misconfigured? I even went as far as disabling caching on the V7000. Thanks.

    • Its a great question. There are lots of good valid reasons why you are seeing what you are seeing. The two most common causes are the influence of cache, the V7000 is using cache to service reads and eliminating sending them to the backend. The other is I/O coalescense. The V7000 will read 256KB blocks from the XIV. So if your host reads 4KB, the V7000 reads 256KB. Same with writes. This can make the IOPS from host to V7000 dramatically different from V7000 to XIV normally making the backend IOPS much lower than front end.
      The third thing is overlapping writes. If you write to the same block a lot, you might overwrite data in V7000 cache before it destages and eliminate so much IO to the backend.

      So when we add these three things together we reduce a lot of the backend traffic and make the numbers look different. Even turning off cache will not change the IO blocksize switch which is the main cause of your confusion.

      • Thank you for the helpful information. Something I noticed as well is that if I do an SQLIO test directly against an XIV volume (not virtualized onto the V7000) my performance results are much lower from the point of view of the OS. I hit 15,000 IOPS when virtualizing my XIV, but when it’s not I only hit around 6,000 max. The test I’m running now is with a 50GB test file with 4 threads, 64 operations, and 64k byte size for 30 seconds. Is it normal to see much better performance virtualizing a volume as opposed to directly connecting it to the primary storage? I will definitely we virtualizing all my storage if I continue to see this much of a performance increase.

      • Is this XIV Gen2 or Gen3?
        The thread count favors V7000
        Add more threads. ESP if you have gen2

      • We have 3 – XIV Gen 2 and 2 – V7000. I did more testing today and it appears the XIV doesn’t handle 64k and random IO as well as something smaller like 8K with sequential IO. I did notice the more threads, the better the performance seemed to get, but that was the case in both XIV and StorWize

  9. Fabrice says:

    Hi Anthony,
    I did two tests and I am surprise to get the same performance in both test :
    the test was with iozone and a big file of 200GB to get out the cache.
    all the group disk are in RAID 10 balanced SAS disk 300GB 15000rpm
    – test1 : a pool with 1 mdisk
    – test2 : a pool with 4 mdisk
    test1 and test2 gave me the same throughput around 600GB/s
    I understand this result for the test1 as we are RAID10 balanced (4 disks x 150MB/s each)
    however I was thinking that the test2 will do concurency write on all the mdisk.
    When we are in the cache the throughput was over 1GB/s.

    Do you have any comments or explanation?
    Thanks

  10. it is sometimes very interesting what we find when we look deeper into performance issue. In this example we have that VAAI data rates are not visible above the cache layer of SVC or Storwize – so if you run into performance issues and you wonder why – it might be the case that a VAAI process is sending data underneath the cache layer – funny enough that this process is exhausting cache – it is using nearly everything which is available – how to uncover this – see more details here: http://bvqwiki.sva.de/x/rQCE

  11. Several V7000 with performance problems fixed this month :)

    We had a very interesting month. We have fixed performance problems on 3 different V7000 Systems in Europe – all of them are running much better now. The problems have been found in very different places: SAN, Cache Overload, Compression, …

    In no case we had to extend a system – all was solved with changes in setup only.

    Do not hesitate to call us when you have performance problems with a Storwize or with a SVC. We will use BVQ to locate the reason for it. No risk for you, no purchase obligation.

    Call us: http://tinyurl.com/CALL-BVQ

    I have written a white paper about one of these customer situations:
    http://bvqwiki.sva.de/x/eICr

    One Example was also really interesting – a metro mirror between two V7000 where a volume on side 1 produced very bad latencies – the reason was found in site 2!

  12. For everybody who is interested in more detail about performance analysis

    We offer a webex about performance Analysis on February 10
    CET 17:00, EST 11 am, PST 8 am, MSK 8pm

    The BVQ Node / Cache / SAN analysis is well suited to find vulnerabilities and performance bottlenecks in SVC / Storwize or in the SAN.

    Enroll here
    http://bvqwiki.sva.de/display/BVQ/BVQ+online+workshops

    About the the BVQ Node / Cache / SAN analysis:
    This is a quick test which gives a good overview if performance bottlenecks exist in the storage system. This test covers the SVC / Storwize nodes, the managed disk groups and the node ports. It is a very quick but deep analysis of the following areas in the IBM SVC or IBM Storwize systems.

    An experienced engineer is able to perform this test in less than 20 minutes.

  13. I think this is of interest for this Thread

    Until now nobody could deliver full analysis support for the V7.3 codelevel. There have been many changes in the internal architecture of the systems so with this support you have many white spaces where you just can guess whats going on!

    We just made BVQ V3.3 available with full V7.3 support!

    Analysis with full Storwize and SVC V7.3 Support!

    We are very proud to announce the general availability of BVQ Version 3.3.

    BVQ Version 3.3 is the first IBM SVC and Storwize monitoring and analysis solution with full support of the IBM SVC and Storwize Family Software V7.3.x.

    IBM SVC and Storwize Family Software V7.3.x enhances overall system performance by introducing a completely new cache architecture. With BVQ V3.3 we are now able to analyze this new cache architecture. This is highly essential, to better understand performance bottlenecks because many problem root causes can only be found within this level.

    Going far more beyond; the new V7.3 codes delivers information about the complete data path from the SCSI frontend down to the mdisk which can now be analyzed using the new metrics with BVQ.

    BVQ 3.3 is also able to analyze FC, SAS, PCIe, FCoE, iSCSI, IPREP ports to allow deeper understanding of data flows in the storage backend.

    BVQ V3.3 delivers a complete new experience. The new BVQ docking framework allows detaching of windows to place them anywhere on the screen or expand the view across several screens.

    Download here
    http://bvqwiki.sva.de/x/NoFmAQ

  14. dale says:

    Hi Anthony, I’m fairly new to the v7000’s and I’ve been trying to troubleshoot an issue with IBM, directly to understand why my global mirror is stopping all the time.
    IBM, are telling me it’s because my 1gig link is over saturated (we have log shipping and a nightly offsite Backup that runs)

    however we have reviewed the bandwidth being used and this does not show any conclusive errors.

    I Have been reviewing our infrastructure & I am interesting in the background copy rate settings at present ours is set to 984mbps and using 80%.

    decreasing these settings seem to have undesirable affects on our environment 8 host cluster running 200+ vm’s( users see issues with launching and running of applications via citrix 6.5)

    storage has SSD and easytier enabled also and is in excess of 50tb with about 24tb free.

    is there any advice or tips i could take a look at?

  15. Ankit Mehta says:

    HI Anthony,
    I have implemented the SAS chain concept logically in my environment like I have 10 mdisks and I have distributed 5 in one chain and other 5 in another chain, how can I real time check that my performance is now better after implementation on my V7000 without further going into application layer.

    Thanks in advance.
    Ankit Mehta

    • Hi Ankit,

      I am not convinced this change will create a huge performance improvement.
      If it does, the way you will see this is in lower VDisk response times.
      If you have TPC you cannot monitor this but the built in tools to monitor this are not granular enough.
      You may need to go into the application layer to see if things have changed.

  16. Hello Ankit,

    This will now sound like self promotion (and it is) because I talk about the tool we are offering for SVC / Storwize performance analysis. Our tool is named BVQ and is developped by the biggest IBM Partner in Europe from real SVC / Storwize experts. (end of self-congratulation)

    The best reference for our tool is that it is really used to solve problems. I found out that most of the tools on the market are somehow nice looking sunshine products but are lame ducks when you want to solve problems (stormy weather product:). Some of our experiences:
    https://bvqwiki.sva.de/display/BVQ/BVQ+use+cases+and+experiences

    How we would approach your problem with BVQ

    We can monitor the load on each spindle and also groups of spindles or mdisks. This will show us whether your load distribution is fair and what the latencies tell us about the single drive. We can also very easily see whether one of the spindles performs worse than others.
    https://bvqwiki.sva.de/display/BVQ/3.4+BVQ+shows+result+of+Storwize+disk+drives

    We can also look into the SAS ports to figure out about load distribution on these ports and whether we find error on these ports. I think this would be the easies option instead of monitoring single drives.

    So the answer to your question can be simple. You just have to use the right tool with the needed technical depths. You will not have such a tool even if you have VSC.

    Looking for alternatives!

    You will probably look for alternatives. Please always keep the the following things in mind

    –> Does the alternative have the technical depths to support you. Do you find terms like node core, node port, lower cache, upper cache, drive, mdisk, vdisk copy, vdisk in their product descriptions.

    –> Do you have the impressions the software you are looking has been used for analysis in different cases. Is this somehow documented. there are many products on the market which I name sunshine products – nice to look at but not at all helpful when it rains.

  17. I forgot to mention the next steps when you want to try out BVQ

    Contact us via the contact form
    http://tinyurl.com/CALL-BVQ
    ask for a demo or an offline scan analysis

    Contact me personally if you want
    michael.pirker@sva.de

    Get more information from the BVQ WIKI
    https://bvqwiki.sva.de/display/BVQ/English+main+page
    https://bvqwiki.sva.de/display/BVQ/BVQ+use+cases+and+experiences

  18. Pingback: Accessing the Instrumentation | Aussie Storage Blog

  19. parmjit bassi says:

    Hi Anthony

    I am new to V7000 and currently have a problem with performance against Write Response times to SAN disk. I have a real-time application I am running which performs write transactions to disk. The files are very small. At sporadical times we get a 100% disk busy for a few seconds, and our application starts buffering the write transactions until the disk is available to continue to write transactions again.
    I have storage Insights tool running, monitoring performance on the SAN, and we can see no problems to disks/host/ports when the problem occurs.
    The application is operating from Linux based system, and we have confirmed all queue depth settings and cache settings with IBM guidelines for SAN v7000.

    Is there anything else, that you know of, where we could check and investigate on why this problem is occurring?

    Thanks

    Parmjit

Leave a comment