Storage QoS Is A Must-Have Feature For Enterprises And The Cloud
Later this year, many of the established storage players will finally be adding Storage QoS (Quality of Service) functionality to their systems. Though startups such as SolidFire and NexGen Storage (and some platforms such as IBM's XIV) have been touting this functionality for a few years now, most storage systems today currently lack Storage QoS. If your primary storage vendor does not have Storage QoS on its roadmap, now is the time to start demanding it.
Normally, when I bring up the topic of Storage QoS with All-Flash Array startups or other high-end array vendors, the typical response I get is "We don't need Storage QoS. Our system is so fast – there are IOPS for everyone!" While this statement may or may not be true (it isn't!), even if a system had a seemingly infinite amount of performance, this would only solve part of the problem with storage performance provisioning. Here are a few things to keep in mind as you evaluate Storage QoS:
Storage QoS allows performance provisioning at a granular level. This functionality should provide controls on transaction (IOPS) and throughput (GB per second) performance – typically set on a LUN basis, but will ideally enforce policies at a VM level in the future (vendors such as Tintri are working towards this).
'Noisy neighbors' disrupt cloud and enterprise storage delivery. When an application or VM spikes up its performance utilization and leaves less controller caching and flash resources available for other applications, customers feel the adverse affects of 'noisy neighbors.' This limitation is a major reason why it is so difficult for cloud storage providers to create consistent multi-tenant environments using traditional storage systems which lack Storage QoS.
Storage QoS's ability to limit performance, is just as valuable as its performance guarantee. As more and more organizations travel down the road towards implementing global catalogs for automating storage provisioning, the need to control performance resources becomes even more critical.
Resource abuse from Bronze class customers should never be rewarded. After all, if you can't prevent a Bronze level customer from consuming performance at a Gold or Platinum level – the validity of the catalog collapses and you will be left with a chaotic performance free-for-all.
Indiscriminately throwing performance at a problem isn't a good (or fiscally responsible) long term answer. Though the latest and greatest in the All-Flash Array space can theoretically deliver 100,000's or even millions of IOPS, leverage Storage QoS to make sure the right customers and applications reap the performance benefits of your new (and expensive) flash systems.
Some implementations of Storage QoS have a 'burst mode.' This allows arrays to temporarily lend out IOPS and throughput performance in an emergency (i.e. to prevent an app from crashing). While this feature should allow I&O teams to reduce confrontations with customers, it should not be used as a crutch. When customers exceed their performance allocations, I&O has a responsibility to inform them of this since spare resources may not always be available in the future.
Storage QoS as a key requirement in my definition of Software Defined Storage, and given the complexity of multi-tenant cloud storage environments and consolidated virtualization implementations – the importance of this technology will only grow in the future.