Currently Being Moderated

Last weekend, I presented a paper on storage workload characterization and consolidation at VPACT '09. In the paper, we did extensive characterization of some of the well known enterprise workloads such as MS exchange, OLTP, DVDStore, Decision support, in terms of various parameters such as seek distances, IO sizes, read/write ratios, outstanding IOs etc.

One of the main conclusion was that: Most workloads exhibit random access pattern. Furthermore, sequential cases such as decision support, backup or virus scanner, seem to be mainly the alternative workloads running on the same data-sets which are commonly accessed in random manner by actual applications. Detailed tracing also revealed that most of them were quite bursty in terms of arrivals.

Based on the two main observations, we tested the effect of consolidation on two different workloads, where they are first run on separate raid-groups and then on  a single raid-group consisting of the physical disk from both of them. The results from consolidation were quite encouraging as the workloads saw a big reduction in 90th percentile latency values. This is mainly because higher degree of burstiness leads to larger gains from statistical multiplexing and combining the underlying physical disks helps a great deal in absorbing the peaks during bursty intervals.

This suggests that administrators should re-think their rule of thumb of placing workloads on different spindles for better performance isolation. This does lead to better predictability in most cases but potentially causes severe over-provisioning as well. Isolation based on software mechanisms both at OS and array level would be much more desirable and cost-effective.

There is also an issue of semantic information gap, where array vendors may not get workload specific information in the request stream that they see. Hence the overall solution may involve cooperation between both end-points of the SAN or Ethernet and some sort of interface between OS and array vendors for stronger isolation.

Here is something to chew: What sort of support or interface should array vendors expose to support various performance isolation requirements for applications in terms of throughput, latency or perhaps burstiness? Are weights (or shares) good enough or we need other parameters such as latency specifications?

1,527 Views Tags: performance, vmware, io, storage, management, isolation, workloads, workload, characterization, resource


There are no comments on this post

Ajay Gulati

Ajay Gulati

Member since: Jan 22, 2009

Tech previews of breakthroughs in computing technologies with focus on resource management, storage technologies and virtualization

View Ajay Gulati's profile

Actions

Bookmarked By (0)