Image may be NSFW.
Clik here to view.
I was really excited when I discovered that Citrix Provisioning services 7.1 has an option to store cache in device RAM with overflow to local disk. Previously I’ve seen how cache to RAM is 10 times faster than a SSD disk cache or a fast disk RAID. The drawback with cache to RAM is that when the cache is filled, your VM’s will crash instantly with BSOD. You had to be really careful with the sizing and monitoring the cache to use this feature as described in my prevous blog post abort this topic.
Now, with overflow to disk or fallback to disk, this would solve that issue and still make use of all that RAM to make extreme performance at a low cost. Right? I rushed to my lab to test, and I made some interesting discoveries.
Image may be NSFW.
Clik here to view.
First, a quick description of my LAB. I have a Xenserver 6.2 with a PVS server and a 2008R2 PVS device. The new feature requieres Windows 7 or 2008 R2 or newer OS.
The device VM has no other components than OS and PVS drivers installed. I used a small 2GB disk for cache to disk and overflow disk just to see what happens when the disk cache is full. I’m using IO meter to test performance. To make a baseline I first tested IO using cache to local disk SSD And a standalone 7200RPM disk, then with the existing cache to RAM with no overflow.
Image may be NSFW.
Clik here to view.Image may be NSFW.
Clik here to view.Image may be NSFW.
Clik here to view.
No surprises here, SSD was 10x faster than the 7200RPM disk and cache to RAM 10x the SSD disk. I’m just displaying read IOPS her but write IOPS is generally 50% of the read IOPS. I will make another blog post later with testing the different storage options combined with different cache options and hypervisors, got some interesting results there too.
Back to cache to RAM with overflow on disk. How does it work? First thing I noticed, was the difference in how the RAM cache is assigned to the VM. With previous RAM cache option, the PVS RAM cache was hidden from the Operations system, so when assigning 1GB RAM as PVS RAM cache, the VM sees only 3GB RAM.
Image may be NSFW.
Clik here to view.Image may be NSFW.
Clik here to view.Image may be NSFW.
Clik here to view.
Using PVS status tray app or the mcli PVS command line tool, the percentage of RAM cache used is visible with cache to RAM without overflow.
Image may be NSFW.
Clik here to view.
Image may be NSFW.
Clik here to view.
With overflow enabled the VM was still able to see all 4GB RAM assigned. So where was the cache? Checking the PVS status tray app, showed that I had about 1,5GB cache available, but I had specified 1GB.
Image may be NSFW.
Clik here to view.Image may be NSFW.
Clik here to view.
The mcli command I’ve previously used to monitor RAM cache also does not show anything. Looking at the overflow disk, the local disk assigned to the VM where cache is usually stored, there is a vdiskdif.vhdx file and no .vdiskcache file like when using cache to device disk.
Image may be NSFW.
Clik here to view.
So this is where the cache is stored a vhdx differencing file, and the size of it equals the cache size used. I assume that the idea is to leverage windows built-in function to cache files in RAM. But how much performance do we gain from this, and what happens when the vdiskdif.vhdx is full? Let’s see. I copied a 2GB file the PVS device C disk. After 1,5GB de file copy hangs.
Image may be NSFW.
Clik here to view.
The VM is still up and shows a warning that the cache disk is full, but any action done at the VM hangs. Only way out is to reboot. So still not much better than cache to RAM without overflow, but you may work around it by having a large overflow disk. So what about performance. Running IO meter again, expecting to see same result as cache to RAM. But…
Image may be NSFW.
Clik here to view.
The performance is about the same as running with cache to disk.
Here is a complete diagram of my testing
Image may be NSFW.
Clik here to view.
So to the conclusion from my LAB results, is that this new option brings no benefit over cache to disk or cache to RAM without overflow. UPDATE: I’ve also tested with Windows 2012 R2 with same result. Only way to achieve better IOPS is by enabling Intermediate buffering on the image, more on this in a new blogpost. If anyone has more information on this subject, please use the comment field or contact me via Twitter.
I will follow up this blog post with more testing with PVS cache to disk using raided disks in SAN, VS local SSD disks, how intermediate buffering gives near RAM like IOPS, and how Hyper-v VS XenServer reacts to the different buffering options.
Image may be NSFW.
Clik here to view.
Clik here to view.
