Thanks for the responses everyone, really appreciate it.

I've condensed the other questions into this reply.

Steve,
What is the CPU load of the GlusterFS host when comparing the raw brick test to the gluster mount point test? Give it 30 seconds and see what top reports. You’ll probably have to significantly increase the count on the test so that it runs that long.

- Nick

Gluster mount point:

*4K* on GLUSTER host

[root@gluster1 rep2]# dd if=/dev/zero of=/mnt/rep2/test1 bs=4k count=500000

500000+0 records in

500000+0 records out

2048000000 bytes (2.0 GB) copied, 100.076 s, 20.5 MB/s

Top reported this right away:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

1826 root 20 0 294m 33m 2540 S 27.2 0.4 0:04.31 glusterfs

2126 root 20 0 1391m 31m 2336 S 22.6 0.4 11:25.48 glusterfsd

Then at about 20+ seconds top reports this:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

1826 root 20 0 294m 35m 2660 R 141.7 0.5 1:14.94 glusterfs

2126 root 20 0 1392m 31m 2344 S 33.7 0.4 11:46.56 glusterfsd

*4K* Directly on the brick:

dd if=/dev/zero of=test1 bs=4k count=500000

500000+0 records in

500000+0 records out

2048000000 bytes (2.0 GB) copied, 4.99367 s, 410 MB/s

7750 root 20 0 102m 648 544 R 50.3 0.0 0:01.52 dd

7719 root 20 0 0 0 0 D 1.0 0.0 0:01.50 flush-253:2

Same test, gluster mount point on OVIRT host:

dd if=/dev/zero of=/mnt/rep2/test1 bs=4k count=500000

500000+0 records in

500000+0 records out

2048000000 bytes (2.0 GB) copied, 42.4518 s, 48.2 MB/s

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

2126 root 20 0 1396m 31m 2360 S 40.5 0.4 13:28.89 glusterfsd

Same test, on OVIRT host but against NFS mount point:

dd if=/dev/zero of=/mnt/rep2-nfs/test1 bs=4k count=500000

500000+0 records in

500000+0 records out

2048000000 bytes (2.0 GB) copied, 18.8911 s, 108 MB/s

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

2141 root 20 0 550m 184m 2840 R 84.6 2.3 16:43.10 glusterfs

2126 root 20 0 1407m 30m 2368 S 49.8 0.4 13:49.07 glusterfsd

Interesting - It looks like if I use a NFS mount point, I incur a cpu hit on two processes instead of just the daemon. I also get much better performance if I'm not running dd (fuse) on the GLUSTER host.

The storage servers are a bit older, but are both dual socket quad core
opterons with 4x 7200rpm drives.

A block size of 4k is quite small so that the context switch overhead involved with fuse would be more perceivable.

Would it be possible to increase the block size for dd and test?

I'm in the process of setting up a share from my desktop and I'll see if
I can bench between the two systems. Not sure if my ssd will impact the
tests, I've heard there isn't an advantage using ssd storage for glusterfs.

Do you have any pointers to this source of information? Typically glusterfs performance for virtualization work loads is bound by the slowest element in the entire stack. Usually storage/disks happen to be the bottleneck and ssd storage does benefit glusterfs.

-Vijay

I had a couple technical calls with RH (re: RHSS), and when I asked if SSD's could add any benefit I was told no. The context may have been in a product comparison to other storage vendors, where they use SSD's for read/write caching, versus having an all SSD storage domain (which I'm not proposing, but which is effectively what my desktop would provide).

Increasing bs against NFS mount point (gluster backend):

dd if=/dev/zero of=/mnt/rep2-nfs/test1 bs=128k count=16000

16000+0 records in

16000+0 records out

2097152000 bytes (2.1 GB) copied, 19.1089 s, 110 MB/s

GLUSTER host top reports:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

2141 root 20 0 550m 183m 2844 R 88.9 2.3 17:30.82 glusterfs

2126 root 20 0 1414m 31m 2408 S 46.1 0.4 14:18.18 glusterfsd

So roughly the same performance as 4k writes remotely. I'm guessing if I could randomize these writes we'd see a large difference.

Check this thread out, http://raobharata.wordpress.com/2012/10/29/qemu-glusterfs-native-integration/ it's quite dated but I remember seeing similar figures.

In fact when I used FIO on a libgfapi mounted VM I got slightly faster read/write speeds than on the physical box itself (I assume because of some level of caching). On NFS it was close to half.. You'll probably get a little more interesting results using FIO opposed to dd

( -Andrew)

Sorry Andrew, I meant to reply to your other message - it looks like CentOS 6.5 can't use libgfapi right now, I stumbled across this info in a couple threads. Something about how the CentOS build has different flags set on build for RHEV snapshot support then RHEL, so native gluster storage domains are disabled because snapshot support is assumed and would break otherwise. I'm assuming this is still valid as I cannot get a storage lock when I attempt a gluster storage domain.

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

I've setup a NFS storage domain on my desktops SSD. I've re-installed win 2008 r2 and initially it was running smoother.

Disk performance peaks at 100MB/s.

If I copy a 250MB file from a share into the Windows VM, it writes out quickly, less than 5 seconds.

If I copy 20 files, ranging in file sizes from 4k to 200MB, totaling in 650MB from the share - windows becomes unresponsive, in top the desktop's nfs daemon is barely being touched at all, and then eventually is not hit. I can still interact with the VM's windows through the spice console. Eventually the file transfer will start and rocket through the transfer.

I've opened a 271MB zip file with 4454 files and started the extract process but the progress windows will sit on 'calculating...' after a significant period of time the decompression starts and runs at <200KB/second. Windows is guesstimating 1HR completion time. Eventually even this freezes up, and my spice console mouse won't grab. I can still see the resource monitor in the Windows VM doing its thing but have to poweroff the VM as its no longer usable.

The windows update process is the same. It seems like when the guest needs quick large writes its fine, but lots of io causes serious hanging, unresponsiveness, spice mouse cursor freeze, and eventually poweroff/reboot is the only way to get it back.

Also, during window 2008 r2 install the 'expanding windows files' task is quite slow, roughly 1% progress every 20 seconds (~30 mins to complete). The GLUSTER host shows these stats pretty consistently:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

8139 root 20 0 1380m 28m 2476 R 83.1 0.4 8:35.78 glusterfsd

8295 root 20 0 550m 186m 2980 S 4.3 2.4 1:52.56 glusterfs

bwm-ng v0.6 (probing every 2.000s), press 'h' for help

input: /proc/net/dev type: rate

\ iface Rx Tx Total

==============================================================================

lo: 3719.31 KB/s 3719.31 KB/s 7438.62 KB/s

eth0: 3405.12 KB/s 3903.28 KB/s 7308.40 KB/s

I've copied the same zip file to an nfs mount point on the OVIRT host (gluster backend) and get about 25 - 600 KB/s during unzip. The same test on NFS mount point (desktop SSD ext4 backend) averaged a network transfer speed of 5MB/s and completed in about 40 seconds.

I have a RHEL 6.5 guest running on the NFS/gluster backend storage domain, and just did the same test. Extracting the file took 22.3 seconds (faster than the fuse mount point on the host !?!?).

GLUSTER host top reported this while the RHEL guest was decompressing the zip file:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

2141 root 20 0 555m 187m 2844 S 4.0 2.4 18:17.00 glusterfs

2122 root 20 0 1380m 31m 2396 S 2.3 0.4 83:19.40 glusterfsd

Steve Dainard
IT Infrastructure Manager
Miovision | Rethink Traffic
519-513-2407 ex.250
877-646-8476 (toll-free)

Blog | LinkedIn | Twitter | Facebook

Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON, Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately.