Thanks for the responses everyone, really appreciate it.
I've condensed the other questions into this reply.
Steve,
What is the CPU load of the GlusterFS host when comparing the raw
brick
test to the gluster mount point test? Give it 30 seconds and see what top
reports. You’ll probably have to significantly increase the count on the
test so that it runs that long.
- Nick
Gluster mount point:
*4K* on GLUSTER host
[root@gluster1 rep2]# dd if=/dev/zero of=/mnt/rep2/test1 bs=4k count=500000
500000+0 records in
500000+0 records out
2048000000 bytes (2.0 GB) copied, 100.076 s, 20.5 MB/s
Top reported this right away:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1826 root 20 0 294m 33m 2540 S 27.2 0.4 0:04.31 glusterfs
2126 root 20 0 1391m 31m 2336 S 22.6 0.4 11:25.48 glusterfsd
Then at about 20+ seconds top reports this:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1826 root 20 0 294m 35m 2660 R 141.7 0.5 1:14.94 glusterfs
2126 root 20 0 1392m 31m 2344 S 33.7 0.4 11:46.56 glusterfsd
*4K* Directly on the brick:
dd if=/dev/zero of=test1 bs=4k count=500000
500000+0 records in
500000+0 records out
2048000000 bytes (2.0 GB) copied, 4.99367 s, 410 MB/s
7750 root 20 0 102m 648 544 R 50.3 0.0 0:01.52 dd
7719 root 20 0 0 0 0 D 1.0 0.0 0:01.50 flush-253:2
Same test, gluster mount point on OVIRT host:
dd if=/dev/zero of=/mnt/rep2/test1 bs=4k count=500000
500000+0 records in
500000+0 records out
2048000000 bytes (2.0 GB) copied, 42.4518 s, 48.2 MB/s
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2126 root 20 0 1396m 31m 2360 S 40.5 0.4 13:28.89 glusterfsd
Same test, on OVIRT host but against NFS mount point:
dd if=/dev/zero of=/mnt/rep2-nfs/test1 bs=4k count=500000
500000+0 records in
500000+0 records out
2048000000 bytes (2.0 GB) copied, 18.8911 s, 108 MB/s
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2141 root 20 0 550m 184m 2840 R 84.6 2.3 16:43.10 glusterfs
2126 root 20 0 1407m 30m 2368 S 49.8 0.4 13:49.07 glusterfsd
Interesting - It looks like if I use a NFS mount point, I incur a cpu hit
on two processes instead of just the daemon. I also get much better
performance if I'm not running dd (fuse) on the GLUSTER host.
The storage servers are a bit older, but are both dual socket quad core
opterons with 4x 7200rpm drives.
A block size of 4k is quite small so that the context switch overhead
involved with fuse would be more perceivable.
Would it be possible to increase the block size for dd and test?
I'm in the process of setting up a share from my desktop and
I'll see if
I can bench between the two systems. Not sure if my ssd will impact the
tests, I've heard there isn't an advantage using ssd storage for glusterfs.
Do you have any pointers to this source of information? Typically glusterfs
performance for virtualization work loads is bound by the slowest element
in the entire stack. Usually storage/disks happen to be the bottleneck and
ssd storage does benefit glusterfs.
-Vijay
I had a couple technical calls with RH (re: RHSS), and when I asked if
SSD's could add any benefit I was told no. The context may have been in a
product comparison to other storage vendors, where they use SSD's for
read/write caching, versus having an all SSD storage domain (which I'm not
proposing, but which is effectively what my desktop would provide).
Increasing bs against NFS mount point (gluster backend):
dd if=/dev/zero of=/mnt/rep2-nfs/test1 bs=128k count=16000
16000+0 records in
16000+0 records out
2097152000 bytes (2.1 GB) copied, 19.1089 s, 110 MB/s
GLUSTER host top reports:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2141 root 20 0 550m 183m 2844 R 88.9 2.3 17:30.82 glusterfs
2126 root 20 0 1414m 31m 2408 S 46.1 0.4 14:18.18 glusterfsd
So roughly the same performance as 4k writes remotely. I'm guessing if I
could randomize these writes we'd see a large difference.
Check this thread out,
http://raobharata.wordpress.com/2012/10/29/qemu-glusterfs-native-integrat...
it's
quite dated but I remember seeing similar figures.
In fact when I used FIO on a libgfapi mounted VM I got slightly faster
read/write speeds than on the physical box itself (I assume because of some
level of caching). On NFS it was close to half.. You'll probably get a
little more interesting results using FIO opposed to dd
( -Andrew)
Sorry Andrew, I meant to reply to your other message - it looks like CentOS
6.5 can't use libgfapi right now, I stumbled across this info in a couple
threads. Something about how the CentOS build has different flags set on
build for RHEV snapshot support then RHEL, so native gluster storage
domains are disabled because snapshot support is assumed and would break
otherwise. I'm assuming this is still valid as I cannot get a storage lock
when I attempt a gluster storage domain.
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
I've setup a NFS storage domain on my desktops SSD. I've re-installed win
2008 r2 and initially it was running smoother.
Disk performance peaks at 100MB/s.
If I copy a 250MB file from a share into the Windows VM, it writes out
quickly, less than 5 seconds.
If I copy 20 files, ranging in file sizes from 4k to 200MB, totaling in
650MB from the share - windows becomes unresponsive, in top the desktop's
nfs daemon is barely being touched at all, and then eventually is not hit.
I can still interact with the VM's windows through the spice console.
Eventually the file transfer will start and rocket through the transfer.
I've opened a 271MB zip file with 4454 files and started the extract
process but the progress windows will sit on 'calculating...' after a
significant period of time the decompression starts and runs at
<200KB/second. Windows is guesstimating 1HR completion time. Eventually
even this freezes up, and my spice console mouse won't grab. I can still
see the resource monitor in the Windows VM doing its thing but have to
poweroff the VM as its no longer usable.
The windows update process is the same. It seems like when the guest needs
quick large writes its fine, but lots of io causes serious hanging,
unresponsiveness, spice mouse cursor freeze, and eventually poweroff/reboot
is the only way to get it back.
Also, during window 2008 r2 install the 'expanding windows files' task is
quite slow, roughly 1% progress every 20 seconds (~30 mins to complete).
The GLUSTER host shows these stats pretty consistently:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8139 root 20 0 1380m 28m 2476 R 83.1 0.4 8:35.78 glusterfsd
8295 root 20 0 550m 186m 2980 S 4.3 2.4 1:52.56 glusterfs
bwm-ng v0.6 (probing every 2.000s), press 'h' for help
input: /proc/net/dev type: rate
\ iface Rx Tx
Total
==============================================================================
lo: 3719.31 KB/s 3719.31 KB/s 7438.62
KB/s
eth0: 3405.12 KB/s 3903.28 KB/s 7308.40
KB/s
I've copied the same zip file to an nfs mount point on the OVIRT host
(gluster backend) and get about 25 - 600 KB/s during unzip. The same test
on NFS mount point (desktop SSD ext4 backend) averaged a network transfer
speed of 5MB/s and completed in about 40 seconds.
I have a RHEL 6.5 guest running on the NFS/gluster backend storage domain,
and just did the same test. Extracting the file took 22.3 seconds (faster
than the fuse mount point on the host !?!?).
GLUSTER host top reported this while the RHEL guest was decompressing the
zip file:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2141 root 20 0 555m 187m 2844 S 4.0 2.4 18:17.00 glusterfs
2122 root 20 0 1380m 31m 2396 S 2.3 0.4 83:19.40 glusterfsd
*Steve Dainard *
IT Infrastructure Manager
Miovision <
http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)
*Blog <
http://miovision.com/blog> | **LinkedIn
<
https://www.linkedin.com/company/miovision-technologies> | Twitter
<
https://twitter.com/miovision> | Facebook
<
https://www.facebook.com/miovision>*
------------------------------
Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.