[Users] Creation of preallocated disk with Gluster replication

Karli Sjöberg Karli.Sjoberg at slu.se
Wed Jan 8 17:55:19 UTC 2014



Skickat från min iPhone

> 8 jan 2014 kl. 18:47 skrev "Darrell Budic" <darrell.budic at zenfire.com>:
> 
> Grégoire-
> 
> I think this is expected behavior. Well, at least the high glusterfsd CPU use during disk creation, anyway. I tried creating a 10 G disk on my test environment and observed similar high CPU usage by glusterfsd. Did the creation on the i5 system, it showed 95%-105% cpu for glusterfsd during creation, with the core2 system running ~35-65% glusterfsd utilization during the creation. Minor disk wait was observed on both systems, < 10% peak and generally < 5%. I imagine my ZFS cached backends helped a lot here. Took about 3 minutes, roughly what I’d expect for the i5’s disk system. Network usage was about 45% of the 1G link. No errors or messages logged to /var/log/messages.
> 
> Depending on what your test setup looks like, I’d check my network for packet loss or errors first. Then look at my storage setup and test pure throughput on the disks to see what you’ve got, maybe see what else is running. Did you use an NFS cluster or a PosixFS cluster for this?
> 
> My test setup, running a version of the nightly self-hosted setup w/ gluster distributed/replicated disks as shared storage, in a NFS cluster:
> 
> Core i5 3570K @ 3.4Ghz, 16G Ram
> Boot disks: 2x 32G SATA SSDs in raid-1
> Storage system: 4x500G Seagate RE3s in a ZFS raid-10 w/ 1GB ZIL & ~22G L2ARC caching from boot drives
> 1 1G ethernet
> 2 VMs running
> 
> Core2 Duo E8500 @ 3.16GHz, 8G Ram
> Boot disks: 2x 32G SATA SSDS in raid-1
> Storage system: 2x1500G WD Green drives in a  ZFS Raid w/ 1GB ZIL & ~22G L2ARC cache from boot drives
> 1 1G ethernet
> 
> They are connected through a Netgear Prosafe+ workgroup style switch, not much going on between them.
> 
>  -Darrell

Just curious, are you doing ZFS in Linux?

/K

> 
>> On Jan 8, 2014, at 7:49 AM, gregoire.leroy at retenodus.net wrote:
>> 
>> Hello,
>> 
>> Do you need more informations about this issue ? Do you think this problem is likely to show up in other cases ? I mean, is that an expected behaviour with my environment, or is it unexpected ?
>> 
>> Is there a way to limit the bandwidth usage for creation of pre-allocated disk so that it doesn't impact production ?
>> 
>> Thank you,
>> Regards,
>> Grégoire
>> 
>> Le 2014-01-02 17:42, Vijay Bellur a écrit :
>>> Adding gluster-users.
>>> On 01/02/2014 08:50 PM, gregoire.leroy at retenodus.net wrote:
>>>> Hello,
>>>> I have a Gluster volume in distributed/replicated mode. I have 2 hosts.
>>>> When I try to create a VM with a preallocated disk, it uses 100% of the
>>>> available CPU and bandwidth (I have 1 Gigabit network card).
>>>> The result is I can't even create a preallocated disk because the engine
>>>> detects a network failure.
>>>> I get that kind of messages in /var/log/messages :
>>>> "
>>>> Jan  2 14:13:54 localhost sanlock[3811]: 2014-01-02 14:13:54+0100 167737
>>>> [3811]: s4 kill 21114 sig 15 count 1
>>>> Jan  2 14:13:54 localhost wdmd[3800]: test failed rem 51 now 167737 ping
>>>> 167718 close 167728 renewal 167657 expire 167737 client 3811
>>>> sanlock_ef4978d6-5711-4e01-a0ec-7ffbd9     cdbe5d:1
>>>> "
>>>> And that in the Ovirt Gui :
>>>> "
>>>> 2014-janv.-02, 15:35 Operation Add-Disk failed to complete.
>>>> 2014-janv.-02, 15:35 Storage Pool Manager runs on Host HOST2 (Address:
>>>> X.X.X.X).
>>>> 2014-janv.-02, 15:35 Invalid status on Data Center GlusterSewan. Setting
>>>> Data Center status to Non Responsive (On host HOST2, Error: done).
>>>> 2014-janv.-02, 15:35 State was set to Up for host HOST2.
>>>> 2014-janv.-02, 15:33 Used Network resources of host HOST2 [98%] exceeded
>>>> defined threshold [95%].
>>>> 2014-janv.-02, 15:33 Add-Disk operation of test_Disk1 was initiated on
>>>> VM test by admin at internal.
>>>> I understand that the creation of a 10 Go disk image generates a lot of
>>>> traffic, but is there a way to limit it so that it doesn't have an
>>>> impact on the production ? Furthermore, Why does it use so much CPU
>>>> ressources ? I can see on my monitoring graph a big peak of CPU usage
>>>> when I launched the operation (probably until 100%).
>>> Do you happen to notice what is consuming CPU? Since the same cluster
>>> does both virtualization and storage, a GigE network might get
>>> saturated very quickly. Is it possible to separate out the management
>>> and data/gluster traffic in this setup?
>>> Regards,
>>> Vijay
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
> 
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users



More information about the Users mailing list