[ovirt-users] Storage network clarification

combuster combuster at gmail.com
Tue Jan 19 17:43:34 UTC 2016


OK, setting up gluster on a dedicated network is easier this time 
around, mostly point and click adventure (setting everything up from 
scratch):

- 4 NIC's, 2 bonds, one for ovirtmgmt and the other one for gluster
- Tagged gluster network for gluster traffic
- configured IP addresses without gateways on gluster dedicated bonds on 
both nodes
- allowed_replica_counts=1,2,3 in gluster section within 
/etc/vdsm/vdsm.conf to allow replica 2
- added transport.socket.bind-address to /etc/glusterfs/glusterd.vol to 
force glusterd to listen only from gluster dedicated IP address
- modified /etc/hosts so that the nodes can resolve each other by 
gluster dedicated hostnames (optional)
- probed the peers by their gluster hostnames
- created the volume in the same fashion (I've tried creating another 
one from oVirt webadmin and it works also)
- oVirt picked it up and I was able to create gluster storage domain on 
this volume (+ optimized the volume for virt store)
- tcpdump and iftop shows that replication is going through gluster 
dedicated interfaces

One problem so far, creating preallocated disk images fails. It broke 
after zeroing out some 37GB of 40GB in total, but it's an intermittent 
issue (sometimes it fails earlier), I'm still poking around to find the 
culprit. Thin provisioning works. Bricks and volume are fine, as are 
gluster services. Bandwidth related issues from what I can see (large 
amount of net traffic during flushes, rpc_clnt_ping_timer_expired 
followed by sanlock renewal errors), but I'll report it as soon as I can 
confirm it's not a hardware/configuration related issue.

vdsm.log:

> bf482d82-d8f9-442d-ba93-da5ec225c8c3::DEBUG::2016-01-19 
> 18:03:20,782::utils::716::Storage.Misc.excCmd::(watchCmd) FAILED: 
> <err> = ["/usr/bin/dd: error writing 
> '/rhev/data-center/90758579-cae7-4fdf-97e5-e8415db68c54/9cbc0f15-119e-4fe7-94ef-8bc84e0c8254/images/283ddfaa-7fc2-4bea-9acc-c8ff601110de/e3d135b2-a7c0-43d4-b3a5-04991cce73ae': 
> Transport endpoint is not connected", "/usr/bin/dd: closing output 
> file 
> '/rhev/data-center/90758579-cae7-4fdf-97e5-e8415db68c54/9cbc0f15-119e-4fe7-94ef-8bc84e0c8254/images/283ddfaa-7fc2-4bea-9acc-c8ff601110de/e3d135b2-a7c0-43d4-b3a5-04991cce73ae': 
> Transport endpoint is not connected"]; <rc> = 1
> bf482d82-d8f9-442d-ba93-da5ec225c8c3::ERROR::2016-01-19 
> 18:03:20,783::fileVolume::133::Storage.Volume::(_create) Unexpected error
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/fileVolume.py", line 129, in _create
>     vars.task.aborting, sizeBytes)
>   File "/usr/share/vdsm/storage/misc.py", line 350, in ddWatchCopy
>     raise se.MiscBlockWriteException(dst, offset, size)
> MiscBlockWriteException: Internal block device write failure: 
> u'name=/rhev/data-center/90758579-cae7-4fdf-97e5-e8415db68c54/9cbc0f15-119e-4fe7-94ef-8bc84e0c8254/images/283ddfaa-7fc2-4bea-9acc-c8ff601110de/e3d135b2-a7c0-43d4-b3a5-04991cce73ae, 
> offset=0, size=42949672960'
> jsonrpc.Executor/7::DEBUG::2016-01-19 
> 18:03:20,784::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) 
> Return 'GlusterTask.list' in bridge with {'tasks': {}}
> bf482d82-d8f9-442d-ba93-da5ec225c8c3::ERROR::2016-01-19 
> 18:03:20,790::volume::515::Storage.Volume::(create) Unexpected error
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/volume.py", line 476, in create
>     initialSize=initialSize)
>   File "/usr/share/vdsm/storage/fileVolume.py", line 134, in _create
>     raise se.VolumesZeroingError(volPath)
> VolumesZeroingError: Cannot zero out volume: 
> (u'/rhev/data-center/90758579-cae7-4fdf-97e5-e8415db68c54/9cbc0f15-119e-4fe7-94ef-8bc84e0c8254/images/283ddfaa-7fc2-4bea-9acc-c8ff601110de/e3d135b2-a7c0-43d4-b3a5-04991cce73ae',)
> bf482d82-d8f9-442d-ba93-da5ec225c8c3::DEBUG::2016-01-19 
> 18:03:20,795::resourceManager::616::Storage.ResourceManager::(releaseResource) 
> Trying to release resource 
> '9cbc0f15-119e-4fe7-94ef-8bc84e0c8254_imageNS.283ddfaa-7fc2-4bea-9acc-c8ff601110de'
> bf482d82-d8f9-442d-ba93-da5ec225c8c3::DEBUG::2016-01-19 
> 18:03:20,796::resourceManager::635::Storage.ResourceManager::(releaseResource) 
> Released resource 
> '9cbc0f15-119e-4fe7-94ef-8bc84e0c8254_imageNS.283ddfaa-7fc2-4bea-9acc-c8ff601110de' 
> (0 active users)
> bf482d82-d8f9-442d-ba93-da5ec225c8c3::DEBUG::2016-01-19 
> 18:03:20,796::resourceManager::641::Storage.ResourceManager::(releaseResource) 
> Resource 
> '9cbc0f15-119e-4fe7-94ef-8bc84e0c8254_imageNS.283ddfaa-7fc2-4bea-9acc-c8ff601110de' 
> is free, finding out if anyone is waiting for it.
> bf482d82-d8f9-442d-ba93-da5ec225c8c3::DEBUG::2016-01-19 
> 18:03:20,796::resourceManager::649::Storage.ResourceManager::(releaseResource) 
> No one is waiting for resource 
> '9cbc0f15-119e-4fe7-94ef-8bc84e0c8254_imageNS.283ddfaa-7fc2-4bea-9acc-c8ff601110de', 
> Clearing records.
> bf482d82-d8f9-442d-ba93-da5ec225c8c3::ERROR::2016-01-19 
> 18:03:20,797::task::866::Storage.TaskManager.Task::(_setError) 
> Task=`bf482d82-d8f9-442d-ba93-da5ec225c8c3`::Unexpected error
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/task.py", line 873, in _run
>     return fn(*args, **kargs)
>   File "/usr/share/vdsm/storage/task.py", line 332, in run
>     return self.cmd(*self.argslist, **self.argsdict)
>   File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
>     return method(self, *args, **kwargs)
>   File "/usr/share/vdsm/storage/sp.py", line 1886, in createVolume
>     initialSize=initialSize)
>   File "/usr/share/vdsm/storage/sd.py", line 488, in createVolume
>     initialSize=initialSize)
>   File "/usr/share/vdsm/storage/volume.py", line 476, in create
>     initialSize=initialSize)
>   File "/usr/share/vdsm/storage/fileVolume.py", line 134, in _create
>     raise se.VolumesZeroingError(volPath)
> VolumesZeroingError: Cannot zero out volume: 
> (u'/rhev/data-center/90758579-cae7-4fdf-97e5-e8415db68c54/9cbc0f15-119e-4fe7-94ef-8bc84e0c8254/images/283ddfaa-7fc2-4bea-9acc-c8ff601110de/e3d135b2-a7c0-43d4-b3a5-04991cce73ae',)
> bf482d82-d8f9-442d-ba93-da5ec225c8c3::DEBUG::2016-01-19 
> 18:03:20,798::task::885::Storage.TaskManager.Task::(_run) 
> Task=`bf482d82-d8f9-442d-ba93-da5ec225c8c3`::Task._run: 
> bf482d82-d8f9-442d-ba93-da5ec225c8c3 () {} failed - stopping task
> bf482d82-d8f9-442d-ba93-da5ec225c8c3::DEBUG::2016-01-19 
> 18:03:20,798::task::1246::Storage.TaskManager.Task::(stop) 
> Task=`bf482d82-d8f9-442d-ba93-da5ec225c8c3`::stopping in state running 
> (force False)
> bf482d82-d8f9-442d-ba93-da5ec225c8c3::DEBUG::2016-01-19 
> 18:03:20,798::task::993::Storage.TaskManager.Task::(_decref) 
> Task=`bf482d82-d8f9-442d-ba93-da5ec225c8c3`::ref 1 aborting True
> bf482d82-d8f9-442d-ba93-da5ec225c8c3::DEBUG::2016-01-19 
> 18:03:20,799::task::919::Storage.TaskManager.Task::(_runJobs) 
> Task=`bf482d82-d8f9-442d-ba93-da5ec225c8c3`::aborting: Task is 
> aborted: 'Cannot zero out volume' - code 374


On 01/18/2016 04:00 PM, combuster wrote:
> oVirt is still managing the cluster via ovirtmgmt network. The same 
> rule applies for tagging networks as VM networks, Live Migration 
> networks etc. Gluster is no different, except that it involved a 
> couple of manual steps for us to configure it.
>
> On 01/18/2016 03:53 PM, Fil Di Noto wrote:
>> Thanks I will try this. I am running ovirt-engine 3.6.1.3-1.el7.centos
>>
>> In the configuration described, is oVirt able to manage gluster? I am
>> confused because if oVirt knows the nodes by their ovirtmgmt network
>> IP/hostname aren't all the VDSM commands going to fail?
>>
>>
>>
>> On Mon, Jan 18, 2016 at 6:39 AM, combuster <combuster at gmail.com> wrote:
>>> Hi Fil,
>>>
>>> this worked for me a couple of months back:
>>>
>>> http://lists.ovirt.org/pipermail/users/2015-November/036235.html
>>>
>>> I'll try to set this up again, and see if there are any issues. 
>>> Which oVirt
>>> release are you running ?
>>>
>>> Ivan
>>>
>>> On 01/18/2016 02:56 PM, Fil Di Noto wrote:
>>>> I'm having trouble setting up a dedicated storage network.
>>>>
>>>> I have a separate VLAN designated for storage, and configured separate
>>>> IP addresses for each host that correspond to that subnet. I have
>>>> tested this subnet extensively and it is working as expected.
>>>>
>>>> Prior to adding the hosts, I configured a storage network and
>>>> configured the cluster to use that network for storage and not the
>>>> ovirtmgmt network. I was hopping that this would be recognized when
>>>> the hosts were added but it was not. I had to actually reconfigure the
>>>> storage VLAN interface via oVirt "manage host networks" just to bring
>>>> the host networks into compliance. The IP is configured directly on
>>>> the bond0.<vlanid>, not on a bridge interface which I assume is
>>>> correct since it is not a "VM" network.
>>>>
>>>> In this setup I was not able to activate any of the hosts due to VDSM
>>>> gluster errors, I think it was because VDSM was trying to use the
>>>> hostname/IP of the ovirtmgmt network. I manually set up the peers
>>>> using "gluster peer probe" and I was able to activate the hosts but
>>>> they were not using the storage network (tcpdump). I also tried adding
>>>> DNS records for the storage network interfaces using different
>>>> hostnames but gluster seemed to still consider the ovirtmgmt interface
>>>> as the primary.
>>>>
>>>> With the hosts active, I couldn't create/activate any volumes until I
>>>> changed the cluster network settings to use the ovirtmgmt network for
>>>> storage. I ended up abandoning the dedicated storage subnet for the
>>>> time being and I'm starting to wonder if running virtualization and
>>>> gluster on the same hosts is intended to work this way.
>>>>
>>>> Assuming that it should work, what is the correct way to configure it?
>>>> I can't find any docs that go in detail about storage networks. Is
>>>> reverse DNS a factor? If I had a better understanding of what oVirt is
>>>> expecting to see that would be helpful.
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>




More information about the Users mailing list