[ovirt-users] Recommended setup for a FC based storage domain

Tue Jun 10 05:19:38 UTC 2014

Hm, another update on this one. If I create another VM with another 
virtual disk on the node that already have a vm running from the FC 
storage, then libvirt doesn't brake. I guess it just happens for the 
first time on any of the nodes. If this is the case, I would have to 
bring all of the vm's on the other two nodes in this four node cluster 
and start a VM from the FC storage just to make sure it doesn't brake 
during working hours. I guess it would be fine then.

It seems to me that this is some sort of a timeout issue that happens 
when I start the vm for the first time on fc sd, this could have 
something to do with fc card driver settings, or libvirt won't wait for 
ovirt-engine to present the new LV to the targeted node. I don't see why 
ovirt-engine waits for the first-time launch of the vm to present the LV 
at all, shouldn't it be doing this at the time of the virtual disk 
creation in case I have selected to run from the specific node?

On 06/09/2014 01:49 PM, combuster wrote:
> Bad news happens only when running a VM for the first time, if it helps...
>
> On 06/09/2014 01:30 PM, combuster wrote:
>> OK, I have good news and bad news :)
>>
>> Good news is that I can run different VM's on different nodes when 
>> all of their drives are on FC Storage domain. I don't think that all 
>> of I/O is running through SPM, but I need to test that. Simply put, 
>> for every virtual disk that you create on the shared fc storage 
>> domain, ovirt will present that vdisk only to the node wich is 
>> running the VM itself. They all can see domain infrastructure 
>> (inbox,outbox,metadata) but the LV for the virtual disk itself for 
>> that VM is visible only to the node that is running that particular 
>> VM. There is no limitation (except for the free space on the storage).
>>
>> Bad news!
>>
>> I can create the virtual disk on the fc storage for a vm, but when I 
>> start the VM itself, node wich hosts the VM that I'm starting is 
>> going non-operational, and quickly goes up again (ilo fencing agent 
>> checks if the node is ok and bring it back up). During that time, vm 
>> starts on another node (Default Host parameter was ignored - assigned 
>> Host was not available). I can manualy migrate it later to the 
>> intended node, that works. Lucky me, on two nodes (of the four) in 
>> the cluster, there were no vm's running (i tried this on both, with 
>> two different vm's created from scratch and i got the same result.
>>
>> I've killed everything above WARNING because it was killing the 
>> performance of the cluster. vdsm.log :
>>
>> [code]
>> Thread-305::WARNING::2014-06-09 
>> 12:15:53,236::persistentDict::256::Storage.PersistentDict::(refresh) 
>> data has no embedded checksum - trust it as it is
>> 55809e40-ccf3-4f7c-aeec-802bc1c326a7::WARNING::2014-06-09 
>> 12:17:25,013::utils::129::root::(rmFile) File: 
>> /rhev/data-center/a0500f5c-e8d9-42f1-8f04-15b23514c8ed/55338570-e537-412b-97a9-635eea1ecb10/images/90659ad8-bd90-4a0a-bb4e-7c6afe90e925/242a1bce-a434-4246-ad24-b62f99c03a05 
>> already removed
>> 55809e40-ccf3-4f7c-aeec-802bc1c326a7::WARNING::2014-06-09 
>> 12:17:25,074::blockSD::761::Storage.StorageDomain::(_getOccupiedMetadataSlots) 
>> Could not find mapping for lv 
>> 55338570-e537-412b-97a9-635eea1ecb10/242a1bce-a434-4246-ad24-b62f99c03a05
>> Thread-305::WARNING::2014-06-09 
>> 12:20:54,341::persistentDict::256::Storage.PersistentDict::(refresh) 
>> data has no embedded checksum - trust it as it is
>> Thread-305::WARNING::2014-06-09 
>> 12:25:55,378::persistentDict::256::Storage.PersistentDict::(refresh) 
>> data has no embedded checksum - trust it as it is
>> Thread-305::WARNING::2014-06-09 
>> 12:30:56,424::persistentDict::256::Storage.PersistentDict::(refresh) 
>> data has no embedded checksum - trust it as it is
>> Thread-1857::WARNING::2014-06-09 
>> 12:32:45,639::libvirtconnection::116::root::(wrapper) connection to 
>> libvirt broken. ecode: 1 edom: 7
>> Thread-1857::CRITICAL::2014-06-09 
>> 12:32:45,640::libvirtconnection::118::root::(wrapper) taking calling 
>> process down.
>> Thread-17704::WARNING::2014-06-09 
>> 12:32:48,009::libvirtconnection::116::root::(wrapper) connection to 
>> libvirt broken. ecode: 1 edom: 7
>> Thread-17704::CRITICAL::2014-06-09 
>> 12:32:48,013::libvirtconnection::118::root::(wrapper) taking calling 
>> process down.
>> Thread-17704::ERROR::2014-06-09 
>> 12:32:48,018::vm::2285::vm.Vm::(_startUnderlyingVm) 
>> vmId=`2bee9d79-b8d1-4a5a-a4f7-8092d1c803d9`::The vm start process failed
>> Traceback (most recent call last):
>>   File "/usr/share/vdsm/vm.py", line 2245, in _startUnderlyingVm
>>     self._run()
>>   File "/usr/share/vdsm/vm.py", line 3185, in _run
>>     self._connection.createXML(domxml, flags),
>>   File 
>> "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 
>> 110, in wrapper
>>     __connections.get(id(target)).pingLibvirt()
>>   File "/usr/lib64/python2.6/site-packages/libvirt.py", line 3389, in 
>> getLibVersion
>>     if ret == -1: raise libvirtError ('virConnectGetLibVersion() 
>> failed', conn=self)
>> libvirtError: internal error client socket is closed
>> Thread-1857::WARNING::2014-06-09 
>> 12:32:50,673::vm::1963::vm.Vm::(_set_lastStatus) 
>> vmId=`2bee9d79-b8d1-4a5a-a4f7-8092d1c803d9`::trying to set state to 
>> Powering down when already Down
>> Thread-1857::WARNING::2014-06-09 
>> 12:32:50,815::utils::129::root::(rmFile) File: 
>> /var/lib/libvirt/qemu/channels/2bee9d79-b8d1-4a5a-a4f7-8092d1c803d9.com.redhat.rhevm.vdsm 
>> already removed
>> Thread-1857::WARNING::2014-06-09 
>> 12:32:50,816::utils::129::root::(rmFile) File: 
>> /var/lib/libvirt/qemu/channels/2bee9d79-b8d1-4a5a-a4f7-8092d1c803d9.org.qemu.guest_agent.0 
>> already removed
>> MainThread::WARNING::2014-06-09 
>> 12:33:03,770::fileUtils::167::Storage.fileUtils::(createdir) Dir 
>> /rhev/data-center/mnt already exists
>> MainThread::WARNING::2014-06-09 
>> 12:33:05,738::clientIF::181::vds::(_prepareBindings) Unable to load 
>> the json rpc server module. Please make sure it is installed.
>> storageRefresh::WARNING::2014-06-09 
>> 12:33:06,133::fileUtils::167::Storage.fileUtils::(createdir) Dir 
>> /rhev/data-center/hsm-tasks already exists
>> Thread-35::ERROR::2014-06-09 
>> 12:33:08,375::sdc::137::Storage.StorageDomainCache::(_findDomain) 
>> looking for unfetched domain 55338570-e537-412b-97a9-635eea1ecb10
>> Thread-35::ERROR::2014-06-09 
>> 12:33:08,413::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain) 
>> looking for domain 55338570-e537-412b-97a9-635eea1ecb10
>> Thread-13::WARNING::2014-06-09 
>> 12:33:08,417::fileUtils::167::Storage.fileUtils::(createdir) Dir 
>> /rhev/data-center/a0500f5c-e8d9-42f1-8f04-15b23514c8ed already exists
>> [/code]
>>
>> libvirt breaks and I guess that it would bring down all of VM's if 
>> there were any on that node.
>>
>> Anybody have an idea why this happens ?
>>
>> TIA,
>>
>> Ivan
>>
>> On 06/02/2014 04:38 PM, combuster at archlinux.us wrote:
>>> One word of caution so far, when exporting any vm, the node that acts as SPM
>>> is stressed out to the max. I releived the stress by a certain margin with
>>> lowering libvirtd and vdsm log levels to WARNING. That shortened out the
>>> export procedure by at least five times. But vdsm process on the SPM node  is
>>> still with high cpu usage so it's best that the SPM node should be left with a
>>> decent CPU time amount to spare. Also, export of VM's with high vdisk capacity
>>> and thin provisioning enabled (let's say 14GB used of 100GB defined) took
>>> around 50min over a 10Gb ethernet interface to a 1Gb export NAS device that
>>> was not stressed out at all by other processes. When I did that export with
>>> debug log levels it took 5hrs :(
>>>
>>> So lowering log levels is a must in production enviroment. I've deleted the
>>> lun that I exported on the storage (removed it first from ovirt) and for the
>>> next weekend I am planing to add a new one, export it again on all the nodes
>>> and start a few fresh vm installations. Things I'm going to look for are
>>> partition alignment and running them from different nodes in the cluster at
>>> the same time. I just hope that not all I/O is going to pass through the SPM,
>>> this is the one thing that bothers me the most.
>>>   
>>> I'll report back on these results next week, but if anyone has experience with
>>> this kind of things or can point  to some documentation would be great.
>>>
>>> On Monday, 2. June 2014. 18.51.52 you wrote:
>>>> I'm curious to hear what other comments arise, as we're analyzing a
>>>> production setup shortly.
>>>>
>>>> On Sun, Jun 1, 2014 at 10:11 PM,<combuster at archlinux.us>  wrote:
>>>>> I need to scratch gluster off because setup is based on CentOS 6.5, so
>>>>> essential prerequisites like qemu 1.3 and libvirt 1.0.1 are not met.
>>>> Gluster would still work with EL6, afaik it just won't use libgfapi and
>>>> instead use just a standard mount.
>>>>
>>>>> Any info regarding FC storage domain would be appreciated though.
>>>>>
>>>>> Thanks
>>>>>
>>>>> Ivan
>>>>>
>>>>> On Sunday, 1. June 2014. 11.44.33combuster at archlinux.us  wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I have a 4 node cluster setup and my storage options right now are a FC
>>>>>> based storage, one partition per node on a local drive (~200GB each) and
>>>>>> a
>>>>>> NFS based NAS device. I want to setup export and ISO domain on the NAS
>>>>>> and
>>>>>> there are no issues or questions regarding those two. I wasn't aware of
>>>>>> any
>>>>>> other options at the time for utilizing a local storage (since this is a
>>>>>> shared based datacenter) so I exported a directory from each partition
>>>>>> via
>>>>>> NFS and it works. But I am little in the dark with the following:
>>>>>>
>>>>>> 1. Are there any advantages for switching from NFS based local storage to
>>>>>> a
>>>>>> Gluster based domain with blocks for each partition. I guess it can be
>>>>>> only
>>>>>> performance wise but maybe I'm wrong. If there are advantages, are there
>>>>>> any tips regarding xfs mount options etc ?
>>>>>>
>>>>>> 2. I've created a volume on the FC based storage and exported it to all
>>>>>> of
>>>>>> the nodes in the cluster on the storage itself. I've configured
>>>>>> multipathing correctly and added an alias for the wwid of the LUN so I
>>>>>> can
>>>>>> distinct this one and any other future volumes more easily. At first I
>>>>>> created a partition on it but since oVirt saw only the whole LUN as raw
>>>>>> device I erased it before adding it as the FC master storage domain. I've
>>>>>> imported a few VM's and point them to the FC storage domain. This setup
>>>>>> works, but:
>>>>>>
>>>>>> - All of the nodes see a device with the alias for the wwid of the
>>>>>> volume,
>>>>>> but only the node wich is currently the SPM for the cluster can see
>>>>>> logical
>>>>>> volumes inside. Also when I setup the high availability for VM's residing
>>>>>> on the FC storage and select to start on any node on the cluster, they
>>>>>> always start on the SPM. Can multiple nodes run different VM's on the
>>>>>> same
>>>>>> FC storage at the same time (logical thing would be that they can, but I
>>>>>> wanted to be sure first). I am not familiar with the logic oVirt utilizes
>>>>>> that locks the vm's logical volume to prevent corruption.
>>>>>>
>>>>>> - Fdisk shows that logical volumes on the LUN of the FC volume are
>>>>>> missaligned (partition doesn't end on cylindar boundary), so I wonder if
>>>>>> this is becuase I imported the VM's with disks that were created on local
>>>>>> storage before and that any _new_ VM's with disks on the fc storage would
>>>>>> be propperly aligned.
>>>>>>
>>>>>> This is a new setup with oVirt 3.4 (did an export of all the VM's on 3.3
>>>>>> and after a fresh installation of the 3.4 imported them back again). I
>>>>>> have room to experiment a little with 2 of the 4 nodes because currently
>>>>>> they are free from running any VM's, but I have limited room for
>>>>>> anything else that would cause an unplanned downtime for four virtual
>>>>>> machines running on the other two nodes on the cluster (currently highly
>>>>>> available and their drives are on the FC storage domain). All in all I
>>>>>> have 12 VM's running and I'm asking on the list for advice and guidance
>>>>>> before I make any changes.
>>>>>>
>>>>>> Just trying to find as much info regarding all of this as possible before
>>>>>> acting upon.
>>>>>>
>>>>>> Thank you in advance,
>>>>>>
>>>>>> Ivan
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20140610/86c8b118/attachment-0001.html>