
On Fri, Apr 7, 2017 at 2:40 AM Bill James <bill.james@j2.com> wrote:
We are trying to convert our qa environment from local nfs to gluster. When I move a disk with a VM that is running on same server as the storage it fails. When I move a disk with VM running on a different system it works.
VM running on same system as disk:
2017-04-06 13:31:00,588 ERROR (jsonrpc/6) [virt.vm] (vmId='e598485a-dc74-43f7-8447-e00ac44dae21') Unable to start replication for vda to {u'domainID': u'6affd8c3-2c51-4cd1-8300-bfbbb14edbe9', 'volumeInfo': {'domainID': u'6affd8c3-2c 51-4cd1-8300-bfbbb14edbe9', 'volType': 'path', 'leaseOffset': 0, 'path': u'/rhev/data-center/mnt/glusterSD/ovirt1-ks.test.j2noc.com: _gv2/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5', 'volumeID': u'30fd46c9-c738-4b13-aeca-3dc9ffc677f5', 'leasePath': u'/rhev/data-center/mnt/glusterSD/ovirt1-ks.test.j2noc.com: _gv2/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5.lease', 'imageID': u'7ae9b3f7-3507-4469-a080-d0944d0ab753'}, 'diskType': 'file', 'format': 'cow', 'cache': 'none', u'volumeID': u'30fd46c9-c738-4b13-aeca-3dc9ffc677f5', u'imageID': u'7ae9b3f7-3507-4469-a080-d0944d0ab753', u'poolID': u'8b6303b3-79c6-4633-ae21-71b15ed00675', u'device': 'disk', 'path':
u'/rhev/data-center/8b6303b3-79c6-4633-ae21-71b15ed00675/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5', 'propagateErrors': u'off', 'volumeChain': [{'domainID': u'6affd8c3-2c51-4cd1-8300-bfbbb14edbe9', 'volType': 'path', 'leaseOffset': 0, 'path': u'/rhev/data-center/mnt/glusterSD/ovirt1-ks.test.j2noc.com: _gv2/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/6756eb05-6803-42a7-a3a2-10233bf2ca8d', 'volumeID': u'6756eb05-6803-42a7-a3a2-10233bf2ca8d', 'leasePath': u'/rhev/data-center/mnt/glusterSD/ovirt1-ks.test.j2noc.com: _gv2/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/6756eb05-6803-42a7-a3a2-10233bf2ca8d.lease', 'imageID': u'7ae9b3f7-3507-4469-a080-d0944d0ab753'}, {'domainID': u'6affd8c3-2c51-4cd1-8300-bfbbb14edbe9', 'volType': 'path', 'leaseOffset': 0, 'path': u'/rhev/data-center/mnt/glusterSD/ovirt1-ks.test.j2noc.com: _gv2/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5', 'volumeID': u'30fd46c9-c738-4b13-aeca-3dc9ffc677f5', 'leasePath': u'/rhev/data-center/mnt/glusterSD/ovirt1-ks.test.j2noc.com: _gv2/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5.lease', 'imageID': u'7ae9b3f7-3507-4469-a080-d0944d0ab753'}]} (vm:3594) Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 3588, in diskReplicateStart self._startDriveReplication(drive) File "/usr/share/vdsm/virt/vm.py", line 3713, in _startDriveReplication self._dom.blockCopy(drive.name, destxml, flags=flags) File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 69, in f ret = attr(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 123, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 941, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 684, in blockCopy if ret == -1: raise libvirtError ('virDomainBlockCopy() failed', dom=self) libvirtError: internal error: unable to execute QEMU command 'drive-mirror': Could not open
'/rhev/data-center/8b6303b3-79c6-4633-ae21-71b15ed00675/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5': Permission denied
[root@ovirt1 test vdsm]# ls -l
/rhev/data-center/8b6303b3-79c6-4633-ae21-71b15ed00675/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5 -rw-rw---- 2 vdsm kvm 197120 Apr 6 13:29
/rhev/data-center/8b6303b3-79c6-4633-ae21-71b15ed00675/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5
Then if I try and rerun it it says, even though move failed:
2017-04-06 13:49:27,197 INFO (jsonrpc/1) [dispatcher] Run and protect: getAllTasksStatuses, Return response: {'allT asksStatus': {'078d962c-e682-40f9-a177-2a8b479a7d8b': {'code': 212, 'message': 'Volume already exists', 'taskState': 'finished', 'taskResult': 'cleanSuccess', 'taskID': '078d962c-e682-40f9-a177-2a8b479a7d8b'}}} (logUtils:52)
So now I have to clean up the disks that it failed to move so I can migrate the VM and then move the disk again. Or so it seems. Failed move disks do exist in new location, even though it "failed".
vdsm.log attached.
ovirt-engine-tools-4.1.0.4-1.el7.centos.noarch vdsm-4.19.4-1.el7.centos.x86_64
Hi Bill, Does it work after setting selinux to permissive? (setenforce 0) Can you share output of: ps -efZ | grep vm-name (filter the specific vm) ls -lhZ /rhev/data-center/mnt ls -lhZ /rhev/data-center/mnt/gluster-server:_path/sd_id/images/img_id/vol_id (assuming the volume was not deleted after the operation). If the volume is not deleted after the failed move disk operation, this is likely a bug, please file a bug for this. The actual failure may be gluster configuration issue, or selinux related bug. Nir