On Fri, Apr 7, 2017 at 2:40 AM Bill James <bill.james@j2.com> wrote:
We are trying to convert our qa environment from local nfs to gluster.
When I move a disk with a VM that is running on same server as the
storage it fails.
When I move a disk with VM running on a different system it works.

VM running on same system as disk:

2017-04-06 13:31:00,588 ERROR (jsonrpc/6) [virt.vm]
(vmId='e598485a-dc74-43f7-8447-e00ac44dae21') Unable to start
replication for vda to {u'domainID':
u'6affd8c3-2c51-4cd1-8300-bfbbb14edbe9', 'volumeInfo': {'domainID':
u'6affd8c3-2c
51-4cd1-8300-bfbbb14edbe9', 'volType': 'path', 'leaseOffset': 0, 'path':
u'/rhev/data-center/mnt/glusterSD/ovirt1-ks.test.j2noc.com:_gv2/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5',
'volumeID': u'30fd46c9-c738-4b13-aeca-3dc9ffc677f5', 'leasePath':
u'/rhev/data-center/mnt/glusterSD/ovirt1-ks.test.j2noc.com:_gv2/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5.lease',
'imageID': u'7ae9b3f7-3507-4469-a080-d0944d0ab753'}, 'diskType': 'file',
'format': 'cow', 'cache': 'none', u'volumeID':
u'30fd46c9-c738-4b13-aeca-3dc9ffc677f5', u'imageID':
u'7ae9b3f7-3507-4469-a080-d0944d0ab753', u'poolID':
u'8b6303b3-79c6-4633-ae21-71b15ed00675', u'device': 'disk', 'path':
u'/rhev/data-center/8b6303b3-79c6-4633-ae21-71b15ed00675/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5',
'propagateErrors': u'off', 'volumeChain': [{'domainID':
u'6affd8c3-2c51-4cd1-8300-bfbbb14edbe9', 'volType': 'path',
'leaseOffset': 0, 'path':
u'/rhev/data-center/mnt/glusterSD/ovirt1-ks.test.j2noc.com:_gv2/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/6756eb05-6803-42a7-a3a2-10233bf2ca8d',
'volumeID': u'6756eb05-6803-42a7-a3a2-10233bf2ca8d', 'leasePath':
u'/rhev/data-center/mnt/glusterSD/ovirt1-ks.test.j2noc.com:_gv2/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/6756eb05-6803-42a7-a3a2-10233bf2ca8d.lease',
'imageID': u'7ae9b3f7-3507-4469-a080-d0944d0ab753'}, {'domainID':
u'6affd8c3-2c51-4cd1-8300-bfbbb14edbe9', 'volType': 'path',
'leaseOffset': 0, 'path':
u'/rhev/data-center/mnt/glusterSD/ovirt1-ks.test.j2noc.com:_gv2/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5',
'volumeID': u'30fd46c9-c738-4b13-aeca-3dc9ffc677f5', 'leasePath':
u'/rhev/data-center/mnt/glusterSD/ovirt1-ks.test.j2noc.com:_gv2/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5.lease',
'imageID': u'7ae9b3f7-3507-4469-a080-d0944d0ab753'}]} (vm:3594)
Traceback (most recent call last):
   File "/usr/share/vdsm/virt/vm.py", line 3588, in diskReplicateStart
     self._startDriveReplication(drive)
   File "/usr/share/vdsm/virt/vm.py", line 3713, in _startDriveReplication
     self._dom.blockCopy(drive.name, destxml, flags=flags)
   File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line
69, in f
     ret = attr(*args, **kwargs)
   File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py",
line 123, in wrapper
     ret = f(*args, **kwargs)
   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 941, in
wrapper
     return func(inst, *args, **kwargs)
   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 684, in
blockCopy
     if ret == -1: raise libvirtError ('virDomainBlockCopy() failed',
dom=self)
libvirtError: internal error: unable to execute QEMU command
'drive-mirror': Could not open
'/rhev/data-center/8b6303b3-79c6-4633-ae21-71b15ed00675/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5':
Permission denied


[root@ovirt1 test vdsm]# ls -l
/rhev/data-center/8b6303b3-79c6-4633-ae21-71b15ed00675/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5
-rw-rw---- 2 vdsm kvm 197120 Apr  6 13:29
/rhev/data-center/8b6303b3-79c6-4633-ae21-71b15ed00675/6affd8c3-2c51-4cd1-8300-bfbbb14edbe9/images/7ae9b3f7-3507-4469-a080-d0944d0ab753/30fd46c9-c738-4b13-aeca-3dc9ffc677f5



Then if I try and rerun it it says, even though move failed:

2017-04-06 13:49:27,197 INFO  (jsonrpc/1) [dispatcher] Run and protect:
getAllTasksStatuses, Return response: {'allT
asksStatus': {'078d962c-e682-40f9-a177-2a8b479a7d8b': {'code': 212,
'message': 'Volume already exists', 'taskState':
  'finished', 'taskResult': 'cleanSuccess', 'taskID':
'078d962c-e682-40f9-a177-2a8b479a7d8b'}}} (logUtils:52)


So now I have to clean up the disks that it failed to move so I can
migrate the VM and then move the disk again.
Or so it seems.
Failed move disks do exist in new location, even though it "failed".

vdsm.log attached.

ovirt-engine-tools-4.1.0.4-1.el7.centos.noarch
vdsm-4.19.4-1.el7.centos.x86_64

Hi Bill,

Does it work after setting selinux to permissive? (setenforce 0)

Can you share output of:

ps -efZ | grep vm-name
(filter the specific vm) 

ls -lhZ /rhev/data-center/mnt

ls -lhZ /rhev/data-center/mnt/gluster-server:_path/sd_id/images/img_id/vol_id
(assuming the volume was not deleted after the operation).

If the volume is not deleted after the failed move disk operation, this is likely
a bug, please file a bug for this.

The actual failure may be gluster configuration issue, or selinux related bug.

Nir