Hello, Nir
 
Log in attachment.  
 
14.08.2018, 01:30, "Nir Soffer" <nsoffer@redhat.com>:
On Mon, Aug 13, 2018 at 1:45 PM Aleksey Maksimov <aleksey.i.maksimov@yandex.ru> wrote:
We use oVirt 4.2.5.2-1.el7 (Hosted engine / 4 hosts in cluster / about twenty virtual machines)
Virtual machine disks are located on the Data Domain from FC SAN.
Snapshots of all virtual machines are created normally. But for one virtual machine, we can not create a snapshot.

When we try to create a snapshot in the oVirt web console, we see such errors:

Aug 13, 2018, 1:05:06 PM Failed to complete snapshot 'KOM-APP14_BACKUP01' creation for VM 'KOM-APP14'.
Aug 13, 2018, 1:05:01 PM VDSM KOM-VM14 command HSMGetAllTasksStatusesVDS failed: Could not acquire resource. Probably resource factory threw an exception.: ()
Aug 13, 2018, 1:05:00 PM Snapshot 'KOM-APP14_BACKUP01' creation for VM 'KOM-APP14' was initiated by petya@sub.holding.com@sub.holding.com-authz.

At this time on the server with the role of "SPM" in the vdsm.log we see this:

...
2018-08-13 05:05:06,471-0500 INFO  (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call VM.getStats succeeded in 0.00 seconds (__init__:573)
2018-08-13 05:05:06,478-0500 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Image.deleteVolumes succeeded in 0.05 seconds (__init__:573)
2018-08-13 05:05:06,478-0500 INFO  (tasks/3) [storage.ThreadPool.WorkerThread] START task bb45ae7e-77e9-4fec-9ee2-8e1f0ad3d589 (cmd=<bound method Task.commit of <vdsm.storage.task.Task instance at 0x7f06b85a2128>>, args=None) (threadPool:208)
2018-08-13 05:05:07,009-0500 WARN  (tasks/3) [storage.ResourceManager] Resource factory failed to create resource '01_img_6db73566-0f7f-4438-a9ef-6815075f45ea.cdf1751b-64d3-42bc-b9ef-b0174c7ea068'. Canceling request. (resourceManager:543)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/resourceManager.py", line 539, in registerResource
    obj = namespaceObj.factory.createResource(name, lockType)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/resourceFactories.py", line 193, in createResource
    lockType)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/resourceFactories.py", line 122, in __getResourceCandidatesList
    imgUUID=resourceName)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/image.py", line 213, in getChain
    if srcVol.isLeaf():
  File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1430, in isLeaf
    return self._manifest.isLeaf()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 139, in isLeaf
    return self.getVolType() == sc.type2name(sc.LEAF_VOL)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 135, in getVolType
    self.voltype = self.getMetaParam(sc.VOLTYPE)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 119, in getMetaParam
    meta = self.getMetadata()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 112, in getMetadata
    md = VolumeMetadata.from_lines(lines)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/volumemetadata.py", line 103, in from_lines
    "Missing metadata key: %s: found: %s" % (e, md))
MetaDataKeyNotFoundError: Meta Data key not found error: ("Missing metadata key: 'DOMAIN': found: {}",)
 
Looks like you have a volume without metadata in the chain.
 
This may happen in the past when deleting a volume failed, but we
cleared the volume metadata. In current 4.2, this cannot happen, since
we clear the metadata only if deleting the volume succeeded.
 
Can you post complete vdsm log with this error?
 
Once we find the volume without metadata, we can delete the LV
using lvremove. This will fix the issue.
 
Shani, do you remember the bug we have with this error?
this probably the same issue.
 
Ala, I think we need to add a tool to check and repair such chains.
 
Nir
 
 
-- 
С наилучшими пожеланиями,
Максимов Алексей

Email: Aleksey.I.Maksimov@Yandex.ru