The error shows that the host was not connected to the pool. Do we check that host is connected before trying to upload to the host?


בתאריך יום ה׳, 25 בינו׳ 2018, 10:33, מאת Daniel Erez ‏<derez@redhat.com>:
On Wed, Jan 24, 2018 at 11:28 PM Dmitry Semenov <zend0@ya.ru> wrote:
24.01.2018, 10:58, "Yaniv Kaul" <ykaul@redhat.com>:
> On Tue, Jan 23, 2018 at 10:39 PM, Dmitry Semenov <zend0@ya.ru> wrote:
>> While loading disk image (via the web interface) in cluster01 on storage01, storage02 - everything is going well.
>> While loading disk image (via the web interface) in cluster02 on storage03, storage04 - the problem occurs, the image isn't loaded, the process stops at the stage: paused by System (at the same time loading straightly through API goes without problems).
>>
>> screenshot: https://yadi.sk/i/9WtkDlT23Riqxp
>>
>> Logs are applied (engine.log): https://pastebin.com/54k5j7hC
>
> Can you also share vdsm logs, at least from 01c04x09.unix.local ? It seems to have failed there.
> Y.

Here is a link to the vdsm.log file with 01c04x09.unix.local : https://pastebin.com/KiqYSYnP

According to the log [1], there was an error on prepareImage invocation.
Seems like the storage pool wasn't found in vdsm cache, though according to the engine log[2]
it sends to correct pool id ('dedc6e8b-30e3-42d1-86f4-a130110f31b1').

@Nir - what do you think? how could the pool be missing from the host? cache/connection issue?

[1]
2018-01-23 23:16:44,005+0300 INFO  (jsonrpc/5) [vdsm.api] START prepareImage(sdUUID=u'629ab576-638a-4c6f-b9c4-8d5cda64f9b2', spUUID=u'dedc6e8b-30e3-42d1-86f4-a130110f31b1', imgUUID=u'636c67ca-baec-4b32-be36-c3fb0ba24e83', leafUUID=u'f03fc310-7a09-4e23-a196-7dde70a1f616', allowIllegal=True) from=::ffff:10.65.35.10,51672, flow_id=0c170481-1518-426a-90c9-9ad7be27b926, task_id=345c9223-4425-49c3-981a-5f6a7d2dbee0 (api:46)
2018-01-23 23:16:44,006+0300 INFO  (jsonrpc/5) [vdsm.api] FINISH prepareImage error=Unknown pool id, pool not connected: (u'dedc6e8b-30e3-42d1-86f4-a130110f31b1',) from=::ffff:10.65.35.10,51672, flow_id=0c170481-1518-426a-90c9-9ad7be27b926, task_id=345c9223-4425-49c3-981a-5f6a7d2dbee0 (api:50)
2018-01-23 23:16:44,006+0300 ERROR (jsonrpc/5) [storage.TaskManager.Task] (Task='345c9223-4425-49c3-981a-5f6a7d2dbee0') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
    return fn(*args, **kargs)
  File "<string>", line 2, in prepareImage
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 3151, in prepareImage
    self.getPool(spUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 349, in getPool
    raise se.StoragePoolUnknown(spUUID)
StoragePoolUnknown: Unknown pool id, pool not connected: (u'dedc6e8b-30e3-42d1-86f4-a130110f31b1',)
2018-01-23 23:16:44,006+0300 INFO  (jsonrpc/5) [storage.TaskManager.Task] (Task='345c9223-4425-49c3-981a-5f6a7d2dbee0') aborting: Task is aborted: "Unknown pool id, pool not connected: (u'dedc6e8b-30e3-42d1-86f4-a130110f31b1',)" - code 309 (task:1181)
2018-01-23 23:16:44,007+0300 ERROR (jsonrpc/5) [storage.Dispatcher] FINISH prepareImage error=Unknown pool id, pool not connected: (u'dedc6e8b-30e3-42d1-86f4-a130110f31b1',) (dispatcher:82)
2018-01-23 23:16:44,007+0300 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Image.prepare failed (error 309) in 0.00 seconds (__init__:573)
[2]
2018-01-23 23:16:38,408+03 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateImageVDSCommand] (default task-38) [0c170481-1518-426a-90c9-9ad7be27b926] START, CreateImageVDSCommand( CreateImageVDSCommandParameters:{storagePoolId='dedc6e8b-30e3-42d1-86f4-a130110f31b1', ignoreFailoverLimit='false', storageDomainId='629ab576-638a-4c6f-b9c4-8d5cda64f9b2', imageGroupId='636c67ca-baec-4b32-be36-c3fb0ba24e83', imageSizeInBytes='23622320128', volumeFormat='COW', newImageId='f03fc310-7a09-4e23-a196-7dde70a1f616', imageType='Sparse', newImageDescription='{"DiskAlias":"aaaaaaaaaaaaaaaaaaaa","DiskDescription":""}', imageInitialSizeInBytes='1068367872'}), log id: 7666c2e4 




And that's what I noticed, the server 01c04x09.unix.local is in cluster01 but I upload the image-file in to the storage02 (cluster02)

>
>> image size: ~1.3 GB
>>
>> my scheme:
>>
>> data_center_01
>>   cluster01
>>     host01  \
>>     host02  - storage01, storage02
>>     host03  /
>>
>>   cluster02
>>     host04  \
>>     host05  - storage03, storage04
>>     host06  /
>>
>> HostedEngine in cluster01
>> oVirt: Version 4.2.0.2-1.el7.centos
>>
>> --
>> Best regards,
>> _______________________________________________
>> Devel mailing list
>> Devel@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel

-- 
Best regards,