On Tue, Oct 27, 2015 at 5:06 PM, Simone Tiraboschi <stirabos@redhat.com> wrote:


I don't understand the meaning of the sentence above:

          Local storage datacenter name is an internal name
          and currently will not be shown in engine's admin UI.


It's just an internal label. I think we can just remove that question always using the default value and nothing will change.

Probably better.
 
 

How is the chosen "she_datacenter" name related with the "Default" datacenter where the hypervisor is put in? Do I have to manually create it (I don't see this se_datacenter in webadmin portal)?

Also, I know there is open bug


But it seems I'm not able to to import the storage domain...
In events, when I import, I have the sequence

Storage Domain she_sdomain was added by admin@internal
VDSM ovc71.localdomain.local command failed: Cannot acquire host id: (u'9f1ec45d-0c32-4bfc-8b67-372d6f204fd1', SanlockException(22, 'Sanlock lockspace add failure', 'Invalid argument'))
Failed to attach Storage Domains to Data Center Default. (User: admin@internal)
Failed to attach Storage Domain she_sdomain to Data Center Default. (User: admin@internal)

What should be the flow to compensate the bug? Do I have actually to attache it to "Default" datacenter or what? Is it expected to be fixed before 3.6?

Postponing to 3.6.1 not being identified as a blocker.

But is this a regression from 3.5.x or did this problem exist also in all the 3.5 versions where sh engine was in place?
 

You can try to add the first additional storage domain for other VMs.
The datacenter should came up and at that point you try importing the hosted-engine storage domain.
You cannot add other VMs to that storage domain neither you'll can when the auto-import will work.
 


So I was indeed able to add a separate data NFS domain and able to attach it to the default DC that came then up as an active one.
Then tried to import/attach also the sh engine domain; it went in locked state but then the sh engine VM itself went down (no qemu process on hypervisor).
In /var/log/libvirt/qemu/HostedEngine.log of hypervisor I can see

2015-10-28 13:59:02.233+0000: shutting down

expected?

what now to have the sh engine come up again and see what happened?
Any logs on hypevisor to check?

In /var/log/sanlock.log
2015-10-28 14:57:14+0100 854 [829]: s4 lockspace 3662a51f-39de-4533-97fe-d49bf98e2d43:1:/rhev/data-center/mnt/ovc71.localdomain.local:_NFS__DOMAIN/3662a51f-39de-4533-97fe-d49bf98e2d43/dom_md/ids:0
2015-10-28 14:57:34+0100 874 [829]: s4:r3 resource 3662a51f-39de-4533-97fe-d49bf98e2d43:SDM:/rhev/data-center/mnt/ovc71.localdomain.local:_NFS__DOMAIN/3662a51f-39de-4533-97fe-d49bf98e2d43/dom_md/leases:1048576 for 4,17,1698
2015-10-28 14:57:35+0100 875 [825]: s4 host 1 1 854 1bfba2b1-2353-4d4e-9000-f97585b54df1.ovc71.loca
2015-10-28 14:57:35+0100 875 [825]: s4 host 250 1 0 1bfba2b1-2353-4d4e-9000-f97585b54df1.ovc71.loca
2015-10-28 14:59:00+0100 960 [830]: s1:r4 resource 9f1ec45d-0c32-4bfc-8b67-372d6f204fd1:SDM:/rhev/data-center/mnt/ovc71.localdomain.local:_SHE__DOMAIN/9f1ec45d-0c32-4bfc-8b67-372d6f204fd1/dom_md/leases:1048576 for 4,17,1698
2015-10-28 14:59:02+0100 962 [825]: s1 kill 3341 sig 9 count 1
2015-10-28 14:59:02+0100 962 [825]: dead 3341 ci 2 count 1
2015-10-28 14:59:08+0100 968 [830]: s5 lockspace 9f1ec45d-0c32-4bfc-8b67-372d6f204fd1:1:/rhev/data-center/mnt/ovc71.localdomain.local:_SHE__DOMAIN/9f1ec45d-0c32-4bfc-8b67-372d6f204fd1/dom_md/ids:0
2015-10-28 14:59:30+0100 990 [825]: s5 host 1 4 968 1bfba2b1-2353-4d4e-9000-f97585b54df1.ovc71.loca
2015-10-28 14:59:30+0100 990 [825]: s5 host 250 1 0 aa89bb89-20a1-414b-8ee3-0430fdc330f8.ovc71.loca



/var/log/vdsm/vdsm.log
Thread-1247::DEBUG::2015-10-28 14:59:00,043::task::993::Storage.TaskManager.Task::(_decref) Task=`56
dd2372-f454-4188-8bf3-ab543d677c14`::ref 0 aborting False
Thread-1247::ERROR::2015-10-28 14:59:00,096::API::1847::vds::(_getHaInfo) failed to retrieve Hosted Engine HA info
Traceback (most recent call last):
  File "/usr/share/vdsm/API.py", line 1827, in _getHaInfo
    stats = instance.get_all_stats()
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 103, in get_all_stats
    self._configure_broker_conn(broker)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 180, in _configure_broker_conn
    dom_type=dom_type)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 176, in set_storage_domain
    .format(sd_type, options, e))
RequestError: Failed to set storage domain FilesystemBackend, options {'dom_type': 'nfs3', 'sd_uuid': '9f1ec45d-0c32-4bfc-8b67-372d6f204fd1'}: Request failed: <class 'ovirt_hosted_engine_ha.lib.storage_backends.BackendFailureException'>
Thread-1247::INFO::2015-10-28 14:59:00,112::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Reques
t handler for 127.0.0.1:42165 stopped
Thread-1248::DEBUG::2015-10-28 14:59:00,137::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'StoragePool.connectStorageServer' in bridge with {u'connectionParams': [{u'id': u'189c29a5-6830-453c-aca3-7d82f2382dd8', u'connection': u'ovc71.localdomain.local:/SHE_DOMAIN', u'iqn': u'', u'user': u'', u'protocol_version': u'3', u'tpgt': u'1', u'password': '********', u'port': u''}], u'storagepoolID': u'00000000-0000-0000-0000-000000000000', u'domainType': 1}
Thread-1248::DEBUG::2015-10-28 14:59:00,138::task::595::Storage.TaskManager.Task::(_updateState) Task=`9ca908a0-45e2-41d5-802c-dc0bd2414a69`::moving from state init -> state preparing
Thread-1248::INFO::2015-10-28 14:59:00,139::logUtils::48::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=1, spUUID=u'00000000-0000-0000-0000-000000000000', conList=[{u'id': u'189c29a5-6830-453c-aca3-7d82f2382dd8', u'connection': u'ovc71.localdomain.local:/SHE_DOMAIN', u'iqn': u'', u'user': u'', u'protocol_version': u'3', u'tpgt': u'1', u'password': '********', u'port': u''}], options=None)
Thread-1248::DEBUG::2015-10-28 14:59:00,142::fileUtils::143::Storage.fileUtils::(createdir) Creating directory: /rhev/data-center/mnt/ovc71.localdomain.local:_SHE__DOMAIN mode: None
Thread-1248::DEBUG::2015-10-28 14:59:00,143::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/sudo -n /usr/bin/mount -t nfs -o soft,nosharecache,timeo=600,retrans=6,nfsvers=3 ovc71.localdomain.local:/SHE_DOMAIN /rhev/data-center/mnt/ovc71.localdomain.local:_SHE__DOMAIN (cwd None)
Thread-1248::DEBUG::2015-10-28 14:59:00,199::hsm::2405::Storage.HSM::(__prefetchDomains) nfs local path: /rhev/data-center/mnt/ovc71.localdomain.local:_SHE__DOMAIN
Thread-1248::DEBUG::2015-10-28 14:59:00,201::hsm::2429::Storage.HSM::(__prefetchDomains) Found SD uuids: (u'9f1ec45d-0c32-4bfc-8b67-372d6f204fd1',)
Thread-1248::DEBUG::2015-10-28 14:59:00,202::hsm::2489::Storage.HSM::(connectStorageServer) knownSDs
: {9f1ec45d-0c32-4bfc-8b67-372d6f204fd1: storage.nfsSD.findDomain, 3662a51f-39de-4533-97fe-d49bf98e2d43: storage.nfsSD.findDomain}
Thread-1248::INFO::2015-10-28 14:59:00,202::logUtils::51::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': u'189c29a5-6830-453c-aca3-7d82f2382dd8'}]}
Thread-1248::DEBUG::2015-10-28 14:59:00,202::task::1191::Storage.TaskManager.Task::(prepare) Task=`9ca908a0-45e2-41d5-802c-dc0bd2414a69`::finished: {'statuslist': [{'status': 0, 'id': u'189c29a5-6830-453c-aca3-7d82f2382dd8'}]}
Thread-1248::DEBUG::2015-10-28 14:59:00,202::task::595::Storage.TaskManager.Task::(_updateState) Task=`9ca908a0-45e2-41d5-802c-dc0bd2414a69`::moving from state preparing -> state finished
Thread-1248::DEBUG::2015-10-28 14:59:00,203::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-1248::DEBUG::2015-10-28 14:59:00,203::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-1248::DEBUG::2015-10-28 14:59:00,203::task::993::Storage.TaskManager.Task::(_decref) Task=`9ca908a0-45e2-41d5-802c-dc0bd2414a69`::ref 0 aborting False
Thread-1248::DEBUG::2015-10-28 14:59:00,203::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return 'StoragePool.connectStorageServer' in bridge with [{'status': 0, 'id': u'189c29a5-6830-453c-aca3-7d82f2382dd8'}]
Thread-1249::DEBUG::2015-10-28 14:59:00,218::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'StorageDomain.getInfo' in bridge with {u'storagedomainID': u'9f1ec45d-0c32-4bfc-8b67-372d6f204fd1'}

Current filesystem layout on hypervisor, but still withou any qemu process for the hosted engine:
[root@ovc71 log]# df -h
Filesystem                           Size  Used Avail Use% Mounted on
/dev/mapper/centos-root               27G  2.6G   24G  10% /
devtmpfs                             4.9G     0  4.9G   0% /dev
tmpfs                                4.9G  4.0K  4.9G   1% /dev/shm
tmpfs                                4.9G  8.6M  4.9G   1% /run
tmpfs                                4.9G     0  4.9G   0% /sys/fs/cgroup
/dev/mapper/OVIRT_DOMAIN-NFS_DOMAIN   20G   36M   20G   1% /NFS_DOMAIN
/dev/mapper/OVIRT_DOMAIN-SHE_DOMAIN   25G  2.9G   23G  12% /SHE_DOMAIN
/dev/mapper/OVIRT_DOMAIN-ISO_DOMAIN  5.0G   33M  5.0G   1% /ISO_DOMAIN
/dev/sda1                            497M  130M  368M  27% /boot
ovc71.localdomain.local:/NFS_DOMAIN   20G   35M   20G   1% /rhev/data-center/mnt/ovc71.localdomain.local:_NFS__DOMAIN
ovc71.localdomain.local:/SHE_DOMAIN   25G  2.9G   23G  12% /rhev/data-center/mnt/ovc71.localdomain.local:_SHE__DOMAIN