[ovirt-users] Cannot retrieve answer file from 1st HE host when setting up 2nd host

Simone Tiraboschi stirabos at redhat.com
Tue Dec 22 14:26:32 UTC 2015


On Tue, Dec 22, 2015 at 3:06 PM, Will Dennis <wdennis at nec-labs.com> wrote:

> See attached for requested logs
>


Thanks, the issue is here:
Dec 21 19:40:53 ovirt-node-03 etc-glusterfs-glusterd.vol[1079]: [2015-12-22
00:40:53.496109] C [MSGID: 106002]
[glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action]
0-management: Server quorum lost for volume engine. Stopping local bricks.
Dec 21 19:40:53 ovirt-node-03 etc-glusterfs-glusterd.vol[1079]: [2015-12-22
00:40:53.496410] C [MSGID: 106002]
[glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action]
0-management: Server quorum lost for volume vmdata. Stopping local bricks.

So at that point gluster lost its quorum and the fail system got read-only.

On the getStorageDomainsList VDSM internally raises cause the file-system
is read only:

Thread-141::DEBUG::2015-12-21
11:29:59,666::fileSD::157::Storage.StorageDomainManifest::(__init__)
Reading domain in path
/rhev/data-center/mnt/glusterSD/localhost:_engine/e89b6e64-bd7d-4846-b970-9af32a3295ee
Thread-141::DEBUG::2015-12-21
11:29:59,666::__init__::320::IOProcessClient::(_run) Starting IOProcess...
Thread-141::DEBUG::2015-12-21
11:29:59,680::persistentDict::192::Storage.PersistentDict::(__init__)
Created a persistent dict with FileMetadataRW backend
Thread-141::ERROR::2015-12-21
11:29:59,686::hsm::2898::Storage.HSM::(getStorageDomainsList) Unexpected
error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 2882, in getStorageDomainsList
    dom = sdCache.produce(sdUUID=sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 100, in produce
    domain.getRealDomain()
  File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
    return self._cache._realProduce(self._sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 124, in _realProduce
    domain = self._findDomain(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 143, in _findDomain
    dom = findMethod(sdUUID)
  File "/usr/share/vdsm/storage/glusterSD.py", line 32, in findDomain
    return GlusterStorageDomain(GlusterStorageDomain.findDomainPath(sdUUID))
  File "/usr/share/vdsm/storage/fileSD.py", line 198, in __init__
    validateFileSystemFeatures(manifest.sdUUID, manifest.mountpoint)
  File "/usr/share/vdsm/storage/fileSD.py", line 93, in
validateFileSystemFeatures
    oop.getProcessPool(sdUUID).directTouch(testFilePath)
  File "/usr/share/vdsm/storage/outOfProcess.py", line 350, in directTouch
    ioproc.touch(path, flags, mode)
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 543,
in touch
    self.timeout)
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 427,
in _sendCommand
    raise OSError(errcode, errstr)
OSError: [Errno 30] Read-only file system

But instead of reporting a failure to hosted-engine-setup, it reported a
successfully execution where it wasn't able to find any storage domain
there ( this one is a real bug, I'm going to open a bug on that, can I
attach your logs there? ):

Thread-141::INFO::2015-12-21
11:29:59,702::logUtils::51::dispatcher::(wrapper) Run and protect:
getStorageDomainsList, Return response: {'domlist': []}
Thread-141::DEBUG::2015-12-21
11:29:59,702::task::1191::Storage.TaskManager.Task::(prepare)
Task=`96a9ea03-dc13-483e-9b17-b55a759c9b44`::finished: {'domlist': []}
Thread-141::DEBUG::2015-12-21
11:29:59,702::task::595::Storage.TaskManager.Task::(_updateState)
Task=`96a9ea03-dc13-483e-9b17-b55a759c9b44`::moving from state preparing ->
state finished
Thread-141::DEBUG::2015-12-21
11:29:59,703::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll)
Owner.releaseAll requests {} resources {}
Thread-141::DEBUG::2015-12-21
11:29:59,703::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}
Thread-141::DEBUG::2015-12-21
11:29:59,703::task::993::Storage.TaskManager.Task::(_decref)
Task=`96a9ea03-dc13-483e-9b17-b55a759c9b44`::ref 0 aborting False
Thread-141::INFO::2015-12-21
11:29:59,704::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Request
handler for 127.0.0.1:39718 stopped

And so, cause VDSM doesn't report any existing storage domain,
hosted-engine-setup assumes that you are going to deploy the first host and
so your original issue.



>
>
> *From:* Simone Tiraboschi [mailto:stirabos at redhat.com]
> *Sent:* Tuesday, December 22, 2015 8:56 AM
> *To:* Will Dennis
> *Cc:* Sahina Bose; Yedidyah Bar David
>
> *Subject:* Re: [ovirt-users] Cannot retrieve answer file from 1st HE host
> when setting up 2nd host
>
>
>
>
>
> On Tue, Dec 22, 2015 at 2:44 PM, Will Dennis <wdennis at nec-labs.com> wrote:
>
> Which logs are needed?
>
>
> Let's start with vdsm.log and /var/log/messages
> Then it's quite strange that you have that amount of data in mom.log so
> also that one could be interesting.
>
>
>
>
>
> /var/log/vdsm
>
> total 24M
>
> drwxr-xr-x   3 vdsm kvm  4.0K Dec 18 20:10 .
>
> drwxr-xr-x. 13 root root 4.0K Dec 20 03:15 ..
>
> drwxr-xr-x   2 vdsm kvm     6 Dec  9 03:24 backup
>
> -rw-r--r--   1 vdsm kvm  2.5K Dec 21 11:29 connectivity.log
>
> -rw-r--r--   1 vdsm kvm  173K Dec 21 11:21 mom.log
>
> -rw-r--r--   1 vdsm kvm  2.0M Dec 17 10:09 mom.log.1
>
> -rw-r--r--   1 vdsm kvm  2.0M Dec 17 04:06 mom.log.2
>
> -rw-r--r--   1 vdsm kvm  2.0M Dec 16 22:03 mom.log.3
>
> -rw-r--r--   1 vdsm kvm  2.0M Dec 16 16:00 mom.log.4
>
> -rw-r--r--   1 vdsm kvm  2.0M Dec 16 09:57 mom.log.5
>
> -rw-r--r--   1 root root 115K Dec 21 11:29 supervdsm.log
>
> -rw-r--r--   1 root root 2.7K Oct 16 11:38 upgrade.log
>
> -rw-r--r--   1 vdsm kvm   13M Dec 22 08:44 vdsm.log
>
>
>
>
>
> *From:* Simone Tiraboschi [mailto:stirabos at redhat.com]
> *Sent:* Tuesday, December 22, 2015 3:58 AM
> *To:* Will Dennis; Sahina Bose
> *Cc:* Yedidyah Bar David; users
> *Subject:* Re: [ovirt-users] Cannot retrieve answer file from 1st HE host
> when setting up 2nd host
>
>
>
>
>
>
>
> On Tue, Dec 22, 2015 at 2:09 AM, Will Dennis <wdennis at nec-labs.com> wrote:
>
> http://ur1.ca/ocstf
>
>
>
>
> 2015-12-21 11:28:39 DEBUG otopi.plugins.otopi.dialog.human
> dialog.__logString:219 DIALOG:SEND                 Please specify the full
> shared storage connection path to use (example: host:/path):
> 2015-12-21 11:28:55 DEBUG otopi.plugins.otopi.dialog.human
> dialog.__logString:219 DIALOG:RECEIVE    localhost:/engine
>
>
>
> OK, so you are trying to deploy hosted-engine on GlusterFS in a
> hyper-converged way (using the same hosts for virtualization and for
> serving GlusterFS). Unfortunately I've to advise you that this is not a
> supported configuration on oVirt 3.6 due to different open bugs.
>
> So I'm glad you can help us testing it but I prefer to advise that today
> that schema is not production ready.
>
>
>
> In your case it seams that VDSM correctly connects the GlusterFS volume
> seeing all the bricks
>
>
> 2015-12-21 11:28:55 DEBUG
> otopi.plugins.ovirt_hosted_engine_setup.storage.nfs plugin.execute:936
> execute-output: ('/sbin/gluster', '--mode=script', '--xml', 'volume',
> 'info', 'engine', '--remote-host=localhost') stdout:
> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
> <cliOutput>
>   <opRet>0</opRet>
>   <opErrno>0</opErrno>
>   <opErrstr/>
>   <volInfo>
>     <volumes>
>       <volume>
>         <name>engine</name>
>         <id>974c9da4-b236-4fc1-b26a-645f14601db8</id>
>         <status>1</status>
>         <statusStr>Started</statusStr>
>         <brickCount>6</brickCount>
>         <distCount>3</distCount>
>
>
>
> but then VDSM doesn't find any storage domain there:
>
>
>
>
> otopi.plugins.ovirt_hosted_engine_setup.storage.storage.Plugin._late_customization
> 2015-12-21 11:29:58 DEBUG
> otopi.plugins.ovirt_hosted_engine_setup.storage.storage
> storage._getExistingDomain:476 _getExistingDomain
> 2015-12-21 11:29:58 DEBUG
> otopi.plugins.ovirt_hosted_engine_setup.storage.storage
> storage._storageServerConnection:638 connectStorageServer
> 2015-12-21 11:29:58 DEBUG
> otopi.plugins.ovirt_hosted_engine_setup.storage.storage
> storage._storageServerConnection:701 {'status': {'message': 'OK', 'code':
> 0}, 'statuslist': [{'status': 0, 'id':
> '67ece152-dd66-444c-8d18-4249d1b8f488'}]}
> 2015-12-21 11:29:58 DEBUG
> otopi.plugins.ovirt_hosted_engine_setup.storage.storage
> storage._getStorageDomainsList:595 getStorageDomainsList
> 2015-12-21 11:29:59 DEBUG
> otopi.plugins.ovirt_hosted_engine_setup.storage.storage
> storage._getStorageDomainsList:598 {'status': {'message': 'OK', 'code': 0},
> 'domlist': []}
>
>
>
> Can you please attach also the correspondent VDSM logs?
>
>
>
> Adding Sahina here.
>
>
>
>
>
> On Dec 21, 2015, at 11:58 AM, Simone Tiraboschi <stirabos at redhat.com
> <mailto:stirabos at redhat.com>> wrote:
>
>
> On Mon, Dec 21, 2015 at 5:52 PM, Will Dennis <wdennis at nec-labs.com<mailto:
> wdennis at nec-labs.com>> wrote:
>
> However, when I went to the 3rd host and did the setup, I selected
> 'glusterfs' and gave the path of the engine volume, it came back and
> incorrectly identified it as the first host, instead of an additional
> host... How does setup determine that? I confirmed that on this 3rd host
> that the engine volume is available and has the GUID subfolder of the
> hosted engine...
>
>
> Can you please attach a log of hosted-engine-setup also from there?
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20151222/58d981ac/attachment-0001.html>


More information about the Users mailing list