Gervais,
I checked the logs and I see:
jsonrpc.Executor/1::ERROR::2016-07-19
16:19:10,283::task::868::Storage.TaskManager.Task::(_setError)
Task=`b27c8bbd-ca35-44ca-97ae-88c4e91f6eec`::Unexpected error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/task.py", line 875, in _run
return fn(*args, **kargs)
File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper
res = f(*args, **kwargs)
File "/usr/share/vdsm/storage/hsm.py", line 2700, in getStorageDomainInfo
dom = self.validateSdUUID(sdUUID)
File "/usr/share/vdsm/storage/hsm.py", line 285, in validateSdUUID
sdDom.validate()
File "/usr/share/vdsm/storage/fileSD.py", line 485, in validate
raise se.StorageDomainAccessError(self.sdUUID)
StorageDomainAccessError: Domain is either partially accessible or
entirely inaccessible: (u'248f46f0-d793-4581-9810-c9d965e2f286',)
Thread-21821::ERROR::2016-07-19
16:19:14,348::api::195::root::(_getHaInfo) failed to retrieve Hosted
Engine HA info
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 174,
in _getHaInfo
stats = instance.get_all_stats()
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 103, in get_all_stats
self._configure_broker_conn(broker)
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 180, in _configure_broker_conn
dom_type=dom_type)
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 176, in set_storage_domain
.format(sd_type, options, e))
RequestError: Failed to set storage domain FilesystemBackend, options
{'dom_type': 'nfs3', 'sd_uuid':
'248f46f0-d793-4581-9810-c9d965e2f286'}: Request failed: <class
'ovirt_hosted_engine_ha.lib.storage_backends.BackendFailureException'>
after couple of above issues vdsm was restarted and 'Connection reset
by peer' started to occur. In between connect reset I can see:
Thread-76::ERROR::2016-07-19
16:21:25,024::api::195::root::(_getHaInfo) failed to retrieve Hosted
Engine HA info
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 174,
in _getHaInfo
stats = instance.get_all_stats()
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 102, in get_all_stats
with broker.connection(self._retries, self._wait):
File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 99, in connection
self.connect(retries, wait)
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 78, in connect
raise BrokerConnectionError(error_msg)
BrokerConnectionError: Failed to connect to broker, the number of
errors has exceeded the limit (1)
and
Thread-315::ERROR::2016-07-19
16:26:58,541::vm::765::virt.vm::(_startUnderlyingVm)
vmId=`4013c829-c9d7-4b72-90d5-6fe58137504c`::The vm start process
failed
Traceback (most recent call last):
File "/usr/share/vdsm/virt/vm.py", line 706, in _startUnderlyingVm
self._run()
File "/usr/share/vdsm/virt/vm.py", line 1995, in _run
self._connection.createXML(domxml, flags),
File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py",
line 123, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 916, in wrapper
return func(inst, *args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3611, in createXML
if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)
libvirtError: resource busy: Failed to acquire lock: error -243
and
Thread-6834::ERROR::2016-07-20
17:18:10,030::task::868::Storage.TaskManager.Task::(_setError)
Task=`f6d8d5df-a55f-4ccb-af11-f1b44b9757d0`::Unexpected error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/task.py", line 875, in _run
return fn(*args, **kargs)
File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper
res = f(*args, **kwargs)
File "/usr/share/vdsm/storage/hsm.py", line 3473, in stopMonitoringDomain
raise se.StorageDomainIsMemberOfPool(sdUUID)
StorageDomainIsMemberOfPool: Storage domain is member of pool:
'domain=248f46f0-d793-4581-9810-c9d965e2f286'
In the logs I can see that vdsm was restarted on 2016-07-21
14:55:03,607 and any issues stopped occurring.
Was there any hardware (storage) issue?
I can see from your previous email that the issues started to occur
again on 2016-07-22.
Do you see any errors like those above?
Thanks,
Piotr
On Fri, Jul 22, 2016 at 3:05 PM, Gervais de Montbrun
<gervais(a)demontbrun.com> wrote:
Hi Simone,
I did have the issue you link to below when doing a `hosted-engine --deploy`
on this server when I was setting it up to run 3.6. I've commented on the
bug with my experiences. I did get the host working in 3.6 and there were no
errors, but this one has cropped up since upgrading to 4.0.1.
I did not have the same issue on all of my hosts, but the error I am
experiencing now:
JsonRpc (StompReactor)::ERROR::2016-07-22
09:59:56,062::betterAsyncore::113::vds.dispatcher::(recv) SSL error during
reading data: unexpected eof
JsonRpc (StompReactor)::ERROR::2016-07-22
10:00:11,240::betterAsyncore::113::vds.dispatcher::(recv) SSL error during
reading data: unexpected eof
JsonRpc (StompReactor)::ERROR::2016-07-22
10:00:21,158::betterAsyncore::113::vds.dispatcher::(recv) SSL error during
reading data: unexpected eof
JsonRpc (StompReactor)::ERROR::2016-07-22
10:00:21,441::betterAsyncore::113::vds.dispatcher::(recv) SSL error during
reading data: unexpected eof
JsonRpc (StompReactor)::ERROR::2016-07-22
10:00:26,717::betterAsyncore::113::vds.dispatcher::(recv) SSL error during
reading data: unexpected eof
JsonRpc (StompReactor)::ERROR::2016-07-22
10:00:31,856::betterAsyncore::113::vds.dispatcher::(recv) SSL error during
reading data: unexpected eof
JsonRpc (StompReactor)::ERROR::2016-07-22
10:00:36,982::betterAsyncore::113::vds.dispatcher::(recv) SSL error during
reading data: unexpected eof
JsonRpc (StompReactor)::ERROR::2016-07-22
10:00:52,180::betterAsyncore::113::vds.dispatcher::(recv) SSL error during
reading data: unexpected eof
is happening on all of them.
:-(
Cheers,
Gervais
On Jul 22, 2016, at 5:35 AM, Simone Tiraboschi <stirabos(a)redhat.com> wrote:
On Thu, Jul 21, 2016 at 8:08 PM, Gervais de Montbrun
<gervais(a)demontbrun.com> wrote:
Hi Martin
Logs are attached.
Thank you for any help you can offer.
:-)
Cheers,
Gervais
see also this one:
https://bugzilla.redhat.com/show_bug.cgi?id=1358530
the results are pretty similar.
On Jul 21, 2016, at 10:20 AM, Martin Perina <mperina(a)redhat.com> wrote:
So could you please share logs?
Thanks
Martin
On Thu, Jul 21, 2016 at 3:17 PM, Gervais de Montbrun
<gervais(a)demontbrun.com> wrote:
Hi Oved,
Thanks for the suggestion.
I tried setting "management_ip = 0.0.0.0" but same result.
BTW, management_ip='0.0.0.0' (as suggested in the post) doesn't work for
me. vdsmd wouldn't start.
Cheers,
Gervais
On Jul 20, 2016, at 10:50 AM, Oved Ourfali <oourfali(a)redhat.com> wrote:
Also, this thread seems similar.
Also talking about IPV4/IPV6 issue.
Does it help?
[1]
http://lists.ovirt.org/pipermail/users/2016-June/040602.html
On Wed, Jul 20, 2016 at 4:43 PM, Martin Perina <mperina(a)redhat.com> wrote:
Hi,
could you please create a bug and attach engine host logs (all from
/var/log/ovirt-engine) and VDSM logs (from /var/log/vdsm)?
Thanks
Martin Perina
On Wed, Jul 20, 2016 at 1:50 PM, Gervais de Montbrun
<gervais(a)demontbrun.com
wrote:
Hi Qiong,
I am experiencing the exact same issue. All four of my hosts are
throwing
the same error to the vdsm.log If you find a solution, please let me
know
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users