
Gervais, I checked the logs and I see: jsonrpc.Executor/1::ERROR::2016-07-19 16:19:10,283::task::868::Storage.TaskManager.Task::(_setError) Task=`b27c8bbd-ca35-44ca-97ae-88c4e91f6eec`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 875, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2700, in getStorageDomainInfo dom = self.validateSdUUID(sdUUID) File "/usr/share/vdsm/storage/hsm.py", line 285, in validateSdUUID sdDom.validate() File "/usr/share/vdsm/storage/fileSD.py", line 485, in validate raise se.StorageDomainAccessError(self.sdUUID) StorageDomainAccessError: Domain is either partially accessible or entirely inaccessible: (u'248f46f0-d793-4581-9810-c9d965e2f286',) Thread-21821::ERROR::2016-07-19 16:19:14,348::api::195::root::(_getHaInfo) failed to retrieve Hosted Engine HA info Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 174, in _getHaInfo stats = instance.get_all_stats() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 103, in get_all_stats self._configure_broker_conn(broker) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 180, in _configure_broker_conn dom_type=dom_type) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 176, in set_storage_domain .format(sd_type, options, e)) RequestError: Failed to set storage domain FilesystemBackend, options {'dom_type': 'nfs3', 'sd_uuid': '248f46f0-d793-4581-9810-c9d965e2f286'}: Request failed: <class 'ovirt_hosted_engine_ha.lib.storage_backends.BackendFailureException'> after couple of above issues vdsm was restarted and 'Connection reset by peer' started to occur. In between connect reset I can see: Thread-76::ERROR::2016-07-19 16:21:25,024::api::195::root::(_getHaInfo) failed to retrieve Hosted Engine HA info Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 174, in _getHaInfo stats = instance.get_all_stats() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 102, in get_all_stats with broker.connection(self._retries, self._wait): File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 99, in connection self.connect(retries, wait) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 78, in connect raise BrokerConnectionError(error_msg) BrokerConnectionError: Failed to connect to broker, the number of errors has exceeded the limit (1) and Thread-315::ERROR::2016-07-19 16:26:58,541::vm::765::virt.vm::(_startUnderlyingVm) vmId=`4013c829-c9d7-4b72-90d5-6fe58137504c`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 706, in _startUnderlyingVm self._run() File "/usr/share/vdsm/virt/vm.py", line 1995, in _run self._connection.createXML(domxml, flags), File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 123, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 916, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3611, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirtError: resource busy: Failed to acquire lock: error -243 and Thread-6834::ERROR::2016-07-20 17:18:10,030::task::868::Storage.TaskManager.Task::(_setError) Task=`f6d8d5df-a55f-4ccb-af11-f1b44b9757d0`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 875, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 3473, in stopMonitoringDomain raise se.StorageDomainIsMemberOfPool(sdUUID) StorageDomainIsMemberOfPool: Storage domain is member of pool: 'domain=248f46f0-d793-4581-9810-c9d965e2f286' In the logs I can see that vdsm was restarted on 2016-07-21 14:55:03,607 and any issues stopped occurring. Was there any hardware (storage) issue? I can see from your previous email that the issues started to occur again on 2016-07-22. Do you see any errors like those above? Thanks, Piotr On Fri, Jul 22, 2016 at 3:05 PM, Gervais de Montbrun <gervais@demontbrun.com> wrote:
Hi Simone,
I did have the issue you link to below when doing a `hosted-engine --deploy` on this server when I was setting it up to run 3.6. I've commented on the bug with my experiences. I did get the host working in 3.6 and there were no errors, but this one has cropped up since upgrading to 4.0.1.
I did not have the same issue on all of my hosts, but the error I am experiencing now:
JsonRpc (StompReactor)::ERROR::2016-07-22 09:59:56,062::betterAsyncore::113::vds.dispatcher::(recv) SSL error during reading data: unexpected eof JsonRpc (StompReactor)::ERROR::2016-07-22 10:00:11,240::betterAsyncore::113::vds.dispatcher::(recv) SSL error during reading data: unexpected eof JsonRpc (StompReactor)::ERROR::2016-07-22 10:00:21,158::betterAsyncore::113::vds.dispatcher::(recv) SSL error during reading data: unexpected eof JsonRpc (StompReactor)::ERROR::2016-07-22 10:00:21,441::betterAsyncore::113::vds.dispatcher::(recv) SSL error during reading data: unexpected eof JsonRpc (StompReactor)::ERROR::2016-07-22 10:00:26,717::betterAsyncore::113::vds.dispatcher::(recv) SSL error during reading data: unexpected eof JsonRpc (StompReactor)::ERROR::2016-07-22 10:00:31,856::betterAsyncore::113::vds.dispatcher::(recv) SSL error during reading data: unexpected eof JsonRpc (StompReactor)::ERROR::2016-07-22 10:00:36,982::betterAsyncore::113::vds.dispatcher::(recv) SSL error during reading data: unexpected eof JsonRpc (StompReactor)::ERROR::2016-07-22 10:00:52,180::betterAsyncore::113::vds.dispatcher::(recv) SSL error during reading data: unexpected eof
is happening on all of them. :-(
Cheers, Gervais
On Jul 22, 2016, at 5:35 AM, Simone Tiraboschi <stirabos@redhat.com> wrote:
On Thu, Jul 21, 2016 at 8:08 PM, Gervais de Montbrun <gervais@demontbrun.com> wrote:
Hi Martin
Logs are attached.
Thank you for any help you can offer. :-)
Cheers, Gervais
see also this one: https://bugzilla.redhat.com/show_bug.cgi?id=1358530
the results are pretty similar.
On Jul 21, 2016, at 10:20 AM, Martin Perina <mperina@redhat.com> wrote:
So could you please share logs?
Thanks
Martin
On Thu, Jul 21, 2016 at 3:17 PM, Gervais de Montbrun <gervais@demontbrun.com> wrote:
Hi Oved,
Thanks for the suggestion.
I tried setting "management_ip = 0.0.0.0" but same result. BTW, management_ip='0.0.0.0' (as suggested in the post) doesn't work for me. vdsmd wouldn't start.
Cheers, Gervais
On Jul 20, 2016, at 10:50 AM, Oved Ourfali <oourfali@redhat.com> wrote:
Also, this thread seems similar. Also talking about IPV4/IPV6 issue. Does it help?
[1] http://lists.ovirt.org/pipermail/users/2016-June/040602.html
On Wed, Jul 20, 2016 at 4:43 PM, Martin Perina <mperina@redhat.com> wrote:
Hi,
could you please create a bug and attach engine host logs (all from /var/log/ovirt-engine) and VDSM logs (from /var/log/vdsm)?
Thanks
Martin Perina
On Wed, Jul 20, 2016 at 1:50 PM, Gervais de Montbrun <gervais@demontbrun.com
wrote:
Hi Qiong,
I am experiencing the exact same issue. All four of my hosts are throwing the same error to the vdsm.log If you find a solution, please let me know
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users