[ovirt-users] Hosted-Engine HA problem

Niels de Vos ndevos at redhat.com
Thu Oct 30 20:11:25 UTC 2014


On Thu, Oct 30, 2014 at 09:07:24PM +0530, Vijay Bellur wrote:
> On 10/30/2014 06:45 PM, Jiri Moskovcak wrote:
> >On 10/30/2014 09:22 AM, Jaicel R. Sabonsolin wrote:
> >>Hi Guys,
> >>
> >>I need help with my ovirt Hosted-Engine HA setup. I am running on 2
> >>ovirt hosts and 2 gluster nodes with replicated volumes. i already have
> >>VMs running on my hosts and they can migrate normally once i for example
> >>power off the host that they are running on. the problem is that the
> >>engine can't migrate once i switch off the host that hosts the engine.
> >>
> >>    oVirt        3.4.3-1.el6
> >>    KVM         0.12.1.2 - 2.415.el6_5.10
> >>    LIBVIRT   libvirt-0.10.2-29.el6_5.9
> >>    VDSM      vdsm-4.14.17-0.el6
> >>
> >>
> >>right now, i have this result from hosted-engine --vm-status.
> >>
> >>       File "/usr/lib64/python2.6/runpy.py", line 122, in
> >>    _run_module_as_main
> >>         "__main__", fname, loader, pkg_name)
> >>       File "/usr/lib64/python2.6/runpy.py", line 34, in _run_code
> >>         exec code in run_globals
> >>       File
> >>
> >>"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_setup/vm_status.py",
> >>
> >>    line 111, in <module>
> >>         if not status_checker.print_status():
> >>       File
> >>
> >>"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_setup/vm_status.py",
> >>
> >>    line 58, in print_status
> >>         all_host_stats = ha_cli.get_all_host_stats()
> >>       File
> >>
> >>"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py",
> >>
> >>    line 137, in get_all_host_stats
> >>         return self.get_all_stats(self.StatModes.HOST)
> >>       File
> >>
> >>"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/client/client.py",
> >>
> >>    line 86, in get_all_stats
> >>         constants.SERVICE_TYPE)
> >>       File
> >>
> >>"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> >>
> >>    line 171, in get_stats_from_storage
> >>         result = self._checked_communicate(request)
> >>       File
> >>
> >>"/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> >>
> >>    line 199, in _checked_communicate
> >>         .format(message or response))
> >>    ovirt_hosted_engine_ha.lib.exceptions.RequestError: Request failed:
> >>    <type 'exceptions.OSError'>
> >>
> >>
> >>restarting ha-broker and ha-agent normalizes the status but eventually
> >>it would become "false" and then return to the result above. hope you
> >>guys could help me with this.
> >>
> >
> >Hi Jaicel,
> >please attach agent.log and broker.log from the host where you trying to
> >run hosted-engine --vm-status. I have a feeling that you ran into a
> >known problem on gluster - stalled file descriptor, in that case the
> >only known solution at this time is to restart the broker & agent as you
> >have already found out.
> >
> 
> Adding Niels and gluster-devel to troubleshoot from Gluster NFS perspective.

I'd welcome any details on this "stalled file descriptor" problem. Is
there a bug filed with some details like logs, sysrq-t and maybe even
tcpdumps? If there is an easy way to reproduce this behaviour, I can
surely look into it and hopefully come up with some advise or fix.

Thanks,
Niels



More information about the Users mailing list