
On Wed, Apr 10, 2013 at 08:59:01AM -0500, Tony Feldmann wrote:
I am having a strange issue in my ovirt cluster. I have 2 hosts, 1 running engine and added as a host and one other system added as a host. Both systems are running gluster across local disks for shared storage. Everything was working fine until last night, where my system that is also running the engine when unresponsive in the admin page. All vms were still running that were on the host. I shut down the vms that were on the host from within the guest os as I was not able to do anything to the vm with the host in unresponsive state. After getting the vms off and rebooting the host, the vdsmd service says that it is running, but it continually restarts the vdsm process and dumps out these messages: detected unhandled Python exception in '/usr/share/vdsm/vdsm'. All services say they are up and running but the host stays in unresponsive state and the vdsm process keeps respawning. There is also no data in the vdsm.log. Can anyone shed any light on this for me?
vdsm-devel@fedorahosted.org may be a better place to ask vdsm-specific questions. Could you log into the non-operational host as root, and stop the vdsm service. Then become the vdsm user with su -s /bin/bash - vdsm and run /usr/share/vdsm/vdsm manually. Do you see anything in particular? Dan.