Dan Kenigsberg wrote:
On Wed, Apr 10, 2013 at 08:59:01AM -0500, Tony Feldmann wrote:
> I am having a strange issue in my ovirt cluster. I have 2 hosts, 1 running
> engine and added as a host and one other system added as a host. Both
> systems are running gluster across local disks for shared storage.
> Everything was working fine until last night, where my system that is also
> running the engine when unresponsive in the admin page. All vms were still
> running that were on the host. I shut down the vms that were on the host
> from within the guest os as I was not able to do anything to the vm with
> the host in unresponsive state. After getting the vms off and rebooting
> the host, the vdsmd service says that it is running, but it continually
> restarts the vdsm process and dumps out these messages: detected unhandled
> Python exception in '/usr/share/vdsm/vdsm'. All services say they are up
> and running but the host stays in unresponsive state and the vdsm process
> keeps respawning. There is also no data in the vdsm.log. Can anyone shed
> any light on this for me?
>
vdsm-devel(a)fedorahosted.org may be a better place to ask vdsm-specific
questions.
Could you log into the non-operational host as root, and stop the vdsm
service.
Then become the vdsm user with
su -s /bin/bash - vdsm
and run /usr/share/vdsm/vdsm manually. Do you see anything in
particular?
Please have a look at the permissions/owner of /var/log/vdsm/vdsm.log.
Should be vdsm:kvm and not root:root
Joop