A large number of my virtual machines are reporting in a paused state but they are actually up and running. Unfortunately I don't look at this cluster very often so I don't know exactly when it got into this state, I believe it may have happened a couple months back when I deployed a change to /etc/vdsm/vdsm.conf to throttle virtual machines migrations. I deployed this change via puppet and had puppet cycle the vdsm daemon. All of the vdsm daemons in the cluster were restarted in a short period of time. Since this happened some time back all the logs have rotated off.
I have tried restarting vdsmd and libvirtd, but that doesn't appear to return the virtual machines to an up state. The output of "vdsClient -s 0 getAllVmStats" reports them as paused, "virsh -r list --all" reports them as running. From the GUI I can click run and the VM will show “Up” in the GUI and vdsClient. But if I restart vdsm, the VM's revert back to displaying a paused state. I did find that the *.recovery files in /var/run/vdsm contain a status of Paused even when the GUI and vdsClient are reporting it as up. I did try and change the status in this file while vdsm was down and then start vdsm, but it had no effect and the entry in the *.recovery file which reverted to paused.
p335
sasS'status'
p336
S'Paused'
Any idea what might have caused the VM's to report as paused and how to recover from it?
I am running
ovirt-3.3.4-1 and vdsm-4.13.3-4