A large number of my virtual machines are reporting in a paused state but they are actually up and running. Unfortunately I don't look at this cluster very often so I don't know exactly when it got into this state, I believe it may have happened a couple months back when I deployed a change to /etc/vdsm/vdsm.conf to throttle virtual machines migrations.  I deployed this change via puppet and had puppet cycle the vdsm daemon.  All of the vdsm daemons in the cluster were restarted in a short period of time. Since this happened some time back all the logs have rotated off.

 

I have tried restarting vdsmd and libvirtd, but that doesn't appear to return the virtual machines to an up state.  The output of "vdsClient -s 0 getAllVmStats" reports them as paused, "virsh -r list --all" reports them as running.  From the GUI I can click run and the VM will show “Up” in the GUI and vdsClient.  But if I restart vdsm, the VM's revert back to displaying a paused state.  I did find that the *.recovery files in /var/run/vdsm contain a status of Paused even when the GUI and vdsClient are reporting it as up. I did try and change the status in this file while vdsm was down and then start vdsm, but it had no effect and the entry in the *.recovery file which  reverted to paused.

 

p335

sasS'status'

p336

S'Paused'

 

Any idea what might have caused the VM's to report as paused and how to recover from it? 

 

I am running

ovirt-3.3.4-1 and vdsm-4.13.3-4