[Users] Host Non-Operational from sanlock and VM fails to migrate

Itamar Heim iheim at redhat.com
Sun Feb 9 22:15:21 UTC 2014


On 02/03/2014 06:58 PM, Trey Dockendorf wrote:
> I have a 2 node oVirt 3.3.2 cluster setup and am evaluating the setup
> for production use on our HPC system for managing our VM
> infrastructure.  Currently I'm trying to utilize our DDR InfiniBand
> fabric for the storage domains in oVirt using NFS over RDMA.  I've
> noticed some unstable behavior and it seems in every case to begin
> with sanlock.
>
> The ovirt web admin interface shows the following message as first
> sign of trouble on 2014-Feb-03 07:45.
>
> "Invalid status on Data Center Default. Setting Data Center status to
> Non Responsive (On host vm01.brazos.tamu.edu, Error: Network error
> during communication with the Host.).".
>
> The single VM I had running is stuck in the "Migrating From" state.
> virsh shows the VM paused on the crashed host and the one it attempted
> to migrate to.
>
> Right now I have a few concerns.
>
> 1) The cause of the sanlock (or other instability) and if it's related
> to a bug or an issue using NFSoRDMA.
> 2) Why the VM failed to migrate if the second host had no issues.  If
> the first host is down should the VM be considered offline and booted
> on the second host after first is fenced?
>
> Attached are logs from the failed host (vm01) and the healthy host
> (vm02) as well as engine.  The failed host's /var/log/message is also
> attached (vm01_message.log).
>
> Thanks
> - Trey
>
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>

was this resolved?



More information about the Users mailing list