
On Thu, Apr 7, 2016 at 5:05 PM, <nicolas@devels.es> wrote:
Hi,
Lately we're having a lot of events like these:
2016-04-07 14:54:25,247 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (org.ovirt.thread.pool-8-thread-2) [] domain '5de4a000-a9c4-489c-8eee-10368647c413:iscsi01' in problem. vds: 'host7.domain.com' 2016-04-07 14:54:40,501 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (org.ovirt.thread.pool-8-thread-17) [] Domain '5de4a000-a9c4-489c-8eee-10368647c413:iscsi01' recovered from problem. vds: 'host7.domain.com' 2016-04-07 14:54:40,501 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (org.ovirt.thread.pool-8-thread-17) [] Domain '5de4a000-a9c4-489c-8eee-10368647c413:iscsi01' has recovered from problem. No active host in the DC is reporting it as problematic, so clearing the domain recovery timer. 2016-04-07 14:54:46,314 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (org.ovirt.thread.pool-8-thread-30) [] domain '5de4a000-a9c4-489c-8eee-10368647c413:iscsi01' in problem. vds: 'host5.domain.com' 2016-04-07 14:55:01,589 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (org.ovirt.thread.pool-8-thread-32) [] Domain '5de4a000-a9c4-489c-8eee-10368647c413:iscsi01' recovered from problem. vds: 'host5.domain.com' 2016-04-07 14:55:01,589 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (org.ovirt.thread.pool-8-thread-32) [] Domain '5de4a000-a9c4-489c-8eee-10368647c413:iscsi01' has recovered from problem. No active host in the DC is reporting it as problematic, so clearing the domain recovery timer.
Up until now it's been only one domain that I see with this warning, this doesn't look good nevertheless. Not sure if related, but I can't find a disk with this UUID. How can I start debugging?
This is oVirt 3.6.4.1-1, and using an iSCSI-based storage backend.
This may be related to this bug: https://bugzilla.redhat.com/1081962 Running this tool on vdsm log will give better picture of what is happening in the vdsm side: https://github.com/oVirt/vdsm/blob/master/contrib/repoplot You can see examples of the output in the bug: https://bugzilla.redhat.com/attachment.cgi?id=1130967 We are working on improve monitoring that will eliminate this issue. Nir