Liron Aravot has submitted this change and it was merged.
Change subject: core: intrdoucing host immediate domain recovery mechanism
......................................................................
core: intrdoucing host immediate domain recovery mechanism
oVirt engine allows hosts to be activated even if they can't access some
of the data center's storage domains in case that those domains are
marked as "inactive" which means that all the hosts that are already in
status up reported them as problematic (therefore there's no need to
prevent "new" hosts from being activated).
In case that we have an inactive domain that we failed to connect to
it's storage server we won't have the link for that domain and we won't
be able to produce it (as the mount was possible unavailable when we
attempted to connect to the storage server).
If the connectivity to that domain will return, host that was already
active before might report that he has access to the domain which will
cause the engine to change that domain's status to "active". The issue
is that hosts that were activated after the connectivity was lost would
move to non operational (causing to vm migration..etc) as they possibly
won't have connection to the domain (it's a race between the domain
status being changed to Active and the domain auto recovery meachanism)
and won't have the needed links of that domain.
The implemented solution is attempting to prevent hosts from moving to
non-operational status to avoid the related affects of it.
A new quartz job is set to run every 30 seconds, that job will inspect
all reports of hosts that were gatherd since it's last run. The
motivation for that implementation is to aggregate the operations on the
different hosts together to avoid long wait time and block other "pool"
operations.
If any hosts has a "new" report on a domain that is active or unknown
that it can't access for "storage" reason, those hosts would be
reconnected to the active/unknown domains storage servers and will
refresh it's storage pool metadata.
the engine will attempt to "recover" each host only once for each
problematic report to avoid flooding the system with recovery attempts,
if the host would still have problem accessing the domain it'll be moved
to non operational as usual.
Change-Id: Idb7b2fe8c87805986aaf25cd0f24f605d67d4186
Bug-Url:
https://bugzilla.redhat.com/show_bug.cgi?id=1093924
Signed-off-by: Liron Aravot <laravot(a)redhat.com>
---
M
backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/VdsEventListener.java
M
backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/storage/ConnectHostToStoragePoolServerCommandBase.java
M
backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/storage/ConnectHostToStoragePoolServersCommand.java
M
backend/manager/modules/common/src/main/java/org/ovirt/engine/core/common/action/ConnectHostToStoragePoolServersParameters.java
M
backend/manager/modules/common/src/main/java/org/ovirt/engine/core/common/businessentities/IVdsEventListener.java
M
backend/manager/modules/common/src/main/java/org/ovirt/engine/core/common/config/ConfigValues.java
M
backend/manager/modules/common/src/main/java/org/ovirt/engine/core/common/locks/LockingGroup.java
M
backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/irsbroker/IrsProxyData.java
M
backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/storage/StoragePoolDomainHelper.java
M packaging/dbscripts/upgrade/pre_upgrade/0000_config.sql
10 files changed, 363 insertions(+), 56 deletions(-)
Approvals:
Allon Mureinik: Looks good to me, approved
Liron Aravot: Verified; Looks good to me, approved
--
To view, visit
http://gerrit.ovirt.org/27523
To unsubscribe, visit
http://gerrit.ovirt.org/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: Idb7b2fe8c87805986aaf25cd0f24f605d67d4186
Gerrit-PatchSet: 4
Gerrit-Project: ovirt-engine
Gerrit-Branch: master
Gerrit-Owner: Liron Aravot <laravot(a)redhat.com>
Gerrit-Reviewer: Allon Mureinik <amureini(a)redhat.com>
Gerrit-Reviewer: Daniel Erez <derez(a)redhat.com>
Gerrit-Reviewer: Federico Simoncelli <fsimonce(a)redhat.com>
Gerrit-Reviewer: Liron Aravot <laravot(a)redhat.com>
Gerrit-Reviewer: Maor Lipchuk <mlipchuk(a)redhat.com>
Gerrit-Reviewer: Oved Ourfali <oourfali(a)redhat.com>
Gerrit-Reviewer: Roy Golan <rgolan(a)redhat.com>
Gerrit-Reviewer: Tal Nisan <tnisan(a)redhat.com>
Gerrit-Reviewer: automation(a)ovirt.org
Gerrit-Reviewer: oVirt Jenkins CI Server