----- Original Message -----
Hi,
A short summary from the call today, please correct me if I forgot or
misunderstood something.
Ayal argued that the failed host/storagedomain should be reactivated
by a periodically executed job, he would prefer if the engine could
[try to] correct the problem right on discovery.
Livnat's point was that this is hard to implement and it is OK if we
move it to Nonoperational state and periodically check it again.
There was a little arguing if we call the current behavior a bug or a
missing behavior, I believe this is not quite important.
I did not fully understand the last few sentences from Livant, did we
manage to agree in a change in the plan?
A couple of points that we agreed upon:
1. no need for new mechanism, just initiate this from the monitoring context.
Preferably, if not difficult, evaluate the monitoring data, if host should remain in
non-op then don't bother running initVdsOnUp
2. configuration of when to call initvdsonup is orthogonal to auto-init behaviour and if
introduced should be on by default and user should be able to configure this either on or
off for the host in general (no lower granularity) and can only be configured via the
API.
When disabled initVdsOnUp would be called only when admin activates the host/storage and
any error would keep it inactive (I still don't understand why this is at all needed
but whatever).
Note that going forward what I envision is engine pushing down the entire host
configuration once and from that point on the host would try to keep this configuration up
and running. Once this happens there will be no need for initVdsOnUp at all.
Anyway, I agree with Ayal that it would be very nice if the engine
could fix the issues right on discovery, but I also agree that this
feature would take a bigger effort. It would be nice to know what
effort it would take to get the monitoring do this safely. Could we
still call it monitoring then?
Laszlo
----- Original Message -----
> From: "Ayal Baron" <abaron(a)redhat.com>
> To: "Laszlo Hornyak" <lhornyak(a)redhat.com>
> Cc: engine-devel(a)ovirt.org, "Yaniv Kaul" <ykaul(a)redhat.com>
> Sent: Wednesday, February 15, 2012 12:46:05 PM
> Subject: Re: [Engine-devel] Autorecovery feature plan for review
>
>
>
> ----- Original Message -----
> > Hi Ayal,
> >
> > ----- Original Message -----
> > > From: "Ayal Baron" <abaron(a)redhat.com>
> > > To: "Yaniv Kaul" <ykaul(a)redhat.com>
> > > Cc: engine-devel(a)ovirt.org
> > > Sent: Wednesday, February 15, 2012 12:19:48 PM
> > > Subject: Re: [Engine-devel] Autorecovery feature plan for
> > > review
> > >
> > >
> > > >
> > > > I still fail to understand why you 'punish' existing objects
> > > > and
> > > > not
> > > > giving them the new feature enabled by default.
> > >
> > > This is not a feature, it's a bug!
> >
> > Whatever we call it, it is a change in behavior. We agreed that
> > it
> > will be enabled for all existing objects by default.
> >
> >
http://globalnerdy.com/wordpress/wp-content/uploads/2007/12/bug_vs_featur...
> >
> > > This should not be treated as a feature and this should not be
> > > configurable!
> >
> > I can imagine some situations when I would not like the
> > autorecovery
> > to happen, but if everyone agrees not to make it configurable, I
> > will just remove it from my patchset.
>
> It's not autorecovery, you're not recovering anything. You're
> reflecting the fact that the resource is back to normal (not due to
> anything that the engine did).
> This is why it is a bug today.
> This is why it should not be configurable.
>
> >
> > > Today an object moves to non-operational due to state reported
> > > by
> > > vdsm. The object should immediately return to up the moment
> > > vdsm
> > > reports the object as ok (this means that you don't stop
> > > monitoring
> > > just because there is an error).
> > > That's it. no db field and no nothing...
> > > This pertains to storage domains, network, host status,
> > > whatever.
> > >
> > > > Y.
> > > >
> > > > > b. In environment to be clean installed -we have 0 existing
> > > > > entities -
> > > > > after clean install all new entities in the system will be
> > > > > create
> > > > > with
> > > > > auto recoverable set to true.
> > > > > Will this be considered a bad behavior?
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Engine-devel mailing list
> > > > > Engine-devel(a)ovirt.org
> > > > >
http://lists.ovirt.org/mailman/listinfo/engine-devel
> > > >
> > > > _______________________________________________
> > > > Engine-devel mailing list
> > > > Engine-devel(a)ovirt.org
> > > >
http://lists.ovirt.org/mailman/listinfo/engine-devel
> > > >
> > > _______________________________________________
> > > Engine-devel mailing list
> > > Engine-devel(a)ovirt.org
> > >
http://lists.ovirt.org/mailman/listinfo/engine-devel
> > >
> >
>