From fsimonce at redhat.com  Tue Feb  7 10:27:09 2012
From: fsimonce at redhat.com (Federico Simoncelli)
Date: Tue, 07 Feb 2012 05:27:09 -0500 (EST)
Subject: FOSDEM 2012: VDSM - The oVirt Node Management Agent
In-Reply-To: <1fdea180-7a6c-4296-8415-cd9d7a99b18b@zmail16.collab.prod.int.phx2.redhat.com>
Message-ID: <36af1f31-b53f-4baa-b5a1-8bd849ff904a@zmail16.collab.prod.int.phx2.redhat.com>

Here you can find my slides presented at FOSDEM 2012:

http://people.redhat.com/fsimonce/fosdem-2012/

-- 
Federico


From oschreib at redhat.com  Wed Feb  8 16:05:19 2012
From: oschreib at redhat.com (Ofer Schreiber)
Date: Wed, 08 Feb 2012 11:05:19 -0500 (EST)
Subject: oVirt's Next Release
In-Reply-To: <c8de5b43-9282-49c4-bb66-47324b819915@zmail14.collab.prod.int.phx2.redhat.com>
Message-ID: <7d2c39b3-17f9-4345-a19f-561fac119b53@zmail14.collab.prod.int.phx2.redhat.com>

The First release of oVirt is (almost) out, so we should talk about the next release.

Please share your thoughts about:
1. Release Date
2. Release Criteria (affecting #1)

Hopefully, we will have a formal vote about those suggestion in the next weekly meeting.

Regards,
Ofer Schreiber.


From aliguori at us.ibm.com  Wed Feb  8 22:05:38 2012
From: aliguori at us.ibm.com (Anthony Liguori)
Date: Wed, 08 Feb 2012 16:05:38 -0600
Subject: How to effectively lurk oVirt projects with Gerrit?
Message-ID: <4F32F1B2.6050501@us.ibm.com>

Hi,

Perhaps I'm getting too set in my ways, but I'm having a really hard time 
following oVirt projects with Gerrit.  I'm hoping I can get some advice here 
from anyone with a similar needs.

I know there's -patches lists like vdsm-patches but the mails generated by 
Gerrit only seem to be commits, not patches posted for inclusion or comments on 
those patches.  It also doesn't inline the patches in the mails which makes 
casual reading difficult.

In short, how can one be a silent observer via a mailing list and get a good 
feeling for what's going on in the project?

Is there a way to make Gerrit send the actual patches to the various mailing 
lists as they're posted for review (along with review comments to those patches)?

Regards,

Anthony Liguori


From iheim at redhat.com  Wed Feb  8 22:34:16 2012
From: iheim at redhat.com (Itamar Heim)
Date: Wed, 08 Feb 2012 17:34:16 -0500 (EST)
Subject: How to effectively lurk oVirt projects with Gerrit?
In-Reply-To: <4F32F1B2.6050501@us.ibm.com>
References: <4F32F1B2.6050501@us.ibm.com>
Message-ID: <08c301cce6b1$6c6e7580$454b6080$@com>


> -----Original Message-----
> From: arch-bounces at ovirt.org [mailto:arch-bounces at ovirt.org] On Behalf
Of Anthony Liguori
> Sent: Thursday, February 09, 2012 0:06 AM
> To: arch at ovirt.org
> Subject: How to effectively lurk oVirt projects with Gerrit?
> 
> Hi,
> 
> Perhaps I'm getting too set in my ways, but I'm having a really hard
time
> following oVirt projects with Gerrit.  I'm hoping I can get some advice
here
> from anyone with a similar needs.
> 
> I know there's -patches lists like vdsm-patches but the mails generated
by
> Gerrit only seem to be commits, not patches posted for inclusion or
comments on
> those patches.  It also doesn't inline the patches in the mails which
makes
> casual reading difficult.

All emails should go to the -patches lists, these would include initial
patch email, comments and commit - but yes, they do not include the
payload.
It doesn't include the inline, since we had some issues contributing that
code to gerrit upstream (well, their gerrit was down, then some CLA issues
- hopefully all resolved by now).

> 
> In short, how can one be a silent observer via a mailing list and get a
good
> feeling for what's going on in the project?
> 
> Is there a way to make Gerrit send the actual patches to the various
mailing
> lists as they're posted for review (along with review comments to those
patches)?

As I mentioned above, we saw this gap as well, and (Gal Hammer) wrote a
patch for gerrit to close it.
But I did not want to build a forked version of our gerrit until these
patches are accepted into upstream gerrit.
Once they do, either will wait for the next version, or just backport this
specific patch to the version we have.

Btw, you can also register in gerrit only to specific project and get the
emails directly, rather than via the mailing list.
(login, settings, watched projects, choose per project (or even branch)
granularity to watch new/comment/submit emails.

HTH,
   Itamar


From aliguori at us.ibm.com  Wed Feb  8 22:41:32 2012
From: aliguori at us.ibm.com (Anthony Liguori)
Date: Wed, 08 Feb 2012 16:41:32 -0600
Subject: How to effectively lurk oVirt projects with Gerrit?
In-Reply-To: <08c301cce6b1$6c6e7580$454b6080$@com>
References: <4F32F1B2.6050501@us.ibm.com> <08c301cce6b1$6c6e7580$454b6080$@com>
Message-ID: <4F32FA1C.9000507@us.ibm.com>

On 02/08/2012 04:34 PM, Itamar Heim wrote:
  All emails should go to the -patches lists, these would include initial
> patch email, comments and commit - but yes, they do not include the
> payload.
> It doesn't include the inline, since we had some issues contributing that
> code to gerrit upstream (well, their gerrit was down, then some CLA issues
> - hopefully all resolved by now).

Ah, okay.  Now that I look more closely, I guess I can see that you're right re: 
everything getting posted.

But it's fairly difficult to see because the subject is the same and the typical 
style of inline replying isn't used.  Perhaps the patch you spoke of fixes this.

>>
>> In short, how can one be a silent observer via a mailing list and get a
> good
>> feeling for what's going on in the project?
>>
>> Is there a way to make Gerrit send the actual patches to the various
> mailing
>> lists as they're posted for review (along with review comments to those
> patches)?
>
> As I mentioned above, we saw this gap as well, and (Gal Hammer) wrote a
> patch for gerrit to close it.
> But I did not want to build a forked version of our gerrit until these
> patches are accepted into upstream gerrit.
> Once they do, either will wait for the next version, or just backport this
> specific patch to the version we have.

I'm glad to hear this!  I'll wait until that gets deployed I guess.

> Btw, you can also register in gerrit only to specific project and get the
> emails directly, rather than via the mailing list.

I'm happy using a mailing list.

Thanks for the quick response.

Regards,

Anthony Liguori

> (login, settings, watched projects, choose per project (or even branch)
> granularity to watch new/comment/submit emails.
>
> HTH,
>     Itamar
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch


From iheim at redhat.com  Wed Feb  8 22:46:48 2012
From: iheim at redhat.com (Itamar Heim)
Date: Wed, 08 Feb 2012 17:46:48 -0500 (EST)
Subject: oVirt's Next Release
In-Reply-To: <7d2c39b3-17f9-4345-a19f-561fac119b53@zmail14.collab.prod.int.phx2.redhat.com>
References: <c8de5b43-9282-49c4-bb66-47324b819915@zmail14.collab.prod.int.phx2.redhat.com>
	<7d2c39b3-17f9-4345-a19f-561fac119b53@zmail14.collab.prod.int.phx2.redhat.com>
Message-ID: <08d501cce6b3$2c768100$85638300$@com>


> -----Original Message-----
> From: arch-bounces at ovirt.org [mailto:arch-bounces at ovirt.org] On Behalf
Of Ofer Schreiber
> Sent: Wednesday, February 08, 2012 18:05 PM
> To: arch at ovirt.org
> Subject: oVirt's Next Release
> 
> The First release of oVirt is (almost) out, so we should talk about the
next release.
> 
> Please share your thoughts about:
> 1. Release Date
> 2. Release Criteria (affecting #1)
> 
> Hopefully, we will have a formal vote about those suggestion in the next
weekly meeting.

iirc, we discussed timely releases every 6 months, with an exception for
the first one of 3 months.
So I think a may-ish release would make sense.
Maybe increase the quality bar a bit by feature freezing master branch for
2 or 4 weeks, then branching and cherry-picking show-stoppers.
(concept of freezing master branch can be decided later. If not, it means
more cherry-picking for bug fixes to version branch. I think a 2-4 weeks
of feature freeze after 2 months of development is reasonable even on
master branch).


From oschreib at redhat.com  Thu Feb  9 08:44:17 2012
From: oschreib at redhat.com (Ofer Schreiber)
Date: Thu, 09 Feb 2012 03:44:17 -0500 (EST)
Subject: oVirt's Next Release
In-Reply-To: <08d501cce6b3$2c768100$85638300$@com>
Message-ID: <18b5797a-c95f-43c6-aced-1f9540328333@zmail14.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> 
> 
> > -----Original Message-----
> > From: arch-bounces at ovirt.org [mailto:arch-bounces at ovirt.org] On
> > Behalf
> Of Ofer Schreiber
> > Sent: Wednesday, February 08, 2012 18:05 PM
> > To: arch at ovirt.org
> > Subject: oVirt's Next Release
> > 
> > The First release of oVirt is (almost) out, so we should talk about
> > the
> next release.
> > 
> > Please share your thoughts about:
> > 1. Release Date
> > 2. Release Criteria (affecting #1)
> > 
> > Hopefully, we will have a formal vote about those suggestion in the
> > next
> weekly meeting.
> 
> iirc, we discussed timely releases every 6 months, with an exception
> for
> the first one of 3 months.
> So I think a may-ish release would make sense.
> Maybe increase the quality bar a bit by feature freezing master
> branch for
> 2 or 4 weeks, then branching and cherry-picking show-stoppers.
> (concept of freezing master branch can be decided later. If not, it
> means
> more cherry-picking for bug fixes to version branch. I think a 2-4
> weeks
> of feature freeze after 2 months of development is reasonable even on
> master branch).


+1 on May'ish release.

The early branching/freeze is fine, but IMO the release criteria should contain stuff like working snapshots, fencing, migration. etc.

> 
> 
> 


From iheim at redhat.com  Thu Feb  9 10:55:38 2012
From: iheim at redhat.com (Itamar Heim)
Date: Thu, 09 Feb 2012 12:55:38 +0200
Subject: Adding gluster support
Message-ID: <4F33A62A.20307@redhat.com>

Hi,

The following wiki describes the approach i'm suggesting for adding 
gluster support in phases to ovirt.

http://www.ovirt.org/wiki/AddingGlusterSupportToOvirt

comments welcome.

Thanks,
    Itamar


From mburns at redhat.com  Thu Feb  9 12:44:24 2012
From: mburns at redhat.com (Mike Burns)
Date: Thu, 09 Feb 2012 07:44:24 -0500
Subject: oVirt's Next Release
In-Reply-To: <18b5797a-c95f-43c6-aced-1f9540328333@zmail14.collab.prod.int.phx2.redhat.com>
References: <18b5797a-c95f-43c6-aced-1f9540328333@zmail14.collab.prod.int.phx2.redhat.com>
Message-ID: <1328791464.2553.32.camel@beelzebub.mburnsfire.net>

On Thu, 2012-02-09 at 03:44 -0500, Ofer Schreiber wrote:
> 
> ----- Original Message -----
> > 
> > 
> > > -----Original Message-----
> > > From: arch-bounces at ovirt.org [mailto:arch-bounces at ovirt.org] On
> > > Behalf
> > Of Ofer Schreiber
> > > Sent: Wednesday, February 08, 2012 18:05 PM
> > > To: arch at ovirt.org
> > > Subject: oVirt's Next Release
> > > 
> > > The First release of oVirt is (almost) out, so we should talk about
> > > the
> > next release.
> > > 
> > > Please share your thoughts about:
> > > 1. Release Date
> > > 2. Release Criteria (affecting #1)
> > > 
> > > Hopefully, we will have a formal vote about those suggestion in the
> > > next
> > weekly meeting.
> > 
> > iirc, we discussed timely releases every 6 months, with an exception
> > for
> > the first one of 3 months.
> > So I think a may-ish release would make sense.
> > Maybe increase the quality bar a bit by feature freezing master
> > branch for
> > 2 or 4 weeks, then branching and cherry-picking show-stoppers.
> > (concept of freezing master branch can be decided later. If not, it
> > means
> > more cherry-picking for bug fixes to version branch. I think a 2-4
> > weeks
> > of feature freeze after 2 months of development is reasonable even on
> > master branch).
> 
> 
> +1 on May'ish release.
> 
> The early branching/freeze is fine, but IMO the release criteria should contain stuff like working snapshots, fencing, migration. etc.

How coordinated do we need to be w.r.t. releases?  ovirt-node will
probably have some intermediate release(s) if we go with a May release,
but I don't think that should be a problem as long as we have an ISO
with the right components available when the oVirt Project releases. 

Currently, Node is looking at small incremental releases in February
(2.2.3) and March (2.3.0).  I can see finishing up another release
(2.3.1/2.4.0) toward the end of April or in May which could fall into
the right time frame for the overall release, but haven't gotten that
far out yet.  

Mike
> 
> > 
> > 
> > 
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch


From pmyers at redhat.com  Thu Feb  9 16:45:09 2012
From: pmyers at redhat.com (Perry Myers)
Date: Thu, 09 Feb 2012 11:45:09 -0500
Subject: Some thoughts on enhancing High Availability in oVirt
Message-ID: <4F33F815.7040905@redhat.com>

warning: tl;dr

Right now, HA in oVirt is limited to VM level granularity.  Each VM
provides a heartbeat through vdsm back to the oVirt Engine.  If that
heartbeat is lost, the VM is terminated and (if the user has configured
it) the VM is relaunched.  If the host running that VM has lost its
heartbeat, the host is fenced (via a remote power operation) and all HA
VMs are restarted on an alternate host.

Also, the policies for controlling if/when a VM should be restarted are
somewhat limited and hardcoded.

So there are two things that we can improve here:

1. Provide introspection into VMs so that we can monitor the health of
   individual services and not just the VM

2. Provide a more configurable way of expressing policy for when a VM
   (and its services) should trigger remediation by the HA subsystem

We can tackle these two things in isolation, or we can try to combine
and solve them at the same time.

Some possible paths (not the only ones) might be:


* Leverage Pacemaker Cloud (http://pacemaker-cloud.org/)

Pacemaker Cloud works by providing a generic (read: virt mgmt system
agnostic) way of managing HA for virtual machines and their services.
At a high level the concept is that you define 1 or more virtual
machines to be in a application group, and pcmk-cloud spawns a process
to monitor that application group using either Matahari/QMF or direct
SSH access.

pcmk-cloud is not meant to be a user facing component, so integration
work would need to be done here to have oVirt consume the pcmk-cloud
REST API for specifying what the application groups (sets of VMs) are
and exposing that through the oVirt web UI.

pcmk-cloud at a high level has the following functions:
  + monitoring of services through Matahari/QMF/SSH
  + monitoring of VMs through Matahari/QMF/SSH/Deltacloud
  + control of services through Matahari/QMF/SSH
  + control of VMs through Deltacloud or the native provider (in this
    case oVirt Engine REST API)
  + policy engine/model (per application group) to make decisions about
    when to control services/VMs based on the monitoring input

Integration decisions:
  + pcmk-cloud to use existing transports for monitoring/control
    (QMF/SSH) or do we leverage a new transport via vdsm/ovirt-guest-
    agent?
  + pcmk-cloud could act as the core policy engine to determine VM
    placement in the oVirt datacenter/clusters or it could be used
    solely for the monitoring/remediation aspect


* Leverage guest monitoring agents w/ ovirt-guest-agent

This would be taking the Services Agent from Matahari (which is just a C
library) and utilizing it from the ovirt-guest-agent.  So oga would
setup recurring monitoring of services using this lib and use its
existing communication path with vdsm->oVirt Engine to report back
service events.  In turn, oVirt Engine would need to interpret these
events and then issue service control actions back to oga

Conceptually this is very similar to using pcmk-cloud in the case where
pcmk-cloud utilizes information obtained through oga/vdsm through oVirt
Engine instead of communicating directly to Guests via QMF/SSH.  In
fact, taking this route would probably end up duplicating some effort
because effectively you'd need the pcmk-cloud concept of the Cloud
Application Policy Engine (formerly called DPE/Deployable Policy Engine)
built directly into oVirt Engine anyhow.

So part of looking at this is determining how much reuse/integration of
existing components makes sense vs. just re-implementing similar concepts.

I've cc'd folks from the HA community/pcmk-cloud and hopefully we can
have a bit of a discussion to determine the best path forward here.

Perry


From iheim at redhat.com  Thu Feb  9 16:49:29 2012
From: iheim at redhat.com (Itamar Heim)
Date: Thu, 09 Feb 2012 18:49:29 +0200
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F33F815.7040905@redhat.com>
References: <4F33F815.7040905@redhat.com>
Message-ID: <4F33F919.7050206@redhat.com>

On 02/09/2012 06:45 PM, Perry Myers wrote:
> warning: tl;dr
>
> Right now, HA in oVirt is limited to VM level granularity.  Each VM
> provides a heartbeat through vdsm back to the oVirt Engine.  If that
> heartbeat is lost, the VM is terminated and (if the user has configured
> it) the VM is relaunched.  If the host running that VM has lost its
> heartbeat, the host is fenced (via a remote power operation) and all HA
> VMs are restarted on an alternate host.
>
> Also, the policies for controlling if/when a VM should be restarted are
> somewhat limited and hardcoded.
>
> So there are two things that we can improve here:
>
> 1. Provide introspection into VMs so that we can monitor the health of
>     individual services and not just the VM
>
> 2. Provide a more configurable way of expressing policy for when a VM
>     (and its services) should trigger remediation by the HA subsystem
>
> We can tackle these two things in isolation, or we can try to combine
> and solve them at the same time.
>
> Some possible paths (not the only ones) might be:
>
>
> * Leverage Pacemaker Cloud (http://pacemaker-cloud.org/)
>
> Pacemaker Cloud works by providing a generic (read: virt mgmt system
> agnostic) way of managing HA for virtual machines and their services.
> At a high level the concept is that you define 1 or more virtual
> machines to be in a application group, and pcmk-cloud spawns a process
> to monitor that application group using either Matahari/QMF or direct
> SSH access.
>
> pcmk-cloud is not meant to be a user facing component, so integration
> work would need to be done here to have oVirt consume the pcmk-cloud
> REST API for specifying what the application groups (sets of VMs) are
> and exposing that through the oVirt web UI.
>
> pcmk-cloud at a high level has the following functions:
>    + monitoring of services through Matahari/QMF/SSH
>    + monitoring of VMs through Matahari/QMF/SSH/Deltacloud
>    + control of services through Matahari/QMF/SSH
>    + control of VMs through Deltacloud or the native provider (in this
>      case oVirt Engine REST API)
>    + policy engine/model (per application group) to make decisions about
>      when to control services/VMs based on the monitoring input
>
> Integration decisions:
>    + pcmk-cloud to use existing transports for monitoring/control
>      (QMF/SSH) or do we leverage a new transport via vdsm/ovirt-guest-
>      agent?
>    + pcmk-cloud could act as the core policy engine to determine VM
>      placement in the oVirt datacenter/clusters or it could be used
>      solely for the monitoring/remediation aspect
>
>
> * Leverage guest monitoring agents w/ ovirt-guest-agent
>
> This would be taking the Services Agent from Matahari (which is just a C
> library) and utilizing it from the ovirt-guest-agent.  So oga would
> setup recurring monitoring of services using this lib and use its
> existing communication path with vdsm->oVirt Engine to report back
> service events.  In turn, oVirt Engine would need to interpret these
> events and then issue service control actions back to oga
>
> Conceptually this is very similar to using pcmk-cloud in the case where
> pcmk-cloud utilizes information obtained through oga/vdsm through oVirt
> Engine instead of communicating directly to Guests via QMF/SSH.  In
> fact, taking this route would probably end up duplicating some effort
> because effectively you'd need the pcmk-cloud concept of the Cloud
> Application Policy Engine (formerly called DPE/Deployable Policy Engine)
> built directly into oVirt Engine anyhow.

I think we first need to look at the larger question of policy engine at 
ovirt-engine. the two main candidates are pacemaker and drools (jboss 
rules).
pacemaker for having logic in the area.
drools for having easier java integration and integrated UI to create 
policies by users.

>
> So part of looking at this is determining how much reuse/integration of
> existing components makes sense vs. just re-implementing similar concepts.
>
> I've cc'd folks from the HA community/pcmk-cloud and hopefully we can
> have a bit of a discussion to determine the best path forward here.
>
> Perry


From pmyers at redhat.com  Thu Feb  9 16:55:00 2012
From: pmyers at redhat.com (Perry Myers)
Date: Thu, 09 Feb 2012 11:55:00 -0500
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F33F919.7050206@redhat.com>
References: <4F33F815.7040905@redhat.com> <4F33F919.7050206@redhat.com>
Message-ID: <4F33FA64.6000101@redhat.com>

> I think we first need to look at the larger question of policy engine at
> ovirt-engine. the two main candidates are pacemaker and drools (jboss
> rules).
> pacemaker for having logic in the area.
> drools for having easier java integration and integrated UI to create
> policies by users.

Agreed, as I mentioned in my email they're interrelated

i.e. if you're going to use Pacemaker's policy engine then it absolutely
makes sense to just go with Pacemaker Cloud, since that's precisely what
it does (uses the core Pacemaker PE)

OTOH, if you decide to use drools, then it may make more sense to
integrate the HA concepts directly into the drools PE and then the only
other thing you can leverage would be the library that does the
monitoring of services at the end points.


From mburns at redhat.com  Thu Feb  9 22:00:54 2012
From: mburns at redhat.com (Mike Burns)
Date: Thu, 09 Feb 2012 17:00:54 -0500
Subject: Repo files
Message-ID: <1328824854.2553.69.camel@beelzebub.mburnsfire.net>

Currently, we are providing 2 repo files, both named ovirt-engine.repo.
One is located in the nightly area of the repository, and the other in
the stable area.  

Would it make more sense to:

1.  Have a single repo file containing an entry for both stable and
nightly with just stable enabled by default?
2.  Provide this in some form of ovirt-release RPM?

Mike


From sgordon at redhat.com  Thu Feb  9 22:09:35 2012
From: sgordon at redhat.com (Steve Gordon)
Date: Thu, 09 Feb 2012 17:09:35 -0500 (EST)
Subject: Repo files
In-Reply-To: <1328824854.2553.69.camel@beelzebub.mburnsfire.net>
Message-ID: <187ab29d-96d2-471c-9d3b-08b0718e9c13@zmail15.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> From: "Mike Burns" <mburns at redhat.com>
> To: arch at ovirt.org
> Sent: Thursday, February 9, 2012 5:00:54 PM
> Subject: Repo files
> 
> Currently, we are providing 2 repo files, both named
> ovirt-engine.repo.
> One is located in the nightly area of the repository, and the other
> in
> the stable area.
> 
> Would it make more sense to:
> 
> 1.  Have a single repo file containing an entry for both stable and
> nightly with just stable enabled by default?
> 2.  Provide this in some form of ovirt-release RPM?
> 

This was brought up previously and my belief was (and is) that yes we should do this. At the time the repo and spec file I made with assistance from Karsten were put here:

    http://www.ovirt.org/wiki/Yum_repo_file

A few minor changes are required (for instance I don't believe we include the arch in the path as originally suggested nor do we have SRPM directories yet) but I still think for the most part this holds. In particular I think using a repo file distributed in an RPM is very useful because:

- We can update the repo file and bump the NVR of the RPM as required and have yum pick it up as required, rather than putting it on the website and hoping users see the note to grab an updated one.
- One repo file to rule them all, this configuration supports both nightly and stable from one file (ideally) in a consistent location.
- Will make life much easier when we want to add a GPG key and start signing packages.

Steve


From pmyers at redhat.com  Thu Feb  9 22:33:48 2012
From: pmyers at redhat.com (Perry Myers)
Date: Thu, 09 Feb 2012 17:33:48 -0500
Subject: Repo files
In-Reply-To: <187ab29d-96d2-471c-9d3b-08b0718e9c13@zmail15.collab.prod.int.phx2.redhat.com>
References: <187ab29d-96d2-471c-9d3b-08b0718e9c13@zmail15.collab.prod.int.phx2.redhat.com>
Message-ID: <4F3449CC.8020008@redhat.com>

On 02/09/2012 05:09 PM, Steve Gordon wrote:
> 
> 
> ----- Original Message -----
>> From: "Mike Burns" <mburns at redhat.com>
>> To: arch at ovirt.org
>> Sent: Thursday, February 9, 2012 5:00:54 PM
>> Subject: Repo files
>>
>> Currently, we are providing 2 repo files, both named
>> ovirt-engine.repo.
>> One is located in the nightly area of the repository, and the other
>> in
>> the stable area.
>>
>> Would it make more sense to:
>>
>> 1.  Have a single repo file containing an entry for both stable and
>> nightly with just stable enabled by default?
>> 2.  Provide this in some form of ovirt-release RPM?
>>
> 
> This was brought up previously and my belief was (and is) that yes we should do this. At the time the repo and spec file I made with assistance from Karsten were put here:
> 
>     http://www.ovirt.org/wiki/Yum_repo_file
> 
> A few minor changes are required (for instance I don't believe we include the arch in the path as originally suggested nor do we have SRPM directories yet) but I still think for the most part this holds. In particular I think using a repo file distributed in an RPM is very useful because:
> 
> - We can update the repo file and bump the NVR of the RPM as required and have yum pick it up as required, rather than putting it on the website and hoping users see the note to grab an updated one.
> - One repo file to rule them all, this configuration supports both nightly and stable from one file (ideally) in a consistent location.
> - Will make life much easier when we want to add a GPG key and start signing packages.

+1 to Steve's proposal


From abeekhof at redhat.com  Thu Feb  9 23:26:01 2012
From: abeekhof at redhat.com (Andrew Beekhof)
Date: Fri, 10 Feb 2012 10:26:01 +1100
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F33F919.7050206@redhat.com>
References: <4F33F815.7040905@redhat.com> <4F33F919.7050206@redhat.com>
Message-ID: <4F345609.6000701@redhat.com>

Resending as a link instead of attachment since I blew past the mailing 
list's size limit.

On 10/02/12 3:49 AM, Itamar Heim wrote:
> I think we first need to look at the larger question of policy engine at
> ovirt-engine. the two main candidates are pacemaker and drools (jboss
> rules).
> pacemaker for having logic in the area.
> drools for having easier java integration and integrated UI to create
> policies by users.

Linking to a previous PE/Drools comparison I did.  Should be reasonably 
current.

tl;dr - the Policy Engine is totally awesome /and/ makes coffee!

http://dl.dropbox.com/u/363965/RHEV%20Rules%20Engine%20Analysis.pdf

-- Andrew


From kroberts at redhat.com  Fri Feb 10 14:42:00 2012
From: kroberts at redhat.com (Keith Robertson)
Date: Fri, 10 Feb 2012 09:42:00 -0500
Subject: New oVirt GIT Repo Request
Message-ID: <4F352CB8.8060006@redhat.com>

All,

I would like to move some of the oVirt tools into their own GIT repos so 
that they are easier to manage/maintain.  In particular, I would like to 
move the ovirt-log-collector, ovirt-iso-uploader, and 
ovirt-image-uploader each into their own GIT repos.

The Plan:
Step 1: Create naked GIT repos on oVirt.org for the 3 tools.
Step 2: Link git repos to gerrit.
Step 3: Populate naked GIT repos with source and build standalone spec 
files for each.
Step 4: In one patch do both a) and b)...
  a) Update oVirt manager GIT repo by removing tool source.
  b) Update oVirt manager GIT repo such that spec has dependencies on 3 
new RPMs.

Optional:
- These three tools share some python classes that are very similar.  I 
would like to create a GIT repo (perhaps ovirt-tools-common) to contain 
these classes so that a fix in one place will fix the issue everywhere.  
Perhaps we can also create a naked GIT repo for these common classes 
while addressing the primary concerns above.

Please comment,
Keith Robertson


From ppinatti at linux.vnet.ibm.com  Fri Feb 10 19:33:44 2012
From: ppinatti at linux.vnet.ibm.com (Paulo de Rezende Pinatti)
Date: Fri, 10 Feb 2012 17:33:44 -0200
Subject: New oVirt GIT Repo Request
In-Reply-To: <4F352CB8.8060006@redhat.com>
References: <4F352CB8.8060006@redhat.com>
Message-ID: <4F357118.6050204@linux.vnet.ibm.com>

Hi,

a suggestion for the plan: you could use git filter-branch with the 
--subdirectory-filter option for creating the repos at step 1. That way 
it will keep commit history of the files being moved in the new repos.


Paulo de Rezende Pinatti
Staff Software Engineer
IBM Linux Technology Center


On 02/10/2012 12:42 PM, Keith Robertson wrote:
> All,
>
> I would like to move some of the oVirt tools into their own GIT repos 
> so that they are easier to manage/maintain.  In particular, I would 
> like to move the ovirt-log-collector, ovirt-iso-uploader, and 
> ovirt-image-uploader each into their own GIT repos.
>
> The Plan:
> Step 1: Create naked GIT repos on oVirt.org for the 3 tools.
> Step 2: Link git repos to gerrit.
> Step 3: Populate naked GIT repos with source and build standalone spec 
> files for each.
> Step 4: In one patch do both a) and b)...
>  a) Update oVirt manager GIT repo by removing tool source.
>  b) Update oVirt manager GIT repo such that spec has dependencies on 3 
> new RPMs.
>
> Optional:
> - These three tools share some python classes that are very similar.  
> I would like to create a GIT repo (perhaps ovirt-tools-common) to 
> contain these classes so that a fix in one place will fix the issue 
> everywhere.  Perhaps we can also create a naked GIT repo for these 
> common classes while addressing the primary concerns above.
>
> Please comment,
> Keith Robertson
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
>


From kwade at redhat.com  Sat Feb 11 02:36:09 2012
From: kwade at redhat.com (Karsten 'quaid' Wade)
Date: Fri, 10 Feb 2012 18:36:09 -0800
Subject: Are you attending the next workshop?
Message-ID: <4F35D419.4010809@redhat.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

If you plan on going to the 21 March workshop in Beijing, please send
in email to rsvp at ovirt.org soonest.

If you aren't sure you are attending but want to, go ahead and email
rsvp at ovirt.org anyway so we have your spot held for you.

Every attendee is requested to send an individual email to
rsvp at ovirt.org. This means if your manager is sending you, you still
need to email us to get on the list.

Thank you,

- - Karsten
- -- 
name:  Karsten 'quaid' Wade, Sr. Community Architect
team:    Red Hat Community Architecture & Leadership
uri:              http://communityleadershipteam.org
                         http://TheOpenSourceWay.org
gpg:                                        AD0E0C41
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFPNdQZ2ZIOBq0ODEERAjrKAJ9yomnT1qVvHFR+CXhjxFMy6HJMNQCfYLA2
9s9Vipv0Y2octBcTEevwpQw=
=Q1Gr
-----END PGP SIGNATURE-----


From kwade at redhat.com  Sat Feb 11 02:36:41 2012
From: kwade at redhat.com (Karsten 'quaid' Wade)
Date: Fri, 10 Feb 2012 18:36:41 -0800
Subject: Are you attending the next workshop?
Message-ID: <4F35D439.90401@redhat.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

If you plan on going to the 21 March workshop in Beijing, please send
in email to rsvp at ovirt.org soonest.

If you aren't sure you are attending but want to, go ahead and email
rsvp at ovirt.org anyway so we have your spot held for you.

Every attendee is requested to send an individual email to
rsvp at ovirt.org. This means if your manager is sending you, you still
need to email us to get on the list.

Thank you,

- - Karsten
- -- 
name:  Karsten 'quaid' Wade, Sr. Community Architect
team:    Red Hat Community Architecture & Leadership
uri:              http://communityleadershipteam.org
                         http://TheOpenSourceWay.org
gpg:                                        AD0E0C41
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFPNdQ52ZIOBq0ODEERAnicAJ90O69pgyq7ATkMdEh7yGOvA/mQVQCfY06Y
3+ZBWDGKO2r6cBNzQ/I4O3M=
=c3gB
-----END PGP SIGNATURE-----


From oschreib at redhat.com  Sat Feb 11 08:37:14 2012
From: oschreib at redhat.com (Ofer Schreiber)
Date: Sat, 11 Feb 2012 03:37:14 -0500 (EST)
Subject: Repo files
In-Reply-To: <4F3449CC.8020008@redhat.com>
References: <187ab29d-96d2-471c-9d3b-08b0718e9c13@zmail15.collab.prod.int.phx2.redhat.com>
	<4F3449CC.8020008@redhat.com>
Message-ID: <5642EADE-87A3-4D85-BB29-AE3C3D55E493@redhat.com>


On 10 Feb 2012, at 00:33, Perry Myers <pmyers at redhat.com> wrote:

> On 02/09/2012 05:09 PM, Steve Gordon wrote:
>> 
>> 
>> ----- Original Message -----
>>> From: "Mike Burns" <mburns at redhat.com>
>>> To: arch at ovirt.org
>>> Sent: Thursday, February 9, 2012 5:00:54 PM
>>> Subject: Repo files
>>> 
>>> Currently, we are providing 2 repo files, both named
>>> ovirt-engine.repo.
>>> One is located in the nightly area of the repository, and the other
>>> in
>>> the stable area.
>>> 
>>> Would it make more sense to:
>>> 
>>> 1.  Have a single repo file containing an entry for both stable and
>>> nightly with just stable enabled by default?
>>> 2.  Provide this in some form of ovirt-release RPM?
>>> 
>> 
>> This was brought up previously and my belief was (and is) that yes we should do this. At the time the repo and spec file I made with assistance from Karsten were put here:
>> 
>>    http://www.ovirt.org/wiki/Yum_repo_file
>> 
>> A few minor changes are required (for instance I don't believe we include the arch in the path as originally suggested nor do we have SRPM directories yet) but I still think for the most part this holds. In particular I think using a repo file distributed in an RPM is very useful because:
>> 
>> - We can update the repo file and bump the NVR of the RPM as required and have yum pick it up as required, rather than putting it on the website and hoping users see the note to grab an updated one.
>> - One repo file to rule them all, this configuration supports both nightly and stable from one file (ideally) in a consistent location.
>> - Will make life much easier when we want to add a GPG key and start signing packages.
> 
> +1 to Steve's proposal

+1 here as well. 

Feel free to combine the repo files, sounds better than the current configuration. 

> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/arch/attachments/20120211/e1e04f18/attachment.html>

From oschreib at redhat.com  Sat Feb 11 08:48:54 2012
From: oschreib at redhat.com (Ofer Schreiber)
Date: Sat, 11 Feb 2012 03:48:54 -0500 (EST)
Subject: [Engine-devel] New oVirt GIT Repo Request
In-Reply-To: <4F352CB8.8060006@redhat.com>
References: <4F352CB8.8060006@redhat.com>
Message-ID: <8FF5A0E4-AE69-4F19-87A9-2BEEE70DD78D@redhat.com>


On 10 Feb 2012, at 16:42, Keith Robertson <kroberts at redhat.com> wrote:

> All,
> 
> I would like to move some of the oVirt tools into their own GIT repos so that they are easier to manage/maintain.  In particular, I would like to move the ovirt-log-collector, ovirt-iso-uploader, and ovirt-image-uploader each into their own GIT repos.
> 
> The Plan:
> Step 1: Create naked GIT repos on oVirt.org for the 3 tools.
> Step 2: Link git repos to gerrit.
> Step 3: Populate naked GIT repos with source and build standalone spec files for each.
> Step 4: In one patch do both a) and b)...
> a) Update oVirt manager GIT repo by removing tool source.
> b) Update oVirt manager GIT repo such that spec has dependencies on 3 new RPMs.
> 
> Optional:
> - These three tools share some python classes that are very similar.  I would like to create a GIT repo (perhaps ovirt-tools-common) to contain these classes so that a fix in one place will fix the issue everywhere.  Perhaps we can also create a naked GIT repo for these common classes while addressing the primary concerns above.

+1 on the entire suggestion.
about the common stuff- will this package be obsolete once the tools will be base on the sdk?

> 
> Please comment,
> Keith Robertson
> _______________________________________________
> Engine-devel mailing list
> Engine-devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/engine-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/arch/attachments/20120211/b58f56bc/attachment.html>

From kroberts at redhat.com  Sat Feb 11 13:43:36 2012
From: kroberts at redhat.com (Keith Robertson)
Date: Sat, 11 Feb 2012 08:43:36 -0500
Subject: [Engine-devel] New oVirt GIT Repo Request
In-Reply-To: <8FF5A0E4-AE69-4F19-87A9-2BEEE70DD78D@redhat.com>
References: <4F352CB8.8060006@redhat.com>
	<8FF5A0E4-AE69-4F19-87A9-2BEEE70DD78D@redhat.com>
Message-ID: <4F367088.5010608@redhat.com>

On 02/11/2012 03:48 AM, Ofer Schreiber wrote:
>
> On 10 Feb 2012, at 16:42, Keith Robertson <kroberts at redhat.com 
> <mailto:kroberts at redhat.com>> wrote:
>
>> All,
>>
>> I would like to move some of the oVirt tools into their own GIT repos 
>> so that they are easier to manage/maintain.  In particular, I would 
>> like to move the ovirt-log-collector, ovirt-iso-uploader, and 
>> ovirt-image-uploader each into their own GIT repos.
>>
>> The Plan:
>> Step 1: Create naked GIT repos on oVirt.org <http://oVirt.org> for 
>> the 3 tools.
>> Step 2: Link git repos to gerrit.
>> Step 3: Populate naked GIT repos with source and build standalone 
>> spec files for each.
>> Step 4: In one patch do both a) and b)...
>> a) Update oVirt manager GIT repo by removing tool source.
>> b) Update oVirt manager GIT repo such that spec has dependencies on 3 
>> new RPMs.
>>
>> Optional:
>> - These three tools share some python classes that are very similar. 
>>  I would like to create a GIT repo (perhaps ovirt-tools-common) to 
>> contain these classes so that a fix in one place will fix the issue 
>> everywhere.  Perhaps we can also create a naked GIT repo for these 
>> common classes while addressing the primary concerns above.
>
> +1 on the entire suggestion.
> about the common stuff- will this package be obsolete once the tools 
> will be base on the sdk?
No.  The SDK is different it provides a common mechanism for accessing 
the REST API.  Whereas, the common tools repo is more geared to the 
tooling (e.g. common classes for logging, option parsing, etc.).  It 
would look like this...

[Common Tools]       [REST SDK]
                 \                         /
[image-uploader, iso-uploader, log-collector]


Cheers,
Keith
>
>>
>> Please comment,
>> Keith Robertson
>> _______________________________________________
>> Engine-devel mailing list
>> Engine-devel at ovirt.org <mailto:Engine-devel at ovirt.org>
>> http://lists.ovirt.org/mailman/listinfo/engine-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/arch/attachments/20120211/959964a7/attachment.html>

From kroberts at redhat.com  Sat Feb 11 13:44:58 2012
From: kroberts at redhat.com (Keith Robertson)
Date: Sat, 11 Feb 2012 08:44:58 -0500
Subject: New oVirt GIT Repo Request
In-Reply-To: <4F357118.6050204@linux.vnet.ibm.com>
References: <4F352CB8.8060006@redhat.com> <4F357118.6050204@linux.vnet.ibm.com>
Message-ID: <4F3670DA.5010502@redhat.com>

On 02/10/2012 02:33 PM, Paulo de Rezende Pinatti wrote:
> Hi,
>
> a suggestion for the plan: you could use git filter-branch with the 
> --subdirectory-filter option for creating the repos at step 1. That 
> way it will keep commit history of the files being moved in the new 
> repos.
No argument from me here.  I'd like to keep my history.
>
>
> Paulo de Rezende Pinatti
> Staff Software Engineer
> IBM Linux Technology Center
>
>
> On 02/10/2012 12:42 PM, Keith Robertson wrote:
>> All,
>>
>> I would like to move some of the oVirt tools into their own GIT repos 
>> so that they are easier to manage/maintain.  In particular, I would 
>> like to move the ovirt-log-collector, ovirt-iso-uploader, and 
>> ovirt-image-uploader each into their own GIT repos.
>>
>> The Plan:
>> Step 1: Create naked GIT repos on oVirt.org for the 3 tools.
>> Step 2: Link git repos to gerrit.
>> Step 3: Populate naked GIT repos with source and build standalone 
>> spec files for each.
>> Step 4: In one patch do both a) and b)...
>>  a) Update oVirt manager GIT repo by removing tool source.
>>  b) Update oVirt manager GIT repo such that spec has dependencies on 
>> 3 new RPMs.
>>
>> Optional:
>> - These three tools share some python classes that are very similar.  
>> I would like to create a GIT repo (perhaps ovirt-tools-common) to 
>> contain these classes so that a fix in one place will fix the issue 
>> everywhere.  Perhaps we can also create a naked GIT repo for these 
>> common classes while addressing the primary concerns above.
>>
>> Please comment,
>> Keith Robertson
>> _______________________________________________
>> Arch mailing list
>> Arch at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/arch
>>
>


From sgordon at redhat.com  Sat Feb 11 17:07:06 2012
From: sgordon at redhat.com (Steve Gordon)
Date: Sat, 11 Feb 2012 12:07:06 -0500 (EST)
Subject: Repo files
In-Reply-To: <5642EADE-87A3-4D85-BB29-AE3C3D55E493@redhat.com>
Message-ID: <d8df615d-2f5e-4682-a6ae-21b97d6e2399@zmail15.collab.prod.int.phx2.redhat.com>

> On 10 Feb 2012, at 00:33, Perry Myers <pmyers at redhat.com> wrote:
> 
> > On 02/09/2012 05:09 PM, Steve Gordon wrote:
> >> 
> >> 
> >> ----- Original Message -----
> >>> From: "Mike Burns" <mburns at redhat.com>
> >>> To: arch at ovirt.org
> >>> Sent: Thursday, February 9, 2012 5:00:54 PM
> >>> Subject: Repo files
> >>> 
> >>> Currently, we are providing 2 repo files, both named
> >>> ovirt-engine.repo.
> >>> One is located in the nightly area of the repository, and the
> >>> other
> >>> in
> >>> the stable area.
> >>> 
> >>> Would it make more sense to:
> >>> 
> >>> 1.  Have a single repo file containing an entry for both stable
> >>> and
> >>> nightly with just stable enabled by default?
> >>> 2.  Provide this in some form of ovirt-release RPM?
> >>> 
> >> 
> >> This was brought up previously and my belief was (and is) that yes
> >> we should do this. At the time the repo and spec file I made with
> >> assistance from Karsten were put here:
> >> 
> >>    http://www.ovirt.org/wiki/Yum_repo_file
> >> 
> >> A few minor changes are required (for instance I don't believe we
> >> include the arch in the path as originally suggested nor do we
> >> have SRPM directories yet) but I still think for the most part
> >> this holds. In particular I think using a repo file distributed
> >> in an RPM is very useful because:
> >> 
> >> - We can update the repo file and bump the NVR of the RPM as
> >> required and have yum pick it up as required, rather than putting
> >> it on the website and hoping users see the note to grab an
> >> updated one.
> >> - One repo file to rule them all, this configuration supports both
> >> nightly and stable from one file (ideally) in a consistent
> >> location.
> >> - Will make life much easier when we want to add a GPG key and
> >> start signing packages.
> > 
> > +1 to Steve's proposal
> 
> +1 here as well.
> 
> Feel free to combine the repo files, sounds better than the current
> configuration.
> 

Will make the required changes on Monday and put up an RPM/SRPM.

Steve


From iheim at redhat.com  Sat Feb 11 22:41:39 2012
From: iheim at redhat.com (Itamar Heim)
Date: Sun, 12 Feb 2012 00:41:39 +0200
Subject: [Engine-devel] New oVirt GIT Repo Request
In-Reply-To: <4F352CB8.8060006@redhat.com>
References: <4F352CB8.8060006@redhat.com>
Message-ID: <4F36EEA3.50006@redhat.com>

On 02/10/2012 04:42 PM, Keith Robertson wrote:
> All,
>
> I would like to move some of the oVirt tools into their own GIT repos so
> that they are easier to manage/maintain. In particular, I would like to
> move the ovirt-log-collector, ovirt-iso-uploader, and
> ovirt-image-uploader each into their own GIT repos.
>
> The Plan:
> Step 1: Create naked GIT repos on oVirt.org for the 3 tools.
> Step 2: Link git repos to gerrit.

above two are same step - create a project in gerrit.
I'll do that if list doesn't have any objections by monday.

> Step 3: Populate naked GIT repos with source and build standalone spec
> files for each.
> Step 4: In one patch do both a) and b)...
> a) Update oVirt manager GIT repo by removing tool source.
> b) Update oVirt manager GIT repo such that spec has dependencies on 3
> new RPMs.
>
> Optional:
> - These three tools share some python classes that are very similar. I
> would like to create a GIT repo (perhaps ovirt-tools-common) to contain
> these classes so that a fix in one place will fix the issue everywhere.
> Perhaps we can also create a naked GIT repo for these common classes
> while addressing the primary concerns above.

would this hold both python and java common code?


From iheim at redhat.com  Sat Feb 11 22:44:40 2012
From: iheim at redhat.com (Itamar Heim)
Date: Sun, 12 Feb 2012 00:44:40 +0200
Subject: [Engine-devel] New oVirt GIT Repo Request
In-Reply-To: <4F36EEA3.50006@redhat.com>
References: <4F352CB8.8060006@redhat.com> <4F36EEA3.50006@redhat.com>
Message-ID: <4F36EF58.70702@redhat.com>

On 02/12/2012 12:41 AM, Itamar Heim wrote:
>> The Plan:
>> Step 1: Create naked GIT repos on oVirt.org for the 3 tools.
>> Step 2: Link git repos to gerrit.
>
> above two are same step - create a project in gerrit.
> I'll do that if list doesn't have any objections by monday.
>
>> Step 3: Populate naked GIT repos with source and build standalone spec
>> files for each.
>> Step 4: In one patch do both a) and b)...
>> a) Update oVirt manager GIT repo by removing tool source.
>> b) Update oVirt manager GIT repo such that spec has dependencies on 3
>> new RPMs.

Also, run at list pylint jobs tracking changes to these new repos (cc'd 
eedri to coordinate with)


From eedri at redhat.com  Sun Feb 12 08:26:10 2012
From: eedri at redhat.com (Eyal Edri)
Date: Sun, 12 Feb 2012 03:26:10 -0500 (EST)
Subject: [Engine-devel] New oVirt GIT Repo Request
In-Reply-To: <4F36EF58.70702@redhat.com>
Message-ID: <dcc4f590-e96a-4a08-aace-62e00bba3675@zmail17.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> From: "Itamar Heim" <iheim at redhat.com>
> To: "Keith Robertson" <kroberts at redhat.com>
> Cc: engine-devel at ovirt.org, arch at ovirt.org, "Eyal Edri" <eedri at redhat.com>
> Sent: Sunday, February 12, 2012 12:44:40 AM
> Subject: Re: [Engine-devel] New oVirt GIT Repo Request
> 
> On 02/12/2012 12:41 AM, Itamar Heim wrote:
> >> The Plan:
> >> Step 1: Create naked GIT repos on oVirt.org for the 3 tools.
> >> Step 2: Link git repos to gerrit.
> >
> > above two are same step - create a project in gerrit.
> > I'll do that if list doesn't have any objections by monday.
> >
> >> Step 3: Populate naked GIT repos with source and build standalone
> >> spec
> >> files for each.
> >> Step 4: In one patch do both a) and b)...
> >> a) Update oVirt manager GIT repo by removing tool source.
> >> b) Update oVirt manager GIT repo such that spec has dependencies
> >> on 3
> >> new RPMs.
> 
> Also, run at list pylint jobs tracking changes to these new repos
> (cc'd
> eedri to coordinate with)
> 

It shouldn't be a problem if we want to test python code (pyflakes/pylint) via gerrit & jenkins.
(we might want to use -E just for errors, otherwise we'll get a lot of warnings we can't handle).

testing java (+maven) code will require moving to maven 3.0.X due to jenkins plugins backward compatible issues. 

Eyal.


From kroberts at redhat.com  Sun Feb 12 13:32:12 2012
From: kroberts at redhat.com (Keith Robertson)
Date: Sun, 12 Feb 2012 08:32:12 -0500
Subject: [Engine-devel] New oVirt GIT Repo Request
In-Reply-To: <4F36EEA3.50006@redhat.com>
References: <4F352CB8.8060006@redhat.com> <4F36EEA3.50006@redhat.com>
Message-ID: <4F37BF5C.20801@redhat.com>

On 02/11/2012 05:41 PM, Itamar Heim wrote:
> On 02/10/2012 04:42 PM, Keith Robertson wrote:
>> All,
>>
>> I would like to move some of the oVirt tools into their own GIT repos so
>> that they are easier to manage/maintain. In particular, I would like to
>> move the ovirt-log-collector, ovirt-iso-uploader, and
>> ovirt-image-uploader each into their own GIT repos.
>>
>> The Plan:
>> Step 1: Create naked GIT repos on oVirt.org for the 3 tools.
>> Step 2: Link git repos to gerrit.
>
> above two are same step - create a project in gerrit.
> I'll do that if list doesn't have any objections by monday.
Sure, np.
>
>> Step 3: Populate naked GIT repos with source and build standalone spec
>> files for each.
>> Step 4: In one patch do both a) and b)...
>> a) Update oVirt manager GIT repo by removing tool source.
>> b) Update oVirt manager GIT repo such that spec has dependencies on 3
>> new RPMs.
>>
>> Optional:
>> - These three tools share some python classes that are very similar. I
>> would like to create a GIT repo (perhaps ovirt-tools-common) to contain
>> these classes so that a fix in one place will fix the issue everywhere.
>> Perhaps we can also create a naked GIT repo for these common classes
>> while addressing the primary concerns above.
>
> would this hold both python and java common code?

None of the 3 tools currently have any requirement for Java code, but I 
think the installer does.  That said, I wouldn't have a problem mixing 
Java code in the "common" component as long as they're in separate 
package directories.

If we do something like this do we want a "python" common RPM and a 
"java" common RPM or just a single RPM for all common code?  I don't 
really have a preference.

Perhaps:
common/src/<python>
common/src/<java>/com/ovirt/whatever


From bazulay at redhat.com  Mon Feb 13 12:31:37 2012
From: bazulay at redhat.com (Barak Azulay)
Date: Mon, 13 Feb 2012 14:31:37 +0200
Subject: [Engine-devel] New oVirt GIT Repo Request
In-Reply-To: <4F37BF5C.20801@redhat.com>
References: <4F352CB8.8060006@redhat.com> <4F36EEA3.50006@redhat.com>
	<4F37BF5C.20801@redhat.com>
Message-ID: <4F3902A9.2060405@redhat.com>

On 02/12/2012 03:32 PM, Keith Robertson wrote:
> On 02/11/2012 05:41 PM, Itamar Heim wrote:
>> On 02/10/2012 04:42 PM, Keith Robertson wrote:
>>> All,
>>>
>>> I would like to move some of the oVirt tools into their own GIT repos so
>>> that they are easier to manage/maintain. In particular, I would like to
>>> move the ovirt-log-collector, ovirt-iso-uploader, and
>>> ovirt-image-uploader each into their own GIT repos.
>>>
>>> The Plan:
>>> Step 1: Create naked GIT repos on oVirt.org for the 3 tools.
>>> Step 2: Link git repos to gerrit.
>>
>> above two are same step - create a project in gerrit.
>> I'll do that if list doesn't have any objections by monday.
> Sure, np.
>>
>>> Step 3: Populate naked GIT repos with source and build standalone spec
>>> files for each.
>>> Step 4: In one patch do both a) and b)...
>>> a) Update oVirt manager GIT repo by removing tool source.
>>> b) Update oVirt manager GIT repo such that spec has dependencies on 3
>>> new RPMs.
>>>
>>> Optional:
>>> - These three tools share some python classes that are very similar. I
>>> would like to create a GIT repo (perhaps ovirt-tools-common) to contain
>>> these classes so that a fix in one place will fix the issue everywhere.
>>> Perhaps we can also create a naked GIT repo for these common classes
>>> while addressing the primary concerns above.
>>
>> would this hold both python and java common code?
>
> None of the 3 tools currently have any requirement for Java code, but I
> think the installer does. That said, I wouldn't have a problem mixing
> Java code in the "common" component as long as they're in separate
> package directories.
>
> If we do something like this do we want a "python" common RPM and a
> "java" common RPM or just a single RPM for all common code? I don't
> really have a preference.

I would go with separating the java common and python common, even if 
it's just to ease build/release  issues.


>
> Perhaps:
> common/src/<python>
> common/src/<java>/com/ovirt/whatever
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch


From dougsland at redhat.com  Mon Feb 13 15:57:27 2012
From: dougsland at redhat.com (Douglas Landgraf)
Date: Mon, 13 Feb 2012 10:57:27 -0500
Subject: [Engine-devel] New oVirt GIT Repo Request
In-Reply-To: <4F3902A9.2060405@redhat.com>
References: <4F352CB8.8060006@redhat.com>
	<4F36EEA3.50006@redhat.com>	<4F37BF5C.20801@redhat.com>
	<4F3902A9.2060405@redhat.com>
Message-ID: <4F3932E7.1010501@redhat.com>

On 02/13/2012 07:31 AM, Barak Azulay wrote:
> On 02/12/2012 03:32 PM, Keith Robertson wrote:
>> On 02/11/2012 05:41 PM, Itamar Heim wrote:
>>> On 02/10/2012 04:42 PM, Keith Robertson wrote:
>>>> All,
>>>>
>>>> I would like to move some of the oVirt tools into their own GIT 
>>>> repos so
>>>> that they are easier to manage/maintain. In particular, I would 
>>>> like to
>>>> move the ovirt-log-collector, ovirt-iso-uploader, and
>>>> ovirt-image-uploader each into their own GIT repos.
>>>>
>>>> The Plan:
>>>> Step 1: Create naked GIT repos on oVirt.org for the 3 tools.
>>>> Step 2: Link git repos to gerrit.
>>>
>>> above two are same step - create a project in gerrit.
>>> I'll do that if list doesn't have any objections by monday.
>> Sure, np.
>>>
>>>> Step 3: Populate naked GIT repos with source and build standalone spec
>>>> files for each.
>>>> Step 4: In one patch do both a) and b)...
>>>> a) Update oVirt manager GIT repo by removing tool source.
>>>> b) Update oVirt manager GIT repo such that spec has dependencies on 3
>>>> new RPMs.
>>>>
>>>> Optional:
>>>> - These three tools share some python classes that are very similar. I
>>>> would like to create a GIT repo (perhaps ovirt-tools-common) to 
>>>> contain
>>>> these classes so that a fix in one place will fix the issue 
>>>> everywhere.
>>>> Perhaps we can also create a naked GIT repo for these common classes
>>>> while addressing the primary concerns above.
>>>
>>> would this hold both python and java common code?
>>
>> None of the 3 tools currently have any requirement for Java code, but I
>> think the installer does. That said, I wouldn't have a problem mixing
>> Java code in the "common" component as long as they're in separate
>> package directories.
>>
>> If we do something like this do we want a "python" common RPM and a
>> "java" common RPM or just a single RPM for all common code? I don't
>> really have a preference.
>
> I would go with separating the java common and python common, even if 
> it's just to ease build/release  issues.
>
+1 and if needed one package be required to the other.

-- 
Cheers
Douglas


From kroberts at redhat.com  Mon Feb 13 13:20:40 2012
From: kroberts at redhat.com (Keith Robertson)
Date: Mon, 13 Feb 2012 08:20:40 -0500
Subject: [Engine-devel] New oVirt GIT Repo Request
In-Reply-To: <4F3932E7.1010501@redhat.com>
References: <4F352CB8.8060006@redhat.com>
	<4F36EEA3.50006@redhat.com>	<4F37BF5C.20801@redhat.com>
	<4F3902A9.2060405@redhat.com> <4F3932E7.1010501@redhat.com>
Message-ID: <4F390E28.9060300@redhat.com>

On 02/13/2012 10:57 AM, Douglas Landgraf wrote:
> On 02/13/2012 07:31 AM, Barak Azulay wrote:
>> On 02/12/2012 03:32 PM, Keith Robertson wrote:
>>> On 02/11/2012 05:41 PM, Itamar Heim wrote:
>>>> On 02/10/2012 04:42 PM, Keith Robertson wrote:
>>>>> All,
>>>>>
>>>>> I would like to move some of the oVirt tools into their own GIT 
>>>>> repos so
>>>>> that they are easier to manage/maintain. In particular, I would 
>>>>> like to
>>>>> move the ovirt-log-collector, ovirt-iso-uploader, and
>>>>> ovirt-image-uploader each into their own GIT repos.
>>>>>
>>>>> The Plan:
>>>>> Step 1: Create naked GIT repos on oVirt.org for the 3 tools.
>>>>> Step 2: Link git repos to gerrit.
>>>>
>>>> above two are same step - create a project in gerrit.
>>>> I'll do that if list doesn't have any objections by monday.
>>> Sure, np.
>>>>
>>>>> Step 3: Populate naked GIT repos with source and build standalone 
>>>>> spec
>>>>> files for each.
>>>>> Step 4: In one patch do both a) and b)...
>>>>> a) Update oVirt manager GIT repo by removing tool source.
>>>>> b) Update oVirt manager GIT repo such that spec has dependencies on 3
>>>>> new RPMs.
>>>>>
>>>>> Optional:
>>>>> - These three tools share some python classes that are very 
>>>>> similar. I
>>>>> would like to create a GIT repo (perhaps ovirt-tools-common) to 
>>>>> contain
>>>>> these classes so that a fix in one place will fix the issue 
>>>>> everywhere.
>>>>> Perhaps we can also create a naked GIT repo for these common classes
>>>>> while addressing the primary concerns above.
>>>>
>>>> would this hold both python and java common code?
>>>
>>> None of the 3 tools currently have any requirement for Java code, but I
>>> think the installer does. That said, I wouldn't have a problem mixing
>>> Java code in the "common" component as long as they're in separate
>>> package directories.
>>>
>>> If we do something like this do we want a "python" common RPM and a
>>> "java" common RPM or just a single RPM for all common code? I don't
>>> really have a preference.
>>
>> I would go with separating the java common and python common, even if 
>> it's just to ease build/release  issues.
>>
> +1 and if needed one package be required to the other.
>
Sounds like a plan.  Full speed ahead.
Cheers


From kwade at redhat.com  Tue Feb 14 01:05:36 2012
From: kwade at redhat.com (Karsten 'quaid' Wade)
Date: Mon, 13 Feb 2012 17:05:36 -0800
Subject: Schedule for 21 March workshop
Message-ID: <4F39B360.8010706@redhat.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

We need to figure out a schedule for the upcoming workshop. Based on
the last workshop, it shouldn't be hard to figure out what we want to
cover.

The Red Hat, IBM, and Intel teams will have just come out of a two-day
intensive on the 19th and 20th, so will all be better prepared to deal
with the open workshop.

The materials in the open workshop will be the same topics, but
covered in less depth.

Here are some topic ideas I've heard so far:
* oVirt intro/overview/architecture
* oVirt architecture
* Engine deep dive
* VDSM deep dive
* Getting started with dev environment
* API/SDK/CLI
* Node
* History and reports
* Guest agent
* Engine tools
* How to interact & participate
* Open discussion

What ideas do you have?

What do you think must be covered?

What do you think should be covered?

What is safe to not cover?

Thanks - Karsten
- -- 
name:  Karsten 'quaid' Wade, Sr. Community Architect
team:    Red Hat Community Architecture & Leadership
uri:              http://communityleadershipteam.org
                         http://TheOpenSourceWay.org
gpg:                                        AD0E0C41
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFPObNg2ZIOBq0ODEERAqNCAJ9bJete89+tRpFWFbRV/LQCkFegpQCgohdU
V8FSmSBZXuwt1n9HzJEgbgM=
=GemC
-----END PGP SIGNATURE-----


From agl at us.ibm.com  Tue Feb 14 16:31:11 2012
From: agl at us.ibm.com (Adam Litke)
Date: Tue, 14 Feb 2012 10:31:11 -0600
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F33F815.7040905@redhat.com>
References: <4F33F815.7040905@redhat.com>
Message-ID: <20120214163111.GD2784@us.ibm.com>

On Thu, Feb 09, 2012 at 11:45:09AM -0500, Perry Myers wrote:
> warning: tl;dr
> 
> Right now, HA in oVirt is limited to VM level granularity.  Each VM
> provides a heartbeat through vdsm back to the oVirt Engine.  If that
> heartbeat is lost, the VM is terminated and (if the user has configured
> it) the VM is relaunched.  If the host running that VM has lost its
> heartbeat, the host is fenced (via a remote power operation) and all HA
> VMs are restarted on an alternate host.
> 

Has anyone considered how live snapshots and live block copy will intersect HA
to provide a better end-user experience?  For example, will we be able to handle
a storage connection failure without power-cycling VMs by migrating storage to a
failover storage domain and/or live-migrating the VM to a host with functioning
storage connections?

> Also, the policies for controlling if/when a VM should be restarted are
> somewhat limited and hardcoded.
> 
> So there are two things that we can improve here:
> 
> 1. Provide introspection into VMs so that we can monitor the health of
>    individual services and not just the VM
> 
> 2. Provide a more configurable way of expressing policy for when a VM
>    (and its services) should trigger remediation by the HA subsystem
> 
> We can tackle these two things in isolation, or we can try to combine
> and solve them at the same time.
> 
> Some possible paths (not the only ones) might be:
> 

I also want to mention Memory Overcommitment Manager.  It hasn't been included
in vdsm yet, but the patches will be hitting gerrit within the next couple of
days.  MOM will contribute a single-host policy which is useful for making
decisions about the condition of a host and applying remediation policies:
ballooning, ksm, cgroups, vm ejection (migrating to another host).  It is
lightweight and will integrate seamlessly with vdsm from an oVirt-engine
perspective.

> * Leverage Pacemaker Cloud (http://pacemaker-cloud.org/)
> 
> Pacemaker Cloud works by providing a generic (read: virt mgmt system
> agnostic) way of managing HA for virtual machines and their services.
> At a high level the concept is that you define 1 or more virtual
> machines to be in a application group, and pcmk-cloud spawns a process
> to monitor that application group using either Matahari/QMF or direct
> SSH access.
> 
> pcmk-cloud is not meant to be a user facing component, so integration
> work would need to be done here to have oVirt consume the pcmk-cloud
> REST API for specifying what the application groups (sets of VMs) are
> and exposing that through the oVirt web UI.
> 
> pcmk-cloud at a high level has the following functions:
>   + monitoring of services through Matahari/QMF/SSH
>   + monitoring of VMs through Matahari/QMF/SSH/Deltacloud
>   + control of services through Matahari/QMF/SSH
>   + control of VMs through Deltacloud or the native provider (in this
>     case oVirt Engine REST API)
>   + policy engine/model (per application group) to make decisions about
>     when to control services/VMs based on the monitoring input
> 
> Integration decisions:
>   + pcmk-cloud to use existing transports for monitoring/control
>     (QMF/SSH) or do we leverage a new transport via vdsm/ovirt-guest-
>     agent?
>   + pcmk-cloud could act as the core policy engine to determine VM
>     placement in the oVirt datacenter/clusters or it could be used
>     solely for the monitoring/remediation aspect
> 
> 
> * Leverage guest monitoring agents w/ ovirt-guest-agent
> 
> This would be taking the Services Agent from Matahari (which is just a C
> library) and utilizing it from the ovirt-guest-agent.  So oga would
> setup recurring monitoring of services using this lib and use its
> existing communication path with vdsm->oVirt Engine to report back
> service events.  In turn, oVirt Engine would need to interpret these
> events and then issue service control actions back to oga
> 
> Conceptually this is very similar to using pcmk-cloud in the case where
> pcmk-cloud utilizes information obtained through oga/vdsm through oVirt
> Engine instead of communicating directly to Guests via QMF/SSH.  In
> fact, taking this route would probably end up duplicating some effort
> because effectively you'd need the pcmk-cloud concept of the Cloud
> Application Policy Engine (formerly called DPE/Deployable Policy Engine)
> built directly into oVirt Engine anyhow.
> 
> So part of looking at this is determining how much reuse/integration of
> existing components makes sense vs. just re-implementing similar concepts.
> 
> I've cc'd folks from the HA community/pcmk-cloud and hopefully we can
> have a bit of a discussion to determine the best path forward here.
> 
> Perry
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
> 

-- 
Adam Litke <agl at us.ibm.com>
IBM Linux Technology Center


From ykaul at redhat.com  Tue Feb 14 16:40:20 2012
From: ykaul at redhat.com (Yaniv Kaul)
Date: Tue, 14 Feb 2012 18:40:20 +0200
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <20120214163111.GD2784@us.ibm.com>
References: <4F33F815.7040905@redhat.com> <20120214163111.GD2784@us.ibm.com>
Message-ID: <4F3A8E74.50202@redhat.com>

On 02/14/2012 06:31 PM, Adam Litke wrote:
> On Thu, Feb 09, 2012 at 11:45:09AM -0500, Perry Myers wrote:
>> warning: tl;dr
>>
>> Right now, HA in oVirt is limited to VM level granularity.  Each VM
>> provides a heartbeat through vdsm back to the oVirt Engine.  If that
>> heartbeat is lost, the VM is terminated and (if the user has configured
>> it) the VM is relaunched.  If the host running that VM has lost its
>> heartbeat, the host is fenced (via a remote power operation) and all HA
>> VMs are restarted on an alternate host.
>>
> Has anyone considered how live snapshots and live block copy will intersect HA
> to provide a better end-user experience?  For example, will we be able to handle
> a storage connection failure without power-cycling VMs by migrating storage to a
> failover storage domain and/or live-migrating the VM to a host with functioning
> storage connections?

I think migrating a paused VM (due to EIO) is something KVM is afraid to 
do - there might be in-flight (in the host already) data en-route to the 
storage.
I'm not entirely sure how you migrate the storage, when it's failed.
Y.

>
>> Also, the policies for controlling if/when a VM should be restarted are
>> somewhat limited and hardcoded.
>>
>> So there are two things that we can improve here:
>>
>> 1. Provide introspection into VMs so that we can monitor the health of
>>     individual services and not just the VM
>>
>> 2. Provide a more configurable way of expressing policy for when a VM
>>     (and its services) should trigger remediation by the HA subsystem
>>
>> We can tackle these two things in isolation, or we can try to combine
>> and solve them at the same time.
>>
>> Some possible paths (not the only ones) might be:
>>
> I also want to mention Memory Overcommitment Manager.  It hasn't been included
> in vdsm yet, but the patches will be hitting gerrit within the next couple of
> days.  MOM will contribute a single-host policy which is useful for making
> decisions about the condition of a host and applying remediation policies:
> ballooning, ksm, cgroups, vm ejection (migrating to another host).  It is
> lightweight and will integrate seamlessly with vdsm from an oVirt-engine
> perspective.
>
>> * Leverage Pacemaker Cloud (http://pacemaker-cloud.org/)
>>
>> Pacemaker Cloud works by providing a generic (read: virt mgmt system
>> agnostic) way of managing HA for virtual machines and their services.
>> At a high level the concept is that you define 1 or more virtual
>> machines to be in a application group, and pcmk-cloud spawns a process
>> to monitor that application group using either Matahari/QMF or direct
>> SSH access.
>>
>> pcmk-cloud is not meant to be a user facing component, so integration
>> work would need to be done here to have oVirt consume the pcmk-cloud
>> REST API for specifying what the application groups (sets of VMs) are
>> and exposing that through the oVirt web UI.
>>
>> pcmk-cloud at a high level has the following functions:
>>    + monitoring of services through Matahari/QMF/SSH
>>    + monitoring of VMs through Matahari/QMF/SSH/Deltacloud
>>    + control of services through Matahari/QMF/SSH
>>    + control of VMs through Deltacloud or the native provider (in this
>>      case oVirt Engine REST API)
>>    + policy engine/model (per application group) to make decisions about
>>      when to control services/VMs based on the monitoring input
>>
>> Integration decisions:
>>    + pcmk-cloud to use existing transports for monitoring/control
>>      (QMF/SSH) or do we leverage a new transport via vdsm/ovirt-guest-
>>      agent?
>>    + pcmk-cloud could act as the core policy engine to determine VM
>>      placement in the oVirt datacenter/clusters or it could be used
>>      solely for the monitoring/remediation aspect
>>
>>
>> * Leverage guest monitoring agents w/ ovirt-guest-agent
>>
>> This would be taking the Services Agent from Matahari (which is just a C
>> library) and utilizing it from the ovirt-guest-agent.  So oga would
>> setup recurring monitoring of services using this lib and use its
>> existing communication path with vdsm->oVirt Engine to report back
>> service events.  In turn, oVirt Engine would need to interpret these
>> events and then issue service control actions back to oga
>>
>> Conceptually this is very similar to using pcmk-cloud in the case where
>> pcmk-cloud utilizes information obtained through oga/vdsm through oVirt
>> Engine instead of communicating directly to Guests via QMF/SSH.  In
>> fact, taking this route would probably end up duplicating some effort
>> because effectively you'd need the pcmk-cloud concept of the Cloud
>> Application Policy Engine (formerly called DPE/Deployable Policy Engine)
>> built directly into oVirt Engine anyhow.
>>
>> So part of looking at this is determining how much reuse/integration of
>> existing components makes sense vs. just re-implementing similar concepts.
>>
>> I've cc'd folks from the HA community/pcmk-cloud and hopefully we can
>> have a bit of a discussion to determine the best path forward here.
>>
>> Perry
>> _______________________________________________
>> Arch mailing list
>> Arch at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/arch
>>


From Caitlin.Bestler at nexenta.com  Tue Feb 14 18:14:31 2012
From: Caitlin.Bestler at nexenta.com (Caitlin Bestler)
Date: Tue, 14 Feb 2012 18:14:31 +0000
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F3A8E74.50202@redhat.com>
References: <4F33F815.7040905@redhat.com> <20120214163111.GD2784@us.ibm.com>
	<4F3A8E74.50202@redhat.com>
Message-ID: <719CD19D2B2BFA4CB1B3F00D2A8CDCD0011E406F@AUSP01DAG0103.collaborationhost.net>

Yaniv Kaul wrote:

> I think migrating a paused VM (due to EIO) is something KVM is afraid to do - there might be in-flight
> (in the host already) data en-route to the storage.
> I'm not entirely sure how you migrate the storage, when it's failed.

A Storage Service that is already defined to be highly available should be usable by an HA VM without any problem.
When you migrate a VM the current operations all fail, but the storage stack merely retries them.

There are even solutions for dealing with storage targets that were on the intra-host IP subnet. See my presentation
On NAS proxies at last year's Xen Summit:
http://xen.org/files/xensummit_santaclara11/aug2/11_Bestler_Tailoring_NAS_Proxies_for_Virtual_Machines.pdf


From iheim at redhat.com  Tue Feb 14 18:32:17 2012
From: iheim at redhat.com (Itamar Heim)
Date: Tue, 14 Feb 2012 20:32:17 +0200
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <20120214163111.GD2784@us.ibm.com>
References: <4F33F815.7040905@redhat.com> <20120214163111.GD2784@us.ibm.com>
Message-ID: <4F3AA8B1.6060300@redhat.com>

On 02/14/2012 06:31 PM, Adam Litke wrote:
> On Thu, Feb 09, 2012 at 11:45:09AM -0500, Perry Myers wrote:
>> warning: tl;dr
>>
>> Right now, HA in oVirt is limited to VM level granularity.  Each VM
>> provides a heartbeat through vdsm back to the oVirt Engine.  If that
>> heartbeat is lost, the VM is terminated and (if the user has configured
>> it) the VM is relaunched.  If the host running that VM has lost its
>> heartbeat, the host is fenced (via a remote power operation) and all HA
>> VMs are restarted on an alternate host.
>>
>
> Has anyone considered how live snapshots and live block copy will intersect HA
> to provide a better end-user experience?  For example, will we be able to handle
> a storage connection failure without power-cycling VMs by migrating storage to a
> failover storage domain and/or live-migrating the VM to a host with functioning
> storage connections?

cc'ing Dor - iirc, he mentioned an issue with live migrating a guest 
post an IO error


From dlaor at redhat.com  Tue Feb 14 22:28:28 2012
From: dlaor at redhat.com (Dor Laor)
Date: Wed, 15 Feb 2012 00:28:28 +0200
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F3AA8B1.6060300@redhat.com>
References: <4F33F815.7040905@redhat.com> <20120214163111.GD2784@us.ibm.com>
	<4F3AA8B1.6060300@redhat.com>
Message-ID: <4F3AE00C.6000906@redhat.com>

On 02/14/2012 08:32 PM, Itamar Heim wrote:
> On 02/14/2012 06:31 PM, Adam Litke wrote:
>> On Thu, Feb 09, 2012 at 11:45:09AM -0500, Perry Myers wrote:
>>> warning: tl;dr
>>>
>>> Right now, HA in oVirt is limited to VM level granularity. Each VM
>>> provides a heartbeat through vdsm back to the oVirt Engine. If that
>>> heartbeat is lost, the VM is terminated and (if the user has configured
>>> it) the VM is relaunched. If the host running that VM has lost its
>>> heartbeat, the host is fenced (via a remote power operation) and all HA
>>> VMs are restarted on an alternate host.
>>>
>>
>> Has anyone considered how live snapshots and live block copy will
>> intersect HA
>> to provide a better end-user experience? For example, will we be able
>> to handle
>> a storage connection failure without power-cycling VMs by migrating
>> storage to a
>> failover storage domain and/or live-migrating the VM to a host with
>> functioning
>> storage connections?


Not sure I get the scope here - if the storage is dead, the VM won't be 
able to copy the storage to its new destination. There is only one 
theoretical chance it will work - for shared storage, if one of the 
hosts has its hba/nic/link/port dead maybe some other host will be able 
to access the storage. It seems like a long shot to me. More over,
not all of the guest IO reached the storage prior to the migration. Even 
w/ ODIRECT there is still various caches around, some belong to the VM, 
some may be meta data for image files. We won't be able to switch to 
another host w/o writing the data in most cases.
>
> cc'ing Dor - iirc, he mentioned an issue with live migrating a guest
> post an IO error

In short, while it may be theoretically possible in rare cases, I rather 
not to relay on it. Seems that the 'average' storage array/HBA is more 
stable than live migration + IO errors path...

Cheers,
Dor


From abaron at redhat.com  Tue Feb 14 22:50:13 2012
From: abaron at redhat.com (Ayal Baron)
Date: Tue, 14 Feb 2012 17:50:13 -0500 (EST)
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F3AE00C.6000906@redhat.com>
Message-ID: <e479aff1-b2e2-473c-b1b8-42a3a325c985@zmail13.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> On 02/14/2012 08:32 PM, Itamar Heim wrote:
> > On 02/14/2012 06:31 PM, Adam Litke wrote:
> >> On Thu, Feb 09, 2012 at 11:45:09AM -0500, Perry Myers wrote:
> >>> warning: tl;dr
> >>>
> >>> Right now, HA in oVirt is limited to VM level granularity. Each
> >>> VM
> >>> provides a heartbeat through vdsm back to the oVirt Engine. If
> >>> that
> >>> heartbeat is lost, the VM is terminated and (if the user has
> >>> configured
> >>> it) the VM is relaunched. If the host running that VM has lost
> >>> its
> >>> heartbeat, the host is fenced (via a remote power operation) and
> >>> all HA
> >>> VMs are restarted on an alternate host.
> >>>
> >>
> >> Has anyone considered how live snapshots and live block copy will
> >> intersect HA
> >> to provide a better end-user experience? For example, will we be
> >> able
> >> to handle
> >> a storage connection failure without power-cycling VMs by
> >> migrating
> >> storage to a
> >> failover storage domain and/or live-migrating the VM to a host
> >> with
> >> functioning
> >> storage connections?
> 
> 
> Not sure I get the scope here - if the storage is dead, the VM won't
> be
> able to copy the storage to its new destination. There is only one
> theoretical chance it will work - for shared storage, if one of the
> hosts has its hba/nic/link/port dead maybe some other host will be
> able
> to access the storage. It seems like a long shot to me. More over,
> not all of the guest IO reached the storage prior to the migration.
> Even
> w/ ODIRECT there is still various caches around, some belong to the
> VM,

VM caches are irrelevant as the data is migrated with the VM

> some may be meta data for image files. We won't be able to switch to
> another host w/o writing the data in most cases.

Since we *always* use O_DIRECT the I/O will not be ack'd until it has reached the disk system (not necessarily written to disk though), which means that migration is safe in this respect (the I/O would be resent on destination host)

However, consider the case where the VM has migrated, resent the I/O (successfully) gone on to write some more things and then the source host regains access to the storage and the in-flight (outdated) I/O makes it to disk and corrupts it.

In addition, I'm not sure how you migrate a qemu which is in d-state?

> >
> > cc'ing Dor - iirc, he mentioned an issue with live migrating a
> > guest
> > post an IO error
> 
> In short, while it may be theoretically possible in rare cases, I
> rather
> not to relay on it. Seems that the 'average' storage array/HBA is
> more
> stable than live migration + IO errors path...
> 
> Cheers,
> Dor
> 
> 
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
> 


From abaron at redhat.com  Tue Feb 14 23:11:54 2012
From: abaron at redhat.com (Ayal Baron)
Date: Tue, 14 Feb 2012 18:11:54 -0500 (EST)
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F33FA64.6000101@redhat.com>
Message-ID: <b6e16634-eb30-4bbc-b26c-442ca30e1b60@zmail13.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> > I think we first need to look at the larger question of policy
> > engine at
> > ovirt-engine. the two main candidates are pacemaker and drools
> > (jboss
> > rules).
> > pacemaker for having logic in the area.
> > drools for having easier java integration and integrated UI to
> > create
> > policies by users.
> 
> Agreed, as I mentioned in my email they're interrelated

I'm not sure I agree.
This entire thread assumes that the way to do this is to have the engine continuously monitor all services on all (HA) guests and according to varying policies reschedule VMs (services within VMs?)
I don't think this is scalable (and wrt drools/pacemaker, assuming what Andrew says is correct, drools doesn't even remotely come close to supporting even relatively small scales)

Engine should decide on policy, the hosts should enforce it.
What this would translate to is a more distributed way of monitoring and moving around of VMs/services.  E.g. for each service, engine would run the VM on host A and let host B know that it is the failover node for this service.  Node B would be monitoring the heartbeats for the services it is in charge of and take over when needed. In case host B crashes, engine would choose a different host to be the failover node (note that there can be more than 2 nodes with a predefined order of priority).

> 
> i.e. if you're going to use Pacemaker's policy engine then it
> absolutely
> makes sense to just go with Pacemaker Cloud, since that's precisely
> what
> it does (uses the core Pacemaker PE)
> 
> OTOH, if you decide to use drools, then it may make more sense to
> integrate the HA concepts directly into the drools PE and then the
> only
> other thing you can leverage would be the library that does the
> monitoring of services at the end points.
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
> 


From Caitlin.Bestler at nexenta.com  Wed Feb 15 00:02:02 2012
From: Caitlin.Bestler at nexenta.com (Caitlin Bestler)
Date: Wed, 15 Feb 2012 00:02:02 +0000
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <b6e16634-eb30-4bbc-b26c-442ca30e1b60@zmail13.collab.prod.int.phx2.redhat.com>
References: <4F33FA64.6000101@redhat.com>
	<b6e16634-eb30-4bbc-b26c-442ca30e1b60@zmail13.collab.prod.int.phx2.redhat.com>
Message-ID: <719CD19D2B2BFA4CB1B3F00D2A8CDCD0011E427D@AUSP01DAG0103.collaborationhost.net>

Aval Baron wrote:


> I'm not sure I agree.
> This entire thread assumes that the way to do this is to have the engine continuously monitor all services on all (HA) guests and according to
> varying policies reschedule VMs (services within VMs?) I don't think this is scalable (and wrt drools/pacemaker, assuming what Andrew says is 
> correct, drools doesn't even remotely come close to supporting even relatively small scales)

>Engine should decide on policy, the hosts should enforce it.
>What this would translate to is a more distributed way of monitoring and moving around of VMs/services.
> E.g. for each service, engine would run the VM on host A and let host B know that it is the failover node for
> this service.  Node B would be monitoring the heartbeats for the services it is in charge of and take over when
> needed. In case host B crashes, engine would choose a different host to be the failover node (note that there
>can be more than 2 nodes with a predefined order of priority).

As long as you expect the VM to enforce reliability on the raw storage devices then you are going to have problems with restarting HA VMs.
If you switch your thinking to making the storage operations HA, then all you need is a response cache.

A restarted VM replays the operation, and the cached response is retransmitted (or the operation is benignly re-applied).
Without defining the operations so that they can be benignly re-applied or adding a response cache you will always be able
to come up with some order of failure that won't work. There is no cost-effective way to guarantee that you snapshot the
VM only when there is no in-flight storage activity.


From pmyers at redhat.com  Wed Feb 15 01:36:48 2012
From: pmyers at redhat.com (Perry Myers)
Date: Tue, 14 Feb 2012 20:36:48 -0500
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <b6e16634-eb30-4bbc-b26c-442ca30e1b60@zmail13.collab.prod.int.phx2.redhat.com>
References: <b6e16634-eb30-4bbc-b26c-442ca30e1b60@zmail13.collab.prod.int.phx2.redhat.com>
Message-ID: <4F3B0C30.1070402@redhat.com>

> I'm not sure I agree.
> This entire thread assumes that the way to do this is to have the
> engine continuously monitor all services on all (HA) guests and
> according to varying policies reschedule VMs (services within VMs?)

That's one interpretation of what I wrote, but not the only one.

Pacemaker Cloud doesn't rely on a single process (like oVirt Engine) to
monitor all VMs and the services in those VMs.  It relies on spawning a
monitor process for each logical grouping of VMs in an 'application group'.

So the engine doesn't need to continuously monitor every VM and every
service, it delegates to the Cloud Policy Engine (CPE) which in turn
creates a daemon (DPE[1]) to monitor each application group.

> I don't think this is scalable (and wrt drools/pacemaker, assuming
> what Andrew says is correct, drools doesn't even remotely come close
> to supporting even relatively small scales)

The way to deal with drools poor scaling is... don't use drools :)

But you're right, having oVirt Engine be the sole entity for monitoring
every service on every VM is not scalable, which is the reason why the
Pacemaker Cloud architecture doesn't do it that way.

> Engine should decide on policy, the hosts should enforce it.

This is how Pacemaker Cloud works as well, except right now I'd restate
it as: Engine should decide on policy and the DPEs should enforce it.

In the current thinking the DPEs run co-located with the CPE, which
would run nearby (but not necessarily on the same server as) the oVirt
Engine.

However, you bring up a good point in that the DPEs could be distributed
to the hosts.  (Right now CPE/DPE communication uses IPC but this could
be replaced with something TCP oriented)

Note: Not relying on anything from the host was a design constraint for
Pacemaker Cloud.  oVirt is different in that you can put things on the
hosts, so there may be optimizations we can make due to this relaxed
constraint, like putting the DPEs onto the Hosts.

> What this would translate to is a more distributed way of monitoring
> and moving around of VMs/services.  E.g. for each service, engine
> would run the VM on host A and let host B know that it is the
> failover node for this service.

That seems restrictive.  Why not allow that VM to fail over to 'any
other node in the cloud' vs. picking a specific piece of hardware?  If
you allow it to just pick the best available node at the time using
predefined policies that will result in less focus on the individual
hosts and make things more cloud-like (abstraction of resources)

>  Node B would be monitoring the
> heartbeats for the services it is in charge of and take over when
> needed. In case host B crashes, engine would choose a different host
> to be the failover node (note that there can be more than 2 nodes
> with a predefined order of priority).

Agree with this... Sort of what I said above, the DPE could run on HostB
to monitor stuff running on Hosts A and C (for case where there are
multiple VMs across different hosts in an application group).  And if
the DPE or HostB fails, then the CPE would respawn a new DPE on a new host.

I think Pacemaker Cloud could fit the paradigm you're looking for here.
 But it will require a little integration work.  On the other hand, if
you are looking to keep this more Java oriented or very tightly
integrated with the oVirt codebase, then you could probably take similar
concepts as what has already been done in pcmk-cloud and re-implement them.

Either way works.  We'd be happy to assist either with integration of
pcmk-cloud here or with general advice on HA as you work on the Java
implementation.

Perry


[1] This daemon right now is called the DPE for Deployable Policy
    Engine, since in the Aeolus terminology a Deployable was a set of
    VMs that were coordinated to run an application.  For example, 2
    VMs, one running a database and the other running a web server.

    Aeolus terminology has changed and 'Deployable' is no longer used
    to describe this.  Instead this is called an Application Set/Group

    Because pcmk-cloud adopted Aeolus terminology and the Deployable
    term is not really well known, we're probably going to rename the
    DPE to be "Cloud Application Policy Engine" or CAPE.


From pmyers at redhat.com  Wed Feb 15 01:41:29 2012
From: pmyers at redhat.com (Perry Myers)
Date: Tue, 14 Feb 2012 20:41:29 -0500
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <719CD19D2B2BFA4CB1B3F00D2A8CDCD0011E427D@AUSP01DAG0103.collaborationhost.net>
References: <4F33FA64.6000101@redhat.com>
	<b6e16634-eb30-4bbc-b26c-442ca30e1b60@zmail13.collab.prod.int.phx2.redhat.com>
	<719CD19D2B2BFA4CB1B3F00D2A8CDCD0011E427D@AUSP01DAG0103.collaborationhost.net>
Message-ID: <4F3B0D49.2040904@redhat.com>

> As long as you expect the VM to enforce reliability on the raw
> storage devices then you are going to have problems with restarting
> HA VMs. If you switch your thinking to making the storage operations
> HA, then all you need is a response cache.
> 
> A restarted VM replays the operation, and the cached response is
> retransmitted (or the operation is benignly re-applied). Without
> defining the operations so that they can be benignly re-applied or
> adding a response cache you will always be able to come up with some
> order of failure that won't work. There is no cost-effective way to
> guarantee that you snapshot the VM only when there is no in-flight
> storage activity.

How is this any different than a bare metal host crashing while writes
are in flight either to a local disk or FC disk?  When something crashes
(be it physical or virtual) you're always going to lose some data that
was in flight but not committed to disk (network has same issue).  It's
up to individual applications to be resilient to this.

I think this issue is somewhat orthogonal to simply providing reduced
MTTR by restarting failed services or VMs.


From abaron at redhat.com  Wed Feb 15 06:32:31 2012
From: abaron at redhat.com (Ayal Baron)
Date: Wed, 15 Feb 2012 01:32:31 -0500 (EST)
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F3B0C30.1070402@redhat.com>
Message-ID: <e167021e-1c94-4534-9c87-16f12fafc4c8@zmail13.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> > I'm not sure I agree.
> > This entire thread assumes that the way to do this is to have the
> > engine continuously monitor all services on all (HA) guests and
> > according to varying policies reschedule VMs (services within VMs?)
> 
> That's one interpretation of what I wrote, but not the only one.
> 
> Pacemaker Cloud doesn't rely on a single process (like oVirt Engine)
> to
> monitor all VMs and the services in those VMs.  It relies on spawning
> a
> monitor process for each logical grouping of VMs in an 'application
> group'.
> 
> So the engine doesn't need to continuously monitor every VM and every
> service, it delegates to the Cloud Policy Engine (CPE) which in turn
> creates a daemon (DPE[1]) to monitor each application group.

Where is the daemon spawn? on the engine or in a distributed fashion? if the latter then drools is irrelevant.  if the former then it would just make things worse (scalability wise)


> 
> > I don't think this is scalable (and wrt drools/pacemaker, assuming
> > what Andrew says is correct, drools doesn't even remotely come
> > close
> > to supporting even relatively small scales)
> 
> The way to deal with drools poor scaling is... don't use drools :)
> 
> But you're right, having oVirt Engine be the sole entity for
> monitoring
> every service on every VM is not scalable, which is the reason why
> the
> Pacemaker Cloud architecture doesn't do it that way.
> 
> > Engine should decide on policy, the hosts should enforce it.
> 
> This is how Pacemaker Cloud works as well, except right now I'd
> restate
> it as: Engine should decide on policy and the DPEs should enforce it.
> 
> In the current thinking the DPEs run co-located with the CPE, which
> would run nearby (but not necessarily on the same server as) the
> oVirt
> Engine.
> 
> However, you bring up a good point in that the DPEs could be
> distributed
> to the hosts.  (Right now CPE/DPE communication uses IPC but this
> could
> be replaced with something TCP oriented)
> 
> Note: Not relying on anything from the host was a design constraint
> for
> Pacemaker Cloud.  oVirt is different in that you can put things on
> the
> hosts, so there may be optimizations we can make due to this relaxed
> constraint, like putting the DPEs onto the Hosts.
> 
> > What this would translate to is a more distributed way of
> > monitoring
> > and moving around of VMs/services.  E.g. for each service, engine
> > would run the VM on host A and let host B know that it is the
> > failover node for this service.
> 
> That seems restrictive.  Why not allow that VM to fail over to 'any
> other node in the cloud' vs. picking a specific piece of hardware?
>  If
> you allow it to just pick the best available node at the time using
> predefined policies that will result in less focus on the individual
> hosts and make things more cloud-like (abstraction of resources)
> 
> >  Node B would be monitoring the
> > heartbeats for the services it is in charge of and take over when
> > needed. In case host B crashes, engine would choose a different
> > host
> > to be the failover node (note that there can be more than 2 nodes
> > with a predefined order of priority).
> 
> Agree with this... Sort of what I said above, the DPE could run on
> HostB
> to monitor stuff running on Hosts A and C (for case where there are
> multiple VMs across different hosts in an application group).  And if
> the DPE or HostB fails, then the CPE would respawn a new DPE on a new
> host.
> 
> I think Pacemaker Cloud could fit the paradigm you're looking for
> here.
>  But it will require a little integration work.  On the other hand,
>  if
> you are looking to keep this more Java oriented or very tightly
> integrated with the oVirt codebase, then you could probably take
> similar
> concepts as what has already been done in pcmk-cloud and re-implement
> them.
> 
> Either way works.  We'd be happy to assist either with integration of
> pcmk-cloud here or with general advice on HA as you work on the Java
> implementation.
> 
> Perry
> 
> 
> [1] This daemon right now is called the DPE for Deployable Policy
>     Engine, since in the Aeolus terminology a Deployable was a set of
>     VMs that were coordinated to run an application.  For example, 2
>     VMs, one running a database and the other running a web server.
> 
>     Aeolus terminology has changed and 'Deployable' is no longer used
>     to describe this.  Instead this is called an Application
>     Set/Group
> 
>     Because pcmk-cloud adopted Aeolus terminology and the Deployable
>     term is not really well known, we're probably going to rename the
>     DPE to be "Cloud Application Policy Engine" or CAPE.
> 


From iheim at redhat.com  Wed Feb 15 06:36:31 2012
From: iheim at redhat.com (Itamar Heim)
Date: Wed, 15 Feb 2012 08:36:31 +0200
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <b6e16634-eb30-4bbc-b26c-442ca30e1b60@zmail13.collab.prod.int.phx2.redhat.com>
References: <b6e16634-eb30-4bbc-b26c-442ca30e1b60@zmail13.collab.prod.int.phx2.redhat.com>
Message-ID: <4F3B526F.1060200@redhat.com>

On 02/15/2012 01:11 AM, Ayal Baron wrote:
>
>
> ----- Original Message -----
>>> I think we first need to look at the larger question of policy
>>> engine at
>>> ovirt-engine. the two main candidates are pacemaker and drools
>>> (jboss
>>> rules).
>>> pacemaker for having logic in the area.
>>> drools for having easier java integration and integrated UI to
>>> create
>>> policies by users.
>>
>> Agreed, as I mentioned in my email they're interrelated
>
> I'm not sure I agree.
> This entire thread assumes that the way to do this is to have the engine continuously monitor all services on all (HA) guests and according to varying policies reschedule VMs (services within VMs?)
> I don't think this is scalable (and wrt drools/pacemaker, assuming what Andrew says is correct, drools doesn't even remotely come close to supporting even relatively small scales)
>
> Engine should decide on policy, the hosts should enforce it.
> What this would translate to is a more distributed way of monitoring and moving around of VMs/services.  E.g. for each service, engine would run the VM on host A and let host B know that it is the failover node for this service.  Node B would be monitoring the heartbeats for the services it is in charge of and take over when needed. In case host B crashes, engine would choose a different host to be the failover node (note that there can be more than 2 nodes with a predefined order of priority).

HA is a simple use case of policy.
load balancing/power saving is something more continuous which requires 
constant global view of workload, could be schedule based, etc.


>
>>
>> i.e. if you're going to use Pacemaker's policy engine then it
>> absolutely
>> makes sense to just go with Pacemaker Cloud, since that's precisely
>> what
>> it does (uses the core Pacemaker PE)
>>
>> OTOH, if you decide to use drools, then it may make more sense to
>> integrate the HA concepts directly into the drools PE and then the
>> only
>> other thing you can leverage would be the library that does the
>> monitoring of services at the end points.
>> _______________________________________________
>> Arch mailing list
>> Arch at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/arch
>>


From iheim at redhat.com  Wed Feb 15 06:55:46 2012
From: iheim at redhat.com (Itamar Heim)
Date: Wed, 15 Feb 2012 08:55:46 +0200
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F3B0D49.2040904@redhat.com>
References: <4F33FA64.6000101@redhat.com>
	<b6e16634-eb30-4bbc-b26c-442ca30e1b60@zmail13.collab.prod.int.phx2.redhat.com>
	<719CD19D2B2BFA4CB1B3F00D2A8CDCD0011E427D@AUSP01DAG0103.collaborationhost.net>
	<4F3B0D49.2040904@redhat.com>
Message-ID: <4F3B56F2.5050101@redhat.com>

On 02/15/2012 03:41 AM, Perry Myers wrote:
>> As long as you expect the VM to enforce reliability on the raw
>> storage devices then you are going to have problems with restarting
>> HA VMs. If you switch your thinking to making the storage operations
>> HA, then all you need is a response cache.
>>
>> A restarted VM replays the operation, and the cached response is
>> retransmitted (or the operation is benignly re-applied). Without
>> defining the operations so that they can be benignly re-applied or
>> adding a response cache you will always be able to come up with some
>> order of failure that won't work. There is no cost-effective way to
>> guarantee that you snapshot the VM only when there is no in-flight
>> storage activity.
>
> How is this any different than a bare metal host crashing while writes
> are in flight either to a local disk or FC disk?  When something crashes
> (be it physical or virtual) you're always going to lose some data that
> was in flight but not committed to disk (network has same issue).  It's
> up to individual applications to be resilient to this.
>
> I think this issue is somewhat orthogonal to simply providing reduced
> MTTR by restarting failed services or VMs.

don't you fence the other node first to make sure it won't write after 
you started another one?
here we are talking about moving the VM, without fencing the host.


From abaron at redhat.com  Wed Feb 15 07:03:52 2012
From: abaron at redhat.com (Ayal Baron)
Date: Wed, 15 Feb 2012 02:03:52 -0500 (EST)
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F3B526F.1060200@redhat.com>
Message-ID: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> On 02/15/2012 01:11 AM, Ayal Baron wrote:
> >
> >
> > ----- Original Message -----
> >>> I think we first need to look at the larger question of policy
> >>> engine at
> >>> ovirt-engine. the two main candidates are pacemaker and drools
> >>> (jboss
> >>> rules).
> >>> pacemaker for having logic in the area.
> >>> drools for having easier java integration and integrated UI to
> >>> create
> >>> policies by users.
> >>
> >> Agreed, as I mentioned in my email they're interrelated
> >
> > I'm not sure I agree.
> > This entire thread assumes that the way to do this is to have the
> > engine continuously monitor all services on all (HA) guests and
> > according to varying policies reschedule VMs (services within
> > VMs?)
> > I don't think this is scalable (and wrt drools/pacemaker, assuming
> > what Andrew says is correct, drools doesn't even remotely come
> > close to supporting even relatively small scales)
> >
> > Engine should decide on policy, the hosts should enforce it.
> > What this would translate to is a more distributed way of
> > monitoring and moving around of VMs/services.  E.g. for each
> > service, engine would run the VM on host A and let host B know
> > that it is the failover node for this service.  Node B would be
> > monitoring the heartbeats for the services it is in charge of and
> > take over when needed. In case host B crashes, engine would choose
> > a different host to be the failover node (note that there can be
> > more than 2 nodes with a predefined order of priority).
> 
> HA is a simple use case of policy.

*Today* HA is simply 'if VM is down restart it' but what Perry was suggesting was to improve this to something more robust.

> load balancing/power saving is something more continuous which
> requires
> constant global view of workload, could be schedule based, etc.

power saving is a specific load balancing policy.  Once policy changes (either manually or automatically) then it is engine's job to reshuffle the deck (move VMs around, designate new failover nodes, etc).
There is no question that the engine should periodically get the state of all the VMs / services it is managing (where it is running etc), but HA decisions need to consider a lot more data and are of finer granularity than general VM placement (health check frequency, intra-vm services monitoring, etc).

> 
> 
> >
> >>
> >> i.e. if you're going to use Pacemaker's policy engine then it
> >> absolutely
> >> makes sense to just go with Pacemaker Cloud, since that's
> >> precisely
> >> what
> >> it does (uses the core Pacemaker PE)
> >>
> >> OTOH, if you decide to use drools, then it may make more sense to
> >> integrate the HA concepts directly into the drools PE and then the
> >> only
> >> other thing you can leverage would be the library that does the
> >> monitoring of services at the end points.
> >> _______________________________________________
> >> Arch mailing list
> >> Arch at ovirt.org
> >> http://lists.ovirt.org/mailman/listinfo/arch
> >>
> 
> 


From lpeer at redhat.com  Wed Feb 15 11:12:44 2012
From: lpeer at redhat.com (Livnat Peer)
Date: Wed, 15 Feb 2012 13:12:44 +0200
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
Message-ID: <4F3B932C.9030602@redhat.com>

On 15/02/12 09:03, Ayal Baron wrote:
> 
> 
> ----- Original Message -----
>> On 02/15/2012 01:11 AM, Ayal Baron wrote:
>>>
>>>
>>> ----- Original Message -----
>>>>> I think we first need to look at the larger question of policy
>>>>> engine at
>>>>> ovirt-engine. the two main candidates are pacemaker and drools
>>>>> (jboss
>>>>> rules).
>>>>> pacemaker for having logic in the area.
>>>>> drools for having easier java integration and integrated UI to
>>>>> create
>>>>> policies by users.
>>>>
>>>> Agreed, as I mentioned in my email they're interrelated
>>>
>>> I'm not sure I agree.
>>> This entire thread assumes that the way to do this is to have the
>>> engine continuously monitor all services on all (HA) guests and
>>> according to varying policies reschedule VMs (services within
>>> VMs?)
>>> I don't think this is scalable (and wrt drools/pacemaker, assuming
>>> what Andrew says is correct, drools doesn't even remotely come
>>> close to supporting even relatively small scales)
>>>
>>> Engine should decide on policy, the hosts should enforce it.
>>> What this would translate to is a more distributed way of
>>> monitoring and moving around of VMs/services.  E.g. for each
>>> service, engine would run the VM on host A and let host B know
>>> that it is the failover node for this service.  Node B would be
>>> monitoring the heartbeats for the services it is in charge of and
>>> take over when needed. In case host B crashes, engine would choose
>>> a different host to be the failover node (note that there can be
>>> more than 2 nodes with a predefined order of priority).
>>
>> HA is a simple use case of policy.
> 
> *Today* HA is simply 'if VM is down restart it' but what Perry was suggesting was to improve this to something more robust.

I think that the main concept of what Perry suggested (leaving the
implementation details aside :)) is to add HA of services.

I like this idea and I would like to extend it a little bit.
How about services that are spread on more than a single VM.
I would like to be able to define a service and specify which VM/s
provides this service and add HA flag on the service.

Then i would like to manage policies around it - I define a service
with 3 VMs providing this service and I want to have at least 2 VM
running it at any given time. (now the VMs are not highly available only
the service is.)


> 
>> load balancing/power saving is something more continuous which
>> requires
>> constant global view of workload, could be schedule based, etc.
> 
> power saving is a specific load balancing policy.  Once policy changes (either manually or automatically) then it is engine's job to reshuffle the deck (move VMs around, designate new failover nodes, etc).
> There is no question that the engine should periodically get the state of all the VMs / services it is managing (where it is running etc), but HA decisions need to consider a lot more data and are of finer granularity than general VM placement (health check frequency, intra-vm services monitoring, etc).
> 
>>
>>
>>>
>>>>
>>>> i.e. if you're going to use Pacemaker's policy engine then it
>>>> absolutely
>>>> makes sense to just go with Pacemaker Cloud, since that's
>>>> precisely
>>>> what
>>>> it does (uses the core Pacemaker PE)
>>>>
>>>> OTOH, if you decide to use drools, then it may make more sense to
>>>> integrate the HA concepts directly into the drools PE and then the
>>>> only
>>>> other thing you can leverage would be the library that does the
>>>> monitoring of services at the end points.
>>>> _______________________________________________
>>>> Arch mailing list
>>>> Arch at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/arch
>>>>
>>
>>
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch


From abaron at redhat.com  Wed Feb 15 11:43:59 2012
From: abaron at redhat.com (Ayal Baron)
Date: Wed, 15 Feb 2012 06:43:59 -0500 (EST)
Subject: Empty cdrom drive.
In-Reply-To: <7743c661-ea82-465a-9f6f-5ac0da2a9962@zmail13.collab.prod.int.phx2.redhat.com>
Message-ID: <c7d67908-3169-4f72-9616-e66e6b2205c1@zmail13.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> On Tue, Feb 14, 2012 at 10:59:22AM -0500, Igor Lvovsky wrote:
> > 
> >   Hi,
> > I want to discuss $subject on the email just to be sure that we all
> > on the
> > same page.
> > 
> > So, today in 3.0 vdsm has two ways to create VM with cdrom :
> >  1. If RHEV-M ask to create VM with cdrom, vdsm just create it
> >  2. RHEV-M doesn't ask to create VM with cdrom, vdsm still creates
> >  VM with
> >     empty cdrom. Vdsm creates this device as 'hdc' (IDE device,
> >     index 2),
> >     because of libvirt restrictions.
> >     In this case RHEV-M will be able to "insert" cdrom on the fly
> >     with
> >     changeCD request.
> > 
> > In the new style API we want to get rid from stupid scenario #2,
> > because
> > we want to be able to create VM without cdrom at all.
> 
> > It means, that now we need to change a little our scenarios:
> >  1. If RHEV-M ask to create VM with cdrom, vdsm just create it
> >  2. RHEV-M doesn't want to create VM with cdrom, but it want to be
> >  able to
> >     "insert" cdrom on the fly after this. Here we have two options:
> >     a. RHEV-M should to pass empty cdrom device on VM creation and
> >     use
> >        regular changeCD after that
> >     b. RHEV-M can create VM without cdrom and add cdrom later
> >     through
> >        hotplugDisk command.
> 
> Let's leave hotpluggin for a later discussion. Currently I am worried
> about backward and forward compatibility.
> 
> 1. Currently, all VMs created by ovirt-Engine has an IDE cdrom
> device. This
> behavior should be maintained when Engine is upgraded, to minimize
> surprises to guests.
> 
> 2. In the new "devices" API introduced by Igor, Engine is responsible
> to
> know about all guest devices and their addresses.
> 
> 1+2=3. Engine has to be aware of the fact that even if it did not
> explicitly request for a cdrom, such a device exist.
> 
> 4. Vdsm would very much prefer that Engine explictly request that an
> empty cdrom device is included. This would allow us to start VMs with
> no
> cdrom device at all in the future.
> 
> I understand that this may be a complex feat for Engine, as it
> requires
> a complex upgrade path to the VM data base. To be done correctly, it
> requires a compatible change to the ovirt API, too.
> 
> 5. I suggest a hackish API that would let us solve the problem in
> stages: Engine would not have to explicitly list an empty CD.
> However,
> it would send a hack flag: hackAutoCD=True for all VM starting up.

Why? changing the API is much more difficult than changing a few bits of code.
All that is required from engine is to always add a cdrom if one does not exist.
Same amount of work that would be required of vdsm to implement the hack, but would be much more flexible.
Backward compatibility is not an issue as the API is new.
Using the old API would keep the old behaviour.

> 
> If this flag is True, Vdsm would add an IDE CDROM to the devices
> list.
> 
> In the future, Engine would drop this flag and specify the CDROM only
> when needed.
> 
> Please note that (3) is still correct - Engine would see the CDROM
> device and its address even if it was empty when the VM started
> running.
> 
> 
> Comments?
> 
> Dan.
> 


From simon at redhat.com  Wed Feb 15 12:13:59 2012
From: simon at redhat.com (Simon Grinberg)
Date: Wed, 15 Feb 2012 07:13:59 -0500 (EST)
Subject: Empty cdrom drive.
In-Reply-To: <c7d67908-3169-4f72-9616-e66e6b2205c1@zmail13.collab.prod.int.phx2.redhat.com>
Message-ID: <6307ac4f-31c9-4852-9eed-84ce84a7169f@zmail17.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> From: "Ayal Baron" <abaron at redhat.com>
> To: "Dan Kenigsberg" <danken at redhat.com>
> Cc: arch at ovirt.org, "Igor Lvovsky" <ilvovsky at redhat.com>
> Sent: Wednesday, February 15, 2012 1:43:59 PM
> Subject: Re: Empty cdrom drive.
> 
> 
> ----- Original Message -----
> > On Tue, Feb 14, 2012 at 10:59:22AM -0500, Igor Lvovsky wrote:
> > > 
> > >   Hi,
> > > I want to discuss $subject on the email just to be sure that we
> > > all
> > > on the
> > > same page.
> > > 
> > > So, today in 3.0 vdsm has two ways to create VM with cdrom :
> > >  1. If RHEV-M ask to create VM with cdrom, vdsm just create it
> > >  2. RHEV-M doesn't ask to create VM with cdrom, vdsm still
> > >  creates
> > >  VM with
> > >     empty cdrom. Vdsm creates this device as 'hdc' (IDE device,
> > >     index 2),
> > >     because of libvirt restrictions.
> > >     In this case RHEV-M will be able to "insert" cdrom on the fly
> > >     with
> > >     changeCD request.
> > > 
> > > In the new style API we want to get rid from stupid scenario #2,
> > > because
> > > we want to be able to create VM without cdrom at all.
> > 
> > > It means, that now we need to change a little our scenarios:
> > >  1. If RHEV-M ask to create VM with cdrom, vdsm just create it
> > >  2. RHEV-M doesn't want to create VM with cdrom, but it want to
> > >  be
> > >  able to
> > >     "insert" cdrom on the fly after this. Here we have two
> > >     options:
> > >     a. RHEV-M should to pass empty cdrom device on VM creation
> > >     and
> > >     use
> > >        regular changeCD after that
> > >     b. RHEV-M can create VM without cdrom and add cdrom later
> > >     through
> > >        hotplugDisk command.
> > 
> > Let's leave hotpluggin for a later discussion. Currently I am
> > worried
> > about backward and forward compatibility.
> > 
> > 1. Currently, all VMs created by ovirt-Engine has an IDE cdrom
> > device. This
> > behavior should be maintained when Engine is upgraded, to minimize
> > surprises to guests.
> > 
> > 2. In the new "devices" API introduced by Igor, Engine is
> > responsible
> > to
> > know about all guest devices and their addresses.
> > 
> > 1+2=3. Engine has to be aware of the fact that even if it did not
> > explicitly request for a cdrom, such a device exist.
> > 
> > 4. Vdsm would very much prefer that Engine explictly request that
> > an
> > empty cdrom device is included. This would allow us to start VMs
> > with
> > no
> > cdrom device at all in the future.
> > 
> > I understand that this may be a complex feat for Engine, as it
> > requires
> > a complex upgrade path to the VM data base. To be done correctly,
> > it
> > requires a compatible change to the ovirt API, too.

Why? VM DB will probably change anyhow with more fields added between versions so upgrade scripts should handle that. 

In any case if VDSM will report all devices it implies that DB change is a must (at least for the running config tables).

P.S.
Not sure that change CD should only apply to the running config. 
ATM the engine does not 'remember' the inserted CDs, thus a user that wants to change CD from the menu has no idea which CD is currently inserted. Having VDSM report devices and placing it in the running conf will solve that issue but what about sustaining after power on/off.

For physical machines the last inserted CD is the CD available after power on. In the current implementation it's the one configured in the VM properties.

What happens if it's a highly available VM that a user has changed the CD and now has crashed and restarted? The user will either loose his inserted CD or worse the CD will change. 


> > 
> > 5. I suggest a hackish API that would let us solve the problem in
> > stages: Engine would not have to explicitly list an empty CD.
> > However,
> > it would send a hack flag: hackAutoCD=True for all VM starting up.
> 
> Why? changing the API is much more difficult than changing a few bits
> of code.
> All that is required from engine is to always add a cdrom if one does
> not exist.
> Same amount of work that would be required of vdsm to implement the
> hack, but would be much more flexible.
> Backward compatibility is not an issue as the API is new.
> Using the old API would keep the old behaviour.

+1

> 
> > 
> > If this flag is True, Vdsm would add an IDE CDROM to the devices
> > list.
> > 
> > In the future, Engine would drop this flag and specify the CDROM
> > only
> > when needed.
> > 
> > Please note that (3) is still correct - Engine would see the CDROM
> > device and its address even if it was empty when the VM started
> > running.
> > 
> > 
> > Comments?
> > 
> > Dan.
> > 
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
> 


From sdake at redhat.com  Wed Feb 15 15:37:49 2012
From: sdake at redhat.com (Steven Dake)
Date: Wed, 15 Feb 2012 08:37:49 -0700
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <e167021e-1c94-4534-9c87-16f12fafc4c8@zmail13.collab.prod.int.phx2.redhat.com>
References: <e167021e-1c94-4534-9c87-16f12fafc4c8@zmail13.collab.prod.int.phx2.redhat.com>
Message-ID: <4F3BD14D.70003@redhat.com>

On 02/14/2012 11:32 PM, Ayal Baron wrote:
> 
> 
> ----- Original Message -----
>>> I'm not sure I agree.
>>> This entire thread assumes that the way to do this is to have the
>>> engine continuously monitor all services on all (HA) guests and
>>> according to varying policies reschedule VMs (services within VMs?)
>>
>> That's one interpretation of what I wrote, but not the only one.
>>
>> Pacemaker Cloud doesn't rely on a single process (like oVirt Engine)
>> to
>> monitor all VMs and the services in those VMs.  It relies on spawning
>> a
>> monitor process for each logical grouping of VMs in an 'application
>> group'.
>>
>> So the engine doesn't need to continuously monitor every VM and every
>> service, it delegates to the Cloud Policy Engine (CPE) which in turn
>> creates a daemon (DPE[1]) to monitor each application group.
> 
> Where is the daemon spawn? on the engine or in a distributed fashion? if the latter then drools is irrelevant.  if the former then it would just make things worse (scalability wise)
> 

Ayal,

CPE (cloud policy engine - responsible for starting/stopping cloud
application policy engines, provides an API for third party control)
runs on the same machines as the CAPE(aka DPE) (cloud application policy
engine - responsible for maintaining the availability of the resources
and virtual machines in one cloud application - including recovery
escalation, ordering constraints, fault detection, fault isolation,
instantiation of vms).  This collection of software components could be
collocated with the engine, or a separate machine entirely since the
project provides an API to third party projects.

One thing that may not be entirely clear is that there is a new DPE
process for each cloud application (which could be monitor several
hundreds VMs for large applications).  This converts the inherent
inability of any policy engine to scale to large object counts into a
kernel scheduling problem and memory consumption problem (kernel.org
scheduler rocks, memory is cheap).

The CAPE processes could be spawned in a distributed fashion very
trivially, if/when we run into scaling problems with a single node.  No
sense optimizing for a condition that may not be relevant.

One intentional aspect of our project is focused around reliability.
Our CAPE process is approximately 2kloc.  Its very small code footprint
is designed to be easy to "get right" vs a huge monolithic code base
which increases the possible failure scenarios.

As a short note about scalability, my laptop can run 1000 CAPE processes
with 1% total cpu utilization (measured with top) and 5gig memory
utilization (measured with free).  The design's upper limit on scale is
based upon a) limitations of kernel scheduling b) memory consumption of
the CAPE process.

Regards
-steve


From dfediuck at redhat.com  Wed Feb 15 16:15:30 2012
From: dfediuck at redhat.com (Doron Fediuck)
Date: Wed, 15 Feb 2012 18:15:30 +0200
Subject: Schedule for 21 March workshop
In-Reply-To: <4F39B360.8010706@redhat.com>
References: <4F39B360.8010706@redhat.com>
Message-ID: <4F3BDA22.8080402@redhat.com>

On 14/02/12 03:05, Karsten 'quaid' Wade wrote:
> We need to figure out a schedule for the upcoming workshop. Based on
> the last workshop, it shouldn't be hard to figure out what we want to
> cover.
> 
> The Red Hat, IBM, and Intel teams will have just come out of a two-day
> intensive on the 19th and 20th, so will all be better prepared to deal
> with the open workshop.
> 
> The materials in the open workshop will be the same topics, but
> covered in less depth.
> 
> Here are some topic ideas I've heard so far:
> * oVirt intro/overview/architecture
> * oVirt architecture
> * Engine deep dive
> * VDSM deep dive
> * Getting started with dev environment
> * API/SDK/CLI
> * Node
> * History and reports
> * Guest agent
> * Engine tools
> * How to interact & participate
> * Open discussion
> 
> What ideas do you have?
> 
> What do you think must be covered?
> 
> What do you think should be covered?
> 
> What is safe to not cover?
> 
> Thanks - Karsten

Just a general comment;
Can we run a poll in the oVirt web site with topic suggestions,
and also email address for new suggestions?
After today's #ovirt Meeting, it looks like there's a lot
of interest in the workshop, but maybe not everyone are
aware of this list. So we can ask for opinions / suggestions
in the site.

> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch

-- 

/d

"All computers wait at the same speed."


From dfediuck at redhat.com  Wed Feb 15 16:20:04 2012
From: dfediuck at redhat.com (Doron Fediuck)
Date: Wed, 15 Feb 2012 18:20:04 +0200
Subject: Schedule for 21 March workshop
In-Reply-To: <4F3BDA22.8080402@redhat.com>
References: <4F39B360.8010706@redhat.com> <4F3BDA22.8080402@redhat.com>
Message-ID: <4F3BDB34.206@redhat.com>

On 15/02/12 18:15, Doron Fediuck wrote:
> On 14/02/12 03:05, Karsten 'quaid' Wade wrote:
>> We need to figure out a schedule for the upcoming workshop. Based on
>> the last workshop, it shouldn't be hard to figure out what we want to
>> cover.
>>
>> The Red Hat, IBM, and Intel teams will have just come out of a two-day
>> intensive on the 19th and 20th, so will all be better prepared to deal
>> with the open workshop.
>>
>> The materials in the open workshop will be the same topics, but
>> covered in less depth.
>>
>> Here are some topic ideas I've heard so far:
>> * oVirt intro/overview/architecture
>> * oVirt architecture
>> * Engine deep dive
>> * VDSM deep dive
>> * Getting started with dev environment
>> * API/SDK/CLI
>> * Node
>> * History and reports
>> * Guest agent
>> * Engine tools
>> * How to interact & participate
>> * Open discussion
>>
>> What ideas do you have?
>>
>> What do you think must be covered?
>>
>> What do you think should be covered?
>>
>> What is safe to not cover?
>>
>> Thanks - Karsten
> 
> Just a general comment;
> Can we run a poll in the oVirt web site with topic suggestions,
> and also email address for new suggestions?
> After today's #ovirt Meeting, it looks like there's a lot
> of interest in the workshop, but maybe not everyone are
> aware of this list. So we can ask for opinions / suggestions
> in the site.
> 

One more thing...
As a tribute to the hosting country, may we add to the event page
"oVirt workshop 2012" in Chinese? Maybe even a few more words
about the project and workshop.

-- 

/d

"Email returned to sender -- insufficient voltage."


From lpeer at redhat.com  Wed Feb 15 16:45:47 2012
From: lpeer at redhat.com (Livnat Peer)
Date: Wed, 15 Feb 2012 18:45:47 +0200
Subject: Schedule for 21 March workshop
In-Reply-To: <4F3BDB34.206@redhat.com>
References: <4F39B360.8010706@redhat.com> <4F3BDA22.8080402@redhat.com>
	<4F3BDB34.206@redhat.com>
Message-ID: <4F3BE13B.1070409@redhat.com>

On 15/02/12 18:20, Doron Fediuck wrote:
> On 15/02/12 18:15, Doron Fediuck wrote:
>> On 14/02/12 03:05, Karsten 'quaid' Wade wrote:
>>> We need to figure out a schedule for the upcoming workshop. Based on
>>> the last workshop, it shouldn't be hard to figure out what we want to
>>> cover.
>>>
>>> The Red Hat, IBM, and Intel teams will have just come out of a two-day
>>> intensive on the 19th and 20th, so will all be better prepared to deal
>>> with the open workshop.
>>>
>>> The materials in the open workshop will be the same topics, but
>>> covered in less depth.
>>>
>>> Here are some topic ideas I've heard so far:
>>> * oVirt intro/overview/architecture
>>> * oVirt architecture
>>> * Engine deep dive
>>> * VDSM deep dive
>>> * Getting started with dev environment
>>> * API/SDK/CLI
>>> * Node
>>> * History and reports
>>> * Guest agent
>>> * Engine tools
>>> * How to interact & participate
>>> * Open discussion
>>>
>>> What ideas do you have?
>>>
>>> What do you think must be covered?
>>>
>>> What do you think should be covered?
>>>
>>> What is safe to not cover?
>>>
>>> Thanks - Karsten
>>
>> Just a general comment;
>> Can we run a poll in the oVirt web site with topic suggestions,
>> and also email address for new suggestions?
>> After today's #ovirt Meeting, it looks like there's a lot
>> of interest in the workshop, but maybe not everyone are
>> aware of this list. So we can ask for opinions / suggestions
>> in the site.
>>
> 
> One more thing...
> As a tribute to the hosting country, may we add to the event page
> "oVirt workshop 2012" in Chinese? Maybe even a few more words
> about the project and workshop.
> 

+1, nice idea.


From yzaslavs at redhat.com  Wed Feb 15 16:50:33 2012
From: yzaslavs at redhat.com (Yair Zaslavsky)
Date: Wed, 15 Feb 2012 18:50:33 +0200
Subject: Schedule for 21 March workshop
In-Reply-To: <4F3BE13B.1070409@redhat.com>
References: <4F39B360.8010706@redhat.com>
	<4F3BDA22.8080402@redhat.com>	<4F3BDB34.206@redhat.com>
	<4F3BE13B.1070409@redhat.com>
Message-ID: <4F3BE259.1010801@redhat.com>

On 02/15/2012 06:45 PM, Livnat Peer wrote:
> On 15/02/12 18:20, Doron Fediuck wrote:
>> On 15/02/12 18:15, Doron Fediuck wrote:
>>> On 14/02/12 03:05, Karsten 'quaid' Wade wrote:
>>>> We need to figure out a schedule for the upcoming workshop. Based on
>>>> the last workshop, it shouldn't be hard to figure out what we want to
>>>> cover.
>>>>
>>>> The Red Hat, IBM, and Intel teams will have just come out of a two-day
>>>> intensive on the 19th and 20th, so will all be better prepared to deal
>>>> with the open workshop.
>>>>
>>>> The materials in the open workshop will be the same topics, but
>>>> covered in less depth.
>>>>
>>>> Here are some topic ideas I've heard so far:
>>>> * oVirt intro/overview/architecture
>>>> * oVirt architecture
>>>> * Engine deep dive
>>>> * VDSM deep dive
>>>> * Getting started with dev environment
>>>> * API/SDK/CLI
>>>> * Node
>>>> * History and reports
>>>> * Guest agent
>>>> * Engine tools
>>>> * How to interact & participate
>>>> * Open discussion
>>>>
>>>> What ideas do you have?
>>>>
>>>> What do you think must be covered?
>>>>
>>>> What do you think should be covered?
>>>>
>>>> What is safe to not cover?
>>>>
>>>> Thanks - Karsten
>>>
>>> Just a general comment;
>>> Can we run a poll in the oVirt web site with topic suggestions,
>>> and also email address for new suggestions?
>>> After today's #ovirt Meeting, it looks like there's a lot
>>> of interest in the workshop, but maybe not everyone are
>>> aware of this list. So we can ask for opinions / suggestions
>>> in the site.
>>>
>>
>> One more thing...
>> As a tribute to the hosting country, may we add to the event page
>> "oVirt workshop 2012" in Chinese? Maybe even a few more words
>> about the project and workshop.
>>
> 
> +1, nice idea.
+1 - but will that be Mandarin or also Cantonese? How it is usually
taken care of?

> 
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch


From abaron at redhat.com  Wed Feb 15 16:48:17 2012
From: abaron at redhat.com (Ayal Baron)
Date: Wed, 15 Feb 2012 11:48:17 -0500 (EST)
Subject: Schedule for 21 March workshop
In-Reply-To: <4F39B360.8010706@redhat.com>
Message-ID: <f84fc24e-371a-43da-b2ec-6dc0a57151f9@zmail13.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> We need to figure out a schedule for the upcoming workshop. Based on
> the last workshop, it shouldn't be hard to figure out what we want to
> cover.
> 
> The Red Hat, IBM, and Intel teams will have just come out of a
> two-day
> intensive on the 19th and 20th, so will all be better prepared to
> deal
> with the open workshop.
> 
> The materials in the open workshop will be the same topics, but
> covered in less depth.
> 
> Here are some topic ideas I've heard so far:
> * oVirt intro/overview/architecture
> * oVirt architecture
> * Engine deep dive
> * VDSM deep dive
> * Getting started with dev environment
> * API/SDK/CLI

All of the above - must be covered

> * Node
> * History and reports
> * Guest agent
> * Engine tools

The above I'm not so sure in 1 day

> * How to interact & participate
> * Open discussion

Above should be covered

> 
> What ideas do you have?
> 
> What do you think must be covered?
> 
> What do you think should be covered?
> 
> What is safe to not cover?
> 
> Thanks - Karsten
> - --
> name:  Karsten 'quaid' Wade, Sr. Community Architect
> team:    Red Hat Community Architecture & Leadership
> uri:              http://communityleadershipteam.org
>                          http://TheOpenSourceWay.org
> gpg:                                        AD0E0C41
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iD8DBQFPObNg2ZIOBq0ODEERAqNCAJ9bJete89+tRpFWFbRV/LQCkFegpQCgohdU
> V8FSmSBZXuwt1n9HzJEgbgM=
> =GemC
> -----END PGP SIGNATURE-----
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
> 


From dfediuck at redhat.com  Wed Feb 15 16:49:43 2012
From: dfediuck at redhat.com (Doron Fediuck)
Date: Wed, 15 Feb 2012 18:49:43 +0200
Subject: Schedule for 21 March workshop
In-Reply-To: <4F3BE259.1010801@redhat.com>
References: <4F39B360.8010706@redhat.com>
	<4F3BDA22.8080402@redhat.com>	<4F3BDB34.206@redhat.com>
	<4F3BE13B.1070409@redhat.com> <4F3BE259.1010801@redhat.com>
Message-ID: <4F3BE227.20901@redhat.com>

On 15/02/12 18:50, Yair Zaslavsky wrote:
> On 02/15/2012 06:45 PM, Livnat Peer wrote:
>> On 15/02/12 18:20, Doron Fediuck wrote:
>>> On 15/02/12 18:15, Doron Fediuck wrote:
>>>> On 14/02/12 03:05, Karsten 'quaid' Wade wrote:
>>>>> We need to figure out a schedule for the upcoming workshop. Based on
>>>>> the last workshop, it shouldn't be hard to figure out what we want to
>>>>> cover.
>>>>>
>>>>> The Red Hat, IBM, and Intel teams will have just come out of a two-day
>>>>> intensive on the 19th and 20th, so will all be better prepared to deal
>>>>> with the open workshop.
>>>>>
>>>>> The materials in the open workshop will be the same topics, but
>>>>> covered in less depth.
>>>>>
>>>>> Here are some topic ideas I've heard so far:
>>>>> * oVirt intro/overview/architecture
>>>>> * oVirt architecture
>>>>> * Engine deep dive
>>>>> * VDSM deep dive
>>>>> * Getting started with dev environment
>>>>> * API/SDK/CLI
>>>>> * Node
>>>>> * History and reports
>>>>> * Guest agent
>>>>> * Engine tools
>>>>> * How to interact & participate
>>>>> * Open discussion
>>>>>
>>>>> What ideas do you have?
>>>>>
>>>>> What do you think must be covered?
>>>>>
>>>>> What do you think should be covered?
>>>>>
>>>>> What is safe to not cover?
>>>>>
>>>>> Thanks - Karsten
>>>>
>>>> Just a general comment;
>>>> Can we run a poll in the oVirt web site with topic suggestions,
>>>> and also email address for new suggestions?
>>>> After today's #ovirt Meeting, it looks like there's a lot
>>>> of interest in the workshop, but maybe not everyone are
>>>> aware of this list. So we can ask for opinions / suggestions
>>>> in the site.
>>>>
>>>
>>> One more thing...
>>> As a tribute to the hosting country, may we add to the event page
>>> "oVirt workshop 2012" in Chinese? Maybe even a few more words
>>> about the project and workshop.
>>>
>>
>> +1, nice idea.
> +1 - but will that be Mandarin or also Cantonese? How it is usually
> taken care of?
> 

Since there's a translation work going on, I'm sure there's a convention.
Having this text in the site may also increase the traffic.

-- 

/d

?Funny,? he intoned funereally, ?how just when you think life can't possibly get any worse it suddenly does.? --Douglas Adams, The Hitchhiker's Guide to the Galaxy


From dfediuck at redhat.com  Wed Feb 15 17:07:06 2012
From: dfediuck at redhat.com (Doron Fediuck)
Date: Wed, 15 Feb 2012 19:07:06 +0200
Subject: Schedule for 21 March workshop
In-Reply-To: <f84fc24e-371a-43da-b2ec-6dc0a57151f9@zmail13.collab.prod.int.phx2.redhat.com>
References: <f84fc24e-371a-43da-b2ec-6dc0a57151f9@zmail13.collab.prod.int.phx2.redhat.com>
Message-ID: <4F3BE63A.9020405@redhat.com>

On 15/02/12 18:48, Ayal Baron wrote:
> 
> 
> ----- Original Message -----
> We need to figure out a schedule for the upcoming workshop. Based on
> the last workshop, it shouldn't be hard to figure out what we want to
> cover.
> 
> The Red Hat, IBM, and Intel teams will have just come out of a
> two-day
> intensive on the 19th and 20th, so will all be better prepared to
> deal
> with the open workshop.
> 
> The materials in the open workshop will be the same topics, but
> covered in less depth.
> 
> Here are some topic ideas I've heard so far:
> * oVirt intro/overview/architecture
> * oVirt architecture
> * Engine deep dive
> * VDSM deep dive
> * Getting started with dev environment
> * API/SDK/CLI
> 
>> All of the above - must be covered
+1 with these exceptions:

- Unsure how deep we can dive in engine and vdsm in this time frame.
This would probably be a soften version of SFO workshop.

- The "getting started" is more a hands-on kind of a session,
so we can do it like we did in SFO, or maybe in a lab-session.
The initial maven dependency fetch alone may take >30 minutes (depending
on Internet connection), or we may create a local lab repo, or skip
it like we did in SFO. Depending on participants feedback, which we need
prior to the workshop.

> 
> * Node
Node architecture and how-to should also be covered, so people
are aware of it's read-only / stateless nature.

> * History and reports
> * Guest agent
> * Engine tools
Since engine tools are being widely used (especially the configuration),
I'd add them to the must list, for a short informative session.

> 
>> The above I'm not so sure in 1 day
> 
> * How to interact & participate
> * Open discussion
> 
>> Above should be covered
+1

> 
> 
> What ideas do you have?
> 
A feedback session- both from development POV,
and from usability POV. We may get some fresh
ideas there.

> What do you think must be covered?
> 
> What do you think should be covered?
> 
> What is safe to not cover?
> 
> Thanks - Karsten
-- 

/d

"All computers wait at the same speed."


From Caitlin.Bestler at nexenta.com  Wed Feb 15 17:08:35 2012
From: Caitlin.Bestler at nexenta.com (Caitlin Bestler)
Date: Wed, 15 Feb 2012 17:08:35 +0000
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F3B0D49.2040904@redhat.com>
References: <4F33FA64.6000101@redhat.com>
	<b6e16634-eb30-4bbc-b26c-442ca30e1b60@zmail13.collab.prod.int.phx2.redhat.com>
	<719CD19D2B2BFA4CB1B3F00D2A8CDCD0011E427D@AUSP01DAG0103.collaborationhost.net>
	<4F3B0D49.2040904@redhat.com>
Message-ID: <719CD19D2B2BFA4CB1B3F00D2A8CDCD04148B3F2@AUSP01DAG0101.collaborationhost.net>

Perry Myers wrote:

>> As long as you expect the VM to enforce reliability on the raw storage 
>> devices then you are going to have problems with restarting HA VMs. If 
>> you switch your thinking to making the storage operations HA, then all 
>> you need is a response cache.
>> 
>> A restarted VM replays the operation, and the cached response is 
>> retransmitted (or the operation is benignly re-applied). Without 
>> defining the operations so that they can be benignly re-applied or 
>> adding a response cache you will always be able to come up with some 
>> order of failure that won't work. There is no cost-effective way to 
>> guarantee that you snapshot the VM only when there is no in-flight 
>> storage activity.

> How is this any different than a bare metal host crashing while writes are
> in flight either to a local disk or FC disk?  When something crashes (be it
> physical or virtual) you're always going to lose some data that was in flight
> but not committed to disk (network has same issue).  It's up to individual
> applications to be resilient to this.

Don't think of a storage write as being a write to a device. It is a request to
a service made in the context of a session. The session protocol includes the
necessary logic to complete the transaction even when a TCP connection is
broken. Examples of this include multi-connection iSCSI and NFSv4. Both of
which can be used to back a virtual disk.
 
When a VM is migrated you break the connections by it or were made on its
behalf. The pre-existing session logic will make in-progress operations retry
until they are successful.

The key is thinking of block storage as a service, rather than as a device.
 

From pmyers at redhat.com  Thu Feb 16 02:52:19 2012
From: pmyers at redhat.com (Perry Myers)
Date: Wed, 15 Feb 2012 21:52:19 -0500
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F3B56F2.5050101@redhat.com>
References: <4F33FA64.6000101@redhat.com>
	<b6e16634-eb30-4bbc-b26c-442ca30e1b60@zmail13.collab.prod.int.phx2.redhat.com>
	<719CD19D2B2BFA4CB1B3F00D2A8CDCD0011E427D@AUSP01DAG0103.collaborationhost.net>
	<4F3B0D49.2040904@redhat.com> <4F3B56F2.5050101@redhat.com>
Message-ID: <4F3C6F63.9050208@redhat.com>

On 02/15/2012 01:55 AM, Itamar Heim wrote:
> On 02/15/2012 03:41 AM, Perry Myers wrote:
>>> As long as you expect the VM to enforce reliability on the raw
>>> storage devices then you are going to have problems with restarting
>>> HA VMs. If you switch your thinking to making the storage operations
>>> HA, then all you need is a response cache.
>>>
>>> A restarted VM replays the operation, and the cached response is
>>> retransmitted (or the operation is benignly re-applied). Without
>>> defining the operations so that they can be benignly re-applied or
>>> adding a response cache you will always be able to come up with some
>>> order of failure that won't work. There is no cost-effective way to
>>> guarantee that you snapshot the VM only when there is no in-flight
>>> storage activity.
>>
>> How is this any different than a bare metal host crashing while writes
>> are in flight either to a local disk or FC disk?  When something crashes
>> (be it physical or virtual) you're always going to lose some data that
>> was in flight but not committed to disk (network has same issue).  It's
>> up to individual applications to be resilient to this.
>>
>> I think this issue is somewhat orthogonal to simply providing reduced
>> MTTR by restarting failed services or VMs.
> 
> don't you fence the other node first to make sure it won't write after
> you started another one?

yes

> here we are talking about moving the VM, without fencing the host.

Ok.   I don't see how that's possible...  If you don't fence the other
host (either by cutting off I/O via sanlock, SCSI reservations or power
fencing) then you always run the risk of both VMs accessing shared
storage at the same time from two different hosts leading to data
corruption.


From pmyers at redhat.com  Thu Feb 16 02:55:01 2012
From: pmyers at redhat.com (Perry Myers)
Date: Wed, 15 Feb 2012 21:55:01 -0500
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <719CD19D2B2BFA4CB1B3F00D2A8CDCD04148B3F2@AUSP01DAG0101.collaborationhost.net>
References: <4F33FA64.6000101@redhat.com>
	<b6e16634-eb30-4bbc-b26c-442ca30e1b60@zmail13.collab.prod.int.phx2.redhat.com>
	<719CD19D2B2BFA4CB1B3F00D2A8CDCD0011E427D@AUSP01DAG0103.collaborationhost.net>
	<4F3B0D49.2040904@redhat.com>
	<719CD19D2B2BFA4CB1B3F00D2A8CDCD04148B3F2@AUSP01DAG0101.collaborationhost.net>
Message-ID: <4F3C7005.4020904@redhat.com>

On 02/15/2012 12:08 PM, Caitlin Bestler wrote:
> Perry Myers wrote:
> 
>>> As long as you expect the VM to enforce reliability on the raw storage 
>>> devices then you are going to have problems with restarting HA VMs. If 
>>> you switch your thinking to making the storage operations HA, then all 
>>> you need is a response cache.
>>>
>>> A restarted VM replays the operation, and the cached response is 
>>> retransmitted (or the operation is benignly re-applied). Without 
>>> defining the operations so that they can be benignly re-applied or 
>>> adding a response cache you will always be able to come up with some 
>>> order of failure that won't work. There is no cost-effective way to 
>>> guarantee that you snapshot the VM only when there is no in-flight 
>>> storage activity.
> 
>> How is this any different than a bare metal host crashing while writes are
>> in flight either to a local disk or FC disk?  When something crashes (be it
>> physical or virtual) you're always going to lose some data that was in flight
>> but not committed to disk (network has same issue).  It's up to individual
>> applications to be resilient to this.
> 
> Don't think of a storage write as being a write to a device. It is a request to
> a service made in the context of a session. The session protocol includes the
> necessary logic to complete the transaction even when a TCP connection is
> broken. Examples of this include multi-connection iSCSI and NFSv4. Both of
> which can be used to back a virtual disk.
>  
> When a VM is migrated you break the connections by it or were made on its
> behalf. The pre-existing session logic will make in-progress operations retry
> until they are successful.
> 
> The key is thinking of block storage as a service, rather than as a device.

Ok, that is clearer.  I can see how this would relate to providing
better data integrity in the face of hardware/software faults (at the
expense of performance), but it doesn't replace the need for
monitoring/remediation of failed hosts/VMs/services.  So this is
something that would be used in conjunction with a traditional HA
solution, not in replace of.

Perry


From pmyers at redhat.com  Thu Feb 16 03:01:16 2012
From: pmyers at redhat.com (Perry Myers)
Date: Wed, 15 Feb 2012 22:01:16 -0500
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F3B932C.9030602@redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
	<4F3B932C.9030602@redhat.com>
Message-ID: <4F3C717C.3050708@redhat.com>

>>> HA is a simple use case of policy.
>>
>> *Today* HA is simply 'if VM is down restart it' but what Perry was suggesting was to improve this to something more robust.
> 
> I think that the main concept of what Perry suggested (leaving the
> implementation details aside :)) is to add HA of services.

That's it in a nutshell :)

> I like this idea and I would like to extend it a little bit.
> How about services that are spread on more than a single VM.
> I would like to be able to define a service and specify which VM/s
> provides this service and add HA flag on the service.

That is in line with what I was proposing.  There are two ways you can
do service HA...

* Take a set of OSes (guests) and a set of services.  Each service can
run on any of the guests.  Therefore services can be failed over from
one live guest to another.  This is effectively how the Pacemaker HA
stack works on both bare metal and virtual clusters

* Take a set of OSes (guests) and on each guest place a specific set of
services.  Services can be restarted if they fail on a specific guest,
but if a guest/host fails, rather than failing over the service to
another live running guest, instead the entire guest responsible for
that service is restarted.  The recovery time is slightly longer in this
case because recovery involves restarting a VM instead of just starting
a service on another running VM.  But the positive here is that the
configuration and policies are not as complex, and since VMs typically
can start fairly quickly the failover time is still adequate for most users

Both models work.  Pacemaker HA uses the first model, Pacemaker Cloud
uses the second, but over time could be adapted to include the 1st.

> Then i would like to manage policies around it - I define a service
> with 3 VMs providing this service and I want to have at least 2 VM
> running it at any given time. (now the VMs are not highly available only
> the service is.)

Yep.  This is in line with use case #1 above.

Perry


From oschreib at redhat.com  Thu Feb 16 13:15:54 2012
From: oschreib at redhat.com (Ofer Schreiber)
Date: Thu, 16 Feb 2012 08:15:54 -0500 (EST)
Subject: Release process proposal
In-Reply-To: <8d76cb0a-3c3e-4cea-a046-3d5ca7cfee71@zmail14.collab.prod.int.phx2.redhat.com>
Message-ID: <94db87ca-45d4-4733-b7c4-46be71dae106@zmail14.collab.prod.int.phx2.redhat.com>

Since we currently doesn't have any official Release process, here's my proposal:

1. oVirt will issue a new release every 6 months.
  a. EXCEPTION: First three releases will be issued in a ~3 month interval.
  b. Exact dates must be set for each release.

2. A week after the n-1 release is out, a release criteria for the new release should be discussed.
  a. Release criteria will include MUST items and SHOULD items (held in wiki)
    + MUST items will DELAY the release in any case.
    + SHOULD items will include less critical flows and new features.
    + SHOULD items will be handled as "best-effort" by component owners
  b. Component owners (e.g. Node, engine-core, vdsm) must ACK the criteria suggested.

3. OPTIONAL: Discuses the new version number according to the release criteria/amount of features.
  a. OR BETTER: Increase MAJOR version every second release
  b. Versions will be handled by each component.
  c. The general oVirt version will the engine version.

5. 60 Days before release - Feature freeze
  a. All component owners must create a new versioned branch
  b. "Beta" version should be supplied immediately after.
    + And on a nightly basis afterwards.
  c. Stabilization efforts should start on the new builds.
  d. Cherry-pick fixes for important issues only.
    + Zero/Minimal changes to user interface.
    + Inform in advance on any user interface change.
  e. At this stage, we should start working on the release notes.

6. 30 days before release - release candidate
  a. If no serious issues are found the last release candidate automatically becomes the final release.
  b. Release manager will create a wiki with list of release blockers
  c. Only release blockers should be fixed in this stage.

7. Create a new RC if needed
  a. There must be at least one week between the last release candidate and the final release
  b. Go/No go meetings will happen once a week in this stage.
    + Increase the amount of meeting according to the release manager decision.
    + Release manager will inform the community on any delay.

8. Release
 a. Create ANNOUNCE message few days before actual release.
 b. PARTY

Have any comments? ideas? share them with the list!

Thanks,
Ofer Schreiber.


From lpeer at redhat.com  Thu Feb 16 13:33:44 2012
From: lpeer at redhat.com (Livnat Peer)
Date: Thu, 16 Feb 2012 15:33:44 +0200
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F3C717C.3050708@redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
	<4F3B932C.9030602@redhat.com> <4F3C717C.3050708@redhat.com>
Message-ID: <4F3D05B8.10302@redhat.com>

On 16/02/12 05:01, Perry Myers wrote:
>>>> HA is a simple use case of policy.
>>>
>>> *Today* HA is simply 'if VM is down restart it' but what Perry was suggesting was to improve this to something more robust.
>>
>> I think that the main concept of what Perry suggested (leaving the
>> implementation details aside :)) is to add HA of services.
> 
> That's it in a nutshell :)
> 
>> I like this idea and I would like to extend it a little bit.
>> How about services that are spread on more than a single VM.
>> I would like to be able to define a service and specify which VM/s
>> provides this service and add HA flag on the service.
> 
> That is in line with what I was proposing.  There are two ways you can
> do service HA...
> 
> * Take a set of OSes (guests) and a set of services.  Each service can
> run on any of the guests.  Therefore services can be failed over from
> one live guest to another.  This is effectively how the Pacemaker HA
> stack works on both bare metal and virtual clusters
> 
> * Take a set of OSes (guests) and on each guest place a specific set of
> services.  Services can be restarted if they fail on a specific guest,
> but if a guest/host fails, rather than failing over the service to
> another live running guest, instead the entire guest responsible for
> that service is restarted.  The recovery time is slightly longer in this
> case because recovery involves restarting a VM instead of just starting
> a service on another running VM.  But the positive here is that the
> configuration and policies are not as complex, and since VMs typically
> can start fairly quickly the failover time is still adequate for most users
> 

Can a service be spread on more than one VM?
For example if I have enterprise application that requires application
server (AS) and a data base (DB), the AS and DB can not live in the same
guest because of different access restrictions (based on real use case).
The service availability is dependent on both guests being active, and
an optimization is to run both of them on the same host.


> Both models work.  Pacemaker HA uses the first model, Pacemaker Cloud
> uses the second, but over time could be adapted to include the 1st.
> 
>> Then i would like to manage policies around it - I define a service
>> with 3 VMs providing this service and I want to have at least 2 VM
>> running it at any given time. (now the VMs are not highly available only
>> the service is.)
> 
> Yep.  This is in line with use case #1 above.
> 
> Perry


From pmyers at redhat.com  Thu Feb 16 13:37:49 2012
From: pmyers at redhat.com (Perry Myers)
Date: Thu, 16 Feb 2012 08:37:49 -0500
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F3D05B8.10302@redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
	<4F3B932C.9030602@redhat.com> <4F3C717C.3050708@redhat.com>
	<4F3D05B8.10302@redhat.com>
Message-ID: <4F3D06AD.60805@redhat.com>

> Can a service be spread on more than one VM?
> For example if I have enterprise application that requires application
> server (AS) and a data base (DB), the AS and DB can not live in the same
> guest because of different access restrictions (based on real use case).
> The service availability is dependent on both guests being active, and
> an optimization is to run both of them on the same host.

Yep.  That should all be possible to specify with the correct policy in
Pacemaker Cloud, and with a bare metal Pacemaker cluster this sort of
deployment is very straightforward.  Even the colocation or
anti-colocation policy to keep the two guests on the same host or
separate hosts[1] should be possible.

Perry

[1] as long as the underlying cloud itself supports the ability to
    instruct on colocation/anti-colocation... Right now my
    understanding is that there is no easy way to specify this in oVirt
    and certainly in EC2 there's no way to specify this unless you were
    to put each VM in a different Region


From mburns at redhat.com  Thu Feb 16 13:45:12 2012
From: mburns at redhat.com (Mike Burns)
Date: Thu, 16 Feb 2012 08:45:12 -0500
Subject: Release process proposal
In-Reply-To: <94db87ca-45d4-4733-b7c4-46be71dae106@zmail14.collab.prod.int.phx2.redhat.com>
References: <94db87ca-45d4-4733-b7c4-46be71dae106@zmail14.collab.prod.int.phx2.redhat.com>
Message-ID: <1329399912.9399.42.camel@beelzebub.mburnsfire.net>

On Thu, 2012-02-16 at 08:15 -0500, Ofer Schreiber wrote:
> Since we currently doesn't have any official Release process, here's my proposal:
> 
> 1. oVirt will issue a new release every 6 months.
>   a. EXCEPTION: First three releases will be issued in a ~3 month interval.
>   b. Exact dates must be set for each release.
	decided week after previous release?
> 
> 2. A week after the n-1 release is out, a release criteria for the new release should be discussed.
>   a. Release criteria will include MUST items and SHOULD items (held in wiki)
>     + MUST items will DELAY the release in any case.
>     + SHOULD items will include less critical flows and new features.
>     + SHOULD items will be handled as "best-effort" by component owners
>   b. Component owners (e.g. Node, engine-core, vdsm) must ACK the criteria suggested.
> 
> 3. OPTIONAL: Discuses the new version number according to the release criteria/amount of features.
>   a. OR BETTER: Increase MAJOR version every second release
>   b. Versions will be handled by each component.
>   c. The general oVirt version will the engine version.

Seems case-by-case is better for this.  Probably depends what changes
are made and how big they are.  

> 
> 5. 60 Days before release - Feature freeze
I assume 30 days (maybe 45?) for the first couple releases since they're
only 3 month intervals...
>   a. All component owners must create a new versioned branch
>   b. "Beta" version should be supplied immediately after.
>     + And on a nightly basis afterwards.
that's huge overhead on each component owner/builder unless we get some
process of delivering these from jenkins for each component.
>   c. Stabilization efforts should start on the new builds.
>   d. Cherry-pick fixes for important issues only.
>     + Zero/Minimal changes to user interface.
>     + Inform in advance on any user interface change.
>   e. At this stage, we should start working on the release notes.
> 
> 6. 30 days before release - release candidate
	15 days for first few releases? 20?
>   a. If no serious issues are found the last release candidate automatically becomes the final release.
>   b. Release manager will create a wiki with list of release blockers
>   c. Only release blockers should be fixed in this stage.
> 
> 7. Create a new RC if needed
>   a. There must be at least one week between the last release candidate and the final release
>   b. Go/No go meetings will happen once a week in this stage.
>     + Increase the amount of meeting according to the release manager decision.
>     + Release manager will inform the community on any delay.
> 
> 8. Release
>  a. Create ANNOUNCE message few days before actual release.
>  b. PARTY
> 
> Have any comments? ideas? share them with the list!

Should there be a step for RPM and binary signing?

> 
> Thanks,
> Ofer Schreiber.
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch


From apevec at gmail.com  Thu Feb 16 13:54:03 2012
From: apevec at gmail.com (Alan Pevec)
Date: Thu, 16 Feb 2012 14:54:03 +0100
Subject: Release process proposal
In-Reply-To: <94db87ca-45d4-4733-b7c4-46be71dae106@zmail14.collab.prod.int.phx2.redhat.com>
References: <8d76cb0a-3c3e-4cea-a046-3d5ca7cfee71@zmail14.collab.prod.int.phx2.redhat.com>
	<94db87ca-45d4-4733-b7c4-46be71dae106@zmail14.collab.prod.int.phx2.redhat.com>
Message-ID: <CAGi==UWA-0RVoE8pX7UQJ1WuYg=PtP92wvzx4aHS4foOvf05Ew@mail.gmail.com>

On Thu, Feb 16, 2012 at 2:15 PM, Ofer Schreiber <oschreib at redhat.com> wrote:
> Since we currently doesn't have any official Release process, here's my proposal:
>
> 1. oVirt will issue a new release every 6 months.
> a. EXCEPTION: First three releases will be issued in a ~3 month interval.
...
> 5. 60 Days before release - Feature freeze

For first three, 30 days before release probably makes sense.

> ?a. All component owners must create a new versioned branch
> ?b. "Beta" version should be supplied immediately after.
> ? ?+ And on a nightly basis afterwards.
> ?c. Stabilization efforts should start on the new builds.
> ?d. Cherry-pick fixes for important issues only.

To clarify, all patches go to the trunk first and get cherry-picked to
versioned branched.

> ? ?+ Zero/Minimal changes to user interface.
> ? ?+ Inform in advance on any user interface change.

Also any API or changes which affect other components, e.g.
vdsm/ovirt-node interactions got broken in the past

> ?e. At this stage, we should start working on the release notes.
>
> 6. 30 days before release - release candidate

14 days/2 weeks for first three 3-months releases

> ?a. If no serious issues are found the last release candidate automatically becomes the final release.

That means, "rcN" will not be included in the version-release string?
What would be release tarball named?
I see current tarball for engine includes sequence after version
http://www.ovirt.org/releases/stable/src/
but node and sdk are just name-version - would each RC be simply bump
in version major.minor.micro ?
Should we maybe unify naming across all projects?

> ?b. Release manager will create a wiki with list of release blockers
> ?c. Only release blockers should be fixed in this stage.
>
> 7. Create a new RC if needed
> ?a. There must be at least one week between the last release candidate and the final release
> ?b. Go/No go meetings will happen once a week in this stage.
> ? ?+ Increase the amount of meeting according to the release manager decision.
> ? ?+ Release manager will inform the community on any delay.
>
> 8. Release
> ?a. Create ANNOUNCE message few days before actual release.
> ?b. PARTY

YES!
But wait, before that, let's document also what needs to be uploaded
and where :)

Cheers,
Alan


From ykaul at redhat.com  Thu Feb 16 14:02:36 2012
From: ykaul at redhat.com (Yaniv Kaul)
Date: Thu, 16 Feb 2012 16:02:36 +0200
Subject: Release process proposal
In-Reply-To: <94db87ca-45d4-4733-b7c4-46be71dae106@zmail14.collab.prod.int.phx2.redhat.com>
References: <94db87ca-45d4-4733-b7c4-46be71dae106@zmail14.collab.prod.int.phx2.redhat.com>
Message-ID: <4F3D0C7C.6020502@redhat.com>

On 02/16/2012 03:15 PM, Ofer Schreiber wrote:
> Since we currently doesn't have any official Release process, here's my proposal:
>
> 1. oVirt will issue a new release every 6 months.
>    a. EXCEPTION: First three releases will be issued in a ~3 month interval.
>    b. Exact dates must be set for each release.
>
> 2. A week after the n-1 release is out, a release criteria for the new release should be discussed.

discussed is fine, what is the ETA of a decision?

>    a. Release criteria will include MUST items and SHOULD items (held in wiki)
>      + MUST items will DELAY the release in any case.
>      + SHOULD items will include less critical flows and new features.

So (major) new features are decided and agreed upon (and documented) on 
the release criteria?

>      + SHOULD items will be handled as "best-effort" by component owners
>    b. Component owners (e.g. Node, engine-core, vdsm) must ACK the criteria suggested.

What about general items? For example: no data corruption bugs, no known 
security issues, etc.?
More interestingly, what about the ugly, hairy bugs, which are not 
blockers, just gooey?

>
> 3. OPTIONAL: Discuses the new version number according to the release criteria/amount of features.
>    a. OR BETTER: Increase MAJOR version every second release
>    b. Versions will be handled by each component.
>    c. The general oVirt version will the engine version.
>
> 5. 60 Days before release - Feature freeze
>    a. All component owners must create a new versioned branch
>    b. "Beta" version should be supplied immediately after.
>      + And on a nightly basis afterwards.
>    c. Stabilization efforts should start on the new builds.
>    d. Cherry-pick fixes for important issues only.

What is 'important' ?

>      + Zero/Minimal changes to user interface.

Unless the UI sucks, of course. If we get feedback it's wrong, we should 
change. Been known to happen before.
Also, what about API changes?

>      + Inform in advance on any user interface change.
>    e. At this stage, we should start working on the release notes.
>
> 6. 30 days before release - release candidate
>    a. If no serious issues are found the last release candidate automatically becomes the final release.

Define 'serious'.

>    b. Release manager will create a wiki with list of release blockers
>    c. Only release blockers should be fixed in this stage.
>
> 7. Create a new RC if needed
>    a. There must be at least one week between the last release candidate and the final release
>    b. Go/No go meetings will happen once a week in this stage.
>      + Increase the amount of meeting according to the release manager decision.
>      + Release manager will inform the community on any delay.
>
> 8. Release

Do we expect to release all components together? For example, what if we 
found a blocker in component X, and component Y is fine and dandy? it'll 
just wait?
Y.

>   a. Create ANNOUNCE message few days before actual release.
>   b. PARTY
>
> Have any comments? ideas? share them with the list!
>
> Thanks,
> Ofer Schreiber.
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch


From jchoate at redhat.com  Thu Feb 16 14:21:36 2012
From: jchoate at redhat.com (Jon Choate)
Date: Thu, 16 Feb 2012 09:21:36 -0500 (EST)
Subject: Release process proposal
In-Reply-To: <94db87ca-45d4-4733-b7c4-46be71dae106@zmail14.collab.prod.int.phx2.redhat.com>
Message-ID: <46dbf30a-e7dd-44ac-b22a-606cc34e2658@zmail12.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> From: "Ofer Schreiber" <oschreib at redhat.com>
> To: arch at ovirt.org
> Sent: Thursday, February 16, 2012 8:15:54 AM
> Subject: Release process proposal
> 
> Since we currently doesn't have any official Release process, here's
> my proposal:
> 
> 1. oVirt will issue a new release every 6 months.
>   a. EXCEPTION: First three releases will be issued in a ~3 month
>   interval.
>   b. Exact dates must be set for each release.
> 
> 2. A week after the n-1 release is out, a release criteria for the
> new release should be discussed.
>   a. Release criteria will include MUST items and SHOULD items (held
>   in wiki)
>     + MUST items will DELAY the release in any case.
>     + SHOULD items will include less critical flows and new features.
>     + SHOULD items will be handled as "best-effort" by component
>     owners
>   b. Component owners (e.g. Node, engine-core, vdsm) must ACK the
>   criteria suggested.
> 
> 3. OPTIONAL: Discuses the new version number according to the release
> criteria/amount of features.
>   a. OR BETTER: Increase MAJOR version every second release


This seems very arbitrary. If we do this, what does a major or minor version signify? Just the passage of time?


>   b. Versions will be handled by each component.
>   c. The general oVirt version will the engine version.
> 
> 5. 60 Days before release - Feature freeze


So we estimate that 1/3 of our development efforts need to go into bug fixing and stabilization?
That seems REALLY high.  If it takes two months to stabilize a release of oVirt I think we need
to take a look at our code quality and development processes.


>   a. All component owners must create a new versioned branch
>   b. "Beta" version should be supplied immediately after.
>     + And on a nightly basis afterwards.
>   c. Stabilization efforts should start on the new builds.
>   d. Cherry-pick fixes for important issues only.
>     + Zero/Minimal changes to user interface.
>     + Inform in advance on any user interface change.
>   e. At this stage, we should start working on the release notes.


Shouldn't we be working on the release notes throughout the development cycle?


> 
> 6. 30 days before release - release candidate
>   a. If no serious issues are found the last release candidate
>   automatically becomes the final release.

So there will not be the traditional vote requiring 3 acks to approve a release?

>   b. Release manager will create a wiki with list of release blockers
>   c. Only release blockers should be fixed in this stage.
> 
> 7. Create a new RC if needed
>   a. There must be at least one week between the last release
>   candidate and the final release
>   b. Go/No go meetings will happen once a week in this stage.
>     + Increase the amount of meeting according to the release manager
>     decision.
>     + Release manager will inform the community on any delay.
> 
> 8. Release
>  a. Create ANNOUNCE message few days before actual release.


a1. Encourage everyone to blog / tweet about the release


>  b. PARTY
> 
> Have any comments? ideas? share them with the list!
> 
> Thanks,
> Ofer Schreiber.
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
> 


From sdake at redhat.com  Thu Feb 16 14:23:21 2012
From: sdake at redhat.com (Steven Dake)
Date: Thu, 16 Feb 2012 07:23:21 -0700
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F3D05B8.10302@redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
	<4F3B932C.9030602@redhat.com> <4F3C717C.3050708@redhat.com>
	<4F3D05B8.10302@redhat.com>
Message-ID: <4F3D1159.4090302@redhat.com>

On 02/16/2012 06:33 AM, Livnat Peer wrote:
> On 16/02/12 05:01, Perry Myers wrote:
>>>>> HA is a simple use case of policy.
>>>>
>>>> *Today* HA is simply 'if VM is down restart it' but what Perry was suggesting was to improve this to something more robust.
>>>
>>> I think that the main concept of what Perry suggested (leaving the
>>> implementation details aside :)) is to add HA of services.
>>
>> That's it in a nutshell :)
>>
>>> I like this idea and I would like to extend it a little bit.
>>> How about services that are spread on more than a single VM.
>>> I would like to be able to define a service and specify which VM/s
>>> provides this service and add HA flag on the service.
>>
>> That is in line with what I was proposing.  There are two ways you can
>> do service HA...
>>
>> * Take a set of OSes (guests) and a set of services.  Each service can
>> run on any of the guests.  Therefore services can be failed over from
>> one live guest to another.  This is effectively how the Pacemaker HA
>> stack works on both bare metal and virtual clusters
>>
>> * Take a set of OSes (guests) and on each guest place a specific set of
>> services.  Services can be restarted if they fail on a specific guest,
>> but if a guest/host fails, rather than failing over the service to
>> another live running guest, instead the entire guest responsible for
>> that service is restarted.  The recovery time is slightly longer in this
>> case because recovery involves restarting a VM instead of just starting
>> a service on another running VM.  But the positive here is that the
>> configuration and policies are not as complex, and since VMs typically
>> can start fairly quickly the failover time is still adequate for most users
>>
> 
> Can a service be spread on more than one VM?
> For example if I have enterprise application that requires application
> server (AS) and a data base (DB), the AS and DB can not live in the same
> guest because of different access restrictions (based on real use case).
> The service availability is dependent on both guests being active, and
> an optimization is to run both of them on the same host.
> 
> 

Absolutely.

In this case the Cloud Application is the combination of thw two
separate VM components (database VM and AS VM).  A CAPE (cloud
application policy engine) maintains the HA state of both VMs including
correcting for resource (db,as) or vm failures, and ensuring ordering
constraints even during recovery (the AS would start after the DB in
this model).

Our target scaling atm is 10 VMs per CAPE with 36 resources per VM.
These are arbitrary - we could likely go an order of magnitude beyond.
Note that is *per cape* - where the cape process limit is limited by
memory and scheduling capabilities of the system.

Regards
-steve
> 
>> Both models work.  Pacemaker HA uses the first model, Pacemaker Cloud
>> uses the second, but over time could be adapted to include the 1st.
>>
>>> Then i would like to manage policies around it - I define a service
>>> with 3 VMs providing this service and I want to have at least 2 VM
>>> running it at any given time. (now the VMs are not highly available only
>>> the service is.)
>>
>> Yep.  This is in line with use case #1 above.
>>
>> Perry
> 
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch


From oschreib at redhat.com  Thu Feb 16 14:29:35 2012
From: oschreib at redhat.com (Ofer Schreiber)
Date: Thu, 16 Feb 2012 09:29:35 -0500 (EST)
Subject: Release process proposal
In-Reply-To: <CAGi==UWA-0RVoE8pX7UQJ1WuYg=PtP92wvzx4aHS4foOvf05Ew@mail.gmail.com>
Message-ID: <11795cc7-b723-4c6f-951a-15061f358fb6@zmail14.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> On Thu, Feb 16, 2012 at 2:15 PM, Ofer Schreiber <oschreib at redhat.com>
> wrote:
> > Since we currently doesn't have any official Release process,
> > here's my proposal:
> >
> > 1. oVirt will issue a new release every 6 months.
> > a. EXCEPTION: First three releases will be issued in a ~3 month
> > interval.
> ...
> > 5. 60 Days before release - Feature freeze
> 
> For first three, 30 days before release probably makes sense.
> 
> > ?a. All component owners must create a new versioned branch
> > ?b. "Beta" version should be supplied immediately after.
> > ? ?+ And on a nightly basis afterwards.
> > ?c. Stabilization efforts should start on the new builds.
> > ?d. Cherry-pick fixes for important issues only.
> 
> To clarify, all patches go to the trunk first and get cherry-picked
> to
> versioned branched.
> 
> > ? ?+ Zero/Minimal changes to user interface.
> > ? ?+ Inform in advance on any user interface change.
> 
> Also any API or changes which affect other components, e.g.
> vdsm/ovirt-node interactions got broken in the past
> 
> > ?e. At this stage, we should start working on the release notes.
> >
> > 6. 30 days before release - release candidate
> 
> 14 days/2 weeks for first three 3-months releases
> 
> > ?a. If no serious issues are found the last release candidate
> > ?automatically becomes the final release.
> 
> That means, "rcN" will not be included in the version-release string?
> What would be release tarball named?
> I see current tarball for engine includes sequence after version
> http://www.ovirt.org/releases/stable/src/
> but node and sdk are just name-version - would each RC be simply bump
> in version major.minor.micro ?
> Should we maybe unify naming across all projects?

Well, we can rebuild the binaries with on the same git hash, just without the RCx string
about the engine - the _0001 is part of the version, hope to remove this next build.

I don't care so much about naming conventions. 
maybe enforce the "Beta" and "RC" strings when needed.

> 
> > ?b. Release manager will create a wiki with list of release
> > ?blockers
> > ?c. Only release blockers should be fixed in this stage.
> >
> > 7. Create a new RC if needed
> > ?a. There must be at least one week between the last release
> > ?candidate and the final release
> > ?b. Go/No go meetings will happen once a week in this stage.
> > ? ?+ Increase the amount of meeting according to the release
> > ? ?manager decision.
> > ? ?+ Release manager will inform the community on any delay.
> >
> > 8. Release
> > ?a. Create ANNOUNCE message few days before actual release.
> > ?b. PARTY
> 
> YES!
> But wait, before that, let's document also what needs to be uploaded
> and where :)
> 
> Cheers,
> Alan
> 


From ryanh at us.ibm.com  Thu Feb 16 15:41:09 2012
From: ryanh at us.ibm.com (Ryan Harper)
Date: Thu, 16 Feb 2012 09:41:09 -0600
Subject: Release process proposal
In-Reply-To: <94db87ca-45d4-4733-b7c4-46be71dae106@zmail14.collab.prod.int.phx2.redhat.com>
References: <8d76cb0a-3c3e-4cea-a046-3d5ca7cfee71@zmail14.collab.prod.int.phx2.redhat.com>
	<94db87ca-45d4-4733-b7c4-46be71dae106@zmail14.collab.prod.int.phx2.redhat.com>
Message-ID: <20120216154109.GL12402@us.ibm.com>

* Ofer Schreiber <oschreib at redhat.com> [2012-02-16 09:30]:
> Since we currently doesn't have any official Release process, here's my proposal:
> 
> 1. oVirt will issue a new release every 6 months.
>   a. EXCEPTION: First three releases will be issued in a ~3 month interval.
>   b. Exact dates must be set for each release.
> 
> 2. A week after the n-1 release is out, a release criteria for the new release should be discussed.
>   a. Release criteria will include MUST items and SHOULD items (held in wiki)
>     + MUST items will DELAY the release in any case.

Maybe there might be some review during development phase to determine
if the MUST items are still MUST?  

Thinking back to the issues which delayed the last release a bit; were
all of those MUST?  and if not, then we may find that we have other
items that aren't MUST, and we choose to delay release.


>     + SHOULD items will include less critical flows and new features.
>     + SHOULD items will be handled as "best-effort" by component owners
>   b. Component owners (e.g. Node, engine-core, vdsm) must ACK the criteria suggested.
> 
> 3. OPTIONAL: Discuses the new version number according to the release criteria/amount of features.
>   a. OR BETTER: Increase MAJOR version every second release
>   b. Versions will be handled by each component.
>   c. The general oVirt version will the engine version.
> 
> 5. 60 Days before release - Feature freeze
>   a. All component owners must create a new versioned branch
>   b. "Beta" version should be supplied immediately after.
>     + And on a nightly basis afterwards.
>   c. Stabilization efforts should start on the new builds.
>   d. Cherry-pick fixes for important issues only.
>     + Zero/Minimal changes to user interface.
>     + Inform in advance on any user interface change.
>   e. At this stage, we should start working on the release notes.
> 
> 6. 30 days before release - release candidate
>   a. If no serious issues are found the last release candidate automatically becomes the final release.
>   b. Release manager will create a wiki with list of release blockers
>   c. Only release blockers should be fixed in this stage.
> 
> 7. Create a new RC if needed
>   a. There must be at least one week between the last release candidate and the final release
>   b. Go/No go meetings will happen once a week in this stage.
>     + Increase the amount of meeting according to the release manager decision.
>     + Release manager will inform the community on any delay.
> 
> 8. Release
>  a. Create ANNOUNCE message few days before actual release.
>  b. PARTY
> 
> Have any comments? ideas? share them with the list!
> 
> Thanks,
> Ofer Schreiber.
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ryanh at us.ibm.com


From abaron at redhat.com  Thu Feb 16 16:14:57 2012
From: abaron at redhat.com (Ayal Baron)
Date: Thu, 16 Feb 2012 11:14:57 -0500 (EST)
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F3BD14D.70003@redhat.com>
Message-ID: <0a3c8fd9-5b51-4e8f-81cf-fdaa3daf7cec@zmail13.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> On 02/14/2012 11:32 PM, Ayal Baron wrote:
> > 
> > 
> > ----- Original Message -----
> >>> I'm not sure I agree.
> >>> This entire thread assumes that the way to do this is to have the
> >>> engine continuously monitor all services on all (HA) guests and
> >>> according to varying policies reschedule VMs (services within
> >>> VMs?)
> >>
> >> That's one interpretation of what I wrote, but not the only one.
> >>
> >> Pacemaker Cloud doesn't rely on a single process (like oVirt
> >> Engine)
> >> to
> >> monitor all VMs and the services in those VMs.  It relies on
> >> spawning
> >> a
> >> monitor process for each logical grouping of VMs in an
> >> 'application
> >> group'.
> >>
> >> So the engine doesn't need to continuously monitor every VM and
> >> every
> >> service, it delegates to the Cloud Policy Engine (CPE) which in
> >> turn
> >> creates a daemon (DPE[1]) to monitor each application group.
> > 
> > Where is the daemon spawn? on the engine or in a distributed
> > fashion? if the latter then drools is irrelevant.  if the former
> > then it would just make things worse (scalability wise)
> > 
> 
> Ayal,
> 
> CPE (cloud policy engine - responsible for starting/stopping cloud
> application policy engines, provides an API for third party control)
> runs on the same machines as the CAPE(aka DPE) (cloud application
> policy
> engine - responsible for maintaining the availability of the
> resources
> and virtual machines in one cloud application - including recovery
> escalation, ordering constraints, fault detection, fault isolation,
> instantiation of vms).  This collection of software components could
> be
> collocated with the engine, or a separate machine entirely since the
> project provides an API to third party projects.
> 
> One thing that may not be entirely clear is that there is a new DPE
> process for each cloud application (which could be monitor several
> hundreds VMs for large applications).  This converts the inherent
> inability of any policy engine to scale to large object counts into a
> kernel scheduling problem and memory consumption problem (kernel.org
> scheduler rocks, memory is cheap).
> 
> The CAPE processes could be spawned in a distributed fashion very
> trivially, if/when we run into scaling problems with a single node.
>  No
> sense optimizing for a condition that may not be relevant.
> 
> One intentional aspect of our project is focused around reliability.
> Our CAPE process is approximately 2kloc.  Its very small code
> footprint
> is designed to be easy to "get right" vs a huge monolithic code base
> which increases the possible failure scenarios.
> 
> As a short note about scalability, my laptop can run 1000 CAPE
> processes
> with 1% total cpu utilization (measured with top) and 5gig memory
> utilization (measured with free).  The design's upper limit on scale
> is
> based upon a) limitations of kernel scheduling b) memory consumption
> of
> the CAPE process.

But they all schedule the services to run on the same set of resources (hosts / memory / cpu), how do you coordinate?

> 
> Regards
> -steve
> 


From sdake at redhat.com  Fri Feb 17 00:29:14 2012
From: sdake at redhat.com (Steven Dake)
Date: Thu, 16 Feb 2012 17:29:14 -0700
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <0a3c8fd9-5b51-4e8f-81cf-fdaa3daf7cec@zmail13.collab.prod.int.phx2.redhat.com>
References: <0a3c8fd9-5b51-4e8f-81cf-fdaa3daf7cec@zmail13.collab.prod.int.phx2.redhat.com>
Message-ID: <4F3D9F5A.6020604@redhat.com>

On 02/16/2012 09:14 AM, Ayal Baron wrote:
> 
> 
> ----- Original Message -----
>> On 02/14/2012 11:32 PM, Ayal Baron wrote:
>>>
>>>
>>> ----- Original Message -----
>>>>> I'm not sure I agree.
>>>>> This entire thread assumes that the way to do this is to have the
>>>>> engine continuously monitor all services on all (HA) guests and
>>>>> according to varying policies reschedule VMs (services within
>>>>> VMs?)
>>>>
>>>> That's one interpretation of what I wrote, but not the only one.
>>>>
>>>> Pacemaker Cloud doesn't rely on a single process (like oVirt
>>>> Engine)
>>>> to
>>>> monitor all VMs and the services in those VMs.  It relies on
>>>> spawning
>>>> a
>>>> monitor process for each logical grouping of VMs in an
>>>> 'application
>>>> group'.
>>>>
>>>> So the engine doesn't need to continuously monitor every VM and
>>>> every
>>>> service, it delegates to the Cloud Policy Engine (CPE) which in
>>>> turn
>>>> creates a daemon (DPE[1]) to monitor each application group.
>>>
>>> Where is the daemon spawn? on the engine or in a distributed
>>> fashion? if the latter then drools is irrelevant.  if the former
>>> then it would just make things worse (scalability wise)
>>>
>>
>> Ayal,
>>
>> CPE (cloud policy engine - responsible for starting/stopping cloud
>> application policy engines, provides an API for third party control)
>> runs on the same machines as the CAPE(aka DPE) (cloud application
>> policy
>> engine - responsible for maintaining the availability of the
>> resources
>> and virtual machines in one cloud application - including recovery
>> escalation, ordering constraints, fault detection, fault isolation,
>> instantiation of vms).  This collection of software components could
>> be
>> collocated with the engine, or a separate machine entirely since the
>> project provides an API to third party projects.
>>
>> One thing that may not be entirely clear is that there is a new DPE
>> process for each cloud application (which could be monitor several
>> hundreds VMs for large applications).  This converts the inherent
>> inability of any policy engine to scale to large object counts into a
>> kernel scheduling problem and memory consumption problem (kernel.org
>> scheduler rocks, memory is cheap).
>>
>> The CAPE processes could be spawned in a distributed fashion very
>> trivially, if/when we run into scaling problems with a single node.
>>  No
>> sense optimizing for a condition that may not be relevant.
>>
>> One intentional aspect of our project is focused around reliability.
>> Our CAPE process is approximately 2kloc.  Its very small code
>> footprint
>> is designed to be easy to "get right" vs a huge monolithic code base
>> which increases the possible failure scenarios.
>>
>> As a short note about scalability, my laptop can run 1000 CAPE
>> processes
>> with 1% total cpu utilization (measured with top) and 5gig memory
>> utilization (measured with free).  The design's upper limit on scale
>> is
>> based upon a) limitations of kernel scheduling b) memory consumption
>> of
>> the CAPE process.
> 
> But they all schedule the services to run on the same set of resources (hosts / memory / cpu), how do you coordinate?
> 

Ayal,

The Pacemaker Cloud model is based upon the assumption that if a vm is
requested to be started, it will start (or fail to start, in which case
recovery will be executed) (unlimited resources).  This is the model in
public clouds.  We have had some interest in scheduling the vm starting
on specific hosts, but don't know of specific APIs to gather information
about how to make a decision to place specific VMs.  No project
currently solves this scheduling problem.  I believe part of the reason
is that there is no standardized method to gather topology information
for the VM infrastructure.

However, the objective of finding an appropriate host/memory/cpu to
instantiate a VM is orthogonal to the objective of providing high
availability, recovery escalation and ordering guarantees with virtual
machines and resources[1] that compose a cloud application.

Regards
-steve

[1] the term resource used indicates an individual component
     application, such as Apache's http.
>>
>> Regards
>> -steve
>>


From iheim at redhat.com  Fri Feb 17 15:09:30 2012
From: iheim at redhat.com (Itamar Heim)
Date: Fri, 17 Feb 2012 17:09:30 +0200
Subject: [Engine-devel] New oVirt GIT Repo Request
In-Reply-To: <4F390E28.9060300@redhat.com>
References: <4F352CB8.8060006@redhat.com>
	<4F36EEA3.50006@redhat.com>	<4F37BF5C.20801@redhat.com>
	<4F3902A9.2060405@redhat.com> <4F3932E7.1010501@redhat.com>
	<4F390E28.9060300@redhat.com>
Message-ID: <4F3E6DAA.5050206@redhat.com>

On 02/13/2012 03:20 PM, Keith Robertson wrote:
> On 02/13/2012 10:57 AM, Douglas Landgraf wrote:
>> On 02/13/2012 07:31 AM, Barak Azulay wrote:
>>> On 02/12/2012 03:32 PM, Keith Robertson wrote:
>>>> On 02/11/2012 05:41 PM, Itamar Heim wrote:
>>>>> On 02/10/2012 04:42 PM, Keith Robertson wrote:
>>>>>> All,
>>>>>>
>>>>>> I would like to move some of the oVirt tools into their own GIT
>>>>>> repos so
>>>>>> that they are easier to manage/maintain. In particular, I would
>>>>>> like to
>>>>>> move the ovirt-log-collector, ovirt-iso-uploader, and
>>>>>> ovirt-image-uploader each into their own GIT repos.
>>>>>>
>>>>>> The Plan:
>>>>>> Step 1: Create naked GIT repos on oVirt.org for the 3 tools.
>>>>>> Step 2: Link git repos to gerrit.
>>>>>
>>>>> above two are same step - create a project in gerrit.
>>>>> I'll do that if list doesn't have any objections by monday.
>>>> Sure, np.
>>>>>
>>>>>> Step 3: Populate naked GIT repos with source and build standalone
>>>>>> spec
>>>>>> files for each.
>>>>>> Step 4: In one patch do both a) and b)...
>>>>>> a) Update oVirt manager GIT repo by removing tool source.
>>>>>> b) Update oVirt manager GIT repo such that spec has dependencies on 3
>>>>>> new RPMs.
>>>>>>
>>>>>> Optional:
>>>>>> - These three tools share some python classes that are very
>>>>>> similar. I
>>>>>> would like to create a GIT repo (perhaps ovirt-tools-common) to
>>>>>> contain
>>>>>> these classes so that a fix in one place will fix the issue
>>>>>> everywhere.
>>>>>> Perhaps we can also create a naked GIT repo for these common classes
>>>>>> while addressing the primary concerns above.
>>>>>
>>>>> would this hold both python and java common code?
>>>>
>>>> None of the 3 tools currently have any requirement for Java code, but I
>>>> think the installer does. That said, I wouldn't have a problem mixing
>>>> Java code in the "common" component as long as they're in separate
>>>> package directories.
>>>>
>>>> If we do something like this do we want a "python" common RPM and a
>>>> "java" common RPM or just a single RPM for all common code? I don't
>>>> really have a preference.
>>>
>>> I would go with separating the java common and python common, even if
>>> it's just to ease build/release issues.
>>>
>> +1 and if needed one package be required to the other.
>>
> Sounds like a plan. Full speed ahead.

The following repo's were created:
ovirt-image-uploader
ovirt-iso-uploader
ovirt-log-collector
ovirt-tools-common-python

I've used the existing ovirt-engine-tools group for its maintainers, as 
this is only a split of part of the tools from using the engine git, but 
tools project was defined as separate wrt maintainers.


From abaron at redhat.com  Sun Feb 19 10:06:24 2012
From: abaron at redhat.com (Ayal Baron)
Date: Sun, 19 Feb 2012 05:06:24 -0500 (EST)
Subject: Empty cdrom drive.
In-Reply-To: <20120215122224.GG23523@redhat.com>
Message-ID: <1264df1d-1dc0-4cec-ae0a-04722960b15a@zmail13.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> On Wed, Feb 15, 2012 at 07:16:15AM -0500, Igor Lvovsky wrote:
> > 
> > 
> > > -----Original Message-----
> > > From: Dan Kenigsberg [mailto:danken at redhat.com]
> > > Sent: Wednesday, February 15, 2012 1:25 PM
> > > To: Igor Lvovsky
> > > Cc: Livnat Peer; Doron Fediuck; Ayal Baron;
> > > ovirt-devel at redhat.com
> > > Subject: Re: Empty cdrom drive.
> > > 
> > > On Tue, Feb 14, 2012 at 10:59:22AM -0500, Igor Lvovsky wrote:
> > > >
> > > >   Hi,
> > > > I want to discuss $subject on the email just to be sure that we
> > > > all on
> > the
> > > > same page.
> > > >
> > > > So, today in 3.0 vdsm has two ways to create VM with cdrom :
> > > >  1. If RHEV-M ask to create VM with cdrom, vdsm just create it
> > > >  2. RHEV-M doesn't ask to create VM with cdrom, vdsm still
> > > >  creates VM
> > with
> > > >     empty cdrom. Vdsm creates this device as 'hdc' (IDE device,
> > > >     index
> > 2),
> > > >     because of libvirt restrictions.
> > > >     In this case RHEV-M will be able to "insert" cdrom on the
> > > >     fly with
> > > >     changeCD request.
> > > >
> > > > In the new style API we want to get rid from stupid scenario
> > > > #2,
> > because
> > > > we want to be able to create VM without cdrom at all.
> > > 
> > > > It means, that now we need to change a little our scenarios:
> > > >  1. If RHEV-M ask to create VM with cdrom, vdsm just create it
> > > >  2. RHEV-M doesn't want to create VM with cdrom, but it want to
> > > >  be
> > able to
> > > >     "insert" cdrom on the fly after this. Here we have two
> > > >     options:
> > > >     a. RHEV-M should to pass empty cdrom device on VM creation
> > > >     and use
> > > >        regular changeCD after that
> > > >     b. RHEV-M can create VM without cdrom and add cdrom later
> > > >     through
> > > >        hotplugDisk command.
> > > 
> > > Let's leave hotpluggin for a later discussion. Currently I am
> > > worried
> > > about backward and forward compatibility.
> > > 
> > > 1. Currently, all VMs created by ovirt-Engine has an IDE cdrom
> > > device.
> > This
> > > behavior should be maintained when Engine is upgraded, to
> > > minimize
> > > surprises to guests.
> > > 
> > > 2. In the new "devices" API introduced by Igor, Engine is
> > > responsible to
> > > know about all guest devices and their addresses.
> > > 
> > > 1+2=3. Engine has to be aware of the fact that even if it did not
> > > explicitly request for a cdrom, such a device exist.
> > > 
> > > 4. Vdsm would very much prefer that Engine explictly request that
> > > an
> > > empty cdrom device is included. This would allow us to start VMs
> > > with no
> > > cdrom device at all in the future.
> > > 
> > > I understand that this may be a complex feat for Engine, as it
> > > requires
> > > a complex upgrade path to the VM data base. To be done correctly,
> > > it
> > > requires a compatible change to the ovirt API, too.
> > > 
> > > 5. I suggest a hackish API that would let us solve the problem in
> > > stages: Engine would not have to explicitly list an empty CD.
> > > However,
> > > it would send a hack flag: hackAutoCD=True for all VM starting
> > > up.
> > > 
> > 
> > -1. I am disagree with this approach.
> > If Engine can send hackAutoCD for old VMs I can't find a reason why
> > do not
> > send device instead.
> 
> My suggestion was that Engine would send hackAutoCD for *ALL* VMs -
> until Engine is capable to handle VMs without CD devices.

Absolutely not.  The semantics are funky and API changes are much harder to do then simple code change.  it's a hack, don't put it in the API!

> 
> Vdsm would ignore this flag if a cdrom is already specified in
> "devices". Thus, future Engines can drop the hackAutoCD flag and gain
> cdrom-less VMs.
> 
> > As you mentioned above vdsm anyway will return this cdrom device to
> > Engine
> > and engine will
> > need to put it in DB. This will cover all old VMs and there is no
> > reason
> > do not send empty cdrom device
> > for new created VM.
> > 
> > > If this flag is True, Vdsm would add an IDE CDROM to the devices
> > > list.
> > > 
> > > In the future, Engine would drop this flag and specify the CDROM
> > > only
> > > when needed.
> > > 
> > > Please note that (3) is still correct - Engine would see the
> > > CDROM
> > > device and its address even if it was empty when the VM started
> > > running.
> 


From lpeer at redhat.com  Sun Feb 19 11:06:39 2012
From: lpeer at redhat.com (Livnat Peer)
Date: Sun, 19 Feb 2012 13:06:39 +0200
Subject: [Engine-devel] Empty cdrom drive.
In-Reply-To: <ac76b269-1283-46d1-bc74-a69100851e65@mkenneth.csb>
References: <ac76b269-1283-46d1-bc74-a69100851e65@mkenneth.csb>
Message-ID: <4F40D7BF.7040401@redhat.com>

On 15/02/12 11:29, Miki Kenneth wrote:
> 
> 
> ----- Original Message -----
>> From: "Ayal Baron" <abaron at redhat.com>
>> To: "Yaniv Kaul" <ykaul at redhat.com>
>> Cc: engine-devel at ovirt.org
>> Sent: Wednesday, February 15, 2012 11:23:54 AM
>> Subject: Re: [Engine-devel] Empty cdrom drive.
>>
>>
>>
>> ----- Original Message -----
>>> On 02/15/2012 09:44 AM, Igor Lvovsky wrote:
>>>>    Hi,
>>>> I want to discuss $subject on the email just to be sure that we
>>>> all
>>>> on the
>>>> same page.
>>>>
>>>> So, today in 3.0 vdsm has two ways to create VM with cdrom :
>>>>   1. If RHEV-M ask to create VM with cdrom, vdsm just create it
>>>>   2. RHEV-M doesn't ask to create VM with cdrom, vdsm still
>>>>   creates
>>>>   VM with
>>>>      empty cdrom. Vdsm creates this device as 'hdc' (IDE device,
>>>>      index 2),
>>>>      because of libvirt restrictions.
>>>>      In this case RHEV-M will be able to "insert" cdrom on the
>>>>      fly
>>>>      with
>>>>      changeCD request.
>>>>
>>>> In the new style API we want to get rid from stupid scenario #2,
>>>> because
>>>> we want to be able to create VM without cdrom at all.
>>>> It means, that now we need to change a little our scenarios:
>>>>   1. If RHEV-M ask to create VM with cdrom, vdsm just create it
>>>>   2. RHEV-M doesn't want to create VM with cdrom, but it want to
>>>>   be
>>>>   able to
>>>>      "insert" cdrom on the fly after this. Here we have two
>>>>      options:
>>>>      a. RHEV-M should to pass empty cdrom device on VM creation
>>>>      and
>>>>      use
>>>>         regular changeCD after that
>>>>      b. RHEV-M can create VM without cdrom and add cdrom later
>>>>      through
>>>>         hotplugDisk command.
>>>>


The preferred solution IMO would be to let the user choose if he wants a
VM with CD or not.
I think the motivation for the above is to 'save' IDE slot if a user
does not need CD.

If the user wants to have a VM with CD the engine would create an empty
CD and pass it to VDSM as a device, but if the user does not require a
CD there is no reason to create it in VDSM nor in the OE (oVirt Engine).

Supporting the above requires the engine upgrade to create empty CD
device to all VMs.

Dan - what happens in 3.0 API if the engine passes the element cdrom but
with empty path attribute. (I know that if the engine does not pass
cdrom element VDSM creates empty CD)


Livnat


>>>> Note: The new libvirt remove previous restriction on cdrom
>>>> devices.
>>>> Now
>>>>        cdrom can be created as IDE or VIRTIO device in any index.
>>>>        It means we can easily hotplug it.
>>>
>>> I didn't know a CDROM can be a virtio device, but in any way it
>>> requires
>>> driver (which may not exist on Windows).
>>> I didn't know an IDE CDROM can be hot-plugged (only USB-based?),
>>
>> It can't be hotplugged.
>> usb based is not ide (the ide device is the usb port, the cdrom is a
>> usb device afaik).
>>
>> The point of this email is that since we want to support being able
>> to start VMs *without* a cdrom then the default behaviour of
>> attaching a cdrom device needs to be implemented in engine or we
>> shall have a regression.
> This is a regression that we can not live with...
>> In the new API (for stable device addresses) vdsm doesn't
>> automatically attach a cdrom.
>>
>>> perhaps
>>> I'm wrong here.
>>> Y.
>>>
>>>>
>>>>
>>>> Regards,
>>>>      Igor Lvovsky
>>>>
>>>>
>>>> _______________________________________________
>>>> Engine-devel mailing list
>>>> Engine-devel at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/engine-devel
>>>
>>> _______________________________________________
>>> Engine-devel mailing list
>>> Engine-devel at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/engine-devel
>>>
>> _______________________________________________
>> Engine-devel mailing list
>> Engine-devel at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/engine-devel
>>
> _______________________________________________
> Engine-devel mailing list
> Engine-devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/engine-devel


From oschreib at redhat.com  Sun Feb 19 13:13:11 2012
From: oschreib at redhat.com (Ofer Schreiber)
Date: Sun, 19 Feb 2012 08:13:11 -0500 (EST)
Subject: Release process proposal
In-Reply-To: <4F3D0C7C.6020502@redhat.com>
Message-ID: <dcad3db8-c605-4ddb-92a8-4fe906d156ac@zmail14.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> On 02/16/2012 03:15 PM, Ofer Schreiber wrote:
> > Since we currently doesn't have any official Release process,
> > here's my proposal:
> >
> > 1. oVirt will issue a new release every 6 months.
> >    a. EXCEPTION: First three releases will be issued in a ~3 month
> >    interval.
> >    b. Exact dates must be set for each release.
> >
> > 2. A week after the n-1 release is out, a release criteria for the
> > new release should be discussed.
> 
> discussed is fine, what is the ETA of a decision?

Will add that to v2 proposal

> 
> >    a. Release criteria will include MUST items and SHOULD items
> >    (held in wiki)
> >      + MUST items will DELAY the release in any case.
> >      + SHOULD items will include less critical flows and new
> >      features.
> 
> So (major) new features are decided and agreed upon (and documented)
> on
> the release criteria?

Sure, depends on their importance.
just to clarify, we're not deciding about new feature, that's up to the community/component owners, we're just agreed which feature is important enough to be a MUST.
if a feature is missing, it should be discussed with the relevant mailing list.

> 
> >      + SHOULD items will be handled as "best-effort" by component
> >      owners
> >    b. Component owners (e.g. Node, engine-core, vdsm) must ACK the
> >    criteria suggested.
> 
> What about general items? For example: no data corruption bugs, no
> known
> security issues, etc.?
> More interestingly, what about the ugly, hairy bugs, which are not
> blockers, just gooey?

Sound like an important part of the release criteria (e.g. MUST - No data corruption bugs in "MUST" flows).

If something is not a blocker, so we won't block the release, unless we will discuss it, and decide this item is a MUST, and we forgot it during the release criteria preparation.


> 
> >
> > 3. OPTIONAL: Discuses the new version number according to the
> > release criteria/amount of features.
> >    a. OR BETTER: Increase MAJOR version every second release
> >    b. Versions will be handled by each component.
> >    c. The general oVirt version will the engine version.
> >
> > 5. 60 Days before release - Feature freeze
> >    a. All component owners must create a new versioned branch
> >    b. "Beta" version should be supplied immediately after.
> >      + And on a nightly basis afterwards.
> >    c. Stabilization efforts should start on the new builds.
> >    d. Cherry-pick fixes for important issues only.
> 
> What is 'important' ?

Release blockers for sure.
Probably best effort on "High" importance bugs.

> 
> >      + Zero/Minimal changes to user interface.
> 
> Unless the UI sucks, of course. If we get feedback it's wrong, we
> should
> change. Been known to happen before.

That's under "Minimal"

> Also, what about API changes?

Same. If it's really necessary, fix it.
if not, don't put the whole release in risk just because a certain API doesn't look good enough.

> 
> >      + Inform in advance on any user interface change.
> >    e. At this stage, we should start working on the release notes.
> >
> > 6. 30 days before release - release candidate
> >    a. If no serious issues are found the last release candidate
> >    automatically becomes the final release.
> 
> Define 'serious'.

blockers (violets MUST items in release criteria)

> 
> >    b. Release manager will create a wiki with list of release
> >    blockers
> >    c. Only release blockers should be fixed in this stage.
> >
> > 7. Create a new RC if needed
> >    a. There must be at least one week between the last release
> >    candidate and the final release
> >    b. Go/No go meetings will happen once a week in this stage.
> >      + Increase the amount of meeting according to the release
> >      manager decision.
> >      + Release manager will inform the community on any delay.
> >
> > 8. Release
> 
> Do we expect to release all components together? For example, what if
> we
> found a blocker in component X, and component Y is fine and dandy?
> it'll
> just wait?
> Y.

Probably wait, depends on the importance (should be discussed)


> 
> >   a. Create ANNOUNCE message few days before actual release.
> >   b. PARTY
> >
> > Have any comments? ideas? share them with the list!
> >
> > Thanks,
> > Ofer Schreiber.
> > _______________________________________________
> > Arch mailing list
> > Arch at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/arch
> 
> 


From oschreib at redhat.com  Sun Feb 19 13:37:29 2012
From: oschreib at redhat.com (Ofer Schreiber)
Date: Sun, 19 Feb 2012 08:37:29 -0500 (EST)
Subject: Release process proposal
In-Reply-To: <46dbf30a-e7dd-44ac-b22a-606cc34e2658@zmail12.collab.prod.int.phx2.redhat.com>
Message-ID: <654a75f0-b5d0-421d-b818-0dcc4a23554b@zmail14.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> 
> 
> ----- Original Message -----
> > From: "Ofer Schreiber" <oschreib at redhat.com>
> > To: arch at ovirt.org
> > Sent: Thursday, February 16, 2012 8:15:54 AM
> > Subject: Release process proposal
> > 
> > Since we currently doesn't have any official Release process,
> > here's
> > my proposal:
> > 
> > 1. oVirt will issue a new release every 6 months.
> >   a. EXCEPTION: First three releases will be issued in a ~3 month
> >   interval.
> >   b. Exact dates must be set for each release.
> > 
> > 2. A week after the n-1 release is out, a release criteria for the
> > new release should be discussed.
> >   a. Release criteria will include MUST items and SHOULD items
> >   (held
> >   in wiki)
> >     + MUST items will DELAY the release in any case.
> >     + SHOULD items will include less critical flows and new
> >     features.
> >     + SHOULD items will be handled as "best-effort" by component
> >     owners
> >   b. Component owners (e.g. Node, engine-core, vdsm) must ACK the
> >   criteria suggested.
> > 
> > 3. OPTIONAL: Discuses the new version number according to the
> > release
> > criteria/amount of features.
> >   a. OR BETTER: Increase MAJOR version every second release
> 
> 
> 
> This seems very arbitrary. If we do this, what does a major or minor
> version signify? Just the passage of time?

Point taken

> 
> 
> 
> >   b. Versions will be handled by each component.
> >   c. The general oVirt version will the engine version.
> > 
> > 5. 60 Days before release - Feature freeze
> 
> 
> 
> So we estimate that 1/3 of our development efforts need to go into
> bug fixing and stabilization?
> That seems REALLY high.  If it takes two months to stabilize a
> release of oVirt I think we need
> to take a look at our code quality and development processes

I think this amount of time is necessary, especially in the scale of the oVirt project (multiple build blocks, api, features etc)

> 
> 
> 
> >   a. All component owners must create a new versioned branch
> >   b. "Beta" version should be supplied immediately after.
> >     + And on a nightly basis afterwards.
> >   c. Stabilization efforts should start on the new builds.
> >   d. Cherry-pick fixes for important issues only.
> >     + Zero/Minimal changes to user interface.
> >     + Inform in advance on any user interface change.
> >   e. At this stage, we should start working on the release notes.
> 
> 
> 
> Shouldn't we be working on the release notes throughout the
> development cycle?

define "working on the the release notes".
each component owner should handle his release notes.
in this stage, we should gather all the release notes into something that looks reasonable.

> 
> 
> 
> > 
> > 6. 30 days before release - release candidate
> >   a. If no serious issues are found the last release candidate
> >   automatically becomes the final release.
> 
> So there will not be the traditional vote requiring 3 acks to approve
> a release?

Sounds reasonable.

> 
> >   b. Release manager will create a wiki with list of release
> >   blockers
> >   c. Only release blockers should be fixed in this stage.
> > 
> > 7. Create a new RC if needed
> >   a. There must be at least one week between the last release
> >   candidate and the final release
> >   b. Go/No go meetings will happen once a week in this stage.
> >     + Increase the amount of meeting according to the release
> >     manager
> >     decision.
> >     + Release manager will inform the community on any delay.
> > 
> > 8. Release
> >  a. Create ANNOUNCE message few days before actual release.
> 
> 
> a1. Encourage everyone to blog / tweet about the release

+1

> 
> 
>    
> >  b. PARTY
> > 
> > Have any comments? ideas? share them with the list!
> > 
> > Thanks,
> > Ofer Schreiber.
> > _______________________________________________
> > Arch mailing list
> > Arch at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/arch
> > 
> 


From oschreib at redhat.com  Sun Feb 19 13:51:29 2012
From: oschreib at redhat.com (Ofer Schreiber)
Date: Sun, 19 Feb 2012 08:51:29 -0500 (EST)
Subject: Release process proposal
In-Reply-To: <94db87ca-45d4-4733-b7c4-46be71dae106@zmail14.collab.prod.int.phx2.redhat.com>
Message-ID: <d75d15a3-717b-40a8-8865-1aba4f3eb417@zmail14.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> Since we currently doesn't have any official Release process, here's
> my proposal:
> 
> 1. oVirt will issue a new release every 6 months.
>   a. EXCEPTION: First three releases will be issued in a ~3 month
>   interval.
>   b. Exact dates must be set for each release.

<SNIP>

Release process proposal V2 (with few open items)

1. oVirt will issue a new release every 6 months.
  a. EXCEPTION: First three releases will be issued in a ~3 month interval.
  b. Exact dates must be set for each release.

2. A week after the n-1 release is out, a release criteria for the new release should be discussed.
  a. Release criteria will include MUST items and SHOULD items (held in wiki)
    + MUST items will DELAY the release in any case.
    + SHOULD items will include less critical flows and new features.
    + SHOULD items will be handled as "best-effort" by component owners
  b. Component owners (e.g. Node, engine-core, vdsm) must ACK the criteria suggested.
  c. Release criteria discussions shouldn't take more then 2 weeks
  d. Progress on MUST items should be review every month, during the weekly meeting

3. Discuses the new version number according to the release criteria/amount of features.
  a. Versions will be handled by each component.
  b. The general oVirt version will the engine version.

5. 60 Days before release - Feature freeze
  a. EXCEPTION: 30 days for 3 month release cycle 
  b. All component owners must create a new versioned branch
  c. "Beta" version should be supplied immediately after.
    + And on a nightly basis afterwards.
  d. Stabilization efforts should start on the new builds.
  e. Cherry-pick fixes for high priority bugs.
    + Zero/Minimal changes to user interface.
    + Inform in advance on any user interface change, and any API change.
  f. At this stage, we should start working on the release notes.

6. 30 days before release - release candidate
  a. EXCEPTION: 15 days for 3 month release cycle
  b. If no blockers (MUST violations) are found the last release candidate automatically becomes the final release.
    + Rebuild without the "RC" string.
    + ANOTHER OPTION- Avoid "Beta" or "RC" strings, just use major.minor.micro, and bump the micro every time needed.
  c. Release manager will create a wiki with list of release blockers
  d. Only release blockers should be fixed in this stage.
  e. OPTIONAL: final release requires three +1 from community members
    + This item is currently optional, I'm not sure what a +1 means (does a +1 means "I tested this release", or "This release generally looks fine for me"?)
  
7. Create a new RC if needed
  a. There must be at least one week between the last release candidate and the final release
  b. Go/No go meetings will happen once a week in this stage.
    + Increase the amount of meeting according to the release manager decision.
    + Release manager will inform the community on any delay.

8. Release
 a. Create ANNOUNCE message few days before actual release.
 b. Move all release candidate sources/binaries into the "stable" directory 
 c. Encourage community members to blog / tweet about the release
 d. PARTY

Have any comments? ideas? share them with the list!

Thanks,
Ofer Schreiber.


From lpeer at redhat.com  Sun Feb 19 13:55:06 2012
From: lpeer at redhat.com (Livnat Peer)
Date: Sun, 19 Feb 2012 15:55:06 +0200
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F3D1159.4090302@redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
	<4F3B932C.9030602@redhat.com> <4F3C717C.3050708@redhat.com>
	<4F3D05B8.10302@redhat.com> <4F3D1159.4090302@redhat.com>
Message-ID: <4F40FF3A.6020401@redhat.com>

On 16/02/12 16:23, Steven Dake wrote:
> On 02/16/2012 06:33 AM, Livnat Peer wrote:
>> On 16/02/12 05:01, Perry Myers wrote:
>>>>>> HA is a simple use case of policy.
>>>>>
>>>>> *Today* HA is simply 'if VM is down restart it' but what Perry was suggesting was to improve this to something more robust.
>>>>
>>>> I think that the main concept of what Perry suggested (leaving the
>>>> implementation details aside :)) is to add HA of services.
>>>
>>> That's it in a nutshell :)
>>>
>>>> I like this idea and I would like to extend it a little bit.
>>>> How about services that are spread on more than a single VM.
>>>> I would like to be able to define a service and specify which VM/s
>>>> provides this service and add HA flag on the service.
>>>
>>> That is in line with what I was proposing.  There are two ways you can
>>> do service HA...
>>>
>>> * Take a set of OSes (guests) and a set of services.  Each service can
>>> run on any of the guests.  Therefore services can be failed over from
>>> one live guest to another.  This is effectively how the Pacemaker HA
>>> stack works on both bare metal and virtual clusters
>>>
>>> * Take a set of OSes (guests) and on each guest place a specific set of
>>> services.  Services can be restarted if they fail on a specific guest,
>>> but if a guest/host fails, rather than failing over the service to
>>> another live running guest, instead the entire guest responsible for
>>> that service is restarted.  The recovery time is slightly longer in this
>>> case because recovery involves restarting a VM instead of just starting
>>> a service on another running VM.  But the positive here is that the
>>> configuration and policies are not as complex, and since VMs typically
>>> can start fairly quickly the failover time is still adequate for most users
>>>
>>
>> Can a service be spread on more than one VM?
>> For example if I have enterprise application that requires application
>> server (AS) and a data base (DB), the AS and DB can not live in the same
>> guest because of different access restrictions (based on real use case).
>> The service availability is dependent on both guests being active, and
>> an optimization is to run both of them on the same host.
>>
>>
> 
> Absolutely.
> 
> In this case the Cloud Application is the combination of thw two
> separate VM components (database VM and AS VM).  A CAPE (cloud
> application policy engine) maintains the HA state of both VMs including
> correcting for resource (db,as) or vm failures, and ensuring ordering
> constraints even during recovery (the AS would start after the DB in
> this model).
> 

ok, how would a flow look like to the user (oVirt user)?

- Adding new service in OE
- Specifying for the service which VMs provide it (?)
- Specify how the service can be monitored (? how does CAPE knows what
to look for as the service heartbeat?)
- Marking th service as HA

What's next?
Where can the user define the policy about this service (i.e. 'should be
available only on Tuesdays' or 'should be available only between
0800-1700 CET' etc)?


> Our target scaling atm is 10 VMs per CAPE with 36 resources per VM.
> These are arbitrary - we could likely go an order of magnitude beyond.
> Note that is *per cape* - where the cape process limit is limited by
> memory and scheduling capabilities of the system.
> 
> Regards
> -steve
>>
>>> Both models work.  Pacemaker HA uses the first model, Pacemaker Cloud
>>> uses the second, but over time could be adapted to include the 1st.
>>>
>>>> Then i would like to manage policies around it - I define a service
>>>> with 3 VMs providing this service and I want to have at least 2 VM
>>>> running it at any given time. (now the VMs are not highly available only
>>>> the service is.)
>>>
>>> Yep.  This is in line with use case #1 above.
>>>
>>> Perry
>>
>> _______________________________________________
>> Arch mailing list
>> Arch at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/arch
> 


From mgoldboi at redhat.com  Sun Feb 19 14:07:27 2012
From: mgoldboi at redhat.com (Moran Goldboim)
Date: Sun, 19 Feb 2012 16:07:27 +0200
Subject: Release process proposal
In-Reply-To: <d75d15a3-717b-40a8-8865-1aba4f3eb417@zmail14.collab.prod.int.phx2.redhat.com>
References: <d75d15a3-717b-40a8-8865-1aba4f3eb417@zmail14.collab.prod.int.phx2.redhat.com>
Message-ID: <4F41021F.8040900@redhat.com>

On 02/19/2012 03:51 PM, Ofer Schreiber wrote:
> ----- Original Message -----
>> Since we currently doesn't have any official Release process, here's
>> my proposal:
>>
>> 1. oVirt will issue a new release every 6 months.
>>    a. EXCEPTION: First three releases will be issued in a ~3 month
>>    interval.
>>    b. Exact dates must be set for each release.
> <SNIP>
>
> Release process proposal V2 (with few open items)
>
> 1. oVirt will issue a new release every 6 months.
>    a. EXCEPTION: First three releases will be issued in a ~3 month interval.
>    b. Exact dates must be set for each release.
>
> 2. A week after the n-1 release is out, a release criteria for the new release should be discussed.
>    a. Release criteria will include MUST items and SHOULD items (held in wiki)
>      + MUST items will DELAY the release in any case.
>      + SHOULD items will include less critical flows and new features.
>      + SHOULD items will be handled as "best-effort" by component owners
>    b. Component owners (e.g. Node, engine-core, vdsm) must ACK the criteria suggested.
>    c. Release criteria discussions shouldn't take more then 2 weeks
>    d. Progress on MUST items should be review every month, during the weekly meeting
>
> 3. Discuses the new version number according to the release criteria/amount of features.
>    a. Versions will be handled by each component.
>    b. The general oVirt version will the engine version.
>
> 5. 60 Days before release - Feature freeze
>    a. EXCEPTION: 30 days for 3 month release cycle
>    b. All component owners must create a new versioned branch
>    c. "Beta" version should be supplied immediately after.
>      + And on a nightly basis afterwards.
>    d. Stabilization efforts should start on the new builds.
>    e. Cherry-pick fixes for high priority bugs.
>      + Zero/Minimal changes to user interface.
>      + Inform in advance on any user interface change, and any API change.
>    f. At this stage, we should start working on the release notes.
>
> 6. 30 days before release - release candidate
>    a. EXCEPTION: 15 days for 3 month release cycle
>    b. If no blockers (MUST violations) are found the last release candidate automatically becomes the final release.
>      + Rebuild without the "RC" string.
>      + ANOTHER OPTION- Avoid "Beta" or "RC" strings, just use major.minor.micro, and bump the micro every time needed.
>    c. Release manager will create a wiki with list of release blockers
>    d. Only release blockers should be fixed in this stage.
>    e. OPTIONAL: final release requires three +1 from community members
>      + This item is currently optional, I'm not sure what a +1 means (does a +1 means "I tested this release", or "This release generally looks fine for me"?)
>
> 7. Create a new RC if needed
>    a. There must be at least one week between the last release candidate and the final release
>    b. Go/No go meetings will happen once a week in this stage.
>      + Increase the amount of meeting according to the release manager decision.
>      + Release manager will inform the community on any delay.
>
> 8. Release
>   a. Create ANNOUNCE message few days before actual release.
>   b. Move all release candidate sources/binaries into the "stable" directory
>   c. Encourage community members to blog / tweet about the release
>   d. PARTY
>
> Have any comments? ideas? share them with the list!
please consider the following:

-6. f.30 days before release - release candidate - Test day - i think should be part of the process
-6. g.30 days before release - dev leads/representatives from each component participation on the weekly meetings statuses

Moran.


> Thanks,
> Ofer Schreiber.
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/arch/attachments/20120219/7cf716bb/attachment.html>

From oschreib at redhat.com  Sun Feb 19 14:42:37 2012
From: oschreib at redhat.com (Ofer Schreiber)
Date: Sun, 19 Feb 2012 09:42:37 -0500 (EST)
Subject: Release process proposal
In-Reply-To: <4F41021F.8040900@redhat.com>
Message-ID: <a40a3b62-0ec4-496c-9d06-46638622e3ad@zmail14.collab.prod.int.phx2.redhat.com>

+1 on both comments. 
----- Original Message -----

> On 02/19/2012 03:51 PM, Ofer Schreiber wrote:
> > ----- Original Message -----
> 
> > > Since we currently doesn't have any official Release process,
> > > here's
> > 
> 
> > > my proposal:
> > 
> 

> > > 1. oVirt will issue a new release every 6 months.
> > 
> 
> > > a. EXCEPTION: First three releases will be issued in a ~3 month
> > 
> 
> > > interval.
> > 
> 
> > > b. Exact dates must be set for each release.
> > 
> 
> > <SNIP>
> 

> > Release process proposal V2 (with few open items)
> 

> > 1. oVirt will issue a new release every 6 months.
> 
> > a. EXCEPTION: First three releases will be issued in a ~3 month
> > interval.
> 
> > b. Exact dates must be set for each release.
> 

> > 2. A week after the n-1 release is out, a release criteria for the
> > new release should be discussed.
> 
> > a. Release criteria will include MUST items and SHOULD items (held
> > in
> > wiki)
> 
> > + MUST items will DELAY the release in any case.
> 
> > + SHOULD items will include less critical flows and new features.
> 
> > + SHOULD items will be handled as "best-effort" by component owners
> 
> > b. Component owners (e.g. Node, engine-core, vdsm) must ACK the
> > criteria suggested.
> 
> > c. Release criteria discussions shouldn't take more then 2 weeks
> 
> > d. Progress on MUST items should be review every month, during the
> > weekly meeting
> 

> > 3. Discuses the new version number according to the release
> > criteria/amount of features.
> 
> > a. Versions will be handled by each component.
> 
> > b. The general oVirt version will the engine version.
> 

> > 5. 60 Days before release - Feature freeze
> 
> > a. EXCEPTION: 30 days for 3 month release cycle
> 
> > b. All component owners must create a new versioned branch
> 
> > c. "Beta" version should be supplied immediately after.
> 
> > + And on a nightly basis afterwards.
> 
> > d. Stabilization efforts should start on the new builds.
> 
> > e. Cherry-pick fixes for high priority bugs.
> 
> > + Zero/Minimal changes to user interface.
> 
> > + Inform in advance on any user interface change, and any API
> > change.
> 
> > f. At this stage, we should start working on the release notes.
> 

> > 6. 30 days before release - release candidate
> 
> > a. EXCEPTION: 15 days for 3 month release cycle
> 
> > b. If no blockers (MUST violations) are found the last release
> > candidate automatically becomes the final release.
> 
> > + Rebuild without the "RC" string.
> 
> > + ANOTHER OPTION- Avoid "Beta" or "RC" strings, just use
> > major.minor.micro, and bump the micro every time needed.
> 
> > c. Release manager will create a wiki with list of release blockers
> 
> > d. Only release blockers should be fixed in this stage.
> 
> > e. OPTIONAL: final release requires three +1 from community members
> 
> > + This item is currently optional, I'm not sure what a +1 means
> > (does
> > a +1 means "I tested this release", or "This release generally
> > looks
> > fine for me"?)
> 
> > 
> 
> > 7. Create a new RC if needed
> 
> > a. There must be at least one week between the last release
> > candidate
> > and the final release
> 
> > b. Go/No go meetings will happen once a week in this stage.
> 
> > + Increase the amount of meeting according to the release manager
> > decision.
> 
> > + Release manager will inform the community on any delay.
> 

> > 8. Release
> 
> > a. Create ANNOUNCE message few days before actual release.
> 
> > b. Move all release candidate sources/binaries into the "stable"
> > directory
> 
> > c. Encourage community members to blog / tweet about the release
> 
> > d. PARTY
> 

> > Have any comments? ideas? share them with the list!
> 
> please consider the following:
> -6. f.30 days before release - release candidate - Test day - i think
> should be part of the process
> -6. g.30 days before release - dev leads/representatives from each
> component participation on the weekly meetings statuses Moran.

> > Thanks,
> 
> > Ofer Schreiber.
> 
> > _______________________________________________
> 
> > Arch mailing list Arch at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/arch
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/arch/attachments/20120219/9cb6dee7/attachment.html>

From pmyers at redhat.com  Sun Feb 19 15:42:39 2012
From: pmyers at redhat.com (Perry Myers)
Date: Sun, 19 Feb 2012 10:42:39 -0500
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F40FF3A.6020401@redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
	<4F3B932C.9030602@redhat.com> <4F3C717C.3050708@redhat.com>
	<4F3D05B8.10302@redhat.com> <4F3D1159.4090302@redhat.com>
	<4F40FF3A.6020401@redhat.com>
Message-ID: <4F41186F.1050103@redhat.com>

>> Absolutely.
>>
>> In this case the Cloud Application is the combination of thw two
>> separate VM components (database VM and AS VM).  A CAPE (cloud
>> application policy engine) maintains the HA state of both VMs including
>> correcting for resource (db,as) or vm failures, and ensuring ordering
>> constraints even during recovery (the AS would start after the DB in
>> this model).
>>
> 
> ok, how would a flow look like to the user (oVirt user)?
> 
> - Adding new service in OE
> - Specifying for the service which VMs provide it (?)

That could work, or you could do:

1. Adding a new VM (or set of VMs in OE)
2. Adding one or more services to associate with those VMs

Just depends on what the easier user experience is.  From the
perspective of pcmk-cloud, we get the same data in the end, which is a
config file that specifies the resources we care about (both VMs and
services on those VMs)

> - Specify how the service can be monitored (? how does CAPE knows what
> to look for as the service heartbeat?)

For each service you would specify whether or not to use:
* an OCF resource agent (see resources-agents package in Fedora and
  other distros)
* A systemd unit or sysV init script
* Some other custom script (which would need to be either in OCF RA or
  init script style)

> - Marking th service as HA
> 
> What's next?
> Where can the user define the policy about this service

There would need to be UI in OE that exposed an interface for adding
policy information.  Because the Pacemaker policy engine is very
flexible, it would make sense to only define very specific knobs in the
UI, otherwise it could get very confusing for the users.  For more
complex policies, it might be better to provide a way to manually edit
the policy file and upload it rather than trying to model everything in
the UI.

> (i.e. 'should be
> available only on Tuesdays' or 'should be available only between
> 0800-1700 CET' etc)?

For this example, what do you mean by 'should be available'?  In general
with HA, the idea is to 'keep the service running as much as possible'.

The above example seems less like an HA concern and more of a general
resource scheduling concern.  I think using the Pacemaker Rules engine
with pcmk-cloud, this should be possible as well, but I'll let
Andrew/Steve comment further on that.

Perry


From iheim at redhat.com  Sun Feb 19 16:06:45 2012
From: iheim at redhat.com (Itamar Heim)
Date: Sun, 19 Feb 2012 18:06:45 +0200
Subject: Release process proposal
In-Reply-To: <d75d15a3-717b-40a8-8865-1aba4f3eb417@zmail14.collab.prod.int.phx2.redhat.com>
References: <d75d15a3-717b-40a8-8865-1aba4f3eb417@zmail14.collab.prod.int.phx2.redhat.com>
Message-ID: <4F411E15.8060707@redhat.com>

On 02/19/2012 03:51 PM, Ofer Schreiber wrote:
>
> ----- Original Message -----
>> Since we currently doesn't have any official Release process, here's
>> my proposal:
>>
>> 1. oVirt will issue a new release every 6 months.
>>    a. EXCEPTION: First three releases will be issued in a ~3 month
>>    interval.
>>    b. Exact dates must be set for each release.
>
> <SNIP>
>
> Release process proposal V2 (with few open items)

(I'd change the subject to say it's a V2)

>
> 1. oVirt will issue a new release every 6 months.
>    a. EXCEPTION: First three releases will be issued in a ~3 month interval.
>    b. Exact dates must be set for each release.
>
> 2. A week after the n-1 release is out, a release criteria for the new release should be discussed.
>    a. Release criteria will include MUST items and SHOULD items (held in wiki)
>      + MUST items will DELAY the release in any case.
>      + SHOULD items will include less critical flows and new features.
>      + SHOULD items will be handled as "best-effort" by component owners

there is a difference between defining quality goals and PRD-like 
(feature) goals.
what could be a MUST here?

>    b. Component owners (e.g. Node, engine-core, vdsm) must ACK the criteria suggested.
>    c. Release criteria discussions shouldn't take more then 2 weeks

s/then/than/

>    d. Progress on MUST items should be review every month, during the weekly meeting

s/review/reviewed/
and? I'm not sure there is a lot to be done other than to revisit with 
owner / un-must them?

>
> 3. Discuses the new version number according to the release criteria/amount of features.
>    a. Versions will be handled by each component.
>    b. The general oVirt version will the engine version.
>
> 5. 60 Days before release - Feature freeze
>    a. EXCEPTION: 30 days for 3 month release cycle
>    b. All component owners must create a new versioned branch

I don't see why this is a must, as long as they can specify the version 
to be used (i.e., it could be an existing released version or an already 
existing branch created prior to this date).

>    c. "Beta" version should be supplied immediately after.
>      + And on a nightly basis afterwards.

a nightly beta? I wouldn't call a nightly build a beta, just a nightly 
build of the branch.

>    d. Stabilization efforts should start on the new builds.
>    e. Cherry-pick fixes for high priority bugs.
>      + Zero/Minimal changes to user interface.
>      + Inform in advance on any user interface change, and any API change.
>    f. At this stage, we should start working on the release notes.
>
> 6. 30 days before release - release candidate
>    a. EXCEPTION: 15 days for 3 month release cycle
>    b. If no blockers (MUST violations) are found the last release candidate automatically becomes the final release.
>      + Rebuild without the "RC" string.
>      + ANOTHER OPTION- Avoid "Beta" or "RC" strings, just use major.minor.micro, and bump the micro every time needed.
>    c. Release manager will create a wiki with list of release blockers
>    d. Only release blockers should be fixed in this stage.
>    e. OPTIONAL: final release requires three +1 from community members
>      + This item is currently optional, I'm not sure what a +1 means (does a +1 means "I tested this release", or "This release generally looks fine for me"?)
>
> 7. Create a new RC if needed
>    a. There must be at least one week between the last release candidate and the final release
>    b. Go/No go meetings will happen once a week in this stage.
>      + Increase the amount of meeting according to the release manager decision.
>      + Release manager will inform the community on any delay.
>
> 8. Release
>   a. Create ANNOUNCE message few days before actual release.
>   b. Move all release candidate sources/binaries into the "stable" directory
>   c. Encourage community members to blog / tweet about the release
>   d. PARTY
>
> Have any comments? ideas? share them with the list!
>
> Thanks,
> Ofer Schreiber.
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch


From lpeer at redhat.com  Sun Feb 19 20:55:40 2012
From: lpeer at redhat.com (Livnat Peer)
Date: Sun, 19 Feb 2012 22:55:40 +0200
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F41186F.1050103@redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
	<4F3B932C.9030602@redhat.com> <4F3C717C.3050708@redhat.com>
	<4F3D05B8.10302@redhat.com> <4F3D1159.4090302@redhat.com>
	<4F40FF3A.6020401@redhat.com> <4F41186F.1050103@redhat.com>
Message-ID: <4F4161CC.5040407@redhat.com>

On 19/02/12 17:42, Perry Myers wrote:
>>> Absolutely.
>>>
>>> In this case the Cloud Application is the combination of thw two
>>> separate VM components (database VM and AS VM).  A CAPE (cloud
>>> application policy engine) maintains the HA state of both VMs including
>>> correcting for resource (db,as) or vm failures, and ensuring ordering
>>> constraints even during recovery (the AS would start after the DB in
>>> this model).
>>>
>>
>> ok, how would a flow look like to the user (oVirt user)?
>>
>> - Adding new service in OE
>> - Specifying for the service which VMs provide it (?)
> 
> That could work, or you could do:
> 
> 1. Adding a new VM (or set of VMs in OE)
> 2. Adding one or more services to associate with those VMs
> 
> Just depends on what the easier user experience is.  From the
> perspective of pcmk-cloud, we get the same data in the end, which is a
> config file that specifies the resources we care about (both VMs and
> services on those VMs)
> 
>> - Specify how the service can be monitored (? how does CAPE knows what
>> to look for as the service heartbeat?)
> 
> For each service you would specify whether or not to use:
> * an OCF resource agent (see resources-agents package in Fedora and
>   other distros)
> * A systemd unit or sysV init script
> * Some other custom script (which would need to be either in OCF RA or
>   init script style)
> 
>> - Marking th service as HA
>>
>> What's next?
>> Where can the user define the policy about this service
> 
> There would need to be UI in OE that exposed an interface for adding
> policy information.  Because the Pacemaker policy engine is very
> flexible, it would make sense to only define very specific knobs in the
> UI, otherwise it could get very confusing for the users.  For more
> complex policies, it might be better to provide a way to manually edit
> the policy file and upload it rather than trying to model everything in
> the UI.
> 
>> (i.e. 'should be
>> available only on Tuesdays' or 'should be available only between
>> 0800-1700 CET' etc)?
> 
> For this example, what do you mean by 'should be available'?  In general
> with HA, the idea is to 'keep the service running as much as possible'.
> 

You are right, I mixed two use cases.
Let's focus on HA for start.

Let say CAPE found VM/service is down, does it initiate runVM by OE API?
Who chooses on which host to start the VM and who is responsible for
doing setup work in case it is required by the VM? for example if a VM
is using direct LUN then we might need to connect the host to that LUN
before starting the VM on the target host.

If CAPE use OE to start the VM the setup will be taken-care-of by OE as
part of starting the VM.


> The above example seems less like an HA concern and more of a general
> resource scheduling concern.  I think using the Pacemaker Rules engine
> with pcmk-cloud, this should be possible as well, but I'll let
> Andrew/Steve comment further on that.
> 
> Perry


From oschreib at redhat.com  Mon Feb 20 11:52:58 2012
From: oschreib at redhat.com (Ofer Schreiber)
Date: Mon, 20 Feb 2012 06:52:58 -0500 (EST)
Subject: Release process proposal
In-Reply-To: <4F411E15.8060707@redhat.com>
Message-ID: <335f4c84-4107-461c-87a5-8c1058dc9a86@zmail14.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> On 02/19/2012 03:51 PM, Ofer Schreiber wrote:
> >
> > ----- Original Message -----
> >> Since we currently doesn't have any official Release process,
> >> here's
> >> my proposal:
> >>
> >> 1. oVirt will issue a new release every 6 months.
> >>    a. EXCEPTION: First three releases will be issued in a ~3 month
> >>    interval.
> >>    b. Exact dates must be set for each release.
> >
> > <SNIP>
> >
> > Release process proposal V2 (with few open items)
> 
> (I'd change the subject to say it's a V2)
> 
> >
> > 1. oVirt will issue a new release every 6 months.
> >    a. EXCEPTION: First three releases will be issued in a ~3 month
> >    interval.
> >    b. Exact dates must be set for each release.
> >
> > 2. A week after the n-1 release is out, a release criteria for the
> > new release should be discussed.
> >    a. Release criteria will include MUST items and SHOULD items
> >    (held in wiki)
> >      + MUST items will DELAY the release in any case.
> >      + SHOULD items will include less critical flows and new
> >      features.
> >      + SHOULD items will be handled as "best-effort" by component
> >      owners
> 
> there is a difference between defining quality goals and PRD-like
> (feature) goals.
> what could be a MUST here?

Release criteria is absolutely not a PRD. it's the minimal criteria needed in order to release a specific version to the world.
IMO, new (big) features can be in the release criteria (only as SHOULD). I do see some value saying "This release SHOULD contain the new ovirt-engine-coffee-maker package", especially if Eli, the owner of that component really want it in the next version of oVirt, and we have users waiting for it.

> 
> >    b. Component owners (e.g. Node, engine-core, vdsm) must ACK the
> >    criteria suggested.
> >    c. Release criteria discussions shouldn't take more then 2 weeks
> 
> s/then/than/
> 
> >    d. Progress on MUST items should be review every month, during
> >    the weekly meeting
> 
> s/review/reviewed/
> and? I'm not sure there is a lot to be done other than to revisit
> with
> owner / un-must them?

un-must is one of the possibilities. it will raise the owner/community attention as well.

> 
> >
> > 3. Discuses the new version number according to the release
> > criteria/amount of features.
> >    a. Versions will be handled by each component.
> >    b. The general oVirt version will the engine version.
> >
> > 5. 60 Days before release - Feature freeze
> >    a. EXCEPTION: 30 days for 3 month release cycle
> >    b. All component owners must create a new versioned branch
> 
> I don't see why this is a must, as long as they can specify the
> version
> to be used (i.e., it could be an existing released version or an
> already
> existing branch created prior to this date).

I think branching is a must. especially during the "release candidate" phase.
What will you do with new commits (that should not get into the new version?)

(I don't care about "new" branch, just a separate one. If a specific owner doesn't want to release a new version, that's a different discussion)

> 
> >    c. "Beta" version should be supplied immediately after.
> >      + And on a nightly basis afterwards.
> 
> a nightly beta? I wouldn't call a nightly build a beta, just a
> nightly
> build of the branch.

Any suggestion about the frequency of Beta builds?
I'm not sure we want to create so many different releases (e.g - stable, beta/rc, nightly)

> 
> >    d. Stabilization efforts should start on the new builds.
> >    e. Cherry-pick fixes for high priority bugs.
> >      + Zero/Minimal changes to user interface.
> >      + Inform in advance on any user interface change, and any API
> >      change.
> >    f. At this stage, we should start working on the release notes.
> >
> > 6. 30 days before release - release candidate
> >    a. EXCEPTION: 15 days for 3 month release cycle
> >    b. If no blockers (MUST violations) are found the last release
> >    candidate automatically becomes the final release.
> >      + Rebuild without the "RC" string.
> >      + ANOTHER OPTION- Avoid "Beta" or "RC" strings, just use
> >      major.minor.micro, and bump the micro every time needed.
> >    c. Release manager will create a wiki with list of release
> >    blockers
> >    d. Only release blockers should be fixed in this stage.
> >    e. OPTIONAL: final release requires three +1 from community
> >    members
> >      + This item is currently optional, I'm not sure what a +1
> >      means (does a +1 means "I tested this release", or "This
> >      release generally looks fine for me"?)
> >
> > 7. Create a new RC if needed
> >    a. There must be at least one week between the last release
> >    candidate and the final release
> >    b. Go/No go meetings will happen once a week in this stage.
> >      + Increase the amount of meeting according to the release
> >      manager decision.
> >      + Release manager will inform the community on any delay.
> >
> > 8. Release
> >   a. Create ANNOUNCE message few days before actual release.
> >   b. Move all release candidate sources/binaries into the "stable"
> >   directory
> >   c. Encourage community members to blog / tweet about the release
> >   d. PARTY
> >
> > Have any comments? ideas? share them with the list!
> >
> > Thanks,
> > Ofer Schreiber.
> > _______________________________________________
> > Arch mailing list
> > Arch at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/arch
> 
> 


From abeekhof at redhat.com  Tue Feb 21 00:44:29 2012
From: abeekhof at redhat.com (Andrew Beekhof)
Date: Tue, 21 Feb 2012 11:44:29 +1100
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F41186F.1050103@redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
	<4F3B932C.9030602@redhat.com> <4F3C717C.3050708@redhat.com>
	<4F3D05B8.10302@redhat.com> <4F3D1159.4090302@redhat.com>
	<4F40FF3A.6020401@redhat.com> <4F41186F.1050103@redhat.com>
Message-ID: <4F42E8ED.1060003@redhat.com>

On 20/02/12 2:42 AM, Perry Myers wrote:
>>> Absolutely.
>>>
>>> In this case the Cloud Application is the combination of thw two
>>> separate VM components (database VM and AS VM).  A CAPE (cloud
>>> application policy engine) maintains the HA state of both VMs including
>>> correcting for resource (db,as) or vm failures, and ensuring ordering
>>> constraints even during recovery (the AS would start after the DB in
>>> this model).
>>>
>>
>> ok, how would a flow look like to the user (oVirt user)?
>>
>> - Adding new service in OE
>> - Specifying for the service which VMs provide it (?)
>
> That could work, or you could do:
>
> 1. Adding a new VM (or set of VMs in OE)
> 2. Adding one or more services to associate with those VMs
>
> Just depends on what the easier user experience is.  From the
> perspective of pcmk-cloud, we get the same data in the end, which is a
> config file that specifies the resources we care about (both VMs and
> services on those VMs)
>
>> - Specify how the service can be monitored (? how does CAPE knows what
>> to look for as the service heartbeat?)
>
> For each service you would specify whether or not to use:
> * an OCF resource agent (see resources-agents package in Fedora and
>    other distros)
> * A systemd unit or sysV init script
> * Some other custom script (which would need to be either in OCF RA or
>    init script style)
>
>> - Marking th service as HA
>>
>> What's next?
>> Where can the user define the policy about this service
>
> There would need to be UI in OE that exposed an interface for adding
> policy information.  Because the Pacemaker policy engine is very
> flexible, it would make sense to only define very specific knobs in the
> UI, otherwise it could get very confusing for the users.  For more
> complex policies, it might be better to provide a way to manually edit
> the policy file and upload it rather than trying to model everything in
> the UI.

Definitely agree.
You'd want to figure out the use cases and then how to map that to PE 
concepts.  Don't start with the PE concepts and work backwards :-)

>
>> (i.e. 'should be
>> available only on Tuesdays' or 'should be available only between
>> 0800-1700 CET' etc)?
>
> For this example, what do you mean by 'should be available'?  In general
> with HA, the idea is to 'keep the service running as much as possible'.

You can tell the PE that a given resource should only be running during 
certain times though.

> The above example seems less like an HA concern and more of a general
> resource scheduling concern.  I think using the Pacemaker Rules engine
> with pcmk-cloud, this should be possible as well, but I'll let
> Andrew/Steve comment further on that.

It is.  Whether you really want that is a separate question :-)


From sdake at redhat.com  Tue Feb 21 01:34:44 2012
From: sdake at redhat.com (Steven Dake)
Date: Mon, 20 Feb 2012 18:34:44 -0700
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F4161CC.5040407@redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
	<4F3B932C.9030602@redhat.com> <4F3C717C.3050708@redhat.com>
	<4F3D05B8.10302@redhat.com> <4F3D1159.4090302@redhat.com>
	<4F40FF3A.6020401@redhat.com> <4F41186F.1050103@redhat.com>
	<4F4161CC.5040407@redhat.com>
Message-ID: <4F42F4B4.6070701@redhat.com>

On 02/19/2012 01:55 PM, Livnat Peer wrote:
> On 19/02/12 17:42, Perry Myers wrote:
>>>> Absolutely.
>>>>
>>>> In this case the Cloud Application is the combination of thw two
>>>> separate VM components (database VM and AS VM).  A CAPE (cloud
>>>> application policy engine) maintains the HA state of both VMs including
>>>> correcting for resource (db,as) or vm failures, and ensuring ordering
>>>> constraints even during recovery (the AS would start after the DB in
>>>> this model).
>>>>
>>>
>>> ok, how would a flow look like to the user (oVirt user)?
>>>
>>> - Adding new service in OE
>>> - Specifying for the service which VMs provide it (?)
>>
>> That could work, or you could do:
>>
>> 1. Adding a new VM (or set of VMs in OE)
>> 2. Adding one or more services to associate with those VMs
>>
>> Just depends on what the easier user experience is.  From the
>> perspective of pcmk-cloud, we get the same data in the end, which is a
>> config file that specifies the resources we care about (both VMs and
>> services on those VMs)
>>
>>> - Specify how the service can be monitored (? how does CAPE knows what
>>> to look for as the service heartbeat?)
>>
>> For each service you would specify whether or not to use:
>> * an OCF resource agent (see resources-agents package in Fedora and
>>   other distros)
>> * A systemd unit or sysV init script
>> * Some other custom script (which would need to be either in OCF RA or
>>   init script style)
>>
>>> - Marking th service as HA
>>>
>>> What's next?
>>> Where can the user define the policy about this service
>>
>> There would need to be UI in OE that exposed an interface for adding
>> policy information.  Because the Pacemaker policy engine is very
>> flexible, it would make sense to only define very specific knobs in the
>> UI, otherwise it could get very confusing for the users.  For more
>> complex policies, it might be better to provide a way to manually edit
>> the policy file and upload it rather than trying to model everything in
>> the UI.
>>
>>> (i.e. 'should be
>>> available only on Tuesdays' or 'should be available only between
>>> 0800-1700 CET' etc)?
>>
>> For this example, what do you mean by 'should be available'?  In general
>> with HA, the idea is to 'keep the service running as much as possible'.
>>
> 
> You are right, I mixed two use cases.
> Let's focus on HA for start.
> 
> Let say CAPE found VM/service is down, does it initiate runVM by OE API?
> Who chooses on which host to start the VM and who is responsible for
> doing setup work in case it is required by the VM? for example if a VM
> is using direct LUN then we might need to connect the host to that LUN
> before starting the VM on the target host.
> 
> If CAPE use OE to start the VM the setup will be taken-care-of by OE as
> part of starting the VM.
> 
> 

Currently CAPE uses deltacloud APIs to start/stop instances.

The choosing of which host to start the vm is an act of scheduling
which, in our model, is in the domain of the IAAS platform,  I expect
the typical start operation would look like:
1. cape determines which VMs to start
2. cape sends instance start operations to deltacloudd
3. deltacloudd sends instance start operations to OE API
4. OE starts the vms

The model we have been operating under is that setup work of the actual
virtual machine image is done prior to launching.

Physical resource mapping (such as LUNs or block storage) are again the
domain of the IAAS platform.

Note we have had some informal requests to also handle scheduling, but
would need topology information about the physical resources available
in order to make those decisions.  Currently there is no "standardized"
way to determine the topology.  We don't tackle this problem (currently)
in our implementation.  The project is only focused on HA.

Regards
-steve

> 
>> The above example seems less like an HA concern and more of a general
>> resource scheduling concern.  I think using the Pacemaker Rules engine
>> with pcmk-cloud, this should be possible as well, but I'll let
>> Andrew/Steve comment further on that.
>>
>> Perry
> 


From pmyers at redhat.com  Tue Feb 21 01:42:05 2012
From: pmyers at redhat.com (Perry Myers)
Date: Mon, 20 Feb 2012 20:42:05 -0500
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F42F4B4.6070701@redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
	<4F3B932C.9030602@redhat.com> <4F3C717C.3050708@redhat.com>
	<4F3D05B8.10302@redhat.com> <4F3D1159.4090302@redhat.com>
	<4F40FF3A.6020401@redhat.com> <4F41186F.1050103@redhat.com>
	<4F4161CC.5040407@redhat.com> <4F42F4B4.6070701@redhat.com>
Message-ID: <4F42F66D.4030706@redhat.com>

> Currently CAPE uses deltacloud APIs to start/stop instances.
> 
> The choosing of which host to start the vm is an act of scheduling
> which, in our model, is in the domain of the IAAS platform,  I expect
> the typical start operation would look like:
> 1. cape determines which VMs to start
> 2. cape sends instance start operations to deltacloudd
> 3. deltacloudd sends instance start operations to OE API
> 4. OE starts the vms
> 
> The model we have been operating under is that setup work of the actual
> virtual machine image is done prior to launching.
> 
> Physical resource mapping (such as LUNs or block storage) are again the
> domain of the IAAS platform.
> 
> Note we have had some informal requests to also handle scheduling, but
> would need topology information about the physical resources available
> in order to make those decisions.  Currently there is no "standardized"
> way to determine the topology.  We don't tackle this problem (currently)
> in our implementation.  The project is only focused on HA.

Right, so the path forward can either be:

1. pcmk-cloud operates as described above, offloading VM placement to
   the IAAS platform (oVirt Engine) and oVirt Engine can either:
   a. Continue to do VM placement via it's existing algorithms
   b. Include usage of the Pacemaker Policy Engine specifically to
      handle VM placement but not HA
2. pcmk-cloud is expanded to include VM placement, and incorporates
   add'l policies to do this.  This requires oVirt Engine to:
   a. Expose APIs that allow pcmk-cloud to examine the current
      host/network/storage topologies and utilization in order to make
      the proper choice based on the constraints

I think that model 1 is the better one to go with, and we'll leave IAAS
platforms to do what they do best, which is understand the available
hardware in order to optimally place VMs.  (Just as we do w/ EC2,
OpenStack, Aeolus)

But I also think that 1b is worth exploring as a parallel effort, as
more expressive policy for VM placement may be a good thing and the
Pacemaker library for the PE could do a good job here.

Perry


From lpeer at redhat.com  Tue Feb 21 13:09:03 2012
From: lpeer at redhat.com (Livnat Peer)
Date: Tue, 21 Feb 2012 15:09:03 +0200
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F42F4B4.6070701@redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
	<4F3B932C.9030602@redhat.com> <4F3C717C.3050708@redhat.com>
	<4F3D05B8.10302@redhat.com> <4F3D1159.4090302@redhat.com>
	<4F40FF3A.6020401@redhat.com> <4F41186F.1050103@redhat.com>
	<4F4161CC.5040407@redhat.com> <4F42F4B4.6070701@redhat.com>
Message-ID: <4F43976F.1020206@redhat.com>

On 21/02/12 03:34, Steven Dake wrote:
> On 02/19/2012 01:55 PM, Livnat Peer wrote:
>> On 19/02/12 17:42, Perry Myers wrote:
>>>>> Absolutely.
>>>>>
>>>>> In this case the Cloud Application is the combination of thw two
>>>>> separate VM components (database VM and AS VM).  A CAPE (cloud
>>>>> application policy engine) maintains the HA state of both VMs including
>>>>> correcting for resource (db,as) or vm failures, and ensuring ordering
>>>>> constraints even during recovery (the AS would start after the DB in
>>>>> this model).
>>>>>
>>>>
>>>> ok, how would a flow look like to the user (oVirt user)?
>>>>
>>>> - Adding new service in OE
>>>> - Specifying for the service which VMs provide it (?)
>>>
>>> That could work, or you could do:
>>>
>>> 1. Adding a new VM (or set of VMs in OE)
>>> 2. Adding one or more services to associate with those VMs
>>>
>>> Just depends on what the easier user experience is.  From the
>>> perspective of pcmk-cloud, we get the same data in the end, which is a
>>> config file that specifies the resources we care about (both VMs and
>>> services on those VMs)
>>>
>>>> - Specify how the service can be monitored (? how does CAPE knows what
>>>> to look for as the service heartbeat?)
>>>
>>> For each service you would specify whether or not to use:
>>> * an OCF resource agent (see resources-agents package in Fedora and
>>>   other distros)
>>> * A systemd unit or sysV init script
>>> * Some other custom script (which would need to be either in OCF RA or
>>>   init script style)
>>>
>>>> - Marking th service as HA
>>>>
>>>> What's next?
>>>> Where can the user define the policy about this service
>>>
>>> There would need to be UI in OE that exposed an interface for adding
>>> policy information.  Because the Pacemaker policy engine is very
>>> flexible, it would make sense to only define very specific knobs in the
>>> UI, otherwise it could get very confusing for the users.  For more
>>> complex policies, it might be better to provide a way to manually edit
>>> the policy file and upload it rather than trying to model everything in
>>> the UI.
>>>
>>>> (i.e. 'should be
>>>> available only on Tuesdays' or 'should be available only between
>>>> 0800-1700 CET' etc)?
>>>
>>> For this example, what do you mean by 'should be available'?  In general
>>> with HA, the idea is to 'keep the service running as much as possible'.
>>>
>>
>> You are right, I mixed two use cases.
>> Let's focus on HA for start.
>>
>> Let say CAPE found VM/service is down, does it initiate runVM by OE API?
>> Who chooses on which host to start the VM and who is responsible for
>> doing setup work in case it is required by the VM? for example if a VM
>> is using direct LUN then we might need to connect the host to that LUN
>> before starting the VM on the target host.
>>
>> If CAPE use OE to start the VM the setup will be taken-care-of by OE as
>> part of starting the VM.
>>
>>
> 
> Currently CAPE uses deltacloud APIs to start/stop instances.
> 
> The choosing of which host to start the vm is an act of scheduling
> which, in our model, is in the domain of the IAAS platform,  I expect
> the typical start operation would look like:
> 1. cape determines which VMs to start
> 2. cape sends instance start operations to deltacloudd
> 3. deltacloudd sends instance start operations to OE API
> 4. OE starts the vms
> 
> The model we have been operating under is that setup work of the actual
> virtual machine image is done prior to launching.
> 

Few more questions:

- If the user initiates stop to HA VM does OE has to coordinate that
with cape? terminate CAPE as well?

- How does CAPE makes the decision that it is 'safe' to restart the
resource?
For example currently if OE looses the VM heart beat but we have the
host heart beat we know that it is safe to restart the VM. If we loose
the host heart beat (which implies we loose the VM heart beat as well)
we do not start the VM until we fence the host (or the user can manually
approve he rebooted the host).


- Currently OE is monitoring the VMs for collecting statistics (CPU,
memory, network usage etc.) if OE uses CAPE for providing HA of VMs (or
services) it won't 'save' OE the need to monitor the VM for statistics,
so if the purpose of this integration is to help with OE scalability
don't we need to take care of the monitoring of the VM statistics as well?

Livnat

> Physical resource mapping (such as LUNs or block storage) are again the
> domain of the IAAS platform.
> 
> Note we have had some informal requests to also handle scheduling, but
> would need topology information about the physical resources available
> in order to make those decisions.  Currently there is no "standardized"
> way to determine the topology.  We don't tackle this problem (currently)
> in our implementation.  The project is only focused on HA.
> 
> Regards
> -steve
> 
>>
>>> The above example seems less like an HA concern and more of a general
>>> resource scheduling concern.  I think using the Pacemaker Rules engine
>>> with pcmk-cloud, this should be possible as well, but I'll let
>>> Andrew/Steve comment further on that.
>>>
>>> Perry
>>
> 


From sdake at redhat.com  Tue Feb 21 16:27:58 2012
From: sdake at redhat.com (Steven Dake)
Date: Tue, 21 Feb 2012 09:27:58 -0700
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F43976F.1020206@redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
	<4F3B932C.9030602@redhat.com> <4F3C717C.3050708@redhat.com>
	<4F3D05B8.10302@redhat.com> <4F3D1159.4090302@redhat.com>
	<4F40FF3A.6020401@redhat.com> <4F41186F.1050103@redhat.com>
	<4F4161CC.5040407@redhat.com> <4F42F4B4.6070701@redhat.com>
	<4F43976F.1020206@redhat.com>
Message-ID: <4F43C60E.90509@redhat.com>

On 02/21/2012 06:09 AM, Livnat Peer wrote:
> On 21/02/12 03:34, Steven Dake wrote:
>> On 02/19/2012 01:55 PM, Livnat Peer wrote:
>>> On 19/02/12 17:42, Perry Myers wrote:
>>>>>> Absolutely.
>>>>>>
>>>>>> In this case the Cloud Application is the combination of thw two
>>>>>> separate VM components (database VM and AS VM).  A CAPE (cloud
>>>>>> application policy engine) maintains the HA state of both VMs including
>>>>>> correcting for resource (db,as) or vm failures, and ensuring ordering
>>>>>> constraints even during recovery (the AS would start after the DB in
>>>>>> this model).
>>>>>>
>>>>>
>>>>> ok, how would a flow look like to the user (oVirt user)?
>>>>>
>>>>> - Adding new service in OE
>>>>> - Specifying for the service which VMs provide it (?)
>>>>
>>>> That could work, or you could do:
>>>>
>>>> 1. Adding a new VM (or set of VMs in OE)
>>>> 2. Adding one or more services to associate with those VMs
>>>>
>>>> Just depends on what the easier user experience is.  From the
>>>> perspective of pcmk-cloud, we get the same data in the end, which is a
>>>> config file that specifies the resources we care about (both VMs and
>>>> services on those VMs)
>>>>
>>>>> - Specify how the service can be monitored (? how does CAPE knows what
>>>>> to look for as the service heartbeat?)
>>>>
>>>> For each service you would specify whether or not to use:
>>>> * an OCF resource agent (see resources-agents package in Fedora and
>>>>   other distros)
>>>> * A systemd unit or sysV init script
>>>> * Some other custom script (which would need to be either in OCF RA or
>>>>   init script style)
>>>>
>>>>> - Marking th service as HA
>>>>>
>>>>> What's next?
>>>>> Where can the user define the policy about this service
>>>>
>>>> There would need to be UI in OE that exposed an interface for adding
>>>> policy information.  Because the Pacemaker policy engine is very
>>>> flexible, it would make sense to only define very specific knobs in the
>>>> UI, otherwise it could get very confusing for the users.  For more
>>>> complex policies, it might be better to provide a way to manually edit
>>>> the policy file and upload it rather than trying to model everything in
>>>> the UI.
>>>>
>>>>> (i.e. 'should be
>>>>> available only on Tuesdays' or 'should be available only between
>>>>> 0800-1700 CET' etc)?
>>>>
>>>> For this example, what do you mean by 'should be available'?  In general
>>>> with HA, the idea is to 'keep the service running as much as possible'.
>>>>
>>>
>>> You are right, I mixed two use cases.
>>> Let's focus on HA for start.
>>>
>>> Let say CAPE found VM/service is down, does it initiate runVM by OE API?
>>> Who chooses on which host to start the VM and who is responsible for
>>> doing setup work in case it is required by the VM? for example if a VM
>>> is using direct LUN then we might need to connect the host to that LUN
>>> before starting the VM on the target host.
>>>
>>> If CAPE use OE to start the VM the setup will be taken-care-of by OE as
>>> part of starting the VM.
>>>
>>>
>>
>> Currently CAPE uses deltacloud APIs to start/stop instances.
>>
>> The choosing of which host to start the vm is an act of scheduling
>> which, in our model, is in the domain of the IAAS platform,  I expect
>> the typical start operation would look like:
>> 1. cape determines which VMs to start
>> 2. cape sends instance start operations to deltacloudd
>> 3. deltacloudd sends instance start operations to OE API
>> 4. OE starts the vms
>>
>> The model we have been operating under is that setup work of the actual
>> virtual machine image is done prior to launching.
>>
> 
> Few more questions:
> 
> - If the user initiates stop to HA VM does OE has to coordinate that
> with cape? terminate CAPE as well?
> 

There is another process called a CPE (cloud policy engine) which
provides a REST API for start/stop of instances.  This process starts
and stops the CAPE processes as necessary.

> - How does CAPE makes the decision that it is 'safe' to restart the
> resource?

when monitoring fails in some way we terminate the node via deltacloud.

> For example currently if OE looses the VM heart beat but we have the
> host heart beat we know that it is safe to restart the VM. If we loose
> the host heart beat (which implies we loose the VM heart beat as well)
> we do not start the VM until we fence the host (or the user can manually
> approve he rebooted the host).
> 

This particular use case could be handled with a bit of extra code on
our end.  Use case seems reasonable.

> 
> - Currently OE is monitoring the VMs for collecting statistics (CPU,
> memory, network usage etc.) if OE uses CAPE for providing HA of VMs (or
> services) it won't 'save' OE the need to monitor the VM for statistics,
> so if the purpose of this integration is to help with OE scalability
> don't we need to take care of the monitoring of the VM statistics as well?
>

We support multiple transport mechanisms per a separate cape binary.
Please have a look at

http://www.pacemaker-cloud.org/downloads/cape-ovirt.pdf

This shows how ovirt support could be added by pacemaker cloud devs.
Essentially ovirt.o would communicate with current ovirt monitoring
infrastructure via whatever method makes the most sense.  The operations
that trans_ssh.o, or matahari.o or ovirt.o need are vm healthcheck,
reosurce start, stop, monitor.

Regards
-steve

> Livnat
> 
>> Physical resource mapping (such as LUNs or block storage) are again the
>> domain of the IAAS platform.
>>
>> Note we have had some informal requests to also handle scheduling, but
>> would need topology information about the physical resources available
>> in order to make those decisions.  Currently there is no "standardized"
>> way to determine the topology.  We don't tackle this problem (currently)
>> in our implementation.  The project is only focused on HA.
>>
>> Regards
>> -steve
>>
>>>
>>>> The above example seems less like an HA concern and more of a general
>>>> resource scheduling concern.  I think using the Pacemaker Rules engine
>>>> with pcmk-cloud, this should be possible as well, but I'll let
>>>> Andrew/Steve comment further on that.
>>>>
>>>> Perry
>>>
>>
> 


From pmyers at redhat.com  Tue Feb 21 19:30:50 2012
From: pmyers at redhat.com (Perry Myers)
Date: Tue, 21 Feb 2012 14:30:50 -0500
Subject: all-in-one install via oVirt Node
Message-ID: <4F43F0EA.2090502@redhat.com>

I believe folks have already gotten an all-in-one install working via a
Fedora 16 host which runs the oVirt Engine infrastructure and vdsm side
by side, so that the same host running OE can also run VMs under OE.

This conversation came up on users list from the perspective of oVirt Node:
http://lists.ovirt.org/pipermail/users/2012-February/000702.html

It doesn't make sense to install OE onto oVirt Node itself.  The lack of
a full OS and stateless FS would make that a waste of effort :)

However, what about doing the following:

* Boot oVirt Node, do basic config (install to disk, networking)
* Start a VM specifically to run oVirt Engine.  This VM would need to
  run off of either:
  + dedicated local storage (/data partition, so local disk must be
    large enough)
  + an iSCSI or FC LUN, but this LUN should never be part of the normal
    set of LUNs that oVirt Engine would use for other VMs
* Once this VM is up and running and oVirt Engine is up then you can
  register oVirt Node to that Engine
* You'd want to make sure that OE is not able to fence the node it is
  running on, and the resources needed by the OE VM must be subtracted
  from the total available resources (would this be done automatically
  already?)

To support this, I think we'd just need to provide an easy way for an
oVirt Node user to start up that mgmt VM outside of normal OE control.
Since libvirt is already present and we have a simple TUI version of
virt-manager, maybe we can utilize that interface to do this.

Of course, once the oVirt Node is registered with OE, the local
libvirt/virt-manager TUI should not be used for anything except for
management of the OE VM itself.

Would something like this be desirable for PoC or demo purposes?  If
folks don't think it's a good idea, I'm happy to not do it but curious
as to what people think.

Perry


From pmyers at redhat.com  Tue Feb 21 19:18:42 2012
From: pmyers at redhat.com (Perry Myers)
Date: Tue, 21 Feb 2012 14:18:42 -0500
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F43C071.2070005@redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
	<4F3B932C.9030602@redhat.com> <4F3C717C.3050708@redhat.com>
	<4F3D05B8.10302@redhat.com> <4F3D1159.4090302@redhat.com>
	<4F40FF3A.6020401@redhat.com> <4F41186F.1050103@redhat.com>
	<4F4161CC.5040407@redhat.com> <4F42F4B4.6070701@redhat.com>
	<4F43976F.1020206@redhat.com> <4F43C071.2070005@redhat.com>
Message-ID: <4F43EE12.9080904@redhat.com>

>> - How does CAPE makes the decision that it is 'safe' to restart the
>> resource?
> 
> it terminates it via deltacloud.  This may not be sufficient, as in your
> use case.  We could add additional support here to do extra fencing
> operations to match the underlying IAAS platform.

I don't see why this would be necessary.

1. CAPE makes a call to deltacloud (stop VM A)
2. deltacloud in turn uses oVirt REST API to issue a stop VM A command
3. oVirt Engine tries to stop the VM via vdsm (fencing the VM itself)
4. if the VM cannot be terminated due to host being inaccessible, OE at
   that point would fall back to host fencing

So the IAAS platform is responsible for ensuring that a 'stop VM'
command via deltacloud results in either:

1. success with assurance that the VM has been terminated by some means
   (be it VM fencing or host fencing)
2. failure, which could mean that an unexpected error occurred or that
   host fencing ultimately failed to power down the host

What should never happen in oVirt Engine is:

3. success, but oVirt Engine is not sure if the VM is terminated or not

As that could lead to some nice data corruption :)

In this sense, pcmk-cloud using deltacloud to talk to oVirt Engine for
VM lifecycle control is no different than how pcmk-cloud would use
deltacloud for talking to EC2 or any other cloud provider


From sdake at redhat.com  Tue Feb 21 17:53:00 2012
From: sdake at redhat.com (Steven Dake)
Date: Tue, 21 Feb 2012 10:53:00 -0700
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F43C60E.90509@redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
	<4F3B932C.9030602@redhat.com> <4F3C717C.3050708@redhat.com>
	<4F3D05B8.10302@redhat.com> <4F3D1159.4090302@redhat.com>
	<4F40FF3A.6020401@redhat.com> <4F41186F.1050103@redhat.com>
	<4F4161CC.5040407@redhat.com> <4F42F4B4.6070701@redhat.com>
	<4F43976F.1020206@redhat.com> <4F43C60E.90509@redhat.com>
Message-ID: <4F43D9FC.6050101@redhat.com>

On 02/21/2012 09:27 AM, Steven Dake wrote:
> On 02/21/2012 06:09 AM, Livnat Peer wrote:
>> On 21/02/12 03:34, Steven Dake wrote:
>>> On 02/19/2012 01:55 PM, Livnat Peer wrote:
>>>> On 19/02/12 17:42, Perry Myers wrote:
>>>>>>> Absolutely.
>>>>>>>
>>>>>>> In this case the Cloud Application is the combination of thw two
>>>>>>> separate VM components (database VM and AS VM).  A CAPE (cloud
>>>>>>> application policy engine) maintains the HA state of both VMs including
>>>>>>> correcting for resource (db,as) or vm failures, and ensuring ordering
>>>>>>> constraints even during recovery (the AS would start after the DB in
>>>>>>> this model).
>>>>>>>
>>>>>>
>>>>>> ok, how would a flow look like to the user (oVirt user)?
>>>>>>
>>>>>> - Adding new service in OE
>>>>>> - Specifying for the service which VMs provide it (?)
>>>>>
>>>>> That could work, or you could do:
>>>>>
>>>>> 1. Adding a new VM (or set of VMs in OE)
>>>>> 2. Adding one or more services to associate with those VMs
>>>>>
>>>>> Just depends on what the easier user experience is.  From the
>>>>> perspective of pcmk-cloud, we get the same data in the end, which is a
>>>>> config file that specifies the resources we care about (both VMs and
>>>>> services on those VMs)
>>>>>
>>>>>> - Specify how the service can be monitored (? how does CAPE knows what
>>>>>> to look for as the service heartbeat?)
>>>>>
>>>>> For each service you would specify whether or not to use:
>>>>> * an OCF resource agent (see resources-agents package in Fedora and
>>>>>   other distros)
>>>>> * A systemd unit or sysV init script
>>>>> * Some other custom script (which would need to be either in OCF RA or
>>>>>   init script style)
>>>>>
>>>>>> - Marking th service as HA
>>>>>>
>>>>>> What's next?
>>>>>> Where can the user define the policy about this service
>>>>>
>>>>> There would need to be UI in OE that exposed an interface for adding
>>>>> policy information.  Because the Pacemaker policy engine is very
>>>>> flexible, it would make sense to only define very specific knobs in the
>>>>> UI, otherwise it could get very confusing for the users.  For more
>>>>> complex policies, it might be better to provide a way to manually edit
>>>>> the policy file and upload it rather than trying to model everything in
>>>>> the UI.
>>>>>
>>>>>> (i.e. 'should be
>>>>>> available only on Tuesdays' or 'should be available only between
>>>>>> 0800-1700 CET' etc)?
>>>>>
>>>>> For this example, what do you mean by 'should be available'?  In general
>>>>> with HA, the idea is to 'keep the service running as much as possible'.
>>>>>
>>>>
>>>> You are right, I mixed two use cases.
>>>> Let's focus on HA for start.
>>>>
>>>> Let say CAPE found VM/service is down, does it initiate runVM by OE API?
>>>> Who chooses on which host to start the VM and who is responsible for
>>>> doing setup work in case it is required by the VM? for example if a VM
>>>> is using direct LUN then we might need to connect the host to that LUN
>>>> before starting the VM on the target host.
>>>>
>>>> If CAPE use OE to start the VM the setup will be taken-care-of by OE as
>>>> part of starting the VM.
>>>>
>>>>
>>>
>>> Currently CAPE uses deltacloud APIs to start/stop instances.
>>>
>>> The choosing of which host to start the vm is an act of scheduling
>>> which, in our model, is in the domain of the IAAS platform,  I expect
>>> the typical start operation would look like:
>>> 1. cape determines which VMs to start
>>> 2. cape sends instance start operations to deltacloudd
>>> 3. deltacloudd sends instance start operations to OE API
>>> 4. OE starts the vms
>>>
>>> The model we have been operating under is that setup work of the actual
>>> virtual machine image is done prior to launching.
>>>
>>
>> Few more questions:
>>
>> - If the user initiates stop to HA VM does OE has to coordinate that
>> with cape? terminate CAPE as well?
>>
> 
> There is another process called a CPE (cloud policy engine) which
> provides a REST API for start/stop of instances.  This process starts
> and stops the CAPE processes as necessary.
> 

correction: replace instances above with "cloud applications"

>> - How does CAPE makes the decision that it is 'safe' to restart the
>> resource?
> 
> when monitoring fails in some way we terminate the node via deltacloud.
> 
>> For example currently if OE looses the VM heart beat but we have the
>> host heart beat we know that it is safe to restart the VM. If we loose
>> the host heart beat (which implies we loose the VM heart beat as well)
>> we do not start the VM until we fence the host (or the user can manually
>> approve he rebooted the host).
>>
> 
> This particular use case could be handled with a bit of extra code on
> our end.  Use case seems reasonable.
> 
>>
>> - Currently OE is monitoring the VMs for collecting statistics (CPU,
>> memory, network usage etc.) if OE uses CAPE for providing HA of VMs (or
>> services) it won't 'save' OE the need to monitor the VM for statistics,
>> so if the purpose of this integration is to help with OE scalability
>> don't we need to take care of the monitoring of the VM statistics as well?
>>
> 
> We support multiple transport mechanisms per a separate cape binary.
> Please have a look at
> 
> http://www.pacemaker-cloud.org/downloads/cape-ovirt.pdf
> 
> This shows how ovirt support could be added by pacemaker cloud devs.
> Essentially ovirt.o would communicate with current ovirt monitoring
> infrastructure via whatever method makes the most sense.  The operations
> that trans_ssh.o, or matahari.o or ovirt.o need are vm healthcheck,
> reosurce start, stop, monitor.
> 
> Regards
> -steve
> 
>> Livnat
>>
>>> Physical resource mapping (such as LUNs or block storage) are again the
>>> domain of the IAAS platform.
>>>
>>> Note we have had some informal requests to also handle scheduling, but
>>> would need topology information about the physical resources available
>>> in order to make those decisions.  Currently there is no "standardized"
>>> way to determine the topology.  We don't tackle this problem (currently)
>>> in our implementation.  The project is only focused on HA.
>>>
>>> Regards
>>> -steve
>>>
>>>>
>>>>> The above example seems less like an HA concern and more of a general
>>>>> resource scheduling concern.  I think using the Pacemaker Rules engine
>>>>> with pcmk-cloud, this should be possible as well, but I'll let
>>>>> Andrew/Steve comment further on that.
>>>>>
>>>>> Perry
>>>>
>>>
>>
> 
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch


From kwade at redhat.com  Wed Feb 22 11:35:04 2012
From: kwade at redhat.com (Karsten 'quaid' Wade)
Date: Wed, 22 Feb 2012 03:35:04 -0800
Subject: oVirt weekly meeting reminder - 1500 UTC Wed 22 Feb in #ovirt
Message-ID: <4F44D2E8.1070206@redhat.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

We'll be in #ovirt on irc.oftc.net.

Agenda I have so far:

* Update on release process plan
* Beijing workshop status
* What else?

Anything missing?
- -- 
name:  Karsten 'quaid' Wade, Sr. Community Architect
team:    Red Hat Community Architecture & Leadership
uri:              http://communityleadershipteam.org
                         http://TheOpenSourceWay.org
gpg:                                        AD0E0C41
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFPRNLn2ZIOBq0ODEERArYgAJ9FIoh04RJosL3/aKAO5WM5B636ngCfepv1
qsemz6xPsxUC3k49FiSvmxw=
=VF9G
-----END PGP SIGNATURE-----


From simon at redhat.com  Wed Feb 22 15:32:17 2012
From: simon at redhat.com (Simon Grinberg)
Date: Wed, 22 Feb 2012 10:32:17 -0500 (EST)
Subject: all-in-one install via oVirt Node
In-Reply-To: <4F43F0EA.2090502@redhat.com>
Message-ID: <d97731cc-e5bf-4c39-a966-e2363bc69205@zmail17.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> From: "Perry Myers" <pmyers at redhat.com>
> To: arch at ovirt.org
> Cc: "Joseph Boggs" <jboggs at redhat.com>
> Sent: Tuesday, February 21, 2012 9:30:50 PM
> Subject: all-in-one install via oVirt Node
> 
> I believe folks have already gotten an all-in-one install working via
> a
> Fedora 16 host which runs the oVirt Engine infrastructure and vdsm
> side
> by side, so that the same host running OE can also run VMs under OE.
> 
> This conversation came up on users list from the perspective of oVirt
> Node:
> http://lists.ovirt.org/pipermail/users/2012-February/000702.html
> 
> It doesn't make sense to install OE onto oVirt Node itself.  The lack
> of
> a full OS and stateless FS would make that a waste of effort :)
> 
> However, what about doing the following:
> 
> * Boot oVirt Node, do basic config (install to disk, networking)
> * Start a VM specifically to run oVirt Engine.  This VM would need to
>   run off of either:
>   + dedicated local storage (/data partition, so local disk must be
>     large enough)
>   + an iSCSI or FC LUN, but this LUN should never be part of the
>   normal
>     set of LUNs that oVirt Engine would use for other VMs
> * Once this VM is up and running and oVirt Engine is up then you can
>   register oVirt Node to that Engine
> * You'd want to make sure that OE is not able to fence the node it is
>   running on, and the resources needed by the OE VM must be
>   subtracted
>   from the total available resources (would this be done
>   automatically
>   already?)
> 
> To support this, I think we'd just need to provide an easy way for an
> oVirt Node user to start up that mgmt VM outside of normal OE
> control.
> Since libvirt is already present and we have a simple TUI version of
> virt-manager, maybe we can utilize that interface to do this.
> 
> Of course, once the oVirt Node is registered with OE, the local
> libvirt/virt-manager TUI should not be used for anything except for
> management of the OE VM itself.
> 
> Would something like this be desirable for PoC or demo purposes?  If
> folks don't think it's a good idea, I'm happy to not do it but
> curious
> as to what people think.

It's fairly trivial to install OE on a full fedora host so what is the purpose of this POC/Demo?

Usually if it's for demo purposes or just to save on hardware while you are testing oVirt then you'll probably want to use the Desktop mode otherwise you'll require another machine as the client (which rules out OVirt node)

If it's as POC towards a self hosted OE (otherwise I don't see the motivation to move to all in one oVirt node) then you'll probably want to stick to shared storage and add more staff like some kind of script running on the nodes ensuring that the OE VM will be started on any node. It's not that trivial :)


> Perry
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
> 


From pmyers at redhat.com  Wed Feb 22 15:36:13 2012
From: pmyers at redhat.com (Perry Myers)
Date: Wed, 22 Feb 2012 10:36:13 -0500
Subject: all-in-one install via oVirt Node
In-Reply-To: <d97731cc-e5bf-4c39-a966-e2363bc69205@zmail17.collab.prod.int.phx2.redhat.com>
References: <d97731cc-e5bf-4c39-a966-e2363bc69205@zmail17.collab.prod.int.phx2.redhat.com>
Message-ID: <4F450B6D.7050805@redhat.com>

> It's fairly trivial to install OE on a full fedora host so what is
> the purpose of this POC/Demo?

It came up in the context of a request on the users list.  Personally I
tend to agree with your point above, but figured it was worth seeing
what other folks think.

As I said in my initial email, I'm happy to not go down this path if
folks don't think it's useful


From dfediuck at redhat.com  Wed Feb 22 15:40:06 2012
From: dfediuck at redhat.com (Doron Fediuck)
Date: Wed, 22 Feb 2012 17:40:06 +0200
Subject: all-in-one install via oVirt Node
In-Reply-To: <4F450B6D.7050805@redhat.com>
References: <d97731cc-e5bf-4c39-a966-e2363bc69205@zmail17.collab.prod.int.phx2.redhat.com>
	<4F450B6D.7050805@redhat.com>
Message-ID: <4F450C56.5060804@redhat.com>

On 22/02/12 17:36, Perry Myers wrote:
>> It's fairly trivial to install OE on a full fedora host so what is
>> the purpose of this POC/Demo?
> 
> It came up in the context of a request on the users list.  Personally I
> tend to agree with your point above, but figured it was worth seeing
> what other folks think.
> 
> As I said in my initial email, I'm happy to not go down this path if
> folks don't think it's useful

+2
Knowing both sides (node and engine), we definitely do not wish to
go that path if it's not essential.
-- 

/d

Why doesn't DOS ever say "EXCELLENT command or filename!"


From kwade at redhat.com  Wed Feb 29 00:11:10 2012
From: kwade at redhat.com (Karsten 'quaid' Wade)
Date: Tue, 28 Feb 2012 16:11:10 -0800
Subject: Need a co-guest for FLOSS Weekly 7 March
Message-ID: <4F4D6D1E.20307@redhat.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi:

We've lined up to do an interview about oVirt on FLOSS Weekly on
Wednesday 7 March at 1630 or 1730 UTC.

So I'm prepared to talk about the open source community stuff - how we
got to where we are, how we've structured the project, governance,
contributions, the workshops, and how all of this has gone thus far.

What I can't do is talk from the perspective of a developer
contributor, nor can I (yet) articulate the technical structure of the
project.

I'm looking for someone to join me who:

* Is comfortable speaking about oVirt technology and open source
development for up to an hour with me to FLOSS Weekly's audience.

* Can do the little bit of prep work we need to do in advance.

* Can make that meeting time.

* Is able to join using whatever video technology we agree to. (They
use Skype by default, but that's not a real option for me; I'm
considering asking them to use Google Talk, unless someone else wants
to manage a SIP solution here.)

I've got a few people lined up (Chris Wright, Jason Brooks) who can
speak about the technology, but not to the same depth and breadth as
many of you. I don't think it matters which sub-project you are
involved in, especially if you understand the technology overall.

It would be great to get someone who represents one of the other Board
companies or who is here for their own reasons.

If you are interested, let me know ASAP. If there is more than one
person, maybe we can sort out who on this list? If we give a good
interview, we can get invited back. :)

- - Karsten
- -- 
name:  Karsten 'quaid' Wade, Sr. Community Architect
team:    Red Hat Community Architecture & Leadership
uri:              http://communityleadershipteam.org
                         http://TheOpenSourceWay.org
gpg:                                        AD0E0C41
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFPTW0e2ZIOBq0ODEERAuXFAKDZtgPuXYAU+YZNuZX4idn+435D1QCdE2qD
+XdGIsKFLhFPOasDneIubCc=
=9dOu
-----END PGP SIGNATURE-----


From ilvovsky at redhat.com  Wed Feb 29 08:10:30 2012
From: ilvovsky at redhat.com (Igor Lvovsky)
Date: Wed, 29 Feb 2012 03:10:30 -0500 (EST)
Subject: Hot plug  NIC  wiki pages
Message-ID: <00a701ccf6ba$41df3980$c59dac80$@com>

  Hi,
Please review following Hotplug NIC design:

http://www.ovirt.org/wiki/Features/HotplugNic
http://www.ovirt.org/wiki/Features/Design/DetailedHotlugNic


Regards,
    Igor Lvovsky


From kmestery at cisco.com  Wed Feb 29 15:08:09 2012
From: kmestery at cisco.com (Kyle Mestery (kmestery))
Date: Wed, 29 Feb 2012 15:08:09 +0000
Subject: Need a co-guest for FLOSS Weekly 7 March
In-Reply-To: <4F4D6D1E.20307@redhat.com>
References: <4F4D6D1E.20307@redhat.com>
Message-ID: <37835A7A-47C1-4575-9D03-0415E890A4CB@cisco.com>

Hi Karsten:

If no one else has volunteered yet, I can join you for this, although I may be at the same depth as Chris and Jason. :)

Thanks!
Kyle

On Feb 28, 2012, at 6:11 PM, Karsten 'quaid' Wade wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi:
> 
> We've lined up to do an interview about oVirt on FLOSS Weekly on
> Wednesday 7 March at 1630 or 1730 UTC.
> 
> So I'm prepared to talk about the open source community stuff - how we
> got to where we are, how we've structured the project, governance,
> contributions, the workshops, and how all of this has gone thus far.
> 
> What I can't do is talk from the perspective of a developer
> contributor, nor can I (yet) articulate the technical structure of the
> project.
> 
> I'm looking for someone to join me who:
> 
> * Is comfortable speaking about oVirt technology and open source
> development for up to an hour with me to FLOSS Weekly's audience.
> 
> * Can do the little bit of prep work we need to do in advance.
> 
> * Can make that meeting time.
> 
> * Is able to join using whatever video technology we agree to. (They
> use Skype by default, but that's not a real option for me; I'm
> considering asking them to use Google Talk, unless someone else wants
> to manage a SIP solution here.)
> 
> I've got a few people lined up (Chris Wright, Jason Brooks) who can
> speak about the technology, but not to the same depth and breadth as
> many of you. I don't think it matters which sub-project you are
> involved in, especially if you understand the technology overall.
> 
> It would be great to get someone who represents one of the other Board
> companies or who is here for their own reasons.
> 
> If you are interested, let me know ASAP. If there is more than one
> person, maybe we can sort out who on this list? If we give a good
> interview, we can get invited back. :)
> 
> - - Karsten


From iheim at redhat.com  Wed Feb 29 15:26:46 2012
From: iheim at redhat.com (Itamar Heim)
Date: Wed, 29 Feb 2012 17:26:46 +0200
Subject: Need a co-guest for FLOSS Weekly 7 March
In-Reply-To: <4F4D6D1E.20307@redhat.com>
References: <4F4D6D1E.20307@redhat.com>
Message-ID: <4F4E43B6.9060101@redhat.com>

On 02/29/2012 02:11 AM, Karsten 'quaid' Wade wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi:
>
> We've lined up to do an interview about oVirt on FLOSS Weekly on
> Wednesday 7 March at 1630 or 1730 UTC.
>
> So I'm prepared to talk about the open source community stuff - how we
> got to where we are, how we've structured the project, governance,
> contributions, the workshops, and how all of this has gone thus far.
>
> What I can't do is talk from the perspective of a developer
> contributor, nor can I (yet) articulate the technical structure of the
> project.
>
> I'm looking for someone to join me who:
>
> * Is comfortable speaking about oVirt technology and open source
> development for up to an hour with me to FLOSS Weekly's audience.
>
> * Can do the little bit of prep work we need to do in advance.
>
> * Can make that meeting time.
>
> * Is able to join using whatever video technology we agree to. (They
> use Skype by default, but that's not a real option for me; I'm
> considering asking them to use Google Talk, unless someone else wants
> to manage a SIP solution here.)
>
> I've got a few people lined up (Chris Wright, Jason Brooks) who can
> speak about the technology, but not to the same depth and breadth as
> many of you. I don't think it matters which sub-project you are
> involved in, especially if you understand the technology overall.
>
> It would be great to get someone who represents one of the other Board
> companies or who is here for their own reasons.
>
> If you are interested, let me know ASAP. If there is more than one
> person, maybe we can sort out who on this list? If we give a good
> interview, we can get invited back. :)

Hi Karsten,

I can take this one.

Thanks,
    Itamar


From kwade at redhat.com  Wed Feb 29 15:50:13 2012
From: kwade at redhat.com (Karsten 'quaid' Wade)
Date: Wed, 29 Feb 2012 07:50:13 -0800
Subject: Meeting minutes 2012-02-29
Message-ID: <4F4E4935.60103@redhat.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Minutes:

        http://ovirt.org/meetings/ovirt/2012/ovirt.2012-02-29-15.02.html

Minutes (text):
        http://ovirt.org/meetings/ovirt/2012/ovirt.2012-02-29-15.02.txt

Log:

http://ovirt.org/meetings/ovirt/2012/ovirt.2012-02-29-15.02.log.html


==============
#ovirt Meeting
==============


Meeting started by quaid at 15:02:11 UTC. The full logs are available at
http://ovirt.org/meetings/ovirt/2012/ovirt.2012-02-29-15.02.log.html .


Meeting summary
- ---------------
* Hello & kick the chair around for not getting an agenda out  (quaid,
  15:02:56)

* Agenda discussion  (quaid, 15:05:31)
  * ACTION: quaid or any other meeting organizer to send out agenda 24
    hours in advance of meeting, which spurs status reports  (quaid,
    15:13:35)
  * Better target is EOD Monday for agenda email  (quaid, 15:15:03)
  * AGREED: Monday EOD & no later than Tue morning, meeting organizer
    sends out agenda to trigger status emails from sub-projects and bug
    status  (quaid, 15:15:54)

* Status on release process (follow-up)  (quaid, 15:16:49)
  * LINK: www.ovirt.org/wiki/Second_Release   (oschreib, 15:18:06)
  * deadline to get in features-that-are-blockers is 7 March  (quaid,
    15:37:14)

* Beijing workshop update  (quaid, 15:38:42)
  * RSVPs have slowed down, jbrooks and I are working on getting them
    moving again  (quaid, 15:39:12)
  * Some information about where the venue is - for booking hotels - is
    now on the workshop email list  (quaid, 15:39:33)

* Any other sub-projects have anything to bring up today?  (quaid,
  15:41:30)

* Anything else for today?  (quaid, 15:44:42)

Meeting ended at 15:48:39 UTC.


Action Items
- ------------
* quaid or any other meeting organizer to send out agenda 24 hours in
  advance of meeting, which spurs status reports


Action Items, by person
- -----------------------
* quaid
  * quaid or any other meeting organizer to send out agenda 24 hours in
    advance of meeting, which spurs status reports
* **UNASSIGNED**
  * (none)


People Present (lines said)
- ---------------------------
* quaid (53)
* oschreib (17)
* pmyers (9)
* mgoldboi (8)
* ovirtbot (6)
* mburns (5)
* ichristo (2)
* sgordon (1)
* mestery (1)
* aglitke (1)
* jb_netapp (1)


- -- 
name:  Karsten 'quaid' Wade, Sr. Community Architect
team:    Red Hat Community Architecture & Leadership
uri:              http://communityleadershipteam.org
                         http://TheOpenSourceWay.org
gpg:                                        AD0E0C41
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFPTkk12ZIOBq0ODEERArzQAKCeCJ+32ikKellvCiQSNZmD77qh8ACgxAxN
mM2Wqr9dbK6l7xO9K3EJXHY=
=OXbr
-----END PGP SIGNATURE-----


From pmyers at redhat.com  Wed Feb 29 16:22:10 2012
From: pmyers at redhat.com (Perry Myers)
Date: Wed, 29 Feb 2012 11:22:10 -0500
Subject: freenode vs. oftc
Message-ID: <4F4E50B2.3060904@redhat.com>

Now that we've gotten #ovirt at freenode unblocked and reopened...

Thoughts on making freenode our primary channel and deprecating the oftc
#ovirt channel?

Many of our other related projects are on freenode (vdsm for example),
and so we're sort of an outlier by being on oftc.  Consolidation on a
single irc network may be desirable.

+1 and -1's appreciated, and we'll see if there is a consensus


From eblake at redhat.com  Wed Feb 29 16:26:58 2012
From: eblake at redhat.com (Eric Blake)
Date: Wed, 29 Feb 2012 09:26:58 -0700
Subject: freenode vs. oftc
In-Reply-To: <4F4E50B2.3060904@redhat.com>
References: <4F4E50B2.3060904@redhat.com>
Message-ID: <4F4E51D2.3080603@redhat.com>

On 02/29/2012 09:22 AM, Perry Myers wrote:
> Now that we've gotten #ovirt at freenode unblocked and reopened...
> 
> Thoughts on making freenode our primary channel and deprecating the oftc
> #ovirt channel?
> 
> Many of our other related projects are on freenode (vdsm for example),
> and so we're sort of an outlier by being on oftc.  Consolidation on a
> single irc network may be desirable.

#virt (for libvirt, virt-manager) is still stuck on oftc; any way we can
chase down claiming that channel as well?

> 
> +1 and -1's appreciated, and we'll see if there is a consensus

+1, even if we can't move #virt for a while.

-- 
Eric Blake   eblake at redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 620 bytes
Desc: OpenPGP digital signature
URL: <http://lists.ovirt.org/pipermail/arch/attachments/20120229/8f4b29de/attachment.sig>

From acathrow at redhat.com  Wed Feb 29 16:27:19 2012
From: acathrow at redhat.com (Andrew Cathrow)
Date: Wed, 29 Feb 2012 11:27:19 -0500 (EST)
Subject: freenode vs. oftc
In-Reply-To: <4F4E50B2.3060904@redhat.com>
Message-ID: <10652590-7455-42ec-88c5-2d1d440cae9b@zmail07.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> From: "Perry Myers" <pmyers at redhat.com>
> To: arch at ovirt.org
> Sent: Wednesday, February 29, 2012 11:22:10 AM
> Subject: freenode vs. oftc
> 
> Now that we've gotten #ovirt at freenode unblocked and reopened...
> 
> Thoughts on making freenode our primary channel and deprecating the
> oftc
> #ovirt channel?
> 
> Many of our other related projects are on freenode (vdsm for
> example),
> and so we're sort of an outlier by being on oftc.  Consolidation on a
> single irc network may be desirable.
> 
> +1 and -1's appreciated, and we'll see if there is a consensus

#kvm is on freenode
#virt (libvirt & friends) is on oftc
#vdsm is on freenode
#rhev is on freenode

So yes, it's  all over the place.
But on the plus side we do have over 100 people on IRC as I type this today.

Ideally we'd all be on one place on freenode.


> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
> 


From ovedo at redhat.com  Wed Feb 29 16:30:08 2012
From: ovedo at redhat.com (Oved Ourfalli)
Date: Wed, 29 Feb 2012 11:30:08 -0500 (EST)
Subject: freenode vs. oftc
In-Reply-To: <4F4E50B2.3060904@redhat.com>
Message-ID: <2acd0ce1-9c81-465c-846d-7dcabd5bb503@zmail02.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> From: "Perry Myers" <pmyers at redhat.com>
> To: arch at ovirt.org
> Sent: Wednesday, February 29, 2012 6:22:10 PM
> Subject: freenode vs. oftc
> 
> Now that we've gotten #ovirt at freenode unblocked and reopened...
> 
> Thoughts on making freenode our primary channel and deprecating the
> oftc
> #ovirt channel?
> 
> Many of our other related projects are on freenode (vdsm for
> example),
> and so we're sort of an outlier by being on oftc.  Consolidation on a
> single irc network may be desirable.
> 
> +1 and -1's appreciated, and we'll see if there is a consensus
+1.
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
> 


From anthony at codemonkey.ws  Wed Feb 29 16:36:25 2012
From: anthony at codemonkey.ws (Anthony Liguori)
Date: Wed, 29 Feb 2012 10:36:25 -0600
Subject: freenode vs. oftc
In-Reply-To: <4F4E51D2.3080603@redhat.com>
References: <4F4E50B2.3060904@redhat.com> <4F4E51D2.3080603@redhat.com>
Message-ID: <4F4E5409.3080601@codemonkey.ws>

On 02/29/2012 10:26 AM, Eric Blake wrote:
> On 02/29/2012 09:22 AM, Perry Myers wrote:
>> Now that we've gotten #ovirt at freenode unblocked and reopened...
>>
>> Thoughts on making freenode our primary channel and deprecating the oftc
>> #ovirt channel?
>>
>> Many of our other related projects are on freenode (vdsm for example),
>> and so we're sort of an outlier by being on oftc.  Consolidation on a
>> single irc network may be desirable.
>
> #virt (for libvirt, virt-manager) is still stuck on oftc; any way we can
> chase down claiming that channel as well?

#qemu recently moved to oftc.

Getting admin support on freenode is much harder than it is on oftc.  That was 
our primary reason for moving.

Regards,

Anthony Liguori

>
>>
>> +1 and -1's appreciated, and we'll see if there is a consensus
>
> +1, even if we can't move #virt for a while.
>
>
>
>
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch


From pmyers at redhat.com  Wed Feb 29 16:38:44 2012
From: pmyers at redhat.com (Perry Myers)
Date: Wed, 29 Feb 2012 11:38:44 -0500
Subject: freenode vs. oftc
In-Reply-To: <4F4E5409.3080601@codemonkey.ws>
References: <4F4E50B2.3060904@redhat.com> <4F4E51D2.3080603@redhat.com>
	<4F4E5409.3080601@codemonkey.ws>
Message-ID: <4F4E5494.8080801@redhat.com>

On 02/29/2012 11:36 AM, Anthony Liguori wrote:
> On 02/29/2012 10:26 AM, Eric Blake wrote:
>> On 02/29/2012 09:22 AM, Perry Myers wrote:
>>> Now that we've gotten #ovirt at freenode unblocked and reopened...
>>>
>>> Thoughts on making freenode our primary channel and deprecating the oftc
>>> #ovirt channel?
>>>
>>> Many of our other related projects are on freenode (vdsm for example),
>>> and so we're sort of an outlier by being on oftc.  Consolidation on a
>>> single irc network may be desirable.
>>
>> #virt (for libvirt, virt-manager) is still stuck on oftc; any way we can
>> chase down claiming that channel as well?
> 
> #qemu recently moved to oftc.

Didn't realize this...

> Getting admin support on freenode is much harder than it is on oftc. 
> That was our primary reason for moving.

Yes, we've seen this too

Maybe it makes sense for vdsm to move to oftc and then we can
consolidate there?  I don't have an preference really between fn and
oftc, but think it's generally useful to have the related projects share
the same irc network.

Since kvm, libvirt and ovirt are all on oftc, moving vdsm might make
more sense


From yzaslavs at redhat.com  Wed Feb 29 17:31:13 2012
From: yzaslavs at redhat.com (Yair Zaslavsky)
Date: Wed, 29 Feb 2012 19:31:13 +0200
Subject: freenode vs. oftc
In-Reply-To: <10652590-7455-42ec-88c5-2d1d440cae9b@zmail07.collab.prod.int.phx2.redhat.com>
References: <10652590-7455-42ec-88c5-2d1d440cae9b@zmail07.collab.prod.int.phx2.redhat.com>
Message-ID: <4F4E60E1.4020508@redhat.com>

On 02/29/2012 06:27 PM, Andrew Cathrow wrote:
> 
> 
> ----- Original Message -----
>> From: "Perry Myers" <pmyers at redhat.com>
>> To: arch at ovirt.org
>> Sent: Wednesday, February 29, 2012 11:22:10 AM
>> Subject: freenode vs. oftc
>>
>> Now that we've gotten #ovirt at freenode unblocked and reopened...
>>
>> Thoughts on making freenode our primary channel and deprecating the
>> oftc
>> #ovirt channel?
>>
>> Many of our other related projects are on freenode (vdsm for
>> example),
>> and so we're sort of an outlier by being on oftc.  Consolidation on a
>> single irc network may be desirable.
>>
>> +1 and -1's appreciated, and we'll see if there is a consensus
> 
> #kvm is on freenode
> #virt (libvirt & friends) is on oftc
> #vdsm is on freenode
> #rhev is on freenode

Looking from java-side developer perspective, I found other java-open
source projects (JBoss stuff) on freenode.
So, +1 on that as well.

> 
> So yes, it's  all over the place.
> But on the plus side we do have over 100 people on IRC as I type this today.
> 
> Ideally we'd all be on one place on freenode.
> 
> 
> 
> 
> 
>> _______________________________________________
>> Arch mailing list
>> Arch at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/arch
>>
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch


From jbrooks at redhat.com  Wed Feb 29 20:37:23 2012
From: jbrooks at redhat.com (Jason Brooks)
Date: Wed, 29 Feb 2012 12:37:23 -0800
Subject: freenode vs. oftc
In-Reply-To: <4F4E50B2.3060904@redhat.com>
References: <4F4E50B2.3060904@redhat.com>
Message-ID: <4F4E8C83.3060904@redhat.com>

On 02/29/2012 08:22 AM, Perry Myers wrote:
> Now that we've gotten #ovirt at freenode unblocked and reopened...
> 
> Thoughts on making freenode our primary channel and deprecating the oftc
> #ovirt channel?
> 
> Many of our other related projects are on freenode (vdsm for example),
> and so we're sort of an outlier by being on oftc.  Consolidation on a
> single irc network may be desirable.
> 
> +1 and -1's appreciated, and we'll see if there is a consensus

+1

> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch


From abaron at redhat.com  Wed Feb 29 22:41:14 2012
From: abaron at redhat.com (Ayal Baron)
Date: Wed, 29 Feb 2012 17:41:14 -0500 (EST)
Subject: freenode vs. oftc
In-Reply-To: <4F4E5494.8080801@redhat.com>
Message-ID: <4c8d0923-4aa4-4d63-b546-18d649cb2062@zmail13.collab.prod.int.phx2.redhat.com>


----- Original Message -----
> On 02/29/2012 11:36 AM, Anthony Liguori wrote:
> > On 02/29/2012 10:26 AM, Eric Blake wrote:
> >> On 02/29/2012 09:22 AM, Perry Myers wrote:
> >>> Now that we've gotten #ovirt at freenode unblocked and reopened...
> >>>
> >>> Thoughts on making freenode our primary channel and deprecating
> >>> the oftc
> >>> #ovirt channel?
> >>>
> >>> Many of our other related projects are on freenode (vdsm for
> >>> example),
> >>> and so we're sort of an outlier by being on oftc.  Consolidation
> >>> on a
> >>> single irc network may be desirable.
> >>
> >> #virt (for libvirt, virt-manager) is still stuck on oftc; any way
> >> we can
> >> chase down claiming that channel as well?
> > 
> > #qemu recently moved to oftc.
> 
> Didn't realize this...
> 
> > Getting admin support on freenode is much harder than it is on
> > oftc.
> > That was our primary reason for moving.
> 
> Yes, we've seen this too
> 
> Maybe it makes sense for vdsm to move to oftc and then we can
> consolidate there?  I don't have an preference really between fn and
> oftc, but think it's generally useful to have the related projects
> share
> the same irc network.
> 
> Since kvm, libvirt and ovirt are all on oftc, moving vdsm might make
> more sense

I don't have a personal preference, but kvm is on freenode not oftc.
So is RHEV.
Not directly virt related but #lvm is on freenode as well (vdsm makes extensive use of it so the same logic follows here).
Jboss is mainly on freenode as well (no oftc presence)

So it sounds to me simpler to move ovirt over to freenode now that we have the channel, but I don't feel strongly about it either way.

> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
> 


From lhh at redhat.com  Wed Feb 15 18:31:20 2012
From: lhh at redhat.com (Lon Hohberger)
Date: Wed, 15 Feb 2012 18:31:20 -0000
Subject: Some thoughts on enhancing High Availability in oVirt
In-Reply-To: <4F3B932C.9030602@redhat.com>
References: <99643c65-e6e4-4718-808c-907f2d24571b@zmail13.collab.prod.int.phx2.redhat.com>
	<4F3B932C.9030602@redhat.com>
Message-ID: <4F3BF9F5.9050600@redhat.com>

On 02/15/2012 06:12 AM, Livnat Peer wrote:

>>> HA is a simple use case of policy.
>>
>> *Today* HA is simply 'if VM is down restart it' but what Perry was suggesting was to improve this to something more robust.
>
> I think that the main concept of what Perry suggested (leaving the
> implementation details aside :)) is to add HA of services.

FYI - What we have in cluster today is simply an external monitor script 
to monitor something inside the VM (opaque).  This only moderately 
improves the availability of the application, since we still restart the 
VM to restart the application.

(Point of fact, this functionality was built specifically for 
RHEVM-in-VM deployments on bare-metal clusters)


> I like this idea and I would like to extend it a little bit.
> How about services that are spread on more than a single VM.
> I would like to be able to define a service and specify which VM/s
> provides this service and add HA flag on the service.

This sounds like Pacemaker's 'clone' concept.  I don't know if Pacemaker 
Cloud supports this or not; Steve?


> Then i would like to manage policies around it - I define a service
> with 3 VMs providing this service and I want to have at least 2 VM
> running it at any given time. (now the VMs are not highly available only
> the service is.)

Sure, that's a natural extension.

-- Lon


From smizrahi at redhat.com  Thu Feb 16 15:49:15 2012
From: smizrahi at redhat.com (Saggi Mizrahi)
Date: Thu, 16 Feb 2012 15:49:15 -0000
Subject: [RFC] Ovirt feature PosixFS Connections
In-Reply-To: <eb96a6bc-e081-437d-bb5c-9c3a3ba35b9f@zmail16.collab.prod.int.phx2.redhat.com>
Message-ID: <916f2ebb-70fa-4eba-9297-ed14e8929b7d@zmail16.collab.prod.int.phx2.redhat.com>

http://www.ovirt.org/wiki/Features/PosixFSConnection


From smizrahi at redhat.com  Thu Feb 16 15:49:59 2012
From: smizrahi at redhat.com (Saggi Mizrahi)
Date: Thu, 16 Feb 2012 15:49:59 -0000
Subject: [RFC] Ovirt feature Storage Server Connection References
In-Reply-To: <3d4fde3a-5294-4ee6-a622-569d37cfcedd@zmail16.collab.prod.int.phx2.redhat.com>
Message-ID: <5f7afb02-c28b-4342-8723-f9e812414cea@zmail16.collab.prod.int.phx2.redhat.com>

http://www.ovirt.org/wiki/Features/ConnectionReferences


From ilvovsky at redhat.com  Sun Feb 19 15:42:02 2012
From: ilvovsky at redhat.com (Igor Lvovsky)
Date: Sun, 19 Feb 2012 10:42:02 -0500 (EST)
Subject: [Engine-devel] Empty cdrom drive.
In-Reply-To: <4F40D7BF.7040401@redhat.com>
References: <ac76b269-1283-46d1-bc74-a69100851e65@mkenneth.csb>
	<4F40D7BF.7040401@redhat.com>
Message-ID: <014f01ccef1d$aab3c560$001b5020$@com>


> -----Original Message-----
> From: engine-devel-bounces at ovirt.org
[mailto:engine-devel-bounces at ovirt.org]
> On Behalf Of Livnat Peer
> Sent: Sunday, February 19, 2012 1:07 PM
> To: Dan Kenigsberg
> Cc: engine-devel at ovirt.org; arch at ovirt.org
> Subject: Re: [Engine-devel] Empty cdrom drive.
> 
> On 15/02/12 11:29, Miki Kenneth wrote:
> >
> >
> > ----- Original Message -----
> >> From: "Ayal Baron" <abaron at redhat.com>
> >> To: "Yaniv Kaul" <ykaul at redhat.com>
> >> Cc: engine-devel at ovirt.org
> >> Sent: Wednesday, February 15, 2012 11:23:54 AM
> >> Subject: Re: [Engine-devel] Empty cdrom drive.
> >>
> >>
> >>
> >> ----- Original Message -----
> >>> On 02/15/2012 09:44 AM, Igor Lvovsky wrote:
> >>>>    Hi,
> >>>> I want to discuss $subject on the email just to be sure that we
> >>>> all
> >>>> on the
> >>>> same page.
> >>>>
> >>>> So, today in 3.0 vdsm has two ways to create VM with cdrom :
> >>>>   1. If RHEV-M ask to create VM with cdrom, vdsm just create it
> >>>>   2. RHEV-M doesn't ask to create VM with cdrom, vdsm still
> >>>>   creates
> >>>>   VM with
> >>>>      empty cdrom. Vdsm creates this device as 'hdc' (IDE device,
> >>>>      index 2),
> >>>>      because of libvirt restrictions.
> >>>>      In this case RHEV-M will be able to "insert" cdrom on the
> >>>>      fly
> >>>>      with
> >>>>      changeCD request.
> >>>>
> >>>> In the new style API we want to get rid from stupid scenario #2,
> >>>> because
> >>>> we want to be able to create VM without cdrom at all.
> >>>> It means, that now we need to change a little our scenarios:
> >>>>   1. If RHEV-M ask to create VM with cdrom, vdsm just create it
> >>>>   2. RHEV-M doesn't want to create VM with cdrom, but it want to
> >>>>   be
> >>>>   able to
> >>>>      "insert" cdrom on the fly after this. Here we have two
> >>>>      options:
> >>>>      a. RHEV-M should to pass empty cdrom device on VM creation
> >>>>      and
> >>>>      use
> >>>>         regular changeCD after that
> >>>>      b. RHEV-M can create VM without cdrom and add cdrom later
> >>>>      through
> >>>>         hotplugDisk command.
> >>>>
> 
> 
> The preferred solution IMO would be to let the user choose if he wants a
> VM with CD or not.
> I think the motivation for the above is to 'save' IDE slot if a user
> does not need CD.
> 
> If the user wants to have a VM with CD the engine would create an empty
> CD and pass it to VDSM as a device, but if the user does not require a
> CD there is no reason to create it in VDSM nor in the OE (oVirt Engine).
> 
> Supporting the above requires the engine upgrade to create empty CD
> device to all VMs.
> 

+1  Indeed, this is a right thing to do

> Dan - what happens in 3.0 API if the engine passes the element cdrom but
> with empty path attribute. (I know that if the engine does not pass
> cdrom element VDSM creates empty CD)

We will still create an empty CD

> 


> 
> Livnat
> 
> 
> >>>> Note: The new libvirt remove previous restriction on cdrom
> >>>> devices.
> >>>> Now
> >>>>        cdrom can be created as IDE or VIRTIO device in any index.
> >>>>        It means we can easily hotplug it.
> >>>
> >>> I didn't know a CDROM can be a virtio device, but in any way it
> >>> requires
> >>> driver (which may not exist on Windows).
> >>> I didn't know an IDE CDROM can be hot-plugged (only USB-based?),
> >>
> >> It can't be hotplugged.
> >> usb based is not ide (the ide device is the usb port, the cdrom is a
> >> usb device afaik).
> >>
> >> The point of this email is that since we want to support being able
> >> to start VMs *without* a cdrom then the default behaviour of
> >> attaching a cdrom device needs to be implemented in engine or we
> >> shall have a regression.
> > This is a regression that we can not live with...
> >> In the new API (for stable device addresses) vdsm doesn't
> >> automatically attach a cdrom.
> >>
> >>> perhaps
> >>> I'm wrong here.
> >>> Y.
> >>>
> >>>>
> >>>>
> >>>> Regards,
> >>>>      Igor Lvovsky
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Engine-devel mailing list
> >>>> Engine-devel at ovirt.org
> >>>> http://lists.ovirt.org/mailman/listinfo/engine-devel
> >>>
> >>> _______________________________________________
> >>> Engine-devel mailing list
> >>> Engine-devel at ovirt.org
> >>> http://lists.ovirt.org/mailman/listinfo/engine-devel
> >>>
> >> _______________________________________________
> >> Engine-devel mailing list
> >> Engine-devel at ovirt.org
> >> http://lists.ovirt.org/mailman/listinfo/engine-devel
> >>
> > _______________________________________________
> > Engine-devel mailing list
> > Engine-devel at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/engine-devel
> 
> _______________________________________________
> Engine-devel mailing list
> Engine-devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/engine-devel