[Engine-devel] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)

Hello All, Preparing to ovirt-engine 3.2 the entire "vdsm-bootstrap" bootstrap was re-written from scratch into more pluggable and flexible implementation, available at git master and nightly snapshots. As far as packaging is concerned there are now two more dependencies to ovirt-engine: * otopi -- oVirt Task Oriented Pluggable Installer/Implementation * ovirt-host-deploy -- oVirt host deploy tool These packages replace the legacy vdsm-bootstrap package that was distributed with vdsm. Git repositories are available at at[1][2]. Documentation is available at Git repositories - README*. Builds are available at usual place[3]. Bugzilla components will be available shortly. Change log is attached. There is no change in the way the engine is performing the host deployment process in term of user experience, other than event log messages during deployment were improved. The log of the deployment is fetched from host and stored at engine machine at /var/log/ovirt-engine/host-deploy, on host it is at /tmp/ovirt-host-deploy*.log and deleted when fetched to engine. Among other features, the ovir-host-deploy package can be installed manually on host and executed to prepare host for installation, in future we may be able to add host to engine without performing the deployment process, for now it will be usable for integration tests. The internals are completely different, instead of having 3 different bootstrap sequences: 1. host install 2. ovirt-node install 3. ovirt-node approve We now have single sequence which is common to host and node installation or re-installation, end result is much simpler implementation. Please report any issues even minor issues, so we can stabilize it for 3.2 release. Best Regards, Alon Bar-Lev. [1] http://gerrit.ovirt.org/gitweb?p=otopi.git;a=tree [2] http://gerrit.ovirt.org/gitweb?p=ovirt-host-deploy.git;a=tree [3] http://www.ovirt.org/releases/nightly/rpm/Fedora/17/noarch/ --- Change Log * offline packager feature. * tuned is installed with virtual-host profile. * initial implementation based on otpoi. * implementation is based on legacy vdsm-bootstrap pacakge functionality. * legacy-removed: legacy VDSM (<3.0) config upgrade. * legacy-removed: change machine width core file # echo /var/lib/vdsm/core > /proc/sys/kernel/core_pattern * legacy-removed: kernel version test, package dependency is sufficient. * legacy-removed: do not add kernel parameter processor.max_cstate=1 warn if not have constant_tsc https://bugzilla.redhat.com/show_bug.cgi?id=770153 * legacy-change: io elevator scheduler set in kernel command-line use either udev rule in vdsm package or tuned. * legacy-change: vdsm libvirt reconfigure vdsm is reconfigured with file based trigger instead unsupported systemd init.d parameter. * legacy-change: distribution checks are simpler based on Python platform, minimum: - rhel-6.2 - fedora-17 * legacy-change: minimum vdsm version is taken from engine not hard coded. * legacy-change: pki is now using m2crypto to generate certificate request and parse certificates. * legacy-change: use iproute2 instead of python ethtool to avoid another dependency for host name validation. * legacy-change: use iproute2 instead of reading /proc/net/route for route information and interface information. * legacy-change: do not use vdsm.netinfo for vlan and bonding as it requires /usr/share/vdsm modules, and it is trivial anyway. * legacy-change: use vdsm-store-net-config script to commit network config instead of internal duplicate implementation. * legacy-change: /etc/vdsm/vdsm.conf is overridden unless VDSM/configOverride environment is set to True * legacy-change: /etc/vdsm/vdsm.conf is not read of fake_qemu. set VDSM/checkVirtHardware environment to False to avoid hardware detection. * legacy-change: following gluster packages not installed: - glusterfs-rdma - glusterfs-geo-replication

On Wed, Nov 28, 2012 at 08:59:10AM -0500, Alon Bar-Lev wrote:
Hello All,
Preparing to ovirt-engine 3.2 the entire "vdsm-bootstrap" bootstrap was re-written from scratch into more pluggable and flexible implementation, available at git master and nightly snapshots.
As far as packaging is concerned there are now two more dependencies to ovirt-engine:
* otopi -- oVirt Task Oriented Pluggable Installer/Implementation * ovirt-host-deploy -- oVirt host deploy tool
These packages replace the legacy vdsm-bootstrap package that was distributed with vdsm.
Hurray! I suspect that a `git-rm vds_bootstrap/*` is pending?
Git repositories are available at at[1][2]. Documentation is available at Git repositories - README*. Builds are available at usual place[3]. Bugzilla components will be available shortly.
Are there requests to add the components to Fedora (18, EPEL6)? I think we should add these requests as blockers for Bug 881006 - Tracker: oVirt 3.2 release.
Change log is attached.
There is no change in the way the engine is performing the host deployment process in term of user experience, other than event log messages during deployment were improved.
The log of the deployment is fetched from host and stored at engine machine at /var/log/ovirt-engine/host-deploy, on host it is at /tmp/ovirt-host-deploy*.log and deleted when fetched to engine.
Among other features, the ovir-host-deploy package can be installed manually on host and executed to prepare host for installation, in future we may be able to add host to engine without performing the deployment process, for now it will be usable for integration tests.
The internals are completely different, instead of having 3 different bootstrap sequences: 1. host install 2. ovirt-node install 3. ovirt-node approve
We now have single sequence which is common to host and node installation or re-installation, end result is much simpler implementation.
Please report any issues even minor issues, so we can stabilize it for 3.2 release.
Best Regards, Alon Bar-Lev.
[1] http://gerrit.ovirt.org/gitweb?p=otopi.git;a=tree [2] http://gerrit.ovirt.org/gitweb?p=ovirt-host-deploy.git;a=tree [3] http://www.ovirt.org/releases/nightly/rpm/Fedora/17/noarch/
---
Change Log
* offline packager feature.
* tuned is installed with virtual-host profile.
I never understood why this is an installer step, and not part of vdsmd start up
* initial implementation based on otpoi.
* implementation is based on legacy vdsm-bootstrap pacakge functionality.
* legacy-removed: legacy VDSM (<3.0) config upgrade.
* legacy-removed: change machine width core file # echo /var/lib/vdsm/core > /proc/sys/kernel/core_pattern
Yeah, qemu-kvm and libvirtd are much more stable than in the old days, but wouldn't we want to keep a means to collect the corpses of dead processes from hypervisors? It has helped us nail down nasty bugs, even in Python.
* legacy-removed: kernel version test, package dependency is sufficient.
* legacy-removed: do not add kernel parameter processor.max_cstate=1 warn if not have constant_tsc https://bugzilla.redhat.com/show_bug.cgi?id=770153
* legacy-change: io elevator scheduler set in kernel command-line use either udev rule in vdsm package or tuned.
* legacy-change: vdsm libvirt reconfigure vdsm is reconfigured with file based trigger instead unsupported systemd init.d parameter.
* legacy-change: distribution checks are simpler based on Python platform, minimum: - rhel-6.2 - fedora-17
* legacy-change: minimum vdsm version is taken from engine not hard coded.
* legacy-change: pki is now using m2crypto to generate certificate request and parse certificates.
* legacy-change: use iproute2 instead of python ethtool to avoid another dependency for host name validation.
* legacy-change: use iproute2 instead of reading /proc/net/route for route information and interface information.
* legacy-change: do not use vdsm.netinfo for vlan and bonding as it requires /usr/share/vdsm modules, and it is trivial anyway.
* legacy-change: use vdsm-store-net-config script to commit network config instead of internal duplicate implementation.
* legacy-change: /etc/vdsm/vdsm.conf is overridden unless VDSM/configOverride environment is set to True
I'm a bit confused by the negation: I'd expect VDSM/configOverride=True to mean "override /etc/vdsm/vdsm.conf".
* legacy-change: /etc/vdsm/vdsm.conf is not read of fake_qemu. set VDSM/checkVirtHardware environment to False to avoid hardware detection.
* legacy-change: following gluster packages not installed: - glusterfs-rdma - glusterfs-geo-replication
Alon, thanks for your tremendous work on this. I cannot wait to have it up and running in the release. Dan.

----- Original Message -----
From: "Dan Kenigsberg" <danken@redhat.com> To: "Alon Bar-Lev" <alonbl@redhat.com> Cc: "VDSM Project Development" <vdsm-devel@lists.fedorahosted.org>, "engine-devel" <engine-devel@ovirt.org>, "users" <users@ovirt.org> Sent: Wednesday, November 28, 2012 4:41:04 PM Subject: Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)
On Wed, Nov 28, 2012 at 08:59:10AM -0500, Alon Bar-Lev wrote:
Hello All,
Preparing to ovirt-engine 3.2 the entire "vdsm-bootstrap" bootstrap was re-written from scratch into more pluggable and flexible implementation, available at git master and nightly snapshots.
As far as packaging is concerned there are now two more dependencies to ovirt-engine:
* otopi -- oVirt Task Oriented Pluggable Installer/Implementation * ovirt-host-deploy -- oVirt host deploy tool
These packages replace the legacy vdsm-bootstrap package that was distributed with vdsm.
Hurray!
I suspect that a `git-rm vds_bootstrap/*` is pending?
No... we need it as compatibility with older engines... We keep minimum changes there for legacy, until end-of-life.
Git repositories are available at at[1][2]. Documentation is available at Git repositories - README*. Builds are available at usual place[3]. Bugzilla components will be available shortly.
Are there requests to add the components to Fedora (18, EPEL6)? I think we should add these requests as blockers for Bug 881006 - Tracker: oVirt 3.2 release.
Yes, I am on this one.
Change log is attached.
There is no change in the way the engine is performing the host deployment process in term of user experience, other than event log messages during deployment were improved.
The log of the deployment is fetched from host and stored at engine machine at /var/log/ovirt-engine/host-deploy, on host it is at /tmp/ovirt-host-deploy*.log and deleted when fetched to engine.
Among other features, the ovir-host-deploy package can be installed manually on host and executed to prepare host for installation, in future we may be able to add host to engine without performing the deployment process, for now it will be usable for integration tests.
The internals are completely different, instead of having 3 different bootstrap sequences: 1. host install 2. ovirt-node install 3. ovirt-node approve
We now have single sequence which is common to host and node installation or re-installation, end result is much simpler implementation.
Please report any issues even minor issues, so we can stabilize it for 3.2 release.
Best Regards, Alon Bar-Lev.
[1] http://gerrit.ovirt.org/gitweb?p=otopi.git;a=tree [2] http://gerrit.ovirt.org/gitweb?p=ovirt-host-deploy.git;a=tree [3] http://www.ovirt.org/releases/nightly/rpm/Fedora/17/noarch/
---
Change Log
* offline packager feature.
* tuned is installed with virtual-host profile.
I never understood why this is an installer step, and not part of vdsmd start up
There may be several method to tune a machine. Why VDSM should depend on specific one?
* initial implementation based on otpoi.
* implementation is based on legacy vdsm-bootstrap pacakge functionality.
* legacy-removed: legacy VDSM (<3.0) config upgrade.
* legacy-removed: change machine width core file # echo /var/lib/vdsm/core > /proc/sys/kernel/core_pattern
Yeah, qemu-kvm and libvirtd are much more stable than in the old days, but wouldn't we want to keep a means to collect the corpses of dead processes from hypervisors? It has helped us nail down nasty bugs, even in Python.
It does not mean it should be at /var/lib/vdsm ... :)
* legacy-removed: kernel version test, package dependency is sufficient.
* legacy-removed: do not add kernel parameter processor.max_cstate=1 warn if not have constant_tsc https://bugzilla.redhat.com/show_bug.cgi?id=770153
* legacy-change: io elevator scheduler set in kernel command-line use either udev rule in vdsm package or tuned.
* legacy-change: vdsm libvirt reconfigure vdsm is reconfigured with file based trigger instead unsupported systemd init.d parameter.
* legacy-change: distribution checks are simpler based on Python platform, minimum: - rhel-6.2 - fedora-17
* legacy-change: minimum vdsm version is taken from engine not hard coded.
* legacy-change: pki is now using m2crypto to generate certificate request and parse certificates.
* legacy-change: use iproute2 instead of python ethtool to avoid another dependency for host name validation.
* legacy-change: use iproute2 instead of reading /proc/net/route for route information and interface information.
* legacy-change: do not use vdsm.netinfo for vlan and bonding as it requires /usr/share/vdsm modules, and it is trivial anyway.
* legacy-change: use vdsm-store-net-config script to commit network config instead of internal duplicate implementation.
* legacy-change: /etc/vdsm/vdsm.conf is overridden unless VDSM/configOverride environment is set to True
I'm a bit confused by the negation: I'd expect VDSM/configOverride=True to mean "override /etc/vdsm/vdsm.conf".
Sorry! It should be False.
* legacy-change: /etc/vdsm/vdsm.conf is not read of fake_qemu. set VDSM/checkVirtHardware environment to False to avoid hardware detection.
* legacy-change: following gluster packages not installed: - glusterfs-rdma - glusterfs-geo-replication
Alon, thanks for your tremendous work on this. I cannot wait to have it up and running in the release.
Thank you! I truly hope that from this point we can only make it better. Alon.

On Wed, Nov 28, 2012 at 09:45:17AM -0500, Alon Bar-Lev wrote:
On Wed, Nov 28, 2012 at 08:59:10AM -0500, Alon Bar-Lev wrote:
Hello All,
Preparing to ovirt-engine 3.2 the entire "vdsm-bootstrap" bootstrap was re-written from scratch into more pluggable and flexible implementation, available at git master and nightly snapshots.
As far as packaging is concerned there are now two more dependencies to ovirt-engine:
* otopi -- oVirt Task Oriented Pluggable Installer/Implementation * ovirt-host-deploy -- oVirt host deploy tool
These packages replace the legacy vdsm-bootstrap package that was distributed with vdsm.
Hurray!
I suspect that a `git-rm vds_bootstrap/*` is pending?
No... we need it as compatibility with older engines... We keep minimum changes there for legacy, until end-of-life.
Is there an EoL statement for oVirt-3.1? We can make sure that oVirt-3.2's vdsm installs properly with ovirt-3.1's vdsm-bootstrap, or even require that Engine must be upgraded to ovirt-3.2 before upgrading any of the hosts. Is it too harsh to our vast install base? users@ovirt.org, please chime in!
Git repositories are available at at[1][2]. Documentation is available at Git repositories - README*. Builds are available at usual place[3]. Bugzilla components will be available shortly.
Are there requests to add the components to Fedora (18, EPEL6)? I think we should add these requests as blockers for Bug 881006 - Tracker: oVirt 3.2 release.
Yes, I am on this one.
Change log is attached.
There is no change in the way the engine is performing the host deployment process in term of user experience, other than event log messages during deployment were improved.
The log of the deployment is fetched from host and stored at engine machine at /var/log/ovirt-engine/host-deploy, on host it is at /tmp/ovirt-host-deploy*.log and deleted when fetched to engine.
Among other features, the ovir-host-deploy package can be installed manually on host and executed to prepare host for installation, in future we may be able to add host to engine without performing the deployment process, for now it will be usable for integration tests.
The internals are completely different, instead of having 3 different bootstrap sequences: 1. host install 2. ovirt-node install 3. ovirt-node approve
We now have single sequence which is common to host and node installation or re-installation, end result is much simpler implementation.
Please report any issues even minor issues, so we can stabilize it for 3.2 release.
Best Regards, Alon Bar-Lev.
[1] http://gerrit.ovirt.org/gitweb?p=otopi.git;a=tree [2] http://gerrit.ovirt.org/gitweb?p=ovirt-host-deploy.git;a=tree [3] http://www.ovirt.org/releases/nightly/rpm/Fedora/17/noarch/
---
Change Log
* offline packager feature.
* tuned is installed with virtual-host profile.
I never understood why this is an installer step, and not part of vdsmd start up
There may be several method to tune a machine. Why VDSM should depend on specific one?
Maybe because I tend to install vdsm using `yum`, and would like it to do The Right Thing to make the host an oVirt node. I suspect that if ovirt-host-deploy proves to be easy to use, I could follow my `yum install vdsm` with `ovirt-host-deploy`.
* initial implementation based on otpoi.
* implementation is based on legacy vdsm-bootstrap pacakge functionality.
* legacy-removed: legacy VDSM (<3.0) config upgrade.
* legacy-removed: change machine width core file # echo /var/lib/vdsm/core > /proc/sys/kernel/core_pattern
Yeah, qemu-kvm and libvirtd are much more stable than in the old days, but wouldn't we want to keep a means to collect the corpses of dead processes from hypervisors? It has helped us nail down nasty bugs, even in Python.
It does not mean it should be at /var/lib/vdsm ... :)
I don't get the joke :-(. If you mind the location, we can think of somewhere else to put the core dumps. Would it be hard to reinstate a parallel feature in otopi?
Alon, thanks for your tremendous work on this. I cannot wait to have it up and running in the release.
Thank you! I truly hope that from this point we can only make it better.
Do you mean that we've reached rock bottom? ;-)

----- Original Message -----
From: "Dan Kenigsberg" <danken@redhat.com> To: "Alon Bar-Lev" <alonbl@redhat.com> Cc: "VDSM Project Development" <vdsm-devel@lists.fedorahosted.org>, "engine-devel" <engine-devel@ovirt.org>, "users" <users@ovirt.org> Sent: Wednesday, November 28, 2012 9:48:41 PM Subject: Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)
On Wed, Nov 28, 2012 at 09:45:17AM -0500, Alon Bar-Lev wrote:
On Wed, Nov 28, 2012 at 08:59:10AM -0500, Alon Bar-Lev wrote:
Hello All,
Preparing to ovirt-engine 3.2 the entire "vdsm-bootstrap" bootstrap was re-written from scratch into more pluggable and flexible implementation, available at git master and nightly snapshots.
As far as packaging is concerned there are now two more dependencies to ovirt-engine:
* otopi -- oVirt Task Oriented Pluggable Installer/Implementation * ovirt-host-deploy -- oVirt host deploy tool
These packages replace the legacy vdsm-bootstrap package that was distributed with vdsm.
Hurray!
I suspect that a `git-rm vds_bootstrap/*` is pending?
No... we need it as compatibility with older engines... We keep minimum changes there for legacy, until end-of-life.
Is there an EoL statement for oVirt-3.1? We can make sure that oVirt-3.2's vdsm installs properly with ovirt-3.1's vdsm-bootstrap, or even require that Engine must be upgraded to ovirt-3.2 before upgrading any of the hosts. Is it too harsh to our vast install base? users@ovirt.org, please chime in!
I tried to find such, but the more I dig I find that we need to support old legacy.
Git repositories are available at at[1][2]. Documentation is available at Git repositories - README*. Builds are available at usual place[3]. Bugzilla components will be available shortly.
Are there requests to add the components to Fedora (18, EPEL6)? I think we should add these requests as blockers for Bug 881006 - Tracker: oVirt 3.2 release.
Yes, I am on this one.
Change log is attached.
There is no change in the way the engine is performing the host deployment process in term of user experience, other than event log messages during deployment were improved.
The log of the deployment is fetched from host and stored at engine machine at /var/log/ovirt-engine/host-deploy, on host it is at /tmp/ovirt-host-deploy*.log and deleted when fetched to engine.
Among other features, the ovir-host-deploy package can be installed manually on host and executed to prepare host for installation, in future we may be able to add host to engine without performing the deployment process, for now it will be usable for integration tests.
The internals are completely different, instead of having 3 different bootstrap sequences: 1. host install 2. ovirt-node install 3. ovirt-node approve
We now have single sequence which is common to host and node installation or re-installation, end result is much simpler implementation.
Please report any issues even minor issues, so we can stabilize it for 3.2 release.
Best Regards, Alon Bar-Lev.
[1] http://gerrit.ovirt.org/gitweb?p=otopi.git;a=tree [2] http://gerrit.ovirt.org/gitweb?p=ovirt-host-deploy.git;a=tree [3] http://www.ovirt.org/releases/nightly/rpm/Fedora/17/noarch/
---
Change Log
* offline packager feature.
* tuned is installed with virtual-host profile.
I never understood why this is an installer step, and not part of vdsmd start up
There may be several method to tune a machine. Why VDSM should depend on specific one?
Maybe because I tend to install vdsm using `yum`, and would like it to do The Right Thing to make the host an oVirt node. I suspect that if ovirt-host-deploy proves to be easy to use, I could follow my `yum install vdsm` with `ovirt-host-deploy`.
I will be glad if you try it out.
* initial implementation based on otpoi.
* implementation is based on legacy vdsm-bootstrap pacakge functionality.
* legacy-removed: legacy VDSM (<3.0) config upgrade.
* legacy-removed: change machine width core file # echo /var/lib/vdsm/core > /proc/sys/kernel/core_pattern
Yeah, qemu-kvm and libvirtd are much more stable than in the old days, but wouldn't we want to keep a means to collect the corpses of dead processes from hypervisors? It has helped us nail down nasty bugs, even in Python.
It does not mean it should be at /var/lib/vdsm ... :)
I don't get the joke :-(. If you mind the location, we can think of somewhere else to put the core dumps. Would it be hard to reinstate a parallel feature in otopi?
I usually do not make any jokes... A global system setting should not go into package specific location. Usually core dumps are off by default, I like this approach as unattended system may fast consume all disk space because of dumps. If sysadmin manually enables dumps, he may do this at a location of his own choice. If we want to automatically enable dumps I guess it should go to /var/lib/core or similar.
Alon, thanks for your tremendous work on this. I cannot wait to have it up and running in the release.
Thank you! I truly hope that from this point we can only make it better.
Do you mean that we've reached rock bottom? ;-)
No, that now I have some infrastructure that I can use to provide solutions. Regards, Alon

On Wed, Nov 28, 2012 at 02:57:17PM -0500, Alon Bar-Lev wrote:
No... we need it as compatibility with older engines... We keep minimum changes there for legacy, until end-of-life.
Is there an EoL statement for oVirt-3.1? We can make sure that oVirt-3.2's vdsm installs properly with ovirt-3.1's vdsm-bootstrap, or even require that Engine must be upgraded to ovirt-3.2 before upgrading any of the hosts. Is it too harsh to our vast install base? users@ovirt.org, please chime in!
I tried to find such, but the more I dig I find that we need to support old legacy.
Why, exactly? Fedora gives no such guarntees (heck, I'm stuck with an unupgradable F16). Should we be any better than our (currently single) platform?
* legacy-removed: change machine width core file # echo /var/lib/vdsm/core > /proc/sys/kernel/core_pattern
Yeah, qemu-kvm and libvirtd are much more stable than in the old days, but wouldn't we want to keep a means to collect the corpses of dead processes from hypervisors? It has helped us nail down nasty bugs, even in Python.
It does not mean it should be at /var/lib/vdsm ... :)
I don't get the joke :-(. If you mind the location, we can think of somewhere else to put the core dumps. Would it be hard to reinstate a parallel feature in otopi?
I usually do not make any jokes... A global system setting should not go into package specific location. Usually core dumps are off by default, I like this approach as unattended system may fast consume all disk space because of dumps.
If a host fills up with dumps so quickly, it's a sign that it should not be used for production, and that someone should look into the cores. (P.S. we have a logrotate rule for them in vdsm)
If sysadmin manually enables dumps, he may do this at a location of his own choice.
Note that we've just swapped hats: you're arguing for letting a local admin log in and mess with system configuration, and I'm for keeping a centralized feature for storing and collecting core dumps.
If we want to automatically enable dumps I guess it should go to /var/lib/core or similar.

----- Original Message -----
From: "Dan Kenigsberg" <danken@redhat.com> To: "Alon Bar-Lev" <alonbl@redhat.com> Cc: "VDSM Project Development" <vdsm-devel@lists.fedorahosted.org>, "engine-devel" <engine-devel@ovirt.org>, "users" <users@ovirt.org> Sent: Wednesday, November 28, 2012 10:39:42 PM Subject: Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)
On Wed, Nov 28, 2012 at 02:57:17PM -0500, Alon Bar-Lev wrote:
No... we need it as compatibility with older engines... We keep minimum changes there for legacy, until end-of-life.
Is there an EoL statement for oVirt-3.1? We can make sure that oVirt-3.2's vdsm installs properly with ovirt-3.1's vdsm-bootstrap, or even require that Engine must be upgraded to ovirt-3.2 before upgrading any of the hosts. Is it too harsh to our vast install base? users@ovirt.org, please chime in!
I tried to find such, but the more I dig I find that we need to support old legacy.
Why, exactly? Fedora gives no such guarntees (heck, I'm stuck with an unupgradable F16). Should we be any better than our (currently single) platform?
We should start and detach from specific distro procedures.
* legacy-removed: change machine width core file # echo /var/lib/vdsm/core > /proc/sys/kernel/core_pattern
Yeah, qemu-kvm and libvirtd are much more stable than in the old days, but wouldn't we want to keep a means to collect the corpses of dead processes from hypervisors? It has helped us nail down nasty bugs, even in Python.
It does not mean it should be at /var/lib/vdsm ... :)
I don't get the joke :-(. If you mind the location, we can think of somewhere else to put the core dumps. Would it be hard to reinstate a parallel feature in otopi?
I usually do not make any jokes... A global system setting should not go into package specific location. Usually core dumps are off by default, I like this approach as unattended system may fast consume all disk space because of dumps.
If a host fills up with dumps so quickly, it's a sign that it should not be used for production, and that someone should look into the cores. (P.S. we have a logrotate rule for them in vdsm)
There should be a vdsm-debug-aids (or similar) to perform such changes. Again, I don't think vdsm should (by default) modify any system width parameter such as this. But I will happy to hear more views.
If sysadmin manually enables dumps, he may do this at a location of his own choice.
Note that we've just swapped hats: you're arguing for letting a local admin log in and mess with system configuration, and I'm for keeping a centralized feature for storing and collecting core dumps.
As problems like crashes are investigated per case and reproduction scenario. But again, I may be wrong and we should have VDSM API command to start/stop storing dumps and manage this via its master...
If we want to automatically enable dumps I guess it should go to /var/lib/core or similar.

* Alon Bar-Lev <alonbl@redhat.com> [2012-11-28 14:47]:
----- Original Message -----
From: "Dan Kenigsberg" <danken@redhat.com> To: "Alon Bar-Lev" <alonbl@redhat.com> Cc: "VDSM Project Development" <vdsm-devel@lists.fedorahosted.org>, "engine-devel" <engine-devel@ovirt.org>, "users" <users@ovirt.org> Sent: Wednesday, November 28, 2012 10:39:42 PM Subject: Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)
On Wed, Nov 28, 2012 at 02:57:17PM -0500, Alon Bar-Lev wrote:
No... we need it as compatibility with older engines... We keep minimum changes there for legacy, until end-of-life.
Is there an EoL statement for oVirt-3.1? We can make sure that oVirt-3.2's vdsm installs properly with ovirt-3.1's vdsm-bootstrap, or even require that Engine must be upgraded to ovirt-3.2 before upgrading any of the hosts. Is it too harsh to our vast install base? users@ovirt.org, please chime in!
I tried to find such, but the more I dig I find that we need to support old legacy.
Why, exactly? Fedora gives no such guarntees (heck, I'm stuck with an unupgradable F16). Should we be any better than our (currently single) platform?
We should start and detach from specific distro procedures.
> > * legacy-removed: change machine width core file > # echo /var/lib/vdsm/core > /proc/sys/kernel/core_pattern
Yeah, qemu-kvm and libvirtd are much more stable than in the old days, but wouldn't we want to keep a means to collect the corpses of dead processes from hypervisors? It has helped us nail down nasty bugs, even in Python.
It does not mean it should be at /var/lib/vdsm ... :)
I don't get the joke :-(. If you mind the location, we can think of somewhere else to put the core dumps. Would it be hard to reinstate a parallel feature in otopi?
I usually do not make any jokes... A global system setting should not go into package specific location. Usually core dumps are off by default, I like this approach as unattended system may fast consume all disk space because of dumps.
If a host fills up with dumps so quickly, it's a sign that it should not be used for production, and that someone should look into the cores. (P.S. we have a logrotate rule for them in vdsm)
There should be a vdsm-debug-aids (or similar) to perform such changes. Again, I don't think vdsm should (by default) modify any system width parameter such as this. But I will happy to hear more views.
If sysadmin manually enables dumps, he may do this at a location of his own choice.
Note that we've just swapped hats: you're arguing for letting a local admin log in and mess with system configuration, and I'm for keeping a centralized feature for storing and collecting core dumps.
As problems like crashes are investigated per case and reproduction scenario. But again, I may be wrong and we should have VDSM API command to start/stop storing dumps and manage this via its master...
I very much like this idea. There was a thread a while back discussing[1] the this very idea; I was looking for a way to enable 'debugging' mode as well as a way to programatically collect debugging info (which could include host stats, guest stats, logs and any core files). Certainly in such a scenario, being able to enable/disable varous features of a debugging mode could include whether to enable core dumps as well as where to save them on the host. 1. http://comments.gmane.org/gmane.comp.emulators.ovirt.vdsm.devel/1387 -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx ryanh@us.ibm.com

On Wed, Nov 28, 2012 at 03:45:28PM -0500, Alon Bar-Lev wrote:
----- Original Message -----
From: "Dan Kenigsberg" <danken@redhat.com> To: "Alon Bar-Lev" <alonbl@redhat.com> Cc: "VDSM Project Development" <vdsm-devel@lists.fedorahosted.org>, "engine-devel" <engine-devel@ovirt.org>, "users" <users@ovirt.org> Sent: Wednesday, November 28, 2012 10:39:42 PM Subject: Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)
On Wed, Nov 28, 2012 at 02:57:17PM -0500, Alon Bar-Lev wrote:
No... we need it as compatibility with older engines... We keep minimum changes there for legacy, until end-of-life.
Is there an EoL statement for oVirt-3.1? We can make sure that oVirt-3.2's vdsm installs properly with ovirt-3.1's vdsm-bootstrap, or even require that Engine must be upgraded to ovirt-3.2 before upgrading any of the hosts. Is it too harsh to our vast install base? users@ovirt.org, please chime in!
I tried to find such, but the more I dig I find that we need to support old legacy.
Why, exactly? Fedora gives no such guarntees (heck, I'm stuck with an unupgradable F16). Should we be any better than our (currently single) platform?
We should start and detach from specific distro procedures.
> > * legacy-removed: change machine width core file > # echo /var/lib/vdsm/core > /proc/sys/kernel/core_pattern
Yeah, qemu-kvm and libvirtd are much more stable than in the old days, but wouldn't we want to keep a means to collect the corpses of dead processes from hypervisors? It has helped us nail down nasty bugs, even in Python.
It does not mean it should be at /var/lib/vdsm ... :)
I don't get the joke :-(. If you mind the location, we can think of somewhere else to put the core dumps. Would it be hard to reinstate a parallel feature in otopi?
I usually do not make any jokes... A global system setting should not go into package specific location. Usually core dumps are off by default, I like this approach as unattended system may fast consume all disk space because of dumps.
If a host fills up with dumps so quickly, it's a sign that it should not be used for production, and that someone should look into the cores. (P.S. we have a logrotate rule for them in vdsm)
There should be a vdsm-debug-aids (or similar) to perform such changes. Again, I don't think vdsm should (by default) modify any system width parameter such as this. But I will happy to hear more views.
I agree with your statement above that a single package should not override a global system setting. We should really work to remove as many of these from vdsm as we possibly can. It will help to make vdsm a much safer/well-behaved package.
If sysadmin manually enables dumps, he may do this at a location of his own choice.
Note that we've just swapped hats: you're arguing for letting a local admin log in and mess with system configuration, and I'm for keeping a centralized feature for storing and collecting core dumps.
As problems like crashes are investigated per case and reproduction scenario. But again, I may be wrong and we should have VDSM API command to start/stop storing dumps and manage this via its master...
-- Adam Litke <agl@us.ibm.com> IBM Linux Technology Center

On Wed, Nov 28, 2012 at 03:29:35PM -0600, Adam Litke wrote:
On Wed, Nov 28, 2012 at 03:45:28PM -0500, Alon Bar-Lev wrote:
----- Original Message -----
From: "Dan Kenigsberg" <danken@redhat.com> To: "Alon Bar-Lev" <alonbl@redhat.com> Cc: "VDSM Project Development" <vdsm-devel@lists.fedorahosted.org>, "engine-devel" <engine-devel@ovirt.org>, "users" <users@ovirt.org> Sent: Wednesday, November 28, 2012 10:39:42 PM Subject: Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)
On Wed, Nov 28, 2012 at 02:57:17PM -0500, Alon Bar-Lev wrote:
No... we need it as compatibility with older engines... We keep minimum changes there for legacy, until end-of-life.
Is there an EoL statement for oVirt-3.1? We can make sure that oVirt-3.2's vdsm installs properly with ovirt-3.1's vdsm-bootstrap, or even require that Engine must be upgraded to ovirt-3.2 before upgrading any of the hosts. Is it too harsh to our vast install base? users@ovirt.org, please chime in!
I tried to find such, but the more I dig I find that we need to support old legacy.
Why, exactly? Fedora gives no such guarntees (heck, I'm stuck with an unupgradable F16). Should we be any better than our (currently single) platform?
We should start and detach from specific distro procedures.
> > > > * legacy-removed: change machine width core file > > # echo /var/lib/vdsm/core > /proc/sys/kernel/core_pattern > > Yeah, qemu-kvm and libvirtd are much more stable than in the > old > days, > but wouldn't we want to keep a means to collect the corpses > of > dead > processes from hypervisors? It has helped us nail down nasty > bugs, > even > in Python.
It does not mean it should be at /var/lib/vdsm ... :)
I don't get the joke :-(. If you mind the location, we can think of somewhere else to put the core dumps. Would it be hard to reinstate a parallel feature in otopi?
I usually do not make any jokes... A global system setting should not go into package specific location. Usually core dumps are off by default, I like this approach as unattended system may fast consume all disk space because of dumps.
If a host fills up with dumps so quickly, it's a sign that it should not be used for production, and that someone should look into the cores. (P.S. we have a logrotate rule for them in vdsm)
There should be a vdsm-debug-aids (or similar) to perform such changes. Again, I don't think vdsm should (by default) modify any system width parameter such as this. But I will happy to hear more views.
I agree with your statement above that a single package should not override a global system setting. We should really work to remove as many of these from vdsm as we possibly can. It will help to make vdsm a much safer/well-behaved package.
I'm fine with dropping these from vdsm, but I think they are good for ovirt - we would like to (be able to) enfornce policy on our nodes. If configuring core dumps is removed from vdsm, it should go somewhere else, or our log-collector users would miss their beloved dumps. Dan.

On Thu, Nov 29, 2012 at 10:00:12AM +0200, Dan Kenigsberg wrote:
On Wed, Nov 28, 2012 at 03:29:35PM -0600, Adam Litke wrote:
On Wed, Nov 28, 2012 at 03:45:28PM -0500, Alon Bar-Lev wrote:
----- Original Message -----
From: "Dan Kenigsberg" <danken@redhat.com> To: "Alon Bar-Lev" <alonbl@redhat.com> Cc: "VDSM Project Development" <vdsm-devel@lists.fedorahosted.org>, "engine-devel" <engine-devel@ovirt.org>, "users" <users@ovirt.org> Sent: Wednesday, November 28, 2012 10:39:42 PM Subject: Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)
On Wed, Nov 28, 2012 at 02:57:17PM -0500, Alon Bar-Lev wrote:
> No... we need it as compatibility with older engines... > We keep minimum changes there for legacy, until end-of-life.
Is there an EoL statement for oVirt-3.1? We can make sure that oVirt-3.2's vdsm installs properly with ovirt-3.1's vdsm-bootstrap, or even require that Engine must be upgraded to ovirt-3.2 before upgrading any of the hosts. Is it too harsh to our vast install base? users@ovirt.org, please chime in!
I tried to find such, but the more I dig I find that we need to support old legacy.
Why, exactly? Fedora gives no such guarntees (heck, I'm stuck with an unupgradable F16). Should we be any better than our (currently single) platform?
We should start and detach from specific distro procedures.
> > > > > > * legacy-removed: change machine width core file > > > # echo /var/lib/vdsm/core > /proc/sys/kernel/core_pattern > > > > Yeah, qemu-kvm and libvirtd are much more stable than in the > > old > > days, > > but wouldn't we want to keep a means to collect the corpses > > of > > dead > > processes from hypervisors? It has helped us nail down nasty > > bugs, > > even > > in Python. > > It does not mean it should be at /var/lib/vdsm ... :)
I don't get the joke :-(. If you mind the location, we can think of somewhere else to put the core dumps. Would it be hard to reinstate a parallel feature in otopi?
I usually do not make any jokes... A global system setting should not go into package specific location. Usually core dumps are off by default, I like this approach as unattended system may fast consume all disk space because of dumps.
If a host fills up with dumps so quickly, it's a sign that it should not be used for production, and that someone should look into the cores. (P.S. we have a logrotate rule for them in vdsm)
There should be a vdsm-debug-aids (or similar) to perform such changes. Again, I don't think vdsm should (by default) modify any system width parameter such as this. But I will happy to hear more views.
I agree with your statement above that a single package should not override a global system setting. We should really work to remove as many of these from vdsm as we possibly can. It will help to make vdsm a much safer/well-behaved package.
I'm fine with dropping these from vdsm, but I think they are good for ovirt - we would like to (be able to) enfornce policy on our nodes.
If configuring core dumps is removed from vdsm, it should go somewhere else, or our log-collector users would miss their beloved dumps.
Yes, I agree. From my point of view the plan was to do the following: 1. Remove unnecessary system configuration changes. This includes things like Royce's supervdsm startup process patch (and accompanying sudo->supervdsm conversions) which allows us to remove some of the sudo configuration. 2. Isolate the remaining tweaks into vdsm-tool. 3. Provide a service/program that can be run to configure a system to work in an ovirt-engine controlled cluster. Doing this allows vdsm to be safely installed on any system as a basic prerequisite for other software. -- Adam Litke <agl@us.ibm.com> IBM Linux Technology Center
participants (4)
-
Adam Litke
-
Alon Bar-Lev
-
Dan Kenigsberg
-
Ryan Harper