[ovirt-users] Select As SPM Fails

Thu Jan 19 20:34:53 UTC 2017

The fix in 4.17.35 is backported from oVirt 4.0. You will not hit it again.

Technically, vdsm 4.17.35 has been released as part of RHEV 3.6.9. So it's kinda recommended version if you run 3.6.

________________________________
From: Beau Sapach <bsapach at ualberta.ca>
Sent: Jan 19, 2017 10:58 PM
To: Michael Watters
Cc: Pavel Gashev; users at ovirt.org
Subject: Re: [ovirt-users] Select As SPM Fails

Hmmm, makes sense, thanks for the info!  I'm not enthusiastic about installing packages outside of the ovirt repos so will probably look into an upgrade regardless.  I noticed that ovirt 4 only lists support for RHEL/CentOS 7.2, will a situation such as this crop up again eventually as incremental updates for the OS continue to push it past the supported version?  I've been running oVirt for less than a year now so I'm curious what to expect.

On Thu, Jan 19, 2017 at 10:42 AM, Michael Watters <Michael.Watters at dart.biz<mailto:Michael.Watters at dart.biz>> wrote:
You can upgrade vdsm without upgrading to ovirt 4.  I went through the
same issue on our cluster a few weeks ago and the process was pretty
simple.

You'll need to do this on each of your hosts.

yum --enablerepo=extras install -y epel-release git
git clone https://github.com/oVirt/vdsm.git
cd  vdsm
git checkout v4.17.35
yum install -y `cat ./automation/build-artifacts.packages`
./automation/build-artifacts.sh

cd /root/rpmbuild/RPMS/noarch
yum --enablerepo=extras install centos-release-qemu-ev
yum localinstall vdsm-4.17.35-1.el7.centos.noarch.rpm  vdsm-hook-vmfex-dev-4.17.35-1.el7.centos.noarch.rpm vdsm-infra-4.17.35-1.el7.centos.noarch.rpm vdsm-jsonrpc-4.17.35-1.el7.centos.noarch.rpm vdsm-python-4.17.35-1.el7.centos.noarch.rpm vdsm-xmlrpc-4.17.35-1.el7.centos.noarch.rpm vdsm-yajsonrpc-4.17.35-1.el7.centos.noarch.rpm vdsm-python-4.17.35-1.el7.centos.noarch.rpm vdsm-xmlrpc-4.17.35-1.el7.centos.noarch.rpm vdsm-cli-4.17.35-1.el7.centos.noarch.rpm
systemctl restart vdsmd

The qemu-ev repo is needed to avoid dependency errors.

On Thu, 2017-01-19 at 09:16 -0700, Beau Sapach wrote:
> Uh oh, looks like an upgrade to version 4 is the only option then....
> unless I'm missing something.
>
> On Thu, Jan 19, 2017 at 1:36 AM, Pavel Gashev <Pax at acronis.com<mailto:Pax at acronis.com>>
> wrote:
> > Beau,
> >
> > Looks like you have upgraded to CentOS 7.3. Now you have to update
> > the vdsm package to 4.17.35.
> >
> >
> > From: <users-bounces at ovirt.org<mailto:users-bounces at ovirt.org>> on behalf of Beau Sapach <bsapach at u
> > alberta.ca<http://alberta.ca>>
> > Date: Wednesday 18 January 2017 at 23:56
> > To: "users at ovirt.org<mailto:users at ovirt.org>" <users at ovirt.org<mailto:users at ovirt.org>>
> > Subject: [ovirt-users] Select As SPM Fails
> >
> > Hello everyone,
> >
> > I'm about to start digging through the mailing list archives in
> > search of a solution but thought I would post to the list as well.
> > I'm running oVirt 3.6 on a 2 node CentOS7 cluster backed by fiber
> > channel storage and with a separate engine VM running outside of
> > the cluster (NOT  hosted-engine).
> >
> > When I try to move the SPM role from one node to the other I get
> > the following in the web interface:
> >
> >
> >
> > When I look into /var/log/ovirt-engine/engine.log I see the
> > following:
> >
> > 2017-01-18 13:35:09,332 ERROR
> > [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVD
> > SCommand] (default task-26) [6990cfca] Failed in
> > 'HSMGetAllTasksStatusesVDS' method
> > 2017-01-18 13:35:09,340 ERROR
> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirect
> > or] (default task-26) [6990cfca] Correlation ID: null, Call Stack:
> > null, Custom Event ID: -1, Message: VDSM v6 command failed: Logical
> > Volume extend failed
> >
> > When I look at the task list on the host currently holding the SPM
> > role (in this case 'v6'), using: vdsClient -s 0 getAllTasks, I see
> > a long list like this:
> >
> > dc75d3e7-cea7-449b-9a04-76fd8ef0f82b :
> >          verb = downloadImageFromStream
> >          code = 554
> >          state = recovered
> >          tag = spm
> >          result =
> >          message = Logical Volume extend failed
> >          id = dc75d3e7-cea7-449b-9a04-76fd8ef0f82b
> >
> > When I look at /var/log/vdsm/vdsm.log on the host in question (v6)
> > I see messages like this:
> >
> > '531dd533-22b1-47a0-aae8-76c1dd7d9a56': {'code': 554, 'tag':
> > u'spm', 'state': 'recovered', 'verb': 'downloadImageFromStreaam',
> > 'result': '', 'message': 'Logical Volume extend failed', 'id':
> > '531dd533-22b1-47a0-aae8-76c1dd7d9a56'}
> >
> > As well as the error from the attempted extend of the logical
> > volume:
> >
> > e980df5f-d068-4c84-8aa7-9ce792690562::ERROR::2017-01-18
> > 13:24:50,710::task::866::Storage.TaskManager.Task::(_setError)
> > Task=`e980df5f-d068-4c84-8aa7-9ce792690562`::Unexpected error
> > Traceback (most recent call last):
> >   File "/usr/share/vdsm/storage/task.py", line 873, in _run
> >     return fn(*args, **kargs)
> >   File "/usr/share/vdsm/storage/task.py", line 332, in run
> >     return self.cmd(*self.argslist, **self.argsdict)
> >   File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
> >     return method(self, *args, **kwargs)
> >   File "/usr/share/vdsm/storage/sp.py", line 1776, in
> > downloadImageFromStream
> >     .copyToImage(methodArgs, sdUUID, imgUUID, volUUID)
> >   File "/usr/share/vdsm/storage/image.py", line 1373, in
> > copyToImage
> >     / volume.BLOCK_SIZE)
> >   File "/usr/share/vdsm/storage/blockVolume.py", line 310, in
> > extend
> >     lvm.extendLV(self.sdUUID, self.volUUID, sizemb)
> >   File "/usr/share/vdsm/storage/lvm.py", line 1179, in extendLV
> >     _resizeLV("lvextend", vgName, lvName, size)
> >   File "/usr/share/vdsm/storage/lvm.py", line 1175, in _resizeLV
> >     raise se.LogicalVolumeExtendError(vgName, lvName, "%sM" %
> > (size, ))
> > LogicalVolumeExtendError:
> > Logical Volume extend failed: 'vgname=ae05947f-875c-4507-ad51-
> > 62b0d35ef567 lvname=caaef597-eddd-4c24-8df2-a61f35f744f8
> > newsize=1M'
> > e980df5f-d068-4c84-8aa7-9ce792690562::DEBUG::2017-01-18
> > 13:24:50,711::task::885::Storage.TaskManager.Task::(_run)
> > Task=`e980df5f-d068-4c84-8aa7-9ce792690562`::Task._run: e980df5f-
> > d068-4c84-8aa7-9ce792690562 () {} failed - stopping task
> >
> > The logical volume in question is an OVF_STORE disk that lives on
> > one of the fiber channel backed LUNs.  If I run:
> >
> > vdsClient -s 0 ClearTask TASK-UUID-HERE
> >
> > for each task that appears in the:
> >
> > vdsClient -s 0 getAllTasks
> >
> > output then they disappear and I'm able to move the SPM role to the
> > other host.
> >
> > This problem then crops up again on the new host once the SPM role
> > is moved.  What's going on here?  Does anyone have any insight as
> > to how to prevent this task from re-appearing?  Or why it's failing
> > in the first place?
> >
> > Beau
> >
> >
> >
> >
>
>
>
> --
> Beau Sapach
> System Administrator | Information Technology Services | University
> of Alberta Libraries
> Phone: 780.492.4181 | Email: Beau.Sapach at ualberta.ca<mailto:Beau.Sapach at ualberta.ca>
>

--
Beau Sapach
System Administrator | Information Technology Services | University of Alberta Libraries
Phone: 780.492.4181 | Email: Beau.Sapach at ualberta.ca<mailto:Beau.Sapach at ualberta.ca>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170119/9fb97e7b/attachment.html>