[ovirt-users] Select As SPM Fails

Thu Jan 19 17:42:36 UTC 2017

You can upgrade vdsm without upgrading to ovirt 4.  I went through the
same issue on our cluster a few weeks ago and the process was pretty
simple.

You'll need to do this on each of your hosts.

yum --enablerepo=extras install -y epel-release git
git clone https://github.com/oVirt/vdsm.git
cd  vdsm
git checkout v4.17.35
yum install -y `cat ./automation/build-artifacts.packages`
./automation/build-artifacts.sh

cd /root/rpmbuild/RPMS/noarch
yum --enablerepo=extras install centos-release-qemu-ev
yum localinstall vdsm-4.17.35-1.el7.centos.noarch.rpm  vdsm-hook-vmfex-dev-4.17.35-1.el7.centos.noarch.rpm vdsm-infra-4.17.35-1.el7.centos.noarch.rpm vdsm-jsonrpc-4.17.35-1.el7.centos.noarch.rpm vdsm-python-4.17.35-1.el7.centos.noarch.rpm vdsm-xmlrpc-4.17.35-1.el7.centos.noarch.rpm vdsm-yajsonrpc-4.17.35-1.el7.centos.noarch.rpm vdsm-python-4.17.35-1.el7.centos.noarch.rpm vdsm-xmlrpc-4.17.35-1.el7.centos.noarch.rpm vdsm-cli-4.17.35-1.el7.centos.noarch.rpm
systemctl restart vdsmd

The qemu-ev repo is needed to avoid dependency errors.

On Thu, 2017-01-19 at 09:16 -0700, Beau Sapach wrote:
> Uh oh, looks like an upgrade to version 4 is the only option then....
> unless I'm missing something.
> 
> On Thu, Jan 19, 2017 at 1:36 AM, Pavel Gashev <Pax at acronis.com>
> wrote:
> > Beau,
> >  
> > Looks like you have upgraded to CentOS 7.3. Now you have to update
> > the vdsm package to 4.17.35.
> >  
> >  
> > From: <users-bounces at ovirt.org> on behalf of Beau Sapach <bsapach at u
> > alberta.ca>
> > Date: Wednesday 18 January 2017 at 23:56
> > To: "users at ovirt.org" <users at ovirt.org>
> > Subject: [ovirt-users] Select As SPM Fails
> >  
> > Hello everyone,
> >  
> > I'm about to start digging through the mailing list archives in
> > search of a solution but thought I would post to the list as well. 
> > I'm running oVirt 3.6 on a 2 node CentOS7 cluster backed by fiber
> > channel storage and with a separate engine VM running outside of
> > the cluster (NOT  hosted-engine).
> >  
> > When I try to move the SPM role from one node to the other I get
> > the following in the web interface:
> >  
> > 
> >  
> > When I look into /var/log/ovirt-engine/engine.log I see the
> > following:
> >  
> > 2017-01-18 13:35:09,332 ERROR
> > [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVD
> > SCommand] (default task-26) [6990cfca] Failed in
> > 'HSMGetAllTasksStatusesVDS' method
> > 2017-01-18 13:35:09,340 ERROR
> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirect
> > or] (default task-26) [6990cfca] Correlation ID: null, Call Stack:
> > null, Custom Event ID: -1, Message: VDSM v6 command failed: Logical
> > Volume extend failed
> >  
> > When I look at the task list on the host currently holding the SPM
> > role (in this case 'v6'), using: vdsClient -s 0 getAllTasks, I see
> > a long list like this:
> >  
> > dc75d3e7-cea7-449b-9a04-76fd8ef0f82b :
> >          verb = downloadImageFromStream
> >          code = 554
> >          state = recovered
> >          tag = spm
> >          result =
> >          message = Logical Volume extend failed
> >          id = dc75d3e7-cea7-449b-9a04-76fd8ef0f82b
> >  
> > When I look at /var/log/vdsm/vdsm.log on the host in question (v6)
> > I see messages like this:
> >  
> > '531dd533-22b1-47a0-aae8-76c1dd7d9a56': {'code': 554, 'tag':
> > u'spm', 'state': 'recovered', 'verb': 'downloadImageFromStreaam',
> > 'result': '', 'message': 'Logical Volume extend failed', 'id':
> > '531dd533-22b1-47a0-aae8-76c1dd7d9a56'}
> >  
> > As well as the error from the attempted extend of the logical
> > volume:
> >  
> > e980df5f-d068-4c84-8aa7-9ce792690562::ERROR::2017-01-18
> > 13:24:50,710::task::866::Storage.TaskManager.Task::(_setError)
> > Task=`e980df5f-d068-4c84-8aa7-9ce792690562`::Unexpected error
> > Traceback (most recent call last):
> >   File "/usr/share/vdsm/storage/task.py", line 873, in _run
> >     return fn(*args, **kargs)
> >   File "/usr/share/vdsm/storage/task.py", line 332, in run
> >     return self.cmd(*self.argslist, **self.argsdict)
> >   File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
> >     return method(self, *args, **kwargs)
> >   File "/usr/share/vdsm/storage/sp.py", line 1776, in
> > downloadImageFromStream
> >     .copyToImage(methodArgs, sdUUID, imgUUID, volUUID)
> >   File "/usr/share/vdsm/storage/image.py", line 1373, in
> > copyToImage
> >     / volume.BLOCK_SIZE)
> >   File "/usr/share/vdsm/storage/blockVolume.py", line 310, in
> > extend
> >     lvm.extendLV(self.sdUUID, self.volUUID, sizemb)
> >   File "/usr/share/vdsm/storage/lvm.py", line 1179, in extendLV
> >     _resizeLV("lvextend", vgName, lvName, size)
> >   File "/usr/share/vdsm/storage/lvm.py", line 1175, in _resizeLV
> >     raise se.LogicalVolumeExtendError(vgName, lvName, "%sM" %
> > (size, ))
> > LogicalVolumeExtendError:
> > Logical Volume extend failed: 'vgname=ae05947f-875c-4507-ad51-
> > 62b0d35ef567 lvname=caaef597-eddd-4c24-8df2-a61f35f744f8
> > newsize=1M'
> > e980df5f-d068-4c84-8aa7-9ce792690562::DEBUG::2017-01-18
> > 13:24:50,711::task::885::Storage.TaskManager.Task::(_run)
> > Task=`e980df5f-d068-4c84-8aa7-9ce792690562`::Task._run: e980df5f-
> > d068-4c84-8aa7-9ce792690562 () {} failed - stopping task
> >  
> > The logical volume in question is an OVF_STORE disk that lives on
> > one of the fiber channel backed LUNs.  If I run:
> >  
> > vdsClient -s 0 ClearTask TASK-UUID-HERE
> >  
> > for each task that appears in the:
> >  
> > vdsClient -s 0 getAllTasks 
> >  
> > output then they disappear and I'm able to move the SPM role to the
> > other host.
> >  
> > This problem then crops up again on the new host once the SPM role
> > is moved.  What's going on here?  Does anyone have any insight as
> > to how to prevent this task from re-appearing?  Or why it's failing
> > in the first place?
> >  
> > Beau
> >  
> >  
> >  
> > 
> 
> 
> 
> -- 
> Beau Sapach
> System Administrator | Information Technology Services | University
> of Alberta Libraries
> Phone: 780.492.4181 | Email: Beau.Sapach at ualberta.ca
>