[Users] snapshot taken while live VM operates cause unresponsive hypervisor behavior
Alexandru Vladulescu
avladulescu at bfproject.ro
Fri Feb 8 09:36:12 UTC 2013
Hi,
Thanks Kevin for your feedback on this. I will try to attached the
conclusive logs for the guys if they consider/think to dig further into
it for any new relevant probe or cases.
So here are the rpm versions installed:
/
//[root at hyper02 ~]# rpm -qa | grep vdsm//
//vdsm-cli-4.10.0-0.44.14.el6.noarch//
//vdsm-4.10.0-0.44.14.el6.x86_64//
//vdsm-gluster-4.10.0-0.44.14.el6.noarch//
//vdsm-xmlrpc-4.10.0-0.44.14.el6.noarch//
//vdsm-bootstrap-4.10.0-0.44.14.el6.noarch//
//vdsm-python-4.10.0-0.44.14.el6.x86_64//
//[root at hyper02 ~]# rpm -qa | grep qemu//
//qemu-img-rhev-0.12.1.2-2.295.el6.8.x86_64//
//qemu-kvm-rhev-tools-0.12.1.2-2.295.el6.8.x86_64//
//gpxe-roms-qemu-0.9.7-6.9.el6.noarch//
//qemu-kvm-rhev-0.12.1.2-2.295.el6.8.x86_64/
Regards,
Alex.
On 02/08/2013 10:44 AM, Kevin Maziere Aubry wrote:
> Hi
>
> I add the same problem on ovrit3.1 on f17 and it is resolved on
> f18+ovirt3.2
>
> Kevin
>
>
> 2013/2/7 Dafna Ron <dron at redhat.com <mailto:dron at redhat.com>>
>
> also full engine, vdsm and libvirtd logs
>
>
> On 02/07/2013 05:05 PM, Shu Ming wrote:
> > The libvirt and qemu version in the VDSM host may help the
> debugging.
> > Alexandru Vladulescu:
> >>
> >> Hi,
> >>
> >>
> >> Using 3.1.0-3.19 from dreyou's repo on Centos 6.3 version, running
> >> multiple VM on the hypervisor node and attempting to take a
> snapshot
> >> from one of the running VMs generates errors for VDSM daemon as
> show
> >> below:
> >> /
> >> //[root at hyper02 ~]# vdsClient -s 0 list//
> >> //Traceback (most recent call last)://
> >> //File "/usr/share/vdsm/vdsClient.py", line 2275, in <module>//
> >> //code, message = commands[command][0](commandArgs)//
> >> //File "/usr/share/vdsm/vdsClient.py", line 280, in do_list//
> >> //response = self.s.list(True, vms)//
> >> //File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in
> __call__//
> >> //return self.__send(self.__name, args)//
> >> //File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in
> __request//
> >> //verbose=self.__verbose//
> >> //File "/usr/lib64/python2.6/xmlrpclib.py", line 1253, in request//
> >> //return self._parse_response(h.getfile(), sock)//
> >> //File "/usr/lib64/python2.6/xmlrpclib.py", line 1392, in
> >> _parse_response//
> >> //return u.close()//
> >> //File "/usr/lib64/python2.6/xmlrpclib.py", line 838, in close//
> >> //raise Fault(**self._stack[0])//
> >> //Fault: <Fault 1: "<type 'exceptions.TypeError'>:cannot
> marshal None
> >> unless allow_none is enabled">/
> >>
> >>
> >> Paste from ovirt admin GUI log:
> >>
> >> /2013-Feb-07, 15:29:59 VM ipa01 is down. Exit message: User
> shut down//
> >> //2013-Feb-07, 15:29:34 Migration failed due to Error: Fatal error
> >> during migration (VM: ipa01, Source Host: Hyper02).//
> >> //2013-Feb-07, 15:29:32 Starting migration of VM ipa01 from Host
> >> Hyper02 to Host Hyper01 (User: admin at internal.).//
> >> //2013-Feb-07, 15:29:10 Migration failed due to Error: Fatal error
> >> during migration (VM: ipa01, Source Host: Hyper02).//
> >> //2013-Feb-07, 15:29:10 Migration failed due to Error: Fatal error
> >> during migration (VM: ipa01, Source Host: Hyper02). Trying to
> migrate
> >> to another Host.//
> >> //2013-Feb-07, 15:29:08 Starting migration of VM ipa01 from Host
> >> Hyper02 to Host Hyper01 (User: admin at internal.).//
> >> //2013-Feb-07, 15:24:58 Detected new Host Hyper02. Host state
> was set
> >> to Up.//
> >> //2013-Feb-07, 15:24:53 Host Hyper02 is initializing. Message:
> >> Recovering from crash or Initializing//
> >> //2013-Feb-07, 15:22:18 VM ipa01 was resumed by admin at internal
> (Host:
> >> Hyper02).//
> >> //2013-Feb-07, 15:22:18 VM ipa01 was resumed by admin at internal
> (Host:
> >> Hyper02).//
> >> //2013-Feb-07, 15:20:43 Snapshot [initial installation][after ipa
> >> server installation] creation for VM ipa01 has been completed.//
> >> //2013-Feb-07, 15:20:43 VM ipa01 has paused due to unknown storage
> >> error./
> >>
> >> Looking through htop command could see that VM were still
> running and
> >> responsive but any vdsclient command failed with the upper code
> paste
> >> I have given. Solution here was to connect to each VM, shut it down
> >> and restart vdsmd and power it up again. There's no problem
> with the
> >> storage size as it has aprox 800GB free of space and running
> through
> >> NFS. All nfs mount points were running okay at that time.
> >>
> >> After the VM has been snapshoot-ed went into pause mode and had to
> >> un-pause it from GUI, but the VM was still responsive through
> network
> >> and operations.
> >>
> >> Has anybody seen something related to this by now or it's just
> >> related to 3.1 version ? Mentioning the fact that I did a
> research on
> >> the similar mailing list topics and could not find something
> related
> >> to this error. I would happily provide logs if needed.
> >>
> >>
> >> Regards,
> >> Alex
> >>
> >>
> >> _______________________________________________
> >> Users mailing list
> >> Users at ovirt.org <mailto:Users at ovirt.org>
> >> http://lists.ovirt.org/mailman/listinfo/users
> >
> >
> > --
> > ---
> > ?? Shu Ming
> > Open Virtualization Engineerning; CSTL, IBM Corp.
> > Tel: 86-10-82451626 Tieline: 9051626 E-mail: shuming at cn.ibm.com
> <mailto:shuming at cn.ibm.com> or shuming at linux.vnet.ibm.com
> <mailto:shuming at linux.vnet.ibm.com>
> > Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian
> District, Beijing 100193, PRC
> >
> >
> > _______________________________________________
> > Users mailing list
> > Users at ovirt.org <mailto:Users at ovirt.org>
> > http://lists.ovirt.org/mailman/listinfo/users
>
>
> --
> Dafna Ron
> _______________________________________________
> Users mailing list
> Users at ovirt.org <mailto:Users at ovirt.org>
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
>
> --
>
> Kevin Mazière
> Responsable Infrastructure
> Alter Way -- Hosting
> 1 rue Royal - 227 Bureaux de la Colline
> 92213 Saint-Cloud Cedex
> Tél : +33 (0)1 41 16 38 41
> Mob : +33 (0)7 62 55 57 05
> http://www.alterway.fr <http://www.alterway.fr/>
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
--
Alexandru Vladulescu
Platform System Engineer
---------------------------------------------------------------------------------
Bright Future Project Romania
Web url : www.bfproject.ro
Skype : avladulescu
Mobile : +4(0)726.373.098
---------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20130208/0ae52dc1/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: snapshot-issue.tgz
Type: application/x-compressed-tar
Size: 46309 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20130208/0ae52dc1/attachment-0001.bin>
More information about the Users
mailing list