[Users] snapshot taken while live VM operates cause unresponsive hypervisor behavior

Alexandru Vladulescu avladulescu at bfproject.ro
Fri Feb 8 09:36:12 UTC 2013


Hi,

Thanks Kevin for your feedback on this. I will try to attached the 
conclusive logs for the guys if they consider/think to dig further into 
it for any new relevant probe or cases.


So here are the rpm versions installed:
/
//[root at hyper02 ~]# rpm -qa | grep vdsm//
//vdsm-cli-4.10.0-0.44.14.el6.noarch//
//vdsm-4.10.0-0.44.14.el6.x86_64//
//vdsm-gluster-4.10.0-0.44.14.el6.noarch//
//vdsm-xmlrpc-4.10.0-0.44.14.el6.noarch//
//vdsm-bootstrap-4.10.0-0.44.14.el6.noarch//
//vdsm-python-4.10.0-0.44.14.el6.x86_64//
//[root at hyper02 ~]# rpm -qa | grep qemu//
//qemu-img-rhev-0.12.1.2-2.295.el6.8.x86_64//
//qemu-kvm-rhev-tools-0.12.1.2-2.295.el6.8.x86_64//
//gpxe-roms-qemu-0.9.7-6.9.el6.noarch//
//qemu-kvm-rhev-0.12.1.2-2.295.el6.8.x86_64/

Regards,
Alex.


On 02/08/2013 10:44 AM, Kevin Maziere Aubry wrote:
> Hi
>
> I add the same problem on ovrit3.1 on f17 and it is resolved on 
> f18+ovirt3.2
>
> Kevin
>
>
> 2013/2/7 Dafna Ron <dron at redhat.com <mailto:dron at redhat.com>>
>
>     also full engine, vdsm and libvirtd logs
>
>
>     On 02/07/2013 05:05 PM, Shu Ming wrote:
>     > The libvirt and qemu version in the VDSM host may help the
>     debugging.
>     > Alexandru Vladulescu:
>     >>
>     >> Hi,
>     >>
>     >>
>     >> Using 3.1.0-3.19 from dreyou's repo on Centos 6.3 version, running
>     >> multiple VM on the hypervisor node and attempting to take a
>     snapshot
>     >> from one of the running VMs generates errors for VDSM daemon as
>     show
>     >> below:
>     >> /
>     >> //[root at hyper02 ~]# vdsClient -s 0 list//
>     >> //Traceback (most recent call last)://
>     >> //File "/usr/share/vdsm/vdsClient.py", line 2275, in <module>//
>     >> //code, message = commands[command][0](commandArgs)//
>     >> //File "/usr/share/vdsm/vdsClient.py", line 280, in do_list//
>     >> //response = self.s.list(True, vms)//
>     >> //File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in
>     __call__//
>     >> //return self.__send(self.__name, args)//
>     >> //File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in
>     __request//
>     >> //verbose=self.__verbose//
>     >> //File "/usr/lib64/python2.6/xmlrpclib.py", line 1253, in request//
>     >> //return self._parse_response(h.getfile(), sock)//
>     >> //File "/usr/lib64/python2.6/xmlrpclib.py", line 1392, in
>     >> _parse_response//
>     >> //return u.close()//
>     >> //File "/usr/lib64/python2.6/xmlrpclib.py", line 838, in close//
>     >> //raise Fault(**self._stack[0])//
>     >> //Fault: <Fault 1: "<type 'exceptions.TypeError'>:cannot
>     marshal None
>     >> unless allow_none is enabled">/
>     >>
>     >>
>     >> Paste from ovirt admin GUI log:
>     >>
>     >> /2013-Feb-07, 15:29:59 VM ipa01 is down. Exit message: User
>     shut down//
>     >> //2013-Feb-07, 15:29:34 Migration failed due to Error: Fatal error
>     >> during migration (VM: ipa01, Source Host: Hyper02).//
>     >> //2013-Feb-07, 15:29:32 Starting migration of VM ipa01 from Host
>     >> Hyper02 to Host Hyper01 (User: admin at internal.).//
>     >> //2013-Feb-07, 15:29:10 Migration failed due to Error: Fatal error
>     >> during migration (VM: ipa01, Source Host: Hyper02).//
>     >> //2013-Feb-07, 15:29:10 Migration failed due to Error: Fatal error
>     >> during migration (VM: ipa01, Source Host: Hyper02). Trying to
>     migrate
>     >> to another Host.//
>     >> //2013-Feb-07, 15:29:08 Starting migration of VM ipa01 from Host
>     >> Hyper02 to Host Hyper01 (User: admin at internal.).//
>     >> //2013-Feb-07, 15:24:58 Detected new Host Hyper02. Host state
>     was set
>     >> to Up.//
>     >> //2013-Feb-07, 15:24:53 Host Hyper02 is initializing. Message:
>     >> Recovering from crash or Initializing//
>     >> //2013-Feb-07, 15:22:18 VM ipa01 was resumed by admin at internal
>     (Host:
>     >> Hyper02).//
>     >> //2013-Feb-07, 15:22:18 VM ipa01 was resumed by admin at internal
>     (Host:
>     >> Hyper02).//
>     >> //2013-Feb-07, 15:20:43 Snapshot [initial installation][after ipa
>     >> server installation] creation for VM ipa01 has been completed.//
>     >> //2013-Feb-07, 15:20:43 VM ipa01 has paused due to unknown storage
>     >> error./
>     >>
>     >> Looking through htop command could see that VM were still
>     running and
>     >> responsive but any vdsclient command failed with the upper code
>     paste
>     >> I have given. Solution here was to connect to each VM, shut it down
>     >> and restart vdsmd and power it up again. There's no problem
>     with the
>     >> storage size as it has aprox 800GB free of space and running
>     through
>     >> NFS. All nfs mount points were running okay at that time.
>     >>
>     >> After the VM has been snapshoot-ed went into pause mode and had to
>     >> un-pause it from GUI, but the VM was still responsive through
>     network
>     >> and operations.
>     >>
>     >> Has anybody seen something related to this by now or it's just
>     >> related to 3.1 version ? Mentioning the fact that I did a
>     research on
>     >> the similar mailing list topics and could not find something
>     related
>     >> to this error. I would happily provide logs if needed.
>     >>
>     >>
>     >> Regards,
>     >> Alex
>     >>
>     >>
>     >> _______________________________________________
>     >> Users mailing list
>     >> Users at ovirt.org <mailto:Users at ovirt.org>
>     >> http://lists.ovirt.org/mailman/listinfo/users
>     >
>     >
>     > --
>     > ---
>     > ?? Shu Ming
>     > Open Virtualization Engineerning; CSTL, IBM Corp.
>     > Tel: 86-10-82451626  Tieline: 9051626 E-mail: shuming at cn.ibm.com
>     <mailto:shuming at cn.ibm.com> or shuming at linux.vnet.ibm.com
>     <mailto:shuming at linux.vnet.ibm.com>
>     > Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian
>     District, Beijing 100193, PRC
>     >
>     >
>     > _______________________________________________
>     > Users mailing list
>     > Users at ovirt.org <mailto:Users at ovirt.org>
>     > http://lists.ovirt.org/mailman/listinfo/users
>
>
>     --
>     Dafna Ron
>     _______________________________________________
>     Users mailing list
>     Users at ovirt.org <mailto:Users at ovirt.org>
>     http://lists.ovirt.org/mailman/listinfo/users
>
>
>
>
> -- 
>
> Kevin Mazière
> Responsable Infrastructure
> Alter Way -- Hosting
> 1 rue Royal - 227 Bureaux de la Colline
> 92213 Saint-Cloud Cedex
> Tél : +33 (0)1 41 16 38 41
> Mob : +33 (0)7 62 55 57 05
> http://www.alterway.fr <http://www.alterway.fr/>
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users


-- 
Alexandru Vladulescu
Platform System Engineer
---------------------------------------------------------------------------------
Bright Future Project Romania
Web url : www.bfproject.ro
Skype :   avladulescu
Mobile :  +4(0)726.373.098
---------------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20130208/0ae52dc1/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: snapshot-issue.tgz
Type: application/x-compressed-tar
Size: 46309 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20130208/0ae52dc1/attachment-0001.bin>


More information about the Users mailing list