Hi,

Thanks Kevin for your feedback on this. I will try to attached the conclusive logs for the guys if they consider/think to dig further into it for any new relevant probe or cases.


So here are the rpm versions installed:

[root@hyper02 ~]# rpm -qa | grep vdsm
vdsm-cli-4.10.0-0.44.14.el6.noarch
vdsm-4.10.0-0.44.14.el6.x86_64
vdsm-gluster-4.10.0-0.44.14.el6.noarch
vdsm-xmlrpc-4.10.0-0.44.14.el6.noarch
vdsm-bootstrap-4.10.0-0.44.14.el6.noarch
vdsm-python-4.10.0-0.44.14.el6.x86_64
[root@hyper02 ~]# rpm -qa | grep qemu
qemu-img-rhev-0.12.1.2-2.295.el6.8.x86_64
qemu-kvm-rhev-tools-0.12.1.2-2.295.el6.8.x86_64
gpxe-roms-qemu-0.9.7-6.9.el6.noarch
qemu-kvm-rhev-0.12.1.2-2.295.el6.8.x86_64


Regards,
Alex.


On 02/08/2013 10:44 AM, Kevin Maziere Aubry wrote:
Hi

I add the same problem on ovrit3.1 on f17 and it is resolved on f18+ovirt3.2

Kevin


2013/2/7 Dafna Ron <dron@redhat.com>
also full engine, vdsm and libvirtd logs


On 02/07/2013 05:05 PM, Shu Ming wrote:
> The libvirt and qemu version in the VDSM host may help the debugging.
> Alexandru Vladulescu:
>>
>> Hi,
>>
>>
>> Using 3.1.0-3.19 from dreyou's repo on Centos 6.3 version, running
>> multiple VM on the hypervisor node and attempting to take a snapshot
>> from one of the running VMs generates errors for VDSM daemon as show
>> below:
>> /
>> //[root@hyper02 ~]# vdsClient -s 0 list//
>> //Traceback (most recent call last)://
>> //File "/usr/share/vdsm/vdsClient.py", line 2275, in <module>//
>> //code, message = commands[command][0](commandArgs)//
>> //File "/usr/share/vdsm/vdsClient.py", line 280, in do_list//
>> //response = self.s.list(True, vms)//
>> //File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__//
>> //return self.__send(self.__name, args)//
>> //File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request//
>> //verbose=self.__verbose//
>> //File "/usr/lib64/python2.6/xmlrpclib.py", line 1253, in request//
>> //return self._parse_response(h.getfile(), sock)//
>> //File "/usr/lib64/python2.6/xmlrpclib.py", line 1392, in
>> _parse_response//
>> //return u.close()//
>> //File "/usr/lib64/python2.6/xmlrpclib.py", line 838, in close//
>> //raise Fault(**self._stack[0])//
>> //Fault: <Fault 1: "<type 'exceptions.TypeError'>:cannot marshal None
>> unless allow_none is enabled">/
>>
>>
>> Paste from ovirt admin GUI log:
>>
>> /2013-Feb-07, 15:29:59 VM ipa01 is down. Exit message: User shut down//
>> //2013-Feb-07, 15:29:34 Migration failed due to Error: Fatal error
>> during migration (VM: ipa01, Source Host: Hyper02).//
>> //2013-Feb-07, 15:29:32 Starting migration of VM ipa01 from Host
>> Hyper02 to Host Hyper01 (User: admin@internal.).//
>> //2013-Feb-07, 15:29:10 Migration failed due to Error: Fatal error
>> during migration (VM: ipa01, Source Host: Hyper02).//
>> //2013-Feb-07, 15:29:10 Migration failed due to Error: Fatal error
>> during migration (VM: ipa01, Source Host: Hyper02). Trying to migrate
>> to another Host.//
>> //2013-Feb-07, 15:29:08 Starting migration of VM ipa01 from Host
>> Hyper02 to Host Hyper01 (User: admin@internal.).//
>> //2013-Feb-07, 15:24:58 Detected new Host Hyper02. Host state was set
>> to Up.//
>> //2013-Feb-07, 15:24:53 Host Hyper02 is initializing. Message:
>> Recovering from crash or Initializing//
>> //2013-Feb-07, 15:22:18 VM ipa01 was resumed by admin@internal (Host:
>> Hyper02).//
>> //2013-Feb-07, 15:22:18 VM ipa01 was resumed by admin@internal (Host:
>> Hyper02).//
>> //2013-Feb-07, 15:20:43 Snapshot [initial installation][after ipa
>> server installation] creation for VM ipa01 has been completed.//
>> //2013-Feb-07, 15:20:43 VM ipa01 has paused due to unknown storage
>> error./
>>
>> Looking through htop command could see that VM were still running and
>> responsive but any vdsclient command failed with the upper code paste
>> I have given. Solution here was to connect to each VM, shut it down
>> and restart vdsmd and power it up again. There's no problem with the
>> storage size as it has aprox 800GB free of space and running through
>> NFS. All nfs mount points were running okay at that time.
>>
>> After the VM has been snapshoot-ed went into pause mode and had to
>> un-pause it from GUI, but the VM was still responsive through network
>> and operations.
>>
>> Has anybody seen something related to this by now or it's just
>> related to 3.1 version ? Mentioning the fact that I did a research on
>> the similar mailing list topics and could not find something related
>> to this error. I would happily provide logs if needed.
>>
>>
>> Regards,
>> Alex
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>
>
> --
> ---
> 舒明 Shu Ming
> Open Virtualization Engineerning; CSTL, IBM Corp.
> Tel: 86-10-82451626  Tieline: 9051626 E-mail: shuming@cn.ibm.com or shuming@linux.vnet.ibm.com
> Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian District, Beijing 100193, PRC
>
>
> _______________________________________________
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users


--
Dafna Ron
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



--

Kevin Mazière
Responsable Infrastructure
Alter Way – Hosting
1 rue Royal - 227 Bureaux de la Colline
92213 Saint-Cloud Cedex
Tél : +33 (0)1 41 16 38 41
Mob : +33 (0)7 62 55 57 05
http://www.alterway.fr


_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


-- 
Alexandru Vladulescu
Platform System Engineer
---------------------------------------------------------------------------------
Bright Future Project Romania
Web url : www.bfproject.ro
Skype :   avladulescu
Mobile :  +4(0)726.373.098
---------------------------------------------------------------------------------