[Users] snapshot taken while live VM operates cause unresponsive hypervisor behavior

Alexandru Vladulescu avladulescu at bfproject.ro
Thu Feb 7 13:45:28 UTC 2013


Hi,


Using 3.1.0-3.19 from dreyou's repo on Centos 6.3 version, running 
multiple VM on the hypervisor node and attempting to take a snapshot 
from one of the running VMs generates errors for VDSM daemon as show below:
/
//[root at hyper02 ~]# vdsClient -s 0 list//
//Traceback (most recent call last)://
//  File "/usr/share/vdsm/vdsClient.py", line 2275, in <module>//
//    code, message = commands[command][0](commandArgs)//
//  File "/usr/share/vdsm/vdsClient.py", line 280, in do_list//
//    response = self.s.list(True, vms)//
//  File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__//
//    return self.__send(self.__name, args)//
//  File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request//
//    verbose=self.__verbose//
//  File "/usr/lib64/python2.6/xmlrpclib.py", line 1253, in request//
//    return self._parse_response(h.getfile(), sock)//
//  File "/usr/lib64/python2.6/xmlrpclib.py", line 1392, in 
_parse_response//
//    return u.close()//
//  File "/usr/lib64/python2.6/xmlrpclib.py", line 838, in close//
//    raise Fault(**self._stack[0])//
//Fault: <Fault 1: "<type 'exceptions.TypeError'>:cannot marshal None 
unless allow_none is enabled">/


Paste from ovirt admin GUI log:

/2013-Feb-07, 15:29:59 VM ipa01 is down. Exit message: User shut down//
//2013-Feb-07, 15:29:34 Migration failed due to Error: Fatal error 
during migration (VM: ipa01, Source Host: Hyper02).//
//2013-Feb-07, 15:29:32 Starting migration of VM ipa01 from Host Hyper02 
to Host Hyper01 (User: admin at internal.).//
//2013-Feb-07, 15:29:10 Migration failed due to Error: Fatal error 
during migration (VM: ipa01, Source Host: Hyper02).//
//2013-Feb-07, 15:29:10 Migration failed due to Error: Fatal error 
during migration (VM: ipa01, Source Host: Hyper02). Trying to migrate to 
another Host.//
//2013-Feb-07, 15:29:08 Starting migration of VM ipa01 from Host Hyper02 
to Host Hyper01 (User: admin at internal.).//
//2013-Feb-07, 15:24:58 Detected new Host Hyper02. Host state was set to 
Up.//
//2013-Feb-07, 15:24:53 Host Hyper02 is initializing. Message: 
Recovering from crash or Initializing//
//2013-Feb-07, 15:22:18 VM ipa01 was resumed by admin at internal (Host: 
Hyper02).//
//2013-Feb-07, 15:22:18 VM ipa01 was resumed by admin at internal (Host: 
Hyper02).//
//2013-Feb-07, 15:20:43 Snapshot [initial installation][after ipa server 
installation] creation for VM ipa01 has been completed.//
//2013-Feb-07, 15:20:43 VM ipa01 has paused due to unknown storage error./

Looking through htop command could see that VM were still running and 
responsive but any vdsclient command failed with the upper code paste I 
have given. Solution here was to connect to each VM, shut it down and 
restart vdsmd and power it up again. There's no problem with the storage 
size as it has aprox 800GB free of space and running through NFS. All 
nfs mount points were running okay at that time.

After the VM has been snapshoot-ed went into pause mode and had to 
un-pause it from GUI, but the VM was still responsive through network 
and operations.

Has anybody seen something related to this by now or it's just related 
to 3.1 version ? Mentioning the fact that I did a research on the 
similar mailing list topics and could not find something related to this 
error. I would happily provide logs if needed.


Regards,
Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20130207/32a3c183/attachment-0001.html>


More information about the Users mailing list