[Users] snapshot taken while live VM operates cause unresponsive hypervisor behavior
Alexandru Vladulescu
avladulescu at bfproject.ro
Thu Feb 7 08:45:28 EST 2013
Hi,
Using 3.1.0-3.19 from dreyou's repo on Centos 6.3 version, running
multiple VM on the hypervisor node and attempting to take a snapshot
from one of the running VMs generates errors for VDSM daemon as show below:
/
//[root at hyper02 ~]# vdsClient -s 0 list//
//Traceback (most recent call last)://
// File "/usr/share/vdsm/vdsClient.py", line 2275, in <module>//
// code, message = commands[command][0](commandArgs)//
// File "/usr/share/vdsm/vdsClient.py", line 280, in do_list//
// response = self.s.list(True, vms)//
// File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__//
// return self.__send(self.__name, args)//
// File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request//
// verbose=self.__verbose//
// File "/usr/lib64/python2.6/xmlrpclib.py", line 1253, in request//
// return self._parse_response(h.getfile(), sock)//
// File "/usr/lib64/python2.6/xmlrpclib.py", line 1392, in
_parse_response//
// return u.close()//
// File "/usr/lib64/python2.6/xmlrpclib.py", line 838, in close//
// raise Fault(**self._stack[0])//
//Fault: <Fault 1: "<type 'exceptions.TypeError'>:cannot marshal None
unless allow_none is enabled">/
Paste from ovirt admin GUI log:
/2013-Feb-07, 15:29:59 VM ipa01 is down. Exit message: User shut down//
//2013-Feb-07, 15:29:34 Migration failed due to Error: Fatal error
during migration (VM: ipa01, Source Host: Hyper02).//
//2013-Feb-07, 15:29:32 Starting migration of VM ipa01 from Host Hyper02
to Host Hyper01 (User: admin at internal.).//
//2013-Feb-07, 15:29:10 Migration failed due to Error: Fatal error
during migration (VM: ipa01, Source Host: Hyper02).//
//2013-Feb-07, 15:29:10 Migration failed due to Error: Fatal error
during migration (VM: ipa01, Source Host: Hyper02). Trying to migrate to
another Host.//
//2013-Feb-07, 15:29:08 Starting migration of VM ipa01 from Host Hyper02
to Host Hyper01 (User: admin at internal.).//
//2013-Feb-07, 15:24:58 Detected new Host Hyper02. Host state was set to
Up.//
//2013-Feb-07, 15:24:53 Host Hyper02 is initializing. Message:
Recovering from crash or Initializing//
//2013-Feb-07, 15:22:18 VM ipa01 was resumed by admin at internal (Host:
Hyper02).//
//2013-Feb-07, 15:22:18 VM ipa01 was resumed by admin at internal (Host:
Hyper02).//
//2013-Feb-07, 15:20:43 Snapshot [initial installation][after ipa server
installation] creation for VM ipa01 has been completed.//
//2013-Feb-07, 15:20:43 VM ipa01 has paused due to unknown storage error./
Looking through htop command could see that VM were still running and
responsive but any vdsclient command failed with the upper code paste I
have given. Solution here was to connect to each VM, shut it down and
restart vdsmd and power it up again. There's no problem with the storage
size as it has aprox 800GB free of space and running through NFS. All
nfs mount points were running okay at that time.
After the VM has been snapshoot-ed went into pause mode and had to
un-pause it from GUI, but the VM was still responsive through network
and operations.
Has anybody seen something related to this by now or it's just related
to 3.1 version ? Mentioning the fact that I did a research on the
similar mailing list topics and could not find something related to this
error. I would happily provide logs if needed.
Regards,
Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20130207/32a3c183/attachment-0001.html>
More information about the Users
mailing list