VDSM - sampling.py - remove() called without previous add()

This is a multi-part message in MIME format. --------------060105040505080104010504 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Hi, In sampling.py, remove() is being called without calling add() before, which throws: JsonRpc (StompReactor)::DEBUG::2015-04-28 17:35:55,061::stompReactor::94::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command=u'SEND'> Thread-37401::DEBUG::2015-04-28 17:35:55,062::__init__::445::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.destroy' in bridge with {u'vmID': u'6ec9c0a0-2879-4bfe-9a79-92471881ebfe'} JsonRpcServer::DEBUG::2015-04-28 17:35:55,062::__init__::482::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-37401::INFO::2015-04-28 17:35:55,062::API::334::vds::(destroy) vmContainerLock acquired by vm 6ec9c0a0-2879-4bfe-9a79-92471881ebfe Thread-37401::DEBUG::2015-04-28 17:35:55,062::vm::3513::vm.Vm::(destroy) vmId=`6ec9c0a0-2879-4bfe-9a79-92471881ebfe`::destroy Called Thread-37401::INFO::2015-04-28 17:35:55,062::vm::3444::vm.Vm::(releaseVm) vmId=`6ec9c0a0-2879-4bfe-9a79-92471881ebfe`::Release VM resources Thread-37401::WARNING::2015-04-28 17:35:55,062::vm::375::vm.Vm::(_set_lastStatus) vmId=`6ec9c0a0-2879-4bfe-9a79-92471881ebfe`::trying to set state to Powering down when already Down Thread-37401::ERROR::2015-04-28 17:35:55,063::__init__::469::jsonrpc.JsonRpcServer::(_serveRequest) Internal server error Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 464, in _serveRequest res = method(**params) File "/usr/share/vdsm/rpc/Bridge.py", line 273, in _dynamicMethod result = fn(*methodArgs) File "/usr/share/vdsm/API.py", line 339, in destroy res = v.destroy() File "/usr/share/vdsm/virt/vm.py", line 3515, in destroy result = self.doDestroy() File "/usr/share/vdsm/virt/vm.py", line 3533, in doDestroy return self.releaseVm() File "/usr/share/vdsm/virt/vm.py", line 3448, in releaseVm sampling.stats_cache.remove(self.id) File "/usr/share/vdsm/virt/sampling.py", line 428, in remove if vmid in self._vm_last_timestamp.keys(): KeyError: u'6ec9c0a0-2879-4bfe-9a79-92471881ebfe' Thread-37401::DEBUG::2015-04-28 17:35:55,063::stompReactor::158::yajsonrpc.StompServer::(send) Sending response In file '/usr/share/vdsm/virt/sampling.py': def add(self, vmid): """ Warm up the cache for the given VM. This is to avoid races during the first sampling and the first reporting, which may result in a VM wrongly reported as unresponsive. """ with self._lock: self._vm_last_timestamp[vmid] = self._clock() def remove(self, vmid): """ Remove any data from the cache related to the given VM. """ with self._lock: * if vmid in self._vm_last_timestamp.keys():**<----------------- I patched here as a workarround ** del self._vm_last_timestamp[vmid]* --------------060105040505080104010504 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit <html> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8"> </head> <body bgcolor="#FFFFFF" text="#000000"> Hi,<br> <br> In sampling.py, remove() is being called without calling add() before, which throws:<br> <blockquote>JsonRpc (StompReactor)::DEBUG::2015-04-28 17:35:55,061::stompReactor::94::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command=u'SEND'><br> Thread-37401::DEBUG::2015-04-28 17:35:55,062::__init__::445::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'VM.destroy' in bridge with {u'vmID': u'6ec9c0a0-2879-4bfe-9a79-92471881ebfe'}<br> JsonRpcServer::DEBUG::2015-04-28 17:35:55,062::__init__::482::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request<br> Thread-37401::INFO::2015-04-28 17:35:55,062::API::334::vds::(destroy) vmContainerLock acquired by vm 6ec9c0a0-2879-4bfe-9a79-92471881ebfe<br> Thread-37401::DEBUG::2015-04-28 17:35:55,062::vm::3513::vm.Vm::(destroy) vmId=`6ec9c0a0-2879-4bfe-9a79-92471881ebfe`::destroy Called<br> Thread-37401::INFO::2015-04-28 17:35:55,062::vm::3444::vm.Vm::(releaseVm) vmId=`6ec9c0a0-2879-4bfe-9a79-92471881ebfe`::Release VM resources<br> Thread-37401::WARNING::2015-04-28 17:35:55,062::vm::375::vm.Vm::(_set_lastStatus) vmId=`6ec9c0a0-2879-4bfe-9a79-92471881ebfe`::trying to set state to Powering down when already Down<br> Thread-37401::ERROR::2015-04-28 17:35:55,063::__init__::469::jsonrpc.JsonRpcServer::(_serveRequest) Internal server error<br> Traceback (most recent call last):<br> File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 464, in _serveRequest<br> res = method(**params)<br> File "/usr/share/vdsm/rpc/Bridge.py", line 273, in _dynamicMethod<br> result = fn(*methodArgs)<br> File "/usr/share/vdsm/API.py", line 339, in destroy<br> res = v.destroy()<br> File "/usr/share/vdsm/virt/vm.py", line 3515, in destroy<br> result = self.doDestroy()<br> File "/usr/share/vdsm/virt/vm.py", line 3533, in doDestroy<br> return self.releaseVm()<br> File "/usr/share/vdsm/virt/vm.py", line 3448, in releaseVm<br> sampling.stats_cache.remove(self.id)<br> File "/usr/share/vdsm/virt/sampling.py", line 428, in remove<br> if vmid in self._vm_last_timestamp.keys():<br> KeyError: u'6ec9c0a0-2879-4bfe-9a79-92471881ebfe'<br> Thread-37401::DEBUG::2015-04-28 17:35:55,063::stompReactor::158::yajsonrpc.StompServer::(send) Sending response<br> </blockquote> In file '/usr/share/vdsm/virt/sampling.py':<br> <br> def add(self, vmid):<br> """<br> Warm up the cache for the given VM.<br> This is to avoid races during the first sampling and the first<br> reporting, which may result in a VM wrongly reported as unresponsive.<br> """<br> with self._lock:<br> self._vm_last_timestamp[vmid] = self._clock()<br> <br> def remove(self, vmid):<br> """<br> Remove any data from the cache related to the given VM.<br> """<br> with self._lock:<br> <b> if vmid in self._vm_last_timestamp.keys():</b><b> <----------------- I patched here as a workarround<br> </b><b> del self._vm_last_timestamp[vmid]</b><br> <br> </body> </html> --------------060105040505080104010504--

In sampling.py, remove() is being called without calling add() before, which throws:
Fixed in master, please update: https://gerrit.ovirt.org/40223 Nir

On 28-04-2015 17:44, Nir Soffer wrote:
In sampling.py, remove() is being called without calling add() before, which throws: Fixed in master, please update: https://gerrit.ovirt.org/40223
Hi Nir, Still present: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 464, in _serveRequest res = method(**params) File "/usr/share/vdsm/rpc/Bridge.py", line 273, in _dynamicMethod result = fn(*methodArgs) File "/usr/share/vdsm/API.py", line 339, in destroy res = v.destroy() File "/usr/share/vdsm/virt/vm.py", line 3517, in destroy result = self.doDestroy() File "/usr/share/vdsm/virt/vm.py", line 3535, in doDestroy return self.releaseVm() File "/usr/share/vdsm/virt/vm.py", line 3450, in releaseVm <-------------------------------- Also called here sampling.stats_cache.remove(self.id) File "/usr/share/vdsm/virt/sampling.py", line 428, in remove del self._vm_last_timestamp[vmid] Consider making the function foolproof/flow-independent: 428,429c428 < if vmid in self._vm_last_timestamp.keys(): < del self._vm_last_timestamp[vmid] ---
del self._vm_last_timestamp[vmid]

----- Original Message -----
From: "Christopher Pereira" <kripper@imatronix.cl> To: "Nir Soffer" <nsoffer@redhat.com> Cc: devel@ovirt.org Sent: Wednesday, April 29, 2015 12:33:01 PM Subject: Re: [ovirt-devel] VDSM - sampling.py - remove() called without previous add()
On 28-04-2015 17:44, Nir Soffer wrote:
In sampling.py, remove() is being called without calling add() before, which throws: Fixed in master, please update: https://gerrit.ovirt.org/40223
Hi Nir,
Still present:
Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 464, in _serveRequest res = method(**params) File "/usr/share/vdsm/rpc/Bridge.py", line 273, in _dynamicMethod result = fn(*methodArgs) File "/usr/share/vdsm/API.py", line 339, in destroy res = v.destroy() File "/usr/share/vdsm/virt/vm.py", line 3517, in destroy result = self.doDestroy() File "/usr/share/vdsm/virt/vm.py", line 3535, in doDestroy return self.releaseVm() File "/usr/share/vdsm/virt/vm.py", line 3450, in releaseVm <-------------------------------- Also called here sampling.stats_cache.remove(self.id) File "/usr/share/vdsm/virt/sampling.py", line 428, in remove del self._vm_last_timestamp[vmid]
So you reproduced this bug with VDSM with 40223 applied (let's say, today's master branch)? Which flow is this, just shutdding down a VM? I did basic sanity check as part of verification of 40223, and didn't noticed blatant mistakes. :\ Bests, -- Francesco Romani RedHat Engineering Virtualization R & D Phone: 8261328 IRC: fromani

This is a multi-part message in MIME format. --------------010803090902010902080006 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit On 29-04-2015 7:46, Francesco Romani wrote:
----- Original Message -----
From: "Christopher Pereira" <kripper@imatronix.cl> To: "Nir Soffer" <nsoffer@redhat.com> Cc: devel@ovirt.org Sent: Wednesday, April 29, 2015 12:33:01 PM Subject: Re: [ovirt-devel] VDSM - sampling.py - remove() called without previous add()
On 28-04-2015 17:44, Nir Soffer wrote:
In sampling.py, remove() is being called without calling add() before, which throws: Fixed in master, please update: https://gerrit.ovirt.org/40223
Hi Nir,
Still present:
Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 464, in _serveRequest res = method(**params) File "/usr/share/vdsm/rpc/Bridge.py", line 273, in _dynamicMethod result = fn(*methodArgs) File "/usr/share/vdsm/API.py", line 339, in destroy res = v.destroy() File "/usr/share/vdsm/virt/vm.py", line 3517, in destroy result = self.doDestroy() File "/usr/share/vdsm/virt/vm.py", line 3535, in doDestroy return self.releaseVm() File "/usr/share/vdsm/virt/vm.py", line 3450, in releaseVm <-------------------------------- Also called here sampling.stats_cache.remove(self.id) File "/usr/share/vdsm/virt/sampling.py", line 428, in remove del self._vm_last_timestamp[vmid] So you reproduced this bug with VDSM with 40223 applied (let's say, today's master branch)? Yes. I cloned github and am pulling patches from gerrit. I also checked the file to confirm the patch was applied.
Which flow is this, just shutdding down a VM? I guess the VM was down and about to start. It probably failed to start because the MasterStorageDomain was down. I did basic sanity check as part of verification of 40223, and didn't noticed blatant mistakes. :\
I saw the patch eliminating the remove() call from one section, but it was also called on another one (vm.py, line 3450). --------------010803090902010902080006 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 7bit <html> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <br> <div class="moz-signature"> <style> .signature, .small-signature { font-family:"Calibri","sans-serif";mso-fareast-font-family:"Times New Roman"; color:#7F7F7F; } .signature { font-size:10pt; } .small-signature { font-size:8pt; }</style></div> <div class="moz-cite-prefix">On 29-04-2015 7:46, Francesco Romani wrote:<br> </div> <blockquote cite="mid:2087988230.7158335.1430304391574.JavaMail.zimbra@redhat.com" type="cite"> <pre wrap="">----- Original Message ----- </pre> <blockquote type="cite"> <pre wrap="">From: "Christopher Pereira" <a class="moz-txt-link-rfc2396E" href="mailto:kripper@imatronix.cl"><kripper@imatronix.cl></a> To: "Nir Soffer" <a class="moz-txt-link-rfc2396E" href="mailto:nsoffer@redhat.com"><nsoffer@redhat.com></a> Cc: <a class="moz-txt-link-abbreviated" href="mailto:devel@ovirt.org">devel@ovirt.org</a> Sent: Wednesday, April 29, 2015 12:33:01 PM Subject: Re: [ovirt-devel] VDSM - sampling.py - remove() called without previous add() On 28-04-2015 17:44, Nir Soffer wrote: </pre> <blockquote type="cite"> <blockquote type="cite"> <pre wrap="">In sampling.py, remove() is being called without calling add() before, which throws: </pre> </blockquote> <pre wrap="">Fixed in master, please update: <a class="moz-txt-link-freetext" href="https://gerrit.ovirt.org/40223">https://gerrit.ovirt.org/40223</a> </pre> </blockquote> <pre wrap="">Hi Nir, Still present: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 464, in _serveRequest res = method(**params) File "/usr/share/vdsm/rpc/Bridge.py", line 273, in _dynamicMethod result = fn(*methodArgs) File "/usr/share/vdsm/API.py", line 339, in destroy res = v.destroy() File "/usr/share/vdsm/virt/vm.py", line 3517, in destroy result = self.doDestroy() File "/usr/share/vdsm/virt/vm.py", line 3535, in doDestroy return self.releaseVm() File "/usr/share/vdsm/virt/vm.py", line 3450, in releaseVm <-------------------------------- Also called here sampling.stats_cache.remove(self.id) File "/usr/share/vdsm/virt/sampling.py", line 428, in remove del self._vm_last_timestamp[vmid] </pre> </blockquote> <pre wrap=""> So you reproduced this bug with VDSM with 40223 applied (let's say, today's master branch)? </pre> </blockquote> Yes. I cloned github and am pulling patches from gerrit.<br> I also checked the file to confirm the patch was applied.<br> <br> <blockquote cite="mid:2087988230.7158335.1430304391574.JavaMail.zimbra@redhat.com" type="cite"> <pre wrap="">Which flow is this, just shutdding down a VM? </pre> </blockquote> I guess the VM was down and about to start. It probably failed to start because the MasterStorageDomain was down.<br> <blockquote cite="mid:2087988230.7158335.1430304391574.JavaMail.zimbra@redhat.com" type="cite"> <pre wrap="">I did basic sanity check as part of verification of 40223, and didn't noticed blatant mistakes. :\ </pre> </blockquote> I saw the patch eliminating the remove() call from one section, but it was also called on another one (vm.py, line 3450).<br> <br> </body> </html> --------------010803090902010902080006--
participants (3)
-
Christopher Pereira
-
Francesco Romani
-
Nir Soffer