system tests failing on template export

Hello, We've got several cases today where system tests failed when attempting to export templates: http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/testReport/junit/... Related engine.log looks something like this: https://paste.fedoraproject.org/449936/47643643/raw/ I could not find any obvious issues in SPM logs, could someone please take a look to confirm what may be causing this issue? Full logs from the test are available here: http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/artifact/ Regards, Evgheni Dereveanchin

Adam, I see constant failures due to this and found: 2016-10-17 03:55:21,045 ERROR (jsonrpc/3) [storage.TaskManager.Task] Task=`8989d694-7099-449b-bd66-4d63786be089`::Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2212, in getAllTasksInfo allTasksInfo = sp.getAllTasksInfo() File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper raise SecureError("Secured object is not in safe state") SecureError: Secured object is not in safe state Please take a look not sure whether it is related. You can find latest build here [1] Thanks, Piotr [1] http://jenkins.ovirt.org/job/ovirt_master_system-tests/668/ On Fri, Oct 14, 2016 at 11:22 AM, Evgheni Dereveanchin <ederevea@redhat.com> wrote:
Hello,
We've got several cases today where system tests failed when attempting to export templates:
http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/testReport/junit/...
Related engine.log looks something like this: https://paste.fedoraproject.org/449936/47643643/raw/
I could not find any obvious issues in SPM logs, could someone please take a look to confirm what may be causing this issue?
Full logs from the test are available here: http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/artifact/
Regards, Evgheni Dereveanchin _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

On 17/10/16 11:51 +0200, Piotr Kliczewski wrote:
Adam,
I see constant failures due to this and found:
2016-10-17 03:55:21,045 ERROR (jsonrpc/3) [storage.TaskManager.Task] Task=`8989d694-7099-449b-bd66-4d63786be089`::Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2212, in getAllTasksInfo allTasksInfo = sp.getAllTasksInfo() File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper raise SecureError("Secured object is not in safe state") SecureError: Secured object is not in safe state
This usually indicates that the SPM role has been lost which happens most likely due to connection issues with the storage. What is the storage environment being used for the system tests?
Please take a look not sure whether it is related. You can find latest build here [1]
Thanks, Piotr
[1] http://jenkins.ovirt.org/job/ovirt_master_system-tests/668/
On Fri, Oct 14, 2016 at 11:22 AM, Evgheni Dereveanchin <ederevea@redhat.com> wrote:
Hello,
We've got several cases today where system tests failed when attempting to export templates:
http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/testReport/junit/...
Related engine.log looks something like this: https://paste.fedoraproject.org/449936/47643643/raw/
I could not find any obvious issues in SPM logs, could someone please take a look to confirm what may be causing this issue?
Full logs from the test are available here: http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/artifact/
Regards, Evgheni Dereveanchin _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Adam Litke

All, I noticed that on Friday the problem do not occur but we have a different one [1] which could be related to storage as well. Thanks, Piotr [1] http://jenkins.ovirt.org/job/ovirt_master_system-tests/692/console On Mon, Oct 17, 2016 at 10:45 PM, Adam Litke <alitke@redhat.com> wrote:
On 17/10/16 11:51 +0200, Piotr Kliczewski wrote:
Adam,
I see constant failures due to this and found:
2016-10-17 03:55:21,045 ERROR (jsonrpc/3) [storage.TaskManager.Task] Task=`8989d694-7099-449b-bd66-4d63786be089`::Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2212, in getAllTasksInfo allTasksInfo = sp.getAllTasksInfo() File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper raise SecureError("Secured object is not in safe state") SecureError: Secured object is not in safe state
This usually indicates that the SPM role has been lost which happens most likely due to connection issues with the storage. What is the storage environment being used for the system tests?
Please take a look not sure whether it is related. You can find latest build here [1]
Thanks, Piotr
[1] http://jenkins.ovirt.org/job/ovirt_master_system-tests/668/
On Fri, Oct 14, 2016 at 11:22 AM, Evgheni Dereveanchin <ederevea@redhat.com> wrote:
Hello,
We've got several cases today where system tests failed when attempting to export templates:
http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/testReport/junit/...
Related engine.log looks something like this: https://paste.fedoraproject.org/449936/47643643/raw/
I could not find any obvious issues in SPM logs, could someone please take a look to confirm what may be causing this issue?
Full logs from the test are available here: http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/artifact/
Regards, Evgheni Dereveanchin _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Adam Litke

On Mon, Oct 24, 2016 at 2:30 PM, Piotr Kliczewski < piotr.kliczewski@gmail.com> wrote:
All,
I noticed that on Friday the problem do not occur but we have a different one [1] which could be related to storage as well.
Thanks, Piotr
[1] http://jenkins.ovirt.org/job/ovirt_master_system-tests/692/console
This one is actually https://bugzilla.redhat.com/show_bug.cgi?id=1379130 . Y.
On Mon, Oct 17, 2016 at 10:45 PM, Adam Litke <alitke@redhat.com> wrote:
On 17/10/16 11:51 +0200, Piotr Kliczewski wrote:
Adam,
I see constant failures due to this and found:
2016-10-17 03:55:21,045 ERROR (jsonrpc/3) [storage.TaskManager.Task] Task=`8989d694-7099-449b-bd66-4d63786be089`::Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2212, in getAllTasksInfo allTasksInfo = sp.getAllTasksInfo() File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper raise SecureError("Secured object is not in safe state") SecureError: Secured object is not in safe state
This usually indicates that the SPM role has been lost which happens most likely due to connection issues with the storage. What is the storage environment being used for the system tests?
Please take a look not sure whether it is related. You can find latest build here [1]
Thanks, Piotr
[1] http://jenkins.ovirt.org/job/ovirt_master_system-tests/668/
On Fri, Oct 14, 2016 at 11:22 AM, Evgheni Dereveanchin <ederevea@redhat.com> wrote:
Hello,
We've got several cases today where system tests failed when attempting to export templates:
testReport/junit/(root)/004_basic_sanity/template_export/
Related engine.log looks something like this: https://paste.fedoraproject.org/449936/47643643/raw/
I could not find any obvious issues in SPM logs, could someone please take a look to confirm what may be causing this issue?
Full logs from the test are available here: http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/artifact/
Regards, Evgheni Dereveanchin _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Adam Litke
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

It happened again in [1] 1. 2016-11-20 10:48:12,106 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='6c1ec6e7-fb37-465b-8e30-1613317683b2') Unexpected error (task:870) 2. Traceback (most recent call last): 3. File "/usr/share/vdsm/storage/task.py", line 877, in _run 4. return fn(*args, **kargs) 5. File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper 6. res = f(*args, **kwargs) 7. File "/usr/share/vdsm/storage/hsm.py", line 2205, in getAllTasksInfo 8. allTasksInfo = sp.getAllTasksInfo() 9. File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper 10. raise SecureError("Secured object is not in safe state") 11. SecureError: Secured object is not in safe state 12. 2016-11-20 10:48:12,109 INFO (jsonrpc/2) [storage.TaskManager.Task] (Task='6c1ec6e7-fb37-465b-8e30-1613317683b2') aborting: Task is aborted: u'Secured object is not in safe state' - code 100 (task:1175) 13. 2016-11-20 10:48:12,110 ERROR (jsonrpc/2) [storage.Dispatcher] Secured object is not in safe state (dispatcher:80) 14. Traceback (most recent call last): 15. File "/usr/share/vdsm/storage/dispatcher.py", line 72, in wrapper 16. result = ctask.prepare(func, *args, **kwargs) 17. File "/usr/share/vdsm/storage/task.py", line 105, in wrapper 18. return m(self, *a, **kw) 19. File "/usr/share/vdsm/storage/task.py", line 1183, in prepare 20. raise self.error 21. SecureError: Secured object is not in safe state http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/3506/artifa... The storage VM is running on the same VM as engine ( to save memory ) and its serving both NFS & ISCSI. Do you think running it on the same VM as engine might cause such issues? On Mon, Oct 17, 2016 at 11:45 PM, Adam Litke <alitke@redhat.com> wrote:
On 17/10/16 11:51 +0200, Piotr Kliczewski wrote:
Adam,
I see constant failures due to this and found:
2016-10-17 03:55:21,045 ERROR (jsonrpc/3) [storage.TaskManager.Task] Task=`8989d694-7099-449b-bd66-4d63786be089`::Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2212, in getAllTasksInfo allTasksInfo = sp.getAllTasksInfo() File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper raise SecureError("Secured object is not in safe state") SecureError: Secured object is not in safe state
This usually indicates that the SPM role has been lost which happens most likely due to connection issues with the storage. What is the storage environment being used for the system tests?
Please take a look not sure whether it is related. You can find latest build here [1]
Thanks, Piotr
[1] http://jenkins.ovirt.org/job/ovirt_master_system-tests/668/
On Fri, Oct 14, 2016 at 11:22 AM, Evgheni Dereveanchin <ederevea@redhat.com> wrote:
Hello,
We've got several cases today where system tests failed when attempting to export templates:
http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/t estReport/junit/(root)/004_basic_sanity/template_export/
Related engine.log looks something like this: https://paste.fedoraproject.org/449936/47643643/raw/
I could not find any obvious issues in SPM logs, could someone please take a look to confirm what may be causing this issue?
Full logs from the test are available here: http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/artifact/
Regards, Evgheni Dereveanchin _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Adam Litke
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Eyal Edri Associate Manager RHV DevOps EMEA ENG Virtualization R&D Red Hat Israel phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)

On Sun, Nov 20, 2016 at 6:25 PM, Eyal Edri <eedri@redhat.com> wrote:
It happened again in [1]
2016-11-20 10:48:12,106 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='6c1ec6e7-fb37-465b-8e30-1613317683b2') Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2205, in getAllTasksInfo allTasksInfo = sp.getAllTasksInfo() File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper raise SecureError("Secured object is not in safe state") SecureError: Secured object is not in safe state 2016-11-20 10:48:12,109 INFO (jsonrpc/2) [storage.TaskManager.Task] (Task='6c1ec6e7-fb37-465b-8e30-1613317683b2') aborting: Task is aborted: u'Secured object is not in safe state' - code 100 (task:1175) 2016-11-20 10:48:12,110 ERROR (jsonrpc/2) [storage.Dispatcher] Secured object is not in safe state (dispatcher:80) Traceback (most recent call last): File "/usr/share/vdsm/storage/dispatcher.py", line 72, in wrapper result = ctask.prepare(func, *args, **kwargs) File "/usr/share/vdsm/storage/task.py", line 105, in wrapper return m(self, *a, **kw) File "/usr/share/vdsm/storage/task.py", line 1183, in prepare raise self.error SecureError: Secured object is not in safe state
This can also mean that the SPM is not started yet. Maybe you are not waiting until the SPM is ready before you try to perform an operation? Who is the owner of this test? This person should debug this test.
http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/3506/artifa...
The storage VM is running on the same VM as engine ( to save memory ) and its serving both NFS & ISCSI. Do you think running it on the same VM as engine might cause such issues?
I don't think so, but this prevents testing lot of interesting negative flows. For example, when one storage server is down, the system should be able to use the other storage domain. Having each storage server in its own vm makes this possible. Also, we may like to test multiple storage servers of same type. the storage servers should be decoupled so we can start any number of them as needed for the current test.
On Mon, Oct 17, 2016 at 11:45 PM, Adam Litke <alitke@redhat.com> wrote:
On 17/10/16 11:51 +0200, Piotr Kliczewski wrote:
Adam,
I see constant failures due to this and found:
2016-10-17 03:55:21,045 ERROR (jsonrpc/3) [storage.TaskManager.Task] Task=`8989d694-7099-449b-bd66-4d63786be089`::Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2212, in getAllTasksInfo allTasksInfo = sp.getAllTasksInfo() File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper raise SecureError("Secured object is not in safe state") SecureError: Secured object is not in safe state
This usually indicates that the SPM role has been lost which happens most likely due to connection issues with the storage. What is the storage environment being used for the system tests?
Please take a look not sure whether it is related. You can find latest build here [1]
Thanks, Piotr
[1] http://jenkins.ovirt.org/job/ovirt_master_system-tests/668/
On Fri, Oct 14, 2016 at 11:22 AM, Evgheni Dereveanchin <ederevea@redhat.com> wrote:
Hello,
We've got several cases today where system tests failed when attempting to export templates:
http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/testReport/junit/...
Related engine.log looks something like this: https://paste.fedoraproject.org/449936/47643643/raw/
I could not find any obvious issues in SPM logs, could someone please take a look to confirm what may be causing this issue?
Full logs from the test are available here: http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/artifact/
Regards, Evgheni Dereveanchin _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Adam Litke
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Eyal Edri Associate Manager RHV DevOps EMEA ENG Virtualization R&D Red Hat Israel
phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

On Nov 20, 2016 6:33 PM, "Nir Soffer" <nsoffer@redhat.com> wrote:
On Sun, Nov 20, 2016 at 6:25 PM, Eyal Edri <eedri@redhat.com> wrote:
It happened again in [1]
2016-11-20 10:48:12,106 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='6c1ec6e7-fb37-465b-8e30-1613317683b2') Unexpected error
Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2205, in getAllTasksInfo allTasksInfo = sp.getAllTasksInfo() File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
(task:870) line
77, in wrapper raise SecureError("Secured object is not in safe state") SecureError: Secured object is not in safe state 2016-11-20 10:48:12,109 INFO (jsonrpc/2) [storage.TaskManager.Task] (Task='6c1ec6e7-fb37-465b-8e30-1613317683b2') aborting: Task is aborted: u'Secured object is not in safe state' - code 100 (task:1175) 2016-11-20 10:48:12,110 ERROR (jsonrpc/2) [storage.Dispatcher] Secured object is not in safe state (dispatcher:80) Traceback (most recent call last): File "/usr/share/vdsm/storage/dispatcher.py", line 72, in wrapper result = ctask.prepare(func, *args, **kwargs) File "/usr/share/vdsm/storage/task.py", line 105, in wrapper return m(self, *a, **kw) File "/usr/share/vdsm/storage/task.py", line 1183, in prepare raise self.error SecureError: Secured object is not in safe state
This can also mean that the SPM is not started yet. Maybe you are not waiting until the SPM is ready before you try to perform an operation?
Who is the owner of this test? This person should debug this test.
The relevant team for the feature.
http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/3506/artifa...
The storage VM is running on the same VM as engine ( to save memory )
and
its serving both NFS & ISCSI. Do you think running it on the same VM as engine might cause such issues?
I don't think so, but this prevents testing lot of interesting negative flows.
Which don't belong to CI.
For example, when one storage server is down, the system should be able to use the other storage domain. Having each storage server in its own vm makes this possible.
You have both NFS and ISCSI there. It's trival to set multiple of each if needed, of course. I do wish to add more IPs and test iSCSI bonding as well as both NFSv3 and NFSv4.
Also, we may like to test multiple storage servers of same type. the storage servers should be decoupled so we can start any number of them as needed for the current test.
Right, but not on this suite. Again, it's trivial to do so. The main motivation was to conserve resources so everyone could run the tests. Y.
On Mon, Oct 17, 2016 at 11:45 PM, Adam Litke <alitke@redhat.com> wrote:
On 17/10/16 11:51 +0200, Piotr Kliczewski wrote:
Adam,
I see constant failures due to this and found:
2016-10-17 03:55:21,045 ERROR (jsonrpc/3) [storage.TaskManager.Task] Task=`8989d694-7099-449b-bd66-4d63786be089`::Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2212, in getAllTasksInfo allTasksInfo = sp.getAllTasksInfo() File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper raise SecureError("Secured object is not in safe state") SecureError: Secured object is not in safe state
This usually indicates that the SPM role has been lost which happens most likely due to connection issues with the storage. What is the storage environment being used for the system tests?
Please take a look not sure whether it is related. You can find latest build here [1]
Thanks, Piotr
[1] http://jenkins.ovirt.org/job/ovirt_master_system-tests/668/
On Fri, Oct 14, 2016 at 11:22 AM, Evgheni Dereveanchin <ederevea@redhat.com> wrote:
Hello,
We've got several cases today where system tests failed when attempting to export templates:
http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/testReport/junit/...
Related engine.log looks something like this: https://paste.fedoraproject.org/449936/47643643/raw/
I could not find any obvious issues in SPM logs, could someone please take a look to confirm what may be causing this issue?
Full logs from the test are available here: http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/artifact/
Regards, Evgheni Dereveanchin _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Adam Litke
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Eyal Edri Associate Manager RHV DevOps EMEA ENG Virtualization R&D Red Hat Israel
phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
participants (6)
-
Adam Litke
-
Evgheni Dereveanchin
-
Eyal Edri
-
Nir Soffer
-
Piotr Kliczewski
-
Yaniv Kaul