[ OST Failure Report ] [ oVirt 4.2 (ovirt-engine) ] [ 04-03-2018 ] [ 004_basic_sanity.disk_operations ]

Hi, The following test failed OST: 004_basic_sanity.disk_operations. Link to suspected patch: https://gerrit.ovirt.org/c/88404/ Link to the failed job: http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/ Link to all test logs: - engine <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-engine> - host 0 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-0/_var_log> - host 1 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-1/_var_log> Error snippet from engine: 2018-03-04 09:50:14,823-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-12) [] EVENT_ID: VM_DOWN_ERROR(119), VM vm2 is down with error. Exit message: Lost connection with qemu process. Error snippet from host: Mar 4 09:56:27 lago-basic-suite-4-2-host-1 libvirtd: 2018-03-04 14:56:27.831+0000: 1189: error : qemuDomainAgentAvailable:6010 : Guest agent is not responding: QEMU guest agent is not connected Thanks, -- DANIEL BELENKY RHV DEVOPS

On Sun, Mar 4, 2018 at 5:18 PM, Daniel Belenky <dbelenky@redhat.com> wrote:
Hi,
The following test failed OST: 004_basic_sanity.disk_operations.
Link to suspected patch: https://gerrit.ovirt.org/c/88404/ Link to the failed job: http://jenkins.ovirt.org/ job/ovirt-4.2_change-queue-tester/1019/ Link to all test logs:
- engine <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-engine> - host 0 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-0/_var_log> - host 1 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-1/_var_log>
Error snippet from engine:
2018-03-04 09:50:14,823-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-12) [] EVENT_ID: VM_DOWN_ERROR(119), VM vm2 is down with error. Exit message: Lost connection with qemu process.
Error snippet from host:
Mar 4 09:56:27 lago-basic-suite-4-2-host-1 libvirtd: 2018-03-04 14:56:27.831+0000: 1189: error : qemuDomainAgentAvailable:6010 : Guest agent is not responding: QEMU guest agent is not connected
That's not surprising - there's no guest agent there. However, Benny and I were just discussing this issue as I've seen this failure on my host as well: ovirtlago.testlib: ERROR: Unhandled exception in <function all_jobs_finished at 0x3d62320> Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 219, in assert_equals_within res = func() File "/home/jenkins/workspace/ovirt-4.2_change-queue-tester/ovirt-system-tests/basic-suite-4.2/test-scenarios/004_basic_sanity.py", line 519, in all_jobs_finished jobs = engine.jobs_service().list(search='correlation_id=%s' % correlation_id) TypeError: list() got an unexpected keyword argument 'search' lago.utils: ERROR: Error while running thread Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/lago/utils.py", line 58, in _ret_via_queue queue.put({'return': func()}) File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 59, in wrapper return func(get_test_prefix(), *args, **kwargs) File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 78, in wrapper prefix.virt_env.engine_vm().get_api(api_ver=4), *args, **kwargs File "/home/jenkins/workspace/ovirt-4.2_change-queue-tester/ovirt-system-tests/basic-suite-4.2/test-scenarios/004_basic_sanity.py", line 522, in live_storage_migration testlib.assert_true_within_long(all_jobs_finished) File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 271, in assert_true_within_long assert_equals_within_long(func, True, allowed_exceptions) File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 258, in assert_equals_within_long func, value, LONG_TIMEOUT, allowed_exceptions=allowed_exceptions File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 219, in assert_equals_within res = func() File "/home/jenkins/workspace/ovirt-4.2_change-queue-tester/ovirt-system-tests/basic-suite-4.2/test-scenarios/004_basic_sanity.py", line 519, in all_jobs_finished jobs = engine.jobs_service().list(search='correlation_id=%s' % correlation_id) TypeError: list() got an unexpected keyword argument 'search' Y.
Thanks,
--
DANIEL BELENKY
RHV DEVOPS
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

On Sun, Mar 4, 2018 at 5:31 PM Yaniv Kaul <ykaul@redhat.com> wrote:
On Sun, Mar 4, 2018 at 5:18 PM, Daniel Belenky <dbelenky@redhat.com> wrote:
Hi,
The following test failed OST: 004_basic_sanity.disk_operations.
Link to suspected patch: https://gerrit.ovirt.org/c/88404/ Link to the failed job: http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/ Link to all test logs:
- engine <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-engine> - host 0 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-0/_var_log> - host 1 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-1/_var_log>
Error snippet from engine:
2018-03-04 09:50:14,823-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-12) [] EVENT_ID: VM_DOWN_ERROR(119), VM vm2 is down with error. Exit message: Lost connection with qemu process.
Error snippet from host:
Mar 4 09:56:27 lago-basic-suite-4-2-host-1 libvirtd: 2018-03-04 14:56:27.831+0000: 1189: error : qemuDomainAgentAvailable:6010 : Guest agent is not responding: QEMU guest agent is not connected
That's not surprising - there's no guest agent there.
There are 2 issues here: - we are testing without guest agent when this is the recommended configuration (snapshots may not be consistent without guest agent) - vdsm should not report errors about guest agent since it does not know if guest agent is installed or not. This message should be an INFO message like "could not stop the vm using guest agent, falling back to ..." Generally we should not see ERROR or WARN message in OST. Any repeating error or warning should be reported as a bug. Nir

On Sun, Mar 4, 2018 at 5:48 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Sun, Mar 4, 2018 at 5:31 PM Yaniv Kaul <ykaul@redhat.com> wrote:
On Sun, Mar 4, 2018 at 5:18 PM, Daniel Belenky <dbelenky@redhat.com> wrote:
Hi,
The following test failed OST: 004_basic_sanity.disk_operations.
Link to suspected patch: https://gerrit.ovirt.org/c/88404/ Link to the failed job: http://jenkins.ovirt.org/ job/ovirt-4.2_change-queue-tester/1019/ Link to all test logs:
- engine <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-engine> - host 0 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-0/_var_log> - host 1 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-1/_var_log>
Error snippet from engine:
2018-03-04 09:50:14,823-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-12) [] EVENT_ID: VM_DOWN_ERROR(119), VM vm2 is down with error. Exit message: Lost connection with qemu process.
Error snippet from host:
Mar 4 09:56:27 lago-basic-suite-4-2-host-1 libvirtd: 2018-03-04 14:56:27.831+0000: 1189: error : qemuDomainAgentAvailable:6010 : Guest agent is not responding: QEMU guest agent is not connected
That's not surprising - there's no guest agent there.
There are 2 issues here: - we are testing without guest agent when this is the recommended configuration (snapshots may not be consistent without guest agent)
We are still using Cirros. I need to get a CentOS with cloud-init uploaded (WIP...)
- vdsm should not report errors about guest agent since it does not know if guest agent is installed or not. This message should be an INFO message like "could not stop the vm using guest agent, falling back to ..."
Generally we should not see ERROR or WARN message in OST. Any repeating error or warning should be reported as a bug.
There are several on storage... Should we file BZs on them? Y.
Nir

On Sun, Mar 4, 2018 at 6:43 PM Yaniv Kaul <ykaul@redhat.com> wrote:
On Sun, Mar 4, 2018 at 5:48 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Sun, Mar 4, 2018 at 5:31 PM Yaniv Kaul <ykaul@redhat.com> wrote:
On Sun, Mar 4, 2018 at 5:18 PM, Daniel Belenky <dbelenky@redhat.com> wrote:
Hi,
The following test failed OST: 004_basic_sanity.disk_operations.
Link to suspected patch: https://gerrit.ovirt.org/c/88404/ Link to the failed job: http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/ Link to all test logs:
- engine <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-engine> - host 0 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-0/_var_log> - host 1 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-1/_var_log>
Error snippet from engine:
2018-03-04 09:50:14,823-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-12) [] EVENT_ID: VM_DOWN_ERROR(119), VM vm2 is down with error. Exit message: Lost connection with qemu process.
Error snippet from host:
Mar 4 09:56:27 lago-basic-suite-4-2-host-1 libvirtd: 2018-03-04 14:56:27.831+0000: 1189: error : qemuDomainAgentAvailable:6010 : Guest agent is not responding: QEMU guest agent is not connected
That's not surprising - there's no guest agent there.
There are 2 issues here: - we are testing without guest agent when this is the recommended configuration (snapshots may not be consistent without guest agent)
We are still using Cirros. I need to get a CentOS with cloud-init uploaded (WIP...)
- vdsm should not report errors about guest agent since it does not know if guest agent is installed or not. This message should be an INFO message like "could not stop the vm using guest agent, falling back to ..."
Generally we should not see ERROR or WARN message in OST. Any repeating error or warning should be reported as a bug.
There are several on storage... Should we file BZs on them?
I think we have only warnings now, but they should be fixed, so please file a bug.
Y.
Nir

https://bugzilla.redhat.com/show_bug.cgi?id=1553893 On Sun, Mar 4, 2018 at 10:51 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Sun, Mar 4, 2018 at 6:43 PM Yaniv Kaul <ykaul@redhat.com> wrote:
On Sun, Mar 4, 2018 at 5:48 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Sun, Mar 4, 2018 at 5:31 PM Yaniv Kaul <ykaul@redhat.com> wrote:
On Sun, Mar 4, 2018 at 5:18 PM, Daniel Belenky <dbelenky@redhat.com> wrote:
Hi,
The following test failed OST: 004_basic_sanity.disk_operations.
Link to suspected patch: https://gerrit.ovirt.org/c/88404/ Link to the failed job: http://jenkins.ovirt.org/ job/ovirt-4.2_change-queue-tester/1019/ Link to all test logs:
- engine <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-engine> - host 0 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-0/_var_log> - host 1 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-1/_var_log>
Error snippet from engine:
2018-03-04 09:50:14,823-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-12) [] EVENT_ID: VM_DOWN_ERROR(119), VM vm2 is down with error. Exit message: Lost connection with qemu process.
Error snippet from host:
Mar 4 09:56:27 lago-basic-suite-4-2-host-1 libvirtd: 2018-03-04 14:56:27.831+0000: 1189: error : qemuDomainAgentAvailable:6010 : Guest agent is not responding: QEMU guest agent is not connected
That's not surprising - there's no guest agent there.
There are 2 issues here: - we are testing without guest agent when this is the recommended configuration (snapshots may not be consistent without guest agent)
We are still using Cirros. I need to get a CentOS with cloud-init uploaded (WIP...)
- vdsm should not report errors about guest agent since it does not know if guest agent is installed or not. This message should be an INFO message like "could not stop the vm using guest agent, falling back to ..."
Generally we should not see ERROR or WARN message in OST. Any repeating error or warning should be reported as a bug.
There are several on storage... Should we file BZs on them?
I think we have only warnings now, but they should be fixed, so please file a bug.
Y.
Nir
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

https://gerrit.ovirt.org/#/c/84338/ this patch was merged today and it seems to be causing the problem since it uses an SDK feature which is currently only available in the 4.3 SDK versions It looks like it should be available in the SDK when it's built against the 4.2.29 ovirt-engine-api-model On Sun, Mar 4, 2018 at 5:18 PM, Daniel Belenky <dbelenky@redhat.com> wrote:
Hi,
The following test failed OST: 004_basic_sanity.disk_operations.
Link to suspected patch: https://gerrit.ovirt.org/c/88404/ Link to the failed job: http://jenkins.ovirt.org/ job/ovirt-4.2_change-queue-tester/1019/ Link to all test logs:
- engine <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-engine> - host 0 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-0/_var_log> - host 1 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-1/_var_log>
Error snippet from engine:
2018-03-04 09:50:14,823-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-12) [] EVENT_ID: VM_DOWN_ERROR(119), VM vm2 is down with error. Exit message: Lost connection with qemu process.
Error snippet from host:
Mar 4 09:56:27 lago-basic-suite-4-2-host-1 libvirtd: 2018-03-04 14:56:27.831+0000: 1189: error : qemuDomainAgentAvailable:6010 : Guest agent is not responding: QEMU guest agent is not connected
Thanks,
--
DANIEL BELENKY
RHV DEVOPS
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

I've prepared a revert patch [1] in the meantime, but its strange that it passed check-patch, can you verify which SDK version was used and which one you need? [1] https://gerrit.ovirt.org/#/c/88441/ On Sun, Mar 4, 2018 at 6:01 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
https://gerrit.ovirt.org/#/c/84338/ this patch was merged today and it seems to be causing the problem since it uses an SDK feature which is currently only available in the 4.3 SDK versions It looks like it should be available in the SDK when it's built against the 4.2.29 ovirt-engine-api-model
On Sun, Mar 4, 2018 at 5:18 PM, Daniel Belenky <dbelenky@redhat.com> wrote:
Hi,
The following test failed OST: 004_basic_sanity.disk_operations.
Link to suspected patch: https://gerrit.ovirt.org/c/88404/ Link to the failed job: http://jenkins.ovirt.org/ job/ovirt-4.2_change-queue-tester/1019/ Link to all test logs:
- engine <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-engine> - host 0 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-0/_var_log> - host 1 <http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/1019/artifact/exported-artifacts/basic-suit-4.2-el7/test_logs/basic-suite-4.2/post-004_basic_sanity.py/lago-basic-suite-4-2-host-1/_var_log>
Error snippet from engine:
2018-03-04 09:50:14,823-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-12) [] EVENT_ID: VM_DOWN_ERROR(119), VM vm2 is down with error. Exit message: Lost connection with qemu process.
Error snippet from host:
Mar 4 09:56:27 lago-basic-suite-4-2-host-1 libvirtd: 2018-03-04 14:56:27.831+0000: 1189: error : qemuDomainAgentAvailable:6010 : Guest agent is not responding: QEMU guest agent is not connected
Thanks,
--
DANIEL BELENKY
RHV DEVOPS
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Eyal edri MANAGER RHV DevOps EMEA VIRTUALIZATION R&D Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)
participants (6)
-
Benny Zlotnik
-
Dafna Ron
-
Daniel Belenky
-
Eyal Edri
-
Nir Soffer
-
Yaniv Kaul