FYI - Many nightly suites, including Network, HE and HC are failing since 2-3 days ago

Not sure if all the same issue, but seems to failing around the same time:   ovirt-system-tests_hc-basic-suite-4.2 1 day 12 hr - #824 12 hr - #825 57 min  integ-tests   ovirt-system-tests_hc-basic-suite-master 2 days 12 hr - #1043 12 hr - #1045 51 min  integ-tests   ovirt-system-tests_he-basic-ansible-suite-4.3 N/A 11 hr - #11 24 sec  integ-tests   ovirt-system-tests_he-basic-ipv6-suite-master N/A 12 hr - #11 10 min  integ-tests   ovirt-system-tests_he-basic-iscsi-suite-master 2 days 10 hr - #871 10 hr - #873 1 hr 36 min  integ-tests   ovirt-system-tests_he-basic-suite-master 2 days 11 hr - #1147 11 hr - #1149 1 hr 26 min  integ-tests   ovirt-system-tests_he-node-ng-suite-master 2 days 12 hr - #727 12 hr - #729 1 hr 54 min  integ-tests   ovirt-system-tests_network-suite-master 3 days 12 hr - #926 12 hr - #929 45 min  integ-tests   ovirt-system-tests_openshift-on-ovirt-suite-4.2 3 days 10 hr - #187 10 hr - #190 45 min  integ-tests Links to jobs can be found here: https://jenkins.ovirt.org/ -- Eyal edri MANAGER RHV/CNV DevOps EMEA VIRTUALIZATION R&D Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)

At least for the network suite these failures are due to the fact that DisconnectStoragePoolVDSCommand is not finished and a change_cluster request is issued by the test but revoked by engine which fails the test setup. Ahmed, Can you please have a look at the below engine log [2]- currently change_cluster requests from the test are being revoked by engine. I believe this is because the locking group of DisconnectStoragePoolVDSCommand was moved to VDS in patch [1] Is this a desired outcome and should the tests be changed to reflect it (wait for which status notification from engine?)? Thanks P.S. please note that due to this patch LockingGroup.VDS_POOL_AND_STORAGE_CONNECTIONS has only one consumer, which means it does not lock against anything. Is this a desired outcome of the patch? [1] https://gerrit.ovirt.org/#/c/98405 [2] Network-suite-master.tests.test_mac_pools.test_mac_pools_in_different_clusters_dont_overlap Error Message test setup failure Stacktrace default_data_center = <ovirtlib.datacenterlib.DataCenter object at 0x7fbe88aa8790> host_1_up = <ovirtlib.hostlib.Host object at 0x7fbe8404a810> cluster_0 = <ovirtlib.clusterlib.Cluster object at 0x7fbe77f5a050> @pytest.fixture(scope='module') def host_1_in_cluster_0(default_data_center, host_1_up, cluster_0): current_cluster = host_1_up.get_cluster()
host_1_up.change_cluster(cluster_0)
network-suite-master/tests/test_mac_pools.py:49 2019-03-17 07:16:40,493-04 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-65) [] START, DisconnectStoragePoolVDSCommand(HostName = lago-network-suite-master-host-1, DisconnectStoragePoolVDSCommandParameters:{hostId='2ab16b7a-fd38-41c5-98be-c8db1fdece42', storagePoolId='239d1be2-48a4-11e9-bec6-5452c0a8c902', vds_spm_id='2'}), log id: 44ad4438 2019-03-17 07:16:43,635-04 INFO [org.ovirt.engine.core.bll.ChangeVDSClusterCommand] (default task-2) [eded9adc-b780-4640-ba5e-ec989f620d57] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[2ab16b7a-fd38-41c5-98be-c8db1fdece42=VDS]', sharedLocks=''}' 2019-03-17 07:16:43,635-04 WARN [org.ovirt.engine.core.bll.ChangeVDSClusterCommand] (default task-2) [eded9adc-b780-4640-ba5e-ec989f620d57] Validation of action 'ChangeVDSCluster' failed for user admin@internal-authz. Reasons: VAR__ACTION__UPDATE,VAR__TYPE__HOST,ACTION_TYPE_FAILED_OBJECT_LOCKED 2019-03-17 07:16:43,636-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-2) [eded9adc-b780-4640-ba5e-ec989f620d57] method: runAction, params: [ChangeVDSCluster, ChangeVDSClusterParameters:{commandId='00522716-6557-4da0-8d02-802fbf402c24', user='null', commandType='Unknown'}], timeElapsed: 18ms 2019-03-17 07:16:43,637-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-2) [] Operation Failed: [Cannot edit Host. Related operation is currently in progress. Please try again later.] 2019-03-17 07:16:44,575-04 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-65) [] FINISH, DisconnectStoragePoolVDSCommand, return: , log id: 44ad4438 On Sun, Mar 17, 2019 at 4:05 PM Eyal Edri <eedri@redhat.com> wrote:
Not sure if all the same issue, but seems to failing around the same time:
  ovirt-system-tests_hc-basic-suite-4.2 1 day 12 hr - #824 12 hr - #825 57 min  integ-tests   ovirt-system-tests_hc-basic-suite-master 2 days 12 hr - #1043 12 hr - #1045 51 min  integ-tests   ovirt-system-tests_he-basic-ansible-suite-4.3 N/A 11 hr - #11 24 sec  integ-tests   ovirt-system-tests_he-basic-ipv6-suite-master N/A 12 hr - #11 10 min  integ-tests   ovirt-system-tests_he-basic-iscsi-suite-master 2 days 10 hr - #871 10 hr - #873 1 hr 36 min  integ-tests   ovirt-system-tests_he-basic-suite-master 2 days 11 hr - #1147 11 hr - #1149 1 hr 26 min  integ-tests   ovirt-system-tests_he-node-ng-suite-master 2 days 12 hr - #727 12 hr - #729 1 hr 54 min  integ-tests   ovirt-system-tests_network-suite-master 3 days 12 hr - #926 12 hr - #929 45 min  integ-tests   ovirt-system-tests_openshift-on-ovirt-suite-4.2 3 days 10 hr - #187 10 hr - #190 45 min  integ-tests
Links to jobs can be found here: https://jenkins.ovirt.org/
--
Eyal edri
MANAGER
RHV/CNV DevOps
EMEA VIRTUALIZATION R&D
Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ) _______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/I75ZI4LYD3IKZC...

We should probably lock both groups in VdsEventListener#processStorageOnVdsInactive or in RemoveVdsCommand Ahmad, please evaluate and adjust On Wed, Mar 20, 2019 at 11:25 AM Eitan Raviv <eraviv@redhat.com> wrote:
At least for the network suite these failures are due to the fact that DisconnectStoragePoolVDSCommand is not finished and a change_cluster request is issued by the test but revoked by engine which fails the test setup.
Ahmed, Can you please have a look at the below engine log [2]- currently change_cluster requests from the test are being revoked by engine. I believe this is because the locking group of DisconnectStoragePoolVDSCommand was moved to VDS in patch [1] Is this a desired outcome and should the tests be changed to reflect it (wait for which status notification from engine?)?
Thanks
P.S. please note that due to this patch LockingGroup.VDS_POOL_AND_STORAGE_CONNECTIONS has only one consumer, which means it does not lock against anything. Is this a desired outcome of the patch? [1] https://gerrit.ovirt.org/#/c/98405 [2] Network-suite-master.tests.test_mac_pools.test_mac_pools_in_different_clusters_dont_overlap Error Message
test setup failure Stacktrace
default_data_center = <ovirtlib.datacenterlib.DataCenter object at 0x7fbe88aa8790> host_1_up = <ovirtlib.hostlib.Host object at 0x7fbe8404a810> cluster_0 = <ovirtlib.clusterlib.Cluster object at 0x7fbe77f5a050>
@pytest.fixture(scope='module') def host_1_in_cluster_0(default_data_center, host_1_up, cluster_0): current_cluster = host_1_up.get_cluster()
host_1_up.change_cluster(cluster_0)
network-suite-master/tests/test_mac_pools.py:49
2019-03-17 07:16:40,493-04 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-65) [] START, DisconnectStoragePoolVDSCommand(HostName = lago-network-suite-master-host-1, DisconnectStoragePoolVDSCommandParameters:{hostId='2ab16b7a-fd38-41c5-98be-c8db1fdece42', storagePoolId='239d1be2-48a4-11e9-bec6-5452c0a8c902', vds_spm_id='2'}), log id: 44ad4438
2019-03-17 07:16:43,635-04 INFO [org.ovirt.engine.core.bll.ChangeVDSClusterCommand] (default task-2) [eded9adc-b780-4640-ba5e-ec989f620d57] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[2ab16b7a-fd38-41c5-98be-c8db1fdece42=VDS]', sharedLocks=''}' 2019-03-17 07:16:43,635-04 WARN [org.ovirt.engine.core.bll.ChangeVDSClusterCommand] (default task-2) [eded9adc-b780-4640-ba5e-ec989f620d57] Validation of action 'ChangeVDSCluster' failed for user admin@internal-authz. Reasons: VAR__ACTION__UPDATE,VAR__TYPE__HOST,ACTION_TYPE_FAILED_OBJECT_LOCKED 2019-03-17 07:16:43,636-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-2) [eded9adc-b780-4640-ba5e-ec989f620d57] method: runAction, params: [ChangeVDSCluster, ChangeVDSClusterParameters:{commandId='00522716-6557-4da0-8d02-802fbf402c24', user='null', commandType='Unknown'}], timeElapsed: 18ms 2019-03-17 07:16:43,637-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-2) [] Operation Failed: [Cannot edit Host. Related operation is currently in progress. Please try again later.]
2019-03-17 07:16:44,575-04 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-65) [] FINISH, DisconnectStoragePoolVDSCommand, return: , log id: 44ad4438
On Sun, Mar 17, 2019 at 4:05 PM Eyal Edri <eedri@redhat.com> wrote:
Not sure if all the same issue, but seems to failing around the same time:
  ovirt-system-tests_hc-basic-suite-4.2 1 day 12 hr - #824 12 hr - #825 57 min  integ-tests   ovirt-system-tests_hc-basic-suite-master 2 days 12 hr - #1043 12 hr - #1045 51 min  integ-tests   ovirt-system-tests_he-basic-ansible-suite-4.3 N/A 11 hr - #11 24 sec  integ-tests   ovirt-system-tests_he-basic-ipv6-suite-master N/A 12 hr - #11 10 min  integ-tests   ovirt-system-tests_he-basic-iscsi-suite-master 2 days 10 hr - #871 10 hr - #873 1 hr 36 min  integ-tests   ovirt-system-tests_he-basic-suite-master 2 days 11 hr - #1147 11 hr - #1149 1 hr 26 min  integ-tests   ovirt-system-tests_he-node-ng-suite-master 2 days 12 hr - #727 12 hr - #729 1 hr 54 min  integ-tests   ovirt-system-tests_network-suite-master 3 days 12 hr - #926 12 hr - #929 45 min  integ-tests   ovirt-system-tests_openshift-on-ovirt-suite-4.2 3 days 10 hr - #187 10 hr - #190 45 min  integ-tests
Links to jobs can be found here: https://jenkins.ovirt.org/
--
Eyal edri
MANAGER
RHV/CNV DevOps
EMEA VIRTUALIZATION R&D
Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ) _______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/I75ZI4LYD3IKZC...
_______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/UXLFIW2KK4JWJO...

On Sun, Mar 17, 2019 at 3:04 PM Eyal Edri <eedri@redhat.com> wrote:
Not sure if all the same issue, but seems to failing around the same time:
  ovirt-system-tests_hc-basic-suite-4.2 1 day 12 hr - #824 12 hr - #825 57 min  integ-tests   ovirt-system-tests_hc-basic-suite-master 2 days 12 hr - #1043 12 hr - #1045 51 min  integ-tests   ovirt-system-tests_he-basic-ansible-suite-4.3 N/A 11 hr - #11 24 sec  integ-tests   ovirt-system-tests_he-basic-ipv6-suite-master N/A 12 hr - #11 10 min  integ-tests   ovirt-system-tests_he-basic-iscsi-suite-master 2 days 10 hr - #871 10 hr - #873 1 hr 36 min  integ-tests   ovirt-system-tests_he-basic-suite-master 2 days 11 hr - #1147 11 hr - #1149 1 hr 26 min  integ-tests   ovirt-system-tests_he-node-ng-suite-master 2 days 12 hr - #727 12 hr - #729 1 hr 54 min  integ-tests
It's a NullPointerException on engine side: I opened a bug here: https://bugzilla.redhat.com/show_bug.cgi?id=1690159 which is not on POST
  ovirt-system-tests_network-suite-master 3 days 12 hr - #926 12 hr - #929 45 min  integ-tests   ovirt-system-tests_openshift-on-ovirt-suite-4.2 3 days 10 hr - #187 10 hr - #190 45 min  integ-tests
Links to jobs can be found here: https://jenkins.ovirt.org/
--
Eyal edri
MANAGER
RHV/CNV DevOps
EMEA VIRTUALIZATION R&D
Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)

I am not sure that locking both groups would be sufficient, because there is still a chance that the removeNetowrks request will start and acquire the lock before the DisconnectStorage operation starts. So probably the correct and full proof solution is to not move the host to maintenance until all related storage ops terminate. On Wed, Mar 20, 2019 at 2:07 PM Simone Tiraboschi <stirabos@redhat.com> wrote:
On Sun, Mar 17, 2019 at 3:04 PM Eyal Edri <eedri@redhat.com> wrote:
Not sure if all the same issue, but seems to failing around the same time:
  ovirt-system-tests_hc-basic-suite-4.2 1 day 12 hr - #824 12 hr - #825 57 min  integ-tests   ovirt-system-tests_hc-basic-suite-master 2 days 12 hr - #1043 12 hr - #1045 51 min  integ-tests   ovirt-system-tests_he-basic-ansible-suite-4.3 N/A 11 hr - #11 24 sec  integ-tests   ovirt-system-tests_he-basic-ipv6-suite-master N/A 12 hr - #11 10 min  integ-tests   ovirt-system-tests_he-basic-iscsi-suite-master 2 days 10 hr - #871 10 hr - #873 1 hr 36 min  integ-tests   ovirt-system-tests_he-basic-suite-master 2 days 11 hr - #1147 11 hr - #1149 1 hr 26 min  integ-tests   ovirt-system-tests_he-node-ng-suite-master 2 days 12 hr - #727 12 hr - #729 1 hr 54 min  integ-tests
It's a NullPointerException on engine side: I opened a bug here: https://bugzilla.redhat.com/show_bug.cgi?id=1690159 which is not on POST
  ovirt-system-tests_network-suite-master 3 days 12 hr - #926 12 hr - #929 45 min  integ-tests   ovirt-system-tests_openshift-on-ovirt-suite-4.2 3 days 10 hr - #187 10 hr - #190 45 min  integ-tests
Links to jobs can be found here: https://jenkins.ovirt.org/
--
Eyal edri
MANAGER
RHV/CNV DevOps
EMEA VIRTUALIZATION R&D
Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)
_______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/CQP4YDTMT53J5J...

I was told that intervening in the host state machine is delicate, but I think that this is the only correct approach. Benny, Ahmad, Tal: do you have a plan to resolve this? We are entering a third week with this constant failure. On Wed, Mar 20, 2019 at 2:42 PM Eitan Raviv <eraviv@redhat.com> wrote:
I am not sure that locking both groups would be sufficient, because there is still a chance that the removeNetowrks request will start and acquire the lock before the DisconnectStorage operation starts. So probably the correct and full proof solution is to not move the host to maintenance until all related storage ops terminate.
On Wed, Mar 20, 2019 at 2:07 PM Simone Tiraboschi <stirabos@redhat.com> wrote:
On Sun, Mar 17, 2019 at 3:04 PM Eyal Edri <eedri@redhat.com> wrote:
Not sure if all the same issue, but seems to failing around the same time:
  ovirt-system-tests_hc-basic-suite-4.2 1 day 12 hr - #824 12 hr - #825 57 min  integ-tests   ovirt-system-tests_hc-basic-suite-master 2 days 12 hr - #1043 12 hr - #1045 51 min  integ-tests   ovirt-system-tests_he-basic-ansible-suite-4.3 N/A 11 hr - #11 24 sec  integ-tests   ovirt-system-tests_he-basic-ipv6-suite-master N/A 12 hr - #11 10 min  integ-tests   ovirt-system-tests_he-basic-iscsi-suite-master 2 days 10 hr - #871 10 hr - #873 1 hr 36 min  integ-tests   ovirt-system-tests_he-basic-suite-master 2 days 11 hr - #1147 11 hr - #1149 1 hr 26 min  integ-tests   ovirt-system-tests_he-node-ng-suite-master 2 days 12 hr - #727 12 hr - #729 1 hr 54 min  integ-tests
It's a NullPointerException on engine side: I opened a bug here: https://bugzilla.redhat.com/show_bug.cgi?id=1690159 which is not on POST
  ovirt-system-tests_network-suite-master 3 days 12 hr - #926 12 hr - #929 45 min  integ-tests   ovirt-system-tests_openshift-on-ovirt-suite-4.2 3 days 10 hr - #187 10 hr - #190 45 min  integ-tests
Links to jobs can be found here: https://jenkins.ovirt.org/
--
Eyal edri
MANAGER
RHV/CNV DevOps
EMEA VIRTUALIZATION R&D
Red Hat EMEA
TRIED. TESTED. TRUSTED. phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)
_______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/CQP4YDTMT53J5J...
_______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/3XP6WDOAL54EQ2...

The patch was reverted on Thursday On Sat, Mar 23, 2019 at 8:48 PM Dan Kenigsberg <danken@redhat.com> wrote: > I was told that intervening in the host state machine is delicate, but > I think that this is the only correct approach. > > Benny, Ahmad, Tal: do you have a plan to resolve this? We are entering > a third week with this constant failure. > > > On Wed, Mar 20, 2019 at 2:42 PM Eitan Raviv <eraviv@redhat.com> wrote: > > > > I am not sure that locking both groups would be sufficient, because > there is still a chance that the removeNetowrks request will start and > acquire the lock before the DisconnectStorage operation starts. > > So probably the correct and full proof solution is to not move the host > to maintenance until all related storage ops terminate. > > > > > > On Wed, Mar 20, 2019 at 2:07 PM Simone Tiraboschi <stirabos@redhat.com> > wrote: > >> > >> > >> > >> On Sun, Mar 17, 2019 at 3:04 PM Eyal Edri <eedri@redhat.com> wrote: > >>> > >>> Not sure if all the same issue, but seems to failing around the same > time: > >>> > >>>   ovirt-system-tests_hc-basic-suite-4.2 1 day 12 hr - #824 12 hr - > #825 57 min  integ-tests > >>>   ovirt-system-tests_hc-basic-suite-master 2 days 12 hr - #1043 12 > hr - #1045 51 min  integ-tests > >>>   ovirt-system-tests_he-basic-ansible-suite-4.3 N/A 11 hr - #11 24 > sec  integ-tests > >>>   ovirt-system-tests_he-basic-ipv6-suite-master N/A 12 hr - #11 10 > min  integ-tests > >>>   ovirt-system-tests_he-basic-iscsi-suite-master 2 days 10 hr - #871 > 10 hr - #873 1 hr 36 min  integ-tests > >>>   ovirt-system-tests_he-basic-suite-master 2 days 11 hr - #1147 11 > hr - #1149 1 hr 26 min  integ-tests > >>>   ovirt-system-tests_he-node-ng-suite-master 2 days 12 hr - #727 12 > hr - #729 1 hr 54 min  integ-tests > >> > >> > >> It's a NullPointerException on engine side: > >> I opened a bug here: > https://bugzilla.redhat.com/show_bug.cgi?id=1690159 which is not on POST > >> > >> > >>> > >>>   ovirt-system-tests_network-suite-master 3 days 12 hr - #926 12 hr > - #929 45 min  integ-tests > >>>   ovirt-system-tests_openshift-on-ovirt-suite-4.2 3 days 10 hr - > #187 10 hr - #190 45 min  integ-tests > >>> > >>> Links to jobs can be found here: https://jenkins.ovirt.org/ > >>> > >>> -- > >>> > >>> Eyal edri > >>> > >>> > >>> MANAGER > >>> > >>> RHV/CNV DevOps > >>> > >>> EMEA VIRTUALIZATION R&D > >>> > >>> > >>> Red Hat EMEA > >>> > >>> TRIED. TESTED. TRUSTED. > >>> phone: +972-9-7692018 > >>> irc: eedri (on #tlv #rhev-dev #rhev-integ) > >> > >> _______________________________________________ > >> Devel mailing list -- devel@ovirt.org > >> To unsubscribe send an email to devel-leave@ovirt.org > >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > >> oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > >> List Archives: > https://lists.ovirt.org/archives/list/devel@ovirt.org/message/CQP4YDTMT53J5JKZQLYSYOELIVCHKVKE/ > > > > _______________________________________________ > > Devel mailing list -- devel@ovirt.org > > To unsubscribe send an email to devel-leave@ovirt.org > > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > > List Archives: > https://lists.ovirt.org/archives/list/devel@ovirt.org/message/3XP6WDOAL54EQ2EGC5K25IJ47N3GWSHM/ >

Unfortunately, the network suite is still failing on Cannot edit Host. Related operation is currently in progress. Please try again later. Can you check if that's the same issue? Did you revert from 4.3 too? http://jenkins.ovirt.org/job/ovirt-system-tests_network-suite-4.3/19/ On Sat, 23 Mar 2019, 21:27 Benny Zlotnik, <bzlotnik@redhat.com> wrote: > The patch was reverted on Thursday > > On Sat, Mar 23, 2019 at 8:48 PM Dan Kenigsberg <danken@redhat.com> wrote: > >> I was told that intervening in the host state machine is delicate, but >> I think that this is the only correct approach. >> >> Benny, Ahmad, Tal: do you have a plan to resolve this? We are entering >> a third week with this constant failure. >> >> >> On Wed, Mar 20, 2019 at 2:42 PM Eitan Raviv <eraviv@redhat.com> wrote: >> > >> > I am not sure that locking both groups would be sufficient, because >> there is still a chance that the removeNetowrks request will start and >> acquire the lock before the DisconnectStorage operation starts. >> > So probably the correct and full proof solution is to not move the host >> to maintenance until all related storage ops terminate. >> > >> > >> > On Wed, Mar 20, 2019 at 2:07 PM Simone Tiraboschi <stirabos@redhat.com> >> wrote: >> >> >> >> >> >> >> >> On Sun, Mar 17, 2019 at 3:04 PM Eyal Edri <eedri@redhat.com> wrote: >> >>> >> >>> Not sure if all the same issue, but seems to failing around the same >> time: >> >>> >> >>>   ovirt-system-tests_hc-basic-suite-4.2 1 day 12 hr - #824 12 hr - >> #825 57 min  integ-tests >> >>>   ovirt-system-tests_hc-basic-suite-master 2 days 12 hr - #1043 12 >> hr - #1045 51 min  integ-tests >> >>>   ovirt-system-tests_he-basic-ansible-suite-4.3 N/A 11 hr - #11 24 >> sec  integ-tests >> >>>   ovirt-system-tests_he-basic-ipv6-suite-master N/A 12 hr - #11 10 >> min  integ-tests >> >>>   ovirt-system-tests_he-basic-iscsi-suite-master 2 days 10 hr - >> #871 10 hr - #873 1 hr 36 min  integ-tests >> >>>   ovirt-system-tests_he-basic-suite-master 2 days 11 hr - #1147 11 >> hr - #1149 1 hr 26 min  integ-tests >> >>>   ovirt-system-tests_he-node-ng-suite-master 2 days 12 hr - #727 12 >> hr - #729 1 hr 54 min  integ-tests >> >> >> >> >> >> It's a NullPointerException on engine side: >> >> I opened a bug here: >> https://bugzilla.redhat.com/show_bug.cgi?id=1690159 which is not on POST >> >> >> >> >> >>> >> >>>   ovirt-system-tests_network-suite-master 3 days 12 hr - #926 12 hr >> - #929 45 min  integ-tests >> >>>   ovirt-system-tests_openshift-on-ovirt-suite-4.2 3 days 10 hr - >> #187 10 hr - #190 45 min  integ-tests >> >>> >> >>> Links to jobs can be found here: https://jenkins.ovirt.org/ >> >>> >> >>> -- >> >>> >> >>> Eyal edri >> >>> >> >>> >> >>> MANAGER >> >>> >> >>> RHV/CNV DevOps >> >>> >> >>> EMEA VIRTUALIZATION R&D >> >>> >> >>> >> >>> Red Hat EMEA >> >>> >> >>> TRIED. TESTED. TRUSTED. >> >>> phone: +972-9-7692018 >> >>> irc: eedri (on #tlv #rhev-dev #rhev-integ) >> >> >> >> _______________________________________________ >> >> Devel mailing list -- devel@ovirt.org >> >> To unsubscribe send an email to devel-leave@ovirt.org >> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> >> oVirt Code of Conduct: >> https://www.ovirt.org/community/about/community-guidelines/ >> >> List Archives: >> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/CQP4YDTMT53J5JKZQLYSYOELIVCHKVKE/ >> > >> > _______________________________________________ >> > Devel mailing list -- devel@ovirt.org >> > To unsubscribe send an email to devel-leave@ovirt.org >> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> > oVirt Code of Conduct: >> https://www.ovirt.org/community/about/community-guidelines/ >> > List Archives: >> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/3XP6WDOAL54EQ2EGC5K25IJ47N3GWSHM/ >> >

After some off line discussions it seems that the change that should be implemented in order to solve the original problem (remove host fails due to disconnect storage in progress) is to leave the host in 'preparing for maintenance' until all relevant storage operations are completed. On Sat, Mar 23, 2019 at 11:36 PM Dan Kenigsberg <danken@redhat.com> wrote:
Unfortunately, the network suite is still failing on
Cannot edit Host. Related operation is currently in progress. Please try again later.
Can you check if that's the same issue? Did you revert from 4.3 too?
http://jenkins.ovirt.org/job/ovirt-system-tests_network-suite-4.3/19/
On Sat, 23 Mar 2019, 21:27 Benny Zlotnik, <bzlotnik@redhat.com> wrote:
The patch was reverted on Thursday
On Sat, Mar 23, 2019 at 8:48 PM Dan Kenigsberg <danken@redhat.com> wrote:
I was told that intervening in the host state machine is delicate, but I think that this is the only correct approach.
Benny, Ahmad, Tal: do you have a plan to resolve this? We are entering a third week with this constant failure.
On Wed, Mar 20, 2019 at 2:42 PM Eitan Raviv <eraviv@redhat.com> wrote:
I am not sure that locking both groups would be sufficient, because
So probably the correct and full proof solution is to not move the host to maintenance until all related storage ops terminate.
On Wed, Mar 20, 2019 at 2:07 PM Simone Tiraboschi <stirabos@redhat.com> wrote:
On Sun, Mar 17, 2019 at 3:04 PM Eyal Edri <eedri@redhat.com> wrote:
Not sure if all the same issue, but seems to failing around the same
time:
  ovirt-system-tests_hc-basic-suite-4.2 1 day 12 hr - #824 12 hr -
#825 57 min  integ-tests
  ovirt-system-tests_hc-basic-suite-master 2 days 12 hr - #1043 12
  ovirt-system-tests_he-basic-ansible-suite-4.3 N/A 11 hr - #11 24 sec  integ-tests   ovirt-system-tests_he-basic-ipv6-suite-master N/A 12 hr - #11 10 min  integ-tests   ovirt-system-tests_he-basic-iscsi-suite-master 2 days 10 hr - #871 10 hr - #873 1 hr 36 min  integ-tests   ovirt-system-tests_he-basic-suite-master 2 days 11 hr - #1147 11
  ovirt-system-tests_he-node-ng-suite-master 2 days 12 hr - #727 12 hr - #729 1 hr 54 min  integ-tests
It's a NullPointerException on engine side: I opened a bug here: https://bugzilla.redhat.com/show_bug.cgi?id=1690159 which is not on POST
  ovirt-system-tests_network-suite-master 3 days 12 hr - #926 12
there is still a chance that the removeNetowrks request will start and acquire the lock before the DisconnectStorage operation starts. hr - #1045 51 min  integ-tests hr - #1149 1 hr 26 min  integ-tests hr - #929 45 min  integ-tests
  ovirt-system-tests_openshift-on-ovirt-suite-4.2 3 days 10 hr - #187 10 hr - #190 45 min  integ-tests
Links to jobs can be found here: https://jenkins.ovirt.org/
--
Eyal edri
MANAGER
RHV/CNV DevOps
EMEA VIRTUALIZATION R&D
Red Hat EMEA
TRIED. TESTED. TRUSTED. phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)
_______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/CQP4YDTMT53J5J...
_______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/3XP6WDOAL54EQ2...

patched 4.3! On Sun, Mar 24, 2019 at 9:06 AM Eitan Raviv <eraviv@redhat.com> wrote: > After some off line discussions it seems that the change that should be > implemented in order to solve the original problem (remove host fails due > to disconnect storage in progress) is to leave the host in 'preparing for > maintenance' until all relevant storage operations are completed. > > On Sat, Mar 23, 2019 at 11:36 PM Dan Kenigsberg <danken@redhat.com> wrote: > >> Unfortunately, the network suite is still failing on >> >> Cannot edit Host. Related operation is currently in progress. Please try >> again later. >> >> Can you check if that's the same issue? Did you revert from 4.3 too? >> >> http://jenkins.ovirt.org/job/ovirt-system-tests_network-suite-4.3/19/ >> >> >> >> On Sat, 23 Mar 2019, 21:27 Benny Zlotnik, <bzlotnik@redhat.com> wrote: >> >>> The patch was reverted on Thursday >>> >>> On Sat, Mar 23, 2019 at 8:48 PM Dan Kenigsberg <danken@redhat.com> >>> wrote: >>> >>>> I was told that intervening in the host state machine is delicate, but >>>> I think that this is the only correct approach. >>>> >>>> Benny, Ahmad, Tal: do you have a plan to resolve this? We are entering >>>> a third week with this constant failure. >>>> >>>> >>>> On Wed, Mar 20, 2019 at 2:42 PM Eitan Raviv <eraviv@redhat.com> wrote: >>>> > >>>> > I am not sure that locking both groups would be sufficient, because >>>> there is still a chance that the removeNetowrks request will start and >>>> acquire the lock before the DisconnectStorage operation starts. >>>> > So probably the correct and full proof solution is to not move the >>>> host to maintenance until all related storage ops terminate. >>>> > >>>> > >>>> > On Wed, Mar 20, 2019 at 2:07 PM Simone Tiraboschi < >>>> stirabos@redhat.com> wrote: >>>> >> >>>> >> >>>> >> >>>> >> On Sun, Mar 17, 2019 at 3:04 PM Eyal Edri <eedri@redhat.com> wrote: >>>> >>> >>>> >>> Not sure if all the same issue, but seems to failing around the >>>> same time: >>>> >>> >>>> >>>   ovirt-system-tests_hc-basic-suite-4.2 1 day 12 hr - #824 12 hr >>>> - #825 57 min  integ-tests >>>> >>>   ovirt-system-tests_hc-basic-suite-master 2 days 12 hr - #1043 >>>> 12 hr - #1045 51 min  integ-tests >>>> >>>   ovirt-system-tests_he-basic-ansible-suite-4.3 N/A 11 hr - #11 >>>> 24 sec  integ-tests >>>> >>>   ovirt-system-tests_he-basic-ipv6-suite-master N/A 12 hr - #11 >>>> 10 min  integ-tests >>>> >>>   ovirt-system-tests_he-basic-iscsi-suite-master 2 days 10 hr - >>>> #871 10 hr - #873 1 hr 36 min  integ-tests >>>> >>>   ovirt-system-tests_he-basic-suite-master 2 days 11 hr - #1147 >>>> 11 hr - #1149 1 hr 26 min  integ-tests >>>> >>>   ovirt-system-tests_he-node-ng-suite-master 2 days 12 hr - #727 >>>> 12 hr - #729 1 hr 54 min  integ-tests >>>> >> >>>> >> >>>> >> It's a NullPointerException on engine side: >>>> >> I opened a bug here: >>>> https://bugzilla.redhat.com/show_bug.cgi?id=1690159 which is not on >>>> POST >>>> >> >>>> >> >>>> >>> >>>> >>>   ovirt-system-tests_network-suite-master 3 days 12 hr - #926 12 >>>> hr - #929 45 min  integ-tests >>>> >>>   ovirt-system-tests_openshift-on-ovirt-suite-4.2 3 days 10 hr - >>>> #187 10 hr - #190 45 min  integ-tests >>>> >>> >>>> >>> Links to jobs can be found here: https://jenkins.ovirt.org/ >>>> >>> >>>> >>> -- >>>> >>> >>>> >>> Eyal edri >>>> >>> >>>> >>> >>>> >>> MANAGER >>>> >>> >>>> >>> RHV/CNV DevOps >>>> >>> >>>> >>> EMEA VIRTUALIZATION R&D >>>> >>> >>>> >>> >>>> >>> Red Hat EMEA >>>> >>> >>>> >>> TRIED. TESTED. TRUSTED. >>>> >>> phone: +972-9-7692018 >>>> >>> irc: eedri (on #tlv #rhev-dev #rhev-integ) >>>> >> >>>> >> _______________________________________________ >>>> >> Devel mailing list -- devel@ovirt.org >>>> >> To unsubscribe send an email to devel-leave@ovirt.org >>>> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>> >> oVirt Code of Conduct: >>>> https://www.ovirt.org/community/about/community-guidelines/ >>>> >> List Archives: >>>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/CQP4YDTMT53J5JKZQLYSYOELIVCHKVKE/ >>>> > >>>> > _______________________________________________ >>>> > Devel mailing list -- devel@ovirt.org >>>> > To unsubscribe send an email to devel-leave@ovirt.org >>>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>> > oVirt Code of Conduct: >>>> https://www.ovirt.org/community/about/community-guidelines/ >>>> > List Archives: >>>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/3XP6WDOAL54EQ2EGC5K25IJ47N3GWSHM/ >>>> >>>

but http://jenkins.ovirt.org/job/ovirt-system-tests_network-suite-4.3/21/ is still failing Was your patch merged? On Sun, Mar 24, 2019 at 10:14 AM Ahmad Khiet <akhiet@redhat.com> wrote: > patched 4.3! > > On Sun, Mar 24, 2019 at 9:06 AM Eitan Raviv <eraviv@redhat.com> wrote: > >> After some off line discussions it seems that the change that should be >> implemented in order to solve the original problem (remove host fails due >> to disconnect storage in progress) is to leave the host in 'preparing for >> maintenance' until all relevant storage operations are completed. >> >> On Sat, Mar 23, 2019 at 11:36 PM Dan Kenigsberg <danken@redhat.com> >> wrote: >> >>> Unfortunately, the network suite is still failing on >>> >>> Cannot edit Host. Related operation is currently in progress. Please try >>> again later. >>> >>> Can you check if that's the same issue? Did you revert from 4.3 too? >>> >>> http://jenkins.ovirt.org/job/ovirt-system-tests_network-suite-4.3/19/ >>> >>> >>> >>> On Sat, 23 Mar 2019, 21:27 Benny Zlotnik, <bzlotnik@redhat.com> wrote: >>> >>>> The patch was reverted on Thursday >>>> >>>> On Sat, Mar 23, 2019 at 8:48 PM Dan Kenigsberg <danken@redhat.com> >>>> wrote: >>>> >>>>> I was told that intervening in the host state machine is delicate, but >>>>> I think that this is the only correct approach. >>>>> >>>>> Benny, Ahmad, Tal: do you have a plan to resolve this? We are entering >>>>> a third week with this constant failure. >>>>> >>>>> >>>>> On Wed, Mar 20, 2019 at 2:42 PM Eitan Raviv <eraviv@redhat.com> wrote: >>>>> > >>>>> > I am not sure that locking both groups would be sufficient, because >>>>> there is still a chance that the removeNetowrks request will start and >>>>> acquire the lock before the DisconnectStorage operation starts. >>>>> > So probably the correct and full proof solution is to not move the >>>>> host to maintenance until all related storage ops terminate. >>>>> > >>>>> > >>>>> > On Wed, Mar 20, 2019 at 2:07 PM Simone Tiraboschi < >>>>> stirabos@redhat.com> wrote: >>>>> >> >>>>> >> >>>>> >> >>>>> >> On Sun, Mar 17, 2019 at 3:04 PM Eyal Edri <eedri@redhat.com> wrote: >>>>> >>> >>>>> >>> Not sure if all the same issue, but seems to failing around the >>>>> same time: >>>>> >>> >>>>> >>>   ovirt-system-tests_hc-basic-suite-4.2 1 day 12 hr - #824 12 hr >>>>> - #825 57 min  integ-tests >>>>> >>>   ovirt-system-tests_hc-basic-suite-master 2 days 12 hr - #1043 >>>>> 12 hr - #1045 51 min  integ-tests >>>>> >>>   ovirt-system-tests_he-basic-ansible-suite-4.3 N/A 11 hr - #11 >>>>> 24 sec  integ-tests >>>>> >>>   ovirt-system-tests_he-basic-ipv6-suite-master N/A 12 hr - #11 >>>>> 10 min  integ-tests >>>>> >>>   ovirt-system-tests_he-basic-iscsi-suite-master 2 days 10 hr - >>>>> #871 10 hr - #873 1 hr 36 min  integ-tests >>>>> >>>   ovirt-system-tests_he-basic-suite-master 2 days 11 hr - #1147 >>>>> 11 hr - #1149 1 hr 26 min  integ-tests >>>>> >>>   ovirt-system-tests_he-node-ng-suite-master 2 days 12 hr - #727 >>>>> 12 hr - #729 1 hr 54 min  integ-tests >>>>> >> >>>>> >> >>>>> >> It's a NullPointerException on engine side: >>>>> >> I opened a bug here: >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1690159 which is not on >>>>> POST >>>>> >> >>>>> >> >>>>> >>> >>>>> >>>   ovirt-system-tests_network-suite-master 3 days 12 hr - #926 12 >>>>> hr - #929 45 min  integ-tests >>>>> >>>   ovirt-system-tests_openshift-on-ovirt-suite-4.2 3 days 10 hr - >>>>> #187 10 hr - #190 45 min  integ-tests >>>>> >>> >>>>> >>> Links to jobs can be found here: https://jenkins.ovirt.org/ >>>>> >>> >>>>> >>> -- >>>>> >>> >>>>> >>> Eyal edri >>>>> >>> >>>>> >>> >>>>> >>> MANAGER >>>>> >>> >>>>> >>> RHV/CNV DevOps >>>>> >>> >>>>> >>> EMEA VIRTUALIZATION R&D >>>>> >>> >>>>> >>> >>>>> >>> Red Hat EMEA >>>>> >>> >>>>> >>> TRIED. TESTED. TRUSTED. >>>>> >>> phone: +972-9-7692018 >>>>> >>> irc: eedri (on #tlv #rhev-dev #rhev-integ) >>>>> >> >>>>> >> _______________________________________________ >>>>> >> Devel mailing list -- devel@ovirt.org >>>>> >> To unsubscribe send an email to devel-leave@ovirt.org >>>>> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>> >> oVirt Code of Conduct: >>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>>> >> List Archives: >>>>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/CQP4YDTMT53J5JKZQLYSYOELIVCHKVKE/ >>>>> > >>>>> > _______________________________________________ >>>>> > Devel mailing list -- devel@ovirt.org >>>>> > To unsubscribe send an email to devel-leave@ovirt.org >>>>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>> > oVirt Code of Conduct: >>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>>> > List Archives: >>>>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/3XP6WDOAL54EQ2EGC5K25IJ47N3GWSHM/ >>>>> >>>>

the patch was merged on 4.3 but test run [1] has build [2] which is one patch before Ahmed's merge... [1] http://jenkins.ovirt.org/job/ovirt-system-tests_network-suite-4.3/21/ [2] ovirt-engine-4.3.2.2-0.0.master.20190324105929.git8b0969c.el7.noarch.rpm On Mon, Mar 25, 2019 at 8:42 AM Dan Kenigsberg <danken@redhat.com> wrote:
but http://jenkins.ovirt.org/job/ovirt-system-tests_network-suite-4.3/21/ is still failing Was your patch merged?
On Sun, Mar 24, 2019 at 10:14 AM Ahmad Khiet <akhiet@redhat.com> wrote:
patched 4.3!
On Sun, Mar 24, 2019 at 9:06 AM Eitan Raviv <eraviv@redhat.com> wrote:
After some off line discussions it seems that the change that should be implemented in order to solve the original problem (remove host fails due to disconnect storage in progress) is to leave the host in 'preparing for maintenance' until all relevant storage operations are completed.
On Sat, Mar 23, 2019 at 11:36 PM Dan Kenigsberg <danken@redhat.com> wrote:
Unfortunately, the network suite is still failing on
Cannot edit Host. Related operation is currently in progress. Please try again later.
Can you check if that's the same issue? Did you revert from 4.3 too?
http://jenkins.ovirt.org/job/ovirt-system-tests_network-suite-4.3/19/
On Sat, 23 Mar 2019, 21:27 Benny Zlotnik, <bzlotnik@redhat.com> wrote:
The patch was reverted on Thursday
On Sat, Mar 23, 2019 at 8:48 PM Dan Kenigsberg <danken@redhat.com> wrote:
I was told that intervening in the host state machine is delicate, but I think that this is the only correct approach.
Benny, Ahmad, Tal: do you have a plan to resolve this? We are entering a third week with this constant failure.
On Wed, Mar 20, 2019 at 2:42 PM Eitan Raviv <eraviv@redhat.com> wrote: > > I am not sure that locking both groups would be sufficient, because there is still a chance that the removeNetowrks request will start and acquire the lock before the DisconnectStorage operation starts. > So probably the correct and full proof solution is to not move the host to maintenance until all related storage ops terminate. > > > On Wed, Mar 20, 2019 at 2:07 PM Simone Tiraboschi < stirabos@redhat.com> wrote: >> >> >> >> On Sun, Mar 17, 2019 at 3:04 PM Eyal Edri <eedri@redhat.com> wrote: >>> >>> Not sure if all the same issue, but seems to failing around the same time: >>> >>>   ovirt-system-tests_hc-basic-suite-4.2 1 day 12 hr - #824 12 hr - #825 57 min  integ-tests >>>   ovirt-system-tests_hc-basic-suite-master 2 days 12 hr - #1043 12 hr - #1045 51 min  integ-tests >>>   ovirt-system-tests_he-basic-ansible-suite-4.3 N/A 11 hr - #11 24 sec  integ-tests >>>   ovirt-system-tests_he-basic-ipv6-suite-master N/A 12 hr - #11 10 min  integ-tests >>>   ovirt-system-tests_he-basic-iscsi-suite-master 2 days 10 hr - #871 10 hr - #873 1 hr 36 min  integ-tests >>>   ovirt-system-tests_he-basic-suite-master 2 days 11 hr - #1147 11 hr - #1149 1 hr 26 min  integ-tests >>>   ovirt-system-tests_he-node-ng-suite-master 2 days 12 hr - #727 12 hr - #729 1 hr 54 min  integ-tests >> >> >> It's a NullPointerException on engine side: >> I opened a bug here: https://bugzilla.redhat.com/show_bug.cgi?id=1690159 which is not on POST >> >> >>> >>>   ovirt-system-tests_network-suite-master 3 days 12 hr - #926 12 hr - #929 45 min  integ-tests >>>   ovirt-system-tests_openshift-on-ovirt-suite-4.2 3 days 10 hr - #187 10 hr - #190 45 min  integ-tests >>> >>> Links to jobs can be found here: https://jenkins.ovirt.org/ >>> >>> -- >>> >>> Eyal edri >>> >>> >>> MANAGER >>> >>> RHV/CNV DevOps >>> >>> EMEA VIRTUALIZATION R&D >>> >>> >>> Red Hat EMEA >>> >>> TRIED. TESTED. TRUSTED. >>> phone: +972-9-7692018 >>> irc: eedri (on #tlv #rhev-dev #rhev-integ) >> >> _______________________________________________ >> Devel mailing list -- devel@ovirt.org >> To unsubscribe send an email to devel-leave@ovirt.org >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >> List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/CQP4YDTMT53J5J... > > _______________________________________________ > Devel mailing list -- devel@ovirt.org > To unsubscribe send an email to devel-leave@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/3XP6WDOAL54EQ2...

This issue seems to be solved now, please let me know if not. The build https://jenkins.ovirt.org/view/oVirt system tests/job/ovirt-system-tests_manual/4408/ with ovirt-engine d0a215d862eb819f0bbdd51fed012f9b972c1bdf which includes Ahmed's commit a236c90d54652503d43d8315582effb74050d22e succeeded. Also the build https://jenkins.ovirt.org/view/oVirt system tests/job/ovirt-system-tests_manual/4407/ with default repos succeeded. On Mon, 25 Mar 2019 08:58:25 +0200 Eitan Raviv <eraviv@redhat.com> wrote:
the patch was merged on 4.3 but test run [1] has build [2] which is one patch before Ahmed's merge...
[1] http://jenkins.ovirt.org/job/ovirt-system-tests_network-suite-4.3/21/ [2] ovirt-engine-4.3.2.2-0.0.master.20190324105929.git8b0969c.el7.noarch.rpm
On Mon, Mar 25, 2019 at 8:42 AM Dan Kenigsberg <danken@redhat.com> wrote:
but http://jenkins.ovirt.org/job/ovirt-system-tests_network-suite-4.3/21/ is still failing Was your patch merged?
On Sun, Mar 24, 2019 at 10:14 AM Ahmad Khiet <akhiet@redhat.com> wrote:
patched 4.3!
On Sun, Mar 24, 2019 at 9:06 AM Eitan Raviv <eraviv@redhat.com> wrote:
After some off line discussions it seems that the change that should be implemented in order to solve the original problem (remove host fails due to disconnect storage in progress) is to leave the host in 'preparing for maintenance' until all relevant storage operations are completed.
On Sat, Mar 23, 2019 at 11:36 PM Dan Kenigsberg <danken@redhat.com> wrote:
Unfortunately, the network suite is still failing on
Cannot edit Host. Related operation is currently in progress. Please try again later.
Can you check if that's the same issue? Did you revert from 4.3 too?
http://jenkins.ovirt.org/job/ovirt-system-tests_network-suite-4.3/19/
On Sat, 23 Mar 2019, 21:27 Benny Zlotnik, <bzlotnik@redhat.com> wrote:
The patch was reverted on Thursday
On Sat, Mar 23, 2019 at 8:48 PM Dan Kenigsberg <danken@redhat.com> wrote:
> I was told that intervening in the host state machine is delicate, but > I think that this is the only correct approach. > > Benny, Ahmad, Tal: do you have a plan to resolve this? We are entering > a third week with this constant failure. > > > On Wed, Mar 20, 2019 at 2:42 PM Eitan Raviv <eraviv@redhat.com> > wrote: > > > > I am not sure that locking both groups would be sufficient, because > there is still a chance that the removeNetowrks request will start and > acquire the lock before the DisconnectStorage operation starts. > > So probably the correct and full proof solution is to not move the > host to maintenance until all related storage ops terminate. > > > > > > On Wed, Mar 20, 2019 at 2:07 PM Simone Tiraboschi < > stirabos@redhat.com> wrote: > >> > >> > >> > >> On Sun, Mar 17, 2019 at 3:04 PM Eyal Edri <eedri@redhat.com> > wrote: > >>> > >>> Not sure if all the same issue, but seems to failing around the > same time: > >>> > >>>   ovirt-system-tests_hc-basic-suite-4.2 1 day 12 hr - #824 12 > hr - #825 57 min  integ-tests > >>>   ovirt-system-tests_hc-basic-suite-master 2 days 12 hr - #1043 > 12 hr - #1045 51 min  integ-tests > >>>   ovirt-system-tests_he-basic-ansible-suite-4.3 N/A 11 hr - #11 > 24 sec  integ-tests > >>>   ovirt-system-tests_he-basic-ipv6-suite-master N/A 12 hr - #11 > 10 min  integ-tests > >>>   ovirt-system-tests_he-basic-iscsi-suite-master 2 days 10 hr - > #871 10 hr - #873 1 hr 36 min  integ-tests > >>>   ovirt-system-tests_he-basic-suite-master 2 days 11 hr - #1147 > 11 hr - #1149 1 hr 26 min  integ-tests > >>>   ovirt-system-tests_he-node-ng-suite-master 2 days 12 hr - > #727 12 hr - #729 1 hr 54 min  integ-tests > >> > >> > >> It's a NullPointerException on engine side: > >> I opened a bug here: > https://bugzilla.redhat.com/show_bug.cgi?id=1690159 which is not on > POST > >> > >> > >>> > >>>   ovirt-system-tests_network-suite-master 3 days 12 hr - #926 > 12 hr - #929 45 min  integ-tests > >>>   ovirt-system-tests_openshift-on-ovirt-suite-4.2 3 days 10 hr > - #187 10 hr - #190 45 min  integ-tests > >>> > >>> Links to jobs can be found here: https://jenkins.ovirt.org/ > >>> > >>> -- > >>> > >>> Eyal edri > >>> > >>> > >>> MANAGER > >>> > >>> RHV/CNV DevOps > >>> > >>> EMEA VIRTUALIZATION R&D > >>> > >>> > >>> Red Hat EMEA > >>> > >>> TRIED. TESTED. TRUSTED. > >>> phone: +972-9-7692018 > >>> irc: eedri (on #tlv #rhev-dev #rhev-integ) > >> > >> _______________________________________________ > >> Devel mailing list -- devel@ovirt.org > >> To unsubscribe send an email to devel-leave@ovirt.org > >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > >> oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > >> List Archives: > https://lists.ovirt.org/archives/list/devel@ovirt.org/message/CQP4YDTMT53J5J... > > > > _______________________________________________ > > Devel mailing list -- devel@ovirt.org > > To unsubscribe send an email to devel-leave@ovirt.org > > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > > List Archives: > https://lists.ovirt.org/archives/list/devel@ovirt.org/message/3XP6WDOAL54EQ2... >

ovirt-engine change queue and ovit-system-tests check_patch also look good with ovirt-engine rpm including Ahmed's revert. Thanks On Mon, Mar 25, 2019 at 9:38 PM Dominik Holler <dholler@redhat.com> wrote:
This issue seems to be solved now, please let me know if not.
The build https://jenkins.ovirt.org/view/oVirt system tests/job/ovirt-system-tests_manual/4408/ with ovirt-engine d0a215d862eb819f0bbdd51fed012f9b972c1bdf which includes Ahmed's commit a236c90d54652503d43d8315582effb74050d22e succeeded. Also the build https://jenkins.ovirt.org/view/oVirt system tests/job/ovirt-system-tests_manual/4407/ with default repos succeeded.
On Mon, 25 Mar 2019 08:58:25 +0200 Eitan Raviv <eraviv@redhat.com> wrote:
the patch was merged on 4.3 but test run [1] has build [2] which is one patch before Ahmed's merge...
[1] http://jenkins.ovirt.org/job/ovirt-system-tests_network-suite-4.3/21/ [2] ovirt-engine-4.3.2.2-0.0.master.20190324105929.git8b0969c.el7.noarch.rpm
On Mon, Mar 25, 2019 at 8:42 AM Dan Kenigsberg <danken@redhat.com> wrote:
but http://jenkins.ovirt.org/job/ovirt-system-tests_network-suite-4.3/21/ is still failing Was your patch merged?
On Sun, Mar 24, 2019 at 10:14 AM Ahmad Khiet <akhiet@redhat.com> wrote:
patched 4.3!
On Sun, Mar 24, 2019 at 9:06 AM Eitan Raviv <eraviv@redhat.com> wrote:
After some off line discussions it seems that the change that should be implemented in order to solve the original problem (remove host fails due to disconnect storage in progress) is to leave the host in 'preparing for maintenance' until all relevant storage operations are completed.
On Sat, Mar 23, 2019 at 11:36 PM Dan Kenigsberg <danken@redhat.com> wrote:
Unfortunately, the network suite is still failing on
Cannot edit Host. Related operation is currently in progress. Please try again later.
Can you check if that's the same issue? Did you revert from 4.3 too?
http://jenkins.ovirt.org/job/ovirt-system-tests_network-suite-4.3/19/
On Sat, 23 Mar 2019, 21:27 Benny Zlotnik, <bzlotnik@redhat.com>
wrote:
> The patch was reverted on Thursday > > On Sat, Mar 23, 2019 at 8:48 PM Dan Kenigsberg <danken@redhat.com> > wrote: > >> I was told that intervening in the host state machine is
>> I think that this is the only correct approach. >> >> Benny, Ahmad, Tal: do you have a plan to resolve this? We are entering >> a third week with this constant failure. >> >> >> On Wed, Mar 20, 2019 at 2:42 PM Eitan Raviv <eraviv@redhat.com> >> wrote: >> > >> > I am not sure that locking both groups would be sufficient, because >> there is still a chance that the removeNetowrks request will start and >> acquire the lock before the DisconnectStorage operation starts. >> > So probably the correct and full proof solution is to not move
>> host to maintenance until all related storage ops terminate. >> > >> > >> > On Wed, Mar 20, 2019 at 2:07 PM Simone Tiraboschi < >> stirabos@redhat.com> wrote: >> >> >> >> >> >> >> >> On Sun, Mar 17, 2019 at 3:04 PM Eyal Edri <eedri@redhat.com> >> wrote: >> >>> >> >>> Not sure if all the same issue, but seems to failing around
>> same time: >> >>> >> >>>   ovirt-system-tests_hc-basic-suite-4.2 1 day 12 hr - #824 12 >> hr - #825 57 min  integ-tests >> >>>   ovirt-system-tests_hc-basic-suite-master 2 days 12 hr - #1043 >> 12 hr - #1045 51 min  integ-tests >> >>>   ovirt-system-tests_he-basic-ansible-suite-4.3 N/A 11 hr - #11 >> 24 sec  integ-tests >> >>>   ovirt-system-tests_he-basic-ipv6-suite-master N/A 12 hr - #11 >> 10 min  integ-tests >> >>>   ovirt-system-tests_he-basic-iscsi-suite-master 2 days 10
>> #871 10 hr - #873 1 hr 36 min  integ-tests >> >>>   ovirt-system-tests_he-basic-suite-master 2 days 11 hr - #1147 >> 11 hr - #1149 1 hr 26 min  integ-tests >> >>>   ovirt-system-tests_he-node-ng-suite-master 2 days 12 hr - >> #727 12 hr - #729 1 hr 54 min  integ-tests >> >> >> >> >> >> It's a NullPointerException on engine side: >> >> I opened a bug here: >> https://bugzilla.redhat.com/show_bug.cgi?id=1690159 which is not on >> POST >> >> >> >> >> >>> >> >>>   ovirt-system-tests_network-suite-master 3 days 12 hr - #926 >> 12 hr - #929 45 min  integ-tests >> >>>   ovirt-system-tests_openshift-on-ovirt-suite-4.2 3 days 10
delicate, but the the hr - hr
>> - #187 10 hr - #190 45 min  integ-tests >> >>> >> >>> Links to jobs can be found here: https://jenkins.ovirt.org/ >> >>> >> >>> -- >> >>> >> >>> Eyal edri >> >>> >> >>> >> >>> MANAGER >> >>> >> >>> RHV/CNV DevOps >> >>> >> >>> EMEA VIRTUALIZATION R&D >> >>> >> >>> >> >>> Red Hat EMEA >> >>> >> >>> TRIED. TESTED. TRUSTED. >> >>> phone: +972-9-7692018 >> >>> irc: eedri (on #tlv #rhev-dev #rhev-integ) >> >> >> >> _______________________________________________ >> >> Devel mailing list -- devel@ovirt.org >> >> To unsubscribe send an email to devel-leave@ovirt.org >> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> >> oVirt Code of Conduct: >> https://www.ovirt.org/community/about/community-guidelines/ >> >> List Archives: >> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/CQP4YDTMT53J5J... >> > >> > _______________________________________________ >> > Devel mailing list -- devel@ovirt.org >> > To unsubscribe send an email to devel-leave@ovirt.org >> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >> > oVirt Code of Conduct: >> https://www.ovirt.org/community/about/community-guidelines/ >> > List Archives: >> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/3XP6WDOAL54EQ2... >> >
participants (7)
-
Ahmad Khiet
-
Benny Zlotnik
-
Dan Kenigsberg
-
Dominik Holler
-
Eitan Raviv
-
Eyal Edri
-
Simone Tiraboschi