On Mon, Mar 6, 2017 at 9:46 AM, Dan Kenigsberg <danken@redhat.com> wrote:It is fair to suspect an interaction with setupNetworks, but let usOn Mon, Mar 6, 2017 at 10:11 AM, Piotr Kliczewski <pkliczew@redhat.com> wrote:
>
>
> On Mon, Mar 6, 2017 at 8:23 AM, Dan Kenigsberg <danken@redhat.com> wrote:
>>
>> On Sun, Mar 5, 2017 at 9:50 PM, Piotr Kliczewski <pkliczew@redhat.com>
>> wrote:
>> >
>> >
>> > On Sun, Mar 5, 2017 at 8:29 AM, Dan Kenigsberg <danken@redhat.com>
>> > wrote:
>> >>
>> >> Piotr, could you provide more information?
>> >>
>> >> Which setupNetworks action triggers this problem? Any idea which lock
>> >> did we use to take and when did we drop it?
>> >
>> >
>> > I though that this [1] would make sure that setupNetworks is exclusive
>> > operation on a host which seems not to be the case.
>> > In the logs I saw following message sent:
>> >
>> >
>> > {"jsonrpc":"2.0","method":"Host.setupNetworks","params":{" networks":{"VLAN200_Network":{ "vlan":"200","netmask":"255.25 5.255.0","ipv6autoconf":false, "nic":"eth0","bridged":"false" ,"ipaddr":"192.0.3.1","dhcpv6" :false,"mtu":1500,"switch":" legacy"}},"bondings":{}," options":{"connectivityTimeout ":120,"connectivityCheck":" true"}},"id":"3f7f74ea-fc39- 4815-831b-5e3b1c22131d"}
>> >
>> > Few seconds later there was:
>> >
>> >
>> > {"jsonrpc":"2.0","method":"Host.getAllVmStats","params":{}," id":"67d510eb-6dfc-4f67-97b6- a4e63c670ff2"}
>> >
>> > and still while we were calling pings there was:
>> >
>> >
>> > {"jsonrpc":"2.0","method":"StoragePool.getSpmStatus","params ":{"storagepoolID":"8cc227da- 70e7-4557-aa01-6d8ddee6f847"}, "id":"d4d04c7c-47b8-44db-867b- 770e1e19361c"}
>> >
>> > My assumption was that those calls should not happen and calls them
>> > selves
>> > could be corrupted or their responses.
>> > What do you think?
>> >
>> > [1]
>> >
>> > https://github.com/oVirt/ovirt-engine/blob/master/backend/ manager/modules/bll/src/main/ java/org/ovirt/engine/core/ bll/network/host/HostSetupNetw orksCommand.java#L285
>>
>> I suspect that getVmStats and getSpmStatus simply do not take the
>> hostmonitoring lock, and I don't see anything wrong in that.
>>
>> Note that during 006_migration, we set only a mere migration network,
>> not the management network. This operation should not interfere with
>> Engine-Vdsm communication in any way; I don't yet understand why you
>> suspect that it does.
>
>
> My assumption here is that I saw this failure 2 times and both were during
> setupNetworks.
> The pattern is that always a call fails which "should not" occur during such
> operation.
>
put some substance into it.
What is the mode of failure of the other command?
I am not sure what do you mean. Can you please explain?