[ovirt-devel] [ OST Failure Report ] [ master ] [ 03.03.2017 ] [006_migrations host is in Connecting state]

Dan Kenigsberg danken at redhat.com
Mon Mar 6 08:46:47 UTC 2017


On Mon, Mar 6, 2017 at 10:11 AM, Piotr Kliczewski <pkliczew at redhat.com> wrote:
>
>
> On Mon, Mar 6, 2017 at 8:23 AM, Dan Kenigsberg <danken at redhat.com> wrote:
>>
>> On Sun, Mar 5, 2017 at 9:50 PM, Piotr Kliczewski <pkliczew at redhat.com>
>> wrote:
>> >
>> >
>> > On Sun, Mar 5, 2017 at 8:29 AM, Dan Kenigsberg <danken at redhat.com>
>> > wrote:
>> >>
>> >> Piotr, could you provide more information?
>> >>
>> >> Which setupNetworks action triggers this problem? Any idea which lock
>> >> did we use to take and when did we drop it?
>> >
>> >
>> > I though that this [1] would make sure that setupNetworks is exclusive
>> > operation on a host which seems not to be the case.
>> > In the logs I saw following message sent:
>> >
>> >
>> > {"jsonrpc":"2.0","method":"Host.setupNetworks","params":{"networks":{"VLAN200_Network":{"vlan":"200","netmask":"255.255.255.0","ipv6autoconf":false,"nic":"eth0","bridged":"false","ipaddr":"192.0.3.1","dhcpv6":false,"mtu":1500,"switch":"legacy"}},"bondings":{},"options":{"connectivityTimeout":120,"connectivityCheck":"true"}},"id":"3f7f74ea-fc39-4815-831b-5e3b1c22131d"}
>> >
>> > Few seconds later there was:
>> >
>> >
>> > {"jsonrpc":"2.0","method":"Host.getAllVmStats","params":{},"id":"67d510eb-6dfc-4f67-97b6-a4e63c670ff2"}
>> >
>> > and still while we were calling pings there was:
>> >
>> >
>> > {"jsonrpc":"2.0","method":"StoragePool.getSpmStatus","params":{"storagepoolID":"8cc227da-70e7-4557-aa01-6d8ddee6f847"},"id":"d4d04c7c-47b8-44db-867b-770e1e19361c"}
>> >
>> > My assumption was that those calls should not happen and calls them
>> > selves
>> > could be corrupted or their responses.
>> > What do you think?
>> >
>> > [1]
>> >
>> > https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/network/host/HostSetupNetworksCommand.java#L285
>>
>> I suspect that getVmStats and getSpmStatus simply do not take the
>> hostmonitoring lock, and I don't see anything wrong in that.
>>
>> Note that during 006_migration, we set only a mere migration network,
>> not the management network. This operation should not interfere with
>> Engine-Vdsm communication in any way; I don't yet understand why you
>> suspect that it does.
>
>
> My assumption here is that I saw this failure 2 times and both were during
> setupNetworks.
> The pattern is that always a call fails which "should not" occur during such
> operation.
>

It is fair to suspect an interaction with setupNetworks, but let us
put some substance into it.
What is the mode of failure of the other command?


More information about the Devel mailing list