[ OST Failure Report ] [ oVirt Master (log-collector/engine/vdsm) ] [ 23/25/26-03-2018 ] [004_basic_sanity.hotplug_cpu ]

Milan Zamazal mzamazal at redhat.com
Thu Mar 29 12:34:36 UTC 2018


Milan Zamazal <mzamazal at redhat.com> writes:

> Dafna Ron <dron at redhat.com> writes:
>
>> Can you post the fix that you added on the mail?
>
> I split hotplug_cpu test in https://gerrit.ovirt.org/89551

After some thought, I have a different fix:
https://gerrit.ovirt.org/89589

This one checks whether ssh (and not only network) is available after
resume and if not then it reboots the VM as if network wasn't available.

> We'll see whether it is actually useful once the disk_operations failure
> is fixed.  If it still causes problem then hotplug_cpu_guest_check
> introduced by the patch should be disabled until we can log into the
> guest OS reliably.
>
>> On Wed, Mar 28, 2018 at 9:23 AM, Milan Zamazal <mzamazal at redhat.com> wrote:
>>
>>> Milan Zamazal <mzamazal at redhat.com> writes:
>>>
>>> > Dafna Ron <dron at redhat.com> writes:
>>> >
>>> >> We have a failure that seems to be random and happening in several
>>> >> projects.
>>> >
>>> > Does this failure occur only recently or has it been present for ages?
>>> >
>>> >> from what I can see, we are failing due to a timing issue in the test
>>> >> itself because we are querying the vm after its been destroyed in
>>> engine.
>>> >> looking at engine, I can see that the vm was actually shut down,
>>> >
>>> > No, the VM was shut down in another test.  It's already running again in
>>> > hotplug_cpu.
>>> >
>>> >> I would like to disable this test until we can fix the issue since it
>>> >> already failed about 7 different patches from different projects.
>>> >
>>> > I can see that Gal has already increased the timeout.  I think the test
>>> > could be split to reduce the waiting delay, I'll post a patch for that.
>>>
>>> BTW I think the primary cause of the trouble are the infamous Cirros
>>> networking recovery problems.
>>>


More information about the Infra mailing list