[JIRA] (OVIRT-2821) OST: test for host being in up state should
have retries
by Ehud Yonasi (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2821?page=com.atlassian.jir... ]
Ehud Yonasi commented on OVIRT-2821:
------------------------------------
I agree, IMO it sounds like an issue with the API that needs to be checked.
> OST: test for host being in up state should have retries
> --------------------------------------------------------
>
> Key: OVIRT-2821
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2821
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Components: OST
> Reporter: Dusan Fodor
> Assignee: infra
>
> When host goes up after installation, it first goes up, then it falls back into (iirc) unknown state, then it goes up for good.
> This confuses multiple tests that require host to be up (e.g. add storage domain), which wait for at least one host being up, and when they get past the condition, meanwhile host falls back to the unknown state, so following operation which requires host to be up fails. Then it tries fencing, but it cannot start fencing since there is no up host at the moment, which issues more errors...
> It's same pattern in all such failure, i will paste console output & link to job next time i'll see this.
> The resolution would be to wait if the host is REALLY up, so e.g. ask if it's up, wait a bit, ask again and if it's up both times, continue with operation.
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100114)
5 years
[JIRA] (OVIRT-2821) OST: test for host being in up state should
have retries
by Dusan Fodor (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2821?page=com.atlassian.jir... ]
Dusan Fodor commented on OVIRT-2821:
------------------------------------
On OST meeting we heard valid points against the suggested resolution, since it’s hiding the actual bug (that host doesn’t stay up after first going up) instead of resolving it.
Probably a compromise would be to report the bug in bugzilla to engine and then bypass this in tests, so this failure wouldn’t mask other possible issues.
Thoughts are welcome.
Also note that the bug itself is not trivial, Marcin already spend significant time trying to get why the host doesn’t stay up for good.
> OST: test for host being in up state should have retries
> --------------------------------------------------------
>
> Key: OVIRT-2821
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2821
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Components: OST
> Reporter: Dusan Fodor
> Assignee: infra
>
> When host goes up after installation, it first goes up, then it falls back into (iirc) unknown state, then it goes up for good.
> This confuses multiple tests that require host to be up (e.g. add storage domain), which wait for at least one host being up, and when they get past the condition, meanwhile host falls back to the unknown state, so following operation which requires host to be up fails. Then it tries fencing, but it cannot start fencing since there is no up host at the moment, which issues more errors...
> It's same pattern in all such failure, i will paste console output & link to job next time i'll see this.
> The resolution would be to wait if the host is REALLY up, so e.g. ask if it's up, wait a bit, ask again and if it's up both times, continue with operation.
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100114)
5 years
[JIRA] (OVIRT-2821) OST: test for host being in up state should
have retries
by Dusan Fodor (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2821?page=com.atlassian.jir... ]
Dusan Fodor updated OVIRT-2821:
-------------------------------
Summary: OST: test for host being in up state should have retries (was: OST: test for host being in up state should have a retry)
> OST: test for host being in up state should have retries
> --------------------------------------------------------
>
> Key: OVIRT-2821
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2821
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Components: OST
> Reporter: Dusan Fodor
> Assignee: infra
>
> When host goes up after installation, it first goes up, then it falls back into (iirc) unknown state, then it goes up for good.
> This confuses multiple tests that require host to be up (e.g. add storage domain), which wait for at least one host being up, and when they get past the condition, meanwhile host falls back to the unknown state, so following operation which requires host to be up fails. Then it tries fencing, but it cannot start fencing since there is no up host at the moment, which issues more errors...
> It's same pattern in all such failure, i will paste console output & link to job next time i'll see this.
> The resolution would be to wait if the host is REALLY up, so e.g. ask if it's up, wait a bit, ask again and if it's up both times, continue with operation.
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100114)
5 years
[JIRA] (OVIRT-2821) OST: test for host being in up state should
have a retry
by Dusan Fodor (oVirt JIRA)
Dusan Fodor created OVIRT-2821:
----------------------------------
Summary: OST: test for host being in up state should have a retry
Key: OVIRT-2821
URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2821
Project: oVirt - virtualization made easy
Issue Type: Bug
Components: OST
Reporter: Dusan Fodor
Assignee: infra
When host goes up after installation, it first goes up, then it falls back into (iirc) unknown state, then it goes up for good.
This confuses multiple tests that require host to be up (e.g. add storage domain), which wait for at least one host being up, and when they get past the condition, meanwhile host falls back to the unknown state, so following operation which requires host to be up fails. Then it tries fencing, but it cannot start fencing since there is no up host at the moment, which issues more errors...
It's same pattern in all such failure, i will paste console output & link to job next time i'll see this.
The resolution would be to wait if the host is REALLY up, so e.g. ask if it's up, wait a bit, ask again and if it's up both times, continue with operation.
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100114)
5 years