4.2.2.2-1 Starting hosted engine on all hosts

The last three 4.2.2 release candidates that I've tried have been starting self hosted engine all all physical hosts at the same time. Same with the latest RC, what logs do you need to investigate the problem? Regards, Brett

On Thu, Mar 15, 2018 at 8:50 AM, Maton, Brett <matonb@ltresources.co.uk> wrote:
The last three 4.2.2 release candidates that I've tried have been starting self hosted engine all all physical hosts at the same time.
Same with the latest RC, what logs do you need to investigate the problem?
/var/log/ovirt-hosted-engine-ha/* /var/log/sanlock.log /var/log/vdsm/* Adding Martin. Thanks and best regards, -- Didi

On Thu, Mar 15, 2018 at 8:18 AM, Yedidyah Bar David <didi@redhat.com> wrote:
The last three 4.2.2 release candidates that I've tried have been starting self hosted engine all all physical hosts at the same time.
Same with the latest RC, what logs do you need to investigate the
On Thu, Mar 15, 2018 at 8:50 AM, Maton, Brett <matonb@ltresources.co.uk> wrote: problem?
It was this one: https://bugzilla.redhat.com/show_bug.cgi?id=1547479 It got fixed today but still not available in RC4.
/var/log/ovirt-hosted-engine-ha/* /var/log/sanlock.log /var/log/vdsm/*
Adding Martin.
Thanks and best regards, -- Didi _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Il 15 Mar 2018 6:34 PM, "Simone Tiraboschi" <stirabos@redhat.com> ha scritto: On Thu, Mar 15, 2018 at 8:18 AM, Yedidyah Bar David <didi@redhat.com> wrote:
The last three 4.2.2 release candidates that I've tried have been starting self hosted engine all all physical hosts at the same time.
Same with the latest RC, what logs do you need to investigate the
On Thu, Mar 15, 2018 at 8:50 AM, Maton, Brett <matonb@ltresources.co.uk> wrote: problem?
It was this one: https://bugzilla.redhat.com/show_bug.cgi?id=1547479 It got fixed today but still not available in RC4.
/var/log/ovirt-hosted-engine-ha/* /var/log/sanlock.log /var/log/vdsm/*
Adding Martin.
Thanks and best regards, -- Didi ______________________________________________
If I understood correctly, this kind of risk is not present in 4.1.x and in 4.2.y for every x and for y <= 1?

Ok cool, glad you already have enough information as it's trashed my hosted-engine beyond recovery... On 15 March 2018 at 17:47, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
Il 15 Mar 2018 6:34 PM, "Simone Tiraboschi" <stirabos@redhat.com> ha scritto:
On Thu, Mar 15, 2018 at 8:18 AM, Yedidyah Bar David <didi@redhat.com> wrote:
The last three 4.2.2 release candidates that I've tried have been starting self hosted engine all all physical hosts at the same time.
Same with the latest RC, what logs do you need to investigate the
On Thu, Mar 15, 2018 at 8:50 AM, Maton, Brett <matonb@ltresources.co.uk> wrote: problem?
It was this one: https://bugzilla.redhat.com/show_bug.cgi?id=1547479
It got fixed today but still not available in RC4.
/var/log/ovirt-hosted-engine-ha/* /var/log/sanlock.log /var/log/vdsm/*
Adding Martin.
Thanks and best regards, -- Didi ______________________________________________
If I understood correctly, this kind of risk is not present in 4.1.x and in 4.2.y for every x and for y <= 1?
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On Mar 15, 2018 9:21 PM, "Maton, Brett" <matonb@ltresources.co.uk> wrote: Ok cool, glad you already have enough information as it's trashed my hosted-engine beyond recovery... Why did it trash it? Y. On 15 March 2018 at 17:47, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
Il 15 Mar 2018 6:34 PM, "Simone Tiraboschi" <stirabos@redhat.com> ha scritto:
On Thu, Mar 15, 2018 at 8:18 AM, Yedidyah Bar David <didi@redhat.com> wrote:
The last three 4.2.2 release candidates that I've tried have been starting self hosted engine all all physical hosts at the same time.
Same with the latest RC, what logs do you need to investigate the
On Thu, Mar 15, 2018 at 8:50 AM, Maton, Brett <matonb@ltresources.co.uk> wrote: problem?
It was this one: https://bugzilla.redhat.com/show_bug.cgi?id=1547479
It got fixed today but still not available in RC4.
/var/log/ovirt-hosted-engine-ha/* /var/log/sanlock.log /var/log/vdsm/*
Adding Martin.
Thanks and best regards, -- Didi ______________________________________________
If I understood correctly, this kind of risk is not present in 4.1.x and in 4.2.y for every x and for y <= 1?
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

root (/) partition is now unreadable On 16 March 2018 at 10:04, Yaniv Kaul <ykaul@redhat.com> wrote:
On Mar 15, 2018 9:21 PM, "Maton, Brett" <matonb@ltresources.co.uk> wrote:
Ok cool, glad you already have enough information as it's trashed my hosted-engine beyond recovery...
Why did it trash it? Y.
On 15 March 2018 at 17:47, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
Il 15 Mar 2018 6:34 PM, "Simone Tiraboschi" <stirabos@redhat.com> ha scritto:
On Thu, Mar 15, 2018 at 8:18 AM, Yedidyah Bar David <didi@redhat.com> wrote:
The last three 4.2.2 release candidates that I've tried have been starting self hosted engine all all physical hosts at the same time.
Same with the latest RC, what logs do you need to investigate the
On Thu, Mar 15, 2018 at 8:50 AM, Maton, Brett <matonb@ltresources.co.uk> wrote: problem?
It was this one: https://bugzilla.redhat.com/show_bug.cgi?id=1547479
It got fixed today but still not available in RC4.
/var/log/ovirt-hosted-engine-ha/* /var/log/sanlock.log /var/log/vdsm/*
Adding Martin.
Thanks and best regards, -- Didi ______________________________________________
If I understood correctly, this kind of risk is not present in 4.1.x and in 4.2.y for every x and for y <= 1?
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Why did it trash it?
Split brain and concurrent filesystem access... The bug only happened in 4.2.2 and was never released officially apart from development builds. And it should be fixed now. Martin On Fri, Mar 16, 2018 at 11:04 AM, Yaniv Kaul <ykaul@redhat.com> wrote:
On Mar 15, 2018 9:21 PM, "Maton, Brett" <matonb@ltresources.co.uk> wrote:
Ok cool, glad you already have enough information as it's trashed my hosted-engine beyond recovery...
Why did it trash it? Y.
On 15 March 2018 at 17:47, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
Il 15 Mar 2018 6:34 PM, "Simone Tiraboschi" <stirabos@redhat.com> ha scritto:
On Thu, Mar 15, 2018 at 8:18 AM, Yedidyah Bar David <didi@redhat.com> wrote:
On Thu, Mar 15, 2018 at 8:50 AM, Maton, Brett <matonb@ltresources.co.uk> wrote:
The last three 4.2.2 release candidates that I've tried have been starting self hosted engine all all physical hosts at the same time.
Same with the latest RC, what logs do you need to investigate the problem?
It was this one: https://bugzilla.redhat.com/show_bug.cgi?id=1547479
It got fixed today but still not available in RC4.
/var/log/ovirt-hosted-engine-ha/* /var/log/sanlock.log /var/log/vdsm/*
Adding Martin.
Thanks and best regards, -- Didi ______________________________________________
If I understood correctly, this kind of risk is not present in 4.1.x and in 4.2.y for every x and for y <= 1?
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Yup, no big surprise really :) On 16 March 2018 at 10:21, Martin Sivak <msivak@redhat.com> wrote:
Why did it trash it?
Split brain and concurrent filesystem access...
The bug only happened in 4.2.2 and was never released officially apart from development builds. And it should be fixed now.
Martin
On Fri, Mar 16, 2018 at 11:04 AM, Yaniv Kaul <ykaul@redhat.com> wrote:
On Mar 15, 2018 9:21 PM, "Maton, Brett" <matonb@ltresources.co.uk>
wrote:
Ok cool, glad you already have enough information as it's trashed my hosted-engine beyond recovery...
Why did it trash it? Y.
On 15 March 2018 at 17:47, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
Il 15 Mar 2018 6:34 PM, "Simone Tiraboschi" <stirabos@redhat.com> ha scritto:
On Thu, Mar 15, 2018 at 8:18 AM, Yedidyah Bar David <didi@redhat.com> wrote:
On Thu, Mar 15, 2018 at 8:50 AM, Maton, Brett <
matonb@ltresources.co.uk>
wrote:
The last three 4.2.2 release candidates that I've tried have been starting self hosted engine all all physical hosts at the same time.
Same with the latest RC, what logs do you need to investigate the problem?
It was this one: https://bugzilla.redhat.com/show_bug.cgi?id=1547479
It got fixed today but still not available in RC4.
/var/log/ovirt-hosted-engine-ha/* /var/log/sanlock.log /var/log/vdsm/*
Adding Martin.
Thanks and best regards, -- Didi ______________________________________________
If I understood correctly, this kind of risk is not present in 4.1.x and in 4.2.y for every x and for y <= 1?
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On Thu, Mar 15, 2018 at 6:47 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
Il 15 Mar 2018 6:34 PM, "Simone Tiraboschi" <stirabos@redhat.com> ha scritto:
On Thu, Mar 15, 2018 at 8:18 AM, Yedidyah Bar David <didi@redhat.com> wrote:
The last three 4.2.2 release candidates that I've tried have been starting self hosted engine all all physical hosts at the same time.
Same with the latest RC, what logs do you need to investigate the
On Thu, Mar 15, 2018 at 8:50 AM, Maton, Brett <matonb@ltresources.co.uk> wrote: problem?
It was this one: https://bugzilla.redhat.com/show_bug.cgi?id=1547479
It got fixed today but still not available in RC4.
/var/log/ovirt-hosted-engine-ha/* /var/log/sanlock.log /var/log/vdsm/*
Adding Martin.
Thanks and best regards, -- Didi ______________________________________________
If I understood correctly, this kind of risk is not present in 4.1.x and in 4.2.y for every x and for y <= 1?
Right. This issue has been introduced with https://gerrit.ovirt.org/#/c/86435/ that comes in ovirt-hosted-engine-ha-2.2.5 so 4.2.1 is not affected since it ships ovirt-hosted-engine-ha-2.2.4.
participants (6)
-
Gianluca Cecchi
-
Martin Sivak
-
Maton, Brett
-
Simone Tiraboschi
-
Yaniv Kaul
-
Yedidyah Bar David