Thank you Strahil,

I guess there is the nuclear option....I was trying to avoid a full rebuild since the machine is a functioning gluster replicated node with nic bonding/etc which took me a while to get setup properly.  I'm not an expert in dealing with data center hosts cleanup/etc in ovirt either.  Would I rebuild the host with a new name and delete the failed one?

I'll look at the yum option first.  With regards to ovirt node, if I had more hosts, I would use that option, but I don't think I can use that install process on a gluster node?...or is there a way to install ovirt node through yum without breaking gluster.

The problem I'm having seems like a configuration issue..I would like to figure out what is going on from a learning perspective and maybe avoid all the above.

Todd Barton


---- On Sat, 13 Apr 2019 05:03:25 -0400 Strahil <hunter86_bg@yahoo.com> wrote ----

The fastest (windows style)  approach is to completely wipe the host and do a reinstall -> install in vdsm and so on.
You should consider oVirt Node.

Another one that comes to my mind is to do:
yum history
yum history rollback <id>
reboot
repeat the upgrade.

For now, I'm planing to do gluster snapshots (all machines on volume -> stopped) before major upgrades, as this is the fastest recovery approach.

Actually oVirt Node uses thin lvm snapshots to guarantee fast rollback.

Best Regards,
Strahil Nikolov

On Apr 13, 2019 06:44, Todd Barton <tcbarton@ipvoicedatasystems.com> wrote:
Looking for some help/suggestions to correct an issue I'm having.  I have a 3 host HA setup running a hosted-engine and gluster storage.  The hosts are identical hardware configurations and have been running for several years very solidly.  I was performing an upgrade to 4.1.  1st host when fine.  The second upgrade didn't go well...On server reboot, it went into kernel panic and I had to load previous kernel to diagnose. 

I couldn't get it out of panic and I had to revert the system to the previous kernel which was a big PITA. I updated it to current and verified installation of ovirt/vdsm.  Everything seemed to be ok, but vdsm won't start. Gluster is working fine.  It appears I have a authentication issue with libvirt.  I'm getting the message "libvirt: XML-RPC error : authentication failed: authentication failed" which seems to be the core issue.

I've looked at all the past issues/resolutions to this issue and tried them, but I can't get it to work.  For example, I do a vdsm-tool configure --force and I get this...

Checking configuration status...

abrt is already configured for vdsm
lvm is configured for vdsm
libvirt is already configured for vdsm
SUCCESS: ssl configured to true. No conflicts
Current revision of multipath.conf detected, preserving

Running configure...
Reconfiguration of abrt is done.
Traceback (most recent call last):
  File "/usr/bin/vdsm-tool", line 219, in main
    return tool_command[cmd]["command"](*args)
  File "/usr/lib/python2.7/site-packages/vdsm/tool/__init__.py", line 38, in wrapper
    func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/tool/configurator.py", line 141, in configure
    _configure(c)
  File "/usr/lib/python2.7/site-packages/vdsm/tool/configurator.py", line 88, in _configure
    getattr(module, 'configure', lambda: None)()
  File "/usr/lib/python2.7/site-packages/vdsm/tool/configurators/passwd.py", line 68, in configure
    configure_passwd()
  File "/usr/lib/python2.7/site-packages/vdsm/tool/configurators/passwd.py", line 98, in configure_passwd
    raise RuntimeError("Set password failed: %s" % (err,))
RuntimeError: Set password failed: ['saslpasswd2: invalid parameter supplied']

...and help would be greatly appreciated.  I'm not a linux/ovirt expert by any means, but I desperately need to get this setup back to being stable.  This happened many months ago and I gave up fixing, but I really need to get this back online again.

Thank you 

Todd Barton