Ok, finally got it...Had to get a terminal ready with the virsh command and
guess what the instance number was, and then run suspend right after
starting with --vm-start-paused. Got it to really be paused, got into the
console, booted the old kernel, and have now been repairing a bad yum
transaction....I *think* I've finished that.
So, if I understand correctly, after the yum update, I should run
engine-setup? Do I run that inside the engine vm, or on the host its
running on?
BTW: I did look up upgrade procedures on the documentation for the
release. It links to two or three levels of other documents, then ends in
an error 404.
--Jim
On Mon, Sep 3, 2018 at 6:39 PM, Jim Kusznir <jim(a)palousetech.com> wrote:
global maintence mode is already on. hosted-engine
--vm-start-paused
results in a non-paused VM being started. Of course, this is executed
after hosted-engine --vm-poweroff and suitable time left to let things shut
down.
I just ran another test, and did in fact see the engine was briefly
paused, but then was quickly put in the running state. I don't know by
what, though. Global maintence mode is definitely enabled, every run of
the hosted-engine command reminds me!
On Mon, Sep 3, 2018 at 11:12 AM, Darrell Budic <budic(a)onholyground.com>
wrote:
> Don’t know if there’s anything special, it’s been a while since I’ve
> needed to start it in paused mode. Try putting it in HA maintenance mode
> from the CLI and then start it in paused mode maybe?
>
> ------------------------------
> *From:* Jim Kusznir <jim(a)palousetech.com>
> *Subject:* Re: [ovirt-users] Upgraded host, engine now won't boot
> *Date:* September 3, 2018 at 1:08:27 PM CDT
>
> *To:* Darrell Budic
> *Cc:* users
>
> Unfortunately, I seem unable to get connected to the console early enough
> to actually see a kernel list.
>
> I've tried the hosted-engine --start-vm-paused command, but it just
> starts it (running mode, not paused). By the time I can get vnc connected,
> I have just that last line. ctrl-alt-del doesn't do anything with it,
> either. sending a reset through virsh seems to just kill the VM (it
> doesn't respawn).
>
> ha seems to have some trouble with this too...Originally I allowed ha to
> start it, and it would take it a good long while before it gave up on the
> engine and reset it. It instantly booted to the same crashed state, and
> again waited a "good long while" (sorry, never timed it, but I know it was
> >5 min).
>
> My current thought is that I need to get the engine started in paused
> mode, connect vnc, then unpause it with virsh to catch what is happening.
> Is there any magic to getting it started in paused mode?
>
> On Mon, Sep 3, 2018 at 11:03 AM, Darrell Budic <budic(a)onholyground.com>
> wrote:
>
>> Send it a ctl-alt-delete and see what happens. Possibly try an older
>> kernel at the grub boot menu. Could also try stopping it with hosted-engine
>> —vm-stop and let HA reboot it, see if it boots or get onto the console
>> quickly and try and watch more of the boot.
>>
>> Ssh and yum upgrade is fine for the OS, although it’s a good idea to
>> enable Global HA Maintenance first so the HA watchdogs don’t reboot it in
>> the middle of that. After that, run “engine-setup” again, at least if there
>> are new ovirt engine updates to be done. Then disable Global HA
>> Maintenance, and run "shutdown -h now” to stop the Engine VM (rebooting
>> seems to cause it to exit anyway, HA seems to run it as a single execution
>> VM. Or at least in the past, it seems to quit anyway on me and shutdown
>> triggered HA faster). Wait a few minutes, and HA will respawn it on a new
>> instance and you can log into your engine again.
>>
>> ------------------------------
>> *From:* Jim Kusznir <jim(a)palousetech.com>
>> *Subject:* Re: [ovirt-users] Upgraded host, engine now won't boot
>> *Date:* September 3, 2018 at 12:45:22 PM CDT
>> *To:* Darrell Budic
>> *Cc:* users
>>
>>
>> Thanks to Jayme who pointed me to the --add-console-password
>> hosted-engine command to set a password for vnc. Using that, I see only
>> the single line:
>>
>> Probing EDD (edd=off to disable)... ok
>>
>> --Jim
>>
>> On Mon, Sep 3, 2018 at 10:26 AM, Jim Kusznir <jim(a)palousetech.com>
>> wrote:
>>
>>> Is there a way to get a graphical console on boot of the engine vm so I
>>> can see what's causing the failure to boot?
>>>
>>> On Mon, Sep 3, 2018 at 10:23 AM, Jim Kusznir <jim(a)palousetech.com>
>>> wrote:
>>>
>>>> Thanks; I guess I didn't mention that I started there.
>>>>
>>>> The virsh list shows it in state running, and gluster is showing fully
>>>> online and healed. However, I cannot bring up a console of the engine
VM
>>>> to see why its not booting, even though it shows in running state.
>>>>
>>>> In any case, the hosts and engine were running happily. I applied the
>>>> latest updates on the host, and the engine went unstable. I thought,
Ok,
>>>> maybe there's an update to ovirt that also needs to be applied to
the
>>>> engine, so I ssh'ed in and ran yum update (never did find clear
>>>> instructions on how one is supposed to maintain the engine, but I did
see
>>>> that listed online). A while later, it reset and never booted again.
>>>>
>>>> -JIm
>>>>
>>>> On Sun, Sep 2, 2018 at 4:28 PM, Darrell Budic
<budic(a)onholyground.com>
>>>> wrote:
>>>>
>>>>> It’s definitely not starting, you’ll have to see if you can figure
>>>>> out why. A couple things to try:
>>>>>
>>>>> - Check "virsh list" and see if it’s running, or paused for
storage.
>>>>> (google "virsh saslpasswd2
>>>>>
<
https://www.google.com/search?client=safari&rls=en&q=virsh+saslpa...
>>>>> if you need to add a user to do this with, it’s per host)
>>>>> - It’s hyper converged, so check your gluster volume for healing
>>>>> and/or split brains and wait/resolve those.
>>>>> - check “gluster peer status” and on each host and make sure your
>>>>> gluster hosts are all talking. I’ve seen an upgrade screwup the
firewall,
>>>>> easy fix is to add a rule to allow the hosts to talk to each other on
your
>>>>> gluster network, no questions asked (-j ACCEPT, no port, etc).
>>>>>
>>>>> Good luck!
>>>>>
>>>>> ------------------------------
>>>>> *From:* Jim Kusznir <jim(a)palousetech.com>
>>>>> *Subject:* [ovirt-users] Upgraded host, engine now won't boot
>>>>> *Date:* September 1, 2018 at 8:38:12 PM CDT
>>>>> *To:* users
>>>>>
>>>>> Hello:
>>>>>
>>>>> I saw that there were updates to my ovirt-4.2 3 node hyperconverged
>>>>> system, so I proceeded to apply them the usual way through the UI.
>>>>>
>>>>> At one point, the hosted engine was migrated to one of the upgraded
>>>>> hosts, and then went "unstable" on me. Now, the hosted
engine appears to
>>>>> be crashed: It gets powered up, but it never boots up to the point
where
>>>>> it responds to pings or allows logins. After a while, the hosted
engine
>>>>> shows status (via console "hosted-engine --vm-status"
command) "Powering
>>>>> Down". It stays there for a long time.
>>>>>
>>>>> I tried forcing a poweroff then powering it on, but again, it never
>>>>> gets up to where it will respond to pings. --vm-status shows bad
health,
>>>>> but up.
>>>>>
>>>>> I tried running the hosted-engine --console command, but got:
>>>>>
>>>>> [root@ovirt1 ~]# hosted-engine --console
>>>>> The engine VM is running on this host
>>>>> Connected to domain HostedEngine
>>>>> Escape character is ^]
>>>>> error: internal error: cannot find character device <null>
>>>>>
>>>>> [root@ovirt1 ~]#
>>>>>
>>>>>
>>>>> I tried to run the hosted-engine --upgrade-appliance command, but it
>>>>> hangs at obtaining certificate (understandably, as the hosted-engine
is not
>>>>> up).
>>>>>
>>>>> How do i recover from this? And what caused this?
>>>>>
>>>>> --Jim
>>>>> _______________________________________________
>>>>> Users mailing list -- users(a)ovirt.org
>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>> Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
>>>>> oVirt Code of Conduct:
https://www.ovirt.org/communit
>>>>> y/about/community-guidelines/
>>>>> List Archives:
https://lists.ovirt.org/archiv
>>>>> es/list/users(a)ovirt.org/message/XBNOOF4OA5C5AFGCT3KGUPUTRSOLIPXX/
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>
>