On Tue, Nov 6, 2018 at 4:16 PM Jarosław Prokopowski <jprokopowski@gmail.com> wrote:
Hi,

It looks like after host restart my hosted engine VM is not accessible any more. 
The storage is glusterfs. The gluster volume is healthy. 


The VM status is:
{"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "Up"}

hosted-engine --console
The engine VM is running on this host
Connected to domain HostedEngine
Escape character is ^]
error: internal error: cannot find character device <null>

The serial console requires a correctly working systemd instance on guest side to spawn a getty process on tty connections.
VNC console seams more reliable/useful for troubleshooting activities on systems that fails the boot process.
I'd suggest to try:
  hosted-engine --add-console-password
to force a temporary VNC password and then connect it with a VNC client.


I tried to boot it from cdrom by changing vm.conf file and I'm not sure if the syntax is correct:
1. I removed bootOrder:1 from the index:0 device
devices={index:0,iface:virtio,format:raw,address:{type:pci,slot:0x07,bus:0x00,domain:0x0000,function:0x0},volumeID:8d823e33-4260-4004-a468-cf477d7b1f5b,imageID:41342181-5c8f-4544-878b-44fdaa40dddc,readonly:false,domainID:beb954e7-61b7-4437-bd21-4b268e1a26e5,deviceId:41342181-5c8f-4544-878b-44fdaa40dddc,poolID:00000000-0000-0000-0000-000000000000,device:disk,shared:exclusive,propagateErrors:off,type:disk}

2. in index:2  device I addedd bootOrder:1 and path to the iso image:
devices={index:2,iface:ide,shared:false,readonly:true,bootOrder:1,deviceId:8c3179ac-b322-4f5c-9449-c52e3665e0ae,address:{controller:0,target:0,unit:0,bus:1,type:drive},device:cdrom,path:/opt/iso/CentOS-7-x86_64-DVD-1804.iso,type:disk}


The outcome is the same - no console connection.

Now I also get:

hosted-engine --vm-start
Command VM.getStats with args {'vmID': '1e3aa9cf-8708-40a0-bc86-1127df01047a'} failed:
(code=1, message=Virtual machine does not exist: {'vmId': u'1e3aa9cf-8708-40a0-bc86-1127df01047a'})

Unfortunately I do not have any backup.  Is there a way to redeploy hosted engine and import current configuration or any other way to fix it?

My first advice is to try fixing it via VNC connection.
If not feasible, we now have an ansible procedure to automatically redeploy hosted-engine restoring a backup on the fly but you still need a backup took with engine-backup.
Last option is to deploy a new engine instance; shutdown all the VMs at guest level and try importing other storage domains from the new engine.

 

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/HOD22UUEVQQDCSU5FYJVITIZOT7AQWKQ/