[ovirt-users] ovirt-3.6 : Hosted-engine crashed and can't restart

Alexis HAUSER alexis.hauser at telecom-bretagne.eu
Wed Jul 20 15:01:20 UTC 2016


After assigning an IP adress to a VLAN network (it was using DHCP by default) that was on the same NIC than ovirtmgmt, my hosted-engine crashed and can't start again...I have no idea how to fix this.
I had a similar issue some months ago but with a different error. I tried to restart the ha agent that seems to be linked with this error, also restarted the host. I also tried to remove the _DIRECT_IO_ lockfile on the engine storage as it fixed my problem last time but it didn't help...

Any ideas ? Do you think editing manually the logical network in the host and reverting them at it was before crash can help ?






hosted-engine --vm-status
Traceback (most recent call last):
  File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/vm_status.py", line 117, in <module>
    if not status_checker.print_status():
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/vm_status.py", line 60, in print_status
    all_host_stats = ha_cli.get_all_host_stats()
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 160, in get_all_host_stats
    return self.get_all_stats(self.StatModes.HOST)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 103, in get_all_stats
    self._configure_broker_conn(broker)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 180, in _configure_broker_conn
    dom_type=dom_type)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 176, in set_storage_domain
    .format(sd_type, options, e))
ovirt_hosted_engine_ha.lib.exceptions.RequestError: Failed to set storage domain FilesystemBackend, options {'dom_type': 'nfs3', 'sd_uuid': 'e41807e5-ee68-40a2-a642-cc226ba0e82d'}: Request failed: <class 'ovirt_hosted_engine_ha.lib.storage_backends.BackendFailureException'>


vdsClient -s 0 list

16450089-911e-4bad-a8b7-98e84a79ef3a
	Status = Down
	nicModel = rtl8139,pv
	statusTime = 4295559350
	exitMessage = Unable to get volume size for domain e41807e5-ee68-40a2-a642-cc226ba0e82d volume 053df3a6-db18-445a-8f75-61c630ab0003
	emulatedMachine = rhel6.5.0
	pid = 0
	vmName = HostedEngine
	devices = [{'index': '0', 'iface': 'virtio', 'format': 'raw', 'bootOrder': '1', 'address': {'slot': '0x06', 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'function': '0x0'}, 'volumeID': '053df3a6-db18-445a-8f75-61c630ab0003', 'imageID': 'b6daa50d-adad-46a5-8f5f-accfb155a1e1', 'readonly': 'false', 'domainID': 'e41807e5-ee68-40a2-a642-cc226ba0e82d', 'deviceId': 'b6daa50d-adad-46a5-8f5f-accfb155a1e1', 'poolID': '00000000-0000-0000-0000-000000000000', 'device': 'disk', 'shared': 'exclusive', 'propagateErrors': 'off', 'type': 'disk'}, {'nicModel': 'pv', 'macAddr': '00:16:3e:1c:4b:81', 'linkActive': 'true', 'network': 'ovirtmgmt', 'deviceId': '0aeaea2f-a419-43cc-92d7-8422f6aa9223', 'address': 'None', 'device': 'bridge', 'type': 'interface'}, {'index': '2', 'iface': 'ide', 'readonly': 'true', 'deviceId': '8c3179ac-b322-4f5c-9449-c52e3665e0ae', 'address': {'bus': '1', 'controller': '0', 'type': 'drive', 'target': '0', 'unit': '0'}, 'device': 'cdrom', 'shared': 'false', 'path': '', 'type': 'disk'}, {'device': 'scsi', 'model': 'virtio-scsi', 'type': 'controller', 'deviceId': '21db0c6e-071c-48ff-b905-95478b37c384', 'address': {'slot': '0x04', 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'function': '0x0'}}, {'device': 'usb', 'type': 'controller', 'deviceId': 'c0384f68-d0c9-4ebb-a779-8dc9911ce2f8', 'address': {'slot': '0x01', 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'function': '0x2'}}, {'device': 'ide', 'type': 'controller', 'deviceId': 'd5a2dd13-138a-482b-9bc3-994b10ec4100', 'address': {'slot': '0x01', 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'function': '0x1'}}, {'device': 'virtio-serial', 'type': 'controller', 'deviceId': '9e695172-c9b0-47df-bc76-8170219dec28', 'address': {'slot': '0x05', 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'function': '0x0'}}]
	guestDiskMapping = {}
	vmType = kvm
	displaySecurePort = -1
	exitReason = 1
	memSize = 6000
	displayPort = -1
	clientIp = 
	spiceSecureChannels = smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir
	smp = 4
	displayIp = 0
	display = vnc
	exitCode = 1


systemctl status ovirt-ha-agent.service -l
● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2016-07-20 14:56:22 UTC; 2min 29s ago
 Main PID: 20236 (ovirt-ha-agent)
   CGroup: /system.slice/ovirt-ha-agent.service
           └─20236 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon

Jul 20 14:57:56 rhevserv ovirt-ha-agent[20236]: INFO:ovirt_hosted_engine_ha.lib.storage_server.StorageServer:Connecting storage server
Jul 20 14:57:57 rhevserv ovirt-ha-agent[20236]: INFO:ovirt_hosted_engine_ha.lib.storage_server.StorageServer:Connecting storage server
Jul 20 14:58:37 rhevserv ovirt-ha-agent[20236]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Error: 'Connection to storage server failed' - trying to restart agent
Jul 20 14:58:37 rhevserv ovirt-ha-agent[20236]: ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Error: 'Connection to storage server failed' - trying to restart agent
Jul 20 14:58:42 rhevserv ovirt-ha-agent[20236]: WARNING:ovirt_hosted_engine_ha.agent.agent.Agent:Restarting agent, attempt '2'
Jul 20 14:58:43 rhevserv ovirt-ha-agent[20236]: INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Found certificate common name: rhev.mydomain.com
Jul 20 14:58:43 rhevserv ovirt-ha-agent[20236]: INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Initializing VDSM
Jul 20 14:58:43 rhevserv ovirt-ha-agent[20236]: INFO:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Connecting the storage
Jul 20 14:58:43 rhevserv ovirt-ha-agent[20236]: INFO:ovirt_hosted_engine_ha.lib.storage_server.StorageServer:Connecting storage server
Jul 20 14:58:44 rhevserv ovirt-ha-agent[20236]: INFO:ovirt_hosted_engine_ha.lib.storage_server.StorageServer:Connecting storage server



More information about the Users mailing list