March 2019 - Users - Ovirt List Archives

oVirt 4.3 - create windows2012 failed due to ovirt-imageio-proxy

by jingjie.jiang＠oracle.com

Hi, I tried to create windows2012 vm on nfs data domain, but the disk was locked. Found the error message as following: Connection to ovirt-imageio-proxy service has failed. Make sure the service is installed, configured, and ovirt-engine certificate is registered as a valid CA in the browser. Is this known issue? Thanks, Jingjie

6 years, 3 months

3
2
0 / 0

Change cluster cpu type with hosted engine

by Fabrice SOLER

Hello, I need to create a windows 10 virtual machine but I have an error : I have a fresh ovirt installation (version 4.2.8) with an hosted engine. At the hosted engine installation there was no question about the cluster cpu type, it should be great if in the future version it could be. To change an host to another cluster this host need to be in maintenance mode, and the hosted engine will be power off. I have created another Cluster with an SandyBridge Family CPU type, but to move the hosted engine to this new cluster the hosted should be power off. Is there someone who can help ? Sincerely, --

6 years, 3 months

3
3
0 / 0

Re: Ovirt 4.3.1 problem with HA agent

by Strahil

>> >> 1.2 All bricks healed (gluster volume heal data info summary) and no split-brain >> > >> > >> > >> > gluster volume heal data info >> > >> > Brick node-msk-gluster203:/opt/gluster/data >> > Status: Connected >> > Number of entries: 0 >> > >> > Brick node-msk-gluster205:/opt/gluster/data >> > <gfid:18c78043-0943-48f8-a4fe-9b23e2ba3404> >> > <gfid:b6f7d8e7-1746-471b-a49d-8d824db9fd72> >> > <gfid:6db6a49e-2be2-4c4e-93cb-d76c32f8e422> >> > <gfid:e39cb2a8-5698-4fd2-b49c-102e5ea0a008> >> > <gfid:5fad58f8-4370-46ce-b976-ac22d2f680ee> >> > <gfid:7d0b4104-6ad6-433f-9142-7843fd260c70> >> > <gfid:706cd1d9-f4c9-4c89-aa4c-42ca91ab827e> >> > Status: Connected >> > Number of entries: 7 >> > >> > Brick node-msk-gluster201:/opt/gluster/data >> > <gfid:18c78043-0943-48f8-a4fe-9b23e2ba3404> >> > <gfid:b6f7d8e7-1746-471b-a49d-8d824db9fd72> >> > <gfid:6db6a49e-2be2-4c4e-93cb-d76c32f8e422> >> > <gfid:e39cb2a8-5698-4fd2-b49c-102e5ea0a008> >> > <gfid:5fad58f8-4370-46ce-b976-ac22d2f680ee> >> > <gfid:7d0b4104-6ad6-433f-9142-7843fd260c70> >> > <gfid:706cd1d9-f4c9-4c89-aa4c-42ca91ab827e> >> > Status: Connected >> > Number of entries: 7 >> > >> >> Data needs healing. >> Run: cluster volume heal data full > > This does not work. Yeah, That's because my phone corrects the 'gluster' to 'cluster' Usually gluster daemons detect need of heal, but with 'gluster volume heal data full && sleep 5 && gluster volume heal data info summary && sleep 5 && gluster volume heal data info summary', you can force syncing and get the result. Let's see what happens with DNS. Best Regards, Strahil Nikolov

6 years, 3 months

1
0
0 / 0

Hosted -engine is down and cannot be restarted

by ada per

Hello everyone, For a strange reason the hosted engine went down and I cannot restart it. I tried manually restarting it without any success can you please advice? For all the nodes the engine status is the same as the one below. --== Host nodex. (id: 6) status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : nodex Host ID : 6 Engine status : {"reason": "bad vm status", "health": "bad", "vm": "down_unexpected", "detail": "Down"} Score : 3400 stopped : False Local maintenance : False crc32 : 323a9f45 local_conf_timestamp : 2648874 Host timestamp : 2648874 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=2648874 (Tue Mar 19 12:25:44 2019) host-id=6 score=3400 vm_conf_refresh_time=2648874 (Tue Mar 19 12:25:44 2019) conf_on_shared_storage=True maintenance=False state=GlobalMaintenance stopped=False When I try the commands root@node5# hosted-engine --vm-shutdown I ge the response: root@node5# Command VM.shutdown with args {'delay': '120', 'message': 'VM is shutting down!', 'vmID': 'a492d2eb-1dfd-470d-a141-3e55d2189275'} failed:(code=1, message=Virtual machine does not exist) But when I run : hosted-engine --vm-start I get the response: VM exists and is down, cleaning up and restarting Below you can see the # journalctl -u ovirt-ha-agent logs Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Unhandled monitoring loop exception Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 430, in start_monitoring self._monitoring_loop() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 449, in _monitoring_loop for old_state, state, delay in self.fsm: File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", line 127, in next new_data = self.refresh(self._state.data) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py", line 81, in refresh stats.update(self.hosted_engine.collect_stats()) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 737, in collect_stats all_stats = self._broker.get_stats_from_storage() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 143, in get_stats_from_storage result = self._proxy.get_stats() File "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__ return self.__send(self.__name, args) File "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request verbose=self.__verbose File "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request return self.single_request(host, handler, request_body, verbose) File "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in single_request self.send_content(h, request_body) File "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in send_content connection.endheaders(request_body) File "/usr/lib64/python2.7/httplib.py", line 1037, in endheaders self._send_output(message_body) File "/usr/lib64/python2.7/httplib.py", line 881, in _send_output self.send(msg) File "/usr/lib64/python2.7/httplib.py", line 843, in send self.connect() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py", line 52, in connect self.sock.connect(base64.b16decode(self.host)) File "/usr/lib64/python2.7/socket.py", line 224, in meth return getattr(self._sock,name)(*args) error: [Errno 2] No such file or directory Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 131, in _run_agent return action(he) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 55, in action_proper return he.start_monitoring() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 437, in start_monitoring self.publish(stopped) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 337, in publish self._push_to_storage(blocks) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 708, in _push_to_storage self._broker.put_stats_on_storage(self.host_id, blocks) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 113, in put_stats_on_storage self._proxy.put_stats(host_id, xmlrpclib.Binary(data)) File "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__ return self.__send(self.__name, args) File "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request verbose=self.__verbose File "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request return self.single_request(host, handler, request_body, verbose) File "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in single_request self.send_content(h, request_body) File "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in send_content connection.endheaders(request_body) File "/usr/lib64/python2.7/httplib.py", line 1037, in endheaders self._send_output(message_body) File "/usr/lib64/python2.7/httplib.py", line 881, in _send_output self.send(msg) File "/usr/lib64/python2.7/httplib.py", line 843, in send self.connect() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py", line 52, in connect self.sock.connect(base64.b16decode(self.host)) File "/usr/lib64/python2.7/socket.py", line 224, in meth return getattr(self._sock,name)(*args) error: [Errno 2] No such file or directory Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent Mar 14 12:04:42 node7. systemd[1]: ovirt-ha-agent.service: main process exited, code=exited, status=157/n/a Mar 14 12:04:42 node7. systemd[1]: Unit ovirt-ha-agent.service entered failed state. Mar 14 12:04:42 node7. systemd[1]: ovirt-ha-agent.service failed. Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service holdoff time over, scheduling restart. Mar 14 12:04:52 node7. systemd[1]: Stopped oVirt Hosted Engine High Availability Monitoring Agent. Mar 14 12:04:52 node7. systemd[1]: Started oVirt Hosted Engine High Availability Monitoring Agent. Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to start necessary monitors Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 131, in _run_agent return action(he) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 55, in action_proper return he.start_monitoring() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 413, in start_monitoring self._initialize_broker() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 537, in _initialize_broker m.get('options', {})) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 86, in start_monitor ).format(t=type, o=options, e=e) RequestError: brokerlink - failed to start monitor via ovirt-ha-broker: [Errno 2] No such file or directory, [monitor: 'ping', options: {'addr': '19 Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service: main process exited, code=exited, status=157/n/a Mar 14 12:04:52 node7. systemd[1]: Unit ovirt-ha-agent.service entered failed state. Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service failed. Mar 14 12:05:02 node7. systemd[1]: ovirt-ha-agent.service holdoff time over, scheduling restart. Mar 14 12:05:02 node7. systemd[1]: Stopped oVirt Hosted Engine High Availability Monitoring Agent. Mar 14 12:05:02 node7. systemd[1]: Started oVirt Hosted Engine High Availability Monitoring Agent. Mar 14 12:06:55 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to stop engine vm with /usr/sbin/hosted-engine --vm-poweroff: Co (code=1, message=Virtual machine does not exist: {'vmId': u'a492d2eb-1dfd-470d-a141-3e55d2189275'}) Mar 14 12:06:55 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to stop engine VM: Command VM.destroy with args {'vmID': 'a492d2 (code=1, message=Virtual machine does not exist: {'vmId': u'a492d2eb-1dfd-470d-a141-3e55d2189275'}) Mar 15 14:28:16 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Mar 15 14:28:36 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Mar 15 14:29:00 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Mar 15 14:29:22 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Mar 15 14:29:44 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Mar 15 14:30:06 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Mar 15 14:30:28 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Mar 15 14:30:50 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Mar 15 14:31:12 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Mar 15 14:31:33 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Mar 15 14:31:56 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Mar 15 14:32:18 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Mar 15 14:32:40 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost Mar 15 14:33:02 node7. ovirt-ha-agent[31822]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost

6 years, 3 months

2
8
0 / 0

Host unresponsive after upgrade 4.2.8 -> 4.3.2 failed

by Artem Tambovskiy

Hello, Just started upgrading my small cluster to from 4.2.8 to 4.3.2 and endup in the situation that one of the hosts is not working after upgrade. For some reason vdsmd is not starting up, I have tried to restart it manually with no luck: Any ideas on what could be the reason? [root@ovirt2 log]# systemctl restart vdsmd A dependency job for vdsmd.service failed. See 'journalctl -xe' for details. [root@ovirt2 log]# journalctl -xe -- Unit ovirt-ha-agent.service has finished shutting down. Mar 19 15:47:47 ovirt2.domain.org systemd[1]: Starting Virtual Desktop Server Manager... -- Subject: Unit vdsmd.service has begun start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit vdsmd.service has begun starting up. Mar 19 15:47:47 ovirt2.domain.org vdsmd_init_common.sh[56717]: vdsm: Running mkdirs Mar 19 15:47:47 ovirt2.domain.org vdsmd_init_common.sh[56717]: vdsm: Running configure_coredump Mar 19 15:47:47 ovirt2.domain.org vdsmd_init_common.sh[56717]: vdsm: Running configure_vdsm_logs Mar 19 15:47:47 ovirt2.domain.org vdsmd_init_common.sh[56717]: vdsm: Running wait_for_network Mar 19 15:47:47 ovirt2.domain.org supervdsmd[56716]: Supervdsm failed to start: 'module' object has no attribute 'Accounting' Mar 19 15:47:47 ovirt2.domain.org python2[56716]: detected unhandled Python exception in '/usr/share/vdsm/supervdsmd' Mar 19 15:47:48 ovirt2.domain.org abrt-server[56745]: Duplicate: core backtrace Mar 19 15:47:48 ovirt2.domain.org abrt-server[56745]: DUP_OF_DIR: /var/tmp/abrt/Python-2019-03-19-14:23:04-17292 Mar 19 15:47:48 ovirt2.domain.org abrt-server[56745]: Deleting problem directory Python-2019-03-19-15:47:47-56716 (dup of Python-2019-03-19-14:23:04-17292 Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: Traceback (most recent call last): Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: File "/usr/share/vdsm/supervdsmd", line 26, in <module> Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: supervdsm_server.main(sys.argv[1:]) Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: File "/usr/lib/python2.7/site-packages/vdsm/supervdsm_server.py", line 294, in main Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: module_name)) Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: File "/usr/lib64/python2.7/importlib/__init__.py", line 37, in import_module Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: __import__(name) Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: File "/usr/lib/python2.7/site-packages/vdsm/supervdsm_api/systemd.py", line 34, in <module> Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: cmdutils.Accounting.CPU, Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: AttributeError: 'module' object has no attribute 'Accounting' Mar 19 15:47:49 ovirt2.domain.org systemd[1]: supervdsmd.service: main process exited, code=exited, status=1/FAILURE Mar 19 15:47:49 ovirt2.domain.org systemd[1]: Unit supervdsmd.service entered failed state. Mar 19 15:47:49 ovirt2.domain.org systemd[1]: supervdsmd.service failed. Mar 19 15:47:49 ovirt2.domain.org systemd[1]: supervdsmd.service holdoff time over, scheduling restart. Mar 19 15:47:49 ovirt2.domain.org systemd[1]: Cannot add dependency job for unit lvm2-lvmetad.socket, ignoring: Unit is masked. Mar 19 15:47:49 ovirt2.domain.org systemd[1]: Stopped Auxiliary vdsm service for running helper functions as root. -- Subject: Unit supervdsmd.service has finished shutting down -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit supervdsmd.service has finished shutting down. Mar 19 15:47:49 ovirt2.domain.org systemd[1]: Started Auxiliary vdsm service for running helper functions as root. -- Subject: Unit supervdsmd.service has finished start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit supervdsmd.service has finished starting up. -- -- The start-up result is done. Mar 19 15:47:50 ovirt2.domain.org supervdsmd[56757]: Supervdsm failed to start: 'module' object has no attribute 'Accounting' Mar 19 15:47:50 ovirt2.domain.org python2[56757]: detected unhandled Python exception in '/usr/share/vdsm/supervdsmd' -- Regards, Artem

6 years, 3 months

1
0
0 / 0

Re: Ovirt 4.3.1 problem with HA agent

by Strahil

Hi Alexei, >> 1.2 All bricks healed (gluster volume heal data info summary) and no split-brain > > > > gluster volume heal data info > > Brick node-msk-gluster203:/opt/gluster/data > Status: Connected > Number of entries: 0 > > Brick node-msk-gluster205:/opt/gluster/data > <gfid:18c78043-0943-48f8-a4fe-9b23e2ba3404> > <gfid:b6f7d8e7-1746-471b-a49d-8d824db9fd72> > <gfid:6db6a49e-2be2-4c4e-93cb-d76c32f8e422> > <gfid:e39cb2a8-5698-4fd2-b49c-102e5ea0a008> > <gfid:5fad58f8-4370-46ce-b976-ac22d2f680ee> > <gfid:7d0b4104-6ad6-433f-9142-7843fd260c70> > <gfid:706cd1d9-f4c9-4c89-aa4c-42ca91ab827e> > Status: Connected > Number of entries: 7 > > Brick node-msk-gluster201:/opt/gluster/data > <gfid:18c78043-0943-48f8-a4fe-9b23e2ba3404> > <gfid:b6f7d8e7-1746-471b-a49d-8d824db9fd72> > <gfid:6db6a49e-2be2-4c4e-93cb-d76c32f8e422> > <gfid:e39cb2a8-5698-4fd2-b49c-102e5ea0a008> > <gfid:5fad58f8-4370-46ce-b976-ac22d2f680ee> > <gfid:7d0b4104-6ad6-433f-9142-7843fd260c70> > <gfid:706cd1d9-f4c9-4c89-aa4c-42ca91ab827e> > Status: Connected > Number of entries: 7 > Data needs healing. Run: cluster volume heal data full If it still doesn't heal (check in 5 min),go to /rhev/data-center/mnt/glusterSD/msk-gluster-facility.xxxx_data And run 'find . -exec stat {}\;' without the quotes. As I have understood you, ovirt Hosted Engine is running and can be started on all nodes except 1. >> >> 2. Go to the problematic host and check the mount point is there > > > > No mount point on problematic node /rhev/data-center/mnt/glusterSD/msk-gluster-facility.xxxx:_data > If I create a mount point manually, it is deleted after the node is activated. > > Other nodes can mount this volume without problems. Only this node have connection problems after update. > > Here is a part of the log at the time of activation of the node: > > vdsm log > > 2019-03-18 16:46:00,548+0300 INFO (jsonrpc/5) [vds] Setting Hosted Engine HA local maintenance to False (API:1630) > 2019-03-18 16:46:00,549+0300 INFO (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Host.setHaMaintenanceMode succeeded in 0.00 seconds (__init__:573) > 2019-03-18 16:46:00,581+0300 INFO (jsonrpc/7) [vdsm.api] START connectStorageServer(domType=7, spUUID=u'5a5cca91-01f8-01af-0297-00000000025f', conList=[{u'id': u'5799806e-7969-45da-b17d-b47a63e6a8e4', u'connection': u'msk-gluster-facility.xxxx:/data', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'glusterfs', u'password': '********', u'port': u''}], options=None) from=::ffff:10.77.253.210,56630, flow_id=81524ed, task_id=5f353993-95de-480d-afea-d32dc94fd146 (api:46) > 2019-03-18 16:46:00,621+0300 INFO (jsonrpc/7) [storage.StorageServer.MountConnection] Creating directory u'/rhev/data-center/mnt/glusterSD/msk-gluster-facility.xxxx:_data' (storageServer:167) > 2019-03-18 16:46:00,622+0300 INFO (jsonrpc/7) [storage.fileUtils] Creating directory: /rhev/data-center/mnt/glusterSD/msk-gluster-facility.xxxx:_data mode: None (fileUtils:197) > 2019-03-18 16:46:00,622+0300 WARN (jsonrpc/7) [storage.StorageServer.MountConnection] gluster server u'msk-gluster-facility.xxxx' is not in bricks ['node-msk-gluster203', 'node-msk-gluster205', 'node-msk-gluster201'], possibly mounting duplicate servers (storageServer:317) This seems very strange. As you have hidden the hostname, I'm not use which on is this. Check that DNS can be resolved from all hosts and the hostname of this Host is resolvable. Also check if it in the peer list. Try to manually mount the cluster volume: mount -t glusterfs msk-gluster-facility.xxxx:/data /mnt Is this a second FQDN/IP of this server? If so, gluster accepts that via gluster peer probe IP2 >> 2.1. Check permissions (should be vdsm:kvm) and fix with chown -R if needed >> 2.2. Check the OVF_STORE from the logs that it exists > > > How can i do this? Go to /rhev/data-center/mnt/glusterSD/host_engine and use find inside the domain UUID for files that are not owned by vdsm:KVM. I usually run 'chown -R vdsm:KVM 823xx-xxxx-yyyy-zzz' and it will fix any misconfiguration. Best Regards, Strahil Nikolov

6 years, 3 months

2
1
0 / 0

Live migration failed

by Bong Shau Fui

Hi: I deployed 2 ovirt hosts and an ovirt engine in a nested KVM server. I've a windows vm setup and tried to perform live migration but failed. I checked on the hosts and found them meeting the live migration requirements, or at least that's what I thought. I took the requirement from the below document. https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Virtuali... The hosts, both source and destination are quite empty, with only the hosted engine, 1 centos vm and the windows VM in the cluster. I can do a live migration for the centos vm successfully. But when I tried live migration on the hosted-engine vm it failed immediately with a message "No available host to migrate VMs to". When I tried to migrate the windows VM the message box that let me choose the destination host popped up but failed after a while. I'd like to ask where can I get more information with regards to live-migration apart from /var/log/ovirt-engine/engine.log ? I also checked on the ovirt hosts' /var/log/vdsm/vdsm.log but found nothing pointing to the reason why it failed. Below is the extract from /var/log/ovirt-engine/engine.log when the live-migration took place 2019-03-12 14:37:58,159+08 INFO [org.ovirt.engine.core.sso.utils.AuthenticationUtils] (default task-131) [] User admin@internal successfully logged in with scopes: ovirt-app-api ovirt-ext=token-info:authz-search ovirt-ext=token-info:public-authz-search ovirt-ext=token-info:validate ovirt-ext=token:password-access 2019-03-12 14:37:58,450+08 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-59) [77888830] Lock freed to object 'EngineLock:{exclusiveLocks='[d113be83-2740-4246-a1f2-b9344889c3cf=PROVIDER]', sharedLocks=''}' 2019-03-12 14:38:02,544+08 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-50) [] BaseAsyncTask::onTaskEndSuccess: Task '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. 2019-03-12 14:38:12,677+08 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-16) [] BaseAsyncTask::onTaskEndSuccess: Task '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. 2019-03-12 14:38:21,650+08 INFO [org.ovirt.engine.core.bll.aaa.SessionDataContainer] (EE-ManagedThreadFactory-engineScheduled-Thread-51) [] Not removing session 'xDiHqqa6l+g8cngM26TTCfW7NeLN3WgWChsx28wUM391vAngSxwtyCkLbQxZR1AbJ5I+2bkPZNQijMUk0jLZcA==', session has running commands for user 'admin@internal-authz'. 2019-03-12 14:38:22,782+08 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-49) [] BaseAsyncTask::onTaskEndSuccess: Task '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. 2019-03-12 14:38:33,018+08 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-74) [] BaseAsyncTask::onTaskEndSuccess: Task '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. 2019-03-12 14:38:43,261+08 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-59) [77888830] BaseAsyncTask::onTaskEndSuccess: Task '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. 2019-03-12 14:38:53,528+08 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-13) [] BaseAsyncTask::onTaskEndSuccess: Task '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. 2019-03-12 14:39:03,759+08 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-43) [] BaseAsyncTask::onTaskEndSuccess: Task '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. 2019-03-12 14:39:14,011+08 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-60) [] BaseAsyncTask::onTaskEndSuccess: Task '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. 2019-03-12 14:39:21,660+08 INFO [org.ovirt.engine.core.bll.aaa.SessionDataContainer] (EE-ManagedThreadFactory-engineScheduled-Thread-79) [] Not removing session 'xDiHqqa6l+g8cngM26TTCfW7NeLN3WgWChsx28wUM391vAngSxwtyCkLbQxZR1AbJ5I+2bkPZNQijMUk0jLZcA==', session has running commands for user 'admin@internal-authz'. 2019-03-12 14:39:24,122+08 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-85) [] BaseAsyncTask::onTaskEndSuccess: Task '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. 2019-03-12 14:39:29,773+08 INFO [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-131) [7f0cf113-55e8-4def-9c68-de3b91d6d641] Lock Acquired to object 'EngineLock:{exclusiveLocks='[5cad5c5f-5aab-46ec-a28e-d484abc0401d=VM]', sharedLocks=''}' 2019-03-12 14:39:29,887+08 INFO [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-131) [7f0cf113-55e8-4def-9c68-de3b91d6d641] Running command: MigrateVmToServerCommand internal: false. Entities affected : ID: 5cad5c5f-5aab-46ec-a28e-d484abc0401d Type: VMAction group MIGRATE_VM with role type USER 2019-03-12 14:39:30,019+08 INFO [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-131) [7f0cf113-55e8-4def-9c68-de3b91d6d641] START, MigrateVDSCommand( MigrateVDSCommandParameters:{hostId='f9014bc4-485c-4eb0-a9bc-42d13ed68f41', vmId='5cad5c5f-5aab-46ec-a28e-d484abc0401d', srcHost='host2.xxxx.com', dstVdsId='1bc9b9e9-1e90-4570-9930-08416d1927cc', dstHost='host3.xxxx.com:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='true', migrateCompressed='false', consoleAddress='null', maxBandwidth='null', enableGuestEvents='true', maxIncomingMigrations='2', maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime, params=[100]}], stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2, action={name=setDowntime, params=[200]}}, {limit=3, action={name=setDowntime, params=[300]}}, {limit=4, action={name=setDowntime, params=[400]}}, {limit=6, action={name=setDowntime, params=[500]}}, {limit=-1, action={nam e=abort, params=[]}}]]', dstQemu='192.168.138.135'}), log id: 7eeb678c 2019-03-12 14:39:30,022+08 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (default task-131) [7f0cf113-55e8-4def-9c68-de3b91d6d641] START, MigrateBrokerVDSCommand(HostName = host2.xxxx.com, MigrateVDSCommandParameters:{hostId='f9014bc4-485c-4eb0-a9bc-42d13ed68f41', vmId='5cad5c5f-5aab-46ec-a28e-d484abc0401d', srcHost='host2.xxxx.com', dstVdsId='1bc9b9e9-1e90-4570-9930-08416d1927cc', dstHost='host3.xxxx.com:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='true', migrateCompressed='false', consoleAddress='null', maxBandwidth='null', enableGuestEvents='true', maxIncomingMigrations='2', maxOutgoingMigrations='2', convergenceSchedule='[init=[{name=setDowntime, params=[100]}], stalling=[{limit=1, action={name=setDowntime, params=[150]}}, {limit=2, action={name=setDowntime, params=[200]}}, {limit=3, action={name=setDowntime, params=[300]}}, {limit=4, action={name=setDowntime, params=[400]}}, {limit=6, action={name=set Downtime, params=[500]}}, {limit=-1, action={name=abort, params=[]}}]]', dstQemu='192.168.138.135'}), log id: 5cef4981 2019-03-12 14:39:30,039+08 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (default task-131) [7f0cf113-55e8-4def-9c68-de3b91d6d641] FINISH, MigrateBrokerVDSCommand, return: , log id: 5cef4981 2019-03-12 14:39:30,048+08 INFO [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-131) [7f0cf113-55e8-4def-9c68-de3b91d6d641] FINISH, MigrateVDSCommand, return: MigratingFrom, log id: 7eeb678c 2019-03-12 14:39:30,067+08 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-131) [7f0cf113-55e8-4def-9c68-de3b91d6d641] EVENT_ID: VM_MIGRATION_START(62), Migration started (VM: Win_2016_1, Source: host2.xxxx.com, Destination: host3, User: admin@internal-authz). 2019-03-12 14:39:33,901+08 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-3) [] VM '5cad5c5f-5aab-46ec-a28e-d484abc0401d' was reported as Down on VDS '1bc9b9e9-1e90-4570-9930-08416d1927cc'(host3) 2019-03-12 14:39:33,903+08 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-3) [] START, DestroyVDSCommand(HostName = host3, DestroyVmVDSCommandParameters:{hostId='1bc9b9e9-1e90-4570-9930-08416d1927cc', vmId='5cad5c5f-5aab-46ec-a28e-d484abc0401d', secondsToWait='0', gracefully='false', reason='', ignoreNoVm='true'}), log id: c853ba5 2019-03-12 14:39:34,211+08 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-73) [] BaseAsyncTask::onTaskEndSuccess: Task '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. 2019-03-12 14:39:34,604+08 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-3) [] Failed to destroy VM '5cad5c5f-5aab-46ec-a28e-d484abc0401d' because VM does not exist, ignoring 2019-03-12 14:39:34,605+08 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-3) [] FINISH, DestroyVDSCommand, return: , log id: c853ba5 2019-03-12 14:39:34,605+08 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-3) [] VM '5cad5c5f-5aab-46ec-a28e-d484abc0401d'(Win_2016_1) was unexpectedly detected as 'Down' on VDS '1bc9b9e9-1e90-4570-9930-08416d1927cc'(ohost3) (expected on 'f9014bc4-485c-4eb0-a9bc-42d13ed68f41') 2019-03-12 14:39:34,605+08 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-3) [] Migration of VM 'Win_2016_1' to host 'host3' failed: VM destroyed during the startup. 2019-03-12 14:39:34,615+08 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-10) [] VM '5cad5c5f-5aab-46ec-a28e-d484abc0401d'(Win_2016_1) moved from 'MigratingFrom' --> 'Up' 2019-03-12 14:39:34,615+08 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-10) [] Adding VM '5cad5c5f-5aab-46ec-a28e-d484abc0401d'(Win_2016_1) to re-run list 2019-03-12 14:39:34,621+08 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring] (ForkJoinPool-1-worker-10) [] Rerun VM '5cad5c5f-5aab-46ec-a28e-d484abc0401d'. Called from VDS 'host2.xxxx.com' 2019-03-12 14:39:34,752+08 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-53959) [] START, MigrateStatusVDSCommand(HostName = host2.xxxx.com, MigrateStatusVDSCommandParameters:{hostId='f9014bc4-485c-4eb0-a9bc-42d13ed68f41', vmId='5cad5c5f-5aab-46ec-a28e-d484abc0401d'}), log id: 7ded4ad7 2019-03-12 14:39:34,760+08 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (EE-ManagedThreadFactory-engine-Thread-53959) [] FINISH, MigrateStatusVDSCommand, return: , log id: 7ded4ad7 2019-03-12 14:39:34,786+08 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-53959) [] EVENT_ID: VM_MIGRATION_TO_SERVER_FAILED(120), Migration failed (VM: Win_2016_1, Source: host2.xxxx.com, Destination: host3). Any help is greatly appreciated. regards, Bong SF

6 years, 3 months

2
1
0 / 0

Re: VM has been paused due to a storage I/O error

by Strahil

This should not happen with replica 3 volumes. Are you sure you don't have a fluster brick out of sync/disconnected ? Best Regards, Strahil Nikolov

6 years, 3 months

3
2
0 / 0

Where to find the hooks' print

by zodaoko＠gmail.com

Hi there, I created a before_set_num_of_cpus hook: # more /usr/libexec/vdsm/hooks/before_set_num_of_cpus/before.py #!/usr/bin/python import os import sys if os.environ.has_key('never_existed'): sys.stderr.write('cantsetcpu: before_cpu_set: cannot set cpu.\n') sys.exit(2) else: sys.stdout.write('hook ok.\n') sys.exit(0) But I cannot find the message "hook ok" in engine.log or vdsm.log, where to find it? Thank you very much. Thank you, -Zhen

6 years, 3 months

1
0
0 / 0

Host affinity hard rule doesn't work

by zodaoko＠gmail.com

Hi there, Here is my setup: oVirt engine: 4.2.8 1. Create an affinity group as below: VM affinity rule: positive + enforcing Host affinity rule: disabled. VMs: 2 VMs added Hosts: No host selected. 2. Run the 2 VMs, they are running on the same host, say host1. 3. Change the affinity group's host affinity: Host affinity rule: positive + enforcing Hosts: host2 added. I expect that the 2 VMs can migrate to host2, but that never happen, is this expected? snippet of engine.log: 2019-03-13 07:47:05,747Z INFO [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (EE-ManagedThreadFactory-engineScheduled-Thread-30) [] Candidate host 'dub-svrfarm24' ('76b13e75-d01b-4dec-9298-1fad72b46525') was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'VmAffinityGroups' (correlation id: null) 2019-03-13 07:47:05,747Z DEBUG [org.ovirt.engine.core.bll.scheduling.arem.AffinityRulesEnforcer] (EE-ManagedThreadFactory-engineScheduled-Thread-30) [] VM 822b37b7-5da3-453c-b775-d4192c2fdcae is NOT a viable candidate for solving the affinity group violation situation. 2019-03-13 07:47:05,747Z DEBUG [org.ovirt.engine.core.bll.scheduling.arem.AffinityRulesEnforcer] (EE-ManagedThreadFactory-engineScheduled-Thread-30) [] No vm to hosts soft-affinity group violation detected 2019-03-13 07:47:05,749Z DEBUG [org.ovirt.engine.core.bll.scheduling.arem.AffinityRulesEnforcer] (EE-ManagedThreadFactory-engineScheduled-Thread-30) [] No affinity group collision detected for cluster 8fe88b8c-966c-4c21-839d-e2437cc6b73d. Standing by. 2019-03-13 07:47:05,749Z DEBUG [org.ovirt.engine.core.bll.scheduling.arem.AffinityRulesEnforcer] (EE-ManagedThreadFactory-engineScheduled-Thread-30) [] No affinity group collision detected for cluster 3beac2ea-ed04-4f40-9ce3-5a9a67cebd8c. Standing by. 2019-03-13 07:47:05,750Z DEBUG [org.ovirt.engine.core.bll.scheduling.arem.AffinityRulesEnforcer] (EE-ManagedThreadFactory-engineScheduled-Thread-30) [] No affinity group collision detected for cluster da32d154-4303-11e9-9607-00163eaab080. Standing by. Thank you, -Zhen

6 years, 3 months

3
4
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Users March 2019