
Hi All, We have an odd setup in out environment but each storage data center has one host and one storage domain. We had an issue with the storage domain attached to a host. After the reboot I am seeing in the vdsm logs over and over again vmrecovery 2023-06-09 21:01:30,419+0000 INFO (periodic/2) [vdsm.api] START repoStats(domains=()) from=internal, task_id=40f5b198-cb82-4ba2-8c20-b8cee34a7f47 (api:48) 2023-06-09 21:01:30,420+0000 INFO (periodic/2) [vdsm.api] FINISH repoStats return={} from=internal, task_id=40f5b198-cb82-4ba2-8c20-b8cee34a7f47 (api:54) 2023-06-09 21:01:30,810+0000 INFO (vmrecovery) [vdsm.api] START getConnectedStoragePoolsList(options=None) from=internal, task_id=74b1a1cf-fab1-4918-b0da-b3fd152d9d1a (api:48) 2023-06-09 21:01:30,811+0000 INFO (vmrecovery) [vdsm.api] FINISH getConnectedStoragePoolsList return={'poollist': []} from=internal, task_id=74b1a1cf-fab1-4918-b0da-b3fd152d9d1a (api:54) 2023-06-09 21:01:30,811+0000 INFO (vmrecovery) [vds] recovery: waiting for storage pool to go up (clientIF:723) I've also checked the firewall and it is still disabled. systemctl status libvirtd ● libvirtd.service - Virtualization daemon Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/libvirtd.service.d └─unlimited-core.conf Active: active (running) since Fri 2023-06-09 20:51:11 UTC; 16min ago Docs: man:libvirtd(8) https://libvirt.org Main PID: 4984 (libvirtd) Tasks: 17 (limit: 32768) Memory: 39.7M CGroup: /system.slice/libvirtd.service └─4984 /usr/sbin/libvirtd --listen Jun 09 20:51:11 hlkvm01 systemd[1]: Starting Virtualization daemon... Jun 09 20:51:11 hlkvm01 systemd[1]: Started Virtualization daemon. ● vdsmd.service - Virtual Desktop Server Manager Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2023-06-09 20:53:11 UTC; 14min ago Process: 10496 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS) Main PID: 10587 (vdsmd) Tasks: 39 Memory: 79.5M CGroup: /system.slice/vdsmd.service └─10587 /usr/bin/python2 /usr/share/vdsm/vdsmd Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event '|virt|VM_status|596001c3-33e7-44a4-bdf9-0b53ab1dd810' args={'596001c3-33e7-44a4-bdf9-0b53ab1dd810': {'status': 'Down', 'displayInfo': [{'tlsPort': '-1', 'ipAddress': '0', 'type': 'vnc', 'port': '-1'}], 'hash': '-2283890943663580625', 'exitMessage': 'VM terminated with error', 'cpuUser': '0.00', 'monitorResponse': '0', 'vmId': '596001c3-33e7-44a4-bdf9-0b53ab1dd810', 'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime': '893978', 'cpuSys': '0.00', 'timeOffset': '0', 'clientIp': '', 'exitCode': 1}} Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event '|virt|VM_status|87155499-1e10-4228-aa69-7c487007746e' args={'87155499-1e10-4228-aa69-7c487007746e': {'status': 'Down', 'displayInfo': [{'tlsPort': '-1', 'ipAddress': '0', 'type': 'vnc', 'port': '-1'}], 'hash': '-5453960159391982695', 'exitMessage': 'VM terminated with error', 'cpuUser': '0.00', 'monitorResponse': '0', 'vmId': '87155499-1e10-4228-aa69-7c487007746e', 'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime': '893973', 'cpuSys': '0.00', 'timeOffset': '0', 'clientIp': '', 'exitCode': 1}} Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event '|virt|VM_status|0ec7a66d-fac2-4a4a-a939-e05fc7b097b7' args={'0ec7a66d-fac2-4a4a-a939-e05fc7b097b7': {'status': 'Down', 'displayInfo': [{'tlsPort': '-1', 'ipAddress': '0', 'type': 'vnc', 'port': '-1'}], 'hash': '-1793949836195780752', 'exitMessage': 'VM terminated with error', 'cpuUser': '0.00', 'monitorResponse': '0', 'vmId': '0ec7a66d-fac2-4a4a-a939-e05fc7b097b7', 'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime': '893976', 'cpuSys': '0.00', 'timeOffset': '0', 'clientIp': '', 'exitCode': 1}} Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event '|virt|VM_status|9c8802c3-c7c9-473c-bbfb-abb0bd0f8fdb' args={'9c8802c3-c7c9-473c-bbfb-abb0bd0f8fdb': {'status': 'Down', 'displayInfo': [{'tlsPort': '-1', 'ipAddress': '0', 'type': 'vnc', 'port': '-1'}], 'hash': '-1144924804541449415', 'exitMessage': 'VM terminated with error', 'cpuUser': '0.00', 'monitorResponse': '0', 'vmId': '9c8802c3-c7c9-473c-bbfb-abb0bd0f8fdb', 'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime': '893971', 'cpuSys': '0.00', 'timeOffset': '0', 'clientIp': '', 'exitCode': 1}} Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event '|virt|VM_status|f799c326-9969-4892-8d67-3b1229baf0ef' args={'f799c326-9969-4892-8d67-3b1229baf0ef': {'status': 'Down', 'displayInfo': [{'tlsPort': '-1', 'ipAddress': '0', 'type': 'vnc', 'port': '-1'}], 'hash': '5564598485369155833', 'exitMessage': 'VM terminated with error', 'cpuUser': '0.00', 'monitorResponse': '0', 'vmId': 'f799c326-9969-4892-8d67-3b1229baf0ef', 'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime': '893980', 'cpuSys': '0.00', 'timeOffset': '0', 'clientIp': '', 'exitCode': 1}} Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event '|virt|VM_status|e9311d9f-d770-458b-b5ad-cdc2eb35f1bd' args={'e9311d9f-d770-458b-b5ad-cdc2eb35f1bd': {'status': 'Down', 'displayInfo': [{'tlsPort': '-1', 'ipAddress': '0', 'type': 'vnc', 'port': '-1'}], 'hash': '-5622951617346770490', 'exitMessage': 'VM terminated with error', 'cpuUser': '0.00', 'monitorResponse': '0', 'vmId': 'e9311d9f-d770-458b-b5ad-cdc2eb35f1bd', 'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime': '893972', 'cpuSys': '0.00', 'timeOffset': '0', 'clientIp': '', 'exitCode': 1}} Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event '|virt|VM_status|1fc4ddad-203f-4cdf-9cb3-c3d66fb97c87' args={'1fc4ddad-203f-4cdf-9cb3-c3d66fb97c87': {'status': 'Down', 'displayInfo': [{'tlsPort': '-1', 'ipAddress': '0', 'type': 'vnc', 'port': '-1'}], 'hash': '-1397731328049024241', 'exitMessage': 'VM terminated with error', 'cpuUser': '0.00', 'monitorResponse': '0', 'vmId': '1fc4ddad-203f-4cdf-9cb3-c3d66fb97c87', 'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime': '893981', 'cpuSys': '0.00', 'timeOffset': '0', 'clientIp': '', 'exitCode': 1}} Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event '|virt|VM_status|321183ed-b0a6-42c7-bbee-2ad46a5f37ae' args={'321183ed-b0a6-42c7-bbee-2ad46a5f37ae': {'status': 'Down', 'displayInfo': [{'tlsPort': '-1', 'ipAddress': '0', 'type': 'vnc', 'port': '-1'}], 'hash': '4398712824561987912', 'exitMessage': 'VM terminated with error', 'cpuUser': '0.00', 'monitorResponse': '0', 'vmId': '321183ed-b0a6-42c7-bbee-2ad46a5f37ae', 'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime': '893970', 'cpuSys': '0.00', 'timeOffset': '0', 'clientIp': '', 'exitCode': 1}} Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event '|virt|VM_status|731a11e8-62ba-4639-bdee-8c44b5790d82' args={'731a11e8-62ba-4639-bdee-8c44b5790d82': {'status': 'Down', 'displayInfo': [{'tlsPort': '-1', 'ipAddress': '0', 'type': 'vnc', 'port': '-1'}], 'hash': '-1278467655696539707', 'exitMessage': 'VM terminated with error', 'cpuUser': '0.00', 'monitorResponse': '0', 'vmId': '731a11e8-62ba-4639-bdee-8c44b5790d82', 'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime': '893977', 'cpuSys': '0.00', 'timeOffset': '0', 'clientIp': '', 'exitCode': 1}} Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event '|virt|VM_status|411a97e6-41c7-473e-819b-04aa10bc2bf0' args={'411a97e6-41c7-473e-819b-04aa10bc2bf0': {'status': 'Down', 'displayInfo': [{'tlsPort': '-1', 'ipAddress': '0', 'type': 'vnc', 'port': '-1'}], 'hash': '-11964682092647781', 'exitMessage': 'VM terminated with error', 'cpuUser': '0.00', 'monitorResponse': '0', 'vmId': '411a97e6-41c7-473e-819b-04aa10bc2bf0', 'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime': '893975', 'cpuSys': '0.00', 'timeOffset': '0', 'clientIp': '', 'exitCode': 1}} This has been going on for hours. On the management VM I am seeing the following over and over again 2023-06-09 13:59:25,129-07 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-5) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM hlkvm01 command Get Host Capabilities failed: Message timeout which can be caused by communication issues 2023-06-09 13:59:25,129-07 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-5) [] Unable to RefreshCapabilities: VDSNetworkE xception: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues I am able to restart the Host from the management VM (Web Console). When I try to put the host in maintenance mode I get "Error while executing action. Cannot switch Host to Maintenance mode. Hose still has running VMs on it and is in Non Responsive state" If I try to set the "Confirm host has been rebooted" I get an error saying that another power management action is already in progress. Can someone please help me out here? Is there a way to set the manage for all of the VMs to down? Anything I can do to get the storage domain back up? Thanks