Hi All,
We have an odd setup in out environment but each storage data center has one host and one
storage domain.
We had an issue with the storage domain attached to a host. After the reboot I am seeing
in the vdsm logs over and over again vmrecovery
2023-06-09 21:01:30,419+0000 INFO (periodic/2) [vdsm.api] START repoStats(domains=())
from=internal, task_id=40f5b198-cb82-4ba2-8c20-b8cee34a7f47 (api:48)
2023-06-09 21:01:30,420+0000 INFO (periodic/2) [vdsm.api] FINISH repoStats return={}
from=internal, task_id=40f5b198-cb82-4ba2-8c20-b8cee34a7f47 (api:54)
2023-06-09 21:01:30,810+0000 INFO (vmrecovery) [vdsm.api] START
getConnectedStoragePoolsList(options=None) from=internal,
task_id=74b1a1cf-fab1-4918-b0da-b3fd152d9d1a (api:48)
2023-06-09 21:01:30,811+0000 INFO (vmrecovery) [vdsm.api] FINISH
getConnectedStoragePoolsList return={'poollist': []} from=internal,
task_id=74b1a1cf-fab1-4918-b0da-b3fd152d9d1a (api:54)
2023-06-09 21:01:30,811+0000 INFO (vmrecovery) [vds] recovery: waiting for storage pool
to go up (clientIF:723)
I've also checked the firewall and it is still disabled.
systemctl status libvirtd
● libvirtd.service - Virtualization daemon
Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset:
enabled)
Drop-In: /etc/systemd/system/libvirtd.service.d
└─unlimited-core.conf
Active: active (running) since Fri 2023-06-09 20:51:11 UTC; 16min ago
Docs: man:libvirtd(8)
https://libvirt.org
Main PID: 4984 (libvirtd)
Tasks: 17 (limit: 32768)
Memory: 39.7M
CGroup: /system.slice/libvirtd.service
└─4984 /usr/sbin/libvirtd --listen
Jun 09 20:51:11 hlkvm01 systemd[1]: Starting Virtualization daemon...
Jun 09 20:51:11 hlkvm01 systemd[1]: Started Virtualization daemon.
● vdsmd.service - Virtual Desktop Server Manager
Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset:
enabled)
Active: active (running) since Fri 2023-06-09 20:53:11 UTC; 14min ago
Process: 10496 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start
(code=exited, status=0/SUCCESS)
Main PID: 10587 (vdsmd)
Tasks: 39
Memory: 79.5M
CGroup: /system.slice/vdsmd.service
└─10587 /usr/bin/python2 /usr/share/vdsm/vdsmd
Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event
'|virt|VM_status|596001c3-33e7-44a4-bdf9-0b53ab1dd810'
args={'596001c3-33e7-44a4-bdf9-0b53ab1dd810': {'status': 'Down',
'displayInfo': [{'tlsPort': '-1', 'ipAddress':
'0', 'type': 'vnc', 'port': '-1'}],
'hash': '-2283890943663580625', 'exitMessage': 'VM terminated
with error', 'cpuUser': '0.00', 'monitorResponse':
'0', 'vmId': '596001c3-33e7-44a4-bdf9-0b53ab1dd810',
'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime':
'893978', 'cpuSys': '0.00', 'timeOffset': '0',
'clientIp': '', 'exitCode': 1}}
Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event
'|virt|VM_status|87155499-1e10-4228-aa69-7c487007746e'
args={'87155499-1e10-4228-aa69-7c487007746e': {'status': 'Down',
'displayInfo': [{'tlsPort': '-1', 'ipAddress':
'0', 'type': 'vnc', 'port': '-1'}],
'hash': '-5453960159391982695', 'exitMessage': 'VM terminated
with error', 'cpuUser': '0.00', 'monitorResponse':
'0', 'vmId': '87155499-1e10-4228-aa69-7c487007746e',
'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime':
'893973', 'cpuSys': '0.00', 'timeOffset': '0',
'clientIp': '', 'exitCode': 1}}
Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event
'|virt|VM_status|0ec7a66d-fac2-4a4a-a939-e05fc7b097b7'
args={'0ec7a66d-fac2-4a4a-a939-e05fc7b097b7': {'status': 'Down',
'displayInfo': [{'tlsPort': '-1', 'ipAddress':
'0', 'type': 'vnc', 'port': '-1'}],
'hash': '-1793949836195780752', 'exitMessage': 'VM terminated
with error', 'cpuUser': '0.00', 'monitorResponse':
'0', 'vmId': '0ec7a66d-fac2-4a4a-a939-e05fc7b097b7',
'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime':
'893976', 'cpuSys': '0.00', 'timeOffset': '0',
'clientIp': '', 'exitCode': 1}}
Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event
'|virt|VM_status|9c8802c3-c7c9-473c-bbfb-abb0bd0f8fdb'
args={'9c8802c3-c7c9-473c-bbfb-abb0bd0f8fdb': {'status': 'Down',
'displayInfo': [{'tlsPort': '-1', 'ipAddress':
'0', 'type': 'vnc', 'port': '-1'}],
'hash': '-1144924804541449415', 'exitMessage': 'VM terminated
with error', 'cpuUser': '0.00', 'monitorResponse':
'0', 'vmId': '9c8802c3-c7c9-473c-bbfb-abb0bd0f8fdb',
'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime':
'893971', 'cpuSys': '0.00', 'timeOffset': '0',
'clientIp': '', 'exitCode': 1}}
Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event
'|virt|VM_status|f799c326-9969-4892-8d67-3b1229baf0ef'
args={'f799c326-9969-4892-8d67-3b1229baf0ef': {'status': 'Down',
'displayInfo': [{'tlsPort': '-1', 'ipAddress':
'0', 'type': 'vnc', 'port': '-1'}],
'hash': '5564598485369155833', 'exitMessage': 'VM terminated
with error', 'cpuUser': '0.00', 'monitorResponse':
'0', 'vmId': 'f799c326-9969-4892-8d67-3b1229baf0ef',
'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime':
'893980', 'cpuSys': '0.00', 'timeOffset': '0',
'clientIp': '', 'exitCode': 1}}
Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event
'|virt|VM_status|e9311d9f-d770-458b-b5ad-cdc2eb35f1bd'
args={'e9311d9f-d770-458b-b5ad-cdc2eb35f1bd': {'status': 'Down',
'displayInfo': [{'tlsPort': '-1', 'ipAddress':
'0', 'type': 'vnc', 'port': '-1'}],
'hash': '-5622951617346770490', 'exitMessage': 'VM terminated
with error', 'cpuUser': '0.00', 'monitorResponse':
'0', 'vmId': 'e9311d9f-d770-458b-b5ad-cdc2eb35f1bd',
'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime':
'893972', 'cpuSys': '0.00', 'timeOffset': '0',
'clientIp': '', 'exitCode': 1}}
Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event
'|virt|VM_status|1fc4ddad-203f-4cdf-9cb3-c3d66fb97c87'
args={'1fc4ddad-203f-4cdf-9cb3-c3d66fb97c87': {'status': 'Down',
'displayInfo': [{'tlsPort': '-1', 'ipAddress':
'0', 'type': 'vnc', 'port': '-1'}],
'hash': '-1397731328049024241', 'exitMessage': 'VM terminated
with error', 'cpuUser': '0.00', 'monitorResponse':
'0', 'vmId': '1fc4ddad-203f-4cdf-9cb3-c3d66fb97c87',
'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime':
'893981', 'cpuSys': '0.00', 'timeOffset': '0',
'clientIp': '', 'exitCode': 1}}
Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event
'|virt|VM_status|321183ed-b0a6-42c7-bbee-2ad46a5f37ae'
args={'321183ed-b0a6-42c7-bbee-2ad46a5f37ae': {'status': 'Down',
'displayInfo': [{'tlsPort': '-1', 'ipAddress':
'0', 'type': 'vnc', 'port': '-1'}],
'hash': '4398712824561987912', 'exitMessage': 'VM terminated
with error', 'cpuUser': '0.00', 'monitorResponse':
'0', 'vmId': '321183ed-b0a6-42c7-bbee-2ad46a5f37ae',
'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime':
'893970', 'cpuSys': '0.00', 'timeOffset': '0',
'clientIp': '', 'exitCode': 1}}
Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event
'|virt|VM_status|731a11e8-62ba-4639-bdee-8c44b5790d82'
args={'731a11e8-62ba-4639-bdee-8c44b5790d82': {'status': 'Down',
'displayInfo': [{'tlsPort': '-1', 'ipAddress':
'0', 'type': 'vnc', 'port': '-1'}],
'hash': '-1278467655696539707', 'exitMessage': 'VM terminated
with error', 'cpuUser': '0.00', 'monitorResponse':
'0', 'vmId': '731a11e8-62ba-4639-bdee-8c44b5790d82',
'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime':
'893977', 'cpuSys': '0.00', 'timeOffset': '0',
'clientIp': '', 'exitCode': 1}}
Jun 09 20:53:14 hlkvm01 vdsm[10587]: WARN Not ready yet, ignoring event
'|virt|VM_status|411a97e6-41c7-473e-819b-04aa10bc2bf0'
args={'411a97e6-41c7-473e-819b-04aa10bc2bf0': {'status': 'Down',
'displayInfo': [{'tlsPort': '-1', 'ipAddress':
'0', 'type': 'vnc', 'port': '-1'}],
'hash': '-11964682092647781', 'exitMessage': 'VM terminated
with error', 'cpuUser': '0.00', 'monitorResponse':
'0', 'vmId': '411a97e6-41c7-473e-819b-04aa10bc2bf0',
'exitReason': 1, 'cpuUsage': '0.00', 'elapsedTime':
'893975', 'cpuSys': '0.00', 'timeOffset': '0',
'clientIp': '', 'exitCode': 1}}
This has been going on for hours. On the management VM I am seeing the following over and
over again
2023-06-09 13:59:25,129-07 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engineScheduled-Thread-5) [] EVENT_ID:
VDS_BROKER_COMMAND_FAILURE(10,802), VDSM hlkvm01 command Get Host Capabilities failed:
Message timeout which can be caused by communication issues
2023-06-09 13:59:25,129-07 ERROR
[org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
(EE-ManagedThreadFactory-engineScheduled-Thread-5) [] Unable to RefreshCapabilities:
VDSNetworkE
xception: VDSGenericException: VDSNetworkException: Message timeout which can be caused by
communication issues
I am able to restart the Host from the management VM (Web Console). When I try to put the
host in maintenance mode I get "Error while executing action. Cannot switch Host to
Maintenance mode. Hose still has running VMs on it and is in Non Responsive state"
If I try to set the "Confirm host has been rebooted" I get an error saying that
another power management action is already in progress. Can someone please help me out
here? Is there a way to set the manage for all of the VMs to down? Anything I can do to
get the storage domain back up?
Thanks