Restarted vdsmd on host:
mom.log:
2020-08-06 07:21:19,053 - mom.GuestManager - INFO - Guest Manager ending
2020-08-06 07:21:20,483 - mom.HostMonitor - INFO - Host Monitor ending
2020-08-06 07:21:24,795 - mom - INFO - MOM starting
2020-08-06 07:21:24,833 - mom - INFO - hypervisor interface
vdsmjsonrpcclient
2020-08-06 07:21:24,833 - mom.HostMonitor - INFO - Host Monitor starting
2020-08-06 07:21:24,880 - mom - ERROR - Failed to initialize MOM threads
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/mom/__init__.py", line 29, in run
hypervisor_iface = self.get_hypervisor_interface()
File "/usr/lib/python2.7/site-packages/mom/__init__.py", line 217, in
get_hypervisor_interface
return module.instance(self.config)
File
"/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/vdsmjsonrpcclientInterface.py",
line 96, in instance
return JsonRpcVdsmClientInterface()
File
"/usr/lib/python2.7/site-packages/mom/HypervisorInterfaces/vdsmjsonrpcclientInterface.py",
line 31, in __init__
self._vdsm_api = client.connect(host="localhost")
File "/usr/lib/python2.7/site-packages/vdsm/client.py", line 157, in
connect
raise ConnectionError(host, port, use_tls, timeout, e)
ConnectionError: Connection to localhost:54321 with use_tls=True,
timeout=60 failed: [Errno 111] Connection refused
2020-08-06 07:21:30,085 - mom - INFO - MOM starting
2020-08-06 07:21:30,122 - mom.HostMonitor - INFO - Host Monitor starting
2020-08-06 07:21:30,123 - mom - INFO - hypervisor interface
vdsmjsonrpcclient
2020-08-06 07:21:30,217 - mom.HostMonitor - INFO - HostMonitor is ready
2020-08-06 07:21:30,221 - mom.GuestManager - INFO - Guest Manager starting:
multi-thread
2020-08-06 07:21:30,226 - mom.Policy - INFO - Loaded policy '00-defines'
2020-08-06 07:21:30,228 - mom.Policy - INFO - Loaded policy '01-parameters'
2020-08-06 07:21:30,241 - mom.Policy - INFO - Loaded policy '02-balloon'
2020-08-06 07:21:30,263 - mom.Policy - INFO - Loaded policy '03-ksm'
2020-08-06 07:21:30,290 - mom.Policy - INFO - Loaded policy '04-cputune'
2020-08-06 07:21:30,321 - mom.Policy - INFO - Loaded policy '05-iotune'
2020-08-06 07:21:30,321 - mom.PolicyEngine - INFO - Policy Engine starting
2020-08-06 07:21:30,322 - mom.RPCServer - INFO - Using unix socket
/var/run/vdsm/mom-vdsm.sock
2020-08-06 07:21:30,323 - mom.RPCServer - INFO - RPC Server starting
2020-08-06 07:21:40,692 - mom.RPCServer - INFO - ping()
2020-08-06 07:21:40,692 - mom.RPCServer - INFO - getStatistics()
2020-08-06 07:21:45,356 - mom.Controllers.KSM - INFO - Updating KSM
configuration: pages_to_scan:0 merge_across_nodes:1 run:0 sleep_millisecs:0
2020-08-06 07:21:55,838 - mom.RPCServer - INFO - ping()
2020-08-06 07:21:55,839 - mom.RPCServer - INFO - getStatistics()
supervdsm.log:
MainProcess|jsonrpc/3::DEBUG::2020-08-05
20:11:14,139::supervdsm_server::99::SuperVdsm.ServerCallback::(wrapper)
call ksmTune with ({u'run': 0, u'merge_across_nodes': 1},) {}
MainProcess|jsonrpc/3::DEBUG::2020-08-05
20:11:14,139::supervdsm_server::106::SuperVdsm.ServerCallback::(wrapper)
return ksmTune with None
MainProcess::DEBUG::2020-08-06
07:21:25,279::supervdsm_server::99::SuperVdsm.ServerCallback::(wrapper)
call multipath_status with (<vdsm.supervdsm_server._SuperVdsm object at
0x7fcbf8dd0450>,) {}
MainProcess::DEBUG::2020-08-06
07:21:25,279::logutils::319::root::(_report_stats) ThreadedHandler is ok in
the last 40234 seconds (max pending: 3)
MainProcess::DEBUG::2020-08-06
07:21:25,279::commands::198::storage.Misc.excCmd::(execCmd)
/usr/bin/taskset --cpu-list 0-95 /usr/sbin/dmsetup status --target
multipath (cwd None)
MainProcess::DEBUG::2020-08-06
07:21:25,283::commands::219::storage.Misc.excCmd::(execCmd) SUCCESS: <err>
= ''; <rc> = 0
MainProcess::DEBUG::2020-08-06
07:21:25,289::supervdsm_server::106::SuperVdsm.ServerCallback::(wrapper)
return multipath_status with {u'T1_58886_2121': [PathStatus(name=u'sdd',
status=u'A'), PathStatus(name=u'sdm', status=u'A')],
u'T0_someserver_boot_58886_20c2': [PathStatus(name=u'sdi',
status=u'A'),
PathStatus(name=u'sdr', status=u'A')],
u'T0_R4_UCS_MOB1P_DIGIT_58886_20c8':
[PathStatus(name=u'sdg', status=u'A'), PathStatus(name=u'sdp',
status=u'A')], u'T0_58886_215d': [PathStatus(name=u'sdb',
status=u'A'),
PathStatus(name=u'sdk', status=u'A')],
u'T0_R4_UCS_MOB1P_DIGIT_58886_20c7':
[PathStatus(name=u'sdf', status=u'A'), PathStatus(name=u'sdo',
status=u'A')], u'T0_58886_20b8': [PathStatus(name=u'sde',
status=u'A'),
PathStatus(name=u'sdn', status=u'A')], u'T0_58886_208a':
[PathStatus(name=u'sdc', status=u'A'), PathStatus(name=u'sdl',
status=u'A')], u'T0_58886_2124': [PathStatus(name=u'sdh',
status=u'A'),
PathStatus(name=u'sdq', status=u'A')], u'T0_58886_215c':
[PathStatus(name=u'sda', status=u'A'), PathStatus(name=u'sdj',
status=u'A')]}
MainProcess|hsm/init::DEBUG::2020-08-06
07:21:25,383::supervdsm_server::99::SuperVdsm.ServerCallback::(wrapper)
call hbaRescan with (<vdsm.supervdsm_server._SuperVdsm object at
0x7fcbf8dd0450>,) {}
MainProcess|hsm/init::DEBUG::2020-08-06
07:21:25,384::commands::198::storage.HBA::(execCmd) /usr/bin/taskset
--cpu-list 0-95 /usr/libexec/vdsm/fc-scan (cwd None)
MainProcess|hsm/init::DEBUG::2020-08-06
07:21:25,450::supervdsm_server::106::SuperVdsm.ServerCallback::(wrapper)
return hbaRescan with None
MainProcess|jsonrpc/3::DEBUG::2020-08-06
07:21:45,359::supervdsm_server::99::SuperVdsm.ServerCallback::(wrapper)
call ksmTune with ({u'run': 0, u'merge_across_nodes': 1},) {}
MainProcess|jsonrpc/3::DEBUG::2020-08-06
07:21:45,359::supervdsm_server::106::SuperVdsm.ServerCallback::(wrapper)
return ksmTune with None
vdsm.log
2020-08-06 07:21:22,529+0200 INFO (MainThread) [vds] Received signal 15,
shutting down (vdsmd:71)
[130/49579]
2020-08-06 07:21:22,672+0200 INFO (MainThread) [jsonrpc.JsonRpcServer]
Stopping JsonRPC Server (__init__:442)
2020-08-06 07:21:22,674+0200 INFO (MainThread) [vds] Stopping http server
(http:80)
2020-08-06 07:21:22,674+0200 INFO (http) [vds] Server stopped (http:70)
2020-08-06 07:21:22,674+0200 INFO (MainThread) [root] Unregistering all
secrets (secret:93)
2020-08-06 07:21:22,674+0200 INFO (MainThread) [vds] Stopping QEMU-GA
poller (qemuguestagent:158)
2020-08-06 07:21:22,675+0200 INFO (MainThread) [vdsm.api] START
prepareForShutdown(options=None) from=internal,
task_id=251a8da8-ef3c-4740-842a-42bca0541551 (api:48)
2020-08-06 07:21:22,732+0200 INFO (MainThread) [storage.Monitor] Shutting
down domain monitors (monitor:223)
2020-08-06 07:21:22,732+0200 INFO (MainThread) [storage.check] Stopping
check service (check:104)
2020-08-06 07:21:22,732+0200 INFO (check/loop) [storage.asyncevent]
Stopping <EventLoop running=False closed=False at 0x140458524164304>
(asyncevent:220)
2020-08-06 07:21:22,732+0200 INFO (MainThread) [storage.udev] Stopping
multipath event listener (udev:149)
2020-08-06 07:21:22,733+0200 INFO (MainThread) [vdsm.api] FINISH
prepareForShutdown return=None from=internal,
task_id=251a8da8-ef3c-4740-842a-42bca0541551 (api:54)
2020-08-06 07:21:22,733+0200 INFO (MainThread) [vds] Stopping threads
(vdsmd:163)
2020-08-06 07:21:22,733+0200 INFO (MainThread) [vds] Exiting (vdsmd:174)
2020-08-06 07:21:25,134+0200 INFO (MainThread) [vds] (PID: 217998) I am
the actual vdsm 4.30.43.1 someserver (3.10.0-1127.el7.x86_64) (vdsmd:152)
2020-08-06 07:21:25,135+0200 INFO (MainThread) [vds] VDSM will run with
cpu affinity: frozenset([1]) (vdsmd:264)
2020-08-06 07:21:25,138+0200 INFO (MainThread) [storage.HSM] START HSM
init (hsm:367)
2020-08-06 07:21:25,140+0200 INFO (MainThread) [storage.HSM] Creating
data-center mount directory '/rhev/data-center/mnt' (hsm:372)
2020-08-06 07:21:25,140+0200 INFO (MainThread) [storage.fileUtils]
Creating directory: /rhev/data-center/mnt mode: None (fileUtils:199)
2020-08-06 07:21:25,269+0200 INFO (MainThread) [storage.udev] Registering
multipath event monitor <vdsm.storage.mpathhealth.Monitor object at
0x7fa060286d50> (udev:182)
2020-08-06 07:21:25,269+0200 INFO (MainThread) [storage.udev] Starting
multipath event listener (udev:116)
2020-08-06 07:21:25,289+0200 INFO (MainThread) [storage.check] Starting
check service (check:91)
2020-08-06 07:21:25,293+0200 INFO (MainThread) [storage.Dispatcher]
Starting StorageDispatcher... (dispatcher:47)
2020-08-06 07:21:25,294+0200 INFO (check/loop) [storage.asyncevent]
Starting <EventLoop running=True closed=False at 0x140326784999632>
(asyncevent:125)
2020-08-06 07:21:25,316+0200 INFO (MainThread) [vdsm.api] START
registerDomainStateChangeCallback(callbackFunc=<functools.partial object at
0x7fa06026a628>) from=internal, task_id=c31419c5-486f-4bdd-b206-207e61
a64825 (api:48)
2020-08-06 07:21:25,316+0200 INFO (MainThread) [vdsm.api] FINISH
registerDomainStateChangeCallback return=None from=internal,
task_id=c31419c5-486f-4bdd-b206-207e61a64825 (api:54)
2020-08-06 07:21:25,316+0200 INFO (MainThread) [MOM] Preparing MOM
interface (momIF:45)
2020-08-06 07:21:25,316+0200 INFO (MainThread) [MOM] MOM: Using named unix
socket: /var/run/vdsm/mom-vdsm.sock (momIF:55)
2020-08-06 07:21:25,317+0200 INFO (MainThread) [root] Unregistering all
secrets (secret:93)
2020-08-06 07:21:25,330+0200 INFO (MainThread) [vds] Setting channels'
timeout to 30 seconds. (vmchannels:224)
2020-08-06 07:21:25,331+0200 INFO (MainThread) [vds] Starting QEMU-GA
poller (qemuguestagent:151)
2020-08-06 07:21:25,333+0200 INFO (MainThread) [vds.MultiProtocolAcceptor]
Listening at :::54321 (protocoldetector:183)
2020-08-06 07:21:25,402+0200 INFO (http) [vds] Server running (http:59)
2020-08-06 07:21:25,471+0200 INFO (vmrecovery) [vds] VM recovered:
4c044f9e-afdc-4ee7-bfa4-4b2b94115818 (recovery:97)
2020-08-06 07:21:25,472+0200 INFO (vmrecovery) [vds] recovery [1:1/3]:
recovered domain 4c044f9e-afdc-4ee7-bfa4-4b2b94115818 (recovery:123)
2020-08-06 07:21:25,480+0200 INFO (hsm/init) [storage.LVM] Overriding
read_only mode current=True override=False (lvm:402)
2020-08-06 07:21:25,507+0200 INFO (vm/4c044f9e) [virt.vm]
(vmId='4c044f9e-afdc-4ee7-bfa4-4b2b94115818') VM wrapper has started
(vm:2786)
2020-08-06 07:21:25,510+0200 INFO (vm/4c044f9e) [vdsm.api] START
getVolumeSize(sdUUID='531b2674-0bfa-4d71-bc58-85e280e867b9',
spUUID='a5f4c7eb-3640-49f9-b644-57f10e9f5279',
imgUUID='c7bab10b-9721-4713-84d4-5718
0f8ce2f4', volUUID='1ed27351-77f9-4b69-acbe-a9bc35ed147a', options=None)
from=internal, task_id=22cb50a2-fba6-4430-a47e-b1f6da73e7ed (api:48)
2020-08-06 07:21:25,516+0200 INFO (vmrecovery) [vds] VM recovered:
543765d5-daee-41da-a9d4-34f36c99be82 (recovery:97)
2020-08-06 07:21:25,517+0200 INFO (vmrecovery) [vds] recovery [1:2/3]:
recovered domain 543765d5-daee-41da-a9d4-34f36c99be82 (recovery:123)
2020-08-06 07:21:25,530+0200 INFO (vm/543765d5) [virt.vm]
(vmId='543765d5-daee-41da-a9d4-34f36c99be82') VM wrapper has started
(vm:2786)
2020-08-06 07:21:25,532+0200 INFO (vm/543765d5) [vdsm.api] START
getVolumeSize(sdUUID='531b2674-0bfa-4d71-bc58-85e280e867b9',
spUUID='a5f4c7eb-3640-49f9-b644-57f10e9f5279',
imgUUID='e4fcb0fc-3900-4422-a441-2147
bf8571f3', volUUID='12b3489b-90f5-4b94-aec1-1725296cd66e', options=None)
from=internal, task_id=a1c0460b-4195-4b1c-af8a-d34474c5fa2e (api:48)
2020-08-06 07:21:25,545+0200 INFO (vmrecovery) [vds] VM recovered:
4b472d29-5f40-438f-8bca-9b2affbd3d37 (recovery:97)
2020-08-06 07:21:25,546+0200 INFO (vmrecovery) [vds] recovery [1:3/3]:
recovered domain 4b472d29-5f40-438f-8bca-9b2affbd3d37 (recovery:123)
2020-08-06 07:21:25,546+0200 INFO (vmrecovery) [vds] recovery: waiting for
3 domains to go up (clientIF:717)
2020-08-06 07:21:25,546+0200 INFO (vm/4b472d29) [virt.vm]
(vmId='4b472d29-5f40-438f-8bca-9b2affbd3d37') VM wrapper has started
(vm:2786)
2020-08-06 07:21:25,546+0200 INFO (periodic/1) [vdsm.api] START
repoStats(domains=()) from=internal,
task_id=d8e1926b-d289-4dc1-a81e-ec1592083751 (api:48)
2020-08-06 07:21:25,547+0200 INFO (periodic/1) [vdsm.api] FINISH repoStats
return={} from=internal, task_id=d8e1926b-d289-4dc1-a81e-ec1592083751
(api:54)
2020-08-06 07:22:41,636+0200 INFO (vmrecovery) [vds] recovery: waiting for
storage pool to go up (clientIF:723)
2020-08-06 07:22:45,317+0200 INFO (jsonrpc/7) [api.host] START
getAllVmStats() from=::ffff:127.0.0.1,41540 (api:48)
2020-08-06 07:22:45,317+0200 INFO (jsonrpc/7) [api.host] FINISH
getAllVmStats return={'status': {'message': 'Done',
'code': 0},
'statsList': (suppressed)} from=::ffff:127.0.0.1,41540 (api:54)
2020-08-06 07:22:45,318+0200 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC
call Host.getAllVmStats succeeded in 0.00 seconds (__init__:312)
2020-08-06 07:22:46,642+0200 INFO (vmrecovery) [vdsm.api] START
getConnectedStoragePoolsList(options=None) from=internal,
task_id=ad72b73f-d970-4367-a037-94bef589da57 (api:48)
2020-08-06 07:22:46,642+0200 INFO (vmrecovery) [vdsm.api] FINISH
getConnectedStoragePoolsList return={'poollist': []} from=internal,
task_id=ad72b73f-d970-4367-a037-94bef589da57 (api:54)
2020-08-06 07:22:46,642+0200 INFO (vmrecovery) [vds] recovery: waiting for
storage pool to go up (clientIF:723)
2020-08-06 07:22:51,648+0200 INFO (vmrecovery) [vdsm.api] START
getConnectedStoragePoolsList(options=None) from=internal,
task_id=f14156c0-a819-4fec-aef7-f5f5b1dd0144 (api:48)
2020-08-06 07:22:51,648+0200 INFO (vmrecovery) [vdsm.api] FINISH
getConnectedStoragePoolsList return={'poollist': []} from=internal,
task_id=f14156c0-a819-4fec-aef7-f5f5b1dd0144 (api:54)
2020-08-06 07:22:51,649+0200 INFO (vmrecovery) [vds] recovery: waiting for
storage pool to go up (clientIF:723)
2020-08-06 07:22:56,409+0200 INFO (periodic/1) [vdsm.api] START
repoStats(domains=()) from=internal,
task_id=75381d37-a06d-4174-b956-f5e7f108ea39 (api:48)
2020-08-06 07:22:56,409+0200 INFO (periodic/1) [vdsm.api] FINISH repoStats
return={} from=internal, task_id=75381d37-a06d-4174-b956-f5e7f108ea39
(api:54)
2020-08-06 07:22:56,654+0200 INFO (vmrecovery) [vdsm.api] START
getConnectedStoragePoolsList(options=None) from=internal,
task_id=8f98ae7e-a681-4af3-a4d8-12f0501c8f68 (api:48)
2020-08-06 07:22:56,654+0200 INFO (vmrecovery) [vdsm.api] FINISH
getConnectedStoragePoolsList return={'poollist': []} from=internal,
task_id=8f98ae7e-a681-4af3-a4d8-12f0501c8f68 (api:54)
2020-08-06 07:22:56,655+0200 INFO (vmrecovery) [vds] recovery: waiting for
storage pool to go up (clientIF:723)
2020-08-06 07:23:00,337+0200 INFO (jsonrpc/0) [api.host] START
getAllVmStats() from=::ffff:127.0.0.1,41540 (api:48)
2020-08-06 07:23:00,337+0200 INFO (jsonrpc/0) [api.host] FINISH
getAllVmStats return={'status': {'message': 'Done',
'code': 0},
'statsList': (suppressed)} from=::ffff:127.0.0.1,41540 (api:54)
2020-08-06 07:23:00,337+0200 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC
call Host.getAllVmStats succeeded in 0.00 seconds (__init__:312)
2020-08-06 07:23:01,660+0200 INFO (vmrecovery) [vdsm.api] START
getConnectedStoragePoolsList(options=None) from=internal,
task_id=f984accc-8e20-48b6-ba24-e00418d28043 (api:48)
2020-08-06 07:23:01,660+0200 INFO (vmrecovery) [vdsm.api] FINISH
getConnectedStoragePoolsList return={'poollist': []} from=internal,
task_id=f984accc-8e20-48b6-ba24-e00418d28043 (api:54)
2020-08-06 07:23:01,660+0200 INFO (vmrecovery) [vds] recovery: waiting for
storage pool to go up (clientIF:723)
I see "[vds] recovery: waiting for storage pool to go up (clientIF:723)"
alot.
Regards
Nardus
On Thu, 6 Aug 2020 at 07:09, Strahil Nikolov <hunter86_bg(a)yahoo.com> wrote:
Can you fheck for errors on the affected host. Most probably you need
the
vdsm logs.
Best Regards,
Strahil Nikolov
На 6 август 2020 г. 7:40:23 GMT+03:00, Nardus Geldenhuys <
nardusg(a)gmail.com> написа:
>Hi Strahil
>
>Hope you are well. I get the following error when I tried to confirm
>reboot:
>
>Error while executing action: Cannot confirm 'Host has been rebooted'
>Host.
>Valid Host statuses are "Non operational", "Maintenance" or
>"Connecting".
>
>And I can't put it in maintenance, only option is "restart" or
"stop".
>
>Regards
>
>Nar
>
>On Thu, 6 Aug 2020 at 06:16, Strahil Nikolov <hunter86_bg(a)yahoo.com>
>wrote:
>
>> After rebooting the node, have you "marked" it that it was rebooted ?
>>
>> Best Regards,
>> Strahil Nikolov
>>
>> На 5 август 2020 г. 21:29:04 GMT+03:00, Nardus Geldenhuys <
>> nardusg(a)gmail.com> написа:
>> >Hi oVirt land
>> >
>> >Hope you are well. Got a bit of an issue, actually a big issue. We
>had
>> >some
>> >sort of dip of some sort. All the VM's is still running, but some of
>> >the
>> >hosts is show "Unassigned" or "NonResponsive". So all
the hosts was
>> >showing
>> >UP and was fine before our dip. So I did increase
>vdsHeartbeatInSecond
>> >to
>> >240, no luck.
>> >
>> >I still get a timeout on the engine lock even thou I can connect to
>> >that
>> >host from the engine using nc to test to port 54321. I also did
>restart
>> >vdsmd and also rebooted the host with no luck.
>> >
>> > nc -v someserver 54321
>> >Ncat: Version 7.50 (
https://nmap.org/ncat )
>> >Ncat: Connected to 172.40.2.172:54321.
>> >
>> >2020-08-05 20:20:34,256+02 ERROR
>>
>>[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> >(EE-ManagedThreadFactory-engineScheduled-Thread-70) [] EVENT_ID:
>> >VDS_BROKER_COMMAND_FAILURE(10,802), VDSM someserver command Get Host
>> >Capabilities failed: Message timeout which can be caused by
>> >communication
>> >issues
>> >
>> >Any troubleshoot ideas will be gladly appreciated.
>> >
>> >Regards
>> >
>> >Nar
>>