The engine was not starting till downgrading to 6.0.0 qemu rpms from the Advanced
Virtualization
Best Regards,
Strahil Nikolov В четвъртък, 6 януари 2022 г., 11:51:27 Гринуич+2, Strahil Nikolov via
Users <users(a)ovirt.org> написа:
It seems that after the last attempt I managed to move forward:
systemctl start ovirt-ha-agent ovirt-ha-broker
then stopped the ovirt-ha-agent and run "hosted-engine
--reinitialize-lockspace"
Now the situation changed a little bit:
# sanlock client status
daemon 5f37f400-b865-11dc-a4f5-2c4d54502372
p -1 helper
p -1 listener
p 89795 HostedEngine
p -1 status
s
hosted-engine:1:/run/vdsm/storage/ca3807b9-5afc-4bcd-a557-aacbcc53c340/39ee18b2-3d7b-4d48-8a0e-3ed7947b5038/d95ae3ee-b6d3-46c4-b6a2-75f96134c7f1:0
s
ca3807b9-5afc-4bcd-a557-aacbcc53c340:1:/rhev/data-center/mnt/glusterSD/ovirt2\:_engine44/ca3807b9-5afc-4bcd-a557-aacbcc53c340/dom_md/ids:0
r
ca3807b9-5afc-4bcd-a557-aacbcc53c340:292c2cac-8dad-4229-a9a3-e64811f4b34e:/rhev/data-center/mnt/glusterSD/ovirt2\:_engine44/ca3807b9-5afc-4bcd-a557-aacbcc53c340/images/1deecc6a-0584-4758-8fbb-6386662a8075/292c2cac-8dad-4229-a9a3-e64811f4b34e.lease:0:1
p 89795
And the engine is running:
--== Host ovirt2.localdomain (id: 1) status ==--
Host ID : 1
Host timestamp : 31136
Score : 3400
Engine status : {"vm": "up", "health": "bad",
"detail": "Up", "reason": "failed liveliness
check"}
Hostname : ovirt2.localdomain
Local maintenance : False
stopped : False
crc32 : 5f5bbd94
conf_on_shared_storage : True
local_conf_timestamp : 31136
Status up-to-date : True
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=31136 (Thu Jan 6 11:46:23 2022)
host-id=1
score=3400
vm_conf_refresh_time=31136 (Thu Jan 6 11:46:23 2022)
conf_on_shared_storage=True
maintenance=False
state=EngineStarting
stopped=False
I will leave it for a while before trying to troubleshoot.
Best Regards,
Strahil Nikolov
В четвъртък, 6 януари 2022 г., 09:23:11 Гринуич+2, Strahil Nikolov via Users
<users(a)ovirt.org> написа:
Hello All,
I was trying to upgrade my single node setup (Actually it used to be 2+1 arbiter, but one
of the data nodes died) from 4.3.10 to 4.4.?
The deployment failed on 'hosted-engine --reinitialize-lockspace --force' and it
seems that sanlock fails to obtain a lock:
# hosted-engine --reinitialize-lockspace --force
Traceback (most recent call last):
File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File
"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/reinitialize_lockspace.py",
line 30, in <module>
ha_cli.reset_lockspace(force)
File
"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line
286, in reset_lockspace
stats = broker.get_stats_from_storage()
File
"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 148, in get_stats_from_storage
result = self._proxy.get_stats()
File "/usr/lib64/python3.6/xmlrpc/client.py", line 1112, in __call__
return self.__send(self.__name, args)
File "/usr/lib64/python3.6/xmlrpc/client.py", line 1452, in __request
verbose=self.__verbose
File "/usr/lib64/python3.6/xmlrpc/client.py", line 1154, in request
return self.single_request(host, handler, request_body, verbose)
File "/usr/lib64/python3.6/xmlrpc/client.py", line 1166, in single_request
http_conn = self.send_request(host, handler, request_body, verbose)
File "/usr/lib64/python3.6/xmlrpc/client.py", line 1279, in send_request
self.send_content(connection, request_body)
File "/usr/lib64/python3.6/xmlrpc/client.py", line 1309, in send_content
connection.endheaders(request_body)
File "/usr/lib64/python3.6/http/client.py", line 1268, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1044, in _send_output
self.send(msg)
File "/usr/lib64/python3.6/http/client.py", line 982, in send
self.connect()
File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py",
line 74, in connect
self.sock.connect(base64.b16decode(self.host))
FileNotFoundError: [Errno 2] No such file or directory
# grep sanlock /var/log/messages | tail
Jan 6 08:29:48 ovirt2 sanlock[1269]: 2022-01-06 08:29:48 19341 [77108]: s1777 failed to
read device to find sector size error -223
/run/vdsm/storage/ca3807b9-5afc-4bcd-a557-aacbcc53c340/39ee18b2-3d7b-4d48-8a0e-3ed7947b5038/d95ae3ee-b6d3-46c4-b6a2-75f96134c7f1
Jan 6 08:29:49 ovirt2 sanlock[1269]: 2022-01-06 08:29:49 19342 [1310]: s1777
add_lockspace fail result -223
Jan 6 08:29:54 ovirt2 sanlock[1269]: 2022-01-06 08:29:54 19347 [77113]: s1778 failed to
read device to find sector size error -223
/run/vdsm/storage/ca3807b9-5afc-4bcd-a557-aacbcc53c340/39ee18b2-3d7b-4d48-8a0e-3ed7947b5038/d95ae3ee-b6d3-46c4-b6a2-75f96134c7f1
Jan 6 08:29:55 ovirt2 sanlock[1269]: 2022-01-06 08:29:55 19348 [1310]: s1778
add_lockspace fail result -223
Jan 6 08:30:00 ovirt2 sanlock[1269]: 2022-01-06 08:30:00 19353 [77138]: s1779 failed to
read device to find sector size error -223
/run/vdsm/storage/ca3807b9-5afc-4bcd-a557-aacbcc53c340/39ee18b2-3d7b-4d48-8a0e-3ed7947b5038/d95ae3ee-b6d3-46c4-b6a2-75f96134c7f1
Jan 6 08:30:01 ovirt2 sanlock[1269]: 2022-01-06 08:30:01 19354 [1311]: s1779
add_lockspace fail result -223
Jan 6 08:30:06 ovirt2 sanlock[1269]: 2022-01-06 08:30:06 19359 [77144]: s1780 failed to
read device to find sector size error -223
/run/vdsm/storage/ca3807b9-5afc-4bcd-a557-aacbcc53c340/39ee18b2-3d7b-4d48-8a0e-3ed7947b5038/d95ae3ee-b6d3-46c4-b6a2-75f96134c7f1
Jan 6 08:30:07 ovirt2 sanlock[1269]: 2022-01-06 08:30:07 19360 [1310]: s1780
add_lockspace fail result -223
Jan 6 08:30:12 ovirt2 sanlock[1269]: 2022-01-06 08:30:12 19365 [77151]: s1781 failed to
read device to find sector size error -223
/run/vdsm/storage/ca3807b9-5afc-4bcd-a557-aacbcc53c340/39ee18b2-3d7b-4d48-8a0e-3ed7947b5038/d95ae3ee-b6d3-46c4-b6a2-75f96134c7f1
Jan 6 08:30:13 ovirt2 sanlock[1269]: 2022-01-06 08:30:13 19366 [1310]: s1781
add_lockspace fail result -223
# sanlock client status
daemon 5f37f400-b865-11dc-a4f5-2c4d54502372
p -1 helper
p -1 listener
p -1 status
s
ca3807b9-5afc-4bcd-a557-aacbcc53c340:1:/rhev/data-center/mnt/glusterSD/ovirt2\:_engine44/ca3807b9-5afc-4bcd-a557-aacbcc53c340/dom_md/ids:0
Could it be related to the sector size of the Gluster's Brick?
# smartctl -a /dev/sdb | grep 'Sector Sizes'
Sector Sizes: 512 bytes logical, 4096 bytes physical
Any hint will be helpful
Best Regads,
Strahil Nikolov
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MB2POLUPBLA...
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QLPVW2IVWZF...