Hello All,
I was trying to upgrade my single node setup (Actually it used to be 2+1 arbiter, but one of the data nodes died) from 4.3.10 to 4.4.?
The deployment failed on 'hosted-engine --reinitialize-lockspace --force' and it seems that sanlock fails to obtain a lock:
# hosted-engine --reinitialize-lockspace --force
Traceback (most recent call last):
File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/reinitialize_lockspace.py", line 30, in <module>
ha_cli.reset_lockspace(force)
File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 286, in reset_lockspace
stats = broker.get_stats_from_storage()
File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 148, in get_stats_from_storage
result = self._proxy.get_stats()
File "/usr/lib64/python3.6/xmlrpc/client.py", line 1112, in __call__
return self.__send(self.__name, args)
File "/usr/lib64/python3.6/xmlrpc/client.py", line 1452, in __request
verbose=self.__verbose
File "/usr/lib64/python3.6/xmlrpc/client.py", line 1154, in request
return self.single_request(host, handler, request_body, verbose)
File "/usr/lib64/python3.6/xmlrpc/client.py", line 1166, in single_request
http_conn = self.send_request(host, handler, request_body, verbose)
File "/usr/lib64/python3.6/xmlrpc/client.py", line 1279, in send_request
self.send_content(connection, request_body)
File "/usr/lib64/python3.6/xmlrpc/client.py", line 1309, in send_content
connection.endheaders(request_body)
File "/usr/lib64/python3.6/http/client.py", line 1268, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1044, in _send_output
self.send(msg)
File "/usr/lib64/python3.6/http/client.py", line 982, in send
self.connect()
File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py", line 74, in connect
self.sock.connect(base64.b16decode(self.host))
FileNotFoundError: [Errno 2] No such file or directory
# grep sanlock /var/log/messages | tail
Jan 6 08:29:48 ovirt2 sanlock[1269]: 2022-01-06 08:29:48 19341 [77108]: s1777 failed to read device to find sector size error -223 /run/vdsm/storage/ca3807b9-5afc-4bcd-a557-aacbcc53c340/39ee18b2-3d7b-4d48-8a0e-3ed7947b5038/d95ae3ee-b6d3-46c4-b6a2-75f96134c7f1
Jan 6 08:29:49 ovirt2 sanlock[1269]: 2022-01-06 08:29:49 19342 [1310]: s1777 add_lockspace fail result -223
Jan 6 08:29:54 ovirt2 sanlock[1269]: 2022-01-06 08:29:54 19347 [77113]: s1778 failed to read device to find sector size error -223 /run/vdsm/storage/ca3807b9-5afc-4bcd-a557-aacbcc53c340/39ee18b2-3d7b-4d48-8a0e-3ed7947b5038/d95ae3ee-b6d3-46c4-b6a2-75f96134c7f1
Jan 6 08:29:55 ovirt2 sanlock[1269]: 2022-01-06 08:29:55 19348 [1310]: s1778 add_lockspace fail result -223
Jan 6 08:30:00 ovirt2 sanlock[1269]: 2022-01-06 08:30:00 19353 [77138]: s1779 failed to read device to find sector size error -223 /run/vdsm/storage/ca3807b9-5afc-4bcd-a557-aacbcc53c340/39ee18b2-3d7b-4d48-8a0e-3ed7947b5038/d95ae3ee-b6d3-46c4-b6a2-75f96134c7f1
Jan 6 08:30:01 ovirt2 sanlock[1269]: 2022-01-06 08:30:01 19354 [1311]: s1779 add_lockspace fail result -223
Jan 6 08:30:06 ovirt2 sanlock[1269]: 2022-01-06 08:30:06 19359 [77144]: s1780 failed to read device to find sector size error -223 /run/vdsm/storage/ca3807b9-5afc-4bcd-a557-aacbcc53c340/39ee18b2-3d7b-4d48-8a0e-3ed7947b5038/d95ae3ee-b6d3-46c4-b6a2-75f96134c7f1
Jan 6 08:30:07 ovirt2 sanlock[1269]: 2022-01-06 08:30:07 19360 [1310]: s1780 add_lockspace fail result -223
Jan 6 08:30:12 ovirt2 sanlock[1269]: 2022-01-06 08:30:12 19365 [77151]: s1781 failed to read device to find sector size error -223 /run/vdsm/storage/ca3807b9-5afc-4bcd-a557-aacbcc53c340/39ee18b2-3d7b-4d48-8a0e-3ed7947b5038/d95ae3ee-b6d3-46c4-b6a2-75f96134c7f1
Jan 6 08:30:13 ovirt2 sanlock[1269]: 2022-01-06 08:30:13 19366 [1310]: s1781 add_lockspace fail result -223
# sanlock client status
daemon 5f37f400-b865-11dc-a4f5-2c4d54502372
p -1 helper
p -1 listener
p -1 status
s ca3807b9-5afc-4bcd-a557-aacbcc53c340:1:/rhev/data-center/mnt/glusterSD/ovirt2\:_engine44/ca3807b9-5afc-4bcd-a557-aacbcc53c340/dom_md/ids:0
Could it be related to the sector size of the Gluster's Brick?
# smartctl -a /dev/sdb | grep 'Sector Sizes'
Sector Sizes: 512 bytes logical, 4096 bytes physical
Any hint will be helpful
Best Regads,
Strahil Nikolov
_______________________________________________