After rebootof node1 -> bricks must always come up. Most probably VDO had to recover for a longer period blocking the bricks coming up on time.
Hi guys,one strange thing happens, cannot understand it.Today i installed last version 4.4.7 on centos stream, replica 3, via cockpit, internal lan for sync. Seems all ok, if reboot 3 nodes together aslo ok. But if i reboot 1 node ( and declare node rebooted through web ui) the bricks (engine and data) remain down on that node. All is clear on logs without explicit indication of the situation except "Server quorum regained for volume data. Starting local bricks" on glusterd.log. After "systemctl restart glusterd" bricks gone down on another node. After "systemctl restart glusterd" on that node all is ok.Where should i look?some errors of log that i found:--------------------------------------------- bdocker.log:statusStorageThread::ERROR::2021-07-12 22:17:02,899::storage_broker::223::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(put_stats) Failed to write metadata for ho$Traceback (most recent call last):File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 215, in put_statsf = os.open(path, direct_flag | os.O_WRONLY | os.O_SYNC)FileNotFoundError: [Errno 2] No such file or directory: '/run/vdsm/storage/53b068c1-beb8-4048-a766-3a4e71ded624/d3df7eb6-d453-439a-8436-d3694d4b5179/de18b2cc-a4e1-4afc-9b5a-6063$StatusStorageThread::ERROR::2021-07-12 22:17:02,899::status_broker::90::ovirt_hosted_engine_ha.broker.status_broker.StatusBroker.Update::(run) Failed to update state.Traceback (most recent call last):File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 215, in put_statsf = os.open(path, direct_flag | os.O_WRONLY | os.O_SYNC)FileNotFoundError: [Errno 2] No such file or directory: '/run/vdsm/storage/53b068c1-beb8-4048-a766-3a4e71ded624/d3df7eb6-d453-439a-8436-d3694d4b5179/de18b2cc-a4e1-4afc-9b5a-6063$During handling of the above exception, another exception occurred:Traceback (most recent call last):File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/status_broker.py", line 86, in runentry.dataFile "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 225, in put_stats.format(str(e)))ovirt_hosted_engine_ha.lib.exceptions.RequestError: failed to write metadata: [Errno 2] No such file or directory: '/run/vdsm/storage/53b068c1-beb8-4048-a766-3a4e71ded624/d3df7e$StatusStorageThread::ERROR::2021-07-12 22:17:02,899::storage_broker::223::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(put_stats) Failed to write metadata for ho$Traceback (most recent call last):File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 215, in put_statsf = os.open(path, direct_flag | os.O_WRONLY | os.O_SYNC)FileNotFoundError: [Errno 2] No such file or directory: '/run/vdsm/storage/53b068c1-beb8-4048-a766-3a4e71ded624/d3df7eb6-d453-439a-8436-d3694d4b5179/de18b2cc-a4e1-4afc-9b5a-6063$StatusStorageThread::ERROR::2021-07-12 22:17:02,899::status_broker::90::ovirt_hosted_engine_ha.broker.status_broker.StatusBroker.Update::(run) Failed to update state.Traceback (most recent call last):File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 215, in put_statsf = os.open(path, direct_flag | os.O_WRONLY | os.O_SYNC)FileNotFoundError: [Errno 2] No such file or directory: '/run/vdsm/storage/53b068c1-beb8-4048-a766-3a4e71ded624/d3df7eb6-d453-439a-8436-d3694d4b5179/de18b2cc-a4e1-4afc-9b5a-6063$During handling of the above exception, another exception occurred:Traceback (most recent call last):File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/status_broker.py", line 86, in runentry.dataFile "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 225, in put_stats.format(str(e)))ovirt_hosted_engine_ha.lib.exceptions.RequestError: failed to write metadata: [Errno 2] No such file or directory: '/run/vdsm/storage/53b068c1-beb8-4048-a766-3a4e71ded624/d3df7e$StatusStorageThread::ERROR::2021-07-12 22:17:02,902::status_broker::70::ovirt_hosted_engine_ha.broker.status_broker.StatusBroker.Update::(trigger_restart) Trying to restart the $StatusStorageThread::ERROR::2021-07-12 22:17:02,902::storage_broker::173::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(get_raw_stats) Failed to read metadata fro$Traceback (most recent call last):File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 155, in get_raw_statsf = os.open(path, direct_flag | os.O_RDONLY | os.O_SYNC)FileNotFoundError: [Errno 2] No such file or directory: '/run/vdsm/storage/53b068c1-beb8-4048-a766-3a4e71ded624/d3df7eb6-d453-439a-8436-d3694d4b5179/de18b2cc-a4e1-4afc-9b5a-6063$StatusStorageThread::ERROR::2021-07-12 22:17:02,902::status_broker::98::ovirt_hosted_engine_ha.broker.status_broker.StatusBroker.Update::(run) Failed to read state.Traceback (most recent call last):File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 155, in get_raw_statsf = os.open(path, direct_flag | os.O_RDONLY | os.O_SYNC)FileNotFoundError: [Errno 2] No such file or directory: '/run/vdsm/storage/53b068c1-beb8-4048-a766-3a4e71ded624/d3df7eb6-d453-439a-8436-d3694d4b5179/de18b2cc-a4e1-4afc-9b5a-6063$During handling of the above exception, another exception occurred:Traceback (most recent call last):File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/status_broker.py", line 94, in runself._storage_broker.get_raw_stats()File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 175, in get_raw_stats.format(str(e)))-------------------------------------------------------supervdsm.log:ainProcess|jsonrpc/0::DEBUG::2021-07-12 22:22:13,264::commands::211::root::(execCmd) /usr/bin/taskset --cpu-list 0-19 /usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/b$MainProcess|jsonrpc/0::DEBUG::2021-07-12 22:22:15,083::commands::224::root::(execCmd) FAILED: <err> = b'Running scope as unit: run-r91d6411af8114090aa28933d562fa473.scope\nMount$MainProcess|jsonrpc/0::ERROR::2021-07-12 22:22:15,083::supervdsm_server::99::SuperVdsm.ServerCallback::(wrapper) Error in mountTraceback (most recent call last):File "/usr/lib/python3.6/site-packages/vdsm/supervdsm_server.py", line 97, in wrapperres = func(*args, **kwargs)File "/usr/lib/python3.6/site-packages/vdsm/supervdsm_server.py", line 135, in mountcgroup=cgroup)File "/usr/lib/python3.6/site-packages/vdsm/storage/mount.py", line 280, in _mount_runcmd(cmd)File "/usr/lib/python3.6/site-packages/vdsm/storage/mount.py", line 308, in _runcmdraise MountError(cmd, rc, out, err)vdsm.storage.mount.MountError: Command ['/usr/bin/systemd-run', '--scope', '--slice=vdsm-glusterfs', '/usr/bin/mount', '-t', 'glusterfs', '-o', 'backup-volfile-servers=cluster2.$netlink/events::DEBUG::2021-07-12 22:22:15,131::concurrent::261::root::(run) FINISH thread <Thread(netlink/events, stopped daemon 139867781396224)>MainProcess|jsonrpc/4::DEBUG::2021-07-12 22:22:15,134::supervdsm_server::102::SuperVdsm.ServerCallback::(wrapper) return network_caps with {'networks': {'ovirtmgmt': {'ports': [$---------------------------------------------------------vdsm.log:2021-07-12 22:17:08,718+0200 INFO (jsonrpc/7) [api.host] FINISH getStats return={'status': {'code': 0, 'message': 'Done'}, 'info': (suppressed)} from=::1,34946 (api:54)2021-07-12 22:17:09,491+0200 ERROR (monitor/53b068c) [storage.Monitor] Error checking domain 53b068c1-beb8-4048-a766-3a4e71ded624 (monitor:451)Traceback (most recent call last):File "/usr/lib/python3.6/site-packages/vdsm/storage/monitor.py", line 432, in _checkDomainStatusself.domain.selftest()File "/usr/lib/python3.6/site-packages/vdsm/storage/fileSD.py", line 712, in selftestself.oop.os.statvfs(self.domaindir)File "/usr/lib/python3.6/site-packages/vdsm/storage/outOfProcess.py", line 244, in statvfsreturn self._iop.statvfs(path)File "/usr/lib/python3.6/site-packages/ioprocess/__init__.py", line 510, in statvfsresdict = self._sendCommand("statvfs", {"path": path}, self.timeout)File "/usr/lib/python3.6/site-packages/ioprocess/__init__.py", line 479, in _sendCommandraise OSError(errcode, errstr)FileNotFoundError: [Errno 2] No such file or directory2021-07-12 22:17:09,619+0200 INFO (jsonrpc/0) [api.virt] START getStats() from=::1,34946, vmId=9167f682-3c82-4237-93bd-53f0ad32ffba (api:48)2021-07-12 22:17:09,620+0200 INFO (jsonrpc/0) [api] FINISH getStats error=Virtual machine does not exist: {'vmId': '9167f682-3c82-4237-93bd-53f0ad32ffba'} (api:129)2021-07-12 22:17:09,620+0200 INFO (jsonrpc/0) [api.virt] FINISH getStats return={'status': {'code': 1, 'message': "Virtual machine does not exist: {'vmId': '9167f682-3c82-4237-$2021-07-12 22:17:09,620+0200 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call VM.getStats failed (error 1) in 0.00 seconds (__init__:312)2021-07-12 22:17:10,034+0200 INFO (jsonrpc/3) [vdsm.api] START repoStats(domains=['53b068c1-beb8-4048-a766-3a4e71ded624']) from=::1,34946, task_id=4e823c98-f95b-45f7-ad64-90f82$2021-07-12 22:17:10,034+0200 INFO (jsonrpc/3) [vdsm.api] FINISH repoStats return={'53b068c1-beb8-4048-a766-3a4e71ded624': {'code': 2001, 'lastCheck': '0.5', 'delay': '0', 'vali$2021-07-12 22:17:10,403+0200 INFO (health) [health] LVM cache hit ratio: 12.50% (hits: 1 misses: 7) (health:131)2021-07-12 22:17:10,472+0200 INFO (MainThread) [vds] Received signal 15, shutting down (vdsmd:74)2021-07-12 22:17:10,472+0200 INFO (MainThread) [root] Stopping DHCP monitor. (dhcp_monitor:106)2021-07-12 22:17:10,473+0200 INFO (ioprocess/11056) [IOProcessClient] (53b068c1-beb8-4048-a766-3a4e71ded624) Poll error 16 on fd 74 (__init__:176)2021-07-12 22:17:10,473+0200 INFO (ioprocess/11056) [IOProcessClient] (53b068c1-beb8-4048-a766-3a4e71ded624) ioprocess was terminated by signal 15 (__init__:200)2021-07-12 22:17:10,476+0200 INFO (ioprocess/19109) [IOProcessClient] (e10cbd59-d32e-4b69-a4c1-d213e7bd8973) Poll error 16 on fd 75 (__init__:176)2021-07-12 22:17:10,476+0200 INFO (ioprocess/19109) [IOProcessClient] (e10cbd59-d32e-4b69-a4c1-d213e7bd8973) ioprocess was terminated by signal 15 (__init__:200)2021-07-12 22:17:10,513+0200 INFO (ioprocess/44046) [IOProcess] (e10cbd59-d32e-4b69-a4c1-d213e7bd8973) Starting ioprocess (__init__:465)2021-07-12 22:17:10,513+0200 INFO (ioprocess/44045) [IOProcess] (53b068c1-beb8-4048-a766-3a4e71ded624) Starting ioprocess (__init__:465)2021-07-12 22:17:10,519+0200 WARN (periodic/0) [root] Failed to retrieve Hosted Engine HA info: timed out (api:198)2021-07-12 22:17:10,611+0200 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/glusterSD/cluster1.int:_data/e10cbd59-d32e-4b69-a4c1-d213e7bd8973/dom$Traceback (most recent call last):File "/usr/lib/python3.6/site-packages/vdsm/storage/monitor.py", line 507, in _pathCheckeddelay = result.delay()File "/usr/lib/python3.6/site-packages/vdsm/storage/check.py", line 391, in delayraise exception.MiscFileReadException(self.path, self.rc, self.err)vdsm.storage.exception.MiscFileReadException: Internal file read failure: ('/rhev/data-center/mnt/glusterSD/cluster1.int:_data/e10cbd59-d32e-4b69-a4c1-d213e7bd8973/dom_md/metada$2021-07-12 22:17:10,860+0200 INFO (MainThread) [root] Stopping Bond monitor. (bond_monitor:53)Thanks in advanceBest regards._______________________________________________Users mailing list -- users@ovirt.orgTo unsubscribe send an email to users-leave@ovirt.orgPrivacy Statement: https://www.ovirt.org/privacy-policy.htmloVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/