Hi,
We have a dev cluster running 4.2. It had to be powered down as the building was going to loose power. Since we've brought it back up it has been massively un-stable (Hosts constantly switching state, VMs migrating all the time).
I now have one host running (with HE) and all others in maintenance mode. When I try activate another host i see storage errors in vdsm.log
2019-05-07 09:41:00,114+0000 ERROR (monitor/a98c0b4) [storage.Monitor] Error checking domain a98c0b42-47b9-4632-8b54-0ff3bd80d4c2 (monitor:424)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 416, in _checkDomainStatus
masterStats = self.domain.validateMaster()
File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 941, in validateMaster
if not self.validateMasterMount():
File "/usr/lib/python2.7/site-packages/vdsm/storage/blockSD.py", line 1377, in validateMasterMount
return mount.isMounted(self.getMasterDir())
File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 161, in isMounted
getMountFromTarget(target)
File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 173, in getMountFromTarget
for rec in _iterMountRecords():
File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 143, in _iterMountRecords
for rec in _iterKnownMounts():
File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 139, in _iterKnownMounts
yield _parseFstabLine(line)
File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 81, in _parseFstabLine
fs_spec = fileUtils.normalize_path(_unescape_spaces(fs_spec))
File "/usr/lib/python2.7/site-packages/vdsm/storage/fileUtils.py", line 94, in normalize_path
host, tail = address.hosttail_split(path)
File "/usr/lib/python2.7/site-packages/vdsm/common/network/address.py", line 43, in hosttail_split
raise HosttailError('%s is not a valid hosttail address:' % hosttail)
HosttailError: :/ is not a valid hosttail address:
Not sure if it's related but since the restart the hosted_storage domain has been elected the master domain.
I'm a bit stuck at the moment. My only idea is to remove HE and switch to a standalone Engine VM running outside the cluster.
Thanks,
Alan