[ovirt-users] Hosted engine VDSM issue with sanlock

Artem Tambovskiy artem.tambovskiy at gmail.com
Thu Mar 29 05:59:10 UTC 2018


Hi,

How many hosts you have? Check hosted-engine.conf on all hosts including
the one you have problem with and look if all host_id values are unique. It
might happen that you have several hosts with host_id=1

Regards,
Artem

ср, 28 мар. 2018 г., 20:49 Jamie Lawrence <jlawrence at squaretrade.com>:

> I still can't resolve this issue.
>
> I have a host that is stuck in a cycle; it will be marked non responsive,
> then come back up, ending with an "finished activation" message in the GUI.
> Then it repeats.
>
> The root cause seems to be sanlock.  I'm just unclear on why it started or
> how to resolve it. The only "approved" knob I'm aware of is
> --reinitialize-lockspace and the manual equivalent, neither of which fix
> anything.
>
> Anyone have a guess?
>
> -j
>
> - - - vdsm.log - - - -
>
> 2018-03-28 10:38:22,207-0700 INFO  (monitor/b41eb20) [storage.SANLock]
> Acquiring host id for domain b41eb20a-eafb-481b-9a50-a135cf42b15e (id=1,
> async=True) (clusterlock:284)
> 2018-03-28 10:38:22,208-0700 ERROR (monitor/b41eb20) [storage.Monitor]
> Error acquiring host id 1 for domain b41eb20a-eafb-481b-9a50-a135cf42b15e
> (monitor:568)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
> 565, in _acquireHostId
>     self.domain.acquireHostId(self.hostId, async=True)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 828, in
> acquireHostId
>     self._manifest.acquireHostId(hostId, async)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 453, in
> acquireHostId
>     self._domainLock.acquireHostId(hostId, async)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
> line 315, in acquireHostId
>     raise se.AcquireHostIdFailure(self._sdUUID, e)
> AcquireHostIdFailure: Cannot acquire host id:
> (u'b41eb20a-eafb-481b-9a50-a135cf42b15e', SanlockException(22, 'Sanlock
> lockspace add failure', 'Invalid argument'))
> 2018-03-28 10:38:23,078-0700 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC
> call Host.ping2 succeeded in 0.00 seconds (__init__:573)
> 2018-03-28 10:38:23,085-0700 INFO  (jsonrpc/6) [vdsm.api] START
> repoStats(domains=[u'b41eb20a-eafb-481b-9a50-a135cf42b15e'])
> from=::1,54450, task_id=186d7e8b-7b4e-485d-a9e0-c0cb46eed621 (api:46)
> 2018-03-28 10:38:23,085-0700 INFO  (jsonrpc/6) [vdsm.api] FINISH repoStats
> return={u'b41eb20a-eafb-481b-9a50-a135cf42b15e': {'code': 0, 'actual':
> True, 'version': 4, 'acquired': False, 'delay': '0.000812547', 'lastCheck':
> '0.4', 'valid': True}} from=::1,54450,
> task_id=186d7e8b-7b4e-485d-a9e0-c0cb46eed621 (api:52)
> 2018-03-28 10:38:23,086-0700 INFO  (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC
> call Host.getStorageRepoStats succeeded in 0.00 seconds (__init__:573)
> 2018-03-28 10:38:23,092-0700 WARN  (vdsm.Scheduler) [Executor] Worker
> blocked: <Worker name=periodic/6 running <Task <Operation
> action=<vdsm.virt.sampling.HostMonitor object at 0x1d44110> at 0x1d44150>
> timeout=15, duration=150 at 0x7f076c05fb90> task#=83985 at 0x7f082c08e510>,
> traceback:
> File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap
>   self.__bootstrap_inner()
> File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
>   self.run()
> File: "/usr/lib64/python2.7/threading.py", line 765, in run
>   self.__target(*self.__args, **self.__kwargs)
> File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line
> 194, in run
>   ret = func(*args, **kwargs)
> File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in
> _run
>   self._execute_task()
> File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in
> _execute_task
>   task()
> File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in
> __call__
>   self._callable()
> File: "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 213,
> in __call__
>   self._func()
> File: "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 578,
> in __call__
>   stats = hostapi.get_stats(self._cif, self._samples.stats())
> File: "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 77, in
> get_stats
>   ret['haStats'] = _getHaInfo()
> File: "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 182, in
> _getHaInfo
>   stats = instance.get_all_stats()
> File:
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
> line 93, in get_all_stats
>   stats = broker.get_stats_from_storage()
> File:
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> line 135, in get_stats_from_storage
>   result = self._proxy.get_stats()
> File: "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__
>   return self.__send(self.__name, args)
> File: "/usr/lib64/python2.7/xmlrpclib.py", line 1587, in __request
>   verbose=self.__verbose
> File: "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request
>   return self.single_request(host, handler, request_body, verbose)
> File: "/usr/lib64/python2.7/xmlrpclib.py", line 1303, in single_request
>   response = h.getresponse(buffering=True)
> File: "/usr/lib64/python2.7/httplib.py", line 1089, in getresponse
>   response.begin()
> File: "/usr/lib64/python2.7/httplib.py", line 444, in begin
>   version, status, reason = self._read_status()
> File: "/usr/lib64/python2.7/httplib.py", line 400, in _read_status
>   line = self.fp.readline(_MAXLINE + 1)
> File: "/usr/lib64/python2.7/socket.py", line 476, in readline
>   data = self._sock.recv(self._rbufsize) (executor:363)
> 2018-03-28 10:38:23,274-0700 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC
> call Host.ping2 succeeded in 0.00 seconds (__init__:573)
> 2018-03-28 10:38:24,297-0700 INFO  (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC
> call Host.ping2 succeeded in 0.00 seconds (__init__:573)
>
> 2018-03-28 10:38:24,306-0700 INFO  (jsonrpc/2) [vdsm.api] START
> repoStats(domains=[u'b41eb20a-eafb-481b-9a50-a135cf42b15e'])
> from=::1,54450, task_id=6a60e316-e4d7-415d-970a-a998710a5899 (api:46)
> 2018-03-28 10:38:24,306-0700 INFO  (jsonrpc/2) [vdsm.api] FINISH repoStats
> return={u'b41eb20a-eafb-481b-9a50-a135cf42b15e': {'code': 0, 'actual':
> True, 'version': 4, 'acquired': False, 'delay': '0.000812547', 'lastCheck':
> '1.6', 'valid': True}} from=::1,54450,
> task_id=6a60e316-e4d7-415d-970a-a998710a5899 (api:52)
> 2018-03-28 10:38:24,307-0700 INFO  (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC
> call Host.getStorageRepoStats succeeded in 0.00 seconds (__init__:573)
> 2018-03-28 10:38:24,374-0700 INFO  (jsonrpc/7) [api.host] START
> getAllVmStats() from=::ffff:10.181.26.150,46064 (api:46)
> 2018-03-28 10:38:24,377-0700 INFO  (jsonrpc/7) [api.host] FINISH
> getAllVmStats return={'status': {'message': 'Done', 'code': 0},
> 'statsList': (suppressed)} from=::ffff:10.181.26.150,46064 (api:52)
> 2018-03-28 10:38:24,379-0700 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC
> call Host.getAllVmStats succeeded in 0.00 seconds (__init__:573)
> 2018-03-28 10:38:24,529-0700 INFO  (jsonrpc/5) [api.host] START
> getAllVmStats() from=::1,54454 (api:46)
> 2018-03-28 10:38:24,532-0700 INFO  (jsonrpc/5) [api.host] FINISH
> getAllVmStats return={'status': {'message': 'Done', 'code': 0},
> 'statsList': (suppressed)} from=::1,54454 (api:52)
> 2018-03-28 10:38:24,533-0700 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC
> call Host.getAllVmStats succeeded in 0.01 seconds (__init__:573)
> 2018-03-28 10:38:24,545-0700 INFO  (jsonrpc/6) [api.host] START
> getAllVmIoTunePolicies() from=::1,54454 (api:46)
> 2018-03-28 10:38:24,546-0700 INFO  (jsonrpc/6) [api.host] FINISH
> getAllVmIoTunePolicies return={'status': {'message': 'Done', 'code': 0},
> 'io_tune_policies_dict': {'588a1394-4f28-4fb8-bcad-5b08d78ecd00':
> {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L,
> 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L,
> 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path':
> u'/var/run/vdsm/storage/b41eb20a-eafb-481b-9a50-a135cf42b15e/a9d01d59-f146-47e5-b514-d10f8867678e/8f0c9f7a-ae6a-476e-b6f3-a830dcb79e87',
> 'name': 'vda'}]}}} from=::1,54454 (api:52)
> 2018-03-28 10:38:24,547-0700 INFO  (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC
> call Host.getAllVmIoTunePolicies succeeded in 0.00 seconds (__init__:573)
> 2018-03-28 10:38:29,319-0700 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC
> call Host.ping2 succeeded in 0.00 seconds (__init__:573)
> 2018-03-28 10:38:29,327-0700 INFO  (jsonrpc/0) [vdsm.api] START
> repoStats(domains=[u'b41eb20a-eafb-481b-9a50-a135cf42b15e'])
> from=::1,54450, task_id=c27c5e13-3b31-4182-9c14-11463c9b590a (api:46)
> 2018-03-28 10:38:29,327-0700 INFO  (jsonrpc/0) [vdsm.api] FINISH repoStats
> return={u'b41eb20a-eafb-481b-9a50-a135cf42b15e': {'code': 0, 'actual':
> True, 'version': 4, 'acquired': False, 'delay': '0.000812547', 'lastCheck':
> '6.6', 'valid': True}} from=::1,54450,
> task_id=c27c5e13-3b31-4182-9c14-11463c9b590a (api:52)
> 2018-03-28 10:38:29,328-0700 INFO  (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC
> call Host.getStorageRepoStats succeeded in 0.00 seconds (__init__:573)
> 2018-03-28 10:38:30,471-0700 INFO  (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC
> call Host.ping2 succeeded in 0.00 seconds (__init__:573)
> 2018-03-28 10:38:30,475-0700 INFO  (jsonrpc/7) [api.host] START
> getCapabilities() from=::1,54450 (api:46)
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20180329/7c53c08a/attachment.html>


More information about the Users mailing list