On Tue, Nov 27, 2018 at 9:34 AM Sahina Bose <sabose(a)redhat.com> wrote:
On Tue, Nov 13, 2018 at 4:46 PM fsoyer <fsoyer(a)systea.fr>
wrote:
1, 'Read timeout') - indicates that there was no response from storage
within 32s (I think this is the sanlock read timeout? Denis? Nir?)
This:
2018-11-11 14:33:49,450+0100 ERROR (check/loop) [storage.Monitor]
Error
checking path
/rhev/data-center/mnt/glusterSD/victor.local.systea.fr:_DATA02/ffc53fd8-c5d1-4070-ae51-2e91835cd937/dom_md/metadata
(monitor:498)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
496, in _pathChecked
delay = result.delay()
File "/usr/lib/python2.7/site-packages/vdsm/storage/check.py", line
391, in delay
raise exception.MiscFileReadException(self.path, self.rc,
self.err)
MiscFileReadException: Internal file read failure:
(u'/rhev/data-center/mnt/glusterSD/victor.local.systea.fr:_DATA02/ffc53fd8-c5d1-4070-ae51-2e91835cd937/dom_md/metadata',
1, 'Read timeout')
Means that reading from storage timed out after 10 seconds.
See
https://github.com/oVirt/vdsm/blob/9e80801f05a3e4033f51eb8f629f62fe715d0c...
We immediately change the storage domain to INVALID:
2018-11-11 14:33:49,450+0100 INFO (check/loop) [storage.Monitor]
Domain
ffc53fd8-c5d1-4070-ae51-2e91835cd937 became INVALID (monitor:469)
When the next check succeeds, we move the status back to VALID, and resume
paused
vms using this storage domain.
Once we got a timeout, until the read completes, will see this warning
every 10 seconds:
2018-11-11 14:33:59,451+0100 WARN (check/loop) [storage.check]
Checker
u'/rhev/data-center/mnt/glusterSD/victor.local.systea.fr:_DATA02/ffc53fd8-c5d1-4070-ae51-2e91835cd937/dom_md/metadata'
is blocked for 20.00 seconds (check:282)
See
https://github.com/oVirt/vdsm/blob/9e80801f05a3e4033f51eb8f629f62fe715d0c...
These timeouts are not related to sanlock, but will probably see similar
timeouts in sanlock.log.
because both vdsm and sanlock use read timeout of 10 seconds.
Nir