[ovirt-users] NFS IO timeout configuration

Yaniv Kaul ykaul at redhat.com
Tue Jan 12 12:15:16 UTC 2016


On Tue, Jan 12, 2016 at 9:32 AM, Markus Stockhausen <stockhausen at collogia.de
> wrote:

> Hi there,
>
> we got a nasty situation yesterday in our OVirt 3.5.6 environment.
> We ran a LSM that failed during the cleanup operation. To be precise
> when the process deleted an image on the source NFS storage.
>

Can you share with us your NFS server details?
Is the NFS connection healthy (can be seen with nfsstat)
Generally, delete on NFS should be a pretty quick operation.
Y.


>
> Engine log gives:
>
> 2016-01-11 20:49:45,120 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand]
> (org.ovirt.thread.pool-8-thread-14) [77277f0] START,
> DeleteImageGroupVDSCommand( storagePoolId =
> 94ed7a19-fade-4bd6-83f2-2cbb2f730b95, ignoreFailoverLimit = false,
> storageDomainId = 272ec473-6041-42ee-bd1a-732789dd18d4, imageGroupId =
> aed132ef-703a-44d0-b875-db8c0d2c1a92, postZeros = false, forceDelete =
> false), log id: b52d59c
> ...
> 2016-01-11 20:50:45,206 ERROR
> [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand]
> (org.ovirt.thread.pool-8-thread-14) [77277f0] Failed in DeleteImageGroupVDS
> method
>
> VDSM (SPM) log gives:
>
> Thread-97::DEBUG::2016-01-11
> 20:49:45,737::fileSD::384::Storage.StorageDomain::(deleteImage) Removing
> file: /rhev/data-center/mnt/1.2.3.4:
> _var_nas2_OVirtIB/272ec473-6041-42ee-bd1a-732789dd18d4/images/_remojzBd1r/0d623afb-291e-4f4c-acba-caecb125c4ed
> ...
> Thread-97::ERROR::2016-01-11
> 20:50:45,737::task::866::Storage.TaskManager.Task::(_setError)
> Task=`cd477878-47b4-44b1-85a3-b5da19543a5e`::Unexpected error
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/task.py", line 873, in _run
>     return fn(*args, **kargs)
>   File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
>     res = f(*args, **kwargs)
>   File "/usr/share/vdsm/storage/hsm.py", line 1549, in deleteImage
>     pool.deleteImage(dom, imgUUID, volsByImg)
>   File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
>     return method(self, *args, **kwargs)
>   File "/usr/share/vdsm/storage/sp.py", line 1884, in deleteImage
>     domain.deleteImage(domain.sdUUID, imgUUID, volsByImg)
>   File "/usr/share/vdsm/storage/fileSD.py", line 385, in deleteImage
>     self.oop.os.remove(volPath)
>   File "/usr/share/vdsm/storage/outOfProcess.py", line 245, in remove
>     self._iop.unlink(path)
>   File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 455,
> in unlink
>     return self._sendCommand("unlink", {"path": path}, self.timeout)
>   File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 385,
> in _sendCommand
>     raise Timeout(os.strerror(errno.ETIMEDOUT))
> Timeout: Connection timed out
>
> Reading the docs I got the idea that vdsm default 60 second timeout
> for IO operations might be changed within /etc/vdsm/vdsm.conf
>
> [irs]
> process_pool_timeout = 180
>
> Can anyone confirm that this will solve the problem?
>
> Markus
>
>
>
>
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160112/963c962f/attachment-0001.html>


More information about the Users mailing list