[ovirt-users] Concerns with increasing vdsTimeout value on engine?

Shubhendu Tripathi shtripat at redhat.com
Mon Jul 13 03:57:33 UTC 2015


On 07/12/2015 09:53 PM, Omer Frenkel wrote:
>
> ----- Original Message -----
>> From: "Liron Aravot" <laravot at redhat.com>
>> To: "Ryan Groten" <Ryan.Groten at stantec.com>
>> Cc: users at ovirt.org
>> Sent: Sunday, July 12, 2015 5:44:28 PM
>> Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
>>
>>
>>
>> ----- Original Message -----
>>> From: "Ryan Groten" <Ryan.Groten at stantec.com>
>>> To: users at ovirt.org
>>> Sent: Friday, July 10, 2015 10:45:11 PM
>>> Subject: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
>>>
>>>
>>>
>>> When I try to attach new direct lun disks, the scan takes a very long time
>>> to
>>> complete because of the number of pvs presented to my hosts (there is
>>> already a bug on this, related to the pvcreate command taking a very long
>>> time - https://bugzilla.redhat.com/show_bug.cgi?id=1217401 )
>>>
>>>
>>>
>>> I discovered a workaround by setting the vdsTimeout value higher (it is 180
>>> seconds by default). I changed it to 300 seconds and now the direct lun
>>> scan
>>> returns properly, but I’m hoping someone can warn me if this workaround is
>>> safe or if it’ll cause other potential issues? I made this change yesterday
>>> and so far so good.
>>>
>> Hi, no serious issue can be caused by that.
>> Keep in mind though that any other operation will have that amount of time to
>> complete before failing on timeout - which will
>> cause delays before failing (as the timeout was increased for all executions)
>> when not everything is operational and up as expected (as in most of the
>> time).
>> I'd guess that a RFE could be opened to allow increasing the timeout of
>> specific operations if a user want to do that.
>>
>> thanks,
>> Liron.
> if you have HA vms and use power management (fencing),
> this might cause longer downtime for HA vms if host has network timeouts:
> the engine will wait for 3 network failures before trying to fence the host,
> so in case of timeouts, and increasing it to 5mins,
> you should expect 15mins before engine will decide host is non-responsive and fence,
> so if you have HA vm on this host, this will be the vm downtime as well,
> as the engine will restart HA vms only after fencing.
>
> you can read more on
> http://www.ovirt.org/Automatic_Fencing

Even I am in a need where, I try to delete all the 256 gluster volume 
snapshots using a single gluster CLI command, and engine gets timed out.
So, as Liron suggested it would be better if at VDSM verb level we are 
able to set timeout. That would be better option and caller needs to use 
the feature judicially :)

>
>>> Thanks,
>>>
>>> Ryan
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users




More information about the Users mailing list