[ovirt-users] Concerns with increasing vdsTimeout value on engine?

Piotr Kliczewski piotr.kliczewski at gmail.com
Mon Jul 13 08:12:47 UTC 2015


On Mon, Jul 13, 2015 at 5:57 AM, Shubhendu Tripathi <shtripat at redhat.com> wrote:
> On 07/12/2015 09:53 PM, Omer Frenkel wrote:
>>
>>
>> ----- Original Message -----
>>>
>>> From: "Liron Aravot" <laravot at redhat.com>
>>> To: "Ryan Groten" <Ryan.Groten at stantec.com>
>>> Cc: users at ovirt.org
>>> Sent: Sunday, July 12, 2015 5:44:28 PM
>>> Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on
>>> engine?
>>>
>>>
>>>
>>> ----- Original Message -----
>>>>
>>>> From: "Ryan Groten" <Ryan.Groten at stantec.com>
>>>> To: users at ovirt.org
>>>> Sent: Friday, July 10, 2015 10:45:11 PM
>>>> Subject: [ovirt-users] Concerns with increasing vdsTimeout value on
>>>> engine?
>>>>
>>>>
>>>>
>>>> When I try to attach new direct lun disks, the scan takes a very long
>>>> time
>>>> to
>>>> complete because of the number of pvs presented to my hosts (there is
>>>> already a bug on this, related to the pvcreate command taking a very
>>>> long
>>>> time - https://bugzilla.redhat.com/show_bug.cgi?id=1217401 )
>>>>
>>>>
>>>>
>>>> I discovered a workaround by setting the vdsTimeout value higher (it is
>>>> 180
>>>> seconds by default). I changed it to 300 seconds and now the direct lun
>>>> scan
>>>> returns properly, but I’m hoping someone can warn me if this workaround
>>>> is
>>>> safe or if it’ll cause other potential issues? I made this change
>>>> yesterday
>>>> and so far so good.
>>>>
>>> Hi, no serious issue can be caused by that.
>>> Keep in mind though that any other operation will have that amount of
>>> time to
>>> complete before failing on timeout - which will
>>> cause delays before failing (as the timeout was increased for all
>>> executions)
>>> when not everything is operational and up as expected (as in most of the
>>> time).
>>> I'd guess that a RFE could be opened to allow increasing the timeout of
>>> specific operations if a user want to do that.
>>>
>>> thanks,
>>> Liron.
>>
>> if you have HA vms and use power management (fencing),
>> this might cause longer downtime for HA vms if host has network timeouts:
>> the engine will wait for 3 network failures before trying to fence the
>> host,
>> so in case of timeouts, and increasing it to 5mins,
>> you should expect 15mins before engine will decide host is non-responsive
>> and fence,
>> so if you have HA vm on this host, this will be the vm downtime as well,
>> as the engine will restart HA vms only after fencing.
>>
>> you can read more on
>> http://www.ovirt.org/Automatic_Fencing
>
>
> Even I am in a need where, I try to delete all the 256 gluster volume
> snapshots using a single gluster CLI command, and engine gets timed out.
> So, as Liron suggested it would be better if at VDSM verb level we are able
> to set timeout. That would be better option and caller needs to use the
> feature judicially :)
>

Please open a RFE for being able to set operation timeout for single
command call with description of use cases for which
you would like to set the timeout.

>
>>
>>>> Thanks,
>>>>
>>>> Ryan
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users



More information about the Users mailing list