[ovirt-users] Concerns with increasing vdsTimeout value on engine?

Shubhendu Tripathi shtripat at redhat.com
Mon Jul 13 08:25:15 UTC 2015


On 07/13/2015 01:42 PM, Piotr Kliczewski wrote:
> On Mon, Jul 13, 2015 at 5:57 AM, Shubhendu Tripathi <shtripat at redhat.com> wrote:
>> On 07/12/2015 09:53 PM, Omer Frenkel wrote:
>>>
>>> ----- Original Message -----
>>>> From: "Liron Aravot" <laravot at redhat.com>
>>>> To: "Ryan Groten" <Ryan.Groten at stantec.com>
>>>> Cc: users at ovirt.org
>>>> Sent: Sunday, July 12, 2015 5:44:28 PM
>>>> Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on
>>>> engine?
>>>>
>>>>
>>>>
>>>> ----- Original Message -----
>>>>> From: "Ryan Groten" <Ryan.Groten at stantec.com>
>>>>> To: users at ovirt.org
>>>>> Sent: Friday, July 10, 2015 10:45:11 PM
>>>>> Subject: [ovirt-users] Concerns with increasing vdsTimeout value on
>>>>> engine?
>>>>>
>>>>>
>>>>>
>>>>> When I try to attach new direct lun disks, the scan takes a very long
>>>>> time
>>>>> to
>>>>> complete because of the number of pvs presented to my hosts (there is
>>>>> already a bug on this, related to the pvcreate command taking a very
>>>>> long
>>>>> time - https://bugzilla.redhat.com/show_bug.cgi?id=1217401 )
>>>>>
>>>>>
>>>>>
>>>>> I discovered a workaround by setting the vdsTimeout value higher (it is
>>>>> 180
>>>>> seconds by default). I changed it to 300 seconds and now the direct lun
>>>>> scan
>>>>> returns properly, but I’m hoping someone can warn me if this workaround
>>>>> is
>>>>> safe or if it’ll cause other potential issues? I made this change
>>>>> yesterday
>>>>> and so far so good.
>>>>>
>>>> Hi, no serious issue can be caused by that.
>>>> Keep in mind though that any other operation will have that amount of
>>>> time to
>>>> complete before failing on timeout - which will
>>>> cause delays before failing (as the timeout was increased for all
>>>> executions)
>>>> when not everything is operational and up as expected (as in most of the
>>>> time).
>>>> I'd guess that a RFE could be opened to allow increasing the timeout of
>>>> specific operations if a user want to do that.
>>>>
>>>> thanks,
>>>> Liron.
>>> if you have HA vms and use power management (fencing),
>>> this might cause longer downtime for HA vms if host has network timeouts:
>>> the engine will wait for 3 network failures before trying to fence the
>>> host,
>>> so in case of timeouts, and increasing it to 5mins,
>>> you should expect 15mins before engine will decide host is non-responsive
>>> and fence,
>>> so if you have HA vm on this host, this will be the vm downtime as well,
>>> as the engine will restart HA vms only after fencing.
>>>
>>> you can read more on
>>> http://www.ovirt.org/Automatic_Fencing
>>
>> Even I am in a need where, I try to delete all the 256 gluster volume
>> snapshots using a single gluster CLI command, and engine gets timed out.
>> So, as Liron suggested it would be better if at VDSM verb level we are able
>> to set timeout. That would be better option and caller needs to use the
>> feature judicially :)
>>
> Please open a RFE for being able to set operation timeout for single
> command call with description of use cases for which
> you would like to set the timeout.

Piotr,

I created an RFE BZ at https://bugzilla.redhat.com/show_bug.cgi?id=1242373.

Thanks and Regards,
Shubhendu

>>>>> Thanks,
>>>>>
>>>>> Ryan
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>




More information about the Users mailing list