Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine?

13 Jul 2015

      On 07/13/2015 01:42 PM, Piotr Kliczewski wrote:
...
On Mon, Jul 13, 2015 at 5:57 AM, Shubhendu Tripathi <shtripat@redhat.com> wrote:
...
On 07/12/2015 09:53 PM, Omer Frenkel wrote:
...
...
From: "Liron Aravot" <laravot@redhat.com>
To: "Ryan Groten" <Ryan.Groten@stantec.com>
Cc: users@ovirt.org
Sent: Sunday, July 12, 2015 5:44:28 PM
Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on
engine?
----- Original Message -----
...
From: "Ryan Groten" <Ryan.Groten@stantec.com>
To: users@ovirt.org
Sent: Friday, July 10, 2015 10:45:11 PM
Subject: [ovirt-users] Concerns with increasing vdsTimeout value on
engine?
When I try to attach new direct lun disks, the scan takes a very long
time
to
complete because of the number of pvs presented to my hosts (there is
already a bug on this, related to the pvcreate command taking a very
long
time - https://bugzilla.redhat.com/show_bug.cgi?id=1217401 )
I discovered a workaround by setting the vdsTimeout value higher (it is
180
seconds by default). I changed it to 300 seconds and now the direct lun
scan
returns properly, but I’m hoping someone can warn me if this workaround
is
safe or if it’ll cause other potential issues? I made this change
yesterday
and so far so good.
Hi, no serious issue can be caused by that.
Keep in mind though that any other operation will have that amount of
time to
complete before failing on timeout - which will
cause delays before failing (as the timeout was increased for all
executions)
when not everything is operational and up as expected (as in most of the
time).
I'd guess that a RFE could be opened to allow increasing the timeout of
specific operations if a user want to do that.
thanks,
Liron.
if you have HA vms and use power management (fencing),
----- Original Message -----
this might cause longer downtime for HA vms if host has network timeouts:
the engine will wait for 3 network failures before trying to fence the
host,
so in case of timeouts, and increasing it to 5mins,
you should expect 15mins before engine will decide host is non-responsive
and fence,
so if you have HA vm on this host, this will be the vm downtime as well,
as the engine will restart HA vms only after fencing.
you can read more on
http://www.ovirt.org/Automatic_Fencing
Even I am in a need where, I try to delete all the 256 gluster volume
snapshots using a single gluster CLI command, and engine gets timed out.
So, as Liron suggested it would be better if at VDSM verb level we are able
to set timeout. That would be better option and caller needs to use the
feature judicially :)
Please open a RFE for being able to set operation timeout for single
command call with description of use cases for which
you would like to set the timeout.
Piotr,

I created an RFE BZ at https://bugzilla.redhat.com/show_bug.cgi?id=1242373.

Thanks and Regards,
Shubhendu
...
...
...
...
...
Thanks,
Ryan
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users