Concerns with increasing vdsTimeout value on engine?

--_000_c9bc721d3e0f454d8777b18f6154406aCD1911M21corpads_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable When I try to attach new direct lun disks, the scan takes a very long time = to complete because of the number of pvs presented to my hosts (there is al= ready a bug on this, related to the pvcreate command taking a very long tim= e - https://bugzilla.redhat.com/show_bug.cgi?id=3D1217401) I discovered a workaround by setting the vdsTimeout value higher (it is 180= seconds by default). I changed it to 300 seconds and now the direct lun sc= an returns properly, but I'm hoping someone can warn me if this workaround = is safe or if it'll cause other potential issues? I made this change yeste= rday and so far so good. Thanks, Ryan --_000_c9bc721d3e0f454d8777b18f6154406aCD1911M21corpads_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable <html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr= osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" = xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:= //www.w3.org/TR/REC-html40"> <head> <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"=
<meta name=3D"Generator" content=3D"Microsoft Word 14 (filtered medium)"> <style><!-- /* Font Definitions */ @font-face {font-family:"Century Gothic"; panose-1:2 11 5 2 2 2 2 2 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0in; margin-bottom:.0001pt; font-size:12.0pt; font-family:"Times New Roman","serif";} a:link, span.MsoHyperlink {mso-style-priority:99; color:blue; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {mso-style-priority:99; color:purple; text-decoration:underline;} span.EmailStyle17 {mso-style-type:personal-compose; font-family:"Century Gothic","sans-serif"; color:windowtext;} .MsoChpDefault {mso-style-type:export-only; font-family:"Calibri","sans-serif";} @page WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in;} div.WordSection1 {page:WordSection1;} --></style><!--[if gte mso 9]><xml> <o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" /> </xml><![endif]--><!--[if gte mso 9]><xml> <o:shapelayout v:ext=3D"edit"> <o:idmap v:ext=3D"edit" data=3D"1" /> </o:shapelayout></xml><![endif]--> </head> <body lang=3D"EN-US" link=3D"blue" vlink=3D"purple"> <div class=3D"WordSection1"> <p class=3D"MsoNormal"><span style=3D"font-size:10.0pt;font-family:"Ce= ntury Gothic","sans-serif"">When I try to attach new direct = lun disks, the scan takes a very long time to complete because of the numbe= r of pvs presented to my hosts (there is already a bug on this, related to the pvcreate command taking a very long time - <a href=3D= "https://bugzilla.redhat.com/show_bug.cgi?id=3D1217401"> https://bugzilla.redhat.com/show_bug.cgi?id=3D1217401</a>)<o:p></o:p></span=
</p> <p class=3D"MsoNormal"><span style=3D"font-size:10.0pt;font-family:"Ce= ntury Gothic","sans-serif""><o:p> </o:p></span></p> <p class=3D"MsoNormal"><span style=3D"font-size:10.0pt;font-family:"Ce= ntury Gothic","sans-serif"">I discovered a workaround by set= ting the vdsTimeout value higher (it is 180 seconds by default). I changed = it to 300 seconds and now the direct lun scan returns properly, but I’m hoping someone can warn me if this workaround is safe or if = it’ll cause other potential issues? I made this change yesterda= y and so far so good.<o:p></o:p></span></p> <p class=3D"MsoNormal"><span style=3D"font-size:10.0pt;font-family:"Ce= ntury Gothic","sans-serif""><o:p> </o:p></span></p> <p class=3D"MsoNormal"><span style=3D"font-size:10.0pt;font-family:"Ce= ntury Gothic","sans-serif"">Thanks,<o:p></o:p></span></p> <p class=3D"MsoNormal"><span style=3D"font-size:10.0pt;font-family:"Ce= ntury Gothic","sans-serif"">Ryan<o:p></o:p></span></p> </div> </body> </html>
--_000_c9bc721d3e0f454d8777b18f6154406aCD1911M21corpads_--

----- Original Message -----
From: "Ryan Groten" <Ryan.Groten@stantec.com> To: users@ovirt.org Sent: Friday, July 10, 2015 10:45:11 PM Subject: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
When I try to attach new direct lun disks, the scan takes a very long time to complete because of the number of pvs presented to my hosts (there is already a bug on this, related to the pvcreate command taking a very long time - https://bugzilla.redhat.com/show_bug.cgi?id=1217401 )
I discovered a workaround by setting the vdsTimeout value higher (it is 180 seconds by default). I changed it to 300 seconds and now the direct lun scan returns properly, but I’m hoping someone can warn me if this workaround is safe or if it’ll cause other potential issues? I made this change yesterday and so far so good.
Hi, no serious issue can be caused by that. Keep in mind though that any other operation will have that amount of time to complete before failing on timeout - which will cause delays before failing (as the timeout was increased for all executions) when not everything is operational and up as expected (as in most of the time). I'd guess that a RFE could be opened to allow increasing the timeout of specific operations if a user want to do that. thanks, Liron.
Thanks,
Ryan
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

----- Original Message -----
From: "Liron Aravot" <laravot@redhat.com> To: "Ryan Groten" <Ryan.Groten@stantec.com> Cc: users@ovirt.org Sent: Sunday, July 12, 2015 5:44:28 PM Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
----- Original Message -----
From: "Ryan Groten" <Ryan.Groten@stantec.com> To: users@ovirt.org Sent: Friday, July 10, 2015 10:45:11 PM Subject: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
When I try to attach new direct lun disks, the scan takes a very long time to complete because of the number of pvs presented to my hosts (there is already a bug on this, related to the pvcreate command taking a very long time - https://bugzilla.redhat.com/show_bug.cgi?id=1217401 )
I discovered a workaround by setting the vdsTimeout value higher (it is 180 seconds by default). I changed it to 300 seconds and now the direct lun scan returns properly, but I’m hoping someone can warn me if this workaround is safe or if it’ll cause other potential issues? I made this change yesterday and so far so good.
Hi, no serious issue can be caused by that. Keep in mind though that any other operation will have that amount of time to complete before failing on timeout - which will cause delays before failing (as the timeout was increased for all executions) when not everything is operational and up as expected (as in most of the time). I'd guess that a RFE could be opened to allow increasing the timeout of specific operations if a user want to do that.
thanks, Liron.
if you have HA vms and use power management (fencing), this might cause longer downtime for HA vms if host has network timeouts: the engine will wait for 3 network failures before trying to fence the host, so in case of timeouts, and increasing it to 5mins, you should expect 15mins before engine will decide host is non-responsive and fence, so if you have HA vm on this host, this will be the vm downtime as well, as the engine will restart HA vms only after fencing. you can read more on http://www.ovirt.org/Automatic_Fencing
Thanks,
Ryan
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On 07/12/2015 09:53 PM, Omer Frenkel wrote:
From: "Liron Aravot" <laravot@redhat.com> To: "Ryan Groten" <Ryan.Groten@stantec.com> Cc: users@ovirt.org Sent: Sunday, July 12, 2015 5:44:28 PM Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
----- Original Message -----
From: "Ryan Groten" <Ryan.Groten@stantec.com> To: users@ovirt.org Sent: Friday, July 10, 2015 10:45:11 PM Subject: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
When I try to attach new direct lun disks, the scan takes a very long time to complete because of the number of pvs presented to my hosts (there is already a bug on this, related to the pvcreate command taking a very long time - https://bugzilla.redhat.com/show_bug.cgi?id=1217401 )
I discovered a workaround by setting the vdsTimeout value higher (it is 180 seconds by default). I changed it to 300 seconds and now the direct lun scan returns properly, but I’m hoping someone can warn me if this workaround is safe or if it’ll cause other potential issues? I made this change yesterday and so far so good.
Hi, no serious issue can be caused by that. Keep in mind though that any other operation will have that amount of time to complete before failing on timeout - which will cause delays before failing (as the timeout was increased for all executions) when not everything is operational and up as expected (as in most of the time). I'd guess that a RFE could be opened to allow increasing the timeout of specific operations if a user want to do that.
thanks, Liron. if you have HA vms and use power management (fencing),
----- Original Message ----- this might cause longer downtime for HA vms if host has network timeouts: the engine will wait for 3 network failures before trying to fence the host, so in case of timeouts, and increasing it to 5mins, you should expect 15mins before engine will decide host is non-responsive and fence, so if you have HA vm on this host, this will be the vm downtime as well, as the engine will restart HA vms only after fencing.
you can read more on http://www.ovirt.org/Automatic_Fencing
Even I am in a need where, I try to delete all the 256 gluster volume snapshots using a single gluster CLI command, and engine gets timed out. So, as Liron suggested it would be better if at VDSM verb level we are able to set timeout. That would be better option and caller needs to use the feature judicially :)
Thanks,
Ryan
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On Mon, Jul 13, 2015 at 5:57 AM, Shubhendu Tripathi <shtripat@redhat.com> wrote:
On 07/12/2015 09:53 PM, Omer Frenkel wrote:
----- Original Message -----
From: "Liron Aravot" <laravot@redhat.com> To: "Ryan Groten" <Ryan.Groten@stantec.com> Cc: users@ovirt.org Sent: Sunday, July 12, 2015 5:44:28 PM Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
----- Original Message -----
From: "Ryan Groten" <Ryan.Groten@stantec.com> To: users@ovirt.org Sent: Friday, July 10, 2015 10:45:11 PM Subject: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
When I try to attach new direct lun disks, the scan takes a very long time to complete because of the number of pvs presented to my hosts (there is already a bug on this, related to the pvcreate command taking a very long time - https://bugzilla.redhat.com/show_bug.cgi?id=1217401 )
I discovered a workaround by setting the vdsTimeout value higher (it is 180 seconds by default). I changed it to 300 seconds and now the direct lun scan returns properly, but I’m hoping someone can warn me if this workaround is safe or if it’ll cause other potential issues? I made this change yesterday and so far so good.
Hi, no serious issue can be caused by that. Keep in mind though that any other operation will have that amount of time to complete before failing on timeout - which will cause delays before failing (as the timeout was increased for all executions) when not everything is operational and up as expected (as in most of the time). I'd guess that a RFE could be opened to allow increasing the timeout of specific operations if a user want to do that.
thanks, Liron.
if you have HA vms and use power management (fencing), this might cause longer downtime for HA vms if host has network timeouts: the engine will wait for 3 network failures before trying to fence the host, so in case of timeouts, and increasing it to 5mins, you should expect 15mins before engine will decide host is non-responsive and fence, so if you have HA vm on this host, this will be the vm downtime as well, as the engine will restart HA vms only after fencing.
you can read more on http://www.ovirt.org/Automatic_Fencing
Even I am in a need where, I try to delete all the 256 gluster volume snapshots using a single gluster CLI command, and engine gets timed out. So, as Liron suggested it would be better if at VDSM verb level we are able to set timeout. That would be better option and caller needs to use the feature judicially :)
Please open a RFE for being able to set operation timeout for single command call with description of use cases for which you would like to set the timeout.
Thanks,
Ryan
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On 07/13/2015 01:42 PM, Piotr Kliczewski wrote:
On Mon, Jul 13, 2015 at 5:57 AM, Shubhendu Tripathi <shtripat@redhat.com> wrote:
On 07/12/2015 09:53 PM, Omer Frenkel wrote:
From: "Liron Aravot" <laravot@redhat.com> To: "Ryan Groten" <Ryan.Groten@stantec.com> Cc: users@ovirt.org Sent: Sunday, July 12, 2015 5:44:28 PM Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
----- Original Message -----
From: "Ryan Groten" <Ryan.Groten@stantec.com> To: users@ovirt.org Sent: Friday, July 10, 2015 10:45:11 PM Subject: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
When I try to attach new direct lun disks, the scan takes a very long time to complete because of the number of pvs presented to my hosts (there is already a bug on this, related to the pvcreate command taking a very long time - https://bugzilla.redhat.com/show_bug.cgi?id=1217401 )
I discovered a workaround by setting the vdsTimeout value higher (it is 180 seconds by default). I changed it to 300 seconds and now the direct lun scan returns properly, but I’m hoping someone can warn me if this workaround is safe or if it’ll cause other potential issues? I made this change yesterday and so far so good.
Hi, no serious issue can be caused by that. Keep in mind though that any other operation will have that amount of time to complete before failing on timeout - which will cause delays before failing (as the timeout was increased for all executions) when not everything is operational and up as expected (as in most of the time). I'd guess that a RFE could be opened to allow increasing the timeout of specific operations if a user want to do that.
thanks, Liron. if you have HA vms and use power management (fencing),
----- Original Message ----- this might cause longer downtime for HA vms if host has network timeouts: the engine will wait for 3 network failures before trying to fence the host, so in case of timeouts, and increasing it to 5mins, you should expect 15mins before engine will decide host is non-responsive and fence, so if you have HA vm on this host, this will be the vm downtime as well, as the engine will restart HA vms only after fencing.
you can read more on http://www.ovirt.org/Automatic_Fencing
Even I am in a need where, I try to delete all the 256 gluster volume snapshots using a single gluster CLI command, and engine gets timed out. So, as Liron suggested it would be better if at VDSM verb level we are able to set timeout. That would be better option and caller needs to use the feature judicially :)
Please open a RFE for being able to set operation timeout for single command call with description of use cases for which you would like to set the timeout.
Piotr, I created an RFE BZ at https://bugzilla.redhat.com/show_bug.cgi?id=1242373. Thanks and Regards, Shubhendu
Thanks,
Ryan
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Thanks for the responses everyone and for the RFE. I do use HA in some places at the moment, but I do see another timeout value called vdsConnectionTimeout. Would HA use this value or vdsTimeout (set to 2 by default) when attempting to contact the host? -----Original Message----- From: Shubhendu Tripathi [mailto:shtripat@redhat.com] Sent: Monday, July 13, 2015 2:25 AM To: Piotr Kliczewski Cc: Omer Frenkel; Groten, Ryan; users@ovirt.org Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine? On 07/13/2015 01:42 PM, Piotr Kliczewski wrote:
On Mon, Jul 13, 2015 at 5:57 AM, Shubhendu Tripathi <shtripat@redhat.com> wrote:
On 07/12/2015 09:53 PM, Omer Frenkel wrote:
From: "Liron Aravot" <laravot@redhat.com> To: "Ryan Groten" <Ryan.Groten@stantec.com> Cc: users@ovirt.org Sent: Sunday, July 12, 2015 5:44:28 PM Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
----- Original Message -----
From: "Ryan Groten" <Ryan.Groten@stantec.com> To: users@ovirt.org Sent: Friday, July 10, 2015 10:45:11 PM Subject: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
When I try to attach new direct lun disks, the scan takes a very long time to complete because of the number of pvs presented to my hosts (there is already a bug on this, related to the pvcreate command taking a very long time - https://bugzilla.redhat.com/show_bug.cgi?id=1217401 )
I discovered a workaround by setting the vdsTimeout value higher (it is 180 seconds by default). I changed it to 300 seconds and now the direct lun scan returns properly, but I’m hoping someone can warn me if this workaround is safe or if it’ll cause other potential issues? I made this change yesterday and so far so good.
Hi, no serious issue can be caused by that. Keep in mind though that any other operation will have that amount of time to complete before failing on timeout - which will cause delays before failing (as the timeout was increased for all executions) when not everything is operational and up as expected (as in most of the time). I'd guess that a RFE could be opened to allow increasing the timeout of specific operations if a user want to do that.
thanks, Liron. if you have HA vms and use power management (fencing), this might cause longer downtime for HA vms if host has network timeouts:
----- Original Message ----- the engine will wait for 3 network failures before trying to fence the host, so in case of timeouts, and increasing it to 5mins, you should expect 15mins before engine will decide host is non-responsive and fence, so if you have HA vm on this host, this will be the vm downtime as well, as the engine will restart HA vms only after fencing.
you can read more on http://www.ovirt.org/Automatic_Fencing
Even I am in a need where, I try to delete all the 256 gluster volume snapshots using a single gluster CLI command, and engine gets timed out. So, as Liron suggested it would be better if at VDSM verb level we are able to set timeout. That would be better option and caller needs to use the feature judicially :)
Please open a RFE for being able to set operation timeout for single command call with description of use cases for which you would like to set the timeout.
Piotr, I created an RFE BZ at https://bugzilla.redhat.com/show_bug.cgi?id=1242373. Thanks and Regards, Shubhendu
Thanks,
Ryan
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On Mon, Jul 13, 2015 at 5:12 PM, Groten, Ryan <Ryan.Groten@stantec.com> wrote:
Thanks for the responses everyone and for the RFE. I do use HA in some places at the moment, but I do see another timeout value called vdsConnectionTimeout. Would HA use this value or vdsTimeout (set to 2 by default) when attempting to contact the host?
There is a difference between the two: vdsConnectionTimeout - is a timeout used during connecting to a remote host. By default it is 2 seconds. vdsTimeout - high level command invocation timeout used by all commands. By default it is 3 minutes. As far as I understand you are looking for a possibility to customize vdsTimeout for some of the commands.
-----Original Message----- From: Shubhendu Tripathi [mailto:shtripat@redhat.com] Sent: Monday, July 13, 2015 2:25 AM To: Piotr Kliczewski Cc: Omer Frenkel; Groten, Ryan; users@ovirt.org Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
On 07/13/2015 01:42 PM, Piotr Kliczewski wrote:
On Mon, Jul 13, 2015 at 5:57 AM, Shubhendu Tripathi <shtripat@redhat.com> wrote:
On 07/12/2015 09:53 PM, Omer Frenkel wrote:
From: "Liron Aravot" <laravot@redhat.com> To: "Ryan Groten" <Ryan.Groten@stantec.com> Cc: users@ovirt.org Sent: Sunday, July 12, 2015 5:44:28 PM Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
----- Original Message -----
From: "Ryan Groten" <Ryan.Groten@stantec.com> To: users@ovirt.org Sent: Friday, July 10, 2015 10:45:11 PM Subject: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
When I try to attach new direct lun disks, the scan takes a very long time to complete because of the number of pvs presented to my hosts (there is already a bug on this, related to the pvcreate command taking a very long time - https://bugzilla.redhat.com/show_bug.cgi?id=1217401 )
I discovered a workaround by setting the vdsTimeout value higher (it is 180 seconds by default). I changed it to 300 seconds and now the direct lun scan returns properly, but I’m hoping someone can warn me if this workaround is safe or if it’ll cause other potential issues? I made this change yesterday and so far so good.
Hi, no serious issue can be caused by that. Keep in mind though that any other operation will have that amount of time to complete before failing on timeout - which will cause delays before failing (as the timeout was increased for all executions) when not everything is operational and up as expected (as in most of the time). I'd guess that a RFE could be opened to allow increasing the timeout of specific operations if a user want to do that.
thanks, Liron. if you have HA vms and use power management (fencing), this might cause longer downtime for HA vms if host has network timeouts:
----- Original Message ----- the engine will wait for 3 network failures before trying to fence the host, so in case of timeouts, and increasing it to 5mins, you should expect 15mins before engine will decide host is non-responsive and fence, so if you have HA vm on this host, this will be the vm downtime as well, as the engine will restart HA vms only after fencing.
you can read more on http://www.ovirt.org/Automatic_Fencing
Even I am in a need where, I try to delete all the 256 gluster volume snapshots using a single gluster CLI command, and engine gets timed out. So, as Liron suggested it would be better if at VDSM verb level we are able to set timeout. That would be better option and caller needs to use the feature judicially :)
Please open a RFE for being able to set operation timeout for single command call with description of use cases for which you would like to set the timeout.
Piotr,
I created an RFE BZ at https://bugzilla.redhat.com/show_bug.cgi?id=1242373.
Thanks and Regards, Shubhendu
Thanks,
Ryan
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On 07/14/2015 12:35 PM, Piotr Kliczewski wrote:
On Mon, Jul 13, 2015 at 5:12 PM, Groten, Ryan <Ryan.Groten@stantec.com> wrote:
Thanks for the responses everyone and for the RFE. I do use HA in some places at the moment, but I do see another timeout value called vdsConnectionTimeout. Would HA use this value or vdsTimeout (set to 2 by default) when attempting to contact the host?
There is a difference between the two:
vdsConnectionTimeout - is a timeout used during connecting to a remote host. By default it is 2 seconds. vdsTimeout - high level command invocation timeout used by all commands. By default it is 3 minutes.
As far as I understand you are looking for a possibility to customize vdsTimeout for some of the commands.
For me, yes, the case is to have an option to set higher value of vdsTimeout for a specific command.
-----Original Message----- From: Shubhendu Tripathi [mailto:shtripat@redhat.com] Sent: Monday, July 13, 2015 2:25 AM To: Piotr Kliczewski Cc: Omer Frenkel; Groten, Ryan; users@ovirt.org Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
On 07/13/2015 01:42 PM, Piotr Kliczewski wrote:
On Mon, Jul 13, 2015 at 5:57 AM, Shubhendu Tripathi <shtripat@redhat.com> wrote:
On 07/12/2015 09:53 PM, Omer Frenkel wrote:
From: "Liron Aravot" <laravot@redhat.com> To: "Ryan Groten" <Ryan.Groten@stantec.com> Cc: users@ovirt.org Sent: Sunday, July 12, 2015 5:44:28 PM Subject: Re: [ovirt-users] Concerns with increasing vdsTimeout value on engine?
----- Original Message ----- > From: "Ryan Groten" <Ryan.Groten@stantec.com> > To: users@ovirt.org > Sent: Friday, July 10, 2015 10:45:11 PM > Subject: [ovirt-users] Concerns with increasing vdsTimeout value > on engine? > > > > When I try to attach new direct lun disks, the scan takes a very > long time to complete because of the number of pvs presented to my > hosts (there is already a bug on this, related to the pvcreate > command taking a very long time - > https://bugzilla.redhat.com/show_bug.cgi?id=1217401 ) > > > > I discovered a workaround by setting the vdsTimeout value higher > (it is > 180 > seconds by default). I changed it to 300 seconds and now the > direct lun scan returns properly, but I’m hoping someone can warn > me if this workaround is safe or if it’ll cause other potential > issues? I made this change yesterday and so far so good. > Hi, no serious issue can be caused by that. Keep in mind though that any other operation will have that amount of time to complete before failing on timeout - which will cause delays before failing (as the timeout was increased for all executions) when not everything is operational and up as expected (as in most of the time). I'd guess that a RFE could be opened to allow increasing the timeout of specific operations if a user want to do that.
thanks, Liron. if you have HA vms and use power management (fencing), this might cause longer downtime for HA vms if host has network timeouts:
----- Original Message ----- the engine will wait for 3 network failures before trying to fence the host, so in case of timeouts, and increasing it to 5mins, you should expect 15mins before engine will decide host is non-responsive and fence, so if you have HA vm on this host, this will be the vm downtime as well, as the engine will restart HA vms only after fencing.
you can read more on http://www.ovirt.org/Automatic_Fencing Even I am in a need where, I try to delete all the 256 gluster volume snapshots using a single gluster CLI command, and engine gets timed out. So, as Liron suggested it would be better if at VDSM verb level we are able to set timeout. That would be better option and caller needs to use the feature judicially :)
Please open a RFE for being able to set operation timeout for single command call with description of use cases for which you would like to set the timeout. Piotr,
I created an RFE BZ at https://bugzilla.redhat.com/show_bug.cgi?id=1242373.
Thanks and Regards, Shubhendu
> Thanks, > > Ryan > > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
participants (5)
-
Groten, Ryan
-
Liron Aravot
-
Omer Frenkel
-
Piotr Kliczewski
-
Shubhendu Tripathi