EqualLogic SAN controller switchover

How well does oVirt handle an EqualLogic SAN controller switchover event? IIRC that can result in a short iSCSI "pause" (can't remember how long it takes) - I'm not sure what oVirt's threshold before VMs (including the hosted engine) get paused for storage timeouts. I've got a small setup where the active SAN controller's battery has gone bad, so I need to switch to the other controller, and I'm trying to figure out the impact - do I need to shut all VMs (including the engine) down first, will they just briefly pause and then continue, etc. -- Chris Adams <cma@cmadams.net>

Equallogic since they are active passive only, it can be a pretty long pause. Dell recommends setting your path params to allow for 1 minute. (btw, there's a reason why Equallogic is dead now) On 2/27/20 10:59 AM, Chris Adams wrote:
How well does oVirt handle an EqualLogic SAN controller switchover event? IIRC that can result in a short iSCSI "pause" (can't remember how long it takes) - I'm not sure what oVirt's threshold before VMs (including the hosted engine) get paused for storage timeouts.
I've got a small setup where the active SAN controller's battery has gone bad, so I need to switch to the other controller, and I'm trying to figure out the impact - do I need to shut all VMs (including the engine) down first, will they just briefly pause and then continue, etc.

On February 27, 2020 6:59:57 PM GMT+02:00, Chris Adams <cma@cmadams.net> wrote:
How well does oVirt handle an EqualLogic SAN controller switchover event? IIRC that can result in a short iSCSI "pause" (can't remember how long it takes) - I'm not sure what oVirt's threshold before VMs (including the hosted engine) get paused for storage timeouts.
I've got a small setup where the active SAN controller's battery has gone bad, so I need to switch to the other controller, and I'm trying to figure out the impact - do I need to shut all VMs (including the engine) down first, will they just briefly pause and then continue, etc.
Hi Chris, Do you have an idea how long will it take ? Keep in mind that in case the domain is declared unavailable (reached a threshold , which I doesn't know) , all VMs using it will be paused and oVirt will try to recover them once the storage is back available. For HA VMs , the behaviour could be different (yet I lack experience to give yoiu more details). Best Regards, Strahil Nikolov

Once upon a time, Strahil Nikolov <hunter86_bg@yahoo.com> said:
Do you have an idea how long will it take ?
No, it has been years and years since I had to do a switchover on an EqualLogic (they mostly just run). I know I've read of others using EqualLogic's for oVirt, so I'm hoping for someone who's experienced a switchover...
Keep in mind that in case the domain is declared unavailable (reached a threshold , which I doesn't know) , all VMs using it will be paused and oVirt will try to recover them once the storage is back available.
Right - with the hosted engine on the SAN, I am also curious how that is impacted (how will the engine HA tooling handle a pause). -- Chris Adams <cma@cmadams.net>

I worked with Equalogic in the past. Doesn't it have the ability to replicate to a partner Equalogic? If so, replicate, once complete do a failover to the new one. That may be a simplistic approach and not sure if it's Dell best practices to do so. Just a thought. Eric Evans Digital Data Services LLC. 304.660.9080 -----Original Message----- From: Chris Adams <cma@cmadams.net> Sent: Thursday, February 27, 2020 12:55 PM To: users@ovirt.org Subject: [ovirt-users] Re: EqualLogic SAN controller switchover Once upon a time, Strahil Nikolov <hunter86_bg@yahoo.com> said:
Do you have an idea how long will it take ?
No, it has been years and years since I had to do a switchover on an EqualLogic (they mostly just run). I know I've read of others using EqualLogic's for oVirt, so I'm hoping for someone who's experienced a switchover...
Keep in mind that in case the domain is declared unavailable (reached a threshold , which I doesn't know) , all VMs using it will be paused and oVirt will try to recover them once the storage is back available.
Right - with the hosted engine on the SAN, I am also curious how that is impacted (how will the engine HA tooling handle a pause). -- Chris Adams <cma@cmadams.net> _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/F44KYWA6SPMFUR...

This might help. https://www.dell.com/community/EqualLogic/Linux-multipath-conf-Multipathd/td... What the first poster said about Dell's recommendations are accurate (I'm a long time Equallogic user) On 2/27/20 12:34 PM, eevans@digitaldatatechs.com wrote:
I worked with Equalogic in the past. Doesn't it have the ability to replicate to a partner Equalogic? If so, replicate, once complete do a failover to the new one. That may be a simplistic approach and not sure if it's Dell best practices to do so. Just a thought.
Eric Evans Digital Data Services LLC. 304.660.9080
-----Original Message----- From: Chris Adams <cma@cmadams.net> Sent: Thursday, February 27, 2020 12:55 PM To: users@ovirt.org Subject: [ovirt-users] Re: EqualLogic SAN controller switchover
Once upon a time, Strahil Nikolov <hunter86_bg@yahoo.com> said:
Do you have an idea how long will it take ?
No, it has been years and years since I had to do a switchover on an EqualLogic (they mostly just run). I know I've read of others using EqualLogic's for oVirt, so I'm hoping for someone who's experienced a switchover...
Keep in mind that in case the domain is declared unavailable (reached a threshold , which I doesn't know) , all VMs using it will be paused and oVirt will try to recover them once the storage is back available.
Right - with the hosted engine on the SAN, I am also curious how that is impacted (how will the engine HA tooling handle a pause). -- Chris Adams <cma@cmadams.net> _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/F44KYWA6SPMFUR... _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/Z6ER5NXH67X7TQ...

On February 27, 2020 8:49:16 PM GMT+02:00, Christopher Cox <ccox@endlessnow.com> wrote:
This might help. https://www.dell.com/community/EqualLogic/Linux-multipath-conf-Multipathd/td...
What the first poster said about Dell's recommendations are accurate (I'm a long time Equallogic user)
I worked with Equalogic in the past. Doesn't it have the ability to replicate to a partner Equalogic? If so, replicate, once complete do a failover to the new one. That may be a simplistic approach and not sure if it's Dell best
On 2/27/20 12:34 PM, eevans@digitaldatatechs.com wrote: practices to do so.
Just a thought.
Eric Evans Digital Data Services LLC. 304.660.9080
-----Original Message----- From: Chris Adams <cma@cmadams.net> Sent: Thursday, February 27, 2020 12:55 PM To: users@ovirt.org Subject: [ovirt-users] Re: EqualLogic SAN controller switchover
Once upon a time, Strahil Nikolov <hunter86_bg@yahoo.com> said:
Do you have an idea how long will it take ?
No, it has been years and years since I had to do a switchover on an EqualLogic (they mostly just run). I know I've read of others using EqualLogic's for oVirt, so I'm hoping for someone who's experienced a switchover...
Keep in mind that in case the domain is declared unavailable (reached a threshold , which I doesn't know) , all VMs using it will be paused and oVirt will try to recover them once the storage is back available.
Right - with the hosted engine on the SAN, I am also curious how that is impacted (how will the engine HA tooling handle a pause). -- Chris Adams <cma@cmadams.net> _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/F44KYWA6SPMFUR... _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/Z6ER5NXH67X7TQ...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/42FL4GE6KRNSY6...
I don't think that it will work. oVirt 's internal mechanisms do not rely multipath to provide an error up in the stack , instead if it cannot write/read in the defined threshold - it will still asume that the storage domain is down. At least that was discussed on the mail list some time ago and I could be wrong. Best Regards, Strahil Nikolov

On February 27, 2020 8:34:07 PM GMT+02:00, eevans@digitaldatatechs.com wrote:
I worked with Equalogic in the past. Doesn't it have the ability to replicate to a partner Equalogic? If so, replicate, once complete do a failover to the new one. That may be a simplistic approach and not sure if it's Dell best practices to do so. Just a thought.
Eric Evans Digital Data Services LLC. 304.660.9080
-----Original Message----- From: Chris Adams <cma@cmadams.net> Sent: Thursday, February 27, 2020 12:55 PM To: users@ovirt.org Subject: [ovirt-users] Re: EqualLogic SAN controller switchover
Once upon a time, Strahil Nikolov <hunter86_bg@yahoo.com> said:
Do you have an idea how long will it take ?
No, it has been years and years since I had to do a switchover on an EqualLogic (they mostly just run). I know I've read of others using EqualLogic's for oVirt, so I'm hoping for someone who's experienced a switchover...
Keep in mind that in case the domain is declared unavailable (reached a threshold , which I doesn't know) , all VMs using it will be paused and oVirt will try to recover them once the storage is back available.
Right - with the hosted engine on the SAN, I am also curious how that is impacted (how will the engine HA tooling handle a pause). -- Chris Adams <cma@cmadams.net> _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/F44KYWA6SPMFUR... _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/Z6ER5NXH67X7TQ...
If libvirt cannot write to the VM, it will be paused. Then several minutes later, the ovirt-ha-agent will realize they can't reach the status page of the engine and they will destroy and restart the HostedEngine VM. Then the engine will try to activate the storage domains and I'm not sure if it will be able to realize that the VMs were paused. At least it should fins your VMs paused and try to unpause them (as the storage domain has recovered). I think it will be safer, if you have another storage that you can migrate your VMs (temporarily) via storage migration and after the change - move them back if needed. Best Regards, Strahil Nikolov

I run ovirt in my homelab, and have an Equallogic PS6000. I did this the other week, and it took about 8 to 10 minutes for the VMs to start coming back. I run my engine off of the SAN, and it had some trouble coming back because ovirt was in a lease lock situation. The engine VMs did pause, but the switchover took too long and it killed it. The other HA VMs were just killed and restarted after the storage issue. It did sort itself out, but the equallogic took its time to do the controller switchover. On 2/27/2020 11:55 AM, Chris Adams wrote:
Once upon a time, Strahil Nikolov <hunter86_bg@yahoo.com> said:
Do you have an idea how long will it take ? No, it has been years and years since I had to do a switchover on an EqualLogic (they mostly just run). I know I've read of others using EqualLogic's for oVirt, so I'm hoping for someone who's experienced a switchover...
Keep in mind that in case the domain is declared unavailable (reached a threshold , which I doesn't know) , all VMs using it will be paused and oVirt will try to recover them once the storage is back available. Right - with the hosted engine on the SAN, I am also curious how that is impacted (how will the engine HA tooling handle a pause).

On Thu, Feb 27, 2020 at 6:01 PM Chris Adams <cma@cmadams.net> wrote:
How well does oVirt handle an EqualLogic SAN controller switchover event? IIRC that can result in a short iSCSI "pause" (can't remember how long it takes) - I'm not sure what oVirt's threshold before VMs (including the hosted engine) get paused for storage timeouts.
I've got a small setup where the active SAN controller's battery has gone bad, so I need to switch to the other controller, and I'm trying to figure out the impact - do I need to shut all VMs (including the engine) down first, will they just briefly pause and then continue, etc. -- Chris Adams <cma@cmadams.net>
Hi Chris, see my considerations and my notes on configuration on EQL in the archives, related to APD (All Paths Down), where one of the scenarios was indeed event of switching controller: https://lists.ovirt.org/archives/list/users@ovirt.org/message/2R2ITNSOC67YTJ... I also pointed some considerations of what vSphere does for disk timeouts on Linux when installing VMware Tools. Unfortunately no answer, but it was August last year, so perhaps now we can have a revamping on considerations/feedbacks ;-) See also my further considerations and tests scenarios on October, with Francesco Romani and Strahil https://lists.ovirt.org/archives/list/users@ovirt.org/thread/BQA7NFCNREU5FX7... and https://lists.ovirt.org/archives/list/users@ovirt.org/thread/4DPJR7HGNDC45BJ... No happiness.... And in my case I have external engine and not hosted enginem that in this case of planned maintenance on EQL could have worse effects. Recently I configured EQL replication, used during a Datacenter planned maintenance activity Gianluca
participants (6)
-
Chris Adams
-
Christopher Cox
-
Daniel Rix
-
eevans@digitaldatatechs.com
-
Gianluca Cecchi
-
Strahil Nikolov