[ovirt-users] Seamless SAN HA failovers with oVirt?

Alex Crow acrow at integrafin.co.uk
Tue Jun 6 19:11:55 UTC 2017


I use Open-E in production on standard Intel (Supermicro) hardware. It 
can work in A/A (only in respect of ovirt, ie one LUN normally active on 
one server, the other LUN normally stays on the the other node) or A/P 
mode with multipath. Even in A/P mode it fails over quick enough to 
avoid VM pauses, using virtual IPs that float between the nodes. These 
modes are supported for both iSCSI or NFS.

I've also successfully implemented the same kind of rapid failover using 
standard linux HA tools (pacemaker and corosync). I've had migration 
times under 2s.

NFS has the added complications of filesystem locking. Maybe some of the 
docs on the CTDB site will help, as they ensure that NFS will be running 
on the same ports on each host and locking DBs will be shared between 
the two hosts. I have no idea if TrueNAS supports CTDB or similar 
distributed locking mechanisms.

Caveat: this is with iSCSI resources. I've not really run VMs in oVirt 
in anger against any kind of NFS storage yet. My boss wants to try 
Tintri, so I'll see how that works.

Cheers

Alex

On 06/06/17 18:45, Matthew Trent wrote:
> Thanks for the replies, all!
>
> Yep, Chris is right. TrueNAS HA is active/passive and there isn't a way around that when failing between heads.
>
> Sven: In my experience with iX support, they have directed me to reboot the active node to initiate failover. There's "hactl takeover" and "hactl giveback" commends, but reboot seems to be their preferred method.
>
> VMs going into a paused state and resuming when storage is back online sounds great. As long as oVirt's pause/resume isn't significantly slower than the 30-or-so seconds the TrueNAS takes to complete its failover, that's a pretty tolerable interruption for my needs. So my next questions are:
>
> 1) Assuming the SAN failover DOES work correctly, can anyone comment on their experience with oVirt pausing/thawing VMs in an NFS-based active/passive SAN failover scenario? Does it work reliably without intervention? Is it reasonably fast?
>
> 2) Is there anything else in the oVirt stack that might cause it to "freak out" rather than gracefully pause/unpause VMs?
>
> 2a) Particularly: I'm running hosted engine on the same TrueNAS storage. Does that change anything WRT to timeouts and oVirt's HA and fencing and sanlock and such?
>
> 2b) Is there a limit to how long oVirt will wait for storage before doing something more drastic than just pausing VMs?
>
> --
> Matthew Trent
> Network Engineer
> Lewis County IT Services
> 360.740.1247 - Helpdesk
> 360.740.3343 - Direct line
>
> ________________________________________
> From: users-bounces at ovirt.org <users-bounces at ovirt.org> on behalf of Chris Adams <cma at cmadams.net>
> Sent: Tuesday, June 6, 2017 7:21 AM
> To: users at ovirt.org
> Subject: Re: [ovirt-users] Seamless SAN HA failovers with oVirt?
>
> Once upon a time, Juan Pablo <pablo.localhost at gmail.com> said:
>> Chris, if you have active-active with multipath: you upgrade one system,
>> reboot it, check it came active again, then upgrade the other.
> Yes, but that's still not how a TrueNAS (and most other low- to
> mid-range SANs) works, so is not relevant.  The TrueNAS only has a
> single active node talking to the hard drives at a time, because having
> two nodes talking to the same storage at the same time is a hard problem
> to solve (typically requires custom hardware with active cache coherency
> and such).
>
> You can (and should) use multipath between servers and a TrueNAS, and
> that protects against NIC, cable, and switch failures, but does not help
> with a controller failure/reboot/upgrade.  Multipath is also used to
> provide better bandwidth sharing between links than ethernet LAGs.
>
> --
> Chris Adams <cma at cmadams.net>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users

--
This message is intended only for the addressee and may contain
confidential information. Unless you are that person, you may not
disclose its contents or use it in any way and are requested to delete
the message along with any attachments and notify us immediately.
This email is not intended to, nor should it be taken to, constitute advice.
The information provided is correct to our knowledge & belief and must not
be used as a substitute for obtaining tax, regulatory, investment, legal or
any other appropriate advice.

"Transact" is operated by Integrated Financial Arrangements Ltd.
29 Clement's Lane, London EC4N 7AE. Tel: (020) 7608 4900 Fax: (020) 7608 5300.
(Registered office: as above; Registered in England and Wales under
number: 3727592). Authorised and regulated by the Financial Conduct
Authority (entered on the Financial Services Register; no. 190856).


More information about the Users mailing list