From arsene.gschwind at unibas.ch Mon Jun 12 10:02:07 2017 Content-Type: multipart/mixed; boundary="===============2431832173848940816==" MIME-Version: 1.0 From: =?utf-8?q?Ars=C3=A8ne_Gschwind_=3Carsene=2Egschwind_at_unibas=2Ech=3E?= To: users at ovirt.org Subject: [ovirt-users] SPM in case of Failure Date: Mon, 12 Jun 2017 12:02:03 +0200 Message-ID: <558f2c75-9ed9-0cb4-7411-1e555ebeee94@unibas.ch> --===============2431832173848940816== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable This is a multi-part message in MIME format. --------------2697A26DEF0A6FEBE4ECD7F9 Content-Type: text/plain; charset=3Dutf-8; format=3Dflowed Content-Transfer-Encoding: 8bit Hi Our setup looks like: - 2 clusters in 2 different site connected with 10GBit LAN - Storage based on FC SAN replicated on both site and available for both = site (The LUNs are available over 4 pathes, 2 from each site) My observation: In case one site goes down and this site owned SPM is it not possible to = move or force SPM on the second site. On the site which is down it's possible to reset all VMs that crashed = using the "Confirm Host rebooted" menu on the oVirt Host but this does = not reset SPM. The only solution I found was to bring the Host which owned SPM up again = to be able to move it to the other site and then reactivate the storage = domains. Is this a normal behavior? Is there any way to force SPM reelection ? Thanks for your help or idea... Regards, Ars=C3=A8ne -- = *Ars=C3=A8ne Gschwind* Fa. Sapify AG im Auftrag der Universit=C3=A4t Basel IT Services Klingelbergstr. 70 | CH-4056 Basel | Switzerland Tel. +41 79 449 25 63 | http://its.unibas.ch ITS-ServiceDesk: support-its(a)unibas.ch | +41 61 267 14 11 --------------2697A26DEF0A6FEBE4ECD7F9 Content-Type: text/html; charset=3Dutf-8 Content-Transfer-Encoding: 8bit

Hi

Our setup looks like:

- 2 clusters in 2 different site connected with 10GBit LAN
- Storage based on FC SAN replicated on both site and available for both site (The LUNs are available over 4 pathes, 2 from each site)

My observation:

In case one site goes down and this site owned SPM is it not possible to move or force SPM on the second site.
On the site which is down it's possible to reset all VMs that crashed using the "Confirm Host rebooted" menu on the oVirt Host but this does not reset SPM.
The only solution I found was to bring the Host which owned SPM up again to be able to move it to the other site and then reactivate the storage domains.

Is this a normal behavior?
Is there any way to force SPM reelection ?

Thanks for your help or idea...

Regards,
Ars=C3=A8ne

--

Ars=C3=A8ne Gschwind<= /b> =C2=A0 =C2=A0
Fa. Sapify AG im Auftrag der Universit=C3=A4t Basel
IT Services
Klingelbergstr. 70=C2=A0|=C2=A0 CH-4056 Basel=C2=A0 |=C2=A0= Switzerland
Tel. +41 79 449 25 63=C2=A0 |=C2=A0
http://its.unibas.ch
ITS-ServiceDesk: support-its(a)unibas.ch | +41 61 267 14 11 <= /font>

--------------2697A26DEF0A6FEBE4ECD7F9-- --===============2431832173848940816== Content-Type: multipart/alternative MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="attachment.bin" VGhpcyBpcyBhIG11bHRpLXBhcnQgbWVzc2FnZSBpbiBNSU1FIGZvcm1hdC4KLS0tLS0tLS0tLS0t LS0yNjk3QTI2REVGMEE2RkVCRTRFQ0Q3RjkKQ29udGVudC1UeXBlOiB0ZXh0L3BsYWluOyBjaGFy c2V0PXV0Zi04OyBmb3JtYXQ9Zmxvd2VkCkNvbnRlbnQtVHJhbnNmZXItRW5jb2Rpbmc6IDhiaXQK CkhpCgpPdXIgc2V0dXAgbG9va3MgbGlrZToKCi0gMiBjbHVzdGVycyBpbiAyIGRpZmZlcmVudCBz aXRlIGNvbm5lY3RlZCB3aXRoIDEwR0JpdCBMQU4KLSBTdG9yYWdlIGJhc2VkIG9uIEZDIFNBTiBy ZXBsaWNhdGVkIG9uIGJvdGggc2l0ZSBhbmQgYXZhaWxhYmxlIGZvciBib3RoIApzaXRlIChUaGUg TFVOcyBhcmUgYXZhaWxhYmxlIG92ZXIgNCBwYXRoZXMsIDIgZnJvbSBlYWNoIHNpdGUpCgpNeSBv YnNlcnZhdGlvbjoKCkluIGNhc2Ugb25lIHNpdGUgZ29lcyBkb3duIGFuZCB0aGlzIHNpdGUgb3du ZWQgU1BNIGlzIGl0IG5vdCBwb3NzaWJsZSB0byAKbW92ZSBvciBmb3JjZSBTUE0gb24gdGhlIHNl Y29uZCBzaXRlLgpPbiB0aGUgc2l0ZSB3aGljaCBpcyBkb3duIGl0J3MgcG9zc2libGUgdG8gcmVz ZXQgYWxsIFZNcyB0aGF0IGNyYXNoZWQgCnVzaW5nIHRoZSAiQ29uZmlybSBIb3N0IHJlYm9vdGVk IiBtZW51IG9uIHRoZSBvVmlydCBIb3N0IGJ1dCB0aGlzIGRvZXMgCm5vdCByZXNldCBTUE0uClRo ZSBvbmx5IHNvbHV0aW9uIEkgZm91bmQgd2FzIHRvIGJyaW5nIHRoZSBIb3N0IHdoaWNoIG93bmVk IFNQTSB1cCBhZ2FpbiAKdG8gYmUgYWJsZSB0byBtb3ZlIGl0IHRvIHRoZSBvdGhlciBzaXRlIGFu ZCB0aGVuIHJlYWN0aXZhdGUgdGhlIHN0b3JhZ2UgCmRvbWFpbnMuCgpJcyB0aGlzIGEgbm9ybWFs IGJlaGF2aW9yPwpJcyB0aGVyZSBhbnkgd2F5IHRvIGZvcmNlIFNQTSByZWVsZWN0aW9uID8KClRo YW5rcyBmb3IgeW91ciBoZWxwIG9yIGlkZWEuLi4KClJlZ2FyZHMsCkFyc8OobmUKCi0tIAoKKkFy c8OobmUgR3NjaHdpbmQqCkZhLiBTYXBpZnkgQUcgaW0gQXVmdHJhZyBkZXIgVW5pdmVyc2l0w6R0 IEJhc2VsCklUIFNlcnZpY2VzCktsaW5nZWxiZXJnc3RyLiA3MCB8ICBDSC00MDU2IEJhc2VsICB8 ICBTd2l0emVybGFuZApUZWwuICs0MSA3OSA0NDkgMjUgNjMgIHwgaHR0cDovL2l0cy51bmliYXMu Y2ggPGh0dHA6Ly9pdHMudW5pYmFzLmNoLz4KSVRTLVNlcnZpY2VEZXNrOiBzdXBwb3J0LWl0c0B1 bmliYXMuY2ggfCArNDEgNjEgMjY3IDE0IDExCgoKLS0tLS0tLS0tLS0tLS0yNjk3QTI2REVGMEE2 RkVCRTRFQ0Q3RjkKQ29udGVudC1UeXBlOiB0ZXh0L2h0bWw7IGNoYXJzZXQ9dXRmLTgKQ29udGVu dC1UcmFuc2Zlci1FbmNvZGluZzogOGJpdAoKPGh0bWw+CiAgPGhlYWQ+CgogICAgPG1ldGEgaHR0 cC1lcXVpdj0iY29udGVudC10eXBlIiBjb250ZW50PSJ0ZXh0L2h0bWw7IGNoYXJzZXQ9dXRmLTgi PgogIDwvaGVhZD4KICA8Ym9keSB0ZXh0PSIjMDAwMDAwIiBiZ2NvbG9yPSIjRkZGRkZGIj4KICAg IDxwPkhpIDxicj4KICAgIDwvcD4KICAgIDxwPk91ciBzZXR1cCBsb29rcyBsaWtlOjwvcD4KICAg IDxwPi0gMiBjbHVzdGVycyBpbiAyIGRpZmZlcmVudCBzaXRlIGNvbm5lY3RlZCB3aXRoIDEwR0Jp dCBMQU48YnI+CiAgICAgIC0gU3RvcmFnZSBiYXNlZCBvbiBGQyBTQU4gcmVwbGljYXRlZCBvbiBi b3RoIHNpdGUgYW5kIGF2YWlsYWJsZQogICAgICBmb3IgYm90aCBzaXRlIChUaGUgTFVOcyBhcmUg YXZhaWxhYmxlIG92ZXIgNCBwYXRoZXMsIDIgZnJvbSBlYWNoCiAgICAgIHNpdGUpPGJyPgogICAg PC9wPgogICAgPHA+TXkgb2JzZXJ2YXRpb246PC9wPgogICAgPHA+SW4gY2FzZSBvbmUgc2l0ZSBn b2VzIGRvd24gYW5kIHRoaXMgc2l0ZSBvd25lZCBTUE0gaXMgaXQgbm90CiAgICAgIHBvc3NpYmxl IHRvIG1vdmUgb3IgZm9yY2UgU1BNIG9uIHRoZSBzZWNvbmQgc2l0ZS48YnI+CiAgICAgIE9uIHRo ZSBzaXRlIHdoaWNoIGlzIGRvd24gaXQncyBwb3NzaWJsZSB0byByZXNldCBhbGwgVk1zIHRoYXQK ICAgICAgY3Jhc2hlZCB1c2luZyB0aGUgIkNvbmZpcm0gSG9zdCByZWJvb3RlZCIgbWVudSBvbiB0 aGUgb1ZpcnQgSG9zdAogICAgICBidXQgdGhpcyBkb2VzIG5vdCByZXNldCBTUE0uPGJyPgogICAg ICBUaGUgb25seSBzb2x1dGlvbiBJIGZvdW5kIHdhcyB0byBicmluZyB0aGUgSG9zdCB3aGljaCBv d25lZCBTUE0gdXAKICAgICAgYWdhaW4gdG8gYmUgYWJsZSB0byBtb3ZlIGl0IHRvIHRoZSBvdGhl ciBzaXRlIGFuZCB0aGVuIHJlYWN0aXZhdGUKICAgICAgdGhlIHN0b3JhZ2UgZG9tYWlucy48L3A+ CiAgICA8cD5JcyB0aGlzIGEgbm9ybWFsIGJlaGF2aW9yPzxicj4KICAgICAgSXMgdGhlcmUgYW55 IHdheSB0byBmb3JjZSBTUE0gcmVlbGVjdGlvbiA/PC9wPgogICAgPHA+VGhhbmtzIGZvciB5b3Vy IGhlbHAgb3IgaWRlYS4uLjwvcD4KICAgIDxwPlJlZ2FyZHMsPGJyPgogICAgICBBcnPDqG5lPGJy PgogICAgPC9wPgogICAgPGRpdiBjbGFzcz0ibW96LXNpZ25hdHVyZSI+LS0gPGJyPgogICAgICA8 cCBjbGFzcz0id2VzdGVybiIgc3R5bGU9Im1hcmdpbi1ib3R0b206IDBpbjsgbGluZS1oZWlnaHQ6 IDE1MCUiPgogICAgICAgIDxmb250IGNvbG9yPSIjMDAwMDAwIj48Zm9udCBmYWNlPSJUYWhvbWEs IHNlcmlmIj4gPGZvbnQKICAgICAgICAgICAgICBzdHlsZT0iZm9udC1zaXplOiA4cHQiIHNpemU9 IjEiPiA8Yj5BcnPDqG5lIEdzY2h3aW5kPC9iPiA8L2ZvbnQ+CiAgICAgICAgICA8L2ZvbnQ+CiAg ICAgICAgICA8Zm9udCBjb2xvcj0iIzAwMDAwMCI+IDxmb250IGZhY2U9IlRhaG9tYSwgc2VyaWYi PiA8Zm9udAogICAgICAgICAgICAgICAgc3R5bGU9ImZvbnQtc2l6ZTogOHB0IiBzaXplPSIxIj4g wqAgPC9mb250PiA8L2ZvbnQ+CiAgICAgICAgICA8L2ZvbnQ+CiAgICAgICAgICA8Zm9udCBmYWNl PSJUYWhvbWEsIHNlcmlmIj4gPGZvbnQgc3R5bGU9ImZvbnQtc2l6ZTogOHB0IgogICAgICAgICAg ICAgIHNpemU9IjEiPiA8L2ZvbnQ+CiAgICAgICAgICA8L2ZvbnQ+CiAgICAgICAgICA8Zm9udCBm YWNlPSJUYWhvbWEsIHNlcmlmIj7CoAogICAgICAgICAgPC9mb250PgogICAgICAgICAgPGZvbnQg Y29sb3I9IiMwMDAwMDAiPiA8Zm9udCBmYWNlPSJUYWhvbWEsIHNlcmlmIj4gPGZvbnQKICAgICAg ICAgICAgICAgIHN0eWxlPSJmb250LXNpemU6IDhwdCIgc2l6ZT0iMSI+IDxicj4KICAgICAgICAg ICAgICA8L2ZvbnQ+IDwvZm9udD4KICAgICAgICAgIDwvZm9udD4KICAgICAgICAgIDxmb250IGNv bG9yPSIjN2Y3ZjdmIj4gPGZvbnQgZmFjZT0iVGFob21hLCBzZXJpZiI+IDxmb250CiAgICAgICAg ICAgICAgICBzdHlsZT0iZm9udC1zaXplOiA4cHQiIHNpemU9IjEiPiBGYS4gU2FwaWZ5IEFHIGlt CiAgICAgICAgICAgICAgICBBdWZ0cmFnIGRlciBVbml2ZXJzaXTDpHQgQmFzZWw8YnI+CiAgICAg ICAgICAgICAgICBJVCBTZXJ2aWNlczxicj4KICAgICAgICAgICAgICAgIEtsaW5nZWxiZXJnc3Ry LiA3MMKgfMKgIENILTQwNTYgQmFzZWzCoCB8wqAgU3dpdHplcmxhbmQ8YnI+CiAgICAgICAgICAg ICAgICBUZWwuICs0MSA3OSA0NDkgMjUgNjPCoCB8wqAgPC9mb250PiA8L2ZvbnQ+CiAgICAgICAg ICA8L2ZvbnQ+CiAgICAgICAgICA8YSBocmVmPSJodHRwOi8vaXRzLnVuaWJhcy5jaC8iPiA8Zm9u dCBmYWNlPSJUYWhvbWEsIHNlcmlmIj4KICAgICAgICAgICAgICA8Zm9udCBzdHlsZT0iZm9udC1z aXplOiA4cHQiIHNpemU9IjEiPgogICAgICAgICAgICAgICAgaHR0cDovL2l0cy51bmliYXMuY2gg PC9mb250PiA8L2ZvbnQ+CiAgICAgICAgICA8L2E+PGJyPgogICAgICAgICAgPGZvbnQgY29sb3I9 IiM3ZjdmN2YiPiA8Zm9udCBmYWNlPSJUYWhvbWEsIHNlcmlmIj4gPGZvbnQKICAgICAgICAgICAg ICAgIHN0eWxlPSJmb250LXNpemU6IDhwdCIgc2l6ZT0iMSI+IElUUy1TZXJ2aWNlRGVzazoKICAg ICAgICAgICAgICAgIDxhIGNsYXNzPSJtb3otdHh0LWxpbmstYWJicmV2aWF0ZWQiIGhyZWY9Im1h aWx0bzpzdXBwb3J0LWl0c0B1bmliYXMuY2giPnN1cHBvcnQtaXRzQHVuaWJhcy5jaDwvYT4gfCAr NDEgNjEgMjY3IDE0IDExIDwvZm9udD4gPC9mb250PgogICAgICAgICAgPC9mb250PgogICAgICAg IDwvZm9udD48L3A+CiAgICAgIDxmb250IGNvbG9yPSIjMDAwMDAwIj4KICAgICAgPC9mb250Pjwv ZGl2PgogIDwvYm9keT4KPC9odG1sPgoKLS0tLS0tLS0tLS0tLS0yNjk3QTI2REVGMEE2RkVCRTRF Q0Q3RjktLQo= --===============2431832173848940816==-- From mlipchuk at redhat.com Tue Jun 13 13:11:19 2017 Content-Type: multipart/mixed; boundary="===============4748737015098619217==" MIME-Version: 1.0 From: Maor Lipchuk To: users at ovirt.org Subject: Re: [ovirt-users] SPM in case of Failure Date: Tue, 13 Jun 2017 16:11:18 +0300 Message-ID: In-Reply-To: 558f2c75-9ed9-0cb4-7411-1e555ebeee94@unibas.ch --===============4748737015098619217== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Hi arsene, See my comments inline On Mon, Jun 12, 2017 at 1:02 PM, Ars=C3=A8ne Gschwind wrote: > Hi > > Our setup looks like: > > - 2 clusters in 2 different site connected with 10GBit LAN > - Storage based on FC SAN replicated on both site and available for both > site (The LUNs are available over 4 pathes, 2 from each site) > > My observation: > > In case one site goes down and this site owned SPM is it not possible to > move or force SPM on the second site. It could be a sanlock issue. The SPM uses sanlock on the storage domain, so once the SPM host will be rebooted and sanlock will be released from the storage domain (IINM after 80 seconds) another Host can obtain a lock on that storage domain and become the new SPM. What is the message in the logs that you get when you try to do that? > On the site which is down it's possible to reset all VMs that crashed usi= ng > the "Confirm Host rebooted" menu on the oVirt Host but this does not reset > SPM. > The only solution I found was to bring the Host which owned SPM up again = to > be able to move it to the other site and then reactivate the storage > domains. I would try to attach the storage domain ( detach it first if it is already attached) so you could register any VMs/Templates/Disks that were added in the original env. > > Is this a normal behavior? > Is there any way to force SPM reelection ? > > Thanks for your help or idea... > > Regards, > Ars=C3=A8ne > > -- > > Ars=C3=A8ne Gschwind > Fa. Sapify AG im Auftrag der Universit=C3=A4t Basel > IT Services > Klingelbergstr. 70 | CH-4056 Basel | Switzerland > Tel. +41 79 449 25 63 | http://its.unibas.ch > ITS-ServiceDesk: support-its(a)unibas.ch | +41 61 267 14 11 > > > _______________________________________________ > Users mailing list > Users(a)ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > --===============4748737015098619217==--