
------=_Part_386347_617418787.1505930303404 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable This matches about with what we were thinking, thank you! To answer your questions We do not have power management configured due to it causing a cascading fa= ilure early in our deployment. The host was not fenced and "confirm host r= ebooted" was never used. The VMs were powered on via virsh (this shouldn't= have happened) The way they were powered on is most likely why they were corrupted is our = thought Logan
On September 20, 2017 at 12:03 PM Michal Skrivanek <michal.skrivanek@=
=20 =20 =20 > > On 20 Sep 2017, at 18:06, Logan Kuhn <support@jac-pro=
=20 We had an incident where a VM hosts' disk filled up, the VMs al= l went unknown in the web console, but were fully functional if you were to= login or use the services of one. =20 >=20 Hi, yes, that can happen since the VM=E2=80=99s storage is on NAS whereas=
=20 =20 > > We couldn't migrate them so we powered them down on=
=20 >=20 that=E2=80=99s a mistake. The host should be fenced in that case, you=
=20 Normally you should now be able to run those VMs while the status of =
redhat.com> wrote: perties.com mailto:support@jac-properties.com > wrote: the server itself is non-functional as the management and all other local = processes are using local resources that host and powered them up and let ovirt choose the host for it, same a= s always.=20 likely do not have a power management configured, do you? Even when you do= not have a fencing device available it should have been resolved manually = by rebooting it manually(after fixing the disk problem), or in case of per= manent damage (e.g. server needs to be replaced, that takes a week, you nee= d to run those VMs in the meantime elsewhere) it should have been powered o= ff and VM states should be reset by =E2=80=9Cconfirm host has been rebooted= =E2=80=9D manual action. the host is still Not Responding - was it not the case? How exactly you get= to the situation that you were able to power up the VMs?
=20 =20 =20 > > However the disk image on a few of them were corrupte= d because once we fixed the host with the full disk, it still thought it sh= ould be running the VM. Which promptly corrupted the disk, the error seems= to be this in the logs:
=20 >=20 this can only happen for VMs flagged as HA, is it a case? =20 Thanks, michal =20 =20 > >=20 2017-09-19 21:59:11,058 INFO [org.ovirt.engine.core.vdsbroker.= monitoring.VmAnalyzer] (DefaultQuartzScheduler3) [36c806f6] VM '70cf75c7-0f= c2-4bbe-958e-7d0095f70960'(testhub) is running in db and not running on VDS= 'ef6dc2a3-af6e-4e00-aa4 0-493b31263417'(vm-int7) =20 We upgraded to 4.1.6 from 4.0.6 earlier in the day, I don't rea= lly think it's anything more than coincidence, but it's worrying enough to = send to the community. =20 Regards, Logan _______________________________________________ Users mailing list Users@ovirt.org mailto:Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users =20 >=20 =20
</div></blockquote><div><br class=3D""></div>this can only happen for VMs = flagged as HA, is it a case?</div><div><br class=3D""></div><div><div>Thank= s,</div><div>michal</div><div class=3D""><br class=3D""></div><blockquote t= ype=3D"cite"><div class=3D""><div dir=3D"ltr" class=3D""><div style=3D"font= -family: arial; font-size: 16px; background-color: #fdfdfd;" class=3D""><br= class=3D""></div><div style=3D"font-family: arial; font-size: 16px; backgr= ound-color: #fdfdfd;" class=3D""><span style=3D"font-family: monospace;" cl= ass=3D""><span style=3D"background-color: #ffffff;" class=3D""><span class= =3D"ox-3145df7df0-gmail-Object" id=3D"ox-3145df7df0-gmail-OBJ_PREFIX_DWT446= _com_zimbra_date" style=3D"color: #6f1616;"><span class=3D"ox-3145df7df0-gm= ail-Object" id=3D"ox-3145df7df0-gmail-OBJ_PREFIX_DWT447_com_zimbra_date" st= yle=3D"cursor: pointer;">2017-09-19</span></span> 21:59:11,058 INFO = 160;[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (DefaultQuartzS= cheduler3) [36c806f6] VM '70cf75c7-0fc2-4bbe-958e-7d0095f70960'(tes=
------=_Part_386347_617418787.1505930303404 MIME-Version: 1.0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <!DOCTYPE html> <html><head> <meta charset=3D"UTF-8"> </head><body><p>This matches about with what we were thinking, thank you!</= p><p>To answer your questions</p><p>We do not have power management configu= red due to it causing a cascading failure early in our deployment. Th= e host was not fenced and "confirm host rebooted" was never used.= 160; The VMs were powered on via virsh (this shouldn't have happened)</= p><p>The way they were powered on is most likely why they were corrupted is= our thought</p><p><br></p><p>Logan</p><blockquote type=3D"cite"><div id=3D= "ox-3145df7df0" style=3D"word-wrap: break-word;" class=3D"">On September 20= , 2017 at 12:03 PM Michal Skrivanek <michal.skrivanek@redhat.com> w= rote:<br><br><br class=3D""><div><blockquote type=3D"cite"><div class=3D"">= On 20 Sep 2017, at 18:06, Logan Kuhn <<a href=3D"mailto:support@jac-pro= perties.com" class=3D"">support@jac-properties.com</a>> wrote:</div><br= class=3D"ox-3145df7df0-Apple-interchange-newline"><div class=3D""><div dir= =3D"ltr" class=3D""><div style=3D"font-family: arial; font-size: 16px; back= ground-color: #fdfdfd;" class=3D"">We had an incident where a VM hosts'= disk filled up, the VMs all went unknown in the web console, but were full= y functional if you were to login or use the services of one.</div></div></= div></blockquote><div><br class=3D""></div><div>Hi,</div>yes, that can happ= en since the VM’s storage is on NAS whereas the server itself is non-= functional as the management and all other local processes are using local = resources</div><div><br class=3D""><blockquote type=3D"cite"><div class=3D"= "><div dir=3D"ltr" class=3D""><div style=3D"font-family: arial; font-size: = 16px; background-color: #fdfdfd;" class=3D""> We couldn't migrate= them so we powered them down on that host and powered them up and let ovir= t choose the host for it, same as always. </div></div></div></blockquo= te><div><br class=3D""></div><div>that’s a mistake. The host should b= e fenced in that case, you likely do not have a power management configured= , do you? Even when you do not have a fencing device available it should ha= ve been resolved manually by rebooting it manually(after fixing the d= isk problem), or in case of permanent damage (e.g. server needs to be repla= ced, that takes a week, you need to run those VMs in the meantime elsewhere= ) it should have been powered off and VM states should be reset by “c= onfirm host has been rebooted” manual action.</div><div><br class=3D"= "></div><div>Normally you should now be able to run those VMs while the sta= tus of the host is still Not Responding - was it not the case? How exactly = you get to the situation that you were able to power up the VMs?</div><div>= <br class=3D""></div><div><br class=3D""></div><blockquote type=3D"cite"><d= iv class=3D""><div dir=3D"ltr" class=3D""><div style=3D"font-family: arial;= font-size: 16px; background-color: #fdfdfd;" class=3D"">However the disk i= mage on a few of them were corrupted because once we fixed the host with th= e full disk, it still thought it should be running the VM. Which prom= ptly corrupted the disk, the error seems to be this in the logs:</div></div= thub) is </span><span style=3D"font-weight: bold; color: #ff5454; back= ground-color: #ffffff;" class=3D"">running</span><span style=3D"background-= color: #ffffff;" class=3D""> in db and not </span><span style=3D"= font-weight: bold; color: #ff5454; background-color: #ffffff;" class=3D"">r= unning</span><span style=3D"background-color: #ffffff;" class=3D""> on= VDS 'ef6dc2a3-af6e-4e00-aa4</span><br class=3D"">0-493b31263417'(v= m-int7)<br class=3D""></span></div><div style=3D"font-family: arial; font-s= ize: 16px; background-color: #fdfdfd;" class=3D""><br class=3D""></div><div= style=3D"font-family: arial; font-size: 16px; background-color: #fdfdfd;" = class=3D"">We upgraded to 4.1.6 from 4.0.6 earlier in the day, I don't = really think it's anything more than coincidence, but it's worrying= enough to send to the community.</div><div style=3D"font-family: arial; fo= nt-size: 16px; background-color: #fdfdfd;" class=3D""><br class=3D""></div>= <div style=3D"font-family: arial; font-size: 16px; background-color: #fdfdf= d;" class=3D"">Regards,<br class=3D"">Logan</div></div>____________________= ___________________________<br class=3D"">Users mailing list<br class=3D"">= <a href=3D"mailto:Users@ovirt.org" class=3D"">Users@ovirt.org</a><br class= =3D"">http://lists.ovirt.org/mailman/listinfo/users<br class=3D""></div></b= lockquote></div><br class=3D""></div></blockquote></body></html> =20 ------=_Part_386347_617418787.1505930303404--