[Users] Cluster, hosts, everything down and totally stuck

My cluster is totally down and I'm completely stuck. I cannot set a host in maintenance, stop a VM because it has a unknown state, so also the host where this one is on. A new added node cannot find the storage because all are down but the NFS is up. On the node that is "unknown" all interfaces are up, on the new added host all are down following ovirt, but are actually up. Because of this I'm stuck between 2 or 4 walls and I'm not able to fix this in some way. What are my options ? I'm thinking of setting the VM that is stuck on my existing host to off by editing the DB of ovirt. I hope we can sort this out. Cheers, Matt

------=_Part_17656343_1071169767.1362907094519 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit ----- Original Message -----
From: "Matt ." <yamakasi.014@gmail.com> To: "users" <users@ovirt.org> Sent: Thursday, March 7, 2013 12:03:02 PM Subject: [Users] Cluster, hosts, everything down and totally stuck
My cluster is totally down and I'm completely stuck.
I cannot set a host in maintenance, stop a VM because it has a unknown state, so also the host where this one is on. A new added node cannot find the storage because all are down but the NFS is up.
On the node that is "unknown" all interfaces are up, on the new added host all are down following ovirt, but are actually up.
Because of this I'm stuck between 2 or 4 walls and I'm not able to fix this in some way.
What are my options ? I'm thinking of setting the VM that is stuck on my existing host to off by editing the DB of ovirt.
I hope we can sort this out.
Cheers,
Matt
Hi, it sounds like your host in non-responsive, can you ping from the engine host to that host? does vdsm run ok on the host (you can check by running this in the host: vdsClient 0 getVdsCapabilities)
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
------=_Part_17656343_1071169767.1362907094519 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable <html><head><style type=3D'text/css'>p { margin: 0; }</style></head><body><= div style=3D'font-family: times new roman,new york,times,serif; font-size: = 12pt; color: #000000'><br><br><hr id=3D"zwchr"><blockquote id=3D"DWT5054" s= tyle=3D"border-left:2px solid rgb(16, 16, 255);margin-left:5px;padding-left= :5px;color:#000;font-weight:normal;font-style:normal;text-decoration:none;f= ont-family:Helvetica,Arial,sans-serif;font-size:12pt;"><b>From: </b>"Matt .= " <yamakasi.014@gmail.com><br><b>To: </b>"users" <users@ovirt.org&= gt;<br><b>Sent: </b>Thursday, March 7, 2013 12:03:02 PM<br><b>Subject: </b>= [Users] Cluster, hosts, everything down and totally stuck<br><br><div dir= =3D"ltr"><div><div><div><div><div>My cluster is totally down and I'm comple= tely stuck.<br><br></div>I cannot set a host in maintenance, stop a VM beca= use it has a unknown state, so also the host where this one is on. A new ad= ded node cannot find the storage because all are down but the NFS is up.<br=
<br>On the node that is "unknown" all interfaces are up, on the new added h= ost all are down following ovirt, but are actually up.<br><br></div>Because= of this I'm stuck between 2 or 4 walls and I'm not able to fix this in som= e way.<br> <br></div>What are my options ? I'm thinking of setting the VM that is stuc= k on my existing host to off by editing the DB of ovirt.<br><br></div>I hop= e we can sort this out.<br><br>Cheers,<br><br></div>Matt<br><div><div> <div><div><div><br></div></div></div></div></div></div> <br></blockquote>Hi,<br>it sounds like your host in non-responsive,<br>can = you ping from the engine host to that host?<br>does vdsm run ok on the host= (you can check by running this in the host: vdsClient 0 getVdsCapabilities= )<br><blockquote style=3D"border-left: 2px solid rgb(16, 16, 255); margin-l= eft: 5px; padding-left: 5px; color: rgb(0, 0, 0); font-weight: normal; font= -style: normal; text-decoration: none; font-family: Helvetica,Arial,sans-se= rif; font-size: 12pt;">_______________________________________________<br>U= sers mailing list<br>Users@ovirt.org<br>http://lists.ovirt.org/mailman/list= info/users<br></blockquote><br></div></body></html> ------=_Part_17656343_1071169767.1362907094519--

This is a multi-part message in MIME format. --------------000601050009060105090200 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit
does vdsm run ok on the host (you can check by running this in the host: vdsClient 0 getVdsCapabilities) I've been seeing this issue on my PoC box I've been tinkering on - for some reason the VDSM daemon is not starting up at boot (need to look into the cause of this), but once I manually start/restart it, the rest seems to carry on OK
- J --------------000601050009060105090200 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit <html> <head> <meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <br> <blockquote cite="mid:794861168.17656344.1362907094519.JavaMail.root@redhat.com" type="cite"> <style type="text/css">p { margin: 0; }</style> <div style="font-family: times new roman,new york,times,serif; font-size: 12pt; color: #000000">does vdsm run ok on the host (you can check by running this in the host: vdsClient 0 getVdsCapabilities)<br> </div> </blockquote> I've been seeing this issue on my PoC box I've been tinkering on - for some reason the VDSM daemon is not starting up at boot (need to look into the cause of this), but once I manually start/restart it, the rest seems to carry on OK<br> <br> - J<br> </body> </html> --------------000601050009060105090200--

------99STJJ2HB8C50DSZFHUHZW6HWSK9QY Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Jaco <ubuntumuntu@gmail.com> wrote:
does vdsm run ok on the host (you can check by running this in the host: vdsClient 0 getVdsCapabilities) I've been seeing this issue on my PoC box I've been tinkering on - for some reason the VDSM daemon is not starting up at boot (need to look into the cause of this), but once I manually start/restart it, the rest
seems to carry on OK
- J
------------------------------------------------------------------------
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Hi Jaco, Check if unused nics are using dhcp? Joop -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. ------99STJJ2HB8C50DSZFHUHZW6HWSK9QY Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit <html><head/><body><html><head><meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type" /><style type="text/css">p { margin: 0; }</style></head><body bgcolor="#FFFFFF" text="#000000"><div class="gmail_quote">Jaco <ubuntumuntu@gmail.com> wrote:<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"> <br /> <blockquote cite="mid:794861168.17656344.1362907094519.JavaMail.root@redhat.com" type="cite"> <div style="font-family: times new roman,new york,times,serif; font-size: 12pt; color: #000000">does vdsm run ok on the host (you can check by running this in the host: vdsClient 0 getVdsCapabilities)<br /> </div> </blockquote> I've been seeing this issue on my PoC box I've been tinkering on - for some reason the VDSM daemon is not starting up at boot (need to look into the cause of this), but once I manually start/restart it, the rest seems to carry on OK<br /> <br /> - J<br /> <p style="margin-top: 2.5em; margin-bottom: 1em; border-bottom: 1px solid #000"></p><pre style="white-space: pre-wrap; word-wrap:break-word; font-family: sans-serif; margin-top: 0px"><hr /><br />Users mailing list<br />Users@ovirt.org<br /><a href="http://lists.ovirt.org/mailman/listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a><br /></pre></blockquote></div><br clear="all">Hi Jaco,<br> <br> Check if unused nics are using dhcp?<br> <br> Joop<br> -- <br> Sent from my Android phone with K-9 Mail. Please excuse my brevity.</body></html></body></html> ------99STJJ2HB8C50DSZFHUHZW6HWSK9QY--

This is a multi-part message in MIME format. --------------090700000200080805050001 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit
Check if unused nics are using dhcp? Nope - I have 2 NIC's, both statically locked. 1st NIC for bridge, 2nd dedicated to storage traffic; direct UTP to my NAS.
I have a suspicion that there's issue with the storage not coming up exactly as expected. I'm considering addressing the issue by making use of local storage rather than iSCSI, and then rather use LVM to mirror the local disk to the iSCSI. (^detailed in another messgase) - J --------------090700000200080805050001 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit <html> <head> <meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <br> <blockquote cite="mid:6aa71423-d8a5-45fe-96c2-884dee47911e@email.android.com" type="cite"> <meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type"> <style type="text/css">p { margin: 0; }</style>Check if unused nics are using dhcp?<br> </blockquote> Nope - I have 2 NIC's, both statically locked.<br> 1st NIC for bridge, 2nd dedicated to storage traffic; direct UTP to my NAS.<br> <br> I have a suspicion that there's issue with the storage not coming up exactly as expected.<br> I'm considering addressing the issue by making use of local storage rather than iSCSI, and then rather use LVM to mirror the local disk to the iSCSI.<br> (^detailed in another messgase)<br> <br> - J<br> </body> </html> --------------090700000200080805050001--
participants (4)
-
Jaco
-
Joop
-
Matt .
-
Omer Frenkel