all hosts non-operational

------=_NextPart_000_2B93_01D01D96.3EAE13C0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hello, After testing replacing a failed Gluster brick (shared ovirt/gluster) ALL hosts in the cluster go non-responsive, storage drops off etc. Now, gluster peer status fails, can't set any volume options, the volume randomly drops out of oVirt (was created from oVirt), log in oVirt dashboard shows the entry that the volume was deleted (but is there). Any gluster commands just hang. The combination of Ovirt & Gluster seems stable until there's a problem, then literally everything just grinds to a halt. All VM's go down, datacenter & hosts go non-responsive and the whole thing is broke.. Any ideas on what we should be looking for? ------=_NextPart_000_2B93_01D01D96.3EAE13C0 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable <html xmlns:v=3D"urn:schemas-microsoft-com:vml" = xmlns:o=3D"urn:schemas-microsoft-com:office:office" = xmlns:w=3D"urn:schemas-microsoft-com:office:word" = xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" = xmlns=3D"http://www.w3.org/TR/REC-html40"><head><META = HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; = charset=3Dus-ascii"><meta name=3DGenerator content=3D"Microsoft Word 15 = (filtered medium)"><style><!-- /* Font Definitions */ @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0in; margin-bottom:.0001pt; font-size:11.0pt; font-family:"Calibri",serif;} a:link, span.MsoHyperlink {mso-style-priority:99; color:#0563C1; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {mso-style-priority:99; color:#954F72; text-decoration:underline;} span.EmailStyle17 {mso-style-type:personal-compose; font-family:"Calibri",serif; color:windowtext;} .MsoChpDefault {mso-style-type:export-only; font-family:"Calibri",serif;} @page WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in;} div.WordSection1 {page:WordSection1;} --></style><!--[if gte mso 9]><xml> <o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" /> </xml><![endif]--><!--[if gte mso 9]><xml> <o:shapelayout v:ext=3D"edit"> <o:idmap v:ext=3D"edit" data=3D"1" /> </o:shapelayout></xml><![endif]--></head><body lang=3DEN-US = link=3D"#0563C1" vlink=3D"#954F72"><div class=3DWordSection1><p = class=3DMsoNormal>Hello,<o:p></o:p></p><p = class=3DMsoNormal><o:p> </o:p></p><p class=3DMsoNormal>After = testing replacing a failed Gluster brick (shared ovirt/gluster) ALL = hosts in the cluster go non-responsive, storage drops off etc. Now, = gluster peer status fails, can’t set any volume options, the = volume randomly drops out of oVirt (was created from oVirt), log in = oVirt dashboard shows the entry that the volume was deleted (but is = there). Any gluster commands just hang. The combination of Ovirt & = Gluster seems stable until there’s a problem, then literally = everything just grinds to a halt. All VM’s go down, datacenter = & hosts go non-responsive and the whole thing is broke.. Any ideas = on what we should be looking for?<o:p></o:p></p></div></body></html> ------=_NextPart_000_2B93_01D01D96.3EAE13C0--

This is a multi-part message in MIME format. --------------020307050201040508040807 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Looks like there is some network connectivity issue between the hosts. Due to this, some hosts might not aware of the existence of the volume, hence its removed from ovirt. Do you see any errors in gluster log file when gluster commands fail? Thanks, Kanagaraj On 12/22/2014 01:50 PM, Brent Hartzell wrote:
Hello,
After testing replacing a failed Gluster brick (shared ovirt/gluster) ALL hosts in the cluster go non-responsive, storage drops off etc. Now, gluster peer status fails, cant set any volume options, the volume randomly drops out of oVirt (was created from oVirt), log in oVirt dashboard shows the entry that the volume was deleted (but is there). Any gluster commands just hang. The combination of Ovirt & Gluster seems stable until theres a problem, then literally everything just grinds to a halt. All VMs go down, datacenter & hosts go non-responsive and the whole thing is broke.. Any ideas on what we should be looking for?
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
--------------020307050201040508040807 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=windows-1252" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> Looks like there is some network connectivity issue between the hosts. Due to this, some hosts might not aware of the existence of the volume, hence its removed from ovirt. Do you see any errors in gluster log file when gluster commands fail?<br> <br> Thanks,<br> Kanagaraj<br> <br> <div class="moz-cite-prefix">On 12/22/2014 01:50 PM, Brent Hartzell wrote:<br> </div> <blockquote cite="mid:BLU406-EAS240E9D32427FEFA85716C68D560@phx.gbl" type="cite"> <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> <meta name="Generator" content="Microsoft Word 15 (filtered medium)"> <style><!-- /* Font Definitions */ @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0in; margin-bottom:.0001pt; font-size:11.0pt; font-family:"Calibri",serif;} a:link, span.MsoHyperlink {mso-style-priority:99; color:#0563C1; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {mso-style-priority:99; color:#954F72; text-decoration:underline;} span.EmailStyle17 {mso-style-type:personal-compose; font-family:"Calibri",serif; color:windowtext;} .MsoChpDefault {mso-style-type:export-only; font-family:"Calibri",serif;} @page WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in;} div.WordSection1 {page:WordSection1;} --></style><!--[if gte mso 9]><xml> <o:shapedefaults v:ext="edit" spidmax="1026" /> </xml><![endif]--><!--[if gte mso 9]><xml> <o:shapelayout v:ext="edit"> <o:idmap v:ext="edit" data="1" /> </o:shapelayout></xml><![endif]--> <div class="WordSection1"> <p class="MsoNormal">Hello,<o:p></o:p></p> <p class="MsoNormal"><o:p> </o:p></p> <p class="MsoNormal">After testing replacing a failed Gluster brick (shared ovirt/gluster) ALL hosts in the cluster go non-responsive, storage drops off etc. Now, gluster peer status fails, cant set any volume options, the volume randomly drops out of oVirt (was created from oVirt), log in oVirt dashboard shows the entry that the volume was deleted (but is there). Any gluster commands just hang. The combination of Ovirt & Gluster seems stable until theres a problem, then literally everything just grinds to a halt. All VMs go down, datacenter & hosts go non-responsive and the whole thing is broke.. Any ideas on what we should be looking for?<o:p></o:p></p> </div> <br> <fieldset class="mimeAttachmentHeader"></fieldset> <br> <pre wrap="">_______________________________________________ Users mailing list <a class="moz-txt-link-abbreviated" href="mailto:Users@ovirt.org">Users@ovirt.org</a> <a class="moz-txt-link-freetext" href="http://lists.ovirt.org/mailman/listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a> </pre> </blockquote> <br> </body> </html> --------------020307050201040508040807--
participants (2)
-
Brent Hartzell
-
Kanagaraj