From rabshear at citytwist.net Fri Jan 23 14:22:47 2015 Content-Type: multipart/mixed; boundary="===============0081515793842751270==" MIME-Version: 1.0 From: Rob Abshear To: users at ovirt.org Subject: [ovirt-users] Host remains Non-Responsive after reboot Date: Fri, 23 Jan 2015 19:22:42 +0000 Message-ID: <1422040964161-a917b55f-a2d6b5d4-1ba0de5e@citytwist.net> --===============0081515793842751270== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable ------sinikael-?=3D_1-14220409622600.038446111837401986 Content-Type: text/plain; charset=3Dutf-8 Content-Transfer-Encoding: quoted-printable I am running oVirt Engine Version 3.5.0.1-1.el6. I have 4 hosts in the =3D cluster. Each host has a drac5 and it is configured and working. I am =3D trying to simulate a node failure. I am running one HA VM on one of the =3D hosts for testing. I simulate the failure by powering off the host with the=3D VM running. Here is what is happening. * Host is powered off * ~4 minutes pass and the host is recognized as not responding * Automatic fence runs and the VM migrates.Another host in the node is =3D chosen as a proxy to execute Status command on the host. * Same host is chosen as proxy to execute Start command on the host. * Same host is chosen as proxy to execute Status command on the host. * The host DOES physically start. * The host never shows status of UP. * I select =3DE2=3D80=3D9Cconfirm host has been rebooted=3DE2=3D80=3D9D an= d I see a =3D manual fence start. * Host stays non-responsive. * I put the host in =3D maintenance and then activate it. * Host still non-responsive * I put the host in maintenance and do a reinstall * Reinstall finishes and host becomes UP So, everything seems to go fine =3D with the HA functionality, but the host never recovers without being =3D reinstalled. Please let me know which logs you need to look at to help me out with this. Thanks Sent withMixmax [https://mixmax.=3D com/r/S6cJAfQTLnw8QGtnD] ------sinikael-?=3D_1-14220409622600.038446111837401986 Content-Type: text/html; format=3Dflowed Content-Transfer-Encoding: quoted-printable =3D20 =3D20 =3D20 =3D20 =3D20 =3D20 =3D20 =3D20
I am =3D running oVirt Engine Version 3.5.0.1-1.el6. I have 4 hosts in the cluster. = =3D Each host has a drac5 and it is configured and working. I am trying to =3D simulate a node failure. I am running one HA VM on one of the hosts for =3D testing. I simulate the failure by powering off the host with the VM =3D running.

Here is what is happening.=3D
  • Host is powered off
  • ~4 minutes pass and the host is recognized as = =3D not responding
  • Automatic fence runs and the VM migrates.Another host in the no= de=3D is chosen as a proxy to execute Status command on the host.=3D
  • Same host is chosen as proxy to execute Start command on the host.=3D
  • Same host is chosen as proxy to execute Status command on the host.=3D
  • The host DOES physically start.
  • The host never shows = =3D status of UP.
  • I select “confirm host has been rebooted” = =3D and I see a manual fence start.
  • Host stays non-responsive.= =3D
  • I put t= he=3D host in maintenance and then activate it.
  • Host still =3D non-responsive
  • I put the host in maintenance and do a =3D reinstall
  • Reinstall finishes and host becomes =3D UP
So, everything= =3D seems to go fine with the HA functionality, but the host never recovers =3D without being reinstalled. Please let me know which logs you need to look = =3D at to help me out with this.

Thanks

=
Sent with Mixmax
------sinikael-?=3D_1-14220409622600.038446111837401986-- --===============0081515793842751270== Content-Type: multipart/alternative MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="attachment.bin" LS0tLS0tc2luaWthZWwtPz1fMS0xNDIyMDQwOTYyMjYwMC4wMzg0NDYxMTE4Mzc0MDE5ODYKQ29u dGVudC1UeXBlOiB0ZXh0L3BsYWluOyBjaGFyc2V0PXV0Zi04CkNvbnRlbnQtVHJhbnNmZXItRW5j b2Rpbmc6IHF1b3RlZC1wcmludGFibGUKCkkgYW0gcnVubmluZyBvVmlydCBFbmdpbmUgVmVyc2lv biAzLjUuMC4xLTEuZWw2LiBJIGhhdmUgNCBob3N0cyBpbiB0aGUgPQpjbHVzdGVyLgpFYWNoIGhv c3QgaGFzIGEgZHJhYzUgYW5kIGl0IGlzIGNvbmZpZ3VyZWQgYW5kIHdvcmtpbmcuIEkgYW0gPQp0 cnlpbmcgdG8gc2ltdWxhdGUKYSBub2RlIGZhaWx1cmUuIEkgYW0gcnVubmluZyBvbmUgSEEgVk0g b24gb25lIG9mIHRoZSA9Cmhvc3RzIGZvciB0ZXN0aW5nLiBJCnNpbXVsYXRlIHRoZSBmYWlsdXJl IGJ5IHBvd2VyaW5nIG9mZiB0aGUgaG9zdCB3aXRoIHRoZT0KIFZNIHJ1bm5pbmcuCkhlcmUgaXMg d2hhdCBpcyBoYXBwZW5pbmcuICogSG9zdCBpcyBwb3dlcmVkIG9mZgogKiB+NCBtaW51dGVzIHBh c3MgYW5kIHRoZSBob3N0IGlzIHJlY29nbml6ZWQgYXMgbm90IHJlc3BvbmRpbmcKICogQXV0b21h dGljIGZlbmNlIHJ1bnMgYW5kIHRoZSBWTSBtaWdyYXRlcy5Bbm90aGVyIGhvc3QgaW4gdGhlIG5v ZGUgaXMgPQpjaG9zZW4gYXMgYSBwcm94eSB0byBleGVjdXRlIFN0YXR1cyBjb21tYW5kIG9uCiAg IHRoZSBob3N0LgogKiBTYW1lIGhvc3QgaXMgY2hvc2VuIGFzIHByb3h5IHRvIGV4ZWN1dGUgU3Rh cnQgY29tbWFuZCBvbiB0aGUgaG9zdC4KICogU2FtZSBob3N0IGlzIGNob3NlbiBhcyBwcm94eSB0 byBleGVjdXRlIFN0YXR1cyBjb21tYW5kIG9uIHRoZSBob3N0LgogKiBUaGUgaG9zdCBET0VTIHBo eXNpY2FsbHkgc3RhcnQuCiAqIFRoZSBob3N0IG5ldmVyIHNob3dzIHN0YXR1cyBvZiBVUC4KICog SSBzZWxlY3QgPUUyPTgwPTlDY29uZmlybSBob3N0IGhhcyBiZWVuIHJlYm9vdGVkPUUyPTgwPTlE IGFuZCBJIHNlZSBhID0KbWFudWFsIGZlbmNlIHN0YXJ0LgogKiBIb3N0IHN0YXlzIG5vbi1yZXNw b25zaXZlLgogKiBJIHB1dCB0aGUgaG9zdCBpbiA9Cm1haW50ZW5hbmNlIGFuZCB0aGVuIGFjdGl2 YXRlIGl0LgogKiBIb3N0IHN0aWxsIG5vbi1yZXNwb25zaXZlCiAqIEkgcHV0IHRoZSBob3N0IGlu IG1haW50ZW5hbmNlIGFuZCBkbyBhIHJlaW5zdGFsbAogKiBSZWluc3RhbGwgZmluaXNoZXMgYW5k IGhvc3QgYmVjb21lcyBVUAoKU28sIGV2ZXJ5dGhpbmcgc2VlbXMgdG8gZ28gZmluZSA9CndpdGgg dGhlIEhBIGZ1bmN0aW9uYWxpdHksIGJ1dCB0aGUgaG9zdCBuZXZlcgpyZWNvdmVycyB3aXRob3V0 IGJlaW5nID0KcmVpbnN0YWxsZWQuIFBsZWFzZSBsZXQgbWUga25vdyB3aGljaCBsb2dzIHlvdSBu ZWVkIHRvCmxvb2sgYXQgdG8gaGVscCBtZSBvdXQgd2l0aCB0aGlzLgpUaGFua3MKClNlbnQgd2l0 aE1peG1heCBbaHR0cHM6Ly9taXhtYXguPQpjb20vci9TNmNKQWZRVExudzhRR3RuRF0KLS0tLS0t c2luaWthZWwtPz1fMS0xNDIyMDQwOTYyMjYwMC4wMzg0NDYxMTE4Mzc0MDE5ODYKQ29udGVudC1U eXBlOiB0ZXh0L2h0bWw7IGZvcm1hdD1mbG93ZWQKQ29udGVudC1UcmFuc2Zlci1FbmNvZGluZzog cXVvdGVkLXByaW50YWJsZQoKPCFET0NUWVBFIGh0bWwgUFVCTElDICItLy9XM0MvL0RURCBYSFRN TCAxLjAgVHJhbnNpdGlvbmFsLy9FTiIgImh0dHA6Ly93d3cuPQp3My5vcmcvVFIveGh0bWwxL0RU RC94aHRtbDEtdHJhbnNpdGlvbmFsLmR0ZCI+Cgo8aHRtbCB4bWxucz0zRCJodHRwOi8vd3d3Lj0K dzMub3JnLzE5OTkveGh0bWwiIHhtbG5zOnY9M0QidXJuOnNjaGVtYXMtbWljcm9zb2Z0LWNvbTp2 bWwiID0KeG1sbnM6bz0zRCJ1cm46c2NoZW1hcy1taWNyb3NvZnQtY29tOm9mZmljZTpvZmZpY2Ui PgoKICA8aGVhZD4KICAgIDxtZXRhIG5hbWU9M0Qidmlld3BvcnQiIGNvbnRlbnQ9M0Qid2lkdGg9 M0RkZXZpY2Utd2lkdGgsID0KaW5pdGlhbC1zY2FsZT0zRDEuMCI+CiAgID0yMAogICA9MjAKICAg IDwhLS1baWYgZ3RlIG1zbyA5XT4KICAgIDx4bWw+CiAgICAgIDxvOk9mZmljZURvY3VtZW50U2V0 dGluZ3M+CiAgICAgICAgPG86QWxsb3dQTkcvPgogICAgICAgIDxvOlBpeGVsc1BlckluY2g+OTY8 L286UGl4ZWxzUGVySW5jaD4KICAgICA8L286T2ZmaWNlRG9jdW1lbnRTZXR0aW49CmdzPgogICAg PC94bWw+CiAgICA8IVtlbmRpZl0tLT4KICAgPTIwCiAgID0yMAogICAgPHN0eWxlID0KdHlwZT0z RCJ0ZXh0L2NzcyI+dGFibGUge2JvcmRlci1jb2xsYXBzZTpjb2xsYXBzZTt9KiA9CmE6aG92ZXJ7 Y3Vyc29yOnBvaW50ZXI7fWltZyB7d2lkdGg6YXV0bzt9KiBbbGFuZ349M0QicHJldmlldy1jYXJk Il0sLj0KcHJldmlldy1jYXJkIHtkaXNwbGF5OmJsb2NrO21hcmdpbjowO3dpZHRoOjEwMCU7Zm9u dC1zaXplOjA7fSogPQpbbGFuZ349M0QiaW50ZXJhY3RpdmUtY2FyZCJdLC5pbnRlcmFjdGl2ZS1j YXJkIHtkaXNwbGF5Om5vbmUgIWltcG9ydGFudDt9KiA9CltsYW5nfj0zRCJicmFuZC1waW50ZXJl c3QiXSB7d2lkdGg6MjgwcHggIWltcG9ydGFudDt9Zm9ybSB7Ym9yZGVyOjAgIT0KaW1wb3J0YW50 O21hcmdpbjowICFpbXBvcnRhbnQ7cGFkZGluZzowIDAgOHB4IDAgIWltcG9ydGFudDtmb250LXNp emU6MDt9Zm9yPQptID5kaXYge2Rpc3BsYXk6aW5saW5lLWJsb2NrO3dpZHRoOjUwJTt9Zm9ybSB0 ZCB7cGFkZGluZy1yaWdodDo2cHg7Zm9udC1mYW09CmlseToncHJveGltYS1ub3ZhJywnQXZlbmly IE5leHQnLCdTZWdvZSBVSScsJ0NhbGlicmknLCdIZWx2ZXRpY2EgTmV1ZScsPQpIZWx2ZXRpY2Es QXJpYWwsc2Fucy1zZXJpZjt9ZmllbGRzZXQge2JvcmRlcjoxcHggc29saWQgI2NjZCAhPQppbXBv cnRhbnQ7cGFkZGluZzo2cHggNXB4IDVweCAwICFpbXBvcnRhbnQ7Ym9yZGVyLXJhZGl1czo0cHgg IT0KaW1wb3J0YW50O3BhZGRpbmctcmlnaHQ6MjBweDttYXJnaW46MDt3aWR0aDphdXRvO31pbnB1 dCA9CntiYWNrZ3JvdW5kOm5vbmU7b3V0bGluZTpub25lICFpbXBvcnRhbnQ7bWluLWhlaWdodDoy NXB4O3BhZGRpbmc6MCA9CjEwcHg7Ym9yZGVyOm5vbmU7bWFyZ2luOjA7d2lkdGg6MTAwJTtib3gt c2l6aW5nOmJvcmRlci1ib3g7fSogPQpbbGFuZ349M0QiY29sdW1uLXdyYXBwZXItZmlyc3QiXSxk aXYuY29sdW1uLXdyYXBwZXItZmlyc3QgPQp7ZGlzcGxheTppbmxpbmUtYmxvY2s7d2lkdGg6MzAl O3ZlcnRpY2FsLWFsaWduOnRvcDtwYWRkaW5nOjhweCAxNnB4IDRweCA4cHg9CiAhaW1wb3J0YW50 O30qIFtsYW5nfj0zRCJjb2x1bW4td3JhcHBlci1zZWNvbmQiXSxkaXYuY29sdW1uLXdyYXBwZXIt c2Vjb25kID0Ke2Rpc3BsYXk6aW5saW5lLWJsb2NrO3dpZHRoOjYwJTt2ZXJ0aWNhbC1hbGlnbjp0 b3A7cGFkZGluZzo0cHggMCA0cHggMDt9KiA9CltsYW5nfj0zRCJjb2x1bW4td3JhcHBlci1vbmx5 Il0sZGl2LmNvbHVtbi13cmFwcGVyLW9ubHkge3BhZGRpbmc6OHB4IDhweCA9CjRweCA4cHggIWlt cG9ydGFudDt9PC9zdHlsZT4KICA8L2hlYWQ+CgogIDxib2R5IGxlZnRtYXJnaW49M0QiMCIgPQp0 b3BtYXJnaW49M0QiMCIgbWFyZ2lud2lkdGg9M0QiMCIgbWFyZ2luaGVpZ2h0PTNEIjAiIHlhaG9v PTNEImZpeCIgPQpzdHlsZT0zRCJ3b3JkLXdyYXA6bm9ybWFsOyAgd29yZC1icmVhazpicmVhay13 b3JkOyI+CgogICAgPHN0eWxlPjwvc3R5bGU+CiAgID0yMAogICAgPCEtLVtpZiBtc29dPgogICAg ICA8c3R5bGU+YSB7Zm9udC1mYW1pbHk6J1NlZ29lIFVJJywnQ2FsaWJyaScsPQpBcmlhbCxzYW5z LXNlcmlmICFpbXBvcnRhbnQ7fXAge2xpbmUtaGVpZ2h0OjI0cHg7bWFyZ2luLWxlZnQ6M3B4ICE9 CmltcG9ydGFudDt9aDEsaDIsaDMge3BhZGRpbmctbGVmdDozcHg7fWltZyB7Ym9yZGVyOm5vbmUg IT0KaW1wb3J0YW50Oy1tcy1pbnRlcnBvbGF0aW9uLW1vZGU6YmljdWJpYzt9LmNvbnRhaW5lciB7 d2lkdGg6NjAwcHggIT0KaW1wb3J0YW50O30ucCB7bGluZS1oZWlnaHQ6MjJweDttc28tbGluZS1o ZWlnaHQtcnVsZTpleGFjdGx5ICFpbXBvcnRhbnQ7fXRkPQoge21zby1saW5lLWhlaWdodC1ydWxl OmV4YWN0bHkgIWltcG9ydGFudDt9dGFibGUubXNvLWNhcmQtb3V0ZXIgPQp7d2lkdGg6NTgwcHgg IWltcG9ydGFudDttYXJnaW4tYm90dG9tOjE1cHggIWltcG9ydGFudDt9dGFibGUuYm9yZGVyLW91 dGVyID0Ke3dpZHRoOjU4MHB4ICFpbXBvcnRhbnQ7bWFyZ2luLWJvdHRvbToxNXB4ICFpbXBvcnRh bnQ7fXRhYmxlLj0KbXNvLWNhcmQtb3V0ZXItcGludGVyZXN0IHt3aWR0aDoyNzRweCAhaW1wb3J0 YW50O21hcmdpbi1ib3R0b206MTVweCAhPQppbXBvcnRhbnQ7fXRkLm1zby1jYXJkLWlubmVyIHRh YmxlIHtib3JkZXItY29sbGFwc2U6Y29sbGFwc2UgIT0KaW1wb3J0YW50O21zby10YWJsZS1sc3Bh Y2U6MHB0O21zby10YWJsZS1yc3BhY2U6MHB0O3ZlcnRpY2FsLWFsaWduOnRvcDt9Lj0KYm9yZGVy LW91dGVyLC5ib3JkZXItbWlkZGxlLC5ib3JkZXItaW5uZXIge2JvcmRlcjpub25lICFpbXBvcnRh bnQ7fS49Cm1zby1ib3JkZXItb3V0ZXIsLm1zby1ib3JkZXItbWlkZGxlLC5tc28tYm9yZGVyLWlu bmVyIHtwYWRkaW5nOjFweDt9Lj0KbXNvLWJvcmRlci1vdXRlciB7YmFja2dyb3VuZC1jb2xvcjpy Z2IoMjQ1LDI1NSwyNTUpO30ubXNvLWJvcmRlci1taWRkbGUgPQp7YmFja2dyb3VuZC1jb2xvcjpy Z2IoMjIzLDI0NiwyNTUpO30ubXNvLWJvcmRlci1pbm5lciA9CntiYWNrZ3JvdW5kLWNvbG9yOnJn YigxNTMsMTc2LDIyNSk7fS5wcmV2aWV3LWNhcmQge21hcmdpbi1ib3R0b206MCAhPQppbXBvcnRh bnQ7cGFkZGluZzowICFpbXBvcnRhbnQ7fS5jb2x1bW4td3JhcHBlci1maXJzdCB7bWFyZ2luOjA7 fS49CmNvbHVtbi1vbmx5IHtwYWRkaW5nOjhweCA4cHggNHB4IDhweDt9LmNvbHVtbi1maXJzdCB7 cGFkZGluZzo4cHggMTZweCA4cHggPQo4cHg7fS5tc28tY29sdW1uLXdyYXBwZXItb25seSB7d2lk dGg6MTAwJSAhaW1wb3J0YW50O30ub3V0bG9vay1vbmx5ID0Ke2Rpc3BsYXk6YmxvY2sgIWltcG9y dGFudDttYXgtaGVpZ2h0Om5vbmUgIWltcG9ydGFudDtvdmVyZmxvdzp2aXNpYmxlICE9CmltcG9y dGFudDt9Lm91dGxvb2stY29tLW9ubHkge2Rpc3BsYXk6bm9uZTt9PC9zdHlsZT4KICAgIDwhW2Vu ZGlmXS0tPgogICA9MjAKICAgPTIwCiAgICA8c3R5bGU+LmNvbHVtbi13cmFwcGVyIHt2ZXJ0aWNh bC1hbGlnbjp0b3A7fWEgPQp7d29yZC13cmFwOm5vcm1hbDt3b3JkLWJyZWFrOmJyZWFrLXdvcmQ7 fUBtZWRpYSBvbmx5IHNjcmVlbiBhbmQgPQoobWF4LXdpZHRoOjYwMHB4KSB7LmNvbnRhaW5lcltu b3QteWFob29dIHstd2Via2l0LXRleHQtc2l6ZS1hZGp1c3Q6bm9uZSAhPQppbXBvcnRhbnQ7fS5j b250YWluZXJbbm90LXlhaG9vXSB7d2lkdGg6MTAwJSAhaW1wb3J0YW50O21pbi13aWR0aDoxMDAl ICE9CmltcG9ydGFudDt9LmNvbnRhaW5lcltub3QteWFob29dIFtjbGFzcz0zRCJib3JkZXItb3V0 ZXIiXSB7d2lkdGg6MTAwJSAhPQppbXBvcnRhbnQ7fS5jb250YWluZXJbbm90LXlhaG9vXSBbY2xh c3M9M0QicGFsbS1vbmUtd2hvbGUiXSB7d2lkdGg6MTAwJSAhPQppbXBvcnRhbnQ7bWluLXdpZHRo OjEwMCUgIWltcG9ydGFudDt9LmNvbnRhaW5lcltub3QteWFob29dID0KdGRbY2xhc3M9M0QicGFs bS1vbmUtd2hvbGUiXSB7ZGlzcGxheTppbmxpbmUtYmxvY2sgIWltcG9ydGFudDt9Lj0KY29udGFp bmVyW25vdC15YWhvb10gLm1lc3NhZ2Utd3JhcHBlciB7cGFkZGluZzoyLjUlO30uY29udGFpbmVy W25vdC15YWhvb10gPQp0ZFtjbGFzcz0zRCJob3N0bmFtZSJdIHtwYWRkaW5nLXRvcDozcHggIWlt cG9ydGFudDt9LmNvbnRhaW5lcltub3QteWFob29dID0KZGl2LmNvbHVtbi13cmFwcGVyLWZpcnN0 IHtkaXNwbGF5OmJsb2NrO3BhZGRpbmc6aW5oZXJpdCAhPQppbXBvcnRhbnQ7d2lkdGg6MTAwJSAh aW1wb3J0YW50O30uY29udGFpbmVyW25vdC15YWhvb10gZGl2Lj0KY29sdW1uLXdyYXBwZXItc2Vj b25kIHtkaXNwbGF5OmJsb2NrO3BhZGRpbmc6aW5oZXJpdCAhaW1wb3J0YW50O3dpZHRoOjEwMCUg PQohaW1wb3J0YW50O30uY29udGFpbmVyW25vdC15YWhvb10gZGl2LmNvbHVtbi13cmFwcGVyLW9u bHkge3BhZGRpbmc6MCAhPQppbXBvcnRhbnQ7fX1AbWVkaWEgb25seSBzY3JlZW4gYW5kIChtaW4t ZGV2aWNlLXdpZHRoIDozMjBweCkgYW5kID0KKG1heC1kZXZpY2Utd2lkdGggOjU2OHB4KSxvbmx5 IHNjcmVlbiBhbmQgKG1pbi1kZXZpY2Utd2lkdGggOjc2OHB4KSBhbmQgPQoobWF4LWRldmljZS13 aWR0aCA6MTAyNHB4KSxvbmx5IHNjcmVlbiBhbmQgKG1heC1kZXZpY2Utd2lkdGg6NjQwcHgpLG9u bHkgPQpzY3JlZW4gYW5kIChtYXgtZGV2aWNlLXdpZHRoOjY2N3B4KSxvbmx5IHNjcmVlbiBhbmQg PQoobWF4LXdpZHRoOjQ4MHB4KXt0YWJsZVtjbGFzcz0zRCJjb250YWluZXIiXSB7d2lkdGg6MTAw JSAhPQppbXBvcnRhbnQ7bWluLXdpZHRoOjEwMCUgIWltcG9ydGFudDt9LmNvbnRhaW5lcltub3Qt eWFob29dIC5wLC49CmNvbnRhaW5lcltub3QteWFob29dIG9sLC5jb250YWluZXJbbm90LXlhaG9v XSB1bCB7Zm9udC1zaXplOjE3cHg7fWF1ZGlvID0Ke21hcmdpbi1ib3R0b206MTBweDt9LmNvbnRh aW5lcltub3QteWFob29dIC5tZXNzYWdlLXdyYXBwZXIge3BhZGRpbmc6MDt9Lj0KY29udGFpbmVy W25vdC15YWhvb10gW2xhbmd+PTNEImJyYW5kLXBpbnRlcmVzdCJdIHt3aWR0aDoxMDAlICE9Cmlt cG9ydGFudDt9fUBtZWRpYSBvbmx5IHNjcmVlbiBhbmQgKG1pbi13aWR0aDo2MDFweCkgey5jb250 YWluZXJbbm90LXlhaG9vXT0KIHRhYmxlW2NsYXNzPTNEImNvbnRhaW5lciJdIHt3aWR0aDo2MDBw eCAhaW1wb3J0YW50O30uY29udGFpbmVyW25vdC15YWhvb10gPQoubWVzc2FnZS13cmFwcGVyIHtw YWRkaW5nOjE1cHggMjVweDt9fUBtZWRpYSBvbmx5IHNjcmVlbiBhbmQgPQoobWluLWRldmljZS13 aWR0aCA6MzIwcHgpIGFuZCAobWF4LWRldmljZS13aWR0aCA6NTY4cHgpLG9ubHkgc2NyZWVuIGFu ZCA9CihtaW4tZGV2aWNlLXdpZHRoIDo3NjhweCkgYW5kIChtYXgtZGV2aWNlLXdpZHRoIDoxMDI0 cHgpLG9ubHkgc2NyZWVuIGFuZCA9CihtaW4tZGV2aWNlLXdpZHRoIDoxMjI0cHgpICB7LmNvbnRh aW5lcltub3QteWFob29dIHt9YXVkaW86Oi13ZWJraXQtbWVkaWEtYz0Kb250cm9scy1wYW5lbCB7 LXdlYmtpdC1hcHBlYXJhbmNlOm5vbmUgIWltcG9ydGFudDtiYWNrZ3JvdW5kLWNvbG9yOiNmZjU3 MWI7PQpib3JkZXItcmFkaXVzOjJweDt9YXVkaW86Oi13ZWJraXQtbWVkaWEtY29udHJvbHMtcmV3 aW5kLWJ1dHRvbiA9CntkaXNwbGF5Om5vbmUgIWltcG9ydGFudDt9LmNvbnRhaW5lcltub3QteWFo b29dIC5hcHBsZS1vbmx5W3N0eWxlXSA9CntkaXNwbGF5OmJsb2NrICFpbXBvcnRhbnQ7bWF4LWhl aWdodDpub25lICFpbXBvcnRhbnQ7bGluZS1oZWlnaHQ6bm9ybWFsICE9CmltcG9ydGFudDtvdmVy Zmxvdzp2aXNpYmxlICFpbXBvcnRhbnQ7aGVpZ2h0OmF1dG8gIWltcG9ydGFudDt3aWR0aDoxMDAl ICE9CmltcG9ydGFudDtwb3NpdGlvbjpyZWxhdGl2ZSAhaW1wb3J0YW50O30uRXh0ZXJuYWxDbGFz cyAuZWN4YXBwbGUtb25seSA9CntkaXNwbGF5Om5vbmUgIWltcG9ydGFudDt9LmNvbnRhaW5lcltu b3QteWFob29dIC5uby1hcHBsZSB7ZGlzcGxheTpub25lICE9CmltcG9ydGFudDt9LmNvbnRhaW5l cltub3QteWFob29dIC5uby1hcHBsZSB7ZGlzcGxheTpibG9jazt9Lj0KY29udGFpbmVyW25vdC15 YWhvb10gZm9ybSB7d2lkdGg6MTAwJTtmb250LXNpemU6aW5oZXJpdDtwYWRkaW5nOjAgMCA4cHgg MCE9CmltcG9ydGFudDt9LmNvbnRhaW5lcltub3QteWFob29dIGZvcm0gdGQge30uY29udGFpbmVy W25vdC15YWhvb10gZm9ybSA9CnNlbGVjdCB7fS5jb250YWluZXJbbm90LXlhaG9vXSBmb3JtIGZp ZWxkc2V0IHtwYWRkaW5nOjAgIT0KaW1wb3J0YW50O2hlaWdodDo0NXB4O30uY29udGFpbmVyW25v dC15YWhvb10gZm9ybSBpbnB1dCA9CntoZWlnaHQ6NDNweDtwYWRkaW5nLWxlZnQ6NHB4ICFpbXBv cnRhbnQ7fS5jb250YWluZXJbbm90LXlhaG9vXSBmb3JtID0KYnV0dG9uOmhvdmVyIHtjdXJzb3I6 cG9pbnRlcjt9LmNvbnRhaW5lcltub3QteWFob29dIC5mb3JtLXJvdyA9Cntmb250LXNpemU6MDt9 LmNvbnRhaW5lcltub3QteWFob29dIC5mb3JtLXJvdyA+LmZvcm0tY29sdW1uID0Ke2Rpc3BsYXk6 aW5saW5lLWJsb2NrO3dpZHRoOjUwJTt9LmNvbnRhaW5lcltub3QteWFob29dIC5xdWFsaXR5IGZp ZWxkc2V0ID0Ke3dpZHRoOjQwJSAhaW1wb3J0YW50O30uY29udGFpbmVyW25vdC15YWhvb10gLnpp cCBmaWVsZHNldCB7d2lkdGg6NDAlICE9CmltcG9ydGFudDt9fTwvc3R5bGU+CiAgID0yMAogICAg PHN0eWxlPi5FeHRlcm5hbENsYXNzIHAsLkV4dGVybmFsQ2xhc3MgZm9udD0KLC5FeHRlcm5hbENs YXNzIHRkIHttYXJnaW46MCAhaW1wb3J0YW50O30uRXh0ZXJuYWxDbGFzcyB7d2lkdGg6MTAwJTt9 Lj0KRXh0ZXJuYWxDbGFzcyAuZWN4Y29sdW1uLXdyYXBwZXItc2Vjb25kIHt3aWR0aDo2MCUgIWlt cG9ydGFudDt9Lj0KRXh0ZXJuYWxDbGFzcyAuZWN4Y29sdW1uLXdyYXBwZXItZmlyc3Qge3BhZGRp bmctdG9wOjZweCAhPQppbXBvcnRhbnQ7cGFkZGluZy1sZWZ0OjZweCAhaW1wb3J0YW50O30uRXh0 ZXJuYWxDbGFzcyAuZWN4bGFiZWxzID0Ke2Rpc3BsYXk6bm9uZSAhaW1wb3J0YW50O30uRXh0ZXJu YWxDbGFzcyAuZWN4YXJyb3cge2Rpc3BsYXk6bm9uZSAhPQppbXBvcnRhbnQ7fS5FeHRlcm5hbENs YXNzIC5oMSB7cGFkZGluZy1ib3R0b206NXB4O30uRXh0ZXJuYWxDbGFzcyAuaDIgPQp7cGFkZGlu Zy1ib3R0b206NXB4O30uRXh0ZXJuYWxDbGFzcyAuaDMge3BhZGRpbmctYm90dG9tOjVweDt9LkV4 dGVybmFsQ2xhc3M9CiAub3V0bG9vay1jb20taGlkZGVuIHtkaXNwbGF5Om5vbmUgIWltcG9ydGFu dDt9LkV4dGVybmFsQ2xhc3MgLj0Kb3V0bG9vay1jb20tYnV0dG9uIHtkaXNwbGF5OmJsb2NrO30u RXh0ZXJuYWxDbGFzcyAub3V0bG9vay1jb20tb25seSA9CntkaXNwbGF5OmJsb2NrICFpbXBvcnRh bnQ7bWF4LWhlaWdodDpub25lICFpbXBvcnRhbnQ7bGluZS1oZWlnaHQ6bm9ybWFsICE9CmltcG9y dGFudDtvdmVyZmxvdzp2aXNpYmxlICFpbXBvcnRhbnQ7aGVpZ2h0OmF1dG8gIWltcG9ydGFudDt3 aWR0aDoxMDAlICE9CmltcG9ydGFudDtwb3NpdGlvbjpyZWxhdGl2ZSAhaW1wb3J0YW50O30uRXh0 ZXJuYWxDbGFzcyAub3V0bG9vay1vbmx5ID0Ke2Rpc3BsYXk6YmxvY2sgIWltcG9ydGFudDttYXgt aGVpZ2h0Om5vbmUgIWltcG9ydGFudDtvdmVyZmxvdzp2aXNpYmxlICE9CmltcG9ydGFudDt9LkV4 dGVybmFsQ2xhc3MgW2xhbmc9M0QiYnJhbmQtcGludGVyZXN0Il0ge3dpZHRoOjI4MHB4ICE9Cmlt cG9ydGFudDt9LkV4dGVybmFsQ2xhc3MgY2l0ZSA+ZGl2ICsgZGl2IHtwYWRkaW5nOjAgMCA0cHgg MDt9Lj0KRXh0ZXJuYWxDbGFzcyBidXR0b24ge2hlaWdodDphdXRvO308L3N0eWxlPgoKICAgIDx0 YWJsZSBjbGFzcz0zRCJjb250YWluZXIiPQogbGFuZz0zRCJjb250YWluZXIiIG5vdC15YWhvbz0z RCJmaXgiIGJvcmRlcj0zRCIwIiBjZWxscGFkZGluZz0zRCIwIiA9CmNlbGxzcGFjaW5nPTNEIjAi IHZhbGlnbj0zRCJ0b3AiIHN0eWxlPTNEIm1heC13aWR0aDogNjAwcHg7Ij4KICAgICAgPHRyPgog ICAgICAgIDx0ZCB2YWxpZ249M0QidG9wIiBjbGFzcz0zRCJtZXNzYWdlLXdyYXBwZXIgIHdlYmZv bnQtc2FucyIgPQpzdHlsZT0zRCJmb250LXNpemU6IDE0cHg7ICBsaW5lLWhlaWdodDogMS41OyAg Y29sb3I6ICMzMzM7ICA9CmZvbnQtZmFtaWx5OidTZWdvZSBVSScsICdIZWx2ZXRpY2EgTmV1ZScs IEhlbHZldGljYSwgJ0NhbGlicmknLCBBcmlhbCwgPQpzYW5zLXNlcmlmOwoiPgogICAgICAgIDxk aXYgY2xhc3M9M0QicCIgc3R5bGU9M0QibGluZS1oZWlnaHQ6IDEuNTsiPkkgYW0gPQpydW5uaW5n IG9WaXJ0IEVuZ2luZSBWZXJzaW9uIDMuNS4wLjEtMS5lbDYuIEkgaGF2ZSA0IGhvc3RzIGluIHRo ZSBjbHVzdGVyLiA9CkVhY2ggaG9zdCBoYXMgYSBkcmFjNSBhbmQgaXQgaXMgY29uZmlndXJlZCBh bmQgd29ya2luZy4gSSBhbSB0cnlpbmcgdG8gPQpzaW11bGF0ZSBhIG5vZGUgZmFpbHVyZS4gSSBh bSBydW5uaW5nIG9uZSBIQSBWTSBvbiBvbmUgb2YgdGhlIGhvc3RzIGZvciA9CnRlc3RpbmcuIEkg c2ltdWxhdGUgdGhlIGZhaWx1cmUgYnkgcG93ZXJpbmcgb2ZmIHRoZSBob3N0IHdpdGggdGhlIFZN ID0KcnVubmluZy48L2Rpdj48ZGl2IGNsYXNzPTNEInAiIHN0eWxlPTNEImxpbmUtaGVpZ2h0OiAx LjU7Ij48YnI+PC9kaXY+PGRpdiA9CmNsYXNzPTNEInAiIHN0eWxlPTNEImxpbmUtaGVpZ2h0OiAx LjU7Ij5IZXJlIGlzIHdoYXQgaXMgaGFwcGVuaW5nLj0KPC9kaXY+PHVsPjxsaT5Ib3N0IGlzIHBv d2VyZWQgb2ZmPC9saT48bGk+PHNwYW4gc3R5bGU9M0Qid2hpdGUtc3BhY2U6ID0KcHJlLXdyYXA7 IGxpbmUtaGVpZ2h0OiAxLjU7Ij5+NCBtaW51dGVzIHBhc3MgYW5kIHRoZSBob3N0IGlzIHJlY29n bml6ZWQgYXMgPQpub3QgcmVzcG9uZGluZzwvc3Bhbj48L2xpPjxsaT48c3BhbiBzdHlsZT0zRCJ3 aGl0ZS1zcGFjZTogcHJlLXdyYXA7ID0KbGluZS1oZWlnaHQ6IDEuNTsiPkF1dG9tYXRpYyBmZW5j ZSBydW5zIGFuZCB0aGUgVk0gbWlncmF0ZXMuPC9zcGFuPjxzcGFuID0Kc3R5bGU9M0Qid2hpdGUt c3BhY2U6IHByZS13cmFwOyBsaW5lLWhlaWdodDogMS41OyI+QW5vdGhlciBob3N0IGluIHRoZSBu b2RlPQogaXMgY2hvc2VuIGFzIGEgcHJveHkgdG8gZXhlY3V0ZSBTdGF0dXMgY29tbWFuZCBvbiB0 aGUgaG9zdC49Cjwvc3Bhbj48L2xpPjxsaT48c3BhbiBzdHlsZT0zRCJ3aGl0ZS1zcGFjZTogcHJl LXdyYXA7IGxpbmUtaGVpZ2h0OiAxLj0KNTsiPlNhbWUgaG9zdCBpcyBjaG9zZW4gYXMgcHJveHkg dG8gZXhlY3V0ZSBTdGFydCBjb21tYW5kIG9uIHRoZSBob3N0Lj0KPC9zcGFuPjwvbGk+PGxpPjxz cGFuIHN0eWxlPTNEIndoaXRlLXNwYWNlOiBwcmUtd3JhcDsgbGluZS1oZWlnaHQ6IDEuPQo1OyI+ U2FtZSBob3N0IGlzIGNob3NlbiBhcyBwcm94eSB0byBleGVjdXRlIFN0YXR1cyBjb21tYW5kIG9u IHRoZSBob3N0Lj0KPC9zcGFuPjwvbGk+PGxpPjxzcGFuIHN0eWxlPTNEIndoaXRlLXNwYWNlOiBw cmUtd3JhcDsgbGluZS1oZWlnaHQ6IDEuPQo1OyI+VGhlIGhvc3QgRE9FUyBwaHlzaWNhbGx5IHN0 YXJ0Ljwvc3Bhbj48L2xpPjxsaT48c3BhbiA9CnN0eWxlPTNEIndoaXRlLXNwYWNlOiBwcmUtd3Jh cDsgbGluZS1oZWlnaHQ6IDEuNTsiPlRoZSBob3N0IG5ldmVyIHNob3dzID0Kc3RhdHVzIG9mIFVQ Ljwvc3Bhbj48L2xpPjxsaT48c3BhbiBzdHlsZT0zRCJ3aGl0ZS1zcGFjZTogcHJlLXdyYXA7ID0K bGluZS1oZWlnaHQ6IDEuNTsiPkkgc2VsZWN0ICYjeDIwMUM7Y29uZmlybSBob3N0IGhhcyBiZWVu IHJlYm9vdGVkJiN4MjAxRDsgPQphbmQgSSBzZWUgYSBtYW51YWwgZmVuY2Ugc3RhcnQuPC9zcGFu PjwvbGk+PGxpPkhvc3Qgc3RheXMgbm9uLXJlc3BvbnNpdmUuPQo8L2xpPjxsaT48c3BhbiBzdHls ZT0zRCJ3aGl0ZS1zcGFjZTogcHJlLXdyYXA7IGxpbmUtaGVpZ2h0OiAxLjU7Ij5JIHB1dCB0aGU9 CiBob3N0IGluIG1haW50ZW5hbmNlIGFuZCB0aGVuIGFjdGl2YXRlIGl0Ljwvc3Bhbj48L2xpPjxs aT48c3BhbiA9CnN0eWxlPTNEIndoaXRlLXNwYWNlOiBwcmUtd3JhcDsgbGluZS1oZWlnaHQ6IDEu NTsiPkhvc3Qgc3RpbGwgPQpub24tcmVzcG9uc2l2ZTwvc3Bhbj48L2xpPjxsaT48c3BhbiBzdHls ZT0zRCJ3aGl0ZS1zcGFjZTogcHJlLXdyYXA7ID0KbGluZS1oZWlnaHQ6IDEuNTsiPkkgcHV0IHRo ZSBob3N0IGluIG1haW50ZW5hbmNlIGFuZCBkbyBhID0KcmVpbnN0YWxsPC9zcGFuPjwvbGk+PGxp PlJlaW5zdGFsbCBmaW5pc2hlcyBhbmQgaG9zdCBiZWNvbWVzID0KVVA8L2xpPjwvdWw+PGRpdiBj bGFzcz0zRCJwIiBzdHlsZT0zRCJsaW5lLWhlaWdodDogMS41OyI+U28sIGV2ZXJ5dGhpbmcgPQpz ZWVtcyB0byBnbyBmaW5lIHdpdGggdGhlIEhBIGZ1bmN0aW9uYWxpdHksIGJ1dCB0aGUgaG9zdCBu ZXZlciByZWNvdmVycyA9CndpdGhvdXQgYmVpbmcgcmVpbnN0YWxsZWQuIFBsZWFzZSBsZXQgbWUg a25vdyB3aGljaCBsb2dzIHlvdSBuZWVkIHRvIGxvb2sgPQphdCB0byBoZWxwIG1lIG91dCB3aXRo IHRoaXMuIDwvZGl2PjxkaXYgY2xhc3M9M0QicCIgc3R5bGU9M0QibGluZS1oZWlnaHQ6IDE9Ci41 OyI+PGJyPjwvZGl2PjxkaXYgY2xhc3M9M0QicCIgc3R5bGU9M0QibGluZS1oZWlnaHQ6IDEuPQo1 OyI+VGhhbmtzPC9kaXY+PGRpdiBjbGFzcz0zRCJwIiBzdHlsZT0zRCJsaW5lLWhlaWdodDogMS41 OyI+PGJyPjwvZGl2PjxpbWc9CiBzcmM9M0QiaHR0cHM6Ly9hcHAubWl4bWF4LmNvbS9hcGkvdHJh Y2s/aWQ9M0RIcHhoTkRwUGNXV2lYQmhDTCZhbXA7cmU9M0RJeT0KWnk5bUwwSlhhMjlHUXpKWFp6 Vm5JJmFtcDtybj0zRCI+CgogICAgICAgIDxicj4KICAgICAgICAgIDxkaXYgPQpjbGFzcz0zRCJz aWduYXR1cmUiIHN0eWxlPTNEImZvbnQtc2l6ZTogMTRweDsgIGJvcmRlci10b3A6MXB4IHNvbGlk ICNlZWY7ICA9CmZvbnQtd2VpZ2h0OjUwMDsiPgogICAgICAgICAgICA8dGFibGUgYm9yZGVyPTNE IjAiIGNlbGxwYWRkaW5nPTNEIjAiID0KY2VsbHNwYWNpbmc9M0QiMCIgdmFsaWduPTNEInRvcCIg c3R5bGU9M0QiYm9yZGVyLWNvbGxhcHNlOmNvbGxhcHNlOyI+CiAgICAgICAgICAgICAgPHRyPgog ICAgICAgICAgICAgICAgPHRkIGNsYXNzPTNEInNpZ25hdHVyZS10ZXh0IiA9CnN0eWxlPTNEInBh ZGRpbmctdG9wOjE1cHg7Ij4KICAgICAgICAgICAgICAgICAgPHNwYW4gPQpzdHlsZT0zRCJkaXNw bGF5OmJsb2NrOyBmb250LWZhbWlseToncHJveGltYS1ub3ZhJywgJ0F2ZW5pciBOZXh0JywgJ1Nl Z29lID0KVUknLCAnQ2FsaWJyaScsICdIZWx2ZXRpY2EgTmV1ZScsIEhlbHZldGljYSwgQXJpYWws IHNhbnMtc2VyaWY7CiAgICAgICAgICAiPgogICAgICAgICAgICAgICAgICAgIFNlbnQgd2l0aCA8 YiBzdHlsZT0zRCJmb250LWZhbWlseToncHJveGltYT0KLW5vdmEnLCAnQXZlbmlyIE5leHQnLCAn U2Vnb2UgVUknLCAnQ2FsaWJyaScsICdIZWx2ZXRpY2EgTmV1ZScsIEhlbHZldGljYSwgPQpBcmlh bCwgc2Fucy1zZXJpZjsKICAgICAgICAgIDsiPjx1PjxhIHN0eWxlPTNEInRleHQtZGVjb3JhdGlv bjp1bmRlcmxpbmU7ICA9CmNvbG9yOiMwZDUyY2I7IiBocmVmPTNEImh0dHBzOi8vbWl4bWF4LmNv bS9yL1M2Y0pBZlFUTG53OFFHdG5EIiA9CnRhcmdldD0zRCJfYmxhbmsiPk1peG1heDwvYT48L3U+ PC9iPgogICAgICAgICAgICAgICAgICA8L3NwYW4+CiAgICAgICAgICAgICAgICA8L3RkPgogICAg ICAgICAgICAgIDwvdHI+CiAgICAgICAgICA8L3RhYmxlPjwvZGl2PgogICAgICAgIDwvdGQ+CiAg ICAgIDwvdHI+CiAgICA8L3RhYmxlPgogIDwvYm9keT4KPC9odG1sPgoKLS0tLS0tc2luaWthZWwt Pz1fMS0xNDIyMDQwOTYyMjYwMC4wMzg0NDYxMTE4Mzc0MDE5ODYtLQo= --===============0081515793842751270==-- From istein at redhat.com Sun Jan 25 01:56:25 2015 Content-Type: multipart/mixed; boundary="===============7343208785110721675==" MIME-Version: 1.0 From: ILanit Stein To: users at ovirt.org Subject: Re: [ovirt-users] Host remains Non-Responsive after reboot Date: Sun, 25 Jan 2015 01:56:23 -0500 Message-ID: <575609724.112249.1422168983026.JavaMail.zimbra@redhat.com> In-Reply-To: 1422040964161-a917b55f-a2d6b5d4-1ba0de5e@citytwist.net --===============7343208785110721675== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Hi Rob, Thanks for this report. Would you please provide these logs, at the time frame, the host failure oc= cur: 1. oVirt Engine: /var/log/ovirt-engine/engine.log 2. host: /var/log/vdsm/vdsm.log If it is reproducible, please add this info as well. You can also check vdsm service status, on host, while host reported as Non= responsive, by running on host 'service vdsmd status' = There might some problem, that might have prevented from vdsm service to co= me up, on host. Ilanit. = ----- Original Message ----- From: "Rob Abshear" To: users(a)ovirt.org Sent: Friday, January 23, 2015 9:22:42 PM Subject: [ovirt-users] Host remains Non-Responsive after reboot = I am running oVirt Engine Version 3.5.0.1-1.el6. I have 4 hosts in the clus= ter. Each host has a drac5 and it is configured and working. I am trying to= simulate a node failure. I am running one HA VM on one of the hosts for te= sting. I simulate the failure by powering off the host with the VM running. = Here is what is happening. = * Host is powered off = * ~4 minutes pass and the host is recognized as not responding = * Automatic fence runs and the VM migrates. Another host in the node is= chosen as a proxy to execute Status command on the host. = * Same host is chosen as proxy to execute Start command on the host. = * Same host is chosen as proxy to execute Status command on the host. = * The host DOES physically start. = * The host never shows status of UP. = * I select =E2=80=9Cconfirm host has been rebooted=E2=80=9D and I see a= manual fence start. = * Host stays non-responsive. = * I put the host in maintenance and then activate it. = * Host still non-responsive = * I put the host in maintenance and do a reinstall = * Reinstall finishes and host becomes UP = So, everything seems to go fine with the HA functionality, but the host nev= er recovers without being reinstalled. Please let me know which logs you ne= ed to look at to help me out with this. = Thanks = Sent with Mixmax = _______________________________________________ Users mailing list Users(a)ovirt.org http://lists.ovirt.org/mailman/listinfo/users --===============7343208785110721675==-- From rabshear at citytwist.net Mon Jan 26 14:43:17 2015 Content-Type: multipart/mixed; boundary="===============1252190997349825050==" MIME-Version: 1.0 From: Rob Abshear To: users at ovirt.org Subject: Re: [ovirt-users] Host remains Non-Responsive after reboot Date: Mon, 26 Jan 2015 14:43:14 -0500 Message-ID: In-Reply-To: 575609724.112249.1422168983026.JavaMail.zimbra@redhat.com --===============1252190997349825050== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable I have done a bit more investigating on this matter. If I restart the node from within oVirt using the power management option "restart", then the node restarts and vdsmd DOES NOT start. If I go into the DRAC and issue the command to power cycle the machine, then the machine restarts and vdsmd DOES start. I can run the following command from another node in the cluster: fence_drac5 -a 192.168.200.105 -l root -p -x -o reboot and the node restarts and vdsmd DOES start. On Sun, Jan 25, 2015 at 1:56 AM, ILanit Stein wrote: > Hi Rob, > > Thanks for this report. > > Would you please provide these logs, at the time frame, the host failure > occur: > 1. oVirt Engine: /var/log/ovirt-engine/engine.log > 2. host: /var/log/vdsm/vdsm.log > > If it is reproducible, please add this info as well. > > You can also check vdsm service status, on host, while host reported as > Non responsive, > by running on host 'service vdsmd status' > There might some problem, that might have prevented from vdsm service to > come up, on host. > > Ilanit. > > ----- Original Message ----- > From: "Rob Abshear" > To: users(a)ovirt.org > Sent: Friday, January 23, 2015 9:22:42 PM > Subject: [ovirt-users] Host remains Non-Responsive after reboot > > > I am running oVirt Engine Version 3.5.0.1-1.el6. I have 4 hosts in the > cluster. Each host has a drac5 and it is configured and working. I am > trying to simulate a node failure. I am running one HA VM on one of the > hosts for testing. I simulate the failure by powering off the host with t= he > VM running. > > Here is what is happening. > > > * Host is powered off > * ~4 minutes pass and the host is recognized as not responding > * Automatic fence runs and the VM migrates. Another host in the node > is chosen as a proxy to execute Status command on the host. > * Same host is chosen as proxy to execute Start command on the host. > * Same host is chosen as proxy to execute Status command on the host. > * The host DOES physically start. > * The host never shows status of UP. > * I select =E2=80=9Cconfirm host has been rebooted=E2=80=9D and I see= a manual fence > start. > * Host stays non-responsive. > * I put the host in maintenance and then activate it. > * Host still non-responsive > * I put the host in maintenance and do a reinstall > * Reinstall finishes and host becomes UP > > So, everything seems to go fine with the HA functionality, but the host > never recovers without being reinstalled. Please let me know which logs y= ou > need to look at to help me out with this. > > Thanks > > > Sent with Mixmax > > _______________________________________________ > Users mailing list > Users(a)ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > --===============1252190997349825050== Content-Type: text/html MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="attachment.html" PGRpdiBkaXI9Imx0ciI+SSBoYXZlIGRvbmUgYSBiaXQgbW9yZSBpbnZlc3RpZ2F0aW5nIG9uIHRo aXMgbWF0dGVyLsKgIElmIEkgcmVzdGFydCB0aGUgbm9kZSBmcm9tIHdpdGhpbiBvVmlydCB1c2lu ZyB0aGUgcG93ZXIgbWFuYWdlbWVudCBvcHRpb24gJnF1b3Q7cmVzdGFydCZxdW90OywgdGhlbiB0 aGUgbm9kZSByZXN0YXJ0cyBhbmQgdmRzbWQgRE9FUyBOT1Qgc3RhcnQuwqAgSWYgSSBnbyBpbnRv IHRoZSBEUkFDIGFuZCBpc3N1ZSB0aGUgY29tbWFuZCB0byBwb3dlciBjeWNsZSB0aGUgbWFjaGlu ZSwgdGhlbiB0aGUgbWFjaGluZSByZXN0YXJ0cyBhbmQgdmRzbWQgRE9FUyBzdGFydC7CoCBJIGNh biBydW4gdGhlIGZvbGxvd2luZyBjb21tYW5kIGZyb20gYW5vdGhlciBub2RlIGluIHRoZSBjbHVz dGVyOsKgPGRpdj5mZW5jZV9kcmFjNSAtYSAxOTIuMTY4LjIwMC4xMDUgLWwgcm9vdCAtcCAmbHQ7 cGFzc3dvcmQmZ3Q7IC14IC1vIHJlYm9vdDwvZGl2PjxkaXY+YW5kIHRoZSBub2RlIHJlc3RhcnRz IGFuZCB2ZHNtZCBET0VTIHN0YXJ0LjwvZGl2PjwvZGl2PjxkaXYgY2xhc3M9ImdtYWlsX2V4dHJh Ij48YnI+PGRpdiBjbGFzcz0iZ21haWxfcXVvdGUiPk9uIFN1biwgSmFuIDI1LCAyMDE1IGF0IDE6 NTYgQU0sIElMYW5pdCBTdGVpbiA8c3BhbiBkaXI9Imx0ciI+Jmx0OzxhIGhyZWY9Im1haWx0bzpp c3RlaW5AcmVkaGF0LmNvbSIgdGFyZ2V0PSJfYmxhbmsiPmlzdGVpbkByZWRoYXQuY29tPC9hPiZn dDs8L3NwYW4+IHdyb3RlOjxicj48YmxvY2txdW90ZSBjbGFzcz0iZ21haWxfcXVvdGUiIHN0eWxl PSJtYXJnaW46MCAwIDAgLjhleDtib3JkZXItbGVmdDoxcHggI2NjYyBzb2xpZDtwYWRkaW5nLWxl ZnQ6MWV4Ij5IaSBSb2IsPGJyPgo8YnI+ClRoYW5rcyBmb3IgdGhpcyByZXBvcnQuPGJyPgo8YnI+ CldvdWxkIHlvdSBwbGVhc2UgcHJvdmlkZSB0aGVzZSBsb2dzLCBhdCB0aGUgdGltZSBmcmFtZSwg dGhlIGhvc3QgZmFpbHVyZSBvY2N1cjo8YnI+CjEuIG9WaXJ0IEVuZ2luZTogL3Zhci9sb2cvb3Zp cnQtZW5naW5lL2VuZ2luZS5sb2c8YnI+CjIuIGhvc3Q6IC92YXIvbG9nL3Zkc20vdmRzbS5sb2c8 YnI+Cjxicj4KSWYgaXQgaXMgcmVwcm9kdWNpYmxlLCBwbGVhc2UgYWRkIHRoaXMgaW5mbyBhcyB3 ZWxsLjxicj4KPGJyPgpZb3UgY2FuIGFsc28gY2hlY2sgdmRzbSBzZXJ2aWNlIHN0YXR1cywgb24g aG9zdCwgd2hpbGUgaG9zdCByZXBvcnRlZCBhcyBOb24gcmVzcG9uc2l2ZSw8YnI+CmJ5IHJ1bm5p bmcgb24gaG9zdCAmIzM5O3NlcnZpY2UgdmRzbWQgc3RhdHVzJiMzOTs8YnI+ClRoZXJlIG1pZ2h0 IHNvbWUgcHJvYmxlbSwgdGhhdCBtaWdodCBoYXZlIHByZXZlbnRlZCBmcm9tIHZkc20gc2Vydmlj ZSB0byBjb21lIHVwLCBvbiBob3N0Ljxicj4KPGJyPgpJbGFuaXQuPGJyPgo8YnI+Ci0tLS0tIE9y aWdpbmFsIE1lc3NhZ2UgLS0tLS08YnI+CkZyb206ICZxdW90O1JvYiBBYnNoZWFyJnF1b3Q7ICZs dDs8YSBocmVmPSJtYWlsdG86cmFic2hlYXJAY2l0eXR3aXN0Lm5ldCI+cmFic2hlYXJAY2l0eXR3 aXN0Lm5ldDwvYT4mZ3Q7PGJyPgpUbzogPGEgaHJlZj0ibWFpbHRvOnVzZXJzQG92aXJ0Lm9yZyI+ dXNlcnNAb3ZpcnQub3JnPC9hPjxicj4KU2VudDogRnJpZGF5LCBKYW51YXJ5IDIzLCAyMDE1IDk6 MjI6NDIgUE08YnI+ClN1YmplY3Q6IFtvdmlydC11c2Vyc10gSG9zdCByZW1haW5zIE5vbi1SZXNw b25zaXZlIGFmdGVyIHJlYm9vdDxicj4KPGJyPgo8YnI+CkkgYW0gcnVubmluZyBvVmlydCBFbmdp bmUgVmVyc2lvbiAzLjUuMC4xLTEuZWw2LiBJIGhhdmUgNCBob3N0cyBpbiB0aGUgY2x1c3Rlci4g RWFjaCBob3N0IGhhcyBhIGRyYWM1IGFuZCBpdCBpcyBjb25maWd1cmVkIGFuZCB3b3JraW5nLiBJ IGFtIHRyeWluZyB0byBzaW11bGF0ZSBhIG5vZGUgZmFpbHVyZS4gSSBhbSBydW5uaW5nIG9uZSBI QSBWTSBvbiBvbmUgb2YgdGhlIGhvc3RzIGZvciB0ZXN0aW5nLiBJIHNpbXVsYXRlIHRoZSBmYWls dXJlIGJ5IHBvd2VyaW5nIG9mZiB0aGUgaG9zdCB3aXRoIHRoZSBWTSBydW5uaW5nLjxicj4KPGJy PgpIZXJlIGlzIHdoYXQgaXMgaGFwcGVuaW5nLjxicj4KPGJyPgo8YnI+CsKgIMKgICogSG9zdCBp cyBwb3dlcmVkIG9mZjxicj4KwqAgwqAgKiB+NCBtaW51dGVzIHBhc3MgYW5kIHRoZSBob3N0IGlz IHJlY29nbml6ZWQgYXMgbm90IHJlc3BvbmRpbmc8YnI+CsKgIMKgICogQXV0b21hdGljIGZlbmNl IHJ1bnMgYW5kIHRoZSBWTSBtaWdyYXRlcy4gQW5vdGhlciBob3N0IGluIHRoZSBub2RlIGlzIGNo b3NlbiBhcyBhIHByb3h5IHRvIGV4ZWN1dGUgU3RhdHVzIGNvbW1hbmQgb24gdGhlIGhvc3QuPGJy PgrCoCDCoCAqIFNhbWUgaG9zdCBpcyBjaG9zZW4gYXMgcHJveHkgdG8gZXhlY3V0ZSBTdGFydCBj b21tYW5kIG9uIHRoZSBob3N0Ljxicj4KwqAgwqAgKiBTYW1lIGhvc3QgaXMgY2hvc2VuIGFzIHBy b3h5IHRvIGV4ZWN1dGUgU3RhdHVzIGNvbW1hbmQgb24gdGhlIGhvc3QuPGJyPgrCoCDCoCAqIFRo ZSBob3N0IERPRVMgcGh5c2ljYWxseSBzdGFydC48YnI+CsKgIMKgICogVGhlIGhvc3QgbmV2ZXIg c2hvd3Mgc3RhdHVzIG9mIFVQLjxicj4KwqAgwqAgKiBJIHNlbGVjdCDigJxjb25maXJtIGhvc3Qg aGFzIGJlZW4gcmVib290ZWTigJ0gYW5kIEkgc2VlIGEgbWFudWFsIGZlbmNlIHN0YXJ0Ljxicj4K wqAgwqAgKiBIb3N0IHN0YXlzIG5vbi1yZXNwb25zaXZlLjxicj4KwqAgwqAgKiBJIHB1dCB0aGUg aG9zdCBpbiBtYWludGVuYW5jZSBhbmQgdGhlbiBhY3RpdmF0ZSBpdC48YnI+CsKgIMKgICogSG9z dCBzdGlsbCBub24tcmVzcG9uc2l2ZTxicj4KwqAgwqAgKiBJIHB1dCB0aGUgaG9zdCBpbiBtYWlu dGVuYW5jZSBhbmQgZG8gYSByZWluc3RhbGw8YnI+CsKgIMKgICogUmVpbnN0YWxsIGZpbmlzaGVz IGFuZCBob3N0IGJlY29tZXMgVVA8YnI+Cjxicj4KU28sIGV2ZXJ5dGhpbmcgc2VlbXMgdG8gZ28g ZmluZSB3aXRoIHRoZSBIQSBmdW5jdGlvbmFsaXR5LCBidXQgdGhlIGhvc3QgbmV2ZXIgcmVjb3Zl cnMgd2l0aG91dCBiZWluZyByZWluc3RhbGxlZC4gUGxlYXNlIGxldCBtZSBrbm93IHdoaWNoIGxv Z3MgeW91IG5lZWQgdG8gbG9vayBhdCB0byBoZWxwIG1lIG91dCB3aXRoIHRoaXMuPGJyPgo8YnI+ ClRoYW5rczxicj4KPGJyPgo8YnI+CsKgIMKgIMKgIMKgIFNlbnQgd2l0aCBNaXhtYXg8YnI+Cjxi cj4KX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX188YnI+ClVz ZXJzIG1haWxpbmcgbGlzdDxicj4KPGEgaHJlZj0ibWFpbHRvOlVzZXJzQG92aXJ0Lm9yZyI+VXNl cnNAb3ZpcnQub3JnPC9hPjxicj4KPGEgaHJlZj0iaHR0cDovL2xpc3RzLm92aXJ0Lm9yZy9tYWls bWFuL2xpc3RpbmZvL3VzZXJzIiB0YXJnZXQ9Il9ibGFuayI+aHR0cDovL2xpc3RzLm92aXJ0Lm9y Zy9tYWlsbWFuL2xpc3RpbmZvL3VzZXJzPC9hPjxicj4KPC9ibG9ja3F1b3RlPjwvZGl2Pjxicj48 L2Rpdj4K --===============1252190997349825050==-- From istein at redhat.com Tue Jan 27 02:05:20 2015 Content-Type: multipart/mixed; boundary="===============7776408874470703909==" MIME-Version: 1.0 From: ILanit Stein To: users at ovirt.org Subject: Re: [ovirt-users] Host remains Non-Responsive after reboot Date: Tue, 27 Jan 2015 02:05:18 -0500 Message-ID: <33146762.833246.1422342318200.JavaMail.zimbra@redhat.com> In-Reply-To: CAPrjHUkUNoLtgo0T=pw-YCjwg1WiDyR_uYWAhJWYNFCB8D20Jw@mail.gmail.com --===============7776408874470703909== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable It might be a bug, = Would you please attach the logs, I mentioned bellow, that can bring more details on the failure? Adding Eli, that may want to give some input on this issue. Thanks, Ilanit. ----- Original Message ----- From: "Rob Abshear" To: "ILanit Stein" Cc: users(a)ovirt.org Sent: Monday, January 26, 2015 9:43:14 PM Subject: Re: [ovirt-users] Host remains Non-Responsive after reboot I have done a bit more investigating on this matter. If I restart the node from within oVirt using the power management option "restart", then the node restarts and vdsmd DOES NOT start. If I go into the DRAC and issue the command to power cycle the machine, then the machine restarts and vdsmd DOES start. I can run the following command from another node in the cluster: fence_drac5 -a 192.168.200.105 -l root -p -x -o reboot and the node restarts and vdsmd DOES start. On Sun, Jan 25, 2015 at 1:56 AM, ILanit Stein wrote: > Hi Rob, > > Thanks for this report. > > Would you please provide these logs, at the time frame, the host failure > occur: > 1. oVirt Engine: /var/log/ovirt-engine/engine.log > 2. host: /var/log/vdsm/vdsm.log > > If it is reproducible, please add this info as well. > > You can also check vdsm service status, on host, while host reported as > Non responsive, > by running on host 'service vdsmd status' > There might some problem, that might have prevented from vdsm service to > come up, on host. > > Ilanit. > > ----- Original Message ----- > From: "Rob Abshear" > To: users(a)ovirt.org > Sent: Friday, January 23, 2015 9:22:42 PM > Subject: [ovirt-users] Host remains Non-Responsive after reboot > > > I am running oVirt Engine Version 3.5.0.1-1.el6. I have 4 hosts in the > cluster. Each host has a drac5 and it is configured and working. I am > trying to simulate a node failure. I am running one HA VM on one of the > hosts for testing. I simulate the failure by powering off the host with t= he > VM running. > > Here is what is happening. > > > * Host is powered off > * ~4 minutes pass and the host is recognized as not responding > * Automatic fence runs and the VM migrates. Another host in the node > is chosen as a proxy to execute Status command on the host. > * Same host is chosen as proxy to execute Start command on the host. > * Same host is chosen as proxy to execute Status command on the host. > * The host DOES physically start. > * The host never shows status of UP. > * I select =E2=80=9Cconfirm host has been rebooted=E2=80=9D and I see= a manual fence > start. > * Host stays non-responsive. > * I put the host in maintenance and then activate it. > * Host still non-responsive > * I put the host in maintenance and do a reinstall > * Reinstall finishes and host becomes UP > > So, everything seems to go fine with the HA functionality, but the host > never recovers without being reinstalled. Please let me know which logs y= ou > need to look at to help me out with this. > > Thanks > > > Sent with Mixmax > > _______________________________________________ > Users mailing list > Users(a)ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > --===============7776408874470703909==--