Unable to revive host after a reboot

Hi, I recently had to shutdown my oVirt system which contains both the engine and the host. After it came back up, the host was not able to initialize anymore from the oVirt UI. First it kept getting stuck in a loop with setting "ExecutingStarted: Jul 30, 2018, 9:06:21 AMSetting Host pc331 to Non-Operational mode." This was being repeated over and over again. It also gave this error: Host pc331 does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt' Even though I am using that interface (ovirtmgmt) to access the system and 'ip addr show' shows that it's set up properly. I tried some more things, including upgrading the host, When I do that, the status changes to 'Installing' and later to 'Install Failed'. Here are the engine.log, vdsm.log and ovirt-host-deploy*.log: https://drive.google.com/open?id=13vIUDVjPynmAK0pnFRLabDfUlG1lmkpI https://drive.google.com/open?id=10Zm2dDpxM2k5A2bEs_l6gOv_yWv4cjlN https://drive.google.com/open?id=1AieWMzRuA0gZj3x5yDH1AGnsT2E3VtOM Any idea what is going wrong (what I'm doing wrong) and how to solve it? Thanks in advance! Best regards, Julius

Hi Julius, cc'd some people that might have some ideas :) Greg On Mon, Jul 30, 2018 at 11:07 AM Julius Schwartzenberg < julius.schwartzenberg@gmail.com> wrote:
Hi,
I recently had to shutdown my oVirt system which contains both the engine and the host. After it came back up, the host was not able to initialize anymore from the oVirt UI.
First it kept getting stuck in a loop with setting "ExecutingStarted: Jul 30, 2018, 9:06:21 AMSetting Host pc331 to Non-Operational mode."
This was being repeated over and over again. It also gave this error: Host pc331 does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt'
Even though I am using that interface (ovirtmgmt) to access the system and 'ip addr show' shows that it's set up properly.
I tried some more things, including upgrading the host, When I do that, the status changes to 'Installing' and later to 'Install Failed'.
Here are the engine.log, vdsm.log and ovirt-host-deploy*.log: https://drive.google.com/open?id=13vIUDVjPynmAK0pnFRLabDfUlG1lmkpI https://drive.google.com/open?id=10Zm2dDpxM2k5A2bEs_l6gOv_yWv4cjlN https://drive.google.com/open?id=1AieWMzRuA0gZj3x5yDH1AGnsT2E3VtOM
Any idea what is going wrong (what I'm doing wrong) and how to solve it? Thanks in advance!
Best regards, Julius
-- GREG SHEREMETA SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX Red Hat NA <https://www.redhat.com/> gshereme@redhat.com IRC: gshereme <https://red.ht/sig>

Widening the net a bit :) Can anyone assist Julius? Greg On Mon, Jul 30, 2018 at 11:09 AM Greg Sheremeta <gshereme@redhat.com> wrote:
Hi Julius,
cc'd some people that might have some ideas :)
Greg
On Mon, Jul 30, 2018 at 11:07 AM Julius Schwartzenberg < julius.schwartzenberg@gmail.com> wrote:
Hi,
I recently had to shutdown my oVirt system which contains both the engine and the host. After it came back up, the host was not able to initialize anymore from the oVirt UI.
First it kept getting stuck in a loop with setting "ExecutingStarted: Jul 30, 2018, 9:06:21 AMSetting Host pc331 to Non-Operational mode."
This was being repeated over and over again. It also gave this error: Host pc331 does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt'
Even though I am using that interface (ovirtmgmt) to access the system and 'ip addr show' shows that it's set up properly.
I tried some more things, including upgrading the host, When I do that, the status changes to 'Installing' and later to 'Install Failed'.
Here are the engine.log, vdsm.log and ovirt-host-deploy*.log: https://drive.google.com/open?id=13vIUDVjPynmAK0pnFRLabDfUlG1lmkpI https://drive.google.com/open?id=10Zm2dDpxM2k5A2bEs_l6gOv_yWv4cjlN https://drive.google.com/open?id=1AieWMzRuA0gZj3x5yDH1AGnsT2E3VtOM
Any idea what is going wrong (what I'm doing wrong) and how to solve it? Thanks in advance!
Best regards, Julius
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
-- GREG SHEREMETA SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX Red Hat NA <https://www.redhat.com/> gshereme@redhat.com IRC: gshereme <https://red.ht/sig>

Hi, - Might be useful the output from the command (host): - tree /var/lib/vdsm I noticed in host-deploy/engine.log the following error: "Cannot find a valid baseurl for repo: base/7/x86_64" Are you able to test and see if the host is able to reach the baseurl from the repo? https://www.centos.org/forums/viewtopic.php?t=63349 Traceback (most recent call last): File "/tmp/ovirt-OZFEUm2MU7/pythonlib/otopi/context.py", line 133, in _executeMethod method['method']() File "/tmp/ovirt-OZFEUm2MU7/otopi-plugins/ovirt-host-deploy/kdump/packages.py", line 216, in _customization self._kexec_tools_version_supported() File "/tmp/ovirt-OZFEUm2MU7/otopi-plugins/ovirt-host-deploy/kdump/packages.py", line 148, in _kexec_tools_version_supported patterns=(self._KEXEC_TOOLS_PKG,), File "/tmp/ovirt-OZFEUm2MU7/otopi-plugins/otopi/packagers/yumpackager.py", line 320, in queryPackages showdups=listAll File "/tmp/ovirt-OZFEUm2MU7/pythonlib/otopi/miniyum.py", line 995, in queryPackages showdups=showdups, File "/tmp/ovirt-OZFEUm2MU7/pythonlib/otopi/miniyum.py", line 433, in _queryProvides for po in self._yb.searchPackageProvides(args=packages): File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 3461, in searchPackageProvides where = self.returnPackagesByDep(arg) File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 4287, in returnPackagesByDep return self.pkgSack.searchProvides(depstring) File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 1074, in <lambda> pkgSack = property(fget=lambda self: self._getSacks(), File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 778, in _getSacks self.repos.populateSack(which=repos) File "/usr/lib/python2.7/site-packages/yum/repos.py", line 347, in populateSack self.doSetup() File "/usr/lib/python2.7/site-packages/yum/repos.py", line 122, in doSetup self.ayum.plugins.run('prereposetup') File "/usr/lib/python2.7/site-packages/yum/plugins.py", line 188, in run func(conduitcls(self, self.base, conf, **kwargs)) File "/usr/lib/yum-plugins/fastestmirror.py", line 197, in prereposetup_hook if downgrade_ftp and _len_non_ftp(repo.urls) == 1: File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 881, in <lambda> urls = property(fget=lambda self: self._geturls(), File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 878, in _geturls self._baseurlSetup() File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 844, in _baseurlSetup self.check() File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 562, in check 'Cannot find a valid baseurl for repo: %s' % self.ui_id RepoError: Cannot find a valid baseurl for repo: base/7/x86_64 2018-07-30 13:59:49,785+0200 ERROR otopi.context context._executeMethod:152 Failed to execute stage 'Environment customization': Cannot find a valid baseurl for repo: base/7/x86_64 On Wed, Aug 1, 2018 at 2:28 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
Widening the net a bit :) Can anyone assist Julius?
Greg
On Mon, Jul 30, 2018 at 11:09 AM Greg Sheremeta <gshereme@redhat.com> wrote:
Hi Julius,
cc'd some people that might have some ideas :)
Greg
On Mon, Jul 30, 2018 at 11:07 AM Julius Schwartzenberg < julius.schwartzenberg@gmail.com> wrote:
Hi,
I recently had to shutdown my oVirt system which contains both the engine and the host. After it came back up, the host was not able to initialize anymore from the oVirt UI.
First it kept getting stuck in a loop with setting "ExecutingStarted: Jul 30, 2018, 9:06:21 AMSetting Host pc331 to Non-Operational mode."
This was being repeated over and over again. It also gave this error: Host pc331 does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt'
Even though I am using that interface (ovirtmgmt) to access the system and 'ip addr show' shows that it's set up properly.
I tried some more things, including upgrading the host, When I do that, the status changes to 'Installing' and later to 'Install Failed'.
Here are the engine.log, vdsm.log and ovirt-host-deploy*.log: https://drive.google.com/open?id=13vIUDVjPynmAK0pnFRLabDfUlG1lmkpI https://drive.google.com/open?id=10Zm2dDpxM2k5A2bEs_l6gOv_yWv4cjlN https://drive.google.com/open?id=1AieWMzRuA0gZj3x5yDH1AGnsT2E3VtOM
Any idea what is going wrong (what I'm doing wrong) and how to solve it? Thanks in advance!
Best regards, Julius
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community- guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/ message/2QPGCWTKRXGFW2W3PB44LTEXSH3SLMN7/
-- Cheers Douglas

Please provide the supervdsm.log, most networking related logs are there. On Thu, Aug 2, 2018 at 12:18 AM, Douglas Landgraf <dlandgra@redhat.com> wrote:
Hi,
- Might be useful the output from the command (host): - tree /var/lib/vdsm
I noticed in host-deploy/engine.log the following error: "Cannot find a valid baseurl for repo: base/7/x86_64" Are you able to test and see if the host is able to reach the baseurl from the repo? https://www.centos.org/forums/viewtopic.php?t=63349
Traceback (most recent call last): File "/tmp/ovirt-OZFEUm2MU7/pythonlib/otopi/context.py", line 133, in _executeMethod method['method']() File "/tmp/ovirt-OZFEUm2MU7/otopi-plugins/ovirt-host-deploy/kdump/packages.py", line 216, in _customization self._kexec_tools_version_supported() File "/tmp/ovirt-OZFEUm2MU7/otopi-plugins/ovirt-host-deploy/kdump/packages.py", line 148, in _kexec_tools_version_supported patterns=(self._KEXEC_TOOLS_PKG,), File "/tmp/ovirt-OZFEUm2MU7/otopi-plugins/otopi/packagers/yumpackager.py", line 320, in queryPackages showdups=listAll File "/tmp/ovirt-OZFEUm2MU7/pythonlib/otopi/miniyum.py", line 995, in queryPackages showdups=showdups, File "/tmp/ovirt-OZFEUm2MU7/pythonlib/otopi/miniyum.py", line 433, in _queryProvides for po in self._yb.searchPackageProvides(args=packages): File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 3461, in searchPackageProvides where = self.returnPackagesByDep(arg) File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 4287, in returnPackagesByDep return self.pkgSack.searchProvides(depstring) File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 1074, in <lambda> pkgSack = property(fget=lambda self: self._getSacks(), File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 778, in _getSacks self.repos.populateSack(which=repos) File "/usr/lib/python2.7/site-packages/yum/repos.py", line 347, in populateSack self.doSetup() File "/usr/lib/python2.7/site-packages/yum/repos.py", line 122, in doSetup self.ayum.plugins.run('prereposetup') File "/usr/lib/python2.7/site-packages/yum/plugins.py", line 188, in run func(conduitcls(self, self.base, conf, **kwargs)) File "/usr/lib/yum-plugins/fastestmirror.py", line 197, in prereposetup_hook if downgrade_ftp and _len_non_ftp(repo.urls) == 1: File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 881, in <lambda> urls = property(fget=lambda self: self._geturls(), File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 878, in _geturls self._baseurlSetup() File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 844, in _baseurlSetup self.check() File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 562, in check 'Cannot find a valid baseurl for repo: %s' % self.ui_id RepoError: Cannot find a valid baseurl for repo: base/7/x86_64 2018-07-30 13:59:49,785+0200 ERROR otopi.context context._executeMethod:152 Failed to execute stage 'Environment customization': Cannot find a valid baseurl for repo: base/7/x86_64
On Wed, Aug 1, 2018 at 2:28 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
Widening the net a bit :) Can anyone assist Julius?
Greg
On Mon, Jul 30, 2018 at 11:09 AM Greg Sheremeta <gshereme@redhat.com> wrote:
Hi Julius,
cc'd some people that might have some ideas :)
Greg
On Mon, Jul 30, 2018 at 11:07 AM Julius Schwartzenberg < julius.schwartzenberg@gmail.com> wrote:
Hi,
I recently had to shutdown my oVirt system which contains both the engine and the host. After it came back up, the host was not able to initialize anymore from the oVirt UI.
First it kept getting stuck in a loop with setting "ExecutingStarted: Jul 30, 2018, 9:06:21 AMSetting Host pc331 to Non-Operational mode."
This was being repeated over and over again. It also gave this error: Host pc331 does not comply with the cluster Default networks, the following networks are missing on host: 'ovirtmgmt'
Even though I am using that interface (ovirtmgmt) to access the system and 'ip addr show' shows that it's set up properly.
I tried some more things, including upgrading the host, When I do that, the status changes to 'Installing' and later to 'Install Failed'.
Here are the engine.log, vdsm.log and ovirt-host-deploy*.log: https://drive.google.com/open?id=13vIUDVjPynmAK0pnFRLabDfUlG1lmkpI https://drive.google.com/open?id=10Zm2dDpxM2k5A2bEs_l6gOv_yWv4cjlN https://drive.google.com/open?id=1AieWMzRuA0gZj3x5yDH1AGnsT2E3VtOM
Any idea what is going wrong (what I'm doing wrong) and how to solve it? Thanks in advance!
Best regards, Julius
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
--
GREG SHEREMETA
SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
Red Hat NA
gshereme@redhat.com IRC: gshereme <https://red.ht/sig>
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/communit y/about/community-guidelines/ List Archives: https://lists.ovirt.org/archiv es/list/users@ovirt.org/message/2QPGCWTKRXGFW2W3PB44LTEXSH3SLMN7/
-- Cheers Douglas
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community- guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/ message/7GP3TAKFVMMSF7XFBKWZ3G57JFYIDJCP/

Here is the output of the tree command: [root@pc305052 ~]# tree /var/lib/vdsm /var/lib/vdsm ├── netconfback ├── persistence │ ├── netconf -> /var/lib/vdsm/persistence/netconf.fxEXfkKx │ └── netconf.fxEXfkKx │ ├── bonds │ ├── devices │ └── nets ├── staging │ ├── netconf -> /var/lib/vdsm/staging/netconf.JtRmRH7X │ └── netconf.JtRmRH7X │ ├── bonds │ ├── devices │ └── nets ├── transient └── upgrade 15 directories, 0 files I tried removing the files below persistence (with the services stopped), but that did not help. Here is the supervdsm.log: https://drive.google.com/open?id=1uTCEjXchZ1BJJQuGWln83BR-uxZYcVUJ The repository works when I set up the proxy, but it is not always set up. Should I try enabling it doing a reinstall?

Seems like the persistent configuration was erased and VDSM restored an empty config, therefore, removed ovirtmgmt. At 2018-07-29 22:07:29,037, the persistent config existed at: /var/lib/vdsm/persistence/... But when VDSM (or the whole host?) was restarted at 2018-07-29 22:17:21,558 , it was no longer there and therefore ovirtmgmt was erased. I could not see any hint on why the persistent config was erased. I think that the easiest way to restore the host is to just define dhcp manually on the eno1 interface and re-add the host from Engine. On Thu, Aug 2, 2018 at 2:49 PM, <julius.schwartzenberg@gmail.com> wrote:
Here is the output of the tree command: [root@pc305052 ~]# tree /var/lib/vdsm /var/lib/vdsm ├── netconfback ├── persistence │ ├── netconf -> /var/lib/vdsm/persistence/netconf.fxEXfkKx │ └── netconf.fxEXfkKx │ ├── bonds │ ├── devices │ └── nets ├── staging │ ├── netconf -> /var/lib/vdsm/staging/netconf.JtRmRH7X │ └── netconf.JtRmRH7X │ ├── bonds │ ├── devices │ └── nets ├── transient └── upgrade
15 directories, 0 files
I tried removing the files below persistence (with the services stopped), but that did not help.
Here is the supervdsm.log: https://drive.google.com/open?id=1uTCEjXchZ1BJJQuGWln83BR-uxZYcVUJ
The repository works when I set up the proxy, but it is not always set up. Should I try enabling it doing a reinstall? _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community- guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/ message/DV3MXCWFZFFJO6FQ2UDT47JUWY6XHWIK/

But when I do 'ip addr show' the ovirtmgmt interface is there and 'brctl show' shows that it's bridged with eno1. This suggests me that ovirtmgmt was not removed. The host is still listed in the engine, so I cannot re-add it. The remove button is grayed out, so I cannot remove it either. I'm still not sure how I should proceed now. Would it be possible to describe the steps to go to back to a working state?

On Fri, Aug 3, 2018 at 10:22 AM, <julius.schwartzenberg@gmail.com> wrote:
But when I do 'ip addr show' the ovirtmgmt interface is there and 'brctl show' shows that it's bridged with eno1. This suggests me that ovirtmgmt was not removed.
Nevertheless, it is no longer owned by oVirt/VDSM.
The host is still listed in the engine, so I cannot re-add it. The remove button is grayed out, so I cannot remove it either.
I'm still not sure how I should proceed now. Would it be possible to describe the steps to go to back to a working state?
Can you try on the setup network window to drag the ovirtmgmt network over the nic and apply?
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community- guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/ message/22WK5V6KXRXS2LW65ZQCQ34D7HXRXK7T/

I just noticed that indeed ovirtmgmt had moved to the right part there. Dragging it to the left allowed me to bring up the host again! Then I got to this error when trying to run a VM "Cannot run VM. Unknown Data Center status.". When I checked the status it says 'Non Responsive'. It turns out that I was just impatient and at some point I could start my VMs again. I'm not sure what exactly happened and why I couldn't see ovirtmgmt on the right before. In any case, thank you very much for the help!!
participants (5)
-
Douglas Landgraf
-
Edward Haas
-
Greg Sheremeta
-
Julius Schwartzenberg
-
julius.schwartzenberg@gmail.com