[ovirt-users] 3 strikes....
Michal Skrivanek
michal.skrivanek at redhat.com
Thu Dec 28 11:32:08 UTC 2017
> On 28 Dec 2017, at 00:02, Blaster <Blaster at 556nato.com> wrote:
>
> Well, I've spent the last 2.5 days trying to get oVirt 4.2 up and running.
>
> I sneeze on it, vdsm has a conniption and there appears to be no way to recover from it.
>
> 1) Install 4.2. Everything looks good. Start copying over some data..accidently wipe out the master storage domain...It's gone. The only method google could suggest was to re-initialize the data center. Great. I'd love to! It's greyed out. Can't get it back...Try several hosted-engine uninstall methods, including
> /usr/sbin/ovirt-hosted-engine-cleanup and wiping out the storage.
>
> re-run hosted-engine --deploy
> All I get over and over in the vdsm log file while waiting for vdsm to become operational is..
> 2017-12-27 16:36:22,150-0600 ERROR (periodic/3) [virt.periodic.Operation] <vdsm.virt.sampling.VMBulkstatsMonitor object at 0x397b250> operation failed (periodic:215)
> Traceback (most recent call last):
> File "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 213, in __call__
> self._func()
> File "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 522, in __call__
> self._send_metrics()
> File "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 538, in _send_metrics
> vm_sample.interval)
> File "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py", line 45, in produce
> networks(vm, stats, first_sample, last_sample, interval)
> File "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py", line 322, in networks
> if nic.name.startswith('hostdev'):
> AttributeError: name
not relevant to your issue, just fyi
this error is not significant and fixed by 9a2f73a4384e1d72c3285ef88876e404ec8228ff now
> 2017-12-27 16:36:22,620-0600 INFO (periodic/1) [vdsm.api] START repoStats(domains=()) from=internal, task_id=94688cf1-a991-433e-9e22-7065ed5dc1bf (api:46)
> 2017-12-27 16:36:22,620-0600 INFO (periodic/1) [vdsm.api] FINISH repoStats return={} from=internal, task_id=94688cf1-a991-433e-9e22-7065ed5dc1bf (api:52)
> 2017-12-27 16:36:22,621-0600 INFO (periodic/1) [vdsm.api] START multipath_health() from=internal, task_id=9c680369-8f2a-439e-8fe5-b2a1e33c0706 (api:46)
> 2017-12-27 16:36:22,622-0600 INFO (periodic/1) [vdsm.api] FINISH multipath_health return={} from=internal, task_id=9c680369-8f2a-439e-8fe5-b2a1e33c0706 (api:52)
> 2017-12-27 16:36:22,633-0600 ERROR (periodic/1) [root] failed to retrieve Hosted Engine HA score '[Errno 2] No such file or directory'Is the Hosted Engine setup finished? (api:196)
> 2017-12-27 16:36:23,178-0600 INFO (vmrecovery) [vdsm.api] START getConnectedStoragePoolsList(options=None) from=internal, task_id=a7e48a2f-8cb7-4ec5-acd7-452c8f0c522b (api:46)
> 2017-12-27 16:36:23,179-0600 INFO (vmrecovery) [vdsm.api] FINISH getConnectedStoragePoolsList return={'poollist': []} from=internal, task_id=a7e48a2f-8cb7-4ec5-acd7-452c8f0c522b (api:52)
> 2017-12-27 16:36:23,179-0600 INFO (vmrecovery) [vds] recovery: waiting for storage pool to go up (clientIF:643)
>
> sigh...reinstall 7.4 and do it all over again.
>
> 2) copying data to master storage pool. Didn't wipe it out this time, but filled the volume instead. Environment freezes.
> vdsm can't start...infinite loop waiting for storage pool again. Try clean up and redeploy. Same problem as above.
> 7.4 reinstall #2 here we go...
>
> 3)Up and running again. Forgot to add my NIC card. Shut it down. Boot back up. vdsm sees new network interfaces.
> for some reason, it switches ovirtmgmt over to one of the new interfaces which doesn't have a cable
> attached to it. Clean up ifcfg- files and reboot. ifcfg-ovirtmgmt is now gone. recreate and reboot. Interface
> comes alive, but vdsm is not starting.
> supervdsm log shows:
> Multiple southbound ports per network detected, ignoring this network for the QoS report (network: ovirtmgmt, ports: ['enp3s0', 'enp4s0'])
> restore-net::DEBUG::2017-12-27 13:10:39,815::cmdutils::150::root::(exec_cmd) /usr/share/openvswitch/scripts/ovs-ctl status (cwd None)
> restore-net::DEBUG::2017-12-27 13:10:39,856::cmdutils::158::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0
> restore-net::DEBUG::2017-12-27 13:10:39,863::vsctl::58::root::(commit) Executing commands: /usr/bin/ovs-vsctl --oneline --format=json -- list Bridge -- list Port -- list Interface
> restore-net::DEBUG::2017-12-27 13:10:39,864::cmdutils::150::root::(exec_cmd) /usr/bin/ovs-vsctl --oneline --format=json -- list Bridge -- list Port -- list Interface (cwd None)
> restore-net::DEBUG::2017-12-27 13:10:39,944::cmdutils::158::root::(exec_cmd) SUCCESS: <err> = ''; <rc> = 0
> restore-net::ERROR::2017-12-27 13:10:39,954::restore_net_config::454::root::(restore) unified restoration failed.
> Traceback (most recent call last):
> File "/usr/lib/python2.7/site-packages/vdsm/network/restore_net_config.py", line 448, in restore
> unified_restoration()
> File "/usr/lib/python2.7/site-packages/vdsm/network/restore_net_config.py", line 131, in unified_restoration
> classified_conf = _classify_nets_bonds_config(available_config)
> File "/usr/lib/python2.7/site-packages/vdsm/network/restore_net_config.py", line 260, in _classify_nets_bonds_config
> current_config = kernelconfig.KernelConfig(net_info)
> File "/usr/lib/python2.7/site-packages/vdsm/network/kernelconfig.py", line 44, in __init__
> for net, net_attr in self._analyze_netinfo_nets(netinfo):
> File "/usr/lib/python2.7/site-packages/vdsm/network/kernelconfig.py", line 57, in _analyze_netinfo_nets
> attrs = _translate_netinfo_net(net, net_attr, netinfo, _routes)
> File "/usr/lib/python2.7/site-packages/vdsm/network/kernelconfig.py", line 99, in _translate_netinfo_net
> raise MultipleSouthBoundNicsPerNetworkError(net, nics)
> MultipleSouthBoundNicsPerNetworkError: ('ovirtmgmt', ['enp3s0', 'enp4s0'])
>
> Remove new nic. reboot. vdsm once again stuck waiting for storage pool to come up.
i’m not clear on what are you actually trying to do. You do a clean install with a new SD, and then you write over it? With what? data from your former installation, just plain file-level copy? Why?
>
> So this is where I'm at now. Stuck. once again.
>
> I've been running 3.6.3 All In One for many years because I've been concerned about the complexity of the self hosted
> configuration. Guess I was right.
>
> Google shows lots of other people also concerned about the stability of oVirt..It's great when it runs, but
> any little issue and you're basically reinstalling from scratch.
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20171228/2154c2dc/attachment.html>
More information about the Users
mailing list