[ovirt-users] Network issues on install

Brett I. Holcomb biholcomb at l1049h.com
Sat Apr 2 17:14:55 UTC 2016


On Fri, 2016-04-01 at 21:39 -0400, Brett I. Holcomb wrote:
> Two items here.
> 
> oVirt version 3.6.4  Fresh install, not an upgrade.
> 
> 
> First, I noticed this issue when I did an install on a test machine
> but I didn't have the data to present.  Because of that and some
> other posts dealing with the network issue I kept notes when I
> installed on my production system.   I'm doing a hosted-engine setup.
> 
> As part of the preparation I did the following before installing and
> deploying.
> 
> * Removed NetworkManager with yum remove NetworkManager
> * The NIC that will be used for the oVirt management NIC is connected
> to a switch port expecting VLAN 50 so I set up a VLAN50 ifcfg file.
> * The IP address of the  server, prefix, gateway, and TWO DNS servers
> were setup in the ifcfg file. and name resolution worked.  I could
> ping the host by name as well as the oVirt Engine VM which was in DNS
> so the name resolved but obviously nothing would reply.  Other
> servers and workstations could resolve the host and engine names.
> 
> 1.  On the host I ran hosted-engine --deploy and installed the OS
> (Centos 7 (1511) on the Engine VM.  I rebooted the Engine VM, told
> the deployment that the Engine VM was running and it then continued
> and deployment told me to install the engine on the Engine VM.
> 2.  I updated the Engine VM via yum update,  installed the oVirt
> repositories, and ran the engine-setup which completed successfully.
> 3. I then went back to the host and told it the Engine was setup and
> at this point things went bad.  The deployment started whining about
> not being able to resolve myenginevm.mydomain.com host, did cleanup,
> per-termination, termination, and said the deployment failed and the
> system was unreliable, fix it,  whine, whine, whine.
> 4.  I tried a ping on myenginevm.mydomain.com and it failed.
> 
> What I found was that when the bridge was created (ifcfg-ovirtmgmt)
> the DNS servers were left out!  They were in the original NIC ifcfg
> file but it appears the deployment didn't bother to bring them over
> to the bridge ifcfg.  I find this very puzzling since the deployment
> insists on FQDNs so it should be smart enough to bring over the DNS
> server settings and not leave them out.  My /etc/resolv.conf file
> also had no DNS servers in it.
> 
> I added the DNS server to the bridge ifcfg file, did a systemctl
> restart network and all is well again.  The host can ping the VM! 
> 
> However, the deployment thinks it failed and I can not restart the
> Engine VM.  I tried a reboot, made sure the ovirt daemons were
> running but if I try and do anything such as hosted-engine vm-start I
> get  "Unable to read vm.conf, please check ovirt-ha-agent logs".
> 
> Second, I think that having the deployment fail simply because it can
> not contact the Engine VM is a very huge error/bug/whatever - its
> silly.  The deployment went well, the VM exists and is running but
> due to the deployment messing up the DNS servers it just can't find
> it.  The deployment should first, handle the name server setup
> correctly and second fail gracefully..  
> 
> I rebooted the server but still get the error about not being able to
> read vm.conf.  At this point I now have to run through the entire
> deployment again just because one phase messed up unless there is a
> way to work around this.  However, in the work that I've done with
> oVirt I've notice the deployment is not real robust and when it
> encounters errors that should allow it to recover.  I suggest that
> consideration be given to making the deployment smarter and more
> robust. 
> 
> 
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
More info. 
This gets broken during the hosted-engine --deploy first phase (before
the OS is installed on the Engine VM) which makes sense because I
assume that's when the bridge is created.
I added another logical network with a VLAN tag and this broke name
resolution again.  I had to do systemctl restart network again and then
name resolution was back.
I'm attempting to use the web portal but it's very/very slow.  When I
select the admin portal it can take 5+ minutes before it displays the
login page if it ever does and doesn't time out.  Once I get the Admin
login it goes pretty quickly.  I'm using Firefox 45.0.1 on Fedora 23.
 Any reason for this?  From what I see the message about not supporting
the browser is bogus.  My host has 64 gig memory, and E2620-v3
processor.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160402/9c73981a/attachment-0001.html>


More information about the Users mailing list