
=20 =20 ----- Original Message -----
From: "David Caro" <dcaroest@redhat.com> To: "Infra" <infra@ovirt.org> Cc: "Max Kovgan" <mkovgan@redhat.com> Sent: Tuesday, February 3, 2015 5:32:52 PM Subject: Re: PHX lab downtime =20 On 02/03, David Caro wrote:
=20 It took more than one hour :S =20 Current status is that all the vms are up and running, all the servic= es are working, but we have one host down, ovirt-srv02 is out of the pool of hosts, with a strange issue resolving names. =20 When running ping it can't resolve names, but with dig it works ok. T= hat's usually a misconfiguration in the nsswitch.conf file, but it's ok I tried selinux and iptables. =20 Did a strace of pinx, and I can see it does open a socket to the name= server and sends the query, but nothing goes out the interface... (had tcpdump o=
another screen) =20 socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) =3D 4 <0.000786> connect(4, {sa_family=3DAF_INET, sin_port=3Dhtons(53), sin_addr=3Dinet_addr("8.8.8.8")}, 16) =3D 0 <0.000028> poll([{fd=3D4, events=3DPOLLOUT}], 1, 0) =3D 1 ([{fd=3D4, revents= =3DPOLLOUT}]) <0.000016> sendto(4, "ck\1\0\0\1\0\0\0\0\0\0\3www\6google\3com\0\0\1\0\1", 32, MSG_NOSIGNAL, NULL, 0) =3D 32 <0.000052> =20 Any idea is welcome, I'm going to get some sleep now that all the ser= vices are back up... =20 Mystery solved! Thanks Max! =20 The issue was that the iproute2 module can edit multiple routing tables= , and the usual command 'ip route show' only shows the routes in the default = kernel table while newer vdsm adds a new table for the routing aside from it, = and when I modified the gateway and netmask in the routing tables of the hosts, =
extra side table was not updated. That lead to the strange behavior of =
--1E1Oui4vdubnXi3o Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 02/04, Eyal Edri wrote: pen in the ping
udp requests to the dns server being routed through the old gateway while i= cmp was being routed though the new table. =20 You can find more info about routing tables and rules here: =20 http://linux-ip.net/html/routing-tables.html http://linux-ip.net/html/routing-rpdb.html =20 =20 and some examples here: =20 http://linux-ip.net/html/adv-multi-internet.html =20 =20 Just for future reference, you can see all the routing for all the rout= ing tables with: =20 ip route show table all =20 =20 do we have the network info documented on the infra page?=20 worth documenting it or enforcing standard configuration via puppet if po= ssible.
The network setup is there, but the first time you run vdsm it messes everything up and changes configurations and such, so I don't think it's ok= to change it with puppet when vdsm is already managing it (not always ok it seems). That will lead to a race condition between vdsm and puppet changing= the network configuration. But I'll add the troubleshooting tips to the docs.
=20
=20
=20 =20 ps. ovirt-srv02 was also the hosted engine master, and was the host t= hat broke during the upgrade the last time. Maybe it was related to this issue,= or this issue is related to that... =20 See you tomorrow! =20 =20 On 02/02, David Caro wrote:
=20 Hi all, =20 We are having a downtime on some of the vms and hosts on the phx la= b. It's caused by an unexpected issue with the engine and dhcp after changi= ng the gateway of the machines to adapt to the new ip range. =20 It's almost fixed, but we (I) should really get the environment to a really stable status again and finish the upgrade. =20 There are still some issues, but most of them are already fixed, wi= ll fix the rest in less than one hour. =20 =20 =20 -- David Caro =20 Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D =20 Tel.: +420 532 294 605 Email: dcaro@redhat.com Web: www.redhat.com RHT Global #: 82-62605 =20 =20 =20 _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra =20 =20 -- David Caro =20 Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D =20 Tel.: +420 532 294 605 Email: dcaro@redhat.com Web: www.redhat.com RHT Global #: 82-62605 =20 =20 =20
Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra =20 =20 -- David Caro =20 Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D =20 Tel.: +420 532 294 605 Email: dcaro@redhat.com Web: www.redhat.com RHT Global #: 82-62605 =20
Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra =20
--=20 David Caro Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D Tel.: +420 532 294 605 Email: dcaro@redhat.com Web: www.redhat.com RHT Global #: 82-62605 --1E1Oui4vdubnXi3o Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJU0fTXAAoJEEBxx+HSYmnDm4gH/jeCdExJrzH43Y+4VXiavlTV ZVDymrHbEqJKLpl+EkvlLAmBZSACzxQKK4qo+4uPr8vIxHRjSgzPW/uigpT0lI2l UY/eJ20xeEolJIZr7uK02w7vylR9t/ekwQDxhiNwaOjjHSzd52gBHJQU24zL5VWH BHzgMuwSLXo85vG6ICxb1fmUSxNQJn/pUT6JUg6NqJoDT0PFa2rK7wlNlHhBGJBI Vlg4FIlXmL0OBo6mb+HMIb980aKE+kpVbtZEWbQMfz0pnH0Y5Bm2njT6YKWhky1J unQMMbNC6h+RNSwJRLa5giLZCHQyZZ0DVCP0zK4w/b6pEGaz1kLa6LJ1zNLaNFo= =hk9u -----END PGP SIGNATURE----- --1E1Oui4vdubnXi3o--