--1E1Oui4vdubnXi3o
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
On 02/04, Eyal Edri wrote:
=20
=20
----- Original Message -----
> From: "David Caro" <dcaroest(a)redhat.com>
> To: "Infra" <infra(a)ovirt.org>
> Cc: "Max Kovgan" <mkovgan(a)redhat.com>
> Sent: Tuesday, February 3, 2015 5:32:52 PM
> Subject: Re: PHX lab downtime
>=20
> On 02/03, David Caro wrote:
> >=20
> > It took more than one hour :S
> >=20
> > Current status is that all the vms are up and running, all the servic=
es
are
> > working, but we have one host down, ovirt-srv02 is out of
the pool of
> > hosts,
> > with a strange issue resolving names.
> >=20
> > When running ping it can't resolve names, but with dig it works ok. T=
hat's
> > usually a misconfiguration in the nsswitch.conf file, but
it's ok
> > I tried selinux and iptables.
> >=20
> > Did a strace of pinx, and I can see it does open a socket to the name=
server
> > and
> > sends the query, but nothing goes out the interface... (had tcpdump o=
pen in
> > another screen)
> >=20
> > socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) =3D 4 <0.000786>
> > connect(4, {sa_family=3DAF_INET, sin_port=3Dhtons(53),
> > sin_addr=3Dinet_addr("8.8.8.8")}, 16) =3D 0 <0.000028>
> > poll([{fd=3D4, events=3DPOLLOUT}], 1, 0) =3D 1 ([{fd=3D4, revents=
=3DPOLLOUT}])
> > <0.000016>
> > sendto(4, "ck\1\0\0\1\0\0\0\0\0\0\3www\6google\3com\0\0\1\0\1", 32,
> > MSG_NOSIGNAL, NULL, 0) =3D 32 <0.000052>
> >=20
> > Any idea is welcome, I'm going to get some sleep now that all the ser=
vices
> > are
> > back up...
>=20
> Mystery solved! Thanks Max!
>=20
> The issue was that the iproute2 module can edit multiple routing tables=
,
and
> the usual command 'ip route show' only shows the routes
in the default =
kernel
> table while newer vdsm adds a new table for the routing aside
from it, =
and
> when
> I modified the gateway and netmask in the routing tables of the hosts, =
the
> extra side table was not updated. That lead to the strange
behavior of =
ping
> udp
> requests to the dns server being routed through the old gateway while i=
cmp
> was
> being routed though the new table.
>=20
> You can find more info about routing tables and rules here:
>=20
>
http://linux-ip.net/html/routing-tables.html
>
http://linux-ip.net/html/routing-rpdb.html
>=20
>=20
> and some examples here:
>=20
>
http://linux-ip.net/html/adv-multi-internet.html
>=20
>=20
> Just for future reference, you can see all the routing for all the rout=
ing
> tables with:
>=20
> ip route show table all
>=20
=20
do we have the network info documented on the infra page?=20
worth documenting it or enforcing standard configuration via puppet if po=
ssible.
The network setup is there, but the first time you run vdsm it messes
everything up and changes configurations and such, so I don't think it's ok=
to
change it with puppet when vdsm is already managing it (not always ok it
seems). That will lead to a race condition between vdsm and puppet changing=
the
network configuration.
But I'll add the troubleshooting tips to the docs.
=20
>=20
> >=20
> >=20
> > ps. ovirt-srv02 was also the hosted engine master, and was the host t=
hat
> > broke
> > during the upgrade the last time. Maybe it was related to this issue,=
or
> > this
> > issue is related to that...
> >=20
> > See you tomorrow!
> >=20
> >=20
> > On 02/02, David Caro wrote:
> > >=20
> > > Hi all,
> > >=20
> > > We are having a downtime on some of the vms and hosts on the phx la=
b.
> > > It's
> > > caused by an unexpected issue with the engine and dhcp after changi=
ng the
> > > gateway of the machines to adapt to the new ip range.
> > >=20
> > > It's almost fixed, but we (I) should really get the environment to a
> > > really
> > > stable status again and finish the upgrade.
> > >=20
> > > There are still some issues, but most of them are already fixed, wi=
ll fix
> > > the
> > > rest in less than one hour.
> > >=20
> > >=20
> > >=20
> > > --
> > > David Caro
> > >=20
> > > Red Hat S.L.
> > > Continuous Integration Engineer - EMEA ENG Virtualization R&D
> > >=20
> > > Tel.: +420 532 294 605
> > > Email: dcaro(a)redhat.com
> > > Web:
www.redhat.com
> > > RHT Global #: 82-62605
> >=20
> >=20
> >=20
> > > _______________________________________________
> > > Infra mailing list
> > > Infra(a)ovirt.org
> > >
http://lists.ovirt.org/mailman/listinfo/infra
> >=20
> >=20
> > --
> > David Caro
> >=20
> > Red Hat S.L.
> > Continuous Integration Engineer - EMEA ENG Virtualization R&D
> >=20
> > Tel.: +420 532 294 605
> > Email: dcaro(a)redhat.com
> > Web:
www.redhat.com
> > RHT Global #: 82-62605
>=20
>=20
>=20
> > _______________________________________________
> > Infra mailing list
> > Infra(a)ovirt.org
> >
http://lists.ovirt.org/mailman/listinfo/infra
>=20
>=20
> --
> David Caro
>=20
> Red Hat S.L.
> Continuous Integration Engineer - EMEA ENG Virtualization R&D
>=20
> Tel.: +420 532 294 605
> Email: dcaro(a)redhat.com
> Web:
www.redhat.com
> RHT Global #: 82-62605
>=20
> _______________________________________________
> Infra mailing list
> Infra(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/infra
>=20
--=20
David Caro
Red Hat S.L.
Continuous Integration Engineer - EMEA ENG Virtualization R&D
Tel.: +420 532 294 605
Email: dcaro(a)redhat.com
Web:
www.redhat.com
RHT Global #: 82-62605
--1E1Oui4vdubnXi3o
Content-Type: application/pgp-signature
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAEBAgAGBQJU0fTXAAoJEEBxx+HSYmnDm4gH/jeCdExJrzH43Y+4VXiavlTV
ZVDymrHbEqJKLpl+EkvlLAmBZSACzxQKK4qo+4uPr8vIxHRjSgzPW/uigpT0lI2l
UY/eJ20xeEolJIZr7uK02w7vylR9t/ekwQDxhiNwaOjjHSzd52gBHJQU24zL5VWH
BHzgMuwSLXo85vG6ICxb1fmUSxNQJn/pUT6JUg6NqJoDT0PFa2rK7wlNlHhBGJBI
Vlg4FIlXmL0OBo6mb+HMIb980aKE+kpVbtZEWbQMfz0pnH0Y5Bm2njT6YKWhky1J
unQMMbNC6h+RNSwJRLa5giLZCHQyZZ0DVCP0zK4w/b6pEGaz1kLa6LJ1zNLaNFo=
=hk9u
-----END PGP SIGNATURE-----
--1E1Oui4vdubnXi3o--