PHX lab downtime

David Caro dcaroest at redhat.com
Tue Feb 3 00:26:38 UTC 2015


It took more than one hour :S

Current status is that all the vms are up and running, all the services are
working, but we have one host down, ovirt-srv02 is out of the pool of hosts,
with a strange issue resolving names.

When running ping it can't resolve names, but with dig it works ok. That's
usually a misconfiguration in the nsswitch.conf file, but it's ok
I tried selinux and iptables.

Did a strace of pinx, and I can see it does open a socket to the nameserver and
sends the query, but nothing goes out the interface... (had tcpdump open in
another screen)

socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 4 <0.000786>
connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("8.8.8.8")}, 16) = 0 <0.000028>
poll([{fd=4, events=POLLOUT}], 1, 0)    = 1 ([{fd=4, revents=POLLOUT}]) <0.000016>
sendto(4, "ck\1\0\0\1\0\0\0\0\0\0\3www\6google\3com\0\0\1\0\1", 32, MSG_NOSIGNAL, NULL, 0) = 32 <0.000052>

Any idea is welcome, I'm going to get some sleep now that all the services are
back up...


ps. ovirt-srv02 was also the hosted engine master, and was the host that broke
during the upgrade the last time. Maybe it was related to this issue, or this
issue is related to that...

See you tomorrow!


On 02/02, David Caro wrote:
> 
> Hi all,
> 
> We are having a downtime on some of the vms and hosts on the phx lab. It's
> caused by an unexpected issue with the engine and dhcp after changing the
> gateway of the machines to adapt to the new ip range.
> 
> It's almost fixed, but we (I) should really get the environment to a really
> stable status again and finish the upgrade.
> 
> There are still some issues, but most of them are already fixed, will fix the
> rest in less than one hour.
> 
> 
> 
> -- 
> David Caro
> 
> Red Hat S.L.
> Continuous Integration Engineer - EMEA ENG Virtualization R&D
> 
> Tel.: +420 532 294 605
> Email: dcaro at redhat.com
> Web: www.redhat.com
> RHT Global #: 82-62605



> _______________________________________________
> Infra mailing list
> Infra at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra


-- 
David Caro

Red Hat S.L.
Continuous Integration Engineer - EMEA ENG Virtualization R&D

Tel.: +420 532 294 605
Email: dcaro at redhat.com
Web: www.redhat.com
RHT Global #: 82-62605
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/infra/attachments/20150203/2e853a29/attachment.sig>


More information about the Infra mailing list