[ovirt-users] 3.6 looses network on reboot

Fri Mar 4 00:40:36 EST 2016

-----Original Message-----
From: Dan Kenigsberg [mailto:danken at redhat.com]
Sent: Thursday, March 03, 2016 17:36
To: David LeVene <David.LeVene at blackboard.com>
Cc: edwardh at redhat.com; sabose at redhat.com; users at ovirt.org
Subject: Re: [ovirt-users] 3.6 looses network on reboot

On Thu, Mar 03, 2016 at 12:54:25AM +0000, David LeVene wrote:
>
> Can you check our patches? They should resolve the problem we saw in
> the
> log: https://gerrit.ovirt.org/#/c/54237  (based on oVirt-3.6.3)
>
> -- I've manually applied the patch to the node that I was testing on
> and the networking comes on-line correctly - now I'm encountering a
> gluster issue with cannot find master domain.

You are most welcome to share your logs (preferably on a different thread, to avoid confusion)

--- Will do - I'll start a new thread after I've done some more investigation with the pointers given so far in this thread & by Nir in the next reply.

>
> Without the fixes, as a workaround, I would suggest (if possible) to disable IPv6 on your host boot line and check if all works out for you.
> -- Ok, but as I can manually apply the patch its good now. Do you know
> what version are we hoping to have this put into as I won't perform an
> ovirt/vdsm update until its part of the upstream RPM's

The fix has been proposed to ovirt-3.6.4. I'll make sure it's accepted.

-- Great thanks!

>
> Do you need IPv6 connectivity? If so, you'll need to use a vdsm hook or another interface that is not controlled by oVirt.
> -- Ideally I'd prefer not to have it, but the way our network has been
> configured some hosts are IPv6 only, so at a min the guests need it..
> the hypervisors not so much.

May I tap to what your IPv6 experience? (only if you feel confortable sharing this publically). What does these IPv6-only servers do? What does the guest do with them?

-- The group I work with has implemented dual stack & if a v4 address is not required.. it's not given.. The IPv6 servers will run an application of some kind.. could be a webserver, anything really. As they sit behind LB's they handle the v4 If required. My personal opinion only: is that it's too much hassle for what its worth at this point and I'd prefer it if we only used v4/v6 address's at the entry points and v4 internally or all dual stack.
-- My option is based on the fact that I've come across too many pieces of software that require tweaking, &/or additional configuration &/or special compilation to enable it. It also adds an addition layer of complication when troubleshooting applications if v6 isn't working and there's no v4 address to test on.
-- A recent example.. downloads.ceph.com advertises v4 and v6 address. On an IPv6 only host it fails to connect to the repo because the v6 address fails.. and there's a V4 address so NAT64 isn't performed. This breaks yum. Workaround.. fd64::IPv4 Address.

>
> -- I've now hit an issue with it not starting up the master storage
> gluster domain - as it’s a separate issue I'll review the mailing
> lists & create a new item if its related.. I've attached the
> supervdsm.log incase you can save me some time and point me in the
> right direction!

All I see is this

MainProcess|jsonrpc.Executor/4::ERROR::2016-03-03
MainProcess|11:15:04,699::supervdsmServer::118::SuperVdsm.ServerCallback
MainProcess|::(wrapper) Error in wrapper
Traceback (most recent call last):
  File "/usr/share/vdsm/supervdsmServer", line 116, in wrapper
    res = func(*args, **kwargs)
  File "/usr/share/vdsm/supervdsmServer", line 531, in wrapper
    return func(*args, **kwargs)
  File "/usr/share/vdsm/gluster/cli.py", line 496, in volumeInfo
    xmltree = _execGlusterXml(command)
  File "/usr/share/vdsm/gluster/cli.py", line 108, in _execGlusterXml
    raise ge.GlusterCmdExecFailedException(rc, out, err)
GlusterCmdExecFailedException: Command execution failed return code: 2

which tells me very little. Please share your vdsm.log and gluster logs (possibly /var/log/messages as well) to understand what has happened.
Make sure to include sabose at redhat.com on the thread.
In the past we heard that network disconnections causes glusterd to crash, so it might be the case again.

-- I'll investigate further when I have time, and post again under a different topic. Cheers for the tips/logs to review.

Regards,
Dan.
This email and any attachments may contain confidential and proprietary information of Blackboard that is for the sole use of the intended recipient. If you are not the intended recipient, disclosure, copying, re-distribution or other use of any of this information is strictly prohibited. Please immediately notify the sender and delete this transmission if you received this email in error.