
On Mon, Jun 25, 2012 at 10:57:47AM +0100, jose garcia wrote:
Good monday morning,
Installed Fedora 17 and tried to install the node to a 3.1 engine.
I'm getting an VDS Network exception in the engine side:
in /var/log/ovirt-engine/engine:
2012-06-25 10:15:34,132 WARN [org.ovirt.engine.core.vdsbroker.VdsManager] (QuartzScheduler_Worker-96) ResourceManager::refreshVdsRunTimeInfo::Failed to refresh VDS , vds = 2e9929c6-bea6-11e1-bfdd-ff11f39c80eb : ovirt-node2.smb.eurotux.local, VDS Network Error, continuing. VDSNetworkException: 2012-06-25 10:15:36,143 ERROR [org.ovirt.engine.core.vdsbroker.VdsManager] (QuartzScheduler_Worker-20) VDS::handleNetworkException Server failed to respond, vds_id = 2e9929c6-bea6-11e1-bfdd-ff11f39c80eb, vds_name = ovirt-node2.smb.eurotux.local, error = VDSNetworkException: 2012-06-25 10:15:36,181 INFO [org.ovirt.engine.core.bll.VdsEventListener] (pool-3-thread-49) ResourceManager::vdsNotResponding entered for Host 2e9929c6-bea6-11e1-bfdd-ff11f39c80eb, 10.10.30.177 2012-06-25 10:15:36,214 ERROR [org.ovirt.engine.core.bll.VdsNotRespondingTreatmentCommand] (pool-3-thread-49) [1afd4b89] Failed to run Fence script on vds:ovirt-node2.smb.eurotux.local, VMs moved to UnKnown instead.
While in the node, vdsmd does fail to sample nics:
in /var/log/vdsm/vdsm.log:
nf = netinfo.NetInfo() File "/usr/share/vdsm/netinfo.py", line 268, in __init__ _netinfo = get() File "/usr/share/vdsm/netinfo.py", line 220, in get for nic in nics() ]) KeyError: 'p36p1'
MainThread::INFO::2012-06-25 10:45:09,110::vdsm::76::vds::(run) VDSM main thread ended. Waiting for 1 other threads... MainThread::INFO::2012-06-25 10:45:09,111::vdsm::79::vds::(run) <_MainThread(MainThread, started 140567823243072)> MainThread::INFO::2012-06-25 10:45:09,111::vdsm::79::vds::(run) <Thread(libvirtEventLoop, started daemon 140567752681216)>
in /etc/var/log/messages there is a lot of vdsmd died too quickly:
Jun 25 10:45:08 ovirt-node2 respawn: slave '/usr/share/vdsm/vdsm' died too quickly, respawning slave Jun 25 10:45:08 ovirt-node2 respawn: slave '/usr/share/vdsm/vdsm' died too quickly, respawning slave Jun 25 10:45:09 ovirt-node2 respawn: slave '/usr/share/vdsm/vdsm' died too quickly for more than 30 seconds, master sleeping for 900 seconds
I don't know why Fedora 17 calls p36p1 to what was eth0 in Fedora 16, but tried to configure a bridge ovirtmgmt and the only difference is that KeyError becomes 'ovirtmgmt'.
The nic renaming may have happened due to biosdevname. Do you have it installed? Does any of the /etc/sysconfig/network-scripts/ifcfg-* refer to an old nic name? Which version of vdsm are you running? It seems that it is pre-v4.9.4-61-g24f8627 which is too old for f17 to run - the output of ifconfig has changed. Please retry with latest beta version https://koji.fedoraproject.org/koji/buildinfo?buildID=327015 If the problem persists, could you run vdsm manually, with # su - vdsm -s /bin/bash # cd /usr/share/vdsm # ./vdsm maybe it would give a hint about the crash. regards, Dan.