[ovirt-users] gfapi, 3.5.1

Alex Crow acrow at integrafin.co.uk
Tue Dec 9 10:33:42 UTC 2014


Hi,

Will the vdsm patches to properly enable libgfapi storage for VMs (and 
matching refactored code in the hosted-engine setup scripts) for VMs 
make it into 3.5.1? It's not in the snapshots yet it seems.

I notice it's in master/3.6 snapshot but something stops the HA stuff in 
self-hosted setups from connecting storage:

from Master test setup:
/var/log/ovirt-hosted-engine-ha/broker.log

MainThread::INFO::2014-12-08 
19:22:56,287::hosted_engine::222::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) 
Found certificate common name: 172.17.10.50
MainThread::WARNING::2014-12-08 
19:22:56,395::hosted_engine::497::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) 
Failed to connect storage, waiting '15' seconds before the next attempt
MainThread::WARNING::2014-12-08 
19:23:11,501::hosted_engine::497::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) 
Failed to connect storage, waiting '15' seconds before the next attempt
MainThread::WARNING::2014-12-08 
19:23:26,610::hosted_engine::497::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) 
Failed to connect storage, waiting '15' seconds before the next attempt
MainThread::WARNING::2014-12-08 
19:23:41,717::hosted_engine::497::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) 
Failed to connect storage, waiting '15' seconds before the next attempt
MainThread::WARNING::2014-12-08 
19:23:56,824::hosted_engine::497::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) 
Failed to connect storage, waiting '15' seconds before the next attempt
MainThread::ERROR::2014-12-08 
19:24:11,840::hosted_engine::500::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) 
Failed trying to connect storage:
MainThread::ERROR::2014-12-08 
19:24:11,840::agent::173::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) 
Error: 'Failed trying to connect storage' - trying to restart agent
MainThread::WARNING::2014-12-08 
19:24:16,845::agent::176::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) 
Restarting agent, attempt '8'
MainThread::INFO::2014-12-08 
19:24:16,855::hosted_engine::222::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) 
Found certificate common name: 172.17.10.50
MainThread::WARNING::2014-12-08 
19:24:16,962::hosted_engine::497::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) 
Failed to connect storage, waiting '15' seconds before the next attempt
MainThread::WARNING::2014-12-08 
19:24:32,069::hosted_engine::497::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) 
Failed to connect storage, waiting '15' seconds before the next attempt
MainThread::WARNING::2014-12-08 
19:24:47,181::hosted_engine::497::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) 
Failed to connect storage, waiting '15' seconds before the next attempt
MainThread::WARNING::2014-12-08 
19:25:02,288::hosted_engine::497::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) 
Failed to connect storage, waiting '15' seconds before the next attempt
MainThread::WARNING::2014-12-08 
19:25:17,389::hosted_engine::497::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) 
Failed to connect storage, waiting '15' seconds before the next attempt
MainThread::ERROR::2014-12-08 
19:25:32,404::hosted_engine::500::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) 
Failed trying to connect storage:
MainThread::ERROR::2014-12-08 
19:25:32,404::agent::173::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) 
Error: 'Failed trying to connect storage' - trying to restart agent
MainThread::WARNING::2014-12-08 
19:25:37,409::agent::176::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) 
Restarting agent, attempt '9'
MainThread::ERROR::2014-12-08 
19:25:37,409::agent::178::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) 
Too many errors occurred, giving up. Please review the log and consider 
filing a bug.
MainThread::INFO::2014-12-08 
19:25:37,409::agent::118::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Agent 
shutting down
(END) - Next: /var/log/ovirt-hosted-engine-ha/broker.log

vdsm.log:

Detector thread::DEBUG::2014-12-08 
19:20:45,458::protocoldetector::214::vds.MultiProtocolAcceptor::(_remove_connection) 
Removing connection 127.0.0.1:53083
Detector thread::DEBUG::2014-12-08 
19:20:45,458::BindingXMLRPC::1193::XmlDetector::(handleSocket) xml over 
http detected from ('127.0.0.1', 53083)
Thread-44::DEBUG::2014-12-08 
19:20:45,459::BindingXMLRPC::318::vds::(wrapper) client [127.0.0.1]
Thread-44::DEBUG::2014-12-08 
19:20:45,460::task::592::Storage.TaskManager.Task::(_updateState) 
Task=`b5accf8f-014a-412d-9fb8-9e9447d49b72`::moving from state init -> 
state preparing
Thread-44::INFO::2014-12-08 
19:20:45,460::logUtils::48::dispatcher::(wrapper) Run and protect: 
connectStorageServer(domType=1, 
spUUID='ab2b5ee7-9aa7-426f-9d58-5e7d3840ad81', conList=[{'connection': 
'zebulon.ifa.net:/engine', 'iqn': ',', 'protocol_version': '3'
, 'kvm': 'password', '=': 'user', ',': '='}], options=None)
Thread-44::DEBUG::2014-12-08 
19:20:45,461::hsm::2384::Storage.HSM::(__prefetchDomains) nfs local 
path: /rhev/data-center/mnt/zebulon.ifa.net:_engine
Thread-44::DEBUG::2014-12-08 
19:20:45,462::hsm::2408::Storage.HSM::(__prefetchDomains) Found SD 
uuids: (u'd3240928-dae9-4ed0-8a28-7ab552455063',)
Thread-44::DEBUG::2014-12-08 
19:20:45,463::hsm::2464::Storage.HSM::(connectStorageServer) knownSDs: 
{d3240928-dae9-4ed0-8a28-7ab552455063: storage.nfsSD.findDomain}
Thread-44::ERROR::2014-12-08 
19:20:45,463::task::863::Storage.TaskManager.Task::(_setError) 
Task=`b5accf8f-014a-412d-9fb8-9e9447d49b72`::Unexpected error
Traceback (most recent call last):
   File "/usr/share/vdsm/storage/task.py", line 870, in _run
     return fn(*args, **kargs)
   File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
     res = f(*args, **kwargs)
   File "/usr/share/vdsm/storage/hsm.py", line 2466, in connectStorageServer
     res.append({'id': conDef["id"], 'status': status})
KeyError: 'id'
Thread-44::DEBUG::2014-12-08 
19:20:45,463::task::882::Storage.TaskManager.Task::(_run) 
Task=`b5accf8f-014a-412d-9fb8-9e9447d49b72`::Task._run: 
b5accf8f-014a-412d-9fb8-9e9447d49b72 (1, 
'ab2b5ee7-9aa7-426f-9d58-5e7d3840ad81', [{'kvm': 'password', ',': '=', 'conn
ection': 'zebulon.ifa.net:/engine', 'iqn': ',', 'protocol_version': '3', 
'=': 'user'}]) {} failed - stopping task
Thread-44::DEBUG::2014-12-08 
19:20:45,463::task::1214::Storage.TaskManager.Task::(stop) 
Task=`b5accf8f-014a-412d-9fb8-9e9447d49b72`::stopping in state preparing 
(force False)
Thread-44::DEBUG::2014-12-08 
19:20:45,463::task::990::Storage.TaskManager.Task::(_decref) 
Task=`b5accf8f-014a-412d-9fb8-9e9447d49b72`::ref 1 aborting True
Thread-44::INFO::2014-12-08 
19:20:45,463::task::1168::Storage.TaskManager.Task::(prepare) 
Task=`b5accf8f-014a-412d-9fb8-9e9447d49b72`::aborting: Task is aborted: 
u"'id'" - code 100
Thread-44::DEBUG::2014-12-08 
19:20:45,463::task::1173::Storage.TaskManager.Task::(prepare) 
Task=`b5accf8f-014a-412d-9fb8-9e9447d49b72`::Prepare: aborted: 'id'
Thread-44::DEBUG::2014-12-08 
19:20:45,463::task::990::Storage.TaskManager.Task::(_decref) 
Task=`b5accf8f-014a-412d-9fb8-9e9447d49b72`::ref 0 aborting True
Thread-44::DEBUG::2014-12-08 
19:20:45,463::task::925::Storage.TaskManager.Task::(_doAbort) 
Task=`b5accf8f-014a-412d-9fb8-9e9447d49b72`::Task._doAbort: force False
Thread-44::DEBUG::2014-12-08 
19:20:45,463::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) 
Owner.cancelAll requests {}
Thread-44::DEBUG::2014-12-08 
19:20:45,463::task::592::Storage.TaskManager.Task::(_updateState) 
Task=`b5accf8f-014a-412d-9fb8-9e9447d49b72`::moving from state preparing 
-> state aborting
Thread-44::DEBUG::2014-12-08 
19:20:45,464::task::547::Storage.TaskManager.Task::(__state_aborting) 
Task=`b5accf8f-014a-412d-9fb8-9e9447d49b72`::_aborting: recover policy none
Thread-44::DEBUG::2014-12-08 
19:20:45,464::task::592::Storage.TaskManager.Task::(_updateState) 
Task=`b5accf8f-014a-412d-9fb8-9e9447d49b72`::moving from state aborting 
-> state failed
Thread-44::DEBUG::2014-12-08 
19:20:45,464::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) 
Owner.releaseAll requests {} resources {}
Thread-44::DEBUG::2014-12-08 
19:20:45,464::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) 
Owner.cancelAll requests {}
Thread-44::ERROR::2014-12-08 
19:20:45,464::dispatcher::79::Storage.Dispatcher::(wrapper) 'id'
Traceback (most recent call last):
   File "/usr/share/vdsm/storage/dispatcher.py", line 71, in wrapper
     result = ctask.prepare(func, *args, **kwargs)
   File "/usr/share/vdsm/storage/task.py", line 103, in wrapper
     return m(self, *a, **kw)
   File "/usr/share/vdsm/storage/task.py", line 1176, in prepare
     raise self.error
KeyError: 'id'
clientIFinit::ERROR::2014-12-08 
19:20:48,190::clientIF::460::vds::(_recoverExistingVms) Vm's recovery failed
Traceback (most recent call last):
   File "/usr/share/vdsm/clientIF.py", line 404, in _recoverExistingVms
     caps.CpuTopology().cores())
   File "/usr/share/vdsm/caps.py", line 200, in __init__
     self._topology = _getCpuTopology(capabilities)
   File "/usr/share/vdsm/caps.py", line 232, in _getCpuTopology
     capabilities = _getFreshCapsXMLStr()
   File "/usr/share/vdsm/caps.py", line 222, in _getFreshCapsXMLStr
     return libvirtconnection.get().getCapabilities()
   File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", 
line 157, in get
     passwd)
   File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", 
line 102, in open_connection
     return utils.retry(libvirtOpen, timeout=10, sleep=0.2)
   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 935, in retry
     return func()
   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 102, in 
openAuth
     if ret is None:raise libvirtError('virConnectOpenAuth() failed')
libvirtError: authentication failed: polkit: 
polkit\56retains_authorization_after_challenge=1
Authorization requires authentication but no agent is available.


-- 
This message is intended only for the addressee and may contain
confidential information. Unless you are that person, you may not
disclose its contents or use it in any way and are requested to delete
the message along with any attachments and notify us immediately.
"Transact" is operated by Integrated Financial Arrangements plc. 29
Clement's Lane, London EC4N 7AE. Tel: (020) 7608 4900 Fax: (020) 7608
5300. (Registered office: as above; Registered in England and Wales
under number: 3727592). Authorised and regulated by the Financial
Conduct Authority (entered on the Financial Services Register; no. 190856).




More information about the Users mailing list