[Users] NFS Domains down because of single node failure

Markus Stockhausen stockhausen at collogia.de
Tue Sep 17 07:01:55 UTC 2013


Hello,

maybe a stupid one but ...

When I create a (NFS) storage domain I have to provide a node host that makes
the inital contact. All other node hosts wil directly connect to that domain. So
no bottlenecks.

Today I stopped one of my two nodes outside ovirt-engine. For simplicity we
assume the node crashed. The machine is up right now but VDSM is down.

All domains that where setup with this host are "down" now (red arrow down).
After searching the web interface I found "Data center" -> "Select your DC" ->
"Storage" -> "Activate". Trying to activate only results in a failure message.
To ensure that I can recover those situations in the future I'd like to know what
this node binding is all about and what to do next.

Logs attached & thanks in advance

Markus

2013-09-17 08:54:54,985 WARN  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-6-thread-50) [1be325f3] spm vds is non responsive, stopping spm selection.
2013-09-17 08:54:54,986 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.ActivateStorageDomainVDSCommand] (pool-6-thread-50) [1be325f3] FINISH, ActivateStorageDomainVDSCommand, log id: 4c38e98
2013-09-17 08:54:54,987 ERROR [org.ovirt.engine.core.bll.storage.ActivateStorageDomainCommand] (pool-6-thread-50) [1be325f3] Command org.ovirt.engine.core.bll.storage.ActivateStorageDomainCommand throw Vdc Bll exception. With error message VdcBLLException: Cannot allocate IRS server (Failed with VDSM error IRS_REPOSITORY_NOT_FOUND and code 5009)
2013-09-17 08:54:54,989 INFO  [org.ovirt.engine.core.bll.storage.ActivateStorageDomainCommand] (pool-6-thread-50) [1be325f3] Command [id=a0dbe909-fbb1-40ff-b77a-8e43bd075ace]: Compensating CHANGED_STATUS_ONLY of org.ovirt.engine.core.common.businessentities.StoragePoolIsoMap; snapshot: EntityStatusSnapshot [id=storagePoolId = b054727d-fe4a-41ed-8393-a81e36b8a1af, storageId = ecf7f507-b0fa-47ee-a8b2-d621fbd7b8bf, status=Unknown].
2013-09-17 08:54:55,004 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-99) Command GetCapabilitiesVDS execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused
2013-09-17 08:54:55,008 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-6-thread-50) [1be325f3] Correlation ID: 1be325f3, Job ID: c88c00ba-0298-4f42-bc0e-a720d79c5f49, Call Stack: null, Custom Event ID: -1, Message: Failed to activate Storage Domain NAS5_IB (Data Center Collogia) by admin at internal
2013-09-17 08:54:56,263 INFO  [org.ovirt.engine.core.bll.storage.ActivateStorageDomainCommand] (ajp--127.0.0.1-8702-2) [5c6218c1] Lock Acquired to object EngineLock [exclusiveLocks= key: ecf7f507-b0fa-47ee-a8b2-d621fbd7b8bf value: STORAGE
, sharedLocks= ]
2013-09-17 08:54:56,272 INFO  [org.ovirt.engine.core.bll.storage.ActivateStorageDomainCommand] (pool-6-thread-50) [5c6218c1] Running command: ActivateStorageDomainCommand internal: false. Entities affected :  ID: ecf7f507-b0fa-47ee-a8b2-d621fbd7b8bf Type: Storage
2013-09-17 08:54:56,291 INFO  [org.ovirt.engine.core.bll.storage.ActivateStorageDomainCommand] (pool-6-thread-50) [5c6218c1] Lock freed to object EngineLock [exclusiveLocks= key: ecf7f507-b0fa-47ee-a8b2-d621fbd7b8bf value: STORAGE
, sharedLocks= ]
2013-09-17 08:54:56,292 INFO  [org.ovirt.engine.core.bll.storage.ActivateStorageDomainCommand] (pool-6-thread-50) [5c6218c1] ActivateStorage Domain. Before Connect all hosts to pool. Time:9/17/13 8:54 AM
2013-09-17 08:54:56,296 INFO  [org.ovirt.engine.core.bll.storage.ConnectStorageToVdsCommand] (pool-6-thread-47) Running command: ConnectStorageToVdsCommand internal: true. Entities affected :  ID: aaa00000-0000-0000-0000-123456789aaa Type: System
2013-09-17 08:54:56,299 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (pool-6-thread-47) START, ConnectStorageServerVDSCommand(HostName = colovn3, HostId = 0fdccd63-f5d7-41e4-8350-5941bbc29270, storagePoolId = 00000000-0000-0000-0000-000000000000, storageType = NFS, connectionList = [{ id: 68c31a49-0e37-4438-a8fe-fc28be62cd3f, connection: 10.10.30.251:/var/nas5/ovirt, iqn: null, vfsType: null, mountOptions: null, nfsVersion: null, nfsRetrans: null, nfsTimeo: null };]), log id: 75a9c6a0
2013-09-17 08:54:56,317 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (pool-6-thread-47) FINISH, ConnectStorageServerVDSCommand, return: {68c31a49-0e37-4438-a8fe-fc28be62cd3f=0}, log id: 75a9c6a0

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20130917/0f15e2d9/attachment-0001.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: InterScan_Disclaimer.txt
URL: <http://lists.ovirt.org/pipermail/users/attachments/20130917/0f15e2d9/attachment-0001.txt>


More information about the Users mailing list