[ovirt-users] Auto-SOLVED, but read anyway : Invalid status on Data Center. Setting status to Non Responsive.

Nicolas Ecarnot nicolas at ecarnot.net
Fri May 9 13:55:48 UTC 2014


On our second oVirt setup in 3.4.0-1.el6 (that was running fine), I did 
a yum upgrade on the engine (...sigh...).
Then rebooted the engine.
This machine is hosting the NFS export domain.
Though the VM are still running, the storage domain is in invalid 
status. You'll find below the engine.log.

At first sight, I thought it was the same issue as :
because it looked very similar.
But the NFS export domain connection seemed OK (tested).
I did try every trick I could thought of, restarting, checking anything...
Our cluster stayed in a broken state.

On second sight, I saw that when rebooting the engine, then NFS export 
domain was not mounted correctly (I wrote a static /dev/sd-something in 
fstab, and the iscsi manager changed the letter. Next time, I'll use LVM 
or a label).
So the NFS served was void/empty/black hole.

I just realized all the above, and spent my afternoon in cold sweat.
Correcting the NFS mounting and restarting the engine did the trick.
What still disturbs me is that the unavailability of the NFS export 
domain should NOT be a reason for the MASTER storage domain to break!

Following the URL above and the BZ opened by the user 
(https://bugzilla.redhat.com/show_bug.cgi?id=1072900), I see this has 
been corrected in 3.4.1. What gives a perfectly connected NFS export 
domain, but empty?

PS : I see no 3.4.1 update on CentOS repo.



The engine log :

2014-05-09 14:40:37,767 INFO 
(DefaultQuartzScheduler_Worker-28) [f685ea4] spmStart polling started: 
taskId = 6d612398-fdad-49f2-9874-5f32a9bf87e2 
20│2014-05-09 14:40:40,848 ERROR 
(DefaultQuartzScheduler_Worker-28) [f685ea4] Failed in 
HSMGetTaskStatusVDS method 
20│2014-05-09 14:40:40,850 INFO 
(DefaultQuartzScheduler_Worker-28) [f685ea4] spmStart polling ended: 
taskId = 6d612398-fdad-49f2-9874-5f32a9bf87e2 task status = finished 
20│2014-05-09 14:40:40,850 ERROR 
(DefaultQuartzScheduler_Worker-28) [f685ea4] Start SPM Task failed - 
result: cleanSuccess, message: VDSGenericException: VDSErrorException: 
Failed to HSMGetTaskStatusVDS, error = Storage domain does not exist, 
code = 358                                                           │
20│2014-05-09 14:40:40,913 INFO 
(DefaultQuartzScheduler_Worker-28) [f685ea4] spmStart polling ended, spm 
status: Free 
20│2014-05-09 14:40:40,932 INFO 
(DefaultQuartzScheduler_Worker-28) [f685ea4] START, 
HSMClearTaskVDSCommand(HostName = serv-vm-adm17, HostId = 
taskId=6d612398-fdad-49f2-9874-5f32a9bf87e2), log id: 5cfdc8ce 
20│2014-05-09 14:40:40,982 INFO 
(DefaultQuartzScheduler_Worker-28) [f685ea4] FINISH, 
HSMClearTaskVDSCommand, log id: 5cfdc8ce 
20│2014-05-09 14:40:40,983 INFO 
(DefaultQuartzScheduler_Worker-28) [f685ea4] FINISH, SpmStartVDSCommand, 
org.ovirt.engine.core.common.businessentities.SpmStatusResult at 39471ba9, 
log id: 58ec77ee 
20│2014-05-09 14:40:40,985 INFO 
(DefaultQuartzScheduler_Worker-28) [6b69119f] Running command: 
SetStoragePoolStatusCommand internal: true. Entities affected :  ID: 
5849b030-626e-47cb-ad90-3ce782d831b3 Type: StoragePool 
20│2014-05-09 14:40:41,009 INFO 
(DefaultQuartzScheduler_Worker-28) [6b69119f] Correlation ID: 6b69119f, 
Call Stack: null, Custom Event ID: -1, Message: Invalid status on Data 
Center Etat-Major3. Setting status to Non Responsive. 
20│2014-05-09 14:40:41,017 ERROR 
(DefaultQuartzScheduler_Worker-28) [6b69119f] 
IrsBroker::Failed::GetStoragePoolInfoVDS due to: 
IrsSpmStartFailedException: IRSGenericException: IRSErrorException: 
SpmStart failed 
se│2014-05-09 14:40:41,112 INFO 
(DefaultQuartzScheduler_Worker-28) [6b69119f] Irs placed on server 
049943eb-2bcc-4167-a780-7ef76a1f95e9 failed. Proceed Failover 
20│2014-05-09 14:40:41,206 INFO 
(DefaultQuartzScheduler_Worker-28) [6b69119f] hostFromVds::selectedVds - 
serv-vm-adm16, spmStatus Free, storage pool Etat-Major3 
20│2014-05-09 14:40:41,209 INFO 
(DefaultQuartzScheduler_Worker-28) [6b69119f] starting spm on vds 
serv-vm-adm16, storage pool Etat-Major3, prevId -1, LVER -1 
20│2014-05-09 14:40:41,227 INFO 
(DefaultQuartzScheduler_Worker-28) [6b69119f] START, 
SpmStartVDSCommand(HostName = serv-vm-adm16, HostId = 
13a2bc0a-979a-4fcd-8597-06131030d9a0, storagePoolId = 
5849b030-626e-47cb-ad90-3ce782d831b3, prevId=-1, prevLVER=-1, 
storagePoolFormatType=V3, recoveryMode=Manual, SCSIFenci│
20│ng=false), log id: 67d013a4 
20│2014-05-09 14:40:41,292 INFO 
(DefaultQuartzScheduler_Worker-28) [6b69119f] spmStart polling started: 
taskId = 1046fd3e-71e4-4fcd-bbd0-f17cd6dc08e4 
20│2014-05-09 14:40:44,438 ERROR 
(DefaultQuartzScheduler_Worker-28) [6b69119f] Failed in 
HSMGetTaskStatusVDS method 

Nicolas Ecarnot

More information about the Users mailing list