Hi,
On our second oVirt setup in 3.4.0-1.el6 (that was running fine), I did a yum upgrade on the engine (...sigh...).
Then rebooted the engine.
This machine is hosting the NFS export domain.
Though the VM are still running, the storage domain is in invalid status. You'll find below the engine.log.
At first sight, I thought it was the same issue as :
http://lists.ovirt.org/pipermail/users/2014-March/022161.html
because it looked very similar.
But the NFS export domain connection seemed OK (tested).
I did try every trick I could thought of, restarting, checking anything...
Our cluster stayed in a broken state.
On second sight, I saw that when rebooting the engine, then NFS export domain was not mounted correctly (I wrote a static /dev/sd-something in fstab, and the iscsi manager changed the letter. Next time, I'll use LVM or a label).
So the NFS served was void/empty/black hole.
I just realized all the above, and spent my afternoon in cold sweat.
Correcting the NFS mounting and restarting the engine did the trick.
What still disturbs me is that the unavailability of the NFS export domain should NOT be a reason for the MASTER storage domain to break!
Following the URL above and the BZ opened by the user (https://bugzilla.redhat.com/show_bug.cgi?id=1072900), I see this has been corrected in 3.4.1. What gives a perfectly connected NFS export domain, but empty?
PS : I see no 3.4.1 update on CentOS repo.
Regards,
--------------------------
The engine log :
2014-05-09 14:40:37,767 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] spmStart polling started: taskId = 6d612398-fdad-49f2-9874-5f32a9bf87e2
│
20│2014-05-09 14:40:40,848 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] Failed in HSMGetTaskStatusVDS method
│
20│2014-05-09 14:40:40,850 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] spmStart polling ended: taskId = 6d612398-fdad-49f2-9874-5f32a9bf87e2 task status = finished
│
20│2014-05-09 14:40:40,850 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] Start SPM Task failed - result: cleanSuccess, message: VDSGenericException: VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Storage domain does not exist, code = 358 │
20│2014-05-09 14:40:40,913 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] spmStart polling ended, spm status: Free
│
20│2014-05-09 14:40:40,932 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] START, HSMClearTaskVDSCommand(HostName = serv-vm-adm17, HostId = 049943eb-2bcc-4167-a780-7ef76a1f95e9, taskId=6d612398-fdad-49f2-9874-5f32a9bf87e2), log id: 5cfdc8ce │
20│2014-05-09 14:40:40,982 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] FINISH, HSMClearTaskVDSCommand, log id: 5cfdc8ce
│
20│2014-05-09 14:40:40,983 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-28) [f685ea4] FINISH, SpmStartVDSCommand, return: org.ovirt.engine.core.common.businessentities.SpmStatusResult@39471ba9, log id: 58ec77ee │
20│2014-05-09 14:40:40,985 INFO [org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] Running command: SetStoragePoolStatusCommand internal: true. Entities affected : ID: 5849b030-626e-47cb-ad90-3ce782d831b3 Type: StoragePool │
20│2014-05-09 14:40:41,009 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-28) [6b69119f] Correlation ID: 6b69119f, Call Stack: null, Custom Event ID: -1, Message: Invalid status on Data Center Etat-Major3. Setting status to Non Responsive. │
20│2014-05-09 14:40:41,017 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: IRSGenericException: IRSErrorException: SpmStart failed │al
se│2014-05-09 14:40:41,112 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] Irs placed on server 049943eb-2bcc-4167-a780-7ef76a1f95e9 failed. Proceed Failover
│
20│2014-05-09 14:40:41,206 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] hostFromVds::selectedVds - serv-vm-adm16, spmStatus Free, storage pool Etat-Major3
│
20│2014-05-09 14:40:41,209 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] starting spm on vds serv-vm-adm16, storage pool Etat-Major3, prevId -1, LVER -1
│
20│2014-05-09 14:40:41,227 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] START, SpmStartVDSCommand(HostName = serv-vm-adm16, HostId = 13a2bc0a-979a-4fcd-8597-06131030d9a0, storagePoolId = 5849b030-626e-47cb-ad90-3ce782d831b3, prevId=-1, prevLVER=-1, storagePoolFormatType=V3, recoveryMode=Manual, SCSIFenci│
20│ng=false), log id: 67d013a4
│
20│2014-05-09 14:40:41,292 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] spmStart polling started: taskId = 1046fd3e-71e4-4fcd-bbd0-f17cd6dc08e4
│
20│2014-05-09 14:40:44,438 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] (DefaultQuartzScheduler_Worker-28) [6b69119f] Failed in HSMGetTaskStatusVDS method
--
Nicolas Ecarnot
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users