[ovirt-users] Datacenter unresponsive: recovery procedure?
Jorick Astrego
j.astrego at netbulae.eu
Thu Apr 16 09:39:42 UTC 2015
On 04/16/2015 11:12 AM, Andrea Ghelardi wrote:
>
> (sorry: resending as I wasn’t part of the list, yet)
>
>
>
> hi,
>
> this is my first post so hallo all and thank you for reading.
>
> I have an issue with my production Ovirt environment (3.5.1.1-1.el6).
>
>
>
> My system consists of several datancers.
>
> 2 of them are connected to an iSCSI SAN and they were working fine.
>
> Until the moment I had the bad idea of deleting a SAN volume from the
> SAN manager before deleting the associated storage on Ovirt. From that
> moment, the DC where this storage was mounted became not responsive:
> it cannot attach the master storage (or any other).
>
> I tried to
>
> 1) manually destroy the offending storage (select -> destroy) but
> still cannot recover the situation.
>
> 2) right click on master storage and activate it
>
> 3) re-initialize the datacenter using a NFS storage from the working
> sister DC.
>
>
>
> All Hosts are still running even though their status is "unknown".
>
> All VM are still running even though their status is "not responding".
>
>
>
> I half resolved the issue by manually restarting the host where the
> datastore was originally mounted. This cleared the orphaned multipath.
>
> However, the SPM does not come up still.
>
> This is an extract of the log
>
> /2015-04-16 03:51:48,069 WARN
> [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand]
> (DefaultQuartzScheduler_Worker-14) [61a44b19] could not stop spm of
> pool *00000002-0002-0002-0002-00000000009c*on vds
> *89254f23-8748-402a-afc9-08438dca0975*- reason:
> org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
> VDSGenericException: VDSNetworkException: Message timeout which can be
> caused by communication issues/
>
> /2015-04-16 03:51:48,072 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand]
> (DefaultQuartzScheduler_Worker-14) [61a44b19] FINISH,
> SpmStopVDSCommand, log id: 4354cf46/
>
> /2015-04-16 03:51:48,072 WARN
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData]
> (DefaultQuartzScheduler_Worker-14) [61a44b19] spm stop on spm failed,
> stopping spm selection!/
>
> /2015-04-16 03:51:58,223 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData]
> (DefaultQuartzScheduler_Worker-4) [4ca2d938] hostFromVds::selectedVds
> - Brachetto, spmStatus Free, storage pool IRDC-INTEL/
>
> /2015-04-16 03:51:58,225 ERROR
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData]
> (DefaultQuartzScheduler_Worker-4) [4ca2d938] SPM Init: could not find
> reported vds or not up - pool:IRDC-INTEL vds_spm_id: 3/
>
> /2015-04-16 03:51:58,239 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData]
> (DefaultQuartzScheduler_Worker-4) [4ca2d938] SPM selection - vds seems
> as spm sovana/
>
> /2015-04-16 03:51:58,252 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand]
> (DefaultQuartzScheduler_Worker-4) [4ca2d938] START,
> SpmStopVDSCommand(HostName = sovana, HostId =
> 89254f23-8748-402a-afc9-08438dca0975, storagePoolId =
> 00000002-0002-0002-0002-00000000009c), log id: 63a17687/
>
>
>
> storagePoolId = 00000002-0002-0002-0002-00000000009c is (was)
> hertz-dstore2 which does not exists anymore on SAN adn ovirt
>
> hostid 89254f23-8748-402a-afc9-08438dca0975 is sovana server (current
> SPM)
>
>
>
>
>
>
>
>
>
> I’m thinking about
>
> /Put the hosted engine host into Maintenance///
>
> /Shutdown Ovirt Manager///
>
> /Rebooted SPM server///
>
> /Restarted Ovirt Manager///
>
> /Took hosted engine host out of Maintenance///
>
>
>
>
>
> any help or clue is highly welcomed with cheers and beers
>
> thank you!
>
>
I had comparable issues nearly a year ago after a failed iSCSI failover
that ended in a split brain. Wasn't able to recover from it.
https://bugzilla.redhat.com/show_bug.cgi?id=1108576
Met vriendelijke groet, With kind regards,
Jorick Astrego
Netbulae Virtualization Experts
----------------
Tel: 053 20 30 270 info at netbulae.eu Staalsteden 4-3A KvK 08198180
Fax: 053 20 30 271 www.netbulae.eu 7547 TA Enschede BTW NL821234584B01
----------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20150416/3dd17680/attachment-0001.html>
More information about the Users
mailing list