[Users] Data Center Non Responsive / Contending

Liron Aravot laravot at redhat.com
Tue Mar 4 15:37:18 UTC 2014



----- Original Message -----
> From: "Giorgio Bersano" <giorgio.bersano at gmail.com>
> To: "Liron Aravot" <laravot at redhat.com>
> Cc: "Meital Bourvine" <mbourvin at redhat.com>, "users at ovirt.org" <Users at ovirt.org>, fsimonce at redhat.com
> Sent: Tuesday, March 4, 2014 5:31:01 PM
> Subject: Re: [Users] Data Center Non Responsive / Contending
> 
> 2014-03-04 16:03 GMT+01:00 Liron Aravot <laravot at redhat.com>:
> > Hi Giorgio,
> > Apperantly the issue is caused because there is no connectivity to the
> > export domain and than we fail on spmStart - that's obviously a bug that
> > shouldn't happen.
> 
> Hi Liron,
> we are reaching the same conclusion.
> 
> > can you open a bug for the issue?
> Surely I will
> 
> > in the meanwhile, as it seems to still exist - seems to me like the way for
> > solving it would be either to fix the connectivity issue between vdsm and
> > the storage domain or to downgrade your vdsm version to before this issue
> > was introduced.
> 
> 
> I have some problems with your suggestion(s):
> - I cannot fix the connectivity between vdsm and the storage domain
> because, as I already said, it is exposed by a VM by this very same
> DataCenter and if the DC doesn't goes up, the NFS server can't too.
> - I don't understand what does it mean to downgrade the vdsm: to which
> point in time?
> 
> It seems I've put myself - again - in a situation of the "the egg or
> the chicken" type, where the SD depends from THIS export domain but
> the export domain isn't available if the DC isn't running.
> 
> This export domain isn't that important to me. I can throw it away
> without any problem.
> 
> What if we edit the DB and remove any instances related to it? Any
> adverse consequences?
> 

Ok, please perform a full db backup before attempting the following:
1. right click on the the domain and choose "Destory"
2. move all hosts to maintenance
3. log in into the database and run the following sql command:
update storage_pool where id = '{you id goes here}' set master_domain_version = master_domain_version + 1;
4. activate a host.
> 
> 
> >
> > 6a519e95-62ef-445b-9a98-f05c81592c85::WARNING::2014-03-04
> > 13:05:31,489::lvm::377::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] ['
> > Volume group "1810e5eb-9e
> > b6-4797-ac50-8023a939f312" not found', '  Skipping volume group
> > 1810e5eb-9eb6-4797-ac50-8023a939f312']
> > 6a519e95-62ef-445b-9a98-f05c81592c85::ERROR::2014-03-04
> > 13:05:31,499::sdc::143::Storage.StorageDomainCache::(_findDomain) domain
> > 1810e5eb-9eb6-4797-ac50-8023a
> > 939f312 not found
> > Traceback (most recent call last):
> >   File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
> >     dom = findMethod(sdUUID)
> >   File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
> >     raise se.StorageDomainDoesNotExist(sdUUID)
> > StorageDomainDoesNotExist: Storage domain does not exist:
> > (u'1810e5eb-9eb6-4797-ac50-8023a939f312',)
> > 6a519e95-62ef-445b-9a98-f05c81592c85::ERROR::2014-03-04
> > 13:05:31,500::sp::329::Storage.StoragePool::(startSpm) Unexpected error
> > Traceback (most recent call last):
> >   File "/usr/share/vdsm/storage/sp.py", line 296, in startSpm
> >     self._updateDomainsRole()
> >   File "/usr/share/vdsm/storage/securable.py", line 75, in wrapper
> >     return method(self, *args, **kwargs)
> >   File "/usr/share/vdsm/storage/sp.py", line 205, in _updateDomainsRole
> >     domain = sdCache.produce(sdUUID)
> >   File "/usr/share/vdsm/storage/sdc.py", line 98, in produce
> >     domain.getRealDomain()
> >   File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
> >     return self._cache._realProduce(self._sdUUID)
> >   File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
> >     domain = self._findDomain(sdUUID)
> >   File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
> >     dom = findMethod(sdUUID)
> >   File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
> >     raise se.StorageDomainDoesNotExist(sdUUID)
> >
> >
> >
> >
> > ----- Original Message -----
> >> From: "Giorgio Bersano" <giorgio.bersano at gmail.com>
> >> To: "Meital Bourvine" <mbourvin at redhat.com>
> >> Cc: "users at ovirt.org" <Users at ovirt.org>
> >> Sent: Tuesday, March 4, 2014 4:35:07 PM
> >> Subject: Re: [Users] Data Center Non Responsive / Contending
> >>
> >> 2014-03-04 15:23 GMT+01:00 Meital Bourvine <mbourvin at redhat.com>:
> >> > Master data domain must be reachable in order for the DC to be up.
> >> > Export domain shouldn't affect the dc status.
> >> > Are you sure that you've created the export domain as an export domain,
> >> > and
> >> > not as a regular nfs?
> >> >
> >>
> >> Yes, I am.
> >>
> >> Don't know how to extract this info from DB, but in webadmin, in the
> >> storage list, I have these info:
> >>
> >> Domain Name: nfs02EXPORT
> >> Domain Type: Export
> >> Storage Type: NFS
> >> Format: V1
> >> Cross Data-Center Status: Inactive
> >> Total Space: [N/A]
> >> Free Space: [N/A]
> >>
> >> ATM my only "Data" Domain is based on iSCSI, no NFS.
> >>
> >>
> >>
> >>
> >>
> >> > ----- Original Message -----
> >> >> From: "Giorgio Bersano" <giorgio.bersano at gmail.com>
> >> >> To: "Meital Bourvine" <mbourvin at redhat.com>
> >> >> Cc: "users at ovirt.org" <Users at ovirt.org>
> >> >> Sent: Tuesday, March 4, 2014 4:16:19 PM
> >> >> Subject: Re: [Users] Data Center Non Responsive / Contending
> >> >>
> >> >> 2014-03-04 14:48 GMT+01:00 Meital Bourvine <mbourvin at redhat.com>:
> >> >> > StorageDomainDoesNotExist: Storage domain does not exist:
> >> >> > (u'1810e5eb-9eb6-4797-ac50-8023a939f312',)
> >> >> >
> >> >> > What's the output of:
> >> >> > lvs
> >> >> > vdsClient -s 0 getStorageDomainsList
> >> >> >
> >> >> > If it exists in the list, please run:
> >> >> > vdsClient -s 0 getStorageDomainInfo
> >> >> > 1810e5eb-9eb6-4797-ac50-8023a939f312
> >> >> >
> >> >>
> >> >> I'm attaching a compressed archive to avoid mangling by googlemail
> >> >> client.
> >> >>
> >> >> Indeed the NFS storage with that id is not in the list of available
> >> >> storage as it is brought up by a VM that has to be run in this very
> >> >> same cluster. Obviously it isn't running at the moment.
> >> >>
> >> >> You find this in the DB:
> >> >>
> >> >> COPY storage_domain_static (id, storage, storage_name,
> >> >> storage_domain_type, storage_type, storage_domain_format_type,
> >> >> _create_date, _update_date, recoverable, last_time_used_as_master,
> >> >> storage_description, storage_comment) FROM stdin;
> >> >> ...
> >> >> 1810e5eb-9eb6-4797-ac50-8023a939f312
> >> >> 11d4972d-f227-49ed-b997-f33cf4b2aa26    nfs02EXPORT     3       1
> >> >>  0       2014-02-28 18:11:23.17092+01    \N      t       0       \N
> >> >>   \N
> >> >> ...
> >> >>
> >> >> Also, disks for that VM are carved from the Master Data Domain that is
> >> >> not available ATM.
> >> >>
> >> >> To say in other words: I thought that availability of an export domain
> >> >> wasn't critical to switch on a Data Center. Am I wrong?
> >> >>
> >> >> Thanks,
> >> >> Giorgio.
> >> >>
> >> _______________________________________________
> >> Users mailing list
> >> Users at ovirt.org
> >> http://lists.ovirt.org/mailman/listinfo/users
> >>
> 



More information about the Users mailing list