[Users] How to rescue storage domain structure

Alex Leonhardt alex.tuxx at gmail.com
Sat Apr 27 09:08:47 UTC 2013


dont forget to backup the cert(s) too...

On 04/27/2013 06:23 AM, Itamar Heim wrote:
> On 04/22/2013 08:23 PM, Chris Smith wrote:
>> List,
>>
>> I have lost the ability to manage the hosts or VM's using ovirt engine
>> web interface.  The data center is offline, and I
>> can't actually perform any operations with the hosts or VM's.  I don't
>> think that there
>> are any actions I can perform in the web interface at all.
>>
>> What's odd is that I can tell the host to go into maintenance mode
>> using the ovirt-engine web interface and it seems to go into
>> maintenance mode.  It even shows the wrench icon next to the host.  I
>> can also try and activate it after it susposedly goes into maintenance
>> mode, and It states that the host was activated, but the host never
>> actually comes up or contends for SPM status, and the data center
>> never comes online.
>>
>>  From the logs it seems that at least PKI is broken between the engine
>> and the hosts as I see numerous certificate errors on both the
>> ovirt-engine and clients.
>>
>> vdsm.log shows:
>>
>> Traceback (most recent call last):
>>    File "/usr/lib64/python2.7/SocketServer.py", line 582, in
>> process_request_thread
>>      self.finish_request(request, client_address)
>>    File "/usr/lib/python2.7/site-packages/vdsm/SecureXMLRPCServer.py",
>> line 66, in finish_request
>>      request.do_handshake()
>>    File "/usr/lib64/python2.7/ssl.py", line 305, in do_handshake
>>      self._sslobj.do_handshake()
>> SSLError: [Errno 1] _ssl.c:504: error:14094416:SSL
>> routines:SSL3_READ_BYTES:sslv3 alert certificate unknown
>>
>> and engine.log shows:
>>
>> 2013-04-18 18:42:43,632 ERROR
>> [org.ovirt.engine.core.
>> engineencryptutils.EncryptionUtils]
>> (QuartzScheduler_Worker-68) Failed to decryptData must start with zero
>> 2013-04-18 18:42:43,642 ERROR
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand]
>> (QuartzScheduler_Worker-68) XML RPC error in command
>>
>>
>> Alon Bar-Lev was able to offer several good pointers in another thread
>> titled "Certificates and PKI seem to be broken after yum update" and
>> eventually concluded that the installation seems to be corrupted more
>> than just the certificates, truststore, and keystore, and suggested
>> that I start a new thread to ask about how to rescue the storage
>> domain structure.
>>
>> The storage used for the data center is ISCSI, which is intact and
>> working.  In fact 2 of the VM's are still online and running on one of
>> the original FC17 hosts systems.
>>
>> I'm not able to reinstall any of the existing hosts from the 
>> ovirt-engine web
>> interface.  I attempted to reinstall one of the hosts (not the SPM)
>> which failed.
>>
>> I also tried to bring up a new, third host and add it to the cluster.
>> I setup another Fedora 17 box up and tried to add it to the
>> cluster, but it states that there are no available servers in the
>> cluster to probe the new host.
>>
>> This is a test environment that I would like to fix, but I'm also
>> willing to just run engine cleanup and start over.
>>
>> That said, there are 3 VM's that I would like to keep.  Two are online
>> and running, and I'm able to see them with virsh on that host.  I was
>> wondering about using virsh to backup these vm's.
>>
>> The third VM exists in the database, and was set to run on the host
>> that I attempted to reinstall, but that VM isn't running, and when I
>> use virsh on it's host, virsh can't seem to find it, when I perform
>> the list commands, and I can't start it with virsh <vm-name>
>>
>> What is the best way to proceed?  It seems like it would be easier to
>> export the VM's using virsh from the host that they run on if
>> possible, then update ovirt to the latest version, recreate everything
>> and then import the VM's back in to the new environment.
>>
>> Will this work?  Is there a procedure I can follow to do this?
>>
>> Here's some additional information about the installed ovirt packages
>> on the ovirt-engine
>>
>> [root at reliant yum.repos.d]# yum list installed | grep ovirt
>> ovirt-engine.noarch                    3.1.0-4.fc17
>>   @ovirt-stable
>> ovirt-engine-backend.noarch            3.1.0-4.fc17
>>   @ovirt-stable
>> ovirt-engine-cli.noarch                
>> 3.2.0.5-1.fc17                   @updates
>> ovirt-engine-config.noarch             3.1.0-4.fc17
>>   @ovirt-stable
>> ovirt-engine-dbscripts.noarch          3.1.0-4.fc17
>>   @ovirt-stable
>> ovirt-engine-genericapi.noarch         3.1.0-4.fc17
>>   @ovirt-stable
>> ovirt-engine-notification-
>> service.noarch
>>                                         3.1.0-4.fc17
>>   @ovirt-stable
>> ovirt-engine-restapi.noarch            3.1.0-4.fc17
>>   @ovirt-stable
>> ovirt-engine-sdk.noarch                
>> 3.2.0.2-1.fc17                   @updates
>> ovirt-engine-setup.noarch              3.1.0-4.fc17
>>   @ovirt-stable
>> ovirt-engine-tools-common.noarch       3.1.0-4.fc17
>>   @ovirt-stable
>> ovirt-engine-userportal.noarch         3.1.0-4.fc17
>>   @ovirt-stable
>> ovirt-engine-webadmin-portal.noarch    3.1.0-4.fc17
>>   @ovirt-stable
>> ovirt-image-uploader.noarch            3.1.0-0.git9c42c8.fc17
>>   @ovirt-stable
>> ovirt-iso-uploader.noarch              3.1.0-0.git1841d9.fc17
>>   @ovirt-stable
>> ovirt-log-collector.noarch             3.1.0-0.git10d719.fc17
>>   @ovirt-stable
>> ovirt-release-fedora.noarch            4-2
>>   @/ovirt-release-fedora.noarch
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
> what type of storage domain is this (nfs, iscsi, etc.)?
> you can also try backing up the db, re-install engine, restore the db, 
> then try to re-install the hosts.
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20130427/6424daec/attachment-0001.html>


More information about the Users mailing list