
This is a multi-part message in MIME format. --------------000201030700090801040205 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit dont forget to backup the cert(s) too... On 04/27/2013 06:23 AM, Itamar Heim wrote:
On 04/22/2013 08:23 PM, Chris Smith wrote:
List,
I have lost the ability to manage the hosts or VM's using ovirt engine web interface. The data center is offline, and I can't actually perform any operations with the hosts or VM's. I don't think that there are any actions I can perform in the web interface at all.
What's odd is that I can tell the host to go into maintenance mode using the ovirt-engine web interface and it seems to go into maintenance mode. It even shows the wrench icon next to the host. I can also try and activate it after it susposedly goes into maintenance mode, and It states that the host was activated, but the host never actually comes up or contends for SPM status, and the data center never comes online.
From the logs it seems that at least PKI is broken between the engine and the hosts as I see numerous certificate errors on both the ovirt-engine and clients.
vdsm.log shows:
Traceback (most recent call last): File "/usr/lib64/python2.7/SocketServer.py", line 582, in process_request_thread self.finish_request(request, client_address) File "/usr/lib/python2.7/site-packages/vdsm/SecureXMLRPCServer.py", line 66, in finish_request request.do_handshake() File "/usr/lib64/python2.7/ssl.py", line 305, in do_handshake self._sslobj.do_handshake() SSLError: [Errno 1] _ssl.c:504: error:14094416:SSL routines:SSL3_READ_BYTES:sslv3 alert certificate unknown
and engine.log shows:
2013-04-18 18:42:43,632 ERROR [org.ovirt.engine.core. engineencryptutils.EncryptionUtils] (QuartzScheduler_Worker-68) Failed to decryptData must start with zero 2013-04-18 18:42:43,642 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand] (QuartzScheduler_Worker-68) XML RPC error in command
Alon Bar-Lev was able to offer several good pointers in another thread titled "Certificates and PKI seem to be broken after yum update" and eventually concluded that the installation seems to be corrupted more than just the certificates, truststore, and keystore, and suggested that I start a new thread to ask about how to rescue the storage domain structure.
The storage used for the data center is ISCSI, which is intact and working. In fact 2 of the VM's are still online and running on one of the original FC17 hosts systems.
I'm not able to reinstall any of the existing hosts from the ovirt-engine web interface. I attempted to reinstall one of the hosts (not the SPM) which failed.
I also tried to bring up a new, third host and add it to the cluster. I setup another Fedora 17 box up and tried to add it to the cluster, but it states that there are no available servers in the cluster to probe the new host.
This is a test environment that I would like to fix, but I'm also willing to just run engine cleanup and start over.
That said, there are 3 VM's that I would like to keep. Two are online and running, and I'm able to see them with virsh on that host. I was wondering about using virsh to backup these vm's.
The third VM exists in the database, and was set to run on the host that I attempted to reinstall, but that VM isn't running, and when I use virsh on it's host, virsh can't seem to find it, when I perform the list commands, and I can't start it with virsh <vm-name>
What is the best way to proceed? It seems like it would be easier to export the VM's using virsh from the host that they run on if possible, then update ovirt to the latest version, recreate everything and then import the VM's back in to the new environment.
Will this work? Is there a procedure I can follow to do this?
Here's some additional information about the installed ovirt packages on the ovirt-engine
[root@reliant yum.repos.d]# yum list installed | grep ovirt ovirt-engine.noarch 3.1.0-4.fc17 @ovirt-stable ovirt-engine-backend.noarch 3.1.0-4.fc17 @ovirt-stable ovirt-engine-cli.noarch 3.2.0.5-1.fc17 @updates ovirt-engine-config.noarch 3.1.0-4.fc17 @ovirt-stable ovirt-engine-dbscripts.noarch 3.1.0-4.fc17 @ovirt-stable ovirt-engine-genericapi.noarch 3.1.0-4.fc17 @ovirt-stable ovirt-engine-notification- service.noarch 3.1.0-4.fc17 @ovirt-stable ovirt-engine-restapi.noarch 3.1.0-4.fc17 @ovirt-stable ovirt-engine-sdk.noarch 3.2.0.2-1.fc17 @updates ovirt-engine-setup.noarch 3.1.0-4.fc17 @ovirt-stable ovirt-engine-tools-common.noarch 3.1.0-4.fc17 @ovirt-stable ovirt-engine-userportal.noarch 3.1.0-4.fc17 @ovirt-stable ovirt-engine-webadmin-portal.noarch 3.1.0-4.fc17 @ovirt-stable ovirt-image-uploader.noarch 3.1.0-0.git9c42c8.fc17 @ovirt-stable ovirt-iso-uploader.noarch 3.1.0-0.git1841d9.fc17 @ovirt-stable ovirt-log-collector.noarch 3.1.0-0.git10d719.fc17 @ovirt-stable ovirt-release-fedora.noarch 4-2 @/ovirt-release-fedora.noarch _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
what type of storage domain is this (nfs, iscsi, etc.)? you can also try backing up the db, re-install engine, restore the db, then try to re-install the hosts.
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
--------------000201030700090801040205 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit <html> <head> <meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <font size="-1"><font face="Tahoma">dont forget to backup the cert(s) too... </font></font><br> <br> On 04/27/2013 06:23 AM, Itamar Heim wrote: <blockquote cite="mid:517B60D8.6020107@redhat.com" type="cite">On 04/22/2013 08:23 PM, Chris Smith wrote: <br> <blockquote type="cite">List, <br> <br> I have lost the ability to manage the hosts or VM's using ovirt engine <br> web interface. The data center is offline, and I <br> can't actually perform any operations with the hosts or VM's. I don't <br> think that there <br> are any actions I can perform in the web interface at all. <br> <br> What's odd is that I can tell the host to go into maintenance mode <br> using the ovirt-engine web interface and it seems to go into <br> maintenance mode. It even shows the wrench icon next to the host. I <br> can also try and activate it after it susposedly goes into maintenance <br> mode, and It states that the host was activated, but the host never <br> actually comes up or contends for SPM status, and the data center <br> never comes online. <br> <br> From the logs it seems that at least PKI is broken between the engine <br> and the hosts as I see numerous certificate errors on both the <br> ovirt-engine and clients. <br> <br> vdsm.log shows: <br> <br> Traceback (most recent call last): <br> File "/usr/lib64/python2.7/SocketServer.py", line 582, in <br> process_request_thread <br> self.finish_request(request, client_address) <br> File "/usr/lib/python2.7/site-packages/vdsm/SecureXMLRPCServer.py", <br> line 66, in finish_request <br> request.do_handshake() <br> File "/usr/lib64/python2.7/ssl.py", line 305, in do_handshake <br> self._sslobj.do_handshake() <br> SSLError: [Errno 1] _ssl.c:504: error:14094416:SSL <br> routines:SSL3_READ_BYTES:sslv3 alert certificate unknown <br> <br> and engine.log shows: <br> <br> 2013-04-18 18:42:43,632 ERROR <br> [org.ovirt.engine.core. <br> engineencryptutils.EncryptionUtils] <br> (QuartzScheduler_Worker-68) Failed to decryptData must start with zero <br> 2013-04-18 18:42:43,642 ERROR <br> [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand] <br> (QuartzScheduler_Worker-68) XML RPC error in command <br> <br> <br> Alon Bar-Lev was able to offer several good pointers in another thread <br> titled "Certificates and PKI seem to be broken after yum update" and <br> eventually concluded that the installation seems to be corrupted more <br> than just the certificates, truststore, and keystore, and suggested <br> that I start a new thread to ask about how to rescue the storage <br> domain structure. <br> <br> The storage used for the data center is ISCSI, which is intact and <br> working. In fact 2 of the VM's are still online and running on one of <br> the original FC17 hosts systems. <br> <br> I'm not able to reinstall any of the existing hosts from the ovirt-engine web <br> interface. I attempted to reinstall one of the hosts (not the SPM) <br> which failed. <br> <br> I also tried to bring up a new, third host and add it to the cluster. <br> I setup another Fedora 17 box up and tried to add it to the <br> cluster, but it states that there are no available servers in the <br> cluster to probe the new host. <br> <br> This is a test environment that I would like to fix, but I'm also <br> willing to just run engine cleanup and start over. <br> <br> That said, there are 3 VM's that I would like to keep. Two are online <br> and running, and I'm able to see them with virsh on that host. I was <br> wondering about using virsh to backup these vm's. <br> <br> The third VM exists in the database, and was set to run on the host <br> that I attempted to reinstall, but that VM isn't running, and when I <br> use virsh on it's host, virsh can't seem to find it, when I perform <br> the list commands, and I can't start it with virsh <vm-name> <br> <br> What is the best way to proceed? It seems like it would be easier to <br> export the VM's using virsh from the host that they run on if <br> possible, then update ovirt to the latest version, recreate everything <br> and then import the VM's back in to the new environment. <br> <br> Will this work? Is there a procedure I can follow to do this? <br> <br> Here's some additional information about the installed ovirt packages <br> on the ovirt-engine <br> <br> [root@reliant yum.repos.d]# yum list installed | grep ovirt <br> ovirt-engine.noarch 3.1.0-4.fc17 <br> @ovirt-stable <br> ovirt-engine-backend.noarch 3.1.0-4.fc17 <br> @ovirt-stable <br> ovirt-engine-cli.noarch 3.2.0.5-1.fc17 @updates <br> ovirt-engine-config.noarch 3.1.0-4.fc17 <br> @ovirt-stable <br> ovirt-engine-dbscripts.noarch 3.1.0-4.fc17 <br> @ovirt-stable <br> ovirt-engine-genericapi.noarch 3.1.0-4.fc17 <br> @ovirt-stable <br> ovirt-engine-notification- <br> service.noarch <br> 3.1.0-4.fc17 <br> @ovirt-stable <br> ovirt-engine-restapi.noarch 3.1.0-4.fc17 <br> @ovirt-stable <br> ovirt-engine-sdk.noarch 3.2.0.2-1.fc17 @updates <br> ovirt-engine-setup.noarch 3.1.0-4.fc17 <br> @ovirt-stable <br> ovirt-engine-tools-common.noarch 3.1.0-4.fc17 <br> @ovirt-stable <br> ovirt-engine-userportal.noarch 3.1.0-4.fc17 <br> @ovirt-stable <br> ovirt-engine-webadmin-portal.noarch 3.1.0-4.fc17 <br> @ovirt-stable <br> ovirt-image-uploader.noarch 3.1.0-0.git9c42c8.fc17 <br> @ovirt-stable <br> ovirt-iso-uploader.noarch 3.1.0-0.git1841d9.fc17 <br> @ovirt-stable <br> ovirt-log-collector.noarch 3.1.0-0.git10d719.fc17 <br> @ovirt-stable <br> ovirt-release-fedora.noarch 4-2 <br> @/ovirt-release-fedora.noarch <br> _______________________________________________ <br> Users mailing list <br> <a class="moz-txt-link-abbreviated" href="mailto:Users@ovirt.org">Users@ovirt.org</a> <br> <a class="moz-txt-link-freetext" href="http://lists.ovirt.org/mailman/listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a> <br> <br> </blockquote> <br> what type of storage domain is this (nfs, iscsi, etc.)? <br> you can also try backing up the db, re-install engine, restore the db, then try to re-install the hosts. <br> <br> _______________________________________________ <br> Users mailing list <br> <a class="moz-txt-link-abbreviated" href="mailto:Users@ovirt.org">Users@ovirt.org</a> <br> <a class="moz-txt-link-freetext" href="http://lists.ovirt.org/mailman/listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a> <br> </blockquote> </body> </html> --------------000201030700090801040205--