[ovirt-users] hosted engine vm.conf missing

Simone Tiraboschi stirabos at redhat.com
Tue Dec 1 12:51:23 EST 2015


On Tue, Dec 1, 2015 at 4:13 PM, Thomas Scofield <tscofield at gmail.com> wrote:

> There was an error message on the setup, but everything continued on and
> the hosted engine was available so I didn't think much of it.
>
I opened a bug about that:
https://bugzilla.redhat.com/show_bug.cgi?id=1287159


> If there is some way of recovering this setup I would like to give that a
> try, if you can provide some instructions that would be great, thanks.
>

OK, I recreated the archive from your log file, I'm attaching it.
Please decompress it in a temporary empty directory and then delete the
archive file.
Check if everything is OK for you and then:

sdUUID_line=$(grep sdUUID /etc/ovirt-hosted-engine/hosted-engine.conf)
sdUUID=${sdUUID_line:7:36}
conf_volume_UUID_line=$(grep conf_volume_UUID /etc/ovirt-hosted-engine/
hosted-engine.conf)
conf_volume_UUID=${conf_volume_UUID_line:17:36}
conf_image_UUID_line=$(grep conf_image_UUID /etc/ovirt-hosted-engine/
hosted-engine.conf)
conf_image_UUID=${conf_image_UUID_line:16:36}
tar -cO * | dd of=/rhev/data-center/mnt/blockSD/$sdUUID/images/$
conf_image_UUID/$conf_volume_UUID oflag=direct

Let me know.


> On Dec 1, 2015 9:36 AM, "Simone Tiraboschi" <stirabos at redhat.com> wrote:
>
>>
>>
>> On Tue, Dec 1, 2015 at 3:24 PM, Thomas Scofield <tscofield at gmail.com>
>> wrote:
>>
>>> I did import the lun into the engine.  I had the hosted engine running
>>> for a few days and I was able to restart it a number of times.  It wasn't
>>> until I rebooted the physical box that it was running on that I encountered
>>> the problem.
>>>
>> hosted-engine-setup probably failed adding the LUN to the engine cause it
>> was already there or something similar; we need engine logs to better
>> understand but the result is that it didn't completed creating the storage
>> volume.
>> Probably it's not worth to fix it cause we are going to remove that once
>> we completed the procedure to auto-import the hosted-engine storage domain.
>> didn't you noticed that it failed? wasn't the output clear enough?
>>
>> Then the setup generates a temp file under
>> /var/run/ovirt-hosted-engine-ha/vm.conf to use it for the first run and it
>> copies to the shared storage for other hosts and further runs.
>> But it failed for the issue above before being able to copy it to the
>> shared storage.
>>
>> Your local  /var/run/ovirt-hosted-engine-ha/vm.conf  simply survived till
>> you rebooted the host and so you were able to reboot the engine VM. As
>> expected it disappeared once you rebooted.
>> Probably the agent should validate the conf volume more strictly and fail
>> more clearly in this case so you'd notice it before.
>> I'll open a bug against that.
>>
>
Done:
https://bugzilla.redhat.com/show_bug.cgi?id=1287195



>
>> If you want to manually fix in order to recover your setup I can give you
>> the instruction set to manually create the configuration volume on the
>> shared storage.
>> If it was just an evaluation I'd suggest to redeploy it from scratch.
>>
>>
>>
>>> On Dec 1, 2015 9:14 AM, "Simone Tiraboschi" <stirabos at redhat.com> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Dec 1, 2015 at 2:07 PM, Thomas Scofield <tscofield at gmail.com>
>>>> wrote:
>>>>
>>>>> Are there different instructions for an iscsi domain?  I was able to
>>>>> find the proper path at /rhev/data-center/mnt/blockSD/$sdUUID/images/$
>>>>> conf_image_UUID/$conf_volume_UUID but the tar command failed
>>>>>
>>>>> tar: This does not look like a tar archive
>>>>> tar: Exiting with failure status due to previous errors
>>>>>
>>>>> This was a relatively recent fresh install, the setup log is attached.
>>>>>
>>>>
>>>> The conf volume is not there cause ovirt-hosted-engine-setup failed
>>>> before creating it.
>>>>
>>>> Indeed we have:
>>>> 2015-11-22 16:30:27 DEBUG
>>>> otopi.plugins.ovirt_hosted_engine_setup.engine.add_disk
>>>> add_disk._closeup:153 Connecting to the Engine
>>>> 2015-11-22 16:30:27 DEBUG otopi.context context._executeMethod:156
>>>> method exception
>>>> Traceback (most recent call last):
>>>>   File "/usr/lib/python2.7/site-packages/otopi/context.py", line 146,
>>>> in _executeMethod
>>>>     method['method']()
>>>>   File
>>>> "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/engine/add_disk.py",
>>>> line 182, in _closeup
>>>>     lun_list = check_lun_storage.get_logical_unit()
>>>> AttributeError: 'NoneType' object has no attribute 'get_logical_unit'
>>>> 2015-11-22 16:30:27 ERROR otopi.context context._executeMethod:165
>>>> Failed to execute stage 'Closing up': 'NoneType' object has no attribute
>>>> 'get_logical_unit'
>>>>
>>>> and:
>>>> 2015-11-22 16:30:28 DEBUG otopi.context context._executeMethod:142
>>>> Stage terminate METHOD otopi.plugins.otopi.dialog.machine.Plugin._terminate
>>>> 2015-11-22 16:30:28 DEBUG otopi.context context._executeMethod:148
>>>> condition False
>>>> 2015-11-22 16:30:28 DEBUG otopi.context context._executeMethod:142
>>>> Stage terminate METHOD otopi.plugins.otopi.core.log.Plugin._terminate
>>>>
>>>> Probably you hit a different issue.
>>>> Did you try to take any actions like importing the hosted-engine
>>>> storage domain LUN on something similar on the engine before continuing
>>>> with engine setup?
>>>>
>>>>
>>>>
>>>>> On Tue, Dec 1, 2015 at 5:54 AM, Simone Tiraboschi <stirabos at redhat.com
>>>>> > wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Dec 1, 2015 at 12:55 AM, Thomas Scofield <tscofield at gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> That file is missing
>>>>>>>
>>>>>>> [root at ovirt01 vdsm]# ls -l /var/run/ovirt-hosted-engine-ha/vm.conf
>>>>>>> ls: cannot access /var/run/ovirt-hosted-engine-ha/vm.conf: No such
>>>>>>> file or directory
>>>>>>> [root at ovirt01 vdsm]#
>>>>>>>
>>>>>>
>>>>>>
>>>>>> Was it a fresh install or an upgrade from 3.5?
>>>>>> If you still have it, can you please attach the setup logs?
>>>>>>
>>>>>> Can you please try to manually check vm.conf on the shared storage?
>>>>>> This should do the job extracting the files in a local directory.
>>>>>> (please substitute '192.168.1.115:_Virtual_ext35u36' with the mount
>>>>>> point of hosted-engine storage domain on your system).
>>>>>>
>>>>>> mntpoint=/rhev/data-center/mnt/192.168.1.115:_Virtual_ext35u36
>>>>>> dir=`mktemp -d` && cd $dir
>>>>>> sdUUID_line=$(grep sdUUID /etc/ovirt-hosted-engine/
>>>>>> hosted-engine.conf)
>>>>>> sdUUID=${sdUUID_line:7:36}
>>>>>> conf_volume_UUID_line=$(grep conf_volume_UUID
>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf)
>>>>>> conf_volume_UUID=${conf_volume_UUID_line:17:36}
>>>>>> conf_image_UUID_line=$(grep conf_image_UUID /etc/ovirt-hosted-engine/
>>>>>> hosted-engine.conf)
>>>>>> conf_image_UUID=${conf_image_UUID_line:16:36}
>>>>>> dd if=$mntpoint/$sdUUID/images/$conf_image_UUID/$conf_volume_UUID
>>>>>> 2>/dev/null| tar -xvf -
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 30, 2015 at 4:04 AM, Simone Tiraboschi <
>>>>>>> stirabos at redhat.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Nov 27, 2015 at 9:48 PM, Thomas Scofield <
>>>>>>>> tscofield at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> I have recently setup a ovirt hosted engine on iscsi storage and
>>>>>>>>> after a reboot of the system I am unable to start the hosted engine.  The
>>>>>>>>> agent.log gives errors indicating there is a missing value in the vm.conf
>>>>>>>>> file, but the vm.conf file does not appear in the location indicated.
>>>>>>>>> There is no error indicated when the agent attempts to reload the vm.conf.
>>>>>>>>> Any ideas on how to get the hosted engine up and running?
>>>>>>>>>
>>>>>>>>> MainThread::INFO::2015-11-26
>>>>>>>>> 21:31:13,071::hosted_engine::699::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock)
>>>>>>>>> Ensuring lease for lockspace hosted-engine, host id 1 is acquired (file:
>>>>>>>>> /var/run/vdsm/storage/ddf4a26b-61ff-49a4-81db-9f82da35e44b/6ed6d868-aaf3-4b3f-bdf0-a4ad262709ae/1fe5b7fc-eae7-4f07-a2fe-5a082e14c876)
>>>>>>>>> MainThread::INFO::2015-11-26
>>>>>>>>> 21:31:13,072::upgrade::836::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(upgrade)
>>>>>>>>> Host configuration is already up-to-date
>>>>>>>>> MainThread::INFO::2015-11-26
>>>>>>>>> 21:31:13,072::hosted_engine::422::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>> Reloading vm.conf from the shared storage domain
>>>>>>>>> MainThread::ERROR::2015-11-26
>>>>>>>>> 21:31:13,100::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
>>>>>>>>> Error: ''Configuration value not found: file=/var/run/ovirt-hosted-engine-ha/vm.conf,
>>>>>>>>> key=memSize'' - trying to restart agent
>>>>>>>>> MainThread::WARNING::2015-11-26
>>>>>>>>> 21:31:18,105::agent::208::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
>>>>>>>>> Restarting agent, attempt '9'
>>>>>>>>> MainThread::ERROR::2015-11-26
>>>>>>>>> 21:31:18,106::agent::210::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
>>>>>>>>> Too many errors occurred, giving up. Please review the log and consider
>>>>>>>>> filing a bug.
>>>>>>>>> MainThread::INFO::2015-11-26
>>>>>>>>> 21:31:18,106::agent::143::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
>>>>>>>>> Agent shutting down
>>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Thomas,
>>>>>>>> could you please attach also your /var/run/ovirt-hosted-engine-ha/vm.conf
>>>>>>>> ?
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Users mailing list
>>>>>>>>> Users at ovirt.org
>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20151201/e2ed7e0d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scofield_restore.tar
Type: application/x-tar
Size: 20480 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20151201/e2ed7e0d/attachment-0001.tar>


More information about the Users mailing list