After letting this sit for a few days, does anyone have any ideas as to how
to deal with my situation? Would anyone like me to send the SOS report
directly to them? It's a 9MB file.
If nothing comes up, I'm going to try and sift through the SOS report
tonight, but I won't know what I'm trying to find.
Thank you for any and all help.
On Thu, Jun 1, 2017 at 1:15 AM, Sandro Bonazzola <sbonazzo(a)redhat.com>
wrote:
On Thu, Jun 1, 2017 at 6:36 AM, Brendan Hartzell <mrrex4(a)gmail.com> wrote:
> Ran the 4 commands listed above, no errors on the screen.
>
> Started the hosted-engine standard setup from the web-UI.
>
> Using iSCSI for the storage.
>
> Using mostly default options, I got these errors in the web-UI.
>
> Error creating Volume Group: Failed to initialize physical device:
> ("[u'/dev/mapper/36589cfc000000de7482638fcfcebbbb4']",)
> Failed to execute stage 'Misc configuration': Failed to initialize
> physical device:
("[u'/dev/mapper/36589cfc000000de7482638fcfcebbbb4']",)
> Hosted Engine deployment failed: this system is not reliable, please
> check the issue,fix and redeploy
>
> I rebuilt my iSCSI (I don't think I cleaned it up from a previous
> install).
> Re-ran the above 4 commands.
> Restarted hosted engine standard setup from web-UI.
> Install moved past "Connecting Storage Pool" so I believe the above was
> my fault.
>
> These are the last messages displayed on the web-UI.
> Creating Storage Pool
> Connecting Storage Pool
> Verifying sanlock lockspace initialization
> Creating Image for 'hosted-engine.lockspace' ...
> Image for 'hosted-engine.lockspace' created successfully
> Creating Image for 'hosted-engine.metadata' ...
> Image for 'hosted-engine.metadata' created successfully
> Creating VM Image
> Extracting disk image from OVF archive (could take a few minutes
> depending on archive size)
> Validating pre-allocated volume size
> Uploading volume to data domain (could take a few minutes depending on
> archive size)
>
> At the host terminal, I got the error "watchdog watchdog0: watchdog did
> not stop!"
> Then the host restarted.
>
Simone, can you help here?
>
> This is as far as I've gotten in previous attempts.
>
> Attaching the hosted-engine-setup log.
>
> The SOS report is 9MB and the ovirt users group will drop the email.
>
> On Wed, May 31, 2017 at 6:59 AM, Sandro Bonazzola <sbonazzo(a)redhat.com>
> wrote:
>
>>
>>
>> On Wed, May 31, 2017 at 3:10 PM, Brendan Hartzell <mrrex4(a)gmail.com>
>> wrote:
>>
>>> Now that you have identified the problem, should I run the following
>>> commands and send you another SOS?
>>>
>>> ovirt-hosted-engine-cleanup
>>> vdsm-tool configure --force
>>> systemctl restart libvirtd
>>> systemctl restart vdsm
>>>
>>> Or is there a different plan in mind?
>>>
>>
>> I would have expected someone from virt team to follow up for further
>> investigations :-)
>> above commands should work.
>>
>>
>>
>>>
>>> Thank you,
>>>
>>> Brendan
>>>
>>> On Tue, May 30, 2017 at 11:42 PM, Sandro Bonazzola <sbonazzo(a)redhat.com
>>> > wrote:
>>>
>>>>
>>>>
>>>> On Wed, May 31, 2017 at 4:45 AM, Brendan Hartzell
<mrrex4(a)gmail.com>
>>>> wrote:
>>>>
>>>>> Can you please elaborate about the failure you see here and how are
>>>>> you trying to manually partition the host?
>>>>>
>>>>> Sure, I will start from the beginning.
>>>>> - Using: ovirt-node-ng-installer-ovirt-4.1-2017052604
>>>>> <(201)%20705-2604>.iso
>>>>> - During installation I setup one of the two interfaces and check
the
>>>>> box to automatically use the connection.
>>>>> - I'm currently providing a host name of
node-1.test.net until I
>>>>> have a successful process.
>>>>> - I configure date and time for my timezone and to use an internal
>>>>> NTP server.
>>>>> - On Installation Destination, I pick my 128GB USB3.0 SanDisk flash
>>>>> drive, check the box that I would like to make additional space, and
click
>>>>> done. In the reclaim disk space window, I click delete all, and
then
>>>>> reclaim space. I go back into the Installation Destination, select
that I
>>>>> will configure partitioning, and click done. The Manual
Partitioning
>>>>> window opens, I use the option to automatically create mount points.
>>>>>
>>>>
>>>> In this screen, please change partitioning scheme from LVM to LVM Thin
>>>> Provisioning: it should solve your following error.
>>>>
>>>>
>>>>
>>>>
>>>>> At this point, /boot is 1024MB, /var is 15GB, / is 88.11 GB, and
>>>>> swap is 11.57GB. I then change / to 23.11 GB, update settings,
change /var
>>>>> to 80GB, update settings again, and click done. I accept the changes
and
>>>>> begin installation.
>>>>>
>>>>> I tried these changes based on this article:
http://www.ovirt.org/
>>>>> documentation/self-hosted/chap-Deploying_Self-Hosted_Engine/
>>>>>
>>>>> The article does say that you can specify a different directory than
>>>>> /var/tmp, but I don't recall seeing that option.
>>>>>
>>>>
>>>> If the setup detects not enough space in /var/tmp for extracting the
>>>> appliance it will ask about a different directory.
>>>>
>>>>
>>>>
>>>>>
>>>>> After some time, I get the following error:
>>>>> There was an error running the kickstart script at line 7. This is
a
>>>>> fatal error and installation will be aborted. The details of this
error
>>>>> are:
>>>>>
>>>>> [INFO] Trying to create a manageable base from '/'
>>>>> [ERROR] LVM Thin Provisioning partitioning scheme is required. For
>>>>> autoinstall via Kickstart with LVM Thin Provisioning check options
>>>>> --thinpool and --grow. Please consult documentation for details.
>>>>>
>>>>
>>>>
>>>> ^^ this one should be solved by the LVM Thin Provisioning scheme
>>>> mentioned above..
>>>>
>>>>
>>>>
>>>>>
>>>>> Traceback (most recent call last):
>>>>> File "/usr/lib64/python2.7/runpy.py", line 162, in
>>>>> _run_module_as_main "__main__", fname, loader, pkg_name)
>>>>> File "/usr/lib64/python2.7runpy.py", line 72, in _run_code
exec code
>>>>> in run_globals
>>>>> File
"/usr/lib/python2.7/site-packages/imgbased/__main__.py", line
>>>>> 51, in <module> CliApplication()
>>>>> File
"/usr/lib/python2.7/site-packages/imgbased/__init__.py", line
>>>>> 82, in CliApplication()
>>>>> File "/usr/lib/python2.7/site-packages/imgbased/hooks.py",
line 120,
>>>>> in emit cb(self.context, *args)
>>>>> File
"/usr/lib/python2.7/site-packages/imgbased/plugins/core.py",
>>>>> line 169, in post_argparse layout.initialize(args.source,
args.init_nvr)
>>>>> File
"/usr/lib/python2.7/site-packages/imgbased/plugins/core.py",
>>>>> line 216, in initialize self.app.imgbase.init_layout_from(source,
>>>>> init_nvr)
>>>>> File
"/usr/lib/python2.7/site-packages/imgbased/imgbase.py", line
>>>>> 271, in init_layout_from self.init_tags_on(existing_lv)
>>>>> File
"/usr/lib/python2.7/site-packages/imgbased/imgbase.py", line
>>>>> 243, in init_tags_on pool = lv.thinpool()
>>>>> File "/usr/lib/python2.7/site-packages/imgbased/lvm.py",
line 250,
>>>>> in thinpool raise MissingLvmThinPool()
imgbased.lvm.MissingLvmThinPoo
>>>>> l
>>>>>
>>>>> At this point, the only option is to exit the installer.
>>>>>
>>>>> ****************************
>>>>>
>>>>> Being this a new install, please use 4.1. oVirt 4.0 is not supported
>>>>> anymore.
>>>>>
>>>>> Not a problem.
>>>>>
>>>>> ****************************
>>>>>
>>>>> Can you please provide hosted engine setup logs or better a full sos
>>>>> report? (sosreport -a)
>>>>>
>>>>> Again, the process I'm following:
>>>>> - Using: ovirt-node-ng-installer-ovirt-4.1-2017052604
>>>>> <(201)%20705-2604>.iso
>>>>> - During installation I setup one of the two interfaces and check
the
>>>>> box to automatically use the connection.
>>>>> - I'm currently providing a host name of
node-1.test.net until I
>>>>> have a successful process.
>>>>> - I configure date and time for my timezone and to use an internal
>>>>> NTP server.
>>>>> - On Installation Destination, I pick my 128GB USB3.0 SanDisk flash
>>>>> drive, check the box that I would like to make additional space, and
click
>>>>> done. In the reclaim disk space window, I click delete all, and then
>>>>> reclaim space.
>>>>> - Begin Installation and set a root password.
>>>>> - Perform a yum update - no packages marked for update (as expected)
>>>>> - Use vi to update /etc/hosts with a reference for
node-1.test.net
>>>>> and
engine.test.net
>>>>> - First attempt at hosted-engine from web-UI
>>>>> - Setup downloads and installs ovirt-engine-applianc
>>>>> e-4.1-20170523.1.el7.centos.noarch.rpm
>>>>> *Failed to execute stage 'Environment setup': Failed to
reconfigure
>>>>> libvirt for VDSM
>>>>> *Hosted Engine deployment failed
>>>>> - Attached SOS report
>>>>> The checksum is: aa56097edc0b63c49caaf1a1fde021bc
>>>>>
>>>>> At this point, I would run ovirt-hosted-engine-cleanup and I would
>>>>> get further along in the install process. However, because this is a
fresh
>>>>> install, I'm going to leave things here for now so you can review
the SOS.
>>>>>
>>>>
>>>> Thanks for the SOS report!
>>>> Hosted Engine setup fails on:
>>>>
>>>> 2017-05-30 19:24:39 DEBUG otopi.plugins.gr_he_setup.system.vdsmenv
>>>> plugin.execute:921 execute-output: ('/bin/vdsm-tool',
'configure',
>>>> '--force') stdout:
>>>>
>>>> Checking configuration status...
>>>>
>>>> Current revision of multipath.conf detected, preserving
>>>> lvm is configured for vdsm
>>>> libvirt is already configured for vdsm
>>>> SUCCESS: ssl configured to true. No conflicts
>>>>
>>>> Running configure...
>>>> Reconfiguration of libvirt is done.
>>>>
>>>> 2017-05-30 19:24:39 DEBUG otopi.plugins.gr_he_setup.system.vdsmenv
>>>> plugin.execute:926 execute-output: ('/bin/vdsm-tool',
'configure',
>>>> '--force') stderr:
>>>> Error: ServiceOperationError: _systemctlStart failed
>>>> Job for libvirtd.service failed because the control process exited
>>>> with error code. See "systemctl status libvirtd.service" and
"journalctl
>>>> -xe" for details.
>>>>
>>>> At the same time journalctl shows:
>>>>
>>>> May 30 19:24:39
node-1.test.net libvirtd[20954]: libvirt version:
>>>> 2.0.0, package: 10.el7_3.5 (CentOS BuildSystem
<
http://bugs.centos.org>,
>>>> 2017-03-03-02:09:45,
c1bm.rdu2.centos.org)
>>>> May 30 19:24:39
node-1.test.net libvirtd[20954]: hostname:
>>>>
node-1.test.net
>>>> May 30 19:24:39
node-1.test.net libvirtd[20954]: The server
>>>> certificate /etc/pki/vdsm/certs/vdsmcert.pem is not yet active
>>>> May 30 19:24:39
node-1.test.net systemd[1]: libvirtd.service: main
>>>> process exited, code=exited, status=6/NOTCONFIGURED
>>>> May 30 19:24:39
node-1.test.net systemd[1]: Failed to start
>>>> Virtualization daemon.
>>>> May 30 19:24:39
node-1.test.net systemd[1]: Unit libvirtd.service
>>>> entered failed state.
>>>> May 30 19:24:39
node-1.test.net systemd[1]: libvirtd.service failed.
>>>> May 30 19:24:39
node-1.test.net systemd[1]: libvirtd.service holdoff
>>>> time over, scheduling restart.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> ****************************
>>>>>
>>>>> I'd like to understand the issues you faced before suggesting to
>>>>> restart from scratch.
>>>>>
>>>>> Too late... I did two re-installs to get a more accurate account of
>>>>> my install process for above.
>>>>>
>>>>> ****************************
>>>>>
>>>>> Thank you for your help!
>>>>>
>>>>> Brendan
>>>>>
>>>>> On Tue, May 30, 2017 at 12:17 AM, Sandro Bonazzola <
>>>>> sbonazzo(a)redhat.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, May 30, 2017 at 6:49 AM, Brendan Hartzell
<mrrex4(a)gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> oVirt users list,
>>>>>>>
>>>>>>> Long story short, I've been spending weeks on this
project for my
>>>>>>> home lab with no success.
>>>>>>>
>>>>>>> I would like to successfully install two nodes that host a
highly
>>>>>>> available engine with an iSCSI storage back-end.
>>>>>>>
>>>>>>> I have read through most, if not all, of the guides on
ovirt.org
>>>>>>> with no substantial help.
>>>>>>>
>>>>>>> Successfully, I have done the following:
>>>>>>> Install oVirt Engine on a bare metal system, added a node,
and
>>>>>>> started exploring - not desired.
>>>>>>> Install oVirt Node 4.0.6 on a bare metal system - fails if
>>>>>>> partitions are not done automatically.
>>>>>>> Install oVirt Node 4.1.2 on a bare metal system - fails if
>>>>>>> partitions are not done automatically.
>>>>>>>
>>>>>>
>>>>>> Can you please elaborate about the failure you see here and how
are
>>>>>> you trying to manually partition the host?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> My process after installing a Node:
>>>>>>> Run a yum update - just to be sure, but I am using latest
iso
>>>>>>> images from downloads section.
>>>>>>> Edit /etc/hosts for local name resolution - the goal is to
host DNS
>>>>>>> as a virtual machine, eventually.
>>>>>>> On 4.1 if I install ovirt-engine-appliance from yum, it does
>>>>>>> simplify one step in the hosted engine setup. If I do this
on 4.0 it
>>>>>>> discards the image and uses the default.
>>>>>>>
>>>>>>
>>>>>> Being this a new install, please use 4.1. oVirt 4.0 is not
supported
>>>>>> anymore.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On 4.1 the hosted engine setup fails immediately unless I run
the
>>>>>>> hosted engine cleanup from the shell.
>>>>>>>
>>>>>>
>>>>>> Can you please provide hosted engine setup logs or better a full
sos
>>>>>> report? (sosreport -a)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> If I do this, I can typically get to the point of
installing.
>>>>>>>
>>>>>>> When I do get to the installation phase, I get to a point
just
>>>>>>> after extracting the OVA that I get a message on the shell
saying something
>>>>>>> about the watchdog running the whole time and then the node
reboots.
>>>>>>>
>>>>>>> I found one email thread that sounded like my issue and
suggested
>>>>>>> the following commands:
>>>>>>> vdsm-tool configure --force
>>>>>>> systemctl restart libvirtd
>>>>>>> systemctl restart vdsmd
>>>>>>>
>>>>>>> Unfortunately, these commands did not help my situation like
the
>>>>>>> other individual.
>>>>>>>
>>>>>>> What log file would everyone like to see first? Given that I
still
>>>>>>> consider myself relatively new to Linux, please identify the
path for the
>>>>>>> log file requested.
>>>>>>>
>>>>>>
>>>>>> See above
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Also, because I plan on performing a clean install for this
thread
>>>>>>> using my process from above (I'm not expecting my outcome
to be any
>>>>>>> different), are there any tips and tricks that might result
in a success?
>>>>>>>
>>>>>>
>>>>>> I'd like to understand the issues you faced before suggesting
to
>>>>>> restart from scratch.
>>>>>> Adding some people who may help as well.
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thank you for any and all help,
>>>>>>> Brendan
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Users mailing list
>>>>>>> Users(a)ovirt.org
>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> SANDRO BONAZZOLA
>>>>>>
>>>>>> ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION
R&D
>>>>>>
>>>>>> Red Hat EMEA <
https://www.redhat.com/>
>>>>>> <
https://red.ht/sig>
>>>>>> TRIED. TESTED. TRUSTED. <
https://redhat.com/trusted>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> SANDRO BONAZZOLA
>>>>
>>>> ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
>>>>
>>>> Red Hat EMEA <
https://www.redhat.com/>
>>>> <
https://red.ht/sig>
>>>> TRIED. TESTED. TRUSTED. <
https://redhat.com/trusted>
>>>>
>>>
>>>
>>
>>
>> --
>>
>> SANDRO BONAZZOLA
>>
>> ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
>>
>> Red Hat EMEA <
https://www.redhat.com/>
>> <
https://red.ht/sig>
>> TRIED. TESTED. TRUSTED. <
https://redhat.com/trusted>
>>
>
>
--
SANDRO BONAZZOLA
ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
Red Hat EMEA <
https://www.redhat.com/>
<
https://red.ht/sig>
TRIED. TESTED. TRUSTED. <
https://redhat.com/trusted>