Unable to set up oVirt 4.0 HE using glusterfs storage

Kevin Hung

1 Jul 2016 1 Jul '16

5:22 a.m.

Hello, I thought I would ask the list if anyone is aware of this issue (or if I am doing something obviously wrong) before I submit a bug report. It looks like I am not able to choose glusterfs as a storage option for deploying a Hosted Engine in oVirt 4.0 Node Next. I believe it is failing to execute the following command: /sbin/gluster --mode=script --xml volume info mgmttank --remote-host=storage1.nullaxiom.com When I run the command manually, I get a blank line as output and checking in /var/log/glusterfs/cli.log, it seems to be exiting with error code -2. I have tried disabling the firewall on both host and gluster server, but it did not make a difference. I am able to manually mount the gluster volume on the host, and I had a working oVirt 3.6 installation using the same exact gluster server. Console output below: [ INFO ] Stage: Initializing [ INFO ] Generating a temporary VNC password. [ INFO ] Stage: Environment setup During customization use CTRL-D to abort. Continuing will configure this host for serving as hypervisor and create a VM where you have to install the engine afterwards. Are you sure you want to continue? (Yes, No)[Yes]: It has been detected that this program is executed through an SSH connection without using screen. Continuing with the installation may lead to broken installation if the network connection fails. It is highly recommended to abort the installation and run it inside a screen session using command "screen". Do you want to continue anyway? (Yes, No)[No]: yes [ INFO ] Hardware supports virtualization Configuration files: [] Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160630225930-ef5wd4.log Version: otopi-1.5.0 (otopi-1.5.0-1.el7.centos) [ INFO ] Stage: Environment packages setup [ INFO ] Stage: Programs detection [ INFO ] Stage: Environment setup [ INFO ] Stage: Environment customization --== STORAGE CONFIGURATION ==-- Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs3, nfs4)[nfs3]: glusterfs [ INFO ] Please note that Replica 3 support is required for the shared storage. Please specify the full shared storage connection path to use (example: host:/path): storage1.nullaxiom.com:/mgmttank [ ERROR ] Cannot access storage connection storage1.nullaxiom.com:/mgmttank: Command '/sbin/gluster' failed to execute Please specify the full shared storage connection path to use (example: host:/path):

Show replies by date

Sahina Bose

1 Jul 1 Jul

6:41 a.m.

What's the output of below from the node? # gluster volume info mgmttank --remote-host=storage1.nullaxiom.com On 07/01/2016 08:52 AM, Kevin Hung wrote:

...

Hello,

I thought I would ask the list if anyone is aware of this issue (or if I am doing something obviously wrong) before I submit a bug report.

It looks like I am not able to choose glusterfs as a storage option for deploying a Hosted Engine in oVirt 4.0 Node Next. I believe it is failing to execute the following command: /sbin/gluster --mode=script --xml volume info mgmttank --remote-host=storage1.nullaxiom.com

When I run the command manually, I get a blank line as output and checking in /var/log/glusterfs/cli.log, it seems to be exiting with error code -2.

I have tried disabling the firewall on both host and gluster server, but it did not make a difference. I am able to manually mount the gluster volume on the host, and I had a working oVirt 3.6 installation using the same exact gluster server.

Console output below:

[ INFO ] Stage: Initializing [ INFO ] Generating a temporary VNC password. [ INFO ] Stage: Environment setup During customization use CTRL-D to abort. Continuing will configure this host for serving as hypervisor and create a VM where you have to install the engine afterwards. Are you sure you want to continue? (Yes, No)[Yes]: It has been detected that this program is executed through an SSH connection without using screen. Continuing with the installation may lead to broken installation if the network connection fails. It is highly recommended to abort the installation and run it inside a screen session using command "screen". Do you want to continue anyway? (Yes, No)[No]: yes [ INFO ] Hardware supports virtualization Configuration files: [] Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160630225930-ef5wd4.log Version: otopi-1.5.0 (otopi-1.5.0-1.el7.centos) [ INFO ] Stage: Environment packages setup [ INFO ] Stage: Programs detection [ INFO ] Stage: Environment setup [ INFO ] Stage: Environment customization

--== STORAGE CONFIGURATION ==--

Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs3, nfs4)[nfs3]: glusterfs [ INFO ] Please note that Replica 3 support is required for the shared storage. Please specify the full shared storage connection path to use (example: host:/path): storage1.nullaxiom.com:/mgmttank [ ERROR ] Cannot access storage connection storage1.nullaxiom.com:/mgmttank: Command '/sbin/gluster' failed to execute Please specify the full shared storage connection path to use (example: host:/path): _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Kevin Hung

6:45 a.m.

The same as the other command, two blank lines. On 7/1/2016 12:41 AM, Sahina Bose wrote:

...

What's the output of below from the node? # gluster volume info mgmttank --remote-host=storage1.nullaxiom.com

On 07/01/2016 08:52 AM, Kevin Hung wrote:

...
Hello,

I thought I would ask the list if anyone is aware of this issue (or if I am doing something obviously wrong) before I submit a bug report.

It looks like I am not able to choose glusterfs as a storage option for deploying a Hosted Engine in oVirt 4.0 Node Next. I believe it is failing to execute the following command: /sbin/gluster --mode=script --xml volume info mgmttank --remote-host=storage1.nullaxiom.com

When I run the command manually, I get a blank line as output and checking in /var/log/glusterfs/cli.log, it seems to be exiting with error code -2.

I have tried disabling the firewall on both host and gluster server, but it did not make a difference. I am able to manually mount the gluster volume on the host, and I had a working oVirt 3.6 installation using the same exact gluster server.

Console output below:

[ INFO ] Stage: Initializing [ INFO ] Generating a temporary VNC password. [ INFO ] Stage: Environment setup During customization use CTRL-D to abort. Continuing will configure this host for serving as hypervisor and create a VM where you have to install the engine afterwards. Are you sure you want to continue? (Yes, No)[Yes]: It has been detected that this program is executed through an SSH connection without using screen. Continuing with the installation may lead to broken installation if the network connection fails. It is highly recommended to abort the installation and run it inside a screen session using command "screen". Do you want to continue anyway? (Yes, No)[No]: yes [ INFO ] Hardware supports virtualization Configuration files: [] Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160630225930-ef5wd4.log Version: otopi-1.5.0 (otopi-1.5.0-1.el7.centos) [ INFO ] Stage: Environment packages setup [ INFO ] Stage: Programs detection [ INFO ] Stage: Environment setup [ INFO ] Stage: Environment customization

--== STORAGE CONFIGURATION ==--

Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs3, nfs4)[nfs3]: glusterfs [ INFO ] Please note that Replica 3 support is required for the shared storage. Please specify the full shared storage connection path to use (example: host:/path): storage1.nullaxiom.com:/mgmttank [ ERROR ] Cannot access storage connection storage1.nullaxiom.com:/mgmttank: Command '/sbin/gluster' failed to execute Please specify the full shared storage connection path to use (example: host:/path): _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Sahina Bose

6:50 a.m.

Sorry, missed that you already ran it. Version of glusterfs on node and server? On 07/01/2016 10:15 AM, Kevin Hung wrote:

...

The same as the other command, two blank lines.

On 7/1/2016 12:41 AM, Sahina Bose wrote:

...
What's the output of below from the node? # gluster volume info mgmttank --remote-host=storage1.nullaxiom.com

On 07/01/2016 08:52 AM, Kevin Hung wrote:

...
Hello,

I thought I would ask the list if anyone is aware of this issue (or if I am doing something obviously wrong) before I submit a bug report.

It looks like I am not able to choose glusterfs as a storage option for deploying a Hosted Engine in oVirt 4.0 Node Next. I believe it is failing to execute the following command: /sbin/gluster --mode=script --xml volume info mgmttank --remote-host=storage1.nullaxiom.com

When I run the command manually, I get a blank line as output and checking in /var/log/glusterfs/cli.log, it seems to be exiting with error code -2.

I have tried disabling the firewall on both host and gluster server, but it did not make a difference. I am able to manually mount the gluster volume on the host, and I had a working oVirt 3.6 installation using the same exact gluster server.

Console output below:

[ INFO ] Stage: Initializing [ INFO ] Generating a temporary VNC password. [ INFO ] Stage: Environment setup During customization use CTRL-D to abort. Continuing will configure this host for serving as hypervisor and create a VM where you have to install the engine afterwards. Are you sure you want to continue? (Yes, No)[Yes]: It has been detected that this program is executed through an SSH connection without using screen. Continuing with the installation may lead to broken installation if the network connection fails. It is highly recommended to abort the installation and run it inside a screen session using command "screen". Do you want to continue anyway? (Yes, No)[No]: yes [ INFO ] Hardware supports virtualization Configuration files: [] Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160630225930-ef5wd4.log Version: otopi-1.5.0 (otopi-1.5.0-1.el7.centos) [ INFO ] Stage: Environment packages setup [ INFO ] Stage: Programs detection [ INFO ] Stage: Environment setup [ INFO ] Stage: Environment customization

--== STORAGE CONFIGURATION ==--

Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs3, nfs4)[nfs3]: glusterfs [ INFO ] Please note that Replica 3 support is required for the shared storage. Please specify the full shared storage connection path to use (example: host:/path): storage1.nullaxiom.com:/mgmttank [ ERROR ] Cannot access storage connection storage1.nullaxiom.com:/mgmttank: Command '/sbin/gluster' failed to execute Please specify the full shared storage connection path to use (example: host:/path): _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Kevin Hung

6:55 a.m.

Version 3.7.11 on node and version 3.7.6 on server. The node was set up using the ovirt-node-ng-installer-ovirt-4.0-2016062412 ISO. On 7/1/2016 12:50 AM, Sahina Bose wrote:

...

Sorry, missed that you already ran it. Version of glusterfs on node and server?

Sahina Bose

7:16 a.m.

This is a compatibility issue on executing cli commands. Is it possible to upgrade your glusterfs server to higher version, or downgrade the glusterfs-cli on node to 3.7.6. Please log a bug in HE deploy to handle this error so that installation can proceed. (--remote-host option throws errors when there's a version mismatch) On 07/01/2016 10:25 AM, Kevin Hung wrote:

...

Version 3.7.11 on node and version 3.7.6 on server. The node was set up using the ovirt-node-ng-installer-ovirt-4.0-2016062412 ISO.

On 7/1/2016 12:50 AM, Sahina Bose wrote:

...
Sorry, missed that you already ran it. Version of glusterfs on node and server?

Kevin Hung

11:21 p.m.

Thank you Sahina, that was the issue. I upgraded my glusterfs server to 3.7.11 and I was able to continue with the deployment. I am seeing other issues with deployment, but I will look into those myself first. Bug has been logged [1]. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1352165 On 7/1/2016 1:16 AM, Sahina Bose wrote:

...

This is a compatibility issue on executing cli commands. Is it possible to upgrade your glusterfs server to higher version, or downgrade the glusterfs-cli on node to 3.7.6.

Please log a bug in HE deploy to handle this error so that installation can proceed. (--remote-host option throws errors when there's a version mismatch)

On 07/01/2016 10:25 AM, Kevin Hung wrote:

...
Version 3.7.11 on node and version 3.7.6 on server. The node was set up using the ovirt-node-ng-installer-ovirt-4.0-2016062412 ISO.

On 7/1/2016 12:50 AM, Sahina Bose wrote:

...
Sorry, missed that you already ran it. Version of glusterfs on node and server?

Kevin Hung

2 Jul 2 Jul

2:37 a.m.

It looks like I'm now getting an error when the deployment tries to configure the management bridge. Setup log: 2016-07-01 20:29:47 INFO otopi.plugins.gr_he_common.network.bridge bridge._misc: 372 Configuring the management bridge 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge bridge._misc :384 networks: {'ovirtmgmt': {'nic': 'eno1', 'ipaddr': u'192.168.1.211', 'netmask': u'255.255.255.0', 'bootproto': u'none', 'gateway': u'192.168.1.1', 'defaultRoute': True}} 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge bridge._misc :385 bonds: {} 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge bridge._misc :386 options: {'connectivityCheck': False} 2016-07-01 20:29:48 DEBUG otopi.context context._executeMethod:142 method exception Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in _executeMethod method['method']() File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/network/bridge.py", line 387, in _misc _setupNetworks(conn, networks, bonds, options) File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/network/bridge.py", line 405, in _setupNetworks 'message: "%s"' % (networks, code, message)) RuntimeError: Failed to setup networks {'ovirtmgmt': {'nic': 'eno1', 'ipaddr': u'192.168.1.211', 'netmask': u'255.255.255.0', 'bootproto': u'none', 'gateway': u'192.168.1.1', 'defaultRoute': True}}. Error code: "78" message: "Hook error: Hook Error: ('Traceback (most recent call last):\n File "/usr/libexec/vdsm/hooks/before_network_setup/50_fcoe", line 18, in <module>\n from vdsm.netconfpersistence import RunningConfig\nImportError: No module named netconfpersistence\n',)" 2016-07-01 20:29:48 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Misc configuration': Failed to setup networks {'ovirtmgmt': {'nic': 'eno1', 'ipaddr': u'192.168.1.211', 'netmask': u'255.255.255.0', 'bootproto': u'none', 'gateway': u'192.168.1.1', 'defaultRoute': True}}. Error code: "78" message: "Hook error: Hook Error: ('Traceback (most recent call last):\n File "/usr/libexec/vdsm/hooks/before_network_setup/50_fcoe", line 18, in <module>\n from vdsm.netconfpersistence import RunningConfig\nImportError: No module named netconfpersistence\n',)" On 7/1/2016 5:21 PM, Kevin Hung wrote:

...

Thank you Sahina, that was the issue. I upgraded my glusterfs server to 3.7.11 and I was able to continue with the deployment. I am seeing other issues with deployment, but I will look into those myself first. Bug has been logged [1].

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1352165

Kevin Hung

3 Jul 3 Jul

5:57 a.m.

Looks like there still needs to be some work done on oVirt 4.0 Node and ovirt-hosted-engine-setup before it's ready for general consumption. I have spent days trying to get this to work, and only got it running (on one host) after encountering 8 serious issues (7 below and the initial glusterfs one). I have not been able to successfully deploy a second host (see issue 7 below). I will be moving back to deploying hosts using CentOS (with either oVirt 4.0 or oVirt 3.6) as I need a working oVirt deployment up and running. In case anyone is interested in reproducing the issues, I used the Node ISO here [1] and the latest (7/2/2016) engine appliance OVA here [2]. Those seem to be the "official" files as far as I can tell (which is difficult as the documentation is not clear). List of issues: 1. The error I mentioned seems to be an problem with the code. I bypassed it by deleting /usr/libexec/vdsm/hooks/before_network_setup/50_fcoe. 2. ovirt-hosted-engine-setup is unable to connect to the vdsm service if the FQDN of the node is not resolvable (i.e. if a DNS server is not entered in the initial setup). This should be checked in either the initial oVirt Node setup process or the beginning of ovirt-hosted-engine-setup. 3. The management bridge does not get created properly when the server is set up with a manually configured DNS server and running NetworkManager (the default on Node). It seems like a bug has been filed for this back in 2014. [3] 4. Using cloud-init with default values to customize the engine appliance can fail on the line "Creating/refreshing DWH database schema" if it takes longer than 600 seconds to return output. This may apply to any other step that takes a long time to complete. The VM no longer appears to be exist after the setup exits that so I am unable to debug. 5. Without using cloud-init, the setup creates an engine VM that I cannot log into (it does not seem to use the engine admin password or a blank password). 6. Destroying the VM (option 4) leaves the files intact on the shared storage so I cannot restart setup without deleting those first. This may be intentional, but the use of kvm terminology (destroy for power off) is not common, not to mention that "virsh -r list --all" does not list the VM anymore. 7. Unable to deploy second host through web UI (error "Failed to configure management network on host node2 due to setup networks failure.") or using ovirt-hosted-engine-setup (it looks like it can't connect to or doesn't start the broker service). 8. Random errors to stderr: "vcpu0 unhandled rdmsr" (this seems to be an upstream bug) and "multipath: error getting device" (this has been an issue for years with oVirt and seems to be due to multipathing being on by default even for systems where that does not apply). [1] http://resources.ovirt.org/pub/ovirt-4.0/iso/ovirt-node-ng-installer/ovirt-n... [2] http://jenkins.ovirt.org/view/All/job/ovirt-appliance_ovirt-4.0_build-artifa... [3] https://bugzilla.redhat.com/show_bug.cgi?id=1160423 On 7/1/2016 8:37 PM, Kevin Hung wrote:

...

It looks like I'm now getting an error when the deployment tries to configure the management bridge.

Setup log:

2016-07-01 20:29:47 INFO otopi.plugins.gr_he_common.network.bridge bridge._misc: 372 Configuring the management bridge 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge bridge._misc :384 networks: {'ovirtmgmt': {'nic': 'eno1', 'ipaddr': u'192.168.1.211', 'netmask': u'255.255.255.0', 'bootproto': u'none', 'gateway': u'192.168.1.1', 'defaultRoute': True}} 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge bridge._misc :385 bonds: {} 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge bridge._misc :386 options: {'connectivityCheck': False} 2016-07-01 20:29:48 DEBUG otopi.context context._executeMethod:142 method exception Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in _executeMethod method['method']() File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/network/bridge.py", line 387, in _misc _setupNetworks(conn, networks, bonds, options) File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/network/bridge.py", line 405, in _setupNetworks 'message: "%s"' % (networks, code, message)) RuntimeError: Failed to setup networks {'ovirtmgmt': {'nic': 'eno1', 'ipaddr': u'192.168.1.211', 'netmask': u'255.255.255.0', 'bootproto': u'none', 'gateway': u'192.168.1.1', 'defaultRoute': True}}. Error code: "78" message: "Hook error: Hook Error: ('Traceback (most recent call last):\n File "/usr/libexec/vdsm/hooks/before_network_setup/50_fcoe", line 18, in <module>\n from vdsm.netconfpersistence import RunningConfig\nImportError: No module named netconfpersistence\n',)" 2016-07-01 20:29:48 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Misc configuration': Failed to setup networks {'ovirtmgmt': {'nic': 'eno1', 'ipaddr': u'192.168.1.211', 'netmask': u'255.255.255.0', 'bootproto': u'none', 'gateway': u'192.168.1.1', 'defaultRoute': True}}. Error code: "78" message: "Hook error: Hook Error: ('Traceback (most recent call last):\n File "/usr/libexec/vdsm/hooks/before_network_setup/50_fcoe", line 18, in <module>\n from vdsm.netconfpersistence import RunningConfig\nImportError: No module named netconfpersistence\n',)"

On 7/1/2016 5:21 PM, Kevin Hung wrote:

...
Thank you Sahina, that was the issue. I upgraded my glusterfs server to 3.7.11 and I was able to continue with the deployment. I am seeing other issues with deployment, but I will look into those myself first. Bug has been logged [1].

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1352165

Simone Tiraboschi

4 Jul 4 Jul

10:38 a.m.

On Sun, Jul 3, 2016 at 5:57 AM, Kevin Hung <khung@nullaxiom.com> wrote:

...

Looks like there still needs to be some work done on oVirt 4.0 Node and ovirt-hosted-engine-setup before it's ready for general consumption. I have spent days trying to get this to work, and only got it running (on one host) after encountering 8 serious issues (7 below and the initial glusterfs one). I have not been able to successfully deploy a second host (see issue 7 below). I will be moving back to deploying hosts using CentOS (with either oVirt 4.0 or oVirt 3.6) as I need a working oVirt deployment up and running.

In case anyone is interested in reproducing the issues, I used the Node ISO here [1] and the latest (7/2/2016) engine appliance OVA here [2]. Those seem to be the "official" files as far as I can tell (which is difficult as the documentation is not clear).

List of issues: 1. The error I mentioned seems to be an problem with the code. I bypassed it by deleting /usr/libexec/vdsm/hooks/before_network_setup/50_fcoe. 2. ovirt-hosted-engine-setup is unable to connect to the vdsm service if the FQDN of the node is not resolvable (i.e. if a DNS server is not entered in the initial setup). This should be checked in either the initial oVirt Node setup process or the beginning of ovirt-hosted-engine-setup. 3. The management bridge does not get created properly when the server is set up with a manually configured DNS server and running NetworkManager (the default on Node). It seems like a bug has been filed for this back in 2014. [3] 4. Using cloud-init with default values to customize the engine appliance can fail on the line "Creating/refreshing DWH database schema" if it takes longer than 600 seconds to return output. This may apply to any other step that takes a long time to complete. The VM no longer appears to be exist after the setup exits that so I am unable to debug.

600 seconds seams more than a reasonable time to create an empty DB, if it requires more than 10 minutes for a simple/short operation there is probably something strange with the storage.

...

5. Without using cloud-init, the setup creates an engine VM that I cannot log into (it does not seem to use the engine admin password or a blank password).

Yes, the engine VM host-name and its root password are configured via cloud-init and there is not default password. If you want to avoid using cloud-init you have to reset the root password of the engine VM as for any el7 machine.

...

6. Destroying the VM (option 4) leaves the files intact on the shared storage so I cannot restart setup without deleting those first. This may be intentional, but the use of kvm terminology (destroy for power off) is not common, not to mention that "virsh -r list --all" does not list the VM anymore.

On failures, there is not just the engine VM disk but a whole storage domain for hosted-engine which also contains ancillary disks. Re-deploying over a dirty storage is not supported so please clean up the whole storage domain on failures.

...

7. Unable to deploy second host through web UI (error "Failed to configure management network on host node2 due to setup networks failure.") or using

This is not hosted-engine specific: https://bugzilla.redhat.com/show_bug.cgi?id=1350763

...

ovirt-hosted-engine-setup (it looks like it can't connect to or doesn't start the broker service). 8. Random errors to stderr: "vcpu0 unhandled rdmsr" (this seems to be an

Are you running in a nested env?

...

upstream bug) and "multipath: error getting device" (this has been an issue for years with oVirt and seems to be due to multipathing being on by default even for systems where that does not apply).

[1] http://resources.ovirt.org/pub/ovirt-4.0/iso/ovirt-node-ng-installer/ovirt-n... [2] http://jenkins.ovirt.org/view/All/job/ovirt-appliance_ovirt-4.0_build-artifa... [3] https://bugzilla.redhat.com/show_bug.cgi?id=1160423

On 7/1/2016 8:37 PM, Kevin Hung wrote:

...
It looks like I'm now getting an error when the deployment tries to configure the management bridge.

Setup log:

2016-07-01 20:29:47 INFO otopi.plugins.gr_he_common.network.bridge bridge._misc: 372 Configuring the management bridge 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge bridge._misc :384 networks: {'ovirtmgmt': {'nic': 'eno1', 'ipaddr': u'192.168.1.211', 'netmask': u'255.255.255.0', 'bootproto': u'none', 'gateway': u'192.168.1.1', 'defaultRoute': True}} 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge bridge._misc :385 bonds: {} 2016-07-01 20:29:48 DEBUG otopi.plugins.gr_he_common.network.bridge bridge._misc :386 options: {'connectivityCheck': False} 2016-07-01 20:29:48 DEBUG otopi.context context._executeMethod:142 method exception Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in _executeMethod method['method']() File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/network/bridge.py", line 387, in _misc _setupNetworks(conn, networks, bonds, options) File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-common/network/bridge.py", line 405, in _setupNetworks 'message: "%s"' % (networks, code, message)) RuntimeError: Failed to setup networks {'ovirtmgmt': {'nic': 'eno1', 'ipaddr': u'192.168.1.211', 'netmask': u'255.255.255.0', 'bootproto': u'none', 'gateway': u'192.168.1.1', 'defaultRoute': True}}. Error code: "78" message: "Hook error: Hook Error: ('Traceback (most recent call last):\n File "/usr/libexec/vdsm/hooks/before_network_setup/50_fcoe", line 18, in <module>\n from vdsm.netconfpersistence import RunningConfig\nImportError: No module named netconfpersistence\n',)" 2016-07-01 20:29:48 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Misc configuration': Failed to setup networks {'ovirtmgmt': {'nic': 'eno1', 'ipaddr': u'192.168.1.211', 'netmask': u'255.255.255.0', 'bootproto': u'none', 'gateway': u'192.168.1.1', 'defaultRoute': True}}. Error code: "78" message: "Hook error: Hook Error: ('Traceback (most recent call last):\n File "/usr/libexec/vdsm/hooks/before_network_setup/50_fcoe", line 18, in <module>\n from vdsm.netconfpersistence import RunningConfig\nImportError: No module named netconfpersistence\n',)"

On 7/1/2016 5:21 PM, Kevin Hung wrote:

...
Thank you Sahina, that was the issue. I upgraded my glusterfs server to 3.7.11 and I was able to continue with the deployment. I am seeing other issues with deployment, but I will look into those myself first. Bug has been logged [1].

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1352165

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Kevin Hung

3:51 p.m.

...

On Sun, Jul 3, 2016 at 5:57 AM, Kevin Hung <khung@nullaxiom.com> wrote:

...
4. Using cloud-init with default values to customize the engine appliance can fail on the line "Creating/refreshing DWH database schema" if it takes longer than 600 seconds to return output. This may apply to any other step that takes a long time to complete. The VM no longer appears to be exist after the setup exits that so I am unable to debug. 600 seconds seams more than a reasonable time to create an empty DB, if it requires more than 10 minutes for a simple/short operation there is probably something strange with the storage. I monitored the host RAM/CPU usage and the utilization of the Ethernet interface on the shared storage. RAM and CPU usage were minimal and

...

...
5. Without using cloud-init, the setup creates an engine VM that I cannot log into (it does not seem to use the engine admin password or a blank password). Yes, the engine VM host-name and its root password are configured via cloud-init and there is not default password. If you want to avoid using cloud-init you have to reset the root password of the engine VM as for any el7 machine.

...
6. Destroying the VM (option 4) leaves the files intact on the shared storage so I cannot restart setup without deleting those first. This may be intentional, but the use of kvm terminology (destroy for power off) is not common, not to mention that "virsh -r list --all" does not list the VM anymore. On failures, there is not just the engine VM disk but a whole storage domain for hosted-engine which also contains ancillary disks. Re-deploying over a dirty storage is not supported so please clean up the whole storage domain on failures.

...
7. Unable to deploy second host through web UI (error "Failed to configure management network on host node2 due to setup networks failure.") or using This is not hosted-engine specific: https://bugzilla.redhat.com/show_bug.cgi?id=1350763 Thanks for pointing out the BZ ticket. I'm not entirely certain that's

On 7/4/2016 4:38 AM, Simone Tiraboschi wrote: there was barely anything going through the network interface. I can confirm that the network interface is fine as it is heavily utilized when the engine setup uses it to copy the image to the storage. I'm not sure what other statistics I should monitor to see if there's a bottleneck. the same issue I was seeing. Of course, I have no way of verifying anymore as I have already re-deployed using CentOS instead of Node.

...

...
ovirt-hosted-engine-setup (it looks like it can't connect to or doesn't start the broker service). 8. Random errors to stderr: "vcpu0 unhandled rdmsr" (this seems to be an Are you running in a nested env?

No, this is bare-metal, not a nested environment.

3466

Age (days ago)

3469

Last active (days ago)

List overview

Download

10 comments

3 participants

participants (3)

Kevin Hung
Sahina Bose
Simone Tiraboschi

Unable to set up oVirt 4.0 HE using glusterfs storage

tags

participants (3)