April 2019 - Users - oVirt List Archives

Live storage migration is failing in 4.2.8
by Ladislav Humenik 15 Apr '19

15 Apr '19

Hello, we have recently updated few ovirts from 4.2.5 to 4.2.8 version (actually 9 ovirt engine nodes), where the live storage migration stopped to work, and leave auto-generated snapshot behind. If we power the guest VM down, the migration works as expected. Is there a known bug for this? Shall we open a new one? Setup: ovirt - Dell PowerEdge R630 - CentOS Linux release 7.6.1810 (Core) - ovirt-engine-4.2.8.2-1.el7.noarch - kernel-3.10.0-957.10.1.el7.x86_64 hypervisors - Dell PowerEdge R640 - CentOS Linux release 7.6.1810 (Core) - kernel-3.10.0-957.10.1.el7.x86_64 - vdsm-4.20.46-1.el7.x86_64 - libvirt-5.0.0-1.el7.x86_64 - qemu-kvm-ev-2.12.0-18.el7_6.3.1.x86_64 storage domain - netapp NFS share logs are attached -- Ladislav Humenik System administrator

3 3

Global maintenance and fencing of hosts
by Andreas Elvers 15 Apr '19

15 Apr '19

I am wondering whether global maintenance inhibits fencing of non-responsive hosts. Is this so? Background: I plan on migrating the engine from one cluster to another. I understand this means to backup/restore the engine. While migrating the engine it is shut down and all VMs will continue running. This is good. When starting the engine in the new location, I really don't want the engine to fence any host on its own, because of reasons I can not yet know. So is global maintenance enough to suppress fencing, or do I have to deactivate fencing on all hosts?

3 2

VDSM command CreateStoragePoolVDS faile
by fangjian＠linuxtrend.cn 14 Apr '19

14 Apr '19

Hi, I tried to setup Ovirt in my environment but encountered some problems in data domain creation. oVirt Manager Version 4.3.2.1-1.el7 oVirt Node Version 4.3.2 1. Create new data center & cluster; ---- successful 2. Create new host 'centos-node-01' in cluster; ---- successful 3. Prepare the NFS storage; ---- successful 4. Tried to create new datadomain and attach NFS storage to host 'centos-node-01' but failed. Message: VDSM centos-node-01 command CreateStoragePoolVDS failed: Cannot acquire host id: (u'ec225640-f05e-4a9d-bdc4-5219065704ec', SanlockException(2, 'Sanlock lockspace add failure', 'No such file or directory')) How can I fix the problem and attach the NFS storage to new datadomain?

1 0

Re: Expand existing gluster storage in ovirt 4.2/4.3
by Strahil 14 Apr '19

14 Apr '19

Just add new servers to the gluster cluster (by 3). Then just install them from the oVirt HostedEngine. Best Regards, Strahil NikolovOn Apr 11, 2019 15:49, adrianquintero(a)gmail.com wrote: > > Would you know of any documentation that I could follow for that type of setup? > > I have read quite a bit about oVIrt and Hyperconverged but I have only found 3 node setup examples, nobody seems to go past that or at least I've yet to find one. I already have a 3 node setup with hyperconverged and working as it should with no issues and tested fail scenarios, but I need to have an environment with at least 12 servers and be able to maintain the correct fail scenarios with storage (gluster), however I cant seem to figure out the proper steps to achieve this. > > So if anyone can point me in the right direction it will help a lot, and once I test I should be able to provide back any knowledge that I gain with such type of setup. > > > thanks again. > _______________________________________________ > Users mailing list -- users(a)ovirt.org > To unsubscribe send an email to users-leave(a)ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/PJIRZGP6NQ4RK…

2 1

Host reinstall: ModuleNotFoundError: No module named 'rpmUtils'
by John Florian 14 Apr '19

14 Apr '19

After mucking around trying to use jumbo MTU for my iSCSI storage nets (which apparently I can't do because my Cisco 3560 switch only supports 1500 max for its vlan interfaces) I got one of my Hosts screwed up. I likely could rebuild it from scratch but I suspect that's overkill. I simply tried to do a reinstall via the GUI. That fails. Looking at the ovirt-host-deploy log I see several tracebacks with $SUBJECT. Since Python pays my bills I figure this is an easy fix. Except ... I see this on the host: $ rpm -qf /usr/lib/python2.7/site-packages/rpmUtils/ yum-3.4.3-161.el7.centos.noarch $ python Python 2.7.5 (default, Oct 30 2018, 23:45:53) [GCC 4.8.5 20150623 (Red Hat 4.8.5-36)] on linux2 Type "help", "copyright", "credits" or "license" for more information. Tab completion has been enabled. >>> import rpmUtils >>> I'm guessing this must mean the tracebacks are from Python 3 since I can clearly see the module doesn't exist for either Python 3.4 or 3.6. So this smells like a packaging bug somehow related to upgrading from 4.2. I mean, I can't imagine a brand new install fails this blatantly. Either that or this import error has nothing to do with my reinstall failure. -- John Florian

2 3

Bug 1666795 - Related? - VM's don't start after shutdown on FCP
by nardusg＠gmail.com 14 Apr '19

14 Apr '19

Hi There Wonder if this issue is related to our problem and if there is a way around it. We upgraded from 4.2.8. to 4.3.2. Now when we start some of the VM's fail to start. You need to deattach the disks, create new VM, reattach the disks to the new VM and then the new VM starts. Thanks Nar

4 8

Unable to start vdsm, upgrade 4.0 to 4.1
by Todd Barton 13 Apr '19

13 Apr '19

Looking for some help/suggestions to correct an issue I'm having. I have a 3 host HA setup running a hosted-engine and gluster storage. The hosts are identical hardware configurations and have been running for several years very solidly. I was performing an upgrade to 4.1. 1st host when fine. The second upgrade didn't go well...On server reboot, it went into kernel panic and I had to load previous kernel to diagnose. I couldn't get it out of panic and I had to revert the system to the previous kernel which was a big PITA. I updated it to current and verified installation of ovirt/vdsm. Everything seemed to be ok, but vdsm won't start. Gluster is working fine. It appears I have a authentication issue with libvirt. I'm getting the message "libvirt: XML-RPC error : authentication failed: authentication failed" which seems to be the core issue. I've looked at all the past issues/resolutions to this issue and tried them, but I can't get it to work. For example, I do a vdsm-tool configure --force and I get this... Checking configuration status... abrt is already configured for vdsm lvm is configured for vdsm libvirt is already configured for vdsm SUCCESS: ssl configured to true. No conflicts Current revision of multipath.conf detected, preserving Running configure... Reconfiguration of abrt is done. Traceback (most recent call last): File "/usr/bin/vdsm-tool", line 219, in main return tool_command[cmd]["command"](*args) File "/usr/lib/python2.7/site-packages/vdsm/tool/__init__.py", line 38, in wrapper func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/tool/configurator.py", line 141, in configure _configure(c) File "/usr/lib/python2.7/site-packages/vdsm/tool/configurator.py", line 88, in _configure getattr(module, 'configure', lambda: None)() File "/usr/lib/python2.7/site-packages/vdsm/tool/configurators/passwd.py", line 68, in configure configure_passwd() File "/usr/lib/python2.7/site-packages/vdsm/tool/configurators/passwd.py", line 98, in configure_passwd raise RuntimeError("Set password failed: %s" % (err,)) RuntimeError: Set password failed: ['saslpasswd2: invalid parameter supplied'] ...and help would be greatly appreciated. I'm not a linux/ovirt expert by any means, but I desperately need to get this setup back to being stable. This happened many months ago and I gave up fixing, but I really need to get this back online again. Thank you Todd Barton

2 1

Re: Unable to start vdsm, upgrade 4.0 to 4.1
by Strahil 13 Apr '19

13 Apr '19

The fastest (windows style) approach is to completely wipe the host and do a reinstall -> install in vdsm and so on. You should consider oVirt Node. Another one that comes to my mind is to do: yum history yum history rollback <id> reboot repeat the upgrade. For now, I'm planing to do gluster snapshots (all machines on volume -> stopped) before major upgrades, as this is the fastest recovery approach. Actually oVirt Node uses thin lvm snapshots to guarantee fast rollback. Best Regards, Strahil NikolovOn Apr 13, 2019 06:44, Todd Barton <tcbarton(a)ipvoicedatasystems.com> wrote: > > Looking for some help/suggestions to correct an issue I'm having. I have a 3 host HA setup running a hosted-engine and gluster storage. The hosts are identical hardware configurations and have been running for several years very solidly. I was performing an upgrade to 4.1. 1st host when fine. The second upgrade didn't go well...On server reboot, it went into kernel panic and I had to load previous kernel to diagnose. > > I couldn't get it out of panic and I had to revert the system to the previous kernel which was a big PITA. I updated it to current and verified installation of ovirt/vdsm. Everything seemed to be ok, but vdsm won't start. Gluster is working fine. It appears I have a authentication issue with libvirt. I'm getting the message "libvirt: XML-RPC error : authentication failed: authentication failed" which seems to be the core issue. > > I've looked at all the past issues/resolutions to this issue and tried them, but I can't get it to work. For example, I do a vdsm-tool configure --force and I get this... > > Checking configuration status... > > abrt is already configured for vdsm > lvm is configured for vdsm > libvirt is already configured for vdsm > SUCCESS: ssl configured to true. No conflicts > Current revision of multipath.conf detected, preserving > > Running configure... > Reconfiguration of abrt is done. > Traceback (most recent call last): > File "/usr/bin/vdsm-tool", line 219, in main > return tool_command[cmd]["command"](*args) > File "/usr/lib/python2.7/site-packages/vdsm/tool/__init__.py", line 38, in wrapper > func(*args, **kwargs) > File "/usr/lib/python2.7/site-packages/vdsm/tool/configurator.py", line 141, in configure > _configure(c) > File "/usr/lib/python2.7/site-packages/vdsm/tool/configurator.py", line 88, in _configure > getattr(module, 'configure', lambda: None)() > File "/usr/lib/python2.7/site-packages/vdsm/tool/configurators/passwd.py", line 68, in configure > configure_passwd() > File "/usr/lib/python2.7/site-packages/vdsm/tool/configurators/passwd.py", line 98, in configure_passwd > raise RuntimeError("Set password failed: %s" % (err,)) > RuntimeError: Set password failed: ['saslpasswd2: invalid parameter supplied'] > > ...and help would be greatly appreciated. I'm not a linux/ovirt expert by any means, but I desperately need to get this setup back to being stable. This happened many months ago and I gave up fixing, but I really need to get this back online again. > > Thank you > > Todd Barton > > > > >

2 1

Tuning Gluster Writes
by Alex McWhirter 12 Apr '19

12 Apr '19

I have 8 machines acting as gluster servers. They each have 12 drives raid 50'd together (3 sets of 4 drives raid 5'd then 0'd together as one). They connect to the compute hosts and to each other over lacp'd 10GB connections split across two cisco nexus switched with VPC. Gluster has the following set. performance.write-behind-window-size: 4MB performance.flush-behind: on performance.stat-prefetch: on server.event-threads: 4 client.event-threads: 8 performance.io-thread-count: 32 network.ping-timeout: 30 cluster.granular-entry-heal: enable performance.strict-o-direct: on storage.owner-gid: 36 storage.owner-uid: 36 features.shard: on cluster.shd-wait-qlength: 10000 cluster.shd-max-threads: 8 cluster.locking-scheme: granular cluster.data-self-heal-algorithm: full cluster.server-quorum-type: server cluster.quorum-type: auto cluster.eager-lock: enable network.remote-dio: off performance.low-prio-threads: 32 performance.io-cache: off performance.read-ahead: off performance.quick-read: off auth.allow: * user.cifs: off transport.address-family: inet nfs.disable: off performance.client-io-threads: on I have the following sysctl values on gluster client and servers, using libgfapi, MTU 9K net.core.rmem_max = 134217728 net.core.wmem_max = 134217728 net.ipv4.tcp_rmem = 4096 87380 134217728 net.ipv4.tcp_wmem = 4096 65536 134217728 net.core.netdev_max_backlog = 300000 net.ipv4.tcp_moderate_rcvbuf =1 net.ipv4.tcp_no_metrics_save = 1 net.ipv4.tcp_congestion_control=htcp reads with this setup are perfect, benchmarked in VM to be about 770MB/s sequential with disk access times of < 1ms. Writes on the other hand are all over the place. They peak around 320MB/s sequential write, which is what i expect but it seems as if there is some blocking going on. During the write test i will hit 320MB/s briefly, then 0MB/s as disk access time shoot to over 3000ms, then back to 320MB/s. It averages out to about 110MB/s afterwards. Gluster version is 3.12.15 ovirt is 4.2.7.5 Any ideas on what i could tune to eliminate or minimize that blocking?

1 0

Migrate self-hosted engine between cluster
by raul.caballero.girol＠gmail.com 12 Apr '19

12 Apr '19

Hello everibody, Iḿ a noob with ovirt and I have a problem. I have deployed a new enviroment with a self-hosted engine. Now, when I am learning about ovirt, I think that my engine should be in another cluster. How I can move my manage from one cluster to other?. Cheers

3 3