January 2018 - Users - oVirt List Archives

A possible bug on Fedora 27
by Valentin Bajrami 07 Feb '18

07 Feb '18

Hi Community, Recently we discovered that our VM's became unstable after upgrading from Fedora 26 to Fedora 27. The journalctl log shows the following Jan 29 20:03:28 host1.project.local libvirtd[2741]: 2018-01-29 19:03:28.789+0000: 2741: error : qemuMonitorIO:705 : internal error: End of file from qemu monitor Jan 29 20:09:14 host1.project.local libvirtd[2741]: 2018-01-29 19:09:14.111+0000: 2741: error : qemuMonitorIO:705 : internal error: End of file from qemu monitor Jan 29 20:10:29 host1.project.local libvirtd[2741]: 2018-01-29 19:10:29.584+0000: 2741: error : qemuMonitorIO:705 : internal error: End of file from qemu monitor A similar bug report is already present here: https://bugzilla.redhat.com/show_bug.cgi?id=1523314 but doesn't reflect our problem entirely. This bug seems to be triggered only when a VM is shut down gracefully. In our case this is being triggered without attempting to shutdown a VM. Again, this is causing the VM's to be unstable and eventually they'll shut down by themselves. Do you have any clue what could be causing this? -- Met vriendelijke groeten, Valentin Bajrami

1 1

qemu-kvm images corruption
by Nicolas Ecarnot 06 Feb '18

06 Feb '18

TL;DR: How to avoid images corruption? Hello, On two of our old 3.6 DC, a recent series of VM migrations lead to some issues : - I'm putting a host into maintenance mode - most of the VM are migrating nicely - one remaining VM never migrates, and the logs are showing : * engine.log : "...VM has been paused due to I/O error..." * vdsm.log : "...Improbable extension request for volume..." After digging amongst the RH BZ tickets, I saved the day by : - stopping the VM - lvchange -ay the adequate /dev/... - qemu-img check [-r all] /rhev/blahblah - lvchange -an... - boot the VM - enjoy! Yesterday this worked for a VM where only one error occurred on the qemu image, and the repair was easily done by qemu-img. Today, facing the same issue on another VM, it failed because the errors were very numerous, and also because of this message : [...] Rebuilding refcount structure ERROR writing refblock: No space left on device qemu-img: Check failed: No space left on device [...] The PV/VG/LV are far from being full, so I guess I don't where to look at. I tried many ways to solve it but I'm not comfortable at all with qemu images, corruption and solving, so I ended up exporting this VM (to an NFS export domain), importing it into another DC : this had the side effect to use qemu-img convert from qcow2 to qcow2, and (maybe?????) to solve some errors??? I also copied it into another qcow2 file with the same qemu-img convert way, but it is leading to another clean qcow2 image without errors. I saw that on 4.x some bugs are fixed about VM migrations, but this is not the point here. I checked my SANs, my network layers, my blades, the OS (CentOS 7.2) of my hosts, but I see nothing special. The real reason behind my message is not to know how to repair anything, rather than to understand what could have lead to this situation? Where to keep a keen eye? -- Nicolas ECARNOT

2 2

ovirt 3.6, we had the ovirt manager go down in a bad way and all VMs for one node marked Unknown and Not Reponding while up
by Christopher Cox 05 Feb '18

05 Feb '18

Like the subject says.. I tried to clear the status from the vm_dynamic for a VM, but it just goes back to 8. Any hints on how to get things back to a known state? I tried marking the node in maint, but it can't move the "Unknown" VMs, so that doesn't work. I tried rebooting a VM, that doesn't work. The state of the VMs is up.... and I think they are running on the node they say they are running on, we just have the Unknown problem with VMs on that one node. So... can't move them, reboot VMs doens't fix.... Any trick to restoring state so that oVirt is ok??? (what a mess)

3 8

VM paused due unknown storage error
by Misak Khachatryan 05 Feb '18

05 Feb '18

Hi, After upgrade to 4.2 i'm getting "VM paused due unknown storage error". When i was upgrading i had some gluster problem with one of the hosts, which i was fixed readding it to gluster peers. Now i see something weir in bricks configuration, see attachment - one of the bricks uses 0% of space. How I can diagnose this? Nothing wrong in logs as I can see. Best regards, Misak Khachatryan

2 5

[IOPROCESS] New release for Fedora
by Nir Soffer 04 Feb '18

04 Feb '18

Hi all, I released ioprocess 1.0.0 for Fedora 27 and 28. If you are using Fedora, please install the new version from the updates-testing and test it. Please share your feedback here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-fbe8141dd2 This version will be available soon from oVirt repositories. Thanks, Nir

1 1

Using upstream QEMU
by Harry Mallon 02 Feb '18

02 Feb '18

Hello all, Has anyone used oVirt with non-oVirt provided QEMU versions? I need a feature provided by upstream QEMU, but it is disabled in the oVirt/CentOS7 QEMU RPM. I have two possible methods to avoid the issue: 1. Fedora has a more recent QEMU which is closer to 'stock'. I see that oVirt 4.2 has no Fedora support, but is it possible to install the host onto a Fedora machine? I am trying to use the master branch rpms as recommended in the "No Fedora Support" note with no luck currently. 2. Is it safe/sensible to use oVirt with a CentOS7 host running an upstream QEMU version? Thanks, Harry Harry Mallon CODEX | Senior Software Engineer 60 Poland Street | London | England | W1F 7NT E harry.mallon(a)codex.online | T +44 203 7000 989

3 4

ovirt 4.2.1 pre hosted engine deploy failure
by Gianluca Cecchi 02 Feb '18

02 Feb '18

Hello, at the end of the command hosted-engine --deploy I get [ INFO ] TASK [Detect ovirt-hosted-engine-ha version] [ INFO ] changed: [localhost] [ INFO ] TASK [Set ha_version] [ INFO ] ok: [localhost] [ INFO ] TASK [Create configuration templates] [ INFO ] TASK [Create configuration archive] [ INFO ] changed: [localhost] [ INFO ] TASK [Create ovirt-hosted-engine-ha run directory] [ INFO ] changed: [localhost] [ INFO ] TASK [Copy configuration files to the right location on host] [ INFO ] TASK [Copy configuration archive to storage] [ ERROR ] [WARNING]: Failure using method (v2_runner_on_failed) in callback plugin [ ERROR ] (<ansible.plugins.callback.1_otopi_json.CallbackModule object at 0x2dd7d90>): [ ERROR ] 'ascii' codec can't encode character u'\u2018' in position 496: ordinal not in [ ERROR ] range(128) [ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook [ INFO ] Stage: Clean up [ INFO ] Cleaning temporary resources [ INFO ] TASK [Gathering Facts] [ INFO ] ok: [localhost] [ INFO ] TASK [Remove local vm dir] [ INFO ] changed: [localhost] [ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20180129164431.conf' [ INFO ] Stage: Pre-termination [ INFO ] Stage: Termination [ ERROR ] Hosted Engine deployment failed: this system is not reliable, please check the issue,fix and redeploy Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20180129160956-a7itm9.log [root@ov42 ~]# Is there any known bug for this? In log file I have: 2018-01-29 16:44:28,159+0100 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:173 [WARNING]: Failure using method (v2_runner_on_failed) in callback plugin 2018-01-29 16:44:28,160+0100 DEBUG otopi.plugins.otopi.dialog.human human.format:69 newline sent to logger 2018-01-29 16:44:28,160+0100 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:173 (<ansible.plugins.callback.1_otopi_json.CallbackModule object at 0x2dd7d90>): 2018-01-29 16:44:28,160+0100 DEBUG otopi.plugins.otopi.dialog.human human.format:69 newline sent to logger 2018-01-29 16:44:28,160+0100 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:173 'ascii' codec can't encode character u'\u2018' in position 496: ordinal not in 2018-01-29 16:44:28,161+0100 DEBUG otopi.plugins.otopi.dialog.human human.format:69 newline sent to logger 2018-01-29 16:44:28,161+0100 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:173 range(128) 2018-01-29 16:44:28,161+0100 DEBUG otopi.plugins.otopi.dialog.human human.format:69 newline sent to logger 2018-01-29 16:44:28,161+0100 DEBUG otopi.context context._executeMethod:143 method exception Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/otopi/context.py", line 133, in _executeMethod method['method']() File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-ansiblesetup/core/target_vm.py", line 193, in _closeup r = ah.run() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/ansible_utils.py", line 175, in run raise RuntimeError(_('Failed executing ansible-playbook')) RuntimeError: Failed executing ansible-playbook 2018-01-29 16:44:28,162+0100 ERROR otopi.context context._executeMethod:152 Failed to execute stage 'Closing up': Failed executing ansible-playbook I'm testing deploy of nested self hosted engine with HE on NFS. Thanks, Gianluca

2 4

Node network setup
by spfma.tech＠e.mail.fr 01 Feb '18

01 Feb '18

--=_40d9d471a95f6be5f29c7f93c9060b7c Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi, I am trying to setup a cluster of two nodes, with self hoste Engine= . Things went fine for the first machine, but it as rather messy about t= he second one. I would like to have load balancing and failover for both= management network and storage (NFS repository). So what should I exa= ctly do to get a working network stack which can be recognized when I tr= y to add this host to the cluster ? Have tried configuring bonds and b= riges using Cockpit, using manual "ifcfg" files, but all the time I see= the bridges and the bonds not linked in the Engine interface, so the ne= w host cannot be enrolled. If I try to link "ovirtmgmt" to the the assoc= iated bond, I have a connectivity loss because it is the management devi= ce, and I have te restart the network services. As management configurat= ion is not OK, I can't setup the storage connection. And if I just try= to activate the host, I will install and configure things and then comp= lain about missing "ovirtmgmt" and "nfs" networks, which both exist and= work and Centos level. The interface, bonds and bridge names are copy= /paste from the first server. # brctl show ovirtmgmt=0Abridge name bri= dge id STP enabled interfaces=0Aovirtmgmt 8000.44a842394200 no bond0 # i= p addr show bond0=0A33: bond0: mtu 1500 qdisc noqueue master ovirtmgmt= state UP qlen 1000=0A link/ether 44:a8:42:39:42:00 brd ff:ff:ff:ff:ff:f= f=0A inet6 fe80::46a8:42ff:fe39:4200/64 scope link =0A valid_lft forever= preferred_lft forever=0A# ip addr show em1=0A2: em1: mtu 1500 qdisc mq= master bond0 state UP qlen 1000=0A link/ether 44:a8:42:39:42:00 brd ff:= ff:ff:ff:ff:ff=0A# ip addr show em3=0A4: em3: mtu 1500 qdisc mq master= bond0 state UP qlen 1000=0A link/ether 44:a8:42:39:42:00 brd ff:ff:ff:f= f:ff:ff By the way, is it mandatory to stop and disable NetworkManager= or not ? Thanks for any kind of help :-) --=_40d9d471a95f6be5f29c7f93c9060b7c Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable <div>Hi,</div>=0A<div>=0A<div>I am trying to setup a cluster of two node= s, with self hoste Engine.</div>=0A<div>Things went fine for the first m= achine, but it as rather messy about the second one.</div>=0A<div>I woul= d like to have load balancing and failover for both management network a= nd storage (NFS repository).</div>=0A<div> </div>=0A<div>So what sh= ould I exactly do to get a working network stack which can be recognized= when I try to add this host to the cluster ?</div>=0A<div> </div>= =0A<div>Have tried configuring bonds and briges using Cockpit, using man= ual "ifcfg" files, but all the time I see the bridges and the bonds not= linked in the Engine interface, so the new host cannot be enrolled.</di= v>=0A<div>If I try to link "ovirtmgmt" to the the associated bond, I hav= e a connectivity loss because it is the management device, and I have te= restart the network services. As management configuration is not OK, I= can't setup the storage connection.</div>=0A<div> </div>=0A<div>An= d if I just try to activate the host, I will install and configure thing= s and then complain about missing "ovirtmgmt" and "nfs" networks, which= both exist and work and Centos level.</div>=0A<div> </div>=0A<div>= The interface, bonds and bridge names are copy/paste from the first serv= er.</div>=0A<div> </div>=0A<div># brctl show ovirtmgmt<br />bridge= name    bridge id        S= TP enabled    interfaces<br />ovirtmgmt   &nbsp= ;    8000.44a842394200    no   &= nbsp;    bond0</div>=0A<div># ip addr show bond0<br />33:= bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc no= queue master ovirtmgmt state UP qlen 1000<br />    link/e= ther 44:a8:42:39:42:00 brd ff:ff:ff:ff:ff:ff<br />    ine= t6 fe80::46a8:42ff:fe39:4200/64 scope link <br />   &nbsp= ;   valid_lft forever preferred_lft forever<br /># ip addr sho= w em1<br />2: em1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 150= 0 qdisc mq master bond0 state UP qlen 1000<br />    link/= ether 44:a8:42:39:42:00 brd ff:ff:ff:ff:ff:ff<br /># ip addr show em3<br= />4: em3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc= mq master bond0 state UP qlen 1000<br />    link/ether 4= 4:a8:42:39:42:00 brd ff:ff:ff:ff:ff:ff</div>=0A<div> </div>=0A<div>= By the way, is it mandatory to stop and disable NetworkManager or not ?<= /div>=0A<div> </div>=0A<div>Thanks for any kind of help :-)</div>= =0A</div> --=_40d9d471a95f6be5f29c7f93c9060b7c--

2 1

Upgrade via reinstall?
by Jamie Lawrence 01 Feb '18

01 Feb '18

Hello, I currently have an Ovirt 4.1.8 installation with a hosted engine using Gluster for storage, with the DBs hosted on a dedicated PG cluster. For reasons[1], it seems possibly simpler for me to upgrade our installation by reinstalling rather than upgrading. In this case, I can happily bring down the running VMs/otherwise do things that one normally can't. Is there any technical reason I can't/shouldn't rebuild from bare-metal, including creating a fresh hosted engine, without losing anything? I suppose a different way of asking this is, is there anything on the engine/host filesystems that I should preserve/restore for this to work? Thanks, -j [1] If this isn't an option, I'll go in to them in order to figure out a plan B; just avoiding a lot of backstory that isn't needed for the question.

2 2

engine add hosts
by 李强华 01 Feb '18

01 Feb '18

hello~，I want to add hosts ，but my host offline , can not connect to internal .engine add hosts , the message： installing hosts node failed . My engine was setup successful , i use engine-setup --offline OK，please help~ [cid:4eed0bd5-8485-41c4-b73b-b82855c988e7] [cid:06876a73-d34b-435f-8479-32aec82c5a0b]

2 1