ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements
by yayo (j)
Hi all,
We have an ovirt cluster hyperconverged with hosted engine on 3 full
replicated node . This cluster have 2 gluster volume:
- data: volume for the Data (Master) Domain (For vm)
- engine: volume fro the hosted_storage Domain (for hosted engine)
We have this problem: "engine" gluster volume have always unsynced elements
and we cant' fix the problem, on command line we have tried to use the
"heal" command but elements remain always unsynced ....
Below the heal command "status":
[root@node01 ~]# gluster volume heal engine info
Brick node01:/gluster/engine/brick
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.48
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.64
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.60
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.2
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.68
/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/19d71267-52a4-42a3-bb1e-e3145361c0c2/7a215635-02f3-47db-80db-8b689c6a8f01
/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/88d41053-a257-4272-9e2e-2f3de0743b81/6573ed08-d3ed-4d12-9227-2c95941e1ad6
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.61
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.1
/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/dom_md/ids
/.shard/8aa74564-6740-403e-ad51-f56d9ca5d7a7.20
/__DIRECT_IO_TEST__
Status: Connected
Number of entries: 12
Brick node02:/gluster/engine/brick
/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/19d71267-52a4-42a3-bb1e-e3145361c0c2/7a215635-02f3-47db-80db-8b689c6a8f01
<gfid:9a601373-bbaa-44d8-b396-f0b9b12c026f>
/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/dom_md/ids
<gfid:1e309376-c62e-424f-9857-f9a0c3a729bf>
<gfid:e3565b50-1495-4e5b-ae88-3bceca47b7d9>
<gfid:4e33ac33-dddb-4e29-b4a3-51770b81166a>
/__DIRECT_IO_TEST__
<gfid:67606789-1f34-4c15-86b8-c0d05b07f187>
<gfid:9ef88647-cfe6-4a35-a38c-a5173c9e8fc0>
/8f215dd2-8531-4a4f-b6ed-ea789dd8821b/images/88d41053-a257-4272-9e2e-2f3de0743b81/6573ed08-d3ed-4d12-9227-2c95941e1ad6
<gfid:9ad720b2-507d-4830-8294-ec8adee6d384>
<gfid:d9853e5d-a2bf-4cee-8b39-7781a98033cf>
Status: Connected
Number of entries: 12
Brick node04:/gluster/engine/brick
Status: Connected
Number of entries: 0
running the "gluster volume heal engine" don't solve the problem...
Some extra info:
We have recently changed the gluster from: 2 (full repliacated) + 1 arbiter
to 3 full replicated cluster but i don't know this is the problem...
The "data" volume is good and healty and have no unsynced entry.
Ovirt refuse to put the node02 and node01 in "maintenance mode" and
complains about "unsynced elements"
How can I fix this?
Thank you
7 years, 2 months
oVIRT Node / Network Manager / oVirt 4.1
by Devin Acosta
I noticed that for some reason when I am running oVirt node on my hosts and
disabled NetworkManager it appears that it kept turning itself back on, and
my understanding is that Network Manager should be disabled. I had to force
remove Network Manager in order to get it to stay disabled. I have been
seeing strangeness with my VM's where they disconnect, and where
hosted-engine keeps trying to non-stop migrate to another host. Just wanted
to first confirm about Network Manager and oVirt Node image.
--
Devin Acosta
Red Hat Certified Architect
7 years, 3 months
Problemas with ovirtmgmt network used to connect VMs
by FERNANDO FREDIANI
Has anyone had problem when using the ovirtmgmt bridge to connect VMs ?
I am still facing a bizarre problem where some VMs connected to this
bridge stop passing traffic. Checking the problem further I see its mac
address stops being learned by the bridge and the problem is resolved
only with a VM reboot.
When I last saw the problem I run brctl showmacs ovirtmgmt and it shows
me the VM's mac adress with agening timer 200.19. After the VM reboot I
see the same mac with agening timer 0.00.
I don't see it in another environment where the ovirtmgmt is not used
for VMs.
Does anyone have any clue about this type of behavior ?
Fernando
7 years, 3 months
supervdsmd IOError to /dev/stdout
by Richard Chan
After an upgrade to 4.0 I have a single host that cannot start supervdsmd
because of IOError on /dev/stdout. All other hosts upgraded correctly.
In the systemd unit I have to hack StandardOutput=null.
Any thing I have overlooked? The hosts are all identical and it is just
this one
that has this weird behaviour.
--
Richard Chan
7 years, 3 months
sanlock ids file broken after server crash
by Johan Bernhardsson
Hello,
The ids file for sanlock is broken on one setup. The first host id in
the file is wrong.
>From the logfile i have:
verify_leader 1 wrong space name 0924ff77-ef51-435b-b90d-50bfbf2e�ke7
0924ff77-ef51-435b-b90d-50bfbf2e8de7 /rhev/data-center/mnt/glusterSD/
Note the broken char in the space name.
This also apears. And it seams as the hostid too is broken in the ids
file:
leader4 sn 0924ff77-ef51-435b-b90d-50bfbf2e�ke7 rn ��7afa5-3a91-415b-
a04c-221d3e060163.vbgkvm01.a ts 4351980 cs eefa4dd7
Note the broken chars there as well.
If i check the ids file with less or strings the first row where my
vbgkvm01 host are. That has broken chars.
Can this be repaired in some way without taking down all the virtual
machines on that storage?
/Johan
7 years, 3 months
VDSM Command failed: Heartbeat Exceeded
by Neil
Hi guys,
Please could someone assist me, my DC seems to be trying to re-negotiate
SPM and apparently it's failing. I tried to delete an old autogenerated
snapshot and shortly after that the issue seemed to start, however after
about an hour, the snapshot said successfully deleted, and then SPM
negotiated again albeit for a short period before it started trying to
re-negotiate again.
Last week I upgraded from ovirt 3.5 to 3.6, I also upgraded one of my 4
hosts using the 3.6 repo to the latest available from that repo and did a
yum update too.
I have 4 nodes and my ovirt engine is a KVM guest on another physical
machine on the network. I'm using an FC SAN with ATTO HBA's and recently
we've started seeing some degraded IO. The SAN appears to be alright and
the disks all seem to check out, but we are having rather slow IOPS at the
moment, which we trying to track down.
ovirt engine CentOS release 6.9 (Final)
ebay-cors-filter-1.0.1-0.1.ovirt.el6.noarch
ovirt-engine-3.6.7.5-1.el6.noarch
ovirt-engine-backend-3.6.7.5-1.el6.noarch
ovirt-engine-cli-3.6.2.0-1.el6.noarch
ovirt-engine-dbscripts-3.6.7.5-1.el6.noarch
ovirt-engine-extension-aaa-jdbc-1.0.7-1.el6.noarch
ovirt-engine-extensions-api-impl-3.6.7.5-1.el6.noarch
ovirt-engine-jboss-as-7.1.1-1.el6.x86_64
ovirt-engine-lib-3.6.7.5-1.el6.noarch
ovirt-engine-restapi-3.6.7.5-1.el6.noarch
ovirt-engine-sdk-python-3.6.7.0-1.el6.noarch
ovirt-engine-setup-3.6.7.5-1.el6.noarch
ovirt-engine-setup-base-3.6.7.5-1.el6.noarch
ovirt-engine-setup-plugin-ovirt-engine-3.6.7.5-1.el6.noarch
ovirt-engine-setup-plugin-ovirt-engine-common-3.6.7.5-1.el6.noarch
ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.7.5-1.el6.noarch
ovirt-engine-setup-plugin-websocket-proxy-3.6.7.5-1.el6.noarch
ovirt-engine-tools-3.6.7.5-1.el6.noarch
ovirt-engine-tools-backup-3.6.7.5-1.el6.noarch
ovirt-engine-userportal-3.6.7.5-1.el6.noarch
ovirt-engine-vmconsole-proxy-helper-3.6.7.5-1.el6.noarch
ovirt-engine-webadmin-portal-3.6.7.5-1.el6.noarch
ovirt-engine-websocket-proxy-3.6.7.5-1.el6.noarch
ovirt-engine-wildfly-8.2.1-1.el6.x86_64
ovirt-engine-wildfly-overlay-8.0.5-1.el6.noarch
ovirt-host-deploy-1.4.1-1.el6.noarch
ovirt-host-deploy-java-1.4.1-1.el6.noarch
ovirt-image-uploader-3.6.0-1.el6.noarch
ovirt-iso-uploader-3.6.0-1.el6.noarch
ovirt-release34-1.0.3-1.noarch
ovirt-release35-006-1.noarch
ovirt-release36-3.6.7-1.noarch
ovirt-setup-lib-1.0.1-1.el6.noarch
ovirt-vmconsole-1.0.2-1.el6.noarch
ovirt-vmconsole-proxy-1.0.2-1.el6.noarch
node01 (CentOS 6.9)
vdsm-4.16.30-0.el6.x86_64
vdsm-cli-4.16.30-0.el6.noarch
vdsm-jsonrpc-4.16.30-0.el6.noarch
vdsm-python-4.16.30-0.el6.noarch
vdsm-python-zombiereaper-4.16.30-0.el6.noarch
vdsm-xmlrpc-4.16.30-0.el6.noarch
vdsm-yajsonrpc-4.16.30-0.el6.noarch
gpxe-roms-qemu-0.9.7-6.16.el6.noarch
qemu-img-rhev-0.12.1.2-2.479.el6_7.2.x86_64
qemu-kvm-rhev-0.12.1.2-2.479.el6_7.2.x86_64
qemu-kvm-rhev-tools-0.12.1.2-2.479.el6_7.2.x86_64
libvirt-0.10.2-62.el6.x86_64
libvirt-client-0.10.2-62.el6.x86_64
libvirt-lock-sanlock-0.10.2-62.el6.x86_64
libvirt-python-0.10.2-62.el6.x86_64
node01 was upgraded out of desperation after I tried changing my DC and
cluster version to 3.6, but then found that none of my hosts could be
activated out of maintenance due to an incompatibility with 3.6 (I'm still
not sure why as searching seemed to indicate Centos 6.x was compatible. I
then had to remove all 4 hosts, and change the cluster version back to 3.5
and then re-add them. When I tried changing the cluster version to 3.6 I
did get a complaint about using the "legacy protocol" so on each host under
Advanced, I changed them to use the JSON protocol, and this seemed to
resolve it, however once changing the DC/Cluster back to 3.5 the option to
change the protocol back to Legacy is no longer shown.
node02 (Centos 6.7)
vdsm-4.16.30-0.el6.x86_64
vdsm-cli-4.16.30-0.el6.noarch
vdsm-jsonrpc-4.16.30-0.el6.noarch
vdsm-python-4.16.30-0.el6.noarch
vdsm-python-zombiereaper-4.16.30-0.el6.noarch
vdsm-xmlrpc-4.16.30-0.el6.noarch
vdsm-yajsonrpc-4.16.30-0.el6.noarch
gpxe-roms-qemu-0.9.7-6.14.el6.noarch
qemu-img-rhev-0.12.1.2-2.479.el6_7.2.x86_64
qemu-kvm-rhev-0.12.1.2-2.479.el6_7.2.x86_64
qemu-kvm-rhev-tools-0.12.1.2-2.479.el6_7.2.x86_64
libvirt-0.10.2-54.el6_7.6.x86_64
libvirt-client-0.10.2-54.el6_7.6.x86_64
libvirt-lock-sanlock-0.10.2-54.el6_7.6.x86_64
libvirt-python-0.10.2-54.el6_7.6.x86_64
node03 CentOS 6.7
vdsm-4.16.30-0.el6.x86_64
vdsm-cli-4.16.30-0.el6.noarch
vdsm-jsonrpc-4.16.30-0.el6.noarch
vdsm-python-4.16.30-0.el6.noarch
vdsm-python-zombiereaper-4.16.30-0.el6.noarch
vdsm-xmlrpc-4.16.30-0.el6.noarch
vdsm-yajsonrpc-4.16.30-0.el6.noarch
gpxe-roms-qemu-0.9.7-6.14.el6.noarch
qemu-img-rhev-0.12.1.2-2.479.el6_7.2.x86_64
qemu-kvm-rhev-0.12.1.2-2.479.el6_7.2.x86_64
qemu-kvm-rhev-tools-0.12.1.2-2.479.el6_7.2.x86_64
libvirt-0.10.2-54.el6_7.6.x86_64
libvirt-client-0.10.2-54.el6_7.6.x86_64
libvirt-lock-sanlock-0.10.2-54.el6_7.6.x86_64
libvirt-python-0.10.2-54.el6_7.6.x86_64
node04 (Centos 6.7)
vdsm-4.16.20-1.git3a90f62.el6.x86_64
vdsm-cli-4.16.20-1.git3a90f62.el6.noarch
vdsm-jsonrpc-4.16.20-1.git3a90f62.el6.noarch
vdsm-python-4.16.20-1.git3a90f62.el6.noarch
vdsm-python-zombiereaper-4.16.20-1.git3a90f62.el6.noarch
vdsm-xmlrpc-4.16.20-1.git3a90f62.el6.noarch
vdsm-yajsonrpc-4.16.20-1.git3a90f62.el6.noarch
gpxe-roms-qemu-0.9.7-6.15.el6.noarch
qemu-img-0.12.1.2-2.491.el6_8.1.x86_64
qemu-kvm-0.12.1.2-2.491.el6_8.1.x86_64
qemu-kvm-tools-0.12.1.2-2.503.el6_9.3.x86_64
libvirt-0.10.2-60.el6.x86_64
libvirt-client-0.10.2-60.el6.x86_64
libvirt-lock-sanlock-0.10.2-60.el6.x86_64
libvirt-python-0.10.2-60.el6.x86_64
I'm seeing a rather confusing error in the /var/log/messages on all 4 hosts
as follows....
Jul 31 16:41:36 node01 multipathd: 36001b4d80001c80d0000000000000000: sdb -
directio checker reports path is down
Jul 31 16:41:41 node01 kernel: sd 7:0:0:0: [sdb] Result:
hostbyte=DID_ERROR driverbyte=DRIVER_OK
Jul 31 16:41:41 node01 kernel: sd 7:0:0:0: [sdb] CDB: Read(10): 28 00 00 00
00 00 00 00 01 00
Jul 31 16:41:41 node01 kernel: end_request: I/O error, dev sdb, sector 0
I say confusing, because I don't have a 3000GB LUN
[root@node01 ~]# fdisk -l | grep 3000
Disk /dev/sdb: 3000.0 GB, 2999999528960 bytes
I did have one on Friday, last week, but I trashed it and changed it to a
1500GB LUN instead, so I'm not sure if perhaps this error is still trying
to connect to the old LUN perhaps?
My LUNS are as follows...
Disk /dev/sdb: 3000.0 GB, 2999999528960 bytes (this one doesn't actually
exist anymore)
Disk /dev/sdc: 1000.0 GB, 999999668224 bytes
Disk /dev/sdd: 1000.0 GB, 999999668224 bytes
Disk /dev/sde: 1000.0 GB, 999999668224 bytes
Disk /dev/sdf: 1000.0 GB, 999999668224 bytes
Disk /dev/sdg: 1000.0 GB, 999999668224 bytes
Disk /dev/sdh: 1000.0 GB, 999999668224 bytes
Disk /dev/sdi: 1000.0 GB, 999999668224 bytes
Disk /dev/sdj: 1000.0 GB, 999999668224 bytes
Disk /dev/sdk: 1000.0 GB, 999999668224 bytes
Disk /dev/sdm: 1000.0 GB, 999999668224 bytes
Disk /dev/sdl: 1000.0 GB, 999999668224 bytes
Disk /dev/sdn: 1000.0 GB, 999999668224 bytes
Disk /dev/sdo: 1000.0 GB, 999999668224 bytes
Disk /dev/sdp: 1000.0 GB, 999999668224 bytes
Disk /dev/sdq: 1000.0 GB, 999999668224 bytes
Disk /dev/sdr: 1000.0 GB, 999988133888 bytes
Disk /dev/sds: 1500.0 GB, 1499999764480 bytes
Disk /dev/sdt: 1500.0 GB, 1499999502336 bytes
I'm quite low on SAN disk space currently so I'm a little hesitant to
migrate VM's around for fear of the migrations creating too many snapshots
and filling up my SAN. We are in the process of expanding the SAN Array
too, but we trying to get to the bottom of the bad IOPS at the moment
before adding on addition overhead.
Ping tests between hosts and engine all look alright, so I don't suspect
network issues.
I know this is very vague, everything is currently operational, however as
you can see in the attached logs, I'm getting lots of ERROR messages.
Any help or guidance is greatly appreciated.
Thanks.
Regards.
Neil Wilson.
7 years, 3 months
oVirt and Foreman
by Davide Ferrari
Hello list
is anybody successfully using oVirt + Foreman for VM creation +
provisioning?
I'm using Foremn (latest version, 1.15.2) with latest oVirt version
(4.1.3) but I'm encountering several problem, especially related to
disks. For example:
- cannot create a VM with multiple disks though Foreman CLI (hammer)
- if I create a multidisk VM from Foreman, the second disk always gets
the "bootable" flag and not the primary image, making the VMs not
bootable at all.
Any other Foreman user sharing the pain here? Foramn's list is not so
useful so I'm trying to ask here. How do you programmatically create
virtual machines with oVirt and Foreman? Should I switch do directly
using oVirt API?
Thanks in advance
Davide
7 years, 3 months
Deploying training lab
by Andy Michielsen
Hello all,
Don't know if this is the right place to ask this but I would like to set
up a trainingslab with oVirt.
I have deployed an engine and a host with local storage and want to run 1
server and 5 desktops off it.
But the desktops will be used on thin clients or old laptops with some
minimal os installation running spice client or a webbrowser.
I was wondering if anyone can give me pointer in how to set up a minimal
laptop which only need to run an spice client.
Kind regards.
7 years, 3 months