On Tue, Nov 28, 2017 at 6:36 PM, Kasturi Narra <knarra@redhat.com> wrote:
Hello,
I have an environment with 3 hosts and gluster HCI on 4.1.3.
I'm following this link to take it to 4.1.7


[snip] 
 
7. Exit the global maintenance mode: in a few minutes the engine VM should migrate to the fresh upgraded host cause it will get an higher score

One note: actually exiting from global maintenance doesn't imply that the host previously put into maintenance exiting from it, correct?

[kasturi] - you are right. Global maintenance main use is to allow administrator start / stop / modify the engine vm with out any worry of interference from the HA agents .

So probably one item between 6. and 7. has to be added 

. exit hosted-engine host from maintenance (Select Host -> Management -> activate)


Then after exiting from global maintenance I don't see the engine vm migrating to it.
[kasturi] - which is expected. 

Reading documents I thought it should have migrated to the "higher" version host...
Perhaps this applies only when there is a cluster version upgrade in the datacenter, such as 3.6 -> 4.0 or 4.0 -> 4.1 and it is not true in general?
 
Can I manually migrate engine vm to ovirt03?

[kasturi]

yes, definitely. You should be able to migrate. 
hosted-engine --vm-status looks fine


Yes, it worked as expected


On ovirt03:

[root@ovirt03 ~]# gluster volume info engine
 
Volume Name: engine
Type: Replicate
Volume ID: 6e2bd1d7-9c8e-4c54-9d85-f36e1b871771
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ovirt01.localdomain.local:/gluster/brick1/engine
Brick2: ovirt02.localdomain.local:/gluster/brick1/engine
Brick3: ovirt03.localdomain.local:/gluster/brick1/engine (arbiter)
Options Reconfigured:
performance.strict-o-direct: on
nfs.disable: on
user.cifs: off
network.ping-timeout: 30
cluster.shd-max-threads: 6
cluster.shd-wait-qlength: 10000
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
performance.low-prio-threads: 32
features.shard-block-size: 512MB
features.shard: on
storage.owner-gid: 36
storage.owner-uid: 36
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: off
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
performance.readdir-ahead: on
transport.address-family: inet
[root@ovirt03 ~]# 


[snip]
 

[kasturi] - By the way, any reason why the engine volume is configured to be an arbiter volume? We always recommend engine volume to be a replicated volume to maintain High Availability of the Hosted Engine vm.



Can I mix in general volumes with arbiter and volumes fully replicated in the same infrastructure, correct?

Actually this particular system is based on a single NUC6i5SYH with 32Gb of ram and 2xSSD disks, where I have ESXi 6.0U2 installed.
The 3 oVirt HCI hosts are 3 vSphere VMs, so the engine VM is an L2 guest.
Moreover in oVirt I have another CentOS 6 VM (L2 guest) configured and there is also another CentOS 7 vSphere VM running side by side.
Without arbiter it would have been too cruel... ;-)
It was 4 months I didn't come at it and I found it rock solid active, so I decided to verify update from 4.1.3 to 4.1.7 and all went ok now.

Remaining points I'm going to investigate more are:

- edit of running options for the Engine VM.
Right now I'm force to manually run engine in my particular nested environment with
hosted-engine --vm-start --vm-conf=/root/alternate_engine_vm.conf
where I have
emulatedMachine=pc-i440fx-rhel7.2.0
because with 7.3 and 7.4 it doesn' start as described through this thread:
http://lists.ovirt.org/pipermail/users/2017-July/083149.html

Still it seems I cannot set it from the web admin gui in 4.1.7 and the same happens for other engine vm parameters. I don't know if in 4.2 there will be any improvement in managing this.


- migrate from fuse to libgfapi

- migrate the gluster network volumes from ovirtmgmt to another defined logical network
I tried (after updating to gluster 3.10) with some problems in 4.1.3 for an export domain. It seems more critical for data and engine storage domains.
See also this thread here with my attempts at that time:
http://lists.ovirt.org/pipermail/users/2017-July/083077.html

Thanks for your time,
Gianluca