Hello,

Progress: I finally tried to migrate the machine to other hosts in the cluster. For one this was working !

See attached vdsm.log. The migration to host microcloud25 worked as expected, migrating back to initial host microloud22 also. Other hosts (microcloud21, microcloud23,microcloud24 where not working at all as a migration target.

Perhaps the working ones were the two that I rebooted after upgrading all hosts to Ovirt 4.1.5. I'll check with another host to reboot it and try again.Perhaps any other daemon (libvirt/supervdsm or I don't know has to be restarted)

Bye.

Am 25.08.2017 um 14:14 schrieb Ralf Schenk:

Hello,

setting DNS glusterfs.rxmgmt.databay.de to only one IP didn't change anything.

[root@microcloud22 ~]# dig glusterfs.rxmgmt.databay.de

; <<>> DiG 9.9.4-RedHat-9.9.4-50.el7_3.1 <<>> glusterfs.rxmgmt.databay.de
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35135
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 6

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;glusterfs.rxmgmt.databay.de.   IN      A

;; ANSWER SECTION:
glusterfs.rxmgmt.databay.de. 84600 IN   A       172.16.252.121

;; AUTHORITY SECTION:
rxmgmt.databay.de.      84600   IN      NS      ns3.databay.de.
rxmgmt.databay.de.      84600   IN      NS      ns.databay.de.

vdsm.log still shows:
2017-08-25 14:02:38,476+0200 INFO  (periodic/0) [vdsm.api] FINISH repoStats return={u'7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.000295126', 'lastCheck': '0.8', 'valid': True}, u'2b2a44fc-f2bd-47cd-b7af-00be59e30a35': {'code': 0, 'actual': True, 'version': 0, 'acquired': True, 'delay': '0.000611748', 'lastCheck': '3.6', 'valid': True}, u'5d99af76-33b5-47d8-99da-1f32413c7bb0': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.000324379', 'lastCheck': '3.6', 'valid': True}, u'a7fbaaad-7043-4391-9523-3bedcdc4fb0d': {'code': 0, 'actual': True, 'version': 0, 'acquired': True, 'delay': '0.000718626', 'lastCheck': '4.1', 'valid': True}} from=internal, task_id=ec205bf0-ff00-4fac-97f0-e6a7f3f99492 (api:52)
2017-08-25 14:02:38,584+0200 ERROR (migsrc/ffb71f79) [virt.vm] (vmId='ffb71f79-54cd-4f0e-b6b5-3670236cb497') failed to initialize gluster connection (src=0x7fd82001fc30 priv=0x7fd820003ac0): Success (migration:287)
2017-08-25 14:02:38,619+0200 ERROR (migsrc/ffb71f79) [virt.vm] (vmId='ffb71f79-54cd-4f0e-b6b5-3670236cb497') Failed to migrate (migration:429)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 411, in run
    self._startUnderlyingMigration(time.time())
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 487, in _startUnderlyingMigration
    self._perform_with_conv_schedule(duri, muri)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 563, in _perform_with_conv_schedule
    self._perform_migration(duri, muri)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 529, in _perform_migration
    self._vm._dom.migrateToURI3(duri, params, flags)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 69, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 123, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 944, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1939, in migrateToURI3
    if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
libvirtError: failed to initialize gluster connection (src=0x7fd82001fc30 priv=0x7fd820003ac0): Success


One thing I noticed in destination vdsm.log:
2017-08-25 10:38:03,413+0200 ERROR (jsonrpc/7) [virt.vm] (vmId='ffb71f79-54cd-4f0e-b6b5-3670236cb497') Alias not found for device type disk during migration at destination host (vm:4587)
2017-08-25 10:38:03,478+0200 INFO  (jsonrpc/7) [root]  (hooks:108)
2017-08-25 10:38:03,492+0200 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call VM.migrationCreate succeeded in 0.51 seconds (__init__:539)
2017-08-25 10:38:03,669+0200 INFO  (jsonrpc/2) [vdsm.api] START destroy(gracefulAttempts=1) from=::ffff:172.16.252.122,45736 (api:46)
2017-08-25 10:38:03,669+0200 INFO  (jsonrpc/2) [virt.vm] (vmId='ffb71f79-54cd-4f0e-b6b5-3670236cb497') Release VM resources (vm:4254)
2017-08-25 10:38:03,670+0200 INFO  (jsonrpc/2) [virt.vm] (vmId='ffb71f79-54cd-4f0e-b6b5-3670236cb497') Stopping connection (guestagent:430)
2017-08-25 10:38:03,671+0200 INFO  (jsonrpc/2) [vdsm.api] START teardownImage(sdUUID=u'5d99af76-33b5-47d8-99da-1f32413c7bb0', spUUID=u'00000001-0001-0001-0001-0000000000b9', imgUUID=u'9c007b27-0ab7-4474-9317-a294fd04c65f', volUUID=None) from=::ffff:172.16.252.122,45736, task_id=4878dd0c-54e9-4bef-9ec7-446b110c9d8b (api:46)
2017-08-25 10:38:03,671+0200 INFO  (jsonrpc/2) [vdsm.api] FINISH teardownImage return=None from=::ffff:172.16.252.122,45736, task_id=4878dd0c-54e9-4bef-9ec7-446b110c9d8b (api:52)
2017-08-25 10:38:03,672+0200 INFO  (jsonrpc/2) [virt.vm] (vmId='ffb71f79-54cd-4f0e-b6b5-3670236cb497') Stopping connection (guestagent:430)




Am 25.08.2017 um 14:03 schrieb Denis Chaplygin:
Hello!

On Fri, Aug 25, 2017 at 1:40 PM, Ralf Schenk <rs@databay.de> wrote:

Hello,

I'm using the DNS Balancing gluster hostname for years now, not only with ovirt. No software so far had a problem. And setting the hostname to only one Host of course breaks one advantage of a distributed/replicated Cluster File-System like loadbalancing the connections to the storage and/or failover if one host is missing. In earlier ovirt it wasn't possible to specify something like "backupvolfile-server" for a High-Available hosted-engine rollout (which I use).


As far as i know, backup-volfile-servers is a recommended way to keep you filesystem mountable in case of server failure. While fs is mounted, gluster will automatically provide failover. And you definitely can specify backup-volfile-servers in the storage domain configuration.
 

I already used live migration in such a setup. This was done with pure libvirt setup/virsh and later using OpenNebula.


Yes, but it was based on a accessing gluster volume as a mount filesystem, not directly... And i would like to exclude that from list of possible causes.


--


Ralf Schenk
fon +49 (0) 24 05 / 40 83 70
fax +49 (0) 24 05 / 40 83 759
mail rs@databay.de
 
Databay AG
Jens-Otto-Krag-Straße 11
D-52146 Würselen
www.databay.de

Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.-Kfm. Philipp Hermanns
Aufsichtsratsvorsitzender: Wilhelm Dohmen



_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

--


Ralf Schenk
fon +49 (0) 24 05 / 40 83 70
fax +49 (0) 24 05 / 40 83 759
mail rs@databay.de
 
Databay AG
Jens-Otto-Krag-Straße 11
D-52146 Würselen
www.databay.de

Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.-Kfm. Philipp Hermanns
Aufsichtsratsvorsitzender: Wilhelm Dohmen