<div>Hi Simone, I was reviewing the changelog of 3.6.6, on the link below, but i was not able to find the bug (<span style="font-size:13px"><a href="https://bugzilla.redhat.com/1327516">https://bugzilla.redhat.com/1327516</a>) as fixed on the list. According to Bugzilla the target is really 3.6.6, so what's wrong?</span></div><div><span style="font-size:13px"><br></span></div><div><span style="font-size:13px"><br></span></div><a href="http://www.ovirt.org/release/3.6.6/">http://www.ovirt.org/release/3.6.6/</a><div><br></div><div><br></div><div>Thanks</div><div>Luiz</div><div><br><div class="gmail_quote"><div dir="ltr">Em qui, 28 de abr de 2016 11:33, Luiz Claudio Prazeres Goncalves <<a href="mailto:luizcpg@gmail.com">luizcpg@gmail.com</a>> escreveu:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif;font-size:small">Nice!... so, I'll survive a bit more with these issues until the version 3.6.6 gets released...</div><div class="gmail_default" style="font-family:verdana,sans-serif;font-size:small"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;font-size:small"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;font-size:small">Thanks</div></div><div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif;font-size:small">-Luiz</div></div><div dir="ltr"><div class="gmail_extra"><br><div class="gmail_quote">2016-04-28 4:50 GMT-03:00 Simone Tiraboschi <span dir="ltr"><<a href="mailto:stirabos@redhat.com" target="_blank">stirabos@redhat.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span>On Thu, Apr 28, 2016 at 8:32 AM, Sahina Bose <<a href="mailto:sabose@redhat.com" target="_blank">sabose@redhat.com</a>> wrote:<br>
> This seems like issue reported in<br>
> <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1327121" rel="noreferrer" target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=1327121</a><br>
><br>
> Nir, Simone?<br>
<br>
</span>The issue is here:<br>
MainThread::INFO::2016-04-27<br>
03:26:27,185::storage_server::229::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(disconnect_storage_server)<br>
Disconnecting storage server<br>
MainThread::INFO::2016-04-27<br>
03:26:27,816::upgrade::983::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(fix_storage_path)<br>
Fixing storage path in conf file<br>
<br>
And it's tracked here: <a href="https://bugzilla.redhat.com/1327516" rel="noreferrer" target="_blank">https://bugzilla.redhat.com/1327516</a><br>
<br>
We already have a patch, it will be fixed with 3.6.6<br>
<br>
As far as I saw this issue will only cause a lot of mess in the logs<br>
and some false alert but it's basically harmless<br>
<div><div><br>
> On 04/28/2016 05:35 AM, Luiz Claudio Prazeres Goncalves wrote:<br>
><br>
><br>
> Hi everyone,<br>
><br>
> Until today my environment was fully updated (3.6.5+centos7.2) with 3 nodes<br>
> (kvm1,kvm2 and kvm3 hosts) . I also have 3 external gluster nodes<br>
> (gluster-root1,gluster1 and gluster2 hosts ) , replica 3, which the engine<br>
> storage domain is sitting on top (3.7.11 fully updated+centos7.2)<br>
><br>
> For some weird reason i've been receiving emails from oVirt with<br>
> EngineUnexpectedDown (attached picture) on a daily basis more or less, but<br>
> the engine seems to be working fine and my vm's are up and running normally.<br>
> I've never had any issue to access the User Interface to manage the vm's<br>
><br>
> Today I run "yum update" on the nodes and realised that vdsm was outdated,<br>
> so I updated the kvm hosts and they are now , again, fully updated.<br>
><br>
><br>
> Reviewing the logs It seems to be an intermittent connectivity issue when<br>
> trying to access the gluster engine storage domain as you can see below. I<br>
> don't have any network issue in place and I'm 100% sure about it. I have<br>
> another oVirt Cluster using the same network and using a engine storage<br>
> domain on top of an iSCSI Storage Array with no issues.<br>
><br>
> Here seems to be the issue:<br>
><br>
> Thread-1111::INFO::2016-04-27<br>
> 23:01:27,864::fileSD::357::Storage.StorageDomain::(validate)<br>
> sdUUID=03926733-1872-4f85-bb21-18dc320560db<br>
><br>
> Thread-1111::DEBUG::2016-04-27<br>
> 23:01:27,865::persistentDict::234::Storage.PersistentDict::(refresh) read<br>
> lines (FileMetadataRW)=[]<br>
><br>
> Thread-1111::DEBUG::2016-04-27<br>
> 23:01:27,865::persistentDict::252::Storage.PersistentDict::(refresh) Empty<br>
> metadata<br>
><br>
> Thread-1111::ERROR::2016-04-27<br>
> 23:01:27,865::task::866::Storage.TaskManager.Task::(_setError)<br>
> Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::Unexpected error<br>
><br>
> Traceback (most recent call last):<br>
><br>
> File "/usr/share/vdsm/storage/task.py", line 873, in _run<br>
><br>
> return fn(*args, **kargs)<br>
><br>
> File "/usr/share/vdsm/logUtils.py", line 49, in wrapper<br>
><br>
> res = f(*args, **kwargs)<br>
><br>
> File "/usr/share/vdsm/storage/hsm.py", line 2835, in getStorageDomainInfo<br>
><br>
> dom = self.validateSdUUID(sdUUID)<br>
><br>
> File "/usr/share/vdsm/storage/hsm.py", line 278, in validateSdUUID<br>
><br>
> sdDom.validate()<br>
><br>
> File "/usr/share/vdsm/storage/fileSD.py", line 360, in validate<br>
><br>
> raise se.StorageDomainAccessError(self.sdUUID)<br>
><br>
> StorageDomainAccessError: Domain is either partially accessible or entirely<br>
> inaccessible: (u'03926733-1872-4f85-bb21-18dc320560db',)<br>
><br>
> Thread-1111::DEBUG::2016-04-27<br>
> 23:01:27,865::task::885::Storage.TaskManager.Task::(_run)<br>
> Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::Task._run:<br>
> d2acf575-1a60-4fa0-a5bb-cd4363636b94<br>
> ('03926733-1872-4f85-bb21-18dc320560db',) {} failed - stopping task<br>
><br>
> Thread-1111::DEBUG::2016-04-27<br>
> 23:01:27,865::task::1246::Storage.TaskManager.Task::(stop)<br>
> Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::stopping in state preparing<br>
> (force False)<br>
><br>
> Thread-1111::DEBUG::2016-04-27<br>
> 23:01:27,865::task::993::Storage.TaskManager.Task::(_decref)<br>
> Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::ref 1 aborting True<br>
><br>
> Thread-1111::INFO::2016-04-27<br>
> 23:01:27,865::task::1171::Storage.TaskManager.Task::(prepare)<br>
> Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::aborting: Task is aborted:<br>
> 'Domain is either partially accessible or entirely inaccessible' - code 379<br>
><br>
> Thread-1111::DEBUG::2016-04-27<br>
> 23:01:27,866::task::1176::Storage.TaskManager.Task::(prepare)<br>
> Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::Prepare: aborted: Domain is<br>
> either partially accessible or entirely inaccessible<br>
><br>
><br>
> Question: Anyone know what might be happening? I have several gluster<br>
> config's, as you can see below. All the storage domain are using the same<br>
> config's<br>
><br>
><br>
> More information:<br>
><br>
> I have the "engine" storage domain, "vmos1" storage domain and "master"<br>
> storage domain, so everything looks good.<br>
><br>
> [root@kvm1 vdsm]# vdsClient -s 0 getStorageDomainsList<br>
><br>
> 03926733-1872-4f85-bb21-18dc320560db<br>
><br>
> 35021ff4-fb95-43d7-92a3-f538273a3c2e<br>
><br>
> e306e54e-ca98-468d-bb04-3e8900f8840c<br>
><br>
><br>
> Gluster config:<br>
><br>
> [root@gluster-root1 ~]# gluster volume info<br>
><br>
><br>
><br>
> Volume Name: engine<br>
><br>
> Type: Replicate<br>
><br>
> Volume ID: 64b413d2-c42e-40fd-b356-3e6975e941b0<br>
><br>
> Status: Started<br>
><br>
> Number of Bricks: 1 x 3 = 3<br>
><br>
> Transport-type: tcp<br>
><br>
> Bricks:<br>
><br>
> Brick1: gluster1.xyz.com:/gluster/engine/brick1<br>
><br>
> Brick2: gluster2.xyz.com:/gluster/engine/brick1<br>
><br>
> Brick3: gluster-root1.xyz.com:/gluster/engine/brick1<br>
><br>
> Options Reconfigured:<br>
><br>
> performance.cache-size: 1GB<br>
><br>
> performance.write-behind-window-size: 4MB<br>
><br>
> performance.write-behind: off<br>
><br>
> performance.quick-read: off<br>
><br>
> performance.read-ahead: off<br>
><br>
> performance.io-cache: off<br>
><br>
> performance.stat-prefetch: off<br>
><br>
> cluster.eager-lock: enable<br>
><br>
> cluster.quorum-type: auto<br>
><br>
> network.remote-dio: enable<br>
><br>
> cluster.server-quorum-type: server<br>
><br>
> cluster.data-self-heal-algorithm: full<br>
><br>
> performance.low-prio-threads: 32<br>
><br>
> features.shard-block-size: 512MB<br>
><br>
> features.shard: on<br>
><br>
> storage.owner-gid: 36<br>
><br>
> storage.owner-uid: 36<br>
><br>
> performance.readdir-ahead: on<br>
><br>
><br>
> Volume Name: master<br>
><br>
> Type: Replicate<br>
><br>
> Volume ID: 20164808-7bbe-4eeb-8770-d222c0e0b830<br>
><br>
> Status: Started<br>
><br>
> Number of Bricks: 1 x 3 = 3<br>
><br>
> Transport-type: tcp<br>
><br>
> Bricks:<br>
><br>
> Brick1: gluster1.xyz.com:/home/storage/master/brick1<br>
><br>
> Brick2: gluster2.xyz.com:/home/storage/master/brick1<br>
><br>
> Brick3: gluster-root1.xyz.com:/home/storage/master/brick1<br>
><br>
> Options Reconfigured:<br>
><br>
> performance.readdir-ahead: on<br>
><br>
> performance.quick-read: off<br>
><br>
> performance.read-ahead: off<br>
><br>
> performance.io-cache: off<br>
><br>
> performance.stat-prefetch: off<br>
><br>
> cluster.eager-lock: enable<br>
><br>
> network.remote-dio: enable<br>
><br>
> cluster.quorum-type: auto<br>
><br>
> cluster.server-quorum-type: server<br>
><br>
> storage.owner-uid: 36<br>
><br>
> storage.owner-gid: 36<br>
><br>
> features.shard: on<br>
><br>
> features.shard-block-size: 512MB<br>
><br>
> performance.low-prio-threads: 32<br>
><br>
> cluster.data-self-heal-algorithm: full<br>
><br>
> performance.write-behind: off<br>
><br>
> performance.write-behind-window-size: 4MB<br>
><br>
> performance.cache-size: 1GB<br>
><br>
><br>
> Volume Name: vmos1<br>
><br>
> Type: Replicate<br>
><br>
> Volume ID: ea8fb50e-7bc8-4de3-b775-f3976b6b4f13<br>
><br>
> Status: Started<br>
><br>
> Number of Bricks: 1 x 3 = 3<br>
><br>
> Transport-type: tcp<br>
><br>
> Bricks:<br>
><br>
> Brick1: gluster1.xyz.com:/gluster/vmos1/brick1<br>
><br>
> Brick2: gluster2.xyz.com:/gluster/vmos1/brick1<br>
><br>
> Brick3: gluster-root1.xyz.com:/gluster/vmos1/brick1<br>
><br>
> Options Reconfigured:<br>
><br>
> network.ping-timeout: 60<br>
><br>
> performance.readdir-ahead: on<br>
><br>
> performance.quick-read: off<br>
><br>
> performance.read-ahead: off<br>
><br>
> performance.io-cache: off<br>
><br>
> performance.stat-prefetch: off<br>
><br>
> cluster.eager-lock: enable<br>
><br>
> network.remote-dio: enable<br>
><br>
> cluster.quorum-type: auto<br>
><br>
> cluster.server-quorum-type: server<br>
><br>
> storage.owner-uid: 36<br>
><br>
> storage.owner-gid: 36<br>
><br>
> features.shard: on<br>
><br>
> features.shard-block-size: 512MB<br>
><br>
> performance.low-prio-threads: 32<br>
><br>
> cluster.data-self-heal-algorithm: full<br>
><br>
> performance.write-behind: off<br>
><br>
> performance.write-behind-window-size: 4MB<br>
><br>
> performance.cache-size: 1GB<br>
><br>
><br>
><br>
> Attached goes all the logs...<br>
><br>
><br>
><br>
> Thanks<br>
><br>
> -Luiz<br>
><br>
><br>
><br>
><br>
> _______________________________________________<br>
> Users mailing list<br>
> <a href="mailto:Users@ovirt.org" target="_blank">Users@ovirt.org</a><br>
> <a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a><br>
><br>
><br>
</div></div></blockquote></div><br></div></div></blockquote></div></div>