<div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif;font-size:small">Nice!... so, I&#39;ll survive a bit more with these issues until the version 3.6.6 gets released...</div><div class="gmail_default" style="font-family:verdana,sans-serif;font-size:small"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;font-size:small"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;font-size:small">Thanks</div><div class="gmail_default" style="font-family:verdana,sans-serif;font-size:small">-Luiz</div><div class="gmail_extra"><br><div class="gmail_quote">2016-04-28 4:50 GMT-03:00 Simone Tiraboschi <span dir="ltr">&lt;<a href="mailto:stirabos@redhat.com" target="_blank">stirabos@redhat.com</a>&gt;</span>:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span class="">On Thu, Apr 28, 2016 at 8:32 AM, Sahina Bose &lt;<a href="mailto:sabose@redhat.com">sabose@redhat.com</a>&gt; wrote:<br>
&gt; This seems like issue reported in<br>
&gt; <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1327121" rel="noreferrer" target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=1327121</a><br>
&gt;<br>
&gt; Nir, Simone?<br>
<br>
</span>The issue is here:<br>
MainThread::INFO::2016-04-27<br>
03:26:27,185::storage_server::229::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(disconnect_storage_server)<br>
Disconnecting storage server<br>
MainThread::INFO::2016-04-27<br>
03:26:27,816::upgrade::983::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(fix_storage_path)<br>
Fixing storage path in conf file<br>
<br>
And it&#39;s tracked here: <a href="https://bugzilla.redhat.com/1327516" rel="noreferrer" target="_blank">https://bugzilla.redhat.com/1327516</a><br>
<br>
We already have a patch, it will be fixed with 3.6.6<br>
<br>
As far as I saw this issue will only cause a lot of mess in the logs<br>
and some false alert but it&#39;s basically harmless<br>
<div class=""><div class="h5"><br>
&gt; On 04/28/2016 05:35 AM, Luiz Claudio Prazeres Goncalves wrote:<br>
&gt;<br>
&gt;<br>
&gt; Hi everyone,<br>
&gt;<br>
&gt; Until today my environment was fully updated (3.6.5+centos7.2) with 3 nodes<br>
&gt; (kvm1,kvm2 and kvm3 hosts) . I also have 3 external gluster nodes<br>
&gt; (gluster-root1,gluster1 and gluster2 hosts ) , replica 3, which the engine<br>
&gt; storage domain is sitting on top (3.7.11 fully updated+centos7.2)<br>
&gt;<br>
&gt; For some weird reason i&#39;ve been receiving emails from oVirt with<br>
&gt; EngineUnexpectedDown (attached picture) on a daily basis more or less, but<br>
&gt; the engine seems to be working fine and my vm&#39;s are up and running normally.<br>
&gt; I&#39;ve never had any issue to access the User Interface to manage the vm&#39;s<br>
&gt;<br>
&gt; Today I run &quot;yum update&quot; on the nodes and realised that vdsm was outdated,<br>
&gt; so I updated the kvm hosts and they are now , again, fully updated.<br>
&gt;<br>
&gt;<br>
&gt; Reviewing the logs It seems to be an intermittent connectivity issue when<br>
&gt; trying to access the gluster engine storage domain as you can see below. I<br>
&gt; don&#39;t have any network issue in place and I&#39;m 100% sure about it. I have<br>
&gt; another oVirt Cluster using the same network and using a engine storage<br>
&gt; domain on top of an iSCSI Storage Array with no issues.<br>
&gt;<br>
&gt; Here seems to be the issue:<br>
&gt;<br>
&gt; Thread-1111::INFO::2016-04-27<br>
&gt; 23:01:27,864::fileSD::357::Storage.StorageDomain::(validate)<br>
&gt; sdUUID=03926733-1872-4f85-bb21-18dc320560db<br>
&gt;<br>
&gt; Thread-1111::DEBUG::2016-04-27<br>
&gt; 23:01:27,865::persistentDict::234::Storage.PersistentDict::(refresh) read<br>
&gt; lines (FileMetadataRW)=[]<br>
&gt;<br>
&gt; Thread-1111::DEBUG::2016-04-27<br>
&gt; 23:01:27,865::persistentDict::252::Storage.PersistentDict::(refresh) Empty<br>
&gt; metadata<br>
&gt;<br>
&gt; Thread-1111::ERROR::2016-04-27<br>
&gt; 23:01:27,865::task::866::Storage.TaskManager.Task::(_setError)<br>
&gt; Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::Unexpected error<br>
&gt;<br>
&gt; Traceback (most recent call last):<br>
&gt;<br>
&gt;   File &quot;/usr/share/vdsm/storage/task.py&quot;, line 873, in _run<br>
&gt;<br>
&gt;     return fn(*args, **kargs)<br>
&gt;<br>
&gt;   File &quot;/usr/share/vdsm/logUtils.py&quot;, line 49, in wrapper<br>
&gt;<br>
&gt;     res = f(*args, **kwargs)<br>
&gt;<br>
&gt;   File &quot;/usr/share/vdsm/storage/hsm.py&quot;, line 2835, in getStorageDomainInfo<br>
&gt;<br>
&gt;     dom = self.validateSdUUID(sdUUID)<br>
&gt;<br>
&gt;   File &quot;/usr/share/vdsm/storage/hsm.py&quot;, line 278, in validateSdUUID<br>
&gt;<br>
&gt;     sdDom.validate()<br>
&gt;<br>
&gt;   File &quot;/usr/share/vdsm/storage/fileSD.py&quot;, line 360, in validate<br>
&gt;<br>
&gt;     raise se.StorageDomainAccessError(self.sdUUID)<br>
&gt;<br>
&gt; StorageDomainAccessError: Domain is either partially accessible or entirely<br>
&gt; inaccessible: (u&#39;03926733-1872-4f85-bb21-18dc320560db&#39;,)<br>
&gt;<br>
&gt; Thread-1111::DEBUG::2016-04-27<br>
&gt; 23:01:27,865::task::885::Storage.TaskManager.Task::(_run)<br>
&gt; Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::Task._run:<br>
&gt; d2acf575-1a60-4fa0-a5bb-cd4363636b94<br>
&gt; (&#39;03926733-1872-4f85-bb21-18dc320560db&#39;,) {} failed - stopping task<br>
&gt;<br>
&gt; Thread-1111::DEBUG::2016-04-27<br>
&gt; 23:01:27,865::task::1246::Storage.TaskManager.Task::(stop)<br>
&gt; Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::stopping in state preparing<br>
&gt; (force False)<br>
&gt;<br>
&gt; Thread-1111::DEBUG::2016-04-27<br>
&gt; 23:01:27,865::task::993::Storage.TaskManager.Task::(_decref)<br>
&gt; Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::ref 1 aborting True<br>
&gt;<br>
&gt; Thread-1111::INFO::2016-04-27<br>
&gt; 23:01:27,865::task::1171::Storage.TaskManager.Task::(prepare)<br>
&gt; Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::aborting: Task is aborted:<br>
&gt; &#39;Domain is either partially accessible or entirely inaccessible&#39; - code 379<br>
&gt;<br>
&gt; Thread-1111::DEBUG::2016-04-27<br>
&gt; 23:01:27,866::task::1176::Storage.TaskManager.Task::(prepare)<br>
&gt; Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::Prepare: aborted: Domain is<br>
&gt; either partially accessible or entirely inaccessible<br>
&gt;<br>
&gt;<br>
&gt; Question: Anyone know what might be happening? I have several gluster<br>
&gt; config&#39;s, as you can see below. All the storage domain are using the same<br>
&gt; config&#39;s<br>
&gt;<br>
&gt;<br>
&gt; More information:<br>
&gt;<br>
&gt; I have the &quot;engine&quot; storage domain, &quot;vmos1&quot; storage domain and &quot;master&quot;<br>
&gt; storage domain, so everything looks good.<br>
&gt;<br>
&gt; [root@kvm1 vdsm]# vdsClient -s 0 getStorageDomainsList<br>
&gt;<br>
&gt; 03926733-1872-4f85-bb21-18dc320560db<br>
&gt;<br>
&gt; 35021ff4-fb95-43d7-92a3-f538273a3c2e<br>
&gt;<br>
&gt; e306e54e-ca98-468d-bb04-3e8900f8840c<br>
&gt;<br>
&gt;<br>
&gt; Gluster config:<br>
&gt;<br>
&gt; [root@gluster-root1 ~]# gluster volume info<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; Volume Name: engine<br>
&gt;<br>
&gt; Type: Replicate<br>
&gt;<br>
&gt; Volume ID: 64b413d2-c42e-40fd-b356-3e6975e941b0<br>
&gt;<br>
&gt; Status: Started<br>
&gt;<br>
&gt; Number of Bricks: 1 x 3 = 3<br>
&gt;<br>
&gt; Transport-type: tcp<br>
&gt;<br>
&gt; Bricks:<br>
&gt;<br>
&gt; Brick1: gluster1.xyz.com:/gluster/engine/brick1<br>
&gt;<br>
&gt; Brick2: gluster2.xyz.com:/gluster/engine/brick1<br>
&gt;<br>
&gt; Brick3: gluster-root1.xyz.com:/gluster/engine/brick1<br>
&gt;<br>
&gt; Options Reconfigured:<br>
&gt;<br>
&gt; performance.cache-size: 1GB<br>
&gt;<br>
&gt; performance.write-behind-window-size: 4MB<br>
&gt;<br>
&gt; performance.write-behind: off<br>
&gt;<br>
&gt; performance.quick-read: off<br>
&gt;<br>
&gt; performance.read-ahead: off<br>
&gt;<br>
&gt; performance.io-cache: off<br>
&gt;<br>
&gt; performance.stat-prefetch: off<br>
&gt;<br>
&gt; cluster.eager-lock: enable<br>
&gt;<br>
&gt; cluster.quorum-type: auto<br>
&gt;<br>
&gt; network.remote-dio: enable<br>
&gt;<br>
&gt; cluster.server-quorum-type: server<br>
&gt;<br>
&gt; cluster.data-self-heal-algorithm: full<br>
&gt;<br>
&gt; performance.low-prio-threads: 32<br>
&gt;<br>
&gt; features.shard-block-size: 512MB<br>
&gt;<br>
&gt; features.shard: on<br>
&gt;<br>
&gt; storage.owner-gid: 36<br>
&gt;<br>
&gt; storage.owner-uid: 36<br>
&gt;<br>
&gt; performance.readdir-ahead: on<br>
&gt;<br>
&gt;<br>
&gt; Volume Name: master<br>
&gt;<br>
&gt; Type: Replicate<br>
&gt;<br>
&gt; Volume ID: 20164808-7bbe-4eeb-8770-d222c0e0b830<br>
&gt;<br>
&gt; Status: Started<br>
&gt;<br>
&gt; Number of Bricks: 1 x 3 = 3<br>
&gt;<br>
&gt; Transport-type: tcp<br>
&gt;<br>
&gt; Bricks:<br>
&gt;<br>
&gt; Brick1: gluster1.xyz.com:/home/storage/master/brick1<br>
&gt;<br>
&gt; Brick2: gluster2.xyz.com:/home/storage/master/brick1<br>
&gt;<br>
&gt; Brick3: gluster-root1.xyz.com:/home/storage/master/brick1<br>
&gt;<br>
&gt; Options Reconfigured:<br>
&gt;<br>
&gt; performance.readdir-ahead: on<br>
&gt;<br>
&gt; performance.quick-read: off<br>
&gt;<br>
&gt; performance.read-ahead: off<br>
&gt;<br>
&gt; performance.io-cache: off<br>
&gt;<br>
&gt; performance.stat-prefetch: off<br>
&gt;<br>
&gt; cluster.eager-lock: enable<br>
&gt;<br>
&gt; network.remote-dio: enable<br>
&gt;<br>
&gt; cluster.quorum-type: auto<br>
&gt;<br>
&gt; cluster.server-quorum-type: server<br>
&gt;<br>
&gt; storage.owner-uid: 36<br>
&gt;<br>
&gt; storage.owner-gid: 36<br>
&gt;<br>
&gt; features.shard: on<br>
&gt;<br>
&gt; features.shard-block-size: 512MB<br>
&gt;<br>
&gt; performance.low-prio-threads: 32<br>
&gt;<br>
&gt; cluster.data-self-heal-algorithm: full<br>
&gt;<br>
&gt; performance.write-behind: off<br>
&gt;<br>
&gt; performance.write-behind-window-size: 4MB<br>
&gt;<br>
&gt; performance.cache-size: 1GB<br>
&gt;<br>
&gt;<br>
&gt; Volume Name: vmos1<br>
&gt;<br>
&gt; Type: Replicate<br>
&gt;<br>
&gt; Volume ID: ea8fb50e-7bc8-4de3-b775-f3976b6b4f13<br>
&gt;<br>
&gt; Status: Started<br>
&gt;<br>
&gt; Number of Bricks: 1 x 3 = 3<br>
&gt;<br>
&gt; Transport-type: tcp<br>
&gt;<br>
&gt; Bricks:<br>
&gt;<br>
&gt; Brick1: gluster1.xyz.com:/gluster/vmos1/brick1<br>
&gt;<br>
&gt; Brick2: gluster2.xyz.com:/gluster/vmos1/brick1<br>
&gt;<br>
&gt; Brick3: gluster-root1.xyz.com:/gluster/vmos1/brick1<br>
&gt;<br>
&gt; Options Reconfigured:<br>
&gt;<br>
&gt; network.ping-timeout: 60<br>
&gt;<br>
&gt; performance.readdir-ahead: on<br>
&gt;<br>
&gt; performance.quick-read: off<br>
&gt;<br>
&gt; performance.read-ahead: off<br>
&gt;<br>
&gt; performance.io-cache: off<br>
&gt;<br>
&gt; performance.stat-prefetch: off<br>
&gt;<br>
&gt; cluster.eager-lock: enable<br>
&gt;<br>
&gt; network.remote-dio: enable<br>
&gt;<br>
&gt; cluster.quorum-type: auto<br>
&gt;<br>
&gt; cluster.server-quorum-type: server<br>
&gt;<br>
&gt; storage.owner-uid: 36<br>
&gt;<br>
&gt; storage.owner-gid: 36<br>
&gt;<br>
&gt; features.shard: on<br>
&gt;<br>
&gt; features.shard-block-size: 512MB<br>
&gt;<br>
&gt; performance.low-prio-threads: 32<br>
&gt;<br>
&gt; cluster.data-self-heal-algorithm: full<br>
&gt;<br>
&gt; performance.write-behind: off<br>
&gt;<br>
&gt; performance.write-behind-window-size: 4MB<br>
&gt;<br>
&gt; performance.cache-size: 1GB<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; Attached goes all the logs...<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; Thanks<br>
&gt;<br>
&gt; -Luiz<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; _______________________________________________<br>
&gt; Users mailing list<br>
&gt; <a href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>
&gt; <a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a><br>
&gt;<br>
&gt;<br>
</div></div></blockquote></div><br></div></div>