<div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif;font-size:small"><br></div><div class="gmail_quote"><div dir="ltr"><div style="font-family:verdana,sans-serif;font-size:small">Hi everyone,</div><div style="font-family:verdana,sans-serif;font-size:small"><br></div><div style="font-family:verdana,sans-serif;font-size:small">Until today my environment was fully updated (3.6.5+centos7.2) with 3 nodes (kvm1,kvm2 and kvm3 hosts) . I also have 3 external gluster nodes (gluster-root1,gluster1 and gluster2 hosts ) , replica 3, which the engine storage domain is sitting on top (3.7.11 fully updated+centos7.2)</div><div style="font-family:verdana,sans-serif;font-size:small"><br></div><div style="font-family:verdana,sans-serif;font-size:small"><font color="#ff0000">For some weird reason i&#39;ve been receiving emails from oVirt with EngineUnexpectedDown (attached picture) on a daily basis more or less, but the engine seems to be working fine and my vm&#39;s are up and running normally. I&#39;ve never had any issue to access the User Interface to manage the vm&#39;s </font></div><div style="font-family:verdana,sans-serif;font-size:small"><br></div><div style="font-family:verdana,sans-serif;font-size:small">Today I run &quot;yum update&quot; on the nodes and realised that vdsm was outdated, so I updated the kvm hosts and they are now , again, fully updated. </div><div style="font-family:verdana,sans-serif;font-size:small"><br></div><div style="font-family:verdana,sans-serif;font-size:small"><br></div><div style="font-family:verdana,sans-serif"><font size="4">Reviewing the logs It seems to be an intermittent connectivity issue when trying to access the gluster engine storage domain as you can see below. I don&#39;t have any network issue in place and I&#39;m 100% sure about it. I have another oVirt Cluster using the same network and using a engine storage domain on top of an iSCSI Storage Array with no issues.</font></div><div style="font-family:verdana,sans-serif;font-size:small"><br></div><div style="font-family:verdana,sans-serif"><b><font size="4">Here seems to be the issue:</font></b></div><div style="font-family:verdana,sans-serif;font-size:small"><br></div>







<p><span>Thread-1111::INFO::2016-04-27 23:01:27,864::fileSD::357::Storage.StorageDomain::(validate) sdUUID=03926733-1872-4f85-bb21-18dc320560db</span></p>
<p><span>Thread-1111::DEBUG::2016-04-27 23:01:27,865::persistentDict::234::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=[]</span></p>
<p><span>Thread-1111::DEBUG::2016-04-27 23:01:27,865::persistentDict::252::Storage.PersistentDict::(refresh) Empty metadata</span></p>
<p><span>Thread-1111::</span><span>ERROR</span><span>::2016-04-27 23:01:27,865::task::866::Storage.TaskManager.Task::(_setError) Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::Unexpected error</span></p>
<p><span>Traceback (most recent call last):</span></p>
<p><span>  File &quot;/usr/share/vdsm/storage/task.py&quot;, line 873, in _run</span></p>
<p><span>    return fn(*args, **kargs)</span></p>
<p><span>  File &quot;/usr/share/vdsm/logUtils.py&quot;, line 49, in wrapper</span></p>
<p><span>    res = f(*args, **kwargs)</span></p>
<p><span>  File &quot;/usr/share/vdsm/storage/hsm.py&quot;, line 2835, in getStorageDomainInfo</span></p>
<p><span>    dom = self.validateSdUUID(sdUUID)</span></p>
<p><span>  File &quot;/usr/share/vdsm/storage/hsm.py&quot;, line 278, in validateSdUUID</span></p>
<p><span>    sdDom.validate()</span></p>
<p><span>  File &quot;/usr/share/vdsm/storage/fileSD.py&quot;, line 360, in validate</span></p>
<p><span>    raise se.StorageDomainAccessError(self.sdUUID)</span></p>
<p><span>StorageDomainAccessError: Domain is either partially accessible or entirely inaccessible: (u&#39;03926733-1872-4f85-bb21-18dc320560db&#39;,)</span></p>
<p><span>Thread-1111::DEBUG::2016-04-27 23:01:27,865::task::885::Storage.TaskManager.Task::(_run) Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::Task._run: d2acf575-1a60-4fa0-a5bb-cd4363636b94 (&#39;03926733-1872-4f85-bb21-18dc320560db&#39;,) {} failed - stopping task</span></p>
<p><span>Thread-1111::DEBUG::2016-04-27 23:01:27,865::task::1246::Storage.TaskManager.Task::(stop) Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::stopping in state preparing (force False)</span></p>
<p><span>Thread-1111::DEBUG::2016-04-27 23:01:27,865::task::993::Storage.TaskManager.Task::(_decref) Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::ref 1 aborting True</span></p>
<p><span>Thread-1111::INFO::2016-04-27 23:01:27,865::task::1171::Storage.TaskManager.Task::(prepare) Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::aborting: Task is aborted: &#39;Domain is either partially accessible or entirely inaccessible&#39; - code 379</span></p>
<div style="font-family:verdana,sans-serif;font-size:small"><span style="font-family:arial,sans-serif">Thread-1111::DEBUG::2016-04-27 23:01:27,866::task::1176::Storage.TaskManager.Task::(prepare) Task=`d2acf575-1a60-4fa0-a5bb-cd4363636b94`::Prepare: aborted: Domain is either partially accessible or entirely inaccessible</span> </div><div style="font-family:verdana,sans-serif;font-size:small"><br></div><div style="font-family:verdana,sans-serif;font-size:small"><br></div><div style="font-family:verdana,sans-serif"><font size="4"><b>Question: <font color="#ff0000">Anyone know what might be happening?</font> I have several gluster config&#39;s, as you can see below. All the storage domain are using the same config&#39;s</b></font></div><div style="font-family:verdana,sans-serif;font-size:small"><br></div><div style="font-family:verdana,sans-serif;font-size:small"><br></div><div style="font-family:verdana,sans-serif"><b><font size="4">More information:</font></b></div><div style="font-family:verdana,sans-serif;font-size:small"><br></div><div style="font-family:verdana,sans-serif;font-size:small">I have the &quot;engine&quot; storage domain, &quot;vmos1&quot; storage domain and &quot;master&quot; storage domain, so everything looks good.</div><div style="font-family:verdana,sans-serif">







<p style="font-size:small"><span>[root@kvm1 vdsm]# vdsClient -s 0 getStorageDomainsList</span></p>
<p style="font-size:small"><span>03926733-1872-4f85-bb21-18dc320560db</span></p>
<p style="font-size:small"><span>35021ff4-fb95-43d7-92a3-f538273a3c2e</span></p>
<p style="font-size:small"><span>e306e54e-ca98-468d-bb04-3e8900f8840c</span></p><p style="font-size:small"><span><br></span></p><p><span><b><font size="4">Gluster config:</font></b></span></p><p style="font-size:small"><span>[root@gluster-root1 ~]# gluster volume info</span></p><p style="font-size:small"><span> </span></p><p style="font-size:small"><span>Volume Name: engine</span></p><p style="font-size:small"><span>Type: Replicate</span></p><p style="font-size:small"><span>Volume ID: 64b413d2-c42e-40fd-b356-3e6975e941b0</span></p><p style="font-size:small"><span>Status: Started</span></p><p style="font-size:small"><span>Number of Bricks: 1 x 3 = 3</span></p><p style="font-size:small"><span>Transport-type: tcp</span></p><p style="font-size:small"><span>Bricks:</span></p><p style="font-size:small"><span>Brick1: gluster1.xyz.com:/gluster/engine/brick1</span></p><p style="font-size:small"><span>Brick2: gluster2.xyz.com:/gluster/engine/brick1</span></p><p style="font-size:small"><span>Brick3: gluster-root1.xyz.com:/gluster/engine/brick1</span></p><p style="font-size:small"><span>Options Reconfigured:</span></p><p style="font-size:small"><span>performance.cache-size: 1GB</span></p><p style="font-size:small"><span>performance.write-behind-window-size: 4MB</span></p><p style="font-size:small"><span>performance.write-behind: off</span></p><p style="font-size:small"><span>performance.quick-read: off</span></p><p style="font-size:small"><span>performance.read-ahead: off</span></p><p style="font-size:small"><span>performance.io-cache: off</span></p><p style="font-size:small"><span>performance.stat-prefetch: off</span></p><p style="font-size:small"><span>cluster.eager-lock: enable</span></p><p style="font-size:small"><span>cluster.quorum-type: auto</span></p><p style="font-size:small"><span>network.remote-dio: enable</span></p><p style="font-size:small"><span>cluster.server-quorum-type: server</span></p><p style="font-size:small"><span>cluster.data-self-heal-algorithm: full</span></p><p style="font-size:small"><span>performance.low-prio-threads: 32</span></p><p style="font-size:small"><span>features.shard-block-size: 512MB</span></p><p style="font-size:small"><span>features.shard: on</span></p><p style="font-size:small"><span>storage.owner-gid: 36</span></p><p style="font-size:small"><span>storage.owner-uid: 36</span></p><p style="font-size:small"><span>





































</span></p><p style="font-size:small"><span>performance.readdir-ahead: on</span></p><p style="font-size:small"><span><br></span></p><p style="font-size:small"><span>Volume Name: master</span></p><p style="font-size:small"><span>Type: Replicate</span></p><p style="font-size:small"><span>Volume ID: 20164808-7bbe-4eeb-8770-d222c0e0b830</span></p><p style="font-size:small"><span>Status: Started</span></p><p style="font-size:small"><span>Number of Bricks: 1 x 3 = 3</span></p><p style="font-size:small"><span>Transport-type: tcp</span></p><p style="font-size:small"><span>Bricks:</span></p><p style="font-size:small"><span>Brick1: gluster1.xyz.com:/home/storage/master/brick1</span></p><p style="font-size:small"><span>Brick2: gluster2.xyz.com:/home/storage/master/brick1</span></p><p style="font-size:small"><span>Brick3: gluster-root1.xyz.com:/home/storage/master/brick1</span></p><p style="font-size:small"><span>Options Reconfigured:</span></p><p style="font-size:small"><span>performance.readdir-ahead: on</span></p><p style="font-size:small"><span>performance.quick-read: off</span></p><p style="font-size:small"><span>performance.read-ahead: off</span></p><p style="font-size:small"><span>performance.io-cache: off</span></p><p style="font-size:small"><span>performance.stat-prefetch: off</span></p><p style="font-size:small"><span>cluster.eager-lock: enable</span></p><p style="font-size:small"><span>network.remote-dio: enable</span></p><p style="font-size:small"><span>cluster.quorum-type: auto</span></p><p style="font-size:small"><span>cluster.server-quorum-type: server</span></p><p style="font-size:small"><span>storage.owner-uid: 36</span></p><p style="font-size:small"><span>storage.owner-gid: 36</span></p><p style="font-size:small"><span>features.shard: on</span></p><p style="font-size:small"><span>features.shard-block-size: 512MB</span></p><p style="font-size:small"><span>performance.low-prio-threads: 32</span></p><p style="font-size:small"><span>cluster.data-self-heal-algorithm: full</span></p><p style="font-size:small"><span>performance.write-behind: off</span></p><p style="font-size:small"><span>performance.write-behind-window-size: 4MB</span></p><p style="font-size:small"><span>



































</span></p><p style="font-size:small"><span>performance.cache-size: 1GB</span></p><p style="font-size:small"><span><br></span></p><p style="font-size:small"><span>Volume Name: vmos1</span></p><p style="font-size:small"><span>Type: Replicate</span></p><p style="font-size:small"><span>Volume ID: ea8fb50e-7bc8-4de3-b775-f3976b6b4f13</span></p><p style="font-size:small"><span>Status: Started</span></p><p style="font-size:small"><span>Number of Bricks: 1 x 3 = 3</span></p><p style="font-size:small"><span>Transport-type: tcp</span></p><p style="font-size:small"><span>Bricks:</span></p><p style="font-size:small"><span>Brick1: gluster1.xyz.com:/gluster/vmos1/brick1</span></p><p style="font-size:small"><span>Brick2: gluster2.xyz.com:/gluster/vmos1/brick1</span></p><p style="font-size:small"><span>Brick3: gluster-root1.xyz.com:/gluster/vmos1/brick1</span></p><p style="font-size:small"><span>Options Reconfigured:</span></p><p style="font-size:small"><span>network.ping-timeout: 60</span></p><p style="font-size:small"><span>performance.readdir-ahead: on</span></p><p style="font-size:small"><span>performance.quick-read: off</span></p><p style="font-size:small"><span>performance.read-ahead: off</span></p><p style="font-size:small"><span>performance.io-cache: off</span></p><p style="font-size:small"><span>performance.stat-prefetch: off</span></p><p style="font-size:small"><span>cluster.eager-lock: enable</span></p><p style="font-size:small"><span>network.remote-dio: enable</span></p><p style="font-size:small"><span>cluster.quorum-type: auto</span></p><p style="font-size:small"><span>cluster.server-quorum-type: server</span></p><p style="font-size:small"><span>storage.owner-uid: 36</span></p><p style="font-size:small"><span>storage.owner-gid: 36</span></p><p style="font-size:small"><span>features.shard: on</span></p><p style="font-size:small"><span>features.shard-block-size: 512MB</span></p><p style="font-size:small"><span>performance.low-prio-threads: 32</span></p><p style="font-size:small"><span>cluster.data-self-heal-algorithm: full</span></p><p style="font-size:small"><span>performance.write-behind: off</span></p><p style="font-size:small"><span>performance.write-behind-window-size: 4MB</span></p><p style="font-size:small"><span>




































</span></p><p style="font-size:small"><span>performance.cache-size: 1GB</span></p><p style="font-size:small"><span><br></span></p><p style="font-size:small"><span><br></span></p><p style="font-size:small"><span>Attached goes all the logs...</span></p><p style="font-size:small"><span><br></span></p><p style="font-size:small"><span><br></span></p><p style="font-size:small"><span>Thanks</span></p><span class="HOEnZb"><font color="#888888"><p style="font-size:small"><span>-Luiz</span></p></font></span></div></div>
</div><br></div>