<div dir="ltr"><div><div>Hi Bill,<br><br></div>After glusterfs 3.7.11, around 4-5 bugs were found in sharding and replicate modules and fixed, some of them causing the VM(s) to pause. Could you share the glusterfs client logs from around the time the issue was seen? This will help me confirm it&#39;s the same issue, or even debug further if this is a new issue.<br><br></div><div>-Krutika<br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jun 24, 2016 at 10:02 AM, Sahina Bose <span dir="ltr">&lt;<a href="mailto:sabose@redhat.com" target="_blank">sabose@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF">
    Can you post the gluster mount logs from the node where paused VM
    was running (under
    /var/log/glusterfs/rhev-datarhev-data-center-mnt-glusterSD&lt;mount-path&gt;.log)
    ? <br>
    Which version of glusterfs are you running?<div><div class="h5"><br>
    <br>
    <div>On 06/24/2016 07:49 AM, Bill Bill
      wrote:<br>
    </div>
    </div></div><blockquote type="cite"><div><div class="h5">
      
      
      
      <div>
        <p class="MsoNormal">Hello,</p>
        <p class="MsoNormal"><u></u> <u></u></p>
        <p class="MsoNormal">Have 3 nodes running both oVirt and Gluster
          on 4 SSD’s each. At the moment, there are two physical nics,
          one has public internet access and the other is a non-routable
          network used for ovirtmgmt &amp; gluster.</p>
        <p class="MsoNormal"><u></u> <u></u></p>
        <p class="MsoNormal">In the logical networks, I have selected
          gluster for the nonroutable network running ovirtmgmt and
          gluster however, two VM’s randomly pause for what seems like
          no reason. They can both be resumed without issue.</p>
        <p class="MsoNormal"><u></u> <u></u></p>
        <p class="MsoNormal">One test VM has 4GB of memory and a small
          disk – no problems with this one. Two others have 800GB disks
          and 32GB of RAM – both vm’s exhibit the same issue.</p>
        <p class="MsoNormal"><span><u></u> <u></u></span></p>
        <p class="MsoNormal"><span>I also see these in the oVirt dashboard:<u></u><u></u></span></p>
        <p class="MsoNormal"><span><u></u> <u></u></span></p>
        <p class="MsoNormal"><span>           
            <u></u><u></u></span></p>
        <p class="MsoNormal"><span>Failed to update OVF disks
            9e60328d-29af-4533-84f9-633d87f548a7, OVF data isn&#39;t updated
            on those OVF stores (Data Center xxxxx, Storage Domain
            sr-volume01).<u></u><u></u></span></p>
        <p class="MsoNormal"><span><u></u> <u></u></span></p>
        <p class="MsoNormal"><span>Jun 23, 2016 9:54:03 PM<u></u><u></u></span></p>
        <p class="MsoNormal"><span><u></u> <u></u></span></p>
        <p class="MsoNormal"><span>VDSM command failed: Could not acquire
            resource. Probably resource factory threw an exception.: ()<u></u><u></u></span></p>
        <p class="MsoNormal"><span><u></u> <u></u></span></p>
        <p class="MsoNormal"><span>///////////////<u></u><u></u></span></p>
        <p class="MsoNormal"><span><u></u> <u></u></span></p>
        <p class="MsoNormal"><span>VM xxxxx has been paused due to unknown
            storage error.<u></u><u></u></span></p>
        <p class="MsoNormal"><span><u></u> <u></u></span></p>
        <p class="MsoNormal"><span>///////////////<u></u><u></u></span></p>
        <p class="MsoNormal"><span><u></u> <u></u></span></p>
        <p class="MsoNormal"><span>In the error log on the engine, I see
            these:<u></u><u></u></span></p>
        <p class="MsoNormal"><span><u></u> <u></u></span></p>
        <p class="MsoNormal"><span>ERROR
            [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
            (ForkJoinPool-1-worker-7) [10caf93e] Correlation ID: null,
            Call Stack: null, Custom Event ID: -1, Message: VM xxxxxx
            has been paused due to unknown storage error.<u></u><u></u></span></p>
        <p class="MsoNormal"><span><u></u> <u></u></span></p>
        <p class="MsoNormal"><span>INFO 
            [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
            (ForkJoinPool-1-worker-11) [10caf93e] Correlation ID: null,
            Call Stack: null, Custom Event ID: -1, Message: VM xxxxxx
            has recovered from paused back to up.<u></u><u></u></span></p>
        <p class="MsoNormal"><span><u></u> <u></u></span></p>
        <p class="MsoNormal"><span>///////////////<u></u><u></u></span></p>
        <p class="MsoNormal"><span><u></u> <u></u></span></p>
        <p class="MsoNormal"><span>Hostnames are all local to /etc/hosts on
            all servers – they also resolve without issue from each
            host.<u></u><u></u></span></p>
        <p class="MsoNormal"><span><u></u> <u></u></span></p>
        <p class="MsoNormal"><span>//////////////<u></u><u></u></span></p>
        <p class="MsoNormal"><span><u></u> <u></u></span></p>
        <p class="MsoNormal"><span>2016-06-23 22:08:59,611 WARN 
            [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
            (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not
            associate brick &#39;ovirt3435:/mnt/data/sr-volume01&#39; of volume
            &#39;93e36cdc-ab1b-41ec-ac7f-966cf3856b59&#39; with correct network
            as no gluster network found in cluster
            &#39;75bd64de-04b2-4a99-9cd0-b63e919b9aca&#39;<u></u><u></u></span></p>
        <p class="MsoNormal"><span>2016-06-23 22:08:59,614 WARN 
            [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
            (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not
            associate brick &#39;ovirt3637:/mnt/data/sr-volume01&#39; of volume
            &#39;93e36cdc-ab1b-41ec-ac7f-966cf3856b59&#39; with correct network
            as no gluster network found in cluster
            &#39;75bd64de-04b2-4a99-9cd0-b63e919b9aca&#39;<u></u><u></u></span></p>
        <p class="MsoNormal"><span>2016-06-23 22:08:59,616 WARN 
            [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
            (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not
            associate brick &#39;ovirt3839:/mnt/data/sr-volume01&#39; of volume
            &#39;93e36cdc-ab1b-41ec-ac7f-966cf3856b59&#39; with correct network
            as no gluster network found in cluster
            &#39;75bd64de-04b2-4a99-9cd0-b63e919b9aca&#39;<u></u><u></u></span></p>
        <p class="MsoNormal"><span>2016-06-23 22:08:59,618 WARN 
            [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
            (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not
            associate brick &#39;ovirt3435:/mnt/data/distributed&#39; of volume
            &#39;b887b05e-2ea6-496e-9552-155d658eeaa6&#39; with correct network
            as no gluster network found in cluster
            &#39;75bd64de-04b2-4a99-9cd0-b63e919b9aca&#39;<u></u><u></u></span></p>
        <p class="MsoNormal"><span>2016-06-23 22:08:59,620 WARN 
            [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
            (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not
            associate brick &#39;ovirt3637:/mnt/data/distributed&#39; of volume
            &#39;b887b05e-2ea6-496e-9552-155d658eeaa6&#39; with correct network
            as no gluster network found in cluster
            &#39;75bd64de-04b2-4a99-9cd0-b63e919b9aca&#39;<u></u><u></u></span></p>
        <p class="MsoNormal"><span>2016-06-23 22:08:59,622 WARN 
            [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
            (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not
            associate brick &#39;ovirt3839:/mnt/data/distributed&#39; of volume
            &#39;b887b05e-2ea6-496e-9552-155d658eeaa6&#39; with correct network
            as no gluster network found in cluster
            &#39;75bd64de-04b2-4a99-9cd0-b63e919b9aca&#39;<u></u><u></u></span></p>
        <p class="MsoNormal"><span>2016-06-23 22:08:59,624 WARN 
            [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
            (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not
            associate brick &#39;ovirt3435:/mnt/data/iso&#39; of volume
            &#39;89f32457-c8c3-490e-b491-16dd27de0073&#39; with correct network
            as no gluster network found in cluster
            &#39;75bd64de-04b2-4a99-9cd0-b63e919b9aca&#39;<u></u><u></u></span></p>
        <p class="MsoNormal"><span>2016-06-23 22:08:59,626 WARN 
            [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
            (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not
            associate brick &#39;ovirt3637:/mnt/data/iso&#39; of volume
            &#39;89f32457-c8c3-490e-b491-16dd27de0073&#39; with correct network
            as no gluster network found in cluster
            &#39;75bd64de-04b2-4a99-9cd0-b63e919b9aca&#39;<u></u><u></u></span></p>
        <p class="MsoNormal"><span>2016-06-23 22:08:59,628 WARN 
            [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc]
            (DefaultQuartzScheduler_Worker-76) [1c1cf4f] Could not
            associate brick &#39;ovirt3839:/mnt/data/iso&#39; of volume
            &#39;89f32457-c8c3-490e-b491-16dd27de0073&#39; with correct network
            as no gluster network found in cluster
            &#39;75bd64de-04b2-4a99-9cd0-b63e919b9aca&#39;<u></u><u></u></span></p>
        <p class="MsoNormal"><span>2016-06-23 22:08:59,629 INFO 
            [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
            (DefaultQuartzScheduler_Worker-76) [1c1cf4f] FINISH,
            GlusterVolumesListVDSCommand, return:
            {b887b05e-2ea6-496e-9552-155d658eeaa6=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@d8c9039b,
            93e36cdc-ab1b-41ec-ac7f-966cf3856b59=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@fe5ab019,
            89f32457-c8c3-490e-b491-16dd27de0073=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@9a56d633},

            log id: 485a0611<u></u><u></u></span></p>
        <p class="MsoNormal"><span>2016-06-23 22:09:04,645 INFO 
            [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
            (DefaultQuartzScheduler_Worker-89) [] START,
            GlusterVolumesListVDSCommand(HostName = ovirt3839,
            GlusterVolumesListVDSParameters:{runAsync=&#39;true&#39;,
            hostId=&#39;32c500e5-268d-426a-9a4a-108535e67722&#39;}), log id:
            41b6479d<u></u><u></u></span></p>
      </div>
      <br>
      <fieldset></fieldset>
      <br>
      </div></div><pre>_______________________________________________
Users mailing list
<a href="mailto:Users@ovirt.org" target="_blank">Users@ovirt.org</a>
<a href="http://lists.ovirt.org/mailman/listinfo/users" target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a>
</pre>
    </blockquote>
    <br>
  </div>

<br>_______________________________________________<br>
Users mailing list<br>
<a href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>
<a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a><br>
<br></blockquote></div><br></div>