Strahil,
Thanks for the follow up !
How did you copy the data to another volume ?
I have set up another storage domain GLCLNEW1 with a new volume imgnew1 .
How would you copy all of the data from the problematic domain GLCL3 with
volume images3 to GLCLNEW1 and volume imgnew1 and preserve all the VMs, VM
disks, settings, etc. ?
Remember all of the regular ovirt disk copy, disk move, VM export tools
are failing and my VMs and disks are trapped on domain GLCL3 and volume
images3 right now.
Please let me know
Thank You For Your Help !
On Sun, Jun 21, 2020 at 8:27 AM Strahil Nikolov <hunter86_bg(a)yahoo.com>
wrote:
Sorry to hear that.
I can say that for me 6.5 was working, while 6.6 didn't and I upgraded
to 7.0 .
In the ended , I have ended with creating a new fresh volume and
physically copying the data there, then I detached the storage domains and
attached to the new ones (which holded the old data), but I could
afford the downtime.
Also, I can say that v7.0 ( but not 7.1 or anything later) also
worked without the ACL issue, but it causes some trouble in oVirt - so
avoid that unless you have no other options.
Best Regards,
Strahil Nikolov
На 21 юни 2020 г. 4:39:46 GMT+03:00, C Williams <cwilliams3320(a)gmail.com>
написа:
>Hello,
>
>Upgrading diidn't help
>
>Still acl errors trying to use a Virtual Disk from a VM
>
>[root@ov06 bricks]# tail bricks-brick04-images3.log | grep acl
>[2020-06-21 01:33:45.665888] I [MSGID: 139001]
>[posix-acl.c:263:posix_acl_log_permit_denied] 0-images3-access-control:
>client:
>CTX_ID:3697a7f1-44fb-4258-96b0-98cb4137d195-GRAPH_ID:0-PID:6706-HOST:ov06.ntc.srcle.com-PC_NAME:images3-client-0-RECON_NO:-0,
>gfid: be318638-e8a0-4c6d-977d-7a937aa84806,
>req(uid:107,gid:107,perm:1,ngrps:3),
>ctx(uid:0,gid:0,in-groups:0,perm:000,updated-fop:INVALID, acl:-)
>[Permission denied]
>The message "I [MSGID: 139001]
>[posix-acl.c:263:posix_acl_log_permit_denied] 0-images3-access-control:
>client:
>CTX_ID:3697a7f1-44fb-4258-96b0-98cb4137d195-GRAPH_ID:0-PID:6706-HOST:ov06.ntc.srcle.com-PC_NAME:images3-client-0-RECON_NO:-0,
>gfid: be318638-e8a0-4c6d-977d-7a937aa84806,
>req(uid:107,gid:107,perm:1,ngrps:3),
>ctx(uid:0,gid:0,in-groups:0,perm:000,updated-fop:INVALID, acl:-)
>[Permission denied]" repeated 2 times between [2020-06-21
>01:33:45.665888]
>and [2020-06-21 01:33:45.806779]
>
>Thank You For Your Help !
>
>On Sat, Jun 20, 2020 at 8:59 PM C Williams <cwilliams3320(a)gmail.com>
>wrote:
>
>> Hello,
>>
>> Based on the situation, I am planning to upgrade the 3 affected
>hosts.
>>
>> My reasoning is that the hosts/bricks were attached to 6.9 at one
>time.
>>
>> Thanks For Your Help !
>>
>> On Sat, Jun 20, 2020 at 8:38 PM C Williams <cwilliams3320(a)gmail.com>
>> wrote:
>>
>>> Strahil,
>>>
>>> The gluster version on the current 3 gluster hosts is 6.7 (last
>update
>>> 2/26). These 3 hosts provide 1 brick each for the replica 3 volume.
>>>
>>> Earlier I had tried to add 6 additional hosts to the cluster. Those
>new
>>> hosts were 6.9 gluster.
>>>
>>> I attempted to make a new separate volume with 3 bricks provided by
>the 3
>>> new gluster 6.9 hosts. After having many errors from the oVirt
>interface,
>>> I gave up and removed the 6 new hosts from the cluster. That is
>where the
>>> problems started. The intent was to expand the gluster cluster while
>making
>>> 2 new volumes for that cluster. The ovirt compute cluster would
>allow for
>>> efficient VM migration between 9 hosts -- while having separate
>gluster
>>> volumes for safety purposes.
>>>
>>> Looking at the brick logs, I see where there are acl errors starting
>from
>>> the time of the removal of the 6 new hosts.
>>>
>>> Please check out the attached brick log from 6/14-18. The events
>started
>>> on 6/17.
>>>
>>> I wish I had a downgrade path.
>>>
>>> Thank You For The Help !!
>>>
>>> On Sat, Jun 20, 2020 at 7:47 PM Strahil Nikolov
><hunter86_bg(a)yahoo.com>
>>> wrote:
>>>
>>>> Hi ,
>>>>
>>>>
>>>> This one really looks like the ACL bug I was hit with when I
>updated
>>>> from Gluster v6.5 to 6.6 and later from 7.0 to 7.2.
>>>>
>>>> Did you update your setup recently ? Did you upgrade gluster also ?
>>>>
>>>> You have to check the gluster logs in order to verify that, so you
>can
>>>> try:
>>>>
>>>> 1. Set Gluster logs to trace level (for details check:
>>>>
>
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/h...
>>>> )
>>>> 2. Power up a VM that was already off , or retry the procedure from
>the
>>>> logs you sent.
>>>> 3. Stop the trace level of the logs
>>>> 4. Check libvirt logs on the host that was supposed to power up the
>VM
>>>> (in case a VM was powered on)
>>>> 5. Check the gluster brick logs on all nodes for ACL errors.
>>>> Here is a sample from my old logs:
>>>>
>>>> gluster_bricks-data_fast4-data_fast4.log:[2020-03-18
>13:19:41.489047] I
>>>> [MSGID: 139001] [posix-acl.c:262:posix_acl_log_permit_denied]
>>>> 0-data_fast4-access-control: client: CTX_ID:4a654305-d2e4-
>>>>
>4a10-bad9-58d670d99a97-GRAPH_ID:0-PID:32412-HOST:ovirt1.localdomain-PC_NAME:data_fast4-client-0-RECON_NO:-19,
>>>> gfid: be318638-e8a0-4c6d-977d-7a937aa84806,
>>>> req(uid:36,gid:36,perm:1,ngrps:3), ctx
>>>> (uid:0,gid:0,in-groups:0,perm:000,updated-fop:INVALID, acl:-)
>>>> [Permission denied]
>>>> gluster_bricks-data_fast4-data_fast4.log:[2020-03-18
>13:22:51.818796] I
>>>> [MSGID: 139001] [posix-acl.c:262:posix_acl_log_permit_denied]
>>>> 0-data_fast4-access-control: client: CTX_ID:4a654305-d2e4-
>>>>
>4a10-bad9-58d670d99a97-GRAPH_ID:0-PID:32412-HOST:ovirt1.localdomain-PC_NAME:data_fast4-client-0-RECON_NO:-19,
>>>> gfid: be318638-e8a0-4c6d-977d-7a937aa84806,
>>>> req(uid:36,gid:36,perm:1,ngrps:3), ctx
>>>> (uid:0,gid:0,in-groups:0,perm:000,updated-fop:INVALID, acl:-)
>>>> [Permission denied]
>>>> gluster_bricks-data_fast4-data_fast4.log:[2020-03-18
>13:24:43.732856] I
>>>> [MSGID: 139001] [posix-acl.c:262:posix_acl_log_permit_denied]
>>>> 0-data_fast4-access-control: client: CTX_ID:4a654305-d2e4-
>>>>
>4a10-bad9-58d670d99a97-GRAPH_ID:0-PID:32412-HOST:ovirt1.localdomain-PC_NAME:data_fast4-client-0-RECON_NO:-19,
>>>> gfid: be318638-e8a0-4c6d-977d-7a937aa84806,
>>>> req(uid:36,gid:36,perm:1,ngrps:3), ctx
>>>> (uid:0,gid:0,in-groups:0,perm:000,updated-fop:INVALID, acl:-)
>>>> [Permission denied]
>>>> gluster_bricks-data_fast4-data_fast4.log:[2020-03-18
>13:26:50.758178] I
>>>> [MSGID: 139001] [posix-acl.c:262:posix_acl_log_permit_denied]
>>>> 0-data_fast4-access-control: client: CTX_ID:4a654305-d2e4-
>>>>
>4a10-bad9-58d670d99a97-GRAPH_ID:0-PID:32412-HOST:ovirt1.localdomain-PC_NAME:data_fast4-client-0-RECON_NO:-19,
>>>> gfid: be318638-e8a0-4c6d-977d-7a937aa84806,
>>>> req(uid:36,gid:36,perm:1,ngrps:3), ctx
>>>> (uid:0,gid:0,in-groups:0,perm:000,updated-fop:INVALID, acl:-)
>>>> [Permission denied]
>>>>
>>>>
>>>> In my case , the workaround was to downgrade the gluster packages
>on all
>>>> nodes (and reboot each node 1 by 1 ) if the major version is the
>same, but
>>>> if you upgraded to v7.X - then you can try the v7.0 .
>>>>
>>>> Best Regards,
>>>> Strahil Nikolov
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> В събота, 20 юни 2020 г., 18:48:42 ч. Гринуич+3, C Williams <
>>>> cwilliams3320(a)gmail.com> написа:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Hello,
>>>>
>>>> Here are additional log tiles as well as a tree of the problematic
>>>> Gluster storage domain. During this time I attempted to copy a
>virtual disk
>>>> to another domain, move a virtual disk to another domain and run a
>VM where
>>>> the virtual hard disk would be used.
>>>>
>>>> The copies/moves failed and the VM went into pause mode when the
>virtual
>>>> HDD was involved.
>>>>
>>>> Please check these out.
>>>>
>>>> Thank You For Your Help !
>>>>
>>>> On Sat, Jun 20, 2020 at 9:54 AM C Williams
><cwilliams3320(a)gmail.com>
>>>> wrote:
>>>> > Strahil,
>>>> >
>>>> > I understand. Please keep me posted.
>>>> >
>>>> > Thanks For The Help !
>>>> >
>>>> > On Sat, Jun 20, 2020 at 4:36 AM Strahil Nikolov
><hunter86_bg(a)yahoo.com>
>>>> wrote:
>>>> >> Hey C Williams,
>>>> >>
>>>> >> sorry for the delay, but I couldn't get somw time to check
your
>>>> logs. Will try a little bit later.
>>>> >>
>>>> >> Best Regards,
>>>> >> Strahil Nikolov
>>>> >>
>>>> >> На 20 юни 2020 г. 2:37:22 GMT+03:00, C Williams <
>>>> cwilliams3320(a)gmail.com> написа:
>>>> >>>Hello,
>>>> >>>
>>>> >>>Was wanting to follow up on this issue. Users are impacted.
>>>> >>>
>>>> >>>Thank You
>>>> >>>
>>>> >>>On Fri, Jun 19, 2020 at 9:20 AM C Williams
><cwilliams3320(a)gmail.com>
>>>> >>>wrote:
>>>> >>>
>>>> >>>> Hello,
>>>> >>>>
>>>> >>>> Here are the logs (some IPs are changed )
>>>> >>>>
>>>> >>>> ov05 is the SPM
>>>> >>>>
>>>> >>>> Thank You For Your Help !
>>>> >>>>
>>>> >>>> On Thu, Jun 18, 2020 at 11:31 PM Strahil Nikolov
>>>> >>><hunter86_bg(a)yahoo.com>
>>>> >>>> wrote:
>>>> >>>>
>>>> >>>>> Check on the hosts tab , which is your current SPM
(last
>column in
>>>> >>>Admin
>>>> >>>>> UI).
>>>> >>>>> Then open the /var/log/vdsm/vdsm.log and repeat
the
>operation.
>>>> >>>>> Then provide the log from that host and the
engine's log (on
>the
>>>> >>>>> HostedEngine VM or on your standalone engine).
>>>> >>>>>
>>>> >>>>> Best Regards,
>>>> >>>>> Strahil Nikolov
>>>> >>>>>
>>>> >>>>> На 18 юни 2020 г. 23:59:36 GMT+03:00, C Williams
>>>> >>><cwilliams3320(a)gmail.com>
>>>> >>>>> написа:
>>>> >>>>> >Resending to eliminate email issues
>>>> >>>>> >
>>>> >>>>> >---------- Forwarded message ---------
>>>> >>>>> >From: C Williams
<cwilliams3320(a)gmail.com>
>>>> >>>>> >Date: Thu, Jun 18, 2020 at 4:01 PM
>>>> >>>>> >Subject: Re: [ovirt-users] Fwd: Issues with
Gluster Domain
>>>> >>>>> >To: Strahil Nikolov
<hunter86_bg(a)yahoo.com>
>>>> >>>>> >
>>>> >>>>> >
>>>> >>>>> >Here is output from mount
>>>> >>>>> >
>>>> >>>>> >192.168.24.12:/stor/import0 on
>>>> >>>>>
>/rhev/data-center/mnt/192.168.24.12:_stor_import0
>>>> >>>>> >type nfs4
>>>> >>>>>
>>>> >>>>>
>>>>
>>>>
>>>>>(rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=192.168.24.18,local_lock=none,addr=192.168.24.12)
>>>> >>>>> >192.168.24.13:/stor/import1 on
>>>> >>>>>
>/rhev/data-center/mnt/192.168.24.13:_stor_import1
>>>> >>>>> >type nfs4
>>>> >>>>>
>>>> >>>>>
>>>>
>>>>
>>>>>(rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=192.168.24.18,local_lock=none,addr=192.168.24.13)
>>>> >>>>> >192.168.24.13:/stor/iso1 on
>>>> >>>>> >/rhev/data-center/mnt/192.168.24.13:_stor_iso1
>>>> >>>>> >type nfs4
>>>> >>>>>
>>>> >>>>>
>>>>
>>>>
>>>>>(rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=192.168.24.18,local_lock=none,addr=192.168.24.13)
>>>> >>>>> >192.168.24.13:/stor/export0 on
>>>> >>>>>
>/rhev/data-center/mnt/192.168.24.13:_stor_export0
>>>> >>>>> >type nfs4
>>>> >>>>>
>>>> >>>>>
>>>>
>>>>
>>>>>(rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=192.168.24.18,local_lock=none,addr=192.168.24.13)
>>>> >>>>> >192.168.24.15:/images on
>>>> >>>>>
>/rhev/data-center/mnt/glusterSD/192.168.24.15:_images
>>>> >>>>> >type fuse.glusterfs
>>>> >>>>>
>>>> >>>>>
>>>>
>>>>
>>>>>(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
>>>> >>>>> >192.168.24.18:/images3 on
>>>> >>>>>
>/rhev/data-center/mnt/glusterSD/192.168.24.18:_images3
>>>> >>>>> >type fuse.glusterfs
>>>> >>>>>
>>>> >>>>>
>>>>
>>>>
>>>>>(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
>>>> >>>>> >tmpfs on /run/user/0 type tmpfs
>>>> >>>>>
>(rw,nosuid,nodev,relatime,seclabel,size=13198392k,mode=700)
>>>> >>>>> >[root@ov06 glusterfs]#
>>>> >>>>> >
>>>> >>>>> >Also here is a screenshot of the console
>>>> >>>>> >
>>>> >>>>> >[image: image.png]
>>>> >>>>> >The other domains are up
>>>> >>>>> >
>>>> >>>>> >Import0 and Import1 are NFS . GLCL0 is gluster.
They all are
>>>> >>>running
>>>> >>>>> >VMs
>>>> >>>>> >
>>>> >>>>> >Thank You For Your Help !
>>>> >>>>> >
>>>> >>>>> >On Thu, Jun 18, 2020 at 3:51 PM Strahil
Nikolov
>>>> >>><hunter86_bg(a)yahoo.com>
>>>> >>>>> >wrote:
>>>> >>>>> >
>>>> >>>>> >> I don't see
>'/rhev/data-center/mnt/192.168.24.13:_stor_import1'
>>>> >>>>> >mounted
>>>> >>>>> >> at all .
>>>> >>>>> >> What is the status of all storage domains
?
>>>> >>>>> >>
>>>> >>>>> >> Best Regards,
>>>> >>>>> >> Strahil Nikolov
>>>> >>>>> >>
>>>> >>>>> >> На 18 юни 2020 г. 21:43:44 GMT+03:00, C
Williams
>>>> >>>>> ><cwilliams3320(a)gmail.com>
>>>> >>>>> >> написа:
>>>> >>>>> >> > Resending to deal with possible
email issues
>>>> >>>>> >> >
>>>> >>>>> >> >---------- Forwarded message
---------
>>>> >>>>> >> >From: C Williams
<cwilliams3320(a)gmail.com>
>>>> >>>>> >> >Date: Thu, Jun 18, 2020 at 2:07 PM
>>>> >>>>> >> >Subject: Re: [ovirt-users] Issues with
Gluster Domain
>>>> >>>>> >> >To: Strahil Nikolov
<hunter86_bg(a)yahoo.com>
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> >More
>>>> >>>>> >> >
>>>> >>>>> >> >[root@ov06 ~]# for i in $(gluster
volume list); do echo
>>>> >>>$i;echo;
>>>> >>>>> >> >gluster
>>>> >>>>> >> >volume info $i; echo;echo;gluster
volume status
>>>> >>>>> >$i;echo;echo;echo;done
>>>> >>>>> >> >images3
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> >Volume Name: images3
>>>> >>>>> >> >Type: Replicate
>>>> >>>>> >> >Volume ID:
0243d439-1b29-47d0-ab39-d61c2f15ae8b
>>>> >>>>> >> >Status: Started
>>>> >>>>> >> >Snapshot Count: 0
>>>> >>>>> >> >Number of Bricks: 1 x 3 = 3
>>>> >>>>> >> >Transport-type: tcp
>>>> >>>>> >> >Bricks:
>>>> >>>>> >> >Brick1:
192.168.24.18:/bricks/brick04/images3
>>>> >>>>> >> >Brick2:
192.168.24.19:/bricks/brick05/images3
>>>> >>>>> >> >Brick3:
192.168.24.20:/bricks/brick06/images3
>>>> >>>>> >> >Options Reconfigured:
>>>> >>>>> >> >performance.client-io-threads: on
>>>> >>>>> >> >nfs.disable: on
>>>> >>>>> >> >transport.address-family: inet
>>>> >>>>> >> >user.cifs: off
>>>> >>>>> >> >auth.allow: *
>>>> >>>>> >> >performance.quick-read: off
>>>> >>>>> >> >performance.read-ahead: off
>>>> >>>>> >> >performance.io-cache: off
>>>> >>>>> >> >performance.low-prio-threads: 32
>>>> >>>>> >> >network.remote-dio: off
>>>> >>>>> >> >cluster.eager-lock: enable
>>>> >>>>> >> >cluster.quorum-type: auto
>>>> >>>>> >> >cluster.server-quorum-type: server
>>>> >>>>> >> >cluster.data-self-heal-algorithm:
full
>>>> >>>>> >> >cluster.locking-scheme: granular
>>>> >>>>> >> >cluster.shd-max-threads: 8
>>>> >>>>> >> >cluster.shd-wait-qlength: 10000
>>>> >>>>> >> >features.shard: on
>>>> >>>>> >> >cluster.choose-local: off
>>>> >>>>> >> >client.event-threads: 4
>>>> >>>>> >> >server.event-threads: 4
>>>> >>>>> >> >storage.owner-uid: 36
>>>> >>>>> >> >storage.owner-gid: 36
>>>> >>>>> >> >performance.strict-o-direct: on
>>>> >>>>> >> >network.ping-timeout: 30
>>>> >>>>> >> >cluster.granular-entry-heal: enable
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> >Status of volume: images3
>>>> >>>>> >> >Gluster process
TCP Port
>RDMA Port
>>>> >>>>> >Online
>>>> >>>>> >> > Pid
>>>> >>>>> >>
>>>> >>>>> >>
>>>> >>>>>
>>>> >>>>>
>>>>
>>>>
>>>>>>------------------------------------------------------------------------------
>>>> >>>>> >> >Brick
192.168.24.18:/bricks/brick04/images3 49152 0
>>>>
>>>> >>>Y
>>>> >>>>> >> >6666
>>>> >>>>> >> >Brick
192.168.24.19:/bricks/brick05/images3 49152 0
>>>>
>>>> >>>Y
>>>> >>>>> >> >6779
>>>> >>>>> >> >Brick
192.168.24.20:/bricks/brick06/images3 49152 0
>>>>
>>>> >>>Y
>>>> >>>>> >> >7227
>>>> >>>>> >> >Self-heal Daemon on localhost
N/A N/A
>>>>
>>>> >>>Y
>>>> >>>>> >> >6689
>>>> >>>>> >> >Self-heal Daemon on
ov07.ntc.srcle.com
N/A N/A
>>>>
>>>> >>>Y
>>>> >>>>> >> >6802
>>>> >>>>> >> >Self-heal Daemon on
ov08.ntc.srcle.com
N/A N/A
>>>>
>>>> >>>Y
>>>> >>>>> >> >7250
>>>> >>>>> >> >
>>>> >>>>> >> >Task Status of Volume images3
>>>> >>>>> >>
>>>> >>>>> >>
>>>> >>>>>
>>>> >>>>>
>>>>
>>>>
>>>>>>------------------------------------------------------------------------------
>>>> >>>>> >> >There are no active volume tasks
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> >[root@ov06 ~]# ls -l
/rhev/data-center/mnt/glusterSD/
>>>> >>>>> >> >total 16
>>>> >>>>> >> >drwxr-xr-x. 5 vdsm kvm 8192 Jun 18
14:04
>192.168.24.15:_images
>>>> >>>>> >> >drwxr-xr-x. 5 vdsm kvm 8192 Jun 18
14:05 192.168.24.18:
>>>> _images3
>>>> >>>>> >> >[root@ov06 ~]#
>>>> >>>>> >> >
>>>> >>>>> >> >On Thu, Jun 18, 2020 at 2:03 PM C
Williams
>>>> >>><cwilliams3320(a)gmail.com>
>>>> >>>>> >> >wrote:
>>>> >>>>> >> >
>>>> >>>>> >> >> Strahil,
>>>> >>>>> >> >>
>>>> >>>>> >> >> Here you go -- Thank You For Your
Help !
>>>> >>>>> >> >>
>>>> >>>>> >> >> BTW -- I can write a test file to
gluster and it
>replicates
>>>> >>>>> >properly.
>>>> >>>>> >> >> Thinking something about the
oVirt Storage Domain ?
>>>> >>>>> >> >>
>>>> >>>>> >> >> [root@ov08 ~]# gluster pool list
>>>> >>>>> >> >> UUID
Hostname
>>>> >>>>> >State
>>>> >>>>> >> >>
5b40c659-d9ab-43c3-9af8-18b074ea0b83 ov06
>>>> >>>>> >> >Connected
>>>> >>>>> >> >>
36ce5a00-6f65-4926-8438-696944ebadb5
>ov07.ntc.srcle.com
>>>> >>>>> >> >Connected
>>>> >>>>> >> >>
c7e7abdb-a8f4-4842-924c-e227f0db1b29 localhost
>>>> >>>>> >> >Connected
>>>> >>>>> >> >> [root@ov08 ~]# gluster volume
list
>>>> >>>>> >> >> images3
>>>> >>>>> >> >>
>>>> >>>>> >> >> On Thu, Jun 18, 2020 at 1:13 PM
Strahil Nikolov
>>>> >>>>> >> ><hunter86_bg(a)yahoo.com>
>>>> >>>>> >> >> wrote:
>>>> >>>>> >> >>
>>>> >>>>> >> >>> Log to the oVirt cluster and
provide the output of:
>>>> >>>>> >> >>> gluster pool list
>>>> >>>>> >> >>> gluster volume list
>>>> >>>>> >> >>> for i in $(gluster volume
list); do echo $i;echo;
>gluster
>>>> >>>>> >volume
>>>> >>>>> >> >info
>>>> >>>>> >> >>> $i; echo;echo;gluster volume
status
>$i;echo;echo;echo;done
>>>> >>>>> >> >>>
>>>> >>>>> >> >>> ls -l
/rhev/data-center/mnt/glusterSD/
>>>> >>>>> >> >>>
>>>> >>>>> >> >>> Best Regards,
>>>> >>>>> >> >>> Strahil Nikolov
>>>> >>>>> >> >>>
>>>> >>>>> >> >>>
>>>> >>>>> >> >>> На 18 юни 2020 г. 19:17:46
GMT+03:00, C Williams
>>>> >>>>> >> ><cwilliams3320(a)gmail.com>
>>>> >>>>> >> >>> написа:
>>>> >>>>> >> >>> >Hello,
>>>> >>>>> >> >>> >
>>>> >>>>> >> >>> >I recently added 6 hosts
to an existing oVirt
>>>> >>>compute/gluster
>>>> >>>>> >> >cluster.
>>>> >>>>> >> >>> >
>>>> >>>>> >> >>> >Prior to this attempted
addition, my cluster had 3
>>>> >>>Hypervisor
>>>> >>>>> >hosts
>>>> >>>>> >> >and
>>>> >>>>> >> >>> >3
>>>> >>>>> >> >>> >gluster bricks which made
up a single gluster volume
>>>> >>>(replica 3
>>>> >>>>> >> >volume)
>>>> >>>>> >> >>> >. I
>>>> >>>>> >> >>> >added the additional
hosts and made a brick on 3 of
>the new
>>>> >>>>> >hosts
>>>> >>>>> >> >and
>>>> >>>>> >> >>> >attempted to make a new
replica 3 volume. I had
>difficulty
>>>> >>>>> >> >creating
>>>> >>>>> >> >>> >the
>>>> >>>>> >> >>> >new volume. So, I decided
that I would make a new
>>>> >>>>> >compute/gluster
>>>> >>>>> >> >>> >cluster
>>>> >>>>> >> >>> >for each set of 3 new
hosts.
>>>> >>>>> >> >>> >
>>>> >>>>> >> >>> >I removed the 6 new hosts
from the existing oVirt
>>>> >>>>> >Compute/Gluster
>>>> >>>>> >> >>> >Cluster
>>>> >>>>> >> >>> >leaving the 3 original
hosts in place with their
>bricks. At
>>>> >>>that
>>>> >>>>> >> >point
>>>> >>>>> >> >>> >my
>>>> >>>>> >> >>> >original bricks went down
and came back up . The
>volume
>>>> >>>showed
>>>> >>>>> >> >entries
>>>> >>>>> >> >>> >that
>>>> >>>>> >> >>> >needed healing. At that
point I ran gluster volume
>heal
>>>> >>>images3
>>>> >>>>> >> >full,
>>>> >>>>> >> >>> >etc.
>>>> >>>>> >> >>> >The volume shows no
unhealed entries. I also
>corrected some
>>>> >>>peer
>>>> >>>>> >> >>> >errors.
>>>> >>>>> >> >>> >
>>>> >>>>> >> >>> >However, I am unable to
copy disks, move disks to
>another
>>>> >>>>> >domain,
>>>> >>>>> >> >>> >export
>>>> >>>>> >> >>> >disks, etc. It appears
that the engine cannot locate
>disks
>>>> >>>>> >properly
>>>> >>>>> >> >and
>>>> >>>>> >> >>> >I
>>>> >>>>> >> >>> >get storage I/O errors.
>>>> >>>>> >> >>> >
>>>> >>>>> >> >>> >I have detached and
removed the oVirt Storage Domain.
>I
>>>> >>>>> >reimported
>>>> >>>>> >> >the
>>>> >>>>> >> >>> >domain and imported 2
VMs, But the VM disks exhibit
>the
>>>> same
>>>> >>>>> >> >behaviour
>>>> >>>>> >> >>> >and
>>>> >>>>> >> >>> >won't run from the
hard disk.
>>>> >>>>> >> >>> >
>>>> >>>>> >> >>> >
>>>> >>>>> >> >>> >I get errors such as
this
>>>> >>>>> >> >>> >
>>>> >>>>> >> >>> >VDSM ov05 command
HSMGetAllTasksStatusesVDS failed:
>low
>>>> >>>level
>>>> >>>>> >Image
>>>> >>>>> >> >>> >copy
>>>> >>>>> >> >>> >failed: ("Command
['/usr/bin/qemu-img', 'convert',
>'-p',
>>>> >>>'-t',
>>>> >>>>> >> >'none',
>>>> >>>>> >> >>> >'-T',
'none', '-f', 'raw',
>>>> >>>>> >> >>>
>u'/rhev/data-center/mnt/glusterSD/192.168.24.18:
>>>> >>>>> >> >>>
>>>> >>>>> >>
>>>> >>>>> >>
>>>> >>>>>
>>>> >>>>>
>>>>
>>>>
>>>>>>_images3/5fe3ad3f-2d21-404c-832e-4dc7318ca10d/images/3ea5afbd-0fe0-4c09-8d39-e556c66a8b3d/fe6eab63-3b22-4815-bfe6-4a0ade292510',
>>>> >>>>> >> >>> >'-O',
'raw',
>>>> >>>>> >> >>>
>u'/rhev/data-center/mnt/192.168.24.13:
>>>> >>>>> >> >>>
>>>> >>>>> >>
>>>> >>>>> >>
>>>> >>>>>
>>>> >>>>>
>>>>
>>>>
>>>>>>_stor_import1/1ab89386-a2ba-448b-90ab-bc816f55a328/images/f707a218-9db7-4e23-8bbd-9b12972012b6/d6591ec5-3ede-443d-bd40-93119ca7c7d5']
>>>> >>>>> >> >>> >failed with rc=1
out='' err=bytearray(b'qemu-img:
>error
>>>> >>>while
>>>> >>>>> >> >reading
>>>> >>>>> >> >>> >sector 135168: Transport
endpoint is not
>>>> >>>connected\\nqemu-img:
>>>> >>>>> >> >error
>>>> >>>>> >> >>> >while
>>>> >>>>> >> >>> >reading sector 131072:
Transport endpoint is not
>>>> >>>>> >> >connected\\nqemu-img:
>>>> >>>>> >> >>> >error while reading
sector 139264: Transport endpoint
>is
>>>> not
>>>> >>>>> >> >>> >connected\\nqemu-img:
error while reading sector
>143360:
>>>> >>>>> >Transport
>>>> >>>>> >> >>> >endpoint
>>>> >>>>> >> >>> >is not
connected\\nqemu-img: error while reading
>sector
>>>> >>>147456:
>>>> >>>>> >> >>> >Transport
>>>> >>>>> >> >>> >endpoint is not
connected\\nqemu-img: error while
>reading
>>>> >>>sector
>>>> >>>>> >> >>> >155648:
>>>> >>>>> >> >>> >Transport endpoint is not
connected\\nqemu-img: error
>while
>>>> >>>>> >reading
>>>> >>>>> >> >>> >sector
>>>> >>>>> >> >>> >151552: Transport
endpoint is not
>connected\\nqemu-img:
>>>> >>>error
>>>> >>>>> >while
>>>> >>>>> >> >>> >reading
>>>> >>>>> >> >>> >sector 159744: Transport
endpoint is not
>connected\\n')",)
>>>> >>>>> >> >>> >
>>>> >>>>> >> >>> >oVirt version is
4.3.82-1.el7
>>>> >>>>> >> >>> >OS CentOS Linux release
7.7.1908 (Core)
>>>> >>>>> >> >>> >
>>>> >>>>> >> >>> >The Gluster Cluster has
been working very well until
>this
>>>> >>>>> >incident.
>>>> >>>>> >> >>> >
>>>> >>>>> >> >>> >Please help.
>>>> >>>>> >> >>> >
>>>> >>>>> >> >>> >Thank You
>>>> >>>>> >> >>> >
>>>> >>>>> >> >>> >Charles Williams
>>>> >>>>> >> >>>
>>>> >>>>> >> >>
>>>> >>>>> >>
>>>> >>>>>
>>>> >>>>
>>>> >>
>>>> >
>>>> _______________________________________________
>>>> Users mailing list -- users(a)ovirt.org
>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>> oVirt Code of Conduct:
>>>>
https://www.ovirt.org/community/about/community-guidelines/
>>>> List Archives:
>>>>
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/YY3VUKEJLI7...
>>>>
>>>