On Sun, Jun 21, 2020 at 3:08 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:

I created a fresh volume (which is not an ovirt sgorage domain), set the original storage domain in maintenance and detached it.
Then I 'cp -a ' the data from the old to the new volume. Next, I just added the new storage domain (the old one was a kind of a 'backup') - pointing to the new volume name.

If you observe issues , I would recommend you to downgrade gluster packages one node at a time . Then you might be able to restore your oVirt operations.

Best Regards,
Strahil Nikolov

На 21 юни 2020 г. 18:01:31 GMT+03:00, C Williams <cwilliams3320@gmail.com> написа:
>Strahil,
>
>Thanks for the follow up !
>
>How did you copy the data to another volume ?
>
>I have set up another storage domain GLCLNEW1 with a new volume imgnew1
>.
>How would you copy all of the data from the problematic domain GLCL3
>with
>volume images3 to GLCLNEW1 and volume imgnew1 and preserve all the VMs,
>VM
>disks, settings, etc. ?
>
>Remember all of the regular ovirt disk copy, disk move, VM export
>tools
>are failing and my VMs and disks are trapped on domain GLCL3 and volume
>images3 right now.
>
>Please let me know
>
>Thank You For Your Help !
>
>
>
>
>
>On Sun, Jun 21, 2020 at 8:27 AM Strahil Nikolov <hunter86_bg@yahoo.com>
>wrote:
>
>> Sorry to hear that.
>> I can say that for me 6.5 was working, while 6.6 didn't and I
>upgraded
>> to 7.0 .
>> In the ended , I have ended with creating a new fresh volume and
>> physically copying the data there, then I detached the storage
>domains and
>> attached to the new ones (which holded the old data), but I
>could
>> afford the downtime.
>> Also, I can say that v7.0 ( but not 7.1 or anything later) also
>> worked without the ACL issue, but it causes some trouble in oVirt
>- so
>> avoid that unless you have no other options.
>>
>> Best Regards,
>> Strahil Nikolov
>>
>>
>>
>>
>> На 21 юни 2020 г. 4:39:46 GMT+03:00, C Williams
><cwilliams3320@gmail.com>
>> написа:
>> >Hello,
>> >
>> >Upgrading diidn't help
>> >
>> >Still acl errors trying to use a Virtual Disk from a VM
>> >
>> >[root@ov06 bricks]# tail bricks-brick04-images3.log | grep acl
>> >[2020-06-21 01:33:45.665888] I [MSGID: 139001]
>> >[posix-acl.c:263:posix_acl_log_permit_denied]
>0-images3-access-control:
>> >client:
>>
>>
>>CTX_ID:3697a7f1-44fb-4258-96b0-98cb4137d195-GRAPH_ID:0-PID:6706-HOST:ov06.ntc.srcle.com-PC_NAME:images3-client-0-RECON_NO:-0,
>> >gfid: be318638-e8a0-4c6d-977d-7a937aa84806,
>> >req(uid:107,gid:107,perm:1,ngrps:3),
>> >ctx(uid:0,gid:0,in-groups:0,perm:000,updated-fop:INVALID, acl:-)
>> >[Permission denied]
>> >The message "I [MSGID: 139001]
>> >[posix-acl.c:263:posix_acl_log_permit_denied]
>0-images3-access-control:
>> >client:
>>
>>
>>CTX_ID:3697a7f1-44fb-4258-96b0-98cb4137d195-GRAPH_ID:0-PID:6706-HOST:ov06.ntc.srcle.com-PC_NAME:images3-client-0-RECON_NO:-0,
>> >gfid: be318638-e8a0-4c6d-977d-7a937aa84806,
>> >req(uid:107,gid:107,perm:1,ngrps:3),
>> >ctx(uid:0,gid:0,in-groups:0,perm:000,updated-fop:INVALID, acl:-)
>> >[Permission denied]" repeated 2 times between [2020-06-21
>> >01:33:45.665888]
>> >and [2020-06-21 01:33:45.806779]
>> >
>> >Thank You For Your Help !
>> >
>> >On Sat, Jun 20, 2020 at 8:59 PM C Williams <cwilliams3320@gmail.com>
>> >wrote:
>> >
>> >> Hello,
>> >>
>> >> Based on the situation, I am planning to upgrade the 3 affected
>> >hosts.
>> >>
>> >> My reasoning is that the hosts/bricks were attached to 6.9 at one
>> >time.
>> >>
>> >> Thanks For Your Help !
>> >>
>> >> On Sat, Jun 20, 2020 at 8:38 PM C Williams
><cwilliams3320@gmail.com>
>> >> wrote:
>> >>
>> >>> Strahil,
>> >>>
>> >>> The gluster version on the current 3 gluster hosts is 6.7 (last
>> >update
>> >>> 2/26). These 3 hosts provide 1 brick each for the replica 3
>volume.
>> >>>
>> >>> Earlier I had tried to add 6 additional hosts to the cluster.
>Those
>> >new
>> >>> hosts were 6.9 gluster.
>> >>>
>> >>> I attempted to make a new separate volume with 3 bricks provided
>by
>> >the 3
>> >>> new gluster 6.9 hosts. After having many errors from the oVirt
>> >interface,
>> >>> I gave up and removed the 6 new hosts from the cluster. That is
>> >where the
>> >>> problems started. The intent was to expand the gluster cluster
>while
>> >making
>> >>> 2 new volumes for that cluster. The ovirt compute cluster would
>> >allow for
>> >>> efficient VM migration between 9 hosts -- while having separate
>> >gluster
>> >>> volumes for safety purposes.
>> >>>
>> >>> Looking at the brick logs, I see where there are acl errors
>starting
>> >from
>> >>> the time of the removal of the 6 new hosts.
>> >>>
>> >>> Please check out the attached brick log from 6/14-18. The events
>> >started
>> >>> on 6/17.
>> >>>
>> >>> I wish I had a downgrade path.
>> >>>
>> >>> Thank You For The Help !!
>> >>>
>> >>> On Sat, Jun 20, 2020 at 7:47 PM Strahil Nikolov
>> ><hunter86_bg@yahoo.com>
>> >>> wrote:
>> >>>
>> >>>> Hi ,
>> >>>>
>> >>>>
>> >>>> This one really looks like the ACL bug I was hit with when I
>> >updated
>> >>>> from Gluster v6.5 to 6.6 and later from 7.0 to 7.2.
>> >>>>
>> >>>> Did you update your setup recently ? Did you upgrade gluster
>also ?
>> >>>>
>> >>>> You have to check the gluster logs in order to verify that, so
>you
>> >can
>> >>>> try:
>> >>>>
>> >>>> 1. Set Gluster logs to trace level (for details check:
>> >>>>
>> >
>>
>https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/administration_guide/configuring_the_log_level
>> >>>> )
>> >>>> 2. Power up a VM that was already off , or retry the procedure
>from
>> >the
>> >>>> logs you sent.
>> >>>> 3. Stop the trace level of the logs
>> >>>> 4. Check libvirt logs on the host that was supposed to power up
>the
>> >VM
>> >>>> (in case a VM was powered on)
>> >>>> 5. Check the gluster brick logs on all nodes for ACL errors.
>> >>>> Here is a sample from my old logs:
>> >>>>
>> >>>> gluster_bricks-data_fast4-data_fast4.log:[2020-03-18
>> >13:19:41.489047] I
>> >>>> [MSGID: 139001] [posix-acl.c:262:posix_acl_log_permit_denied]
>> >>>> 0-data_fast4-access-control: client: CTX_ID:4a654305-d2e4-
>> >>>>
>>
>>
>>4a10-bad9-58d670d99a97-GRAPH_ID:0-PID:32412-HOST:ovirt1.localdomain-PC_NAME:data_fast4-client-0-RECON_NO:-19,
>> >>>> gfid: be318638-e8a0-4c6d-977d-7a937aa84806,
>> >>>> req(uid:36,gid:36,perm:1,ngrps:3), ctx
>> >>>> (uid:0,gid:0,in-groups:0,perm:000,updated-fop:INVALID, acl:-)
>> >>>> [Permission denied]
>> >>>> gluster_bricks-data_fast4-data_fast4.log:[2020-03-18
>> >13:22:51.818796] I
>> >>>> [MSGID: 139001] [posix-acl.c:262:posix_acl_log_permit_denied]
>> >>>> 0-data_fast4-access-control: client: CTX_ID:4a654305-d2e4-
>> >>>>
>>
>>
>>4a10-bad9-58d670d99a97-GRAPH_ID:0-PID:32412-HOST:ovirt1.localdomain-PC_NAME:data_fast4-client-0-RECON_NO:-19,
>> >>>> gfid: be318638-e8a0-4c6d-977d-7a937aa84806,
>> >>>> req(uid:36,gid:36,perm:1,ngrps:3), ctx
>> >>>> (uid:0,gid:0,in-groups:0,perm:000,updated-fop:INVALID, acl:-)
>> >>>> [Permission denied]
>> >>>> gluster_bricks-data_fast4-data_fast4.log:[2020-03-18
>> >13:24:43.732856] I
>> >>>> [MSGID: 139001] [posix-acl.c:262:posix_acl_log_permit_denied]
>> >>>> 0-data_fast4-access-control: client: CTX_ID:4a654305-d2e4-
>> >>>>
>>
>>
>>4a10-bad9-58d670d99a97-GRAPH_ID:0-PID:32412-HOST:ovirt1.localdomain-PC_NAME:data_fast4-client-0-RECON_NO:-19,
>> >>>> gfid: be318638-e8a0-4c6d-977d-7a937aa84806,
>> >>>> req(uid:36,gid:36,perm:1,ngrps:3), ctx
>> >>>> (uid:0,gid:0,in-groups:0,perm:000,updated-fop:INVALID, acl:-)
>> >>>> [Permission denied]
>> >>>> gluster_bricks-data_fast4-data_fast4.log:[2020-03-18
>> >13:26:50.758178] I
>> >>>> [MSGID: 139001] [posix-acl.c:262:posix_acl_log_permit_denied]
>> >>>> 0-data_fast4-access-control: client: CTX_ID:4a654305-d2e4-
>> >>>>
>>
>>
>>4a10-bad9-58d670d99a97-GRAPH_ID:0-PID:32412-HOST:ovirt1.localdomain-PC_NAME:data_fast4-client-0-RECON_NO:-19,
>> >>>> gfid: be318638-e8a0-4c6d-977d-7a937aa84806,
>> >>>> req(uid:36,gid:36,perm:1,ngrps:3), ctx
>> >>>> (uid:0,gid:0,in-groups:0,perm:000,updated-fop:INVALID, acl:-)
>> >>>> [Permission denied]
>> >>>>
>> >>>>
>> >>>> In my case , the workaround was to downgrade the gluster
>packages
>> >on all
>> >>>> nodes (and reboot each node 1 by 1 ) if the major version is the
>> >same, but
>> >>>> if you upgraded to v7.X - then you can try the v7.0 .
>> >>>>
>> >>>> Best Regards,
>> >>>> Strahil Nikolov
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> В събота, 20 юни 2020 г., 18:48:42 ч. Гринуич+3, C Williams <
>> >>>> cwilliams3320@gmail.com> написа:
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> Hello,
>> >>>>
>> >>>> Here are additional log tiles as well as a tree of the
>problematic
>> >>>> Gluster storage domain. During this time I attempted to copy a
>> >virtual disk
>> >>>> to another domain, move a virtual disk to another domain and run
>a
>> >VM where
>> >>>> the virtual hard disk would be used.
>> >>>>
>> >>>> The copies/moves failed and the VM went into pause mode when the
>> >virtual
>> >>>> HDD was involved.
>> >>>>
>> >>>> Please check these out.
>> >>>>
>> >>>> Thank You For Your Help !
>> >>>>
>> >>>> On Sat, Jun 20, 2020 at 9:54 AM C Williams
>> ><cwilliams3320@gmail.com>
>> >>>> wrote:
>> >>>> > Strahil,
>> >>>> >
>> >>>> > I understand. Please keep me posted.
>> >>>> >
>> >>>> > Thanks For The Help !
>> >>>> >
>> >>>> > On Sat, Jun 20, 2020 at 4:36 AM Strahil Nikolov
>> ><hunter86_bg@yahoo.com>
>> >>>> wrote:
>> >>>> >> Hey C Williams,
>> >>>> >>
>> >>>> >> sorry for the delay, but I couldn't get somw time to check
>your
>> >>>> logs. Will try a little bit later.
>> >>>> >>
>> >>>> >> Best Regards,
>> >>>> >> Strahil Nikolov
>> >>>> >>
>> >>>> >> На 20 юни 2020 г. 2:37:22 GMT+03:00, C Williams <
>> >>>> cwilliams3320@gmail.com> написа:
>> >>>> >>>Hello,
>> >>>> >>>
>> >>>> >>>Was wanting to follow up on this issue. Users are impacted.
>> >>>> >>>
>> >>>> >>>Thank You
>> >>>> >>>
>> >>>> >>>On Fri, Jun 19, 2020 at 9:20 AM C Williams
>> ><cwilliams3320@gmail.com>
>> >>>> >>>wrote:
>> >>>> >>>
>> >>>> >>>> Hello,
>> >>>> >>>>
>> >>>> >>>> Here are the logs (some IPs are changed )
>> >>>> >>>>
>> >>>> >>>> ov05 is the SPM
>> >>>> >>>>
>> >>>> >>>> Thank You For Your Help !
>> >>>> >>>>
>> >>>> >>>> On Thu, Jun 18, 2020 at 11:31 PM Strahil Nikolov
>> >>>> >>><hunter86_bg@yahoo.com>
>> >>>> >>>> wrote:
>> >>>> >>>>
>> >>>> >>>>> Check on the hosts tab , which is your current SPM (last
>> >column in
>> >>>> >>>Admin
>> >>>> >>>>> UI).
>> >>>> >>>>> Then open the /var/log/vdsm/vdsm.log and repeat the
>> >operation.
>> >>>> >>>>> Then provide the log from that host and the engine's log
>(on
>> >the
>> >>>> >>>>> HostedEngine VM or on your standalone engine).
>> >>>> >>>>>
>> >>>> >>>>> Best Regards,
>> >>>> >>>>> Strahil Nikolov
>> >>>> >>>>>
>> >>>> >>>>> На 18 юни 2020 г. 23:59:36 GMT+03:00, C Williams
>> >>>> >>><cwilliams3320@gmail.com>
>> >>>> >>>>> написа:
>> >>>> >>>>> >Resending to eliminate email issues
>> >>>> >>>>> >
>> >>>> >>>>> >---------- Forwarded message ---------
>> >>>> >>>>> >From: C Williams <cwilliams3320@gmail.com>
>> >>>> >>>>> >Date: Thu, Jun 18, 2020 at 4:01 PM
>> >>>> >>>>> >Subject: Re: [ovirt-users] Fwd: Issues with Gluster
>Domain
>> >>>> >>>>> >To: Strahil Nikolov <hunter86_bg@yahoo.com>
>> >>>> >>>>> >
>> >>>> >>>>> >
>> >>>> >>>>> >Here is output from mount
>> >>>> >>>>> >
>> >>>> >>>>> >192.168.24.12:/stor/import0 on
>> >>>> >>>>> >/rhev/data-center/mnt/192.168.24.12:_stor_import0
>> >>>> >>>>> >type nfs4
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>>
>> >>>>
>>
>>
>>>>>>(rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=192.168.24.18,local_lock=none,addr=192.168.24.12)
>> >>>> >>>>> >192.168.24.13:/stor/import1 on
>> >>>> >>>>> >/rhev/data-center/mnt/192.168.24.13:_stor_import1
>> >>>> >>>>> >type nfs4
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>>
>> >>>>
>>
>>
>>>>>>(rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=192.168.24.18,local_lock=none,addr=192.168.24.13)
>> >>>> >>>>> >192.168.24.13:/stor/iso1 on
>> >>>> >>>>> >/rhev/data-center/mnt/192.168.24.13:_stor_iso1
>> >>>> >>>>> >type nfs4
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>>
>> >>>>
>>
>>
>>>>>>(rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=192.168.24.18,local_lock=none,addr=192.168.24.13)
>> >>>> >>>>> >192.168.24.13:/stor/export0 on
>> >>>> >>>>> >/rhev/data-center/mnt/192.168.24.13:_stor_export0
>> >>>> >>>>> >type nfs4
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>>
>> >>>>
>>
>>
>>>>>>(rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=192.168.24.18,local_lock=none,addr=192.168.24.13)
>> >>>> >>>>> >192.168.24.15:/images on
>> >>>> >>>>> >/rhev/data-center/mnt/glusterSD/192.168.24.15:_images
>> >>>> >>>>> >type fuse.glusterfs
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>>
>> >>>>
>>
>>
>>>>>>(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
>> >>>> >>>>> >192.168.24.18:/images3 on
>> >>>> >>>>> >/rhev/data-center/mnt/glusterSD/192.168.24.18:_images3
>> >>>> >>>>> >type fuse.glusterfs
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>>
>> >>>>
>>
>>
>>>>>>(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
>> >>>> >>>>> >tmpfs on /run/user/0 type tmpfs
>> >>>> >>>>>
>>(rw,nosuid,nodev,relatime,seclabel,size=13198392k,mode=700)
>> >>>> >>>>> >[root@ov06 glusterfs]#
>> >>>> >>>>> >
>> >>>> >>>>> >Also here is a screenshot of the console
>> >>>> >>>>> >
>> >>>> >>>>> >[image: image.png]
>> >>>> >>>>> >The other domains are up
>> >>>> >>>>> >
>> >>>> >>>>> >Import0 and Import1 are NFS . GLCL0 is gluster. They all
>are
>> >>>> >>>running
>> >>>> >>>>> >VMs
>> >>>> >>>>> >
>> >>>> >>>>> >Thank You For Your Help !
>> >>>> >>>>> >
>> >>>> >>>>> >On Thu, Jun 18, 2020 at 3:51 PM Strahil Nikolov
>> >>>> >>><hunter86_bg@yahoo.com>
>> >>>> >>>>> >wrote:
>> >>>> >>>>> >
>> >>>> >>>>> >> I don't see
>> >'/rhev/data-center/mnt/192.168.24.13:_stor_import1'
>> >>>> >>>>> >mounted
>> >>>> >>>>> >> at all .
>> >>>> >>>>> >> What is the status of all storage domains ?
>> >>>> >>>>> >>
>> >>>> >>>>> >> Best Regards,
>> >>>> >>>>> >> Strahil Nikolov
>> >>>> >>>>> >>
>> >>>> >>>>> >> На 18 юни 2020 г. 21:43:44 GMT+03:00, C Williams
>> >>>> >>>>> ><cwilliams3320@gmail.com>
>> >>>> >>>>> >> написа:
>> >>>> >>>>> >> > Resending to deal with possible email issues
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >---------- Forwarded message ---------
>> >>>> >>>>> >> >From: C Williams <cwilliams3320@gmail.com>
>> >>>> >>>>> >> >Date: Thu, Jun 18, 2020 at 2:07 PM
>> >>>> >>>>> >> >Subject: Re: [ovirt-users] Issues with Gluster Domain
>> >>>> >>>>> >> >To: Strahil Nikolov <hunter86_bg@yahoo.com>
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >More
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >[root@ov06 ~]# for i in $(gluster volume list); do
>echo
>> >>>> >>>$i;echo;
>> >>>> >>>>> >> >gluster
>> >>>> >>>>> >> >volume info $i; echo;echo;gluster volume status
>> >>>> >>>>> >$i;echo;echo;echo;done
>> >>>> >>>>> >> >images3
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >Volume Name: images3
>> >>>> >>>>> >> >Type: Replicate
>> >>>> >>>>> >> >Volume ID: 0243d439-1b29-47d0-ab39-d61c2f15ae8b
>> >>>> >>>>> >> >Status: Started
>> >>>> >>>>> >> >Snapshot Count: 0
>> >>>> >>>>> >> >Number of Bricks: 1 x 3 = 3
>> >>>> >>>>> >> >Transport-type: tcp
>> >>>> >>>>> >> >Bricks:
>> >>>> >>>>> >> >Brick1: 192.168.24.18:/bricks/brick04/images3
>> >>>> >>>>> >> >Brick2: 192.168.24.19:/bricks/brick05/images3
>> >>>> >>>>> >> >Brick3: 192.168.24.20:/bricks/brick06/images3
>> >>>> >>>>> >> >Options Reconfigured:
>> >>>> >>>>> >> >performance.client-io-threads: on
>> >>>> >>>>> >> >nfs.disable: on
>> >>>> >>>>> >> >transport.address-family: inet
>> >>>> >>>>> >> >user.cifs: off
>> >>>> >>>>> >> >auth.allow: *
>> >>>> >>>>> >> >performance.quick-read: off
>> >>>> >>>>> >> >performance.read-ahead: off
>> >>>> >>>>> >> >performance.io-cache: off
>> >>>> >>>>> >> >performance.low-prio-threads: 32
>> >>>> >>>>> >> >network.remote-dio: off
>> >>>> >>>>> >> >cluster.eager-lock: enable
>> >>>> >>>>> >> >cluster.quorum-type: auto
>> >>>> >>>>> >> >cluster.server-quorum-type: server
>> >>>> >>>>> >> >cluster.data-self-heal-algorithm: full
>> >>>> >>>>> >> >cluster.locking-scheme: granular
>> >>>> >>>>> >> >cluster.shd-max-threads: 8
>> >>>> >>>>> >> >cluster.shd-wait-qlength: 10000
>> >>>> >>>>> >> >features.shard: on
>> >>>> >>>>> >> >cluster.choose-local: off
>> >>>> >>>>> >> >client.event-threads: 4
>> >>>> >>>>> >> >server.event-threads: 4
>> >>>> >>>>> >> >storage.owner-uid: 36
>> >>>> >>>>> >> >storage.owner-gid: 36
>> >>>> >>>>> >> >performance.strict-o-direct: on
>> >>>> >>>>> >> >network.ping-timeout: 30
>> >>>> >>>>> >> >cluster.granular-entry-heal: enable
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >Status of volume: images3
>> >>>> >>>>> >> >Gluster process TCP Port
>> >RDMA Port
>> >>>> >>>>> >Online
>> >>>> >>>>> >> > Pid
>> >>>> >>>>> >>
>> >>>> >>>>> >>
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>>
>> >>>>
>>
>>
>>>>>>>------------------------------------------------------------------------------
>> >>>> >>>>> >> >Brick 192.168.24.18:/bricks/brick04/images3 49152
>0
>> >>>>
>> >>>> >>>Y
>> >>>> >>>>> >> >6666
>> >>>> >>>>> >> >Brick 192.168.24.19:/bricks/brick05/images3 49152
>0
>> >>>>
>> >>>> >>>Y
>> >>>> >>>>> >> >6779
>> >>>> >>>>> >> >Brick 192.168.24.20:/bricks/brick06/images3 49152
>0
>> >>>>
>> >>>> >>>Y
>> >>>> >>>>> >> >7227
>> >>>> >>>>> >> >Self-heal Daemon on localhost N/A
>N/A
>> >>>>
>> >>>> >>>Y
>> >>>> >>>>> >> >6689
>> >>>> >>>>> >> >Self-heal Daemon on ov07.ntc.srcle.com N/A
>N/A
>> >>>>
>> >>>> >>>Y
>> >>>> >>>>> >> >6802
>> >>>> >>>>> >> >Self-heal Daemon on ov08.ntc.srcle.com N/A
>N/A
>> >>>>
>> >>>> >>>Y
>> >>>> >>>>> >> >7250
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >Task Status of Volume images3
>> >>>> >>>>> >>
>> >>>> >>>>> >>
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>>
>> >>>>
>>
>>
>>>>>>>------------------------------------------------------------------------------
>> >>>> >>>>> >> >There are no active volume tasks
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >[root@ov06 ~]# ls -l /rhev/data-center/mnt/glusterSD/
>> >>>> >>>>> >> >total 16
>> >>>> >>>>> >> >drwxr-xr-x. 5 vdsm kvm 8192 Jun 18 14:04
>> >192.168.24.15:_images
>> >>>> >>>>> >> >drwxr-xr-x. 5 vdsm kvm 8192 Jun 18 14:05
>192.168.24.18:
>> >>>> _images3
>> >>>> >>>>> >> >[root@ov06 ~]#
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >On Thu, Jun 18, 2020 at 2:03 PM C Williams
>> >>>> >>><cwilliams3320@gmail.com>
>> >>>> >>>>> >> >wrote:
>> >>>> >>>>> >> >
>> >>>> >>>>> >> >> Strahil,
>> >>>> >>>>> >> >>
>> >>>> >>>>> >> >> Here you go -- Thank You For Your Help !
>> >>>> >>>>> >> >>
>> >>>> >>>>> >> >> BTW -- I can write a test file to gluster and it
>> >replicates
>> >>>> >>>>> >properly.
>> >>>> >>>>> >> >> Thinking something about the oVirt Storage Domain ?
>> >>>> >>>>> >> >>
>> >>>> >>>>> >> >> [root@ov08 ~]# gluster pool list
>> >>>> >>>>> >> >> UUID Hostname
>> >>>> >>>>> >State
>> >>>> >>>>> >> >> 5b40c659-d9ab-43c3-9af8-18b074ea0b83 ov06
>> >>>> >>>>> >> >Connected
>> >>>> >>>>> >> >> 36ce5a00-6f65-4926-8438-696944ebadb5
>> >ov07.ntc.srcle.com
>> >>>> >>>>> >> >Connected
>> >>>> >>>>> >> >> c7e7abdb-a8f4-4842-924c-e227f0db1b29 localhost
>> >>>> >>>>> >> >Connected
>> >>>> >>>>> >> >> [root@ov08 ~]# gluster volume list
>> >>>> >>>>> >> >> images3
>> >>>> >>>>> >> >>
>> >>>> >>>>> >> >> On Thu, Jun 18, 2020 at 1:13 PM Strahil Nikolov
>> >>>> >>>>> >> ><hunter86_bg@yahoo.com>
>> >>>> >>>>> >> >> wrote:
>> >>>> >>>>> >> >>
>> >>>> >>>>> >> >>> Log to the oVirt cluster and provide the output of:
>> >>>> >>>>> >> >>> gluster pool list
>> >>>> >>>>> >> >>> gluster volume list
>> >>>> >>>>> >> >>> for i in $(gluster volume list); do echo $i;echo;
>> >gluster
>> >>>> >>>>> >volume
>> >>>> >>>>> >> >info
>> >>>> >>>>> >> >>> $i; echo;echo;gluster volume status
>> >$i;echo;echo;echo;done
>> >>>> >>>>> >> >>>
>> >>>> >>>>> >> >>> ls -l /rhev/data-center/mnt/glusterSD/
>> >>>> >>>>> >> >>>
>> >>>> >>>>> >> >>> Best Regards,
>> >>>> >>>>> >> >>> Strahil Nikolov
>> >>>> >>>>> >> >>>
>> >>>> >>>>> >> >>>
>> >>>> >>>>> >> >>> На 18 юни 2020 г. 19:17:46 GMT+03:00, C Williams
>> >>>> >>>>> >> ><cwilliams3320@gmail.com>
>> >>>> >>>>> >> >>> написа:
>> >>>> >>>>> >> >>> >Hello,
>> >>>> >>>>> >> >>> >
>> >>>> >>>>> >> >>> >I recently added 6 hosts to an existing oVirt
>> >>>> >>>compute/gluster
>> >>>> >>>>> >> >cluster.
>> >>>> >>>>> >> >>> >
>> >>>> >>>>> >> >>> >Prior to this attempted addition, my cluster had 3
>> >>>> >>>Hypervisor
>> >>>> >>>>> >hosts
>> >>>> >>>>> >> >and
>> >>>> >>>>> >> >>> >3
>> >>>> >>>>> >> >>> >gluster bricks which made up a single gluster
>volume
>> >>>> >>>(replica 3
>> >>>> >>>>> >> >volume)
>> >>>> >>>>> >> >>> >. I
>> >>>> >>>>> >> >>> >added the additional hosts and made a brick on 3
>of
>> >the new
>> >>>> >>>>> >hosts
>> >>>> >>>>> >> >and
>> >>>> >>>>> >> >>> >attempted to make a new replica 3 volume. I had
>> >difficulty
>> >>>> >>>>> >> >creating
>> >>>> >>>>> >> >>> >the
>> >>>> >>>>> >> >>> >new volume. So, I decided that I would make a new
>> >>>> >>>>> >compute/gluster
>> >>>> >>>>> >> >>> >cluster
>> >>>> >>>>> >> >>> >for each set of 3 new hosts.
>> >>>> >>>>> >> >>> >
>> >>>> >>>>> >> >>> >I removed the 6 new hosts from the existing oVirt
>> >>>> >>>>> >Compute/Gluster
>> >>>> >>>>> >> >>> >Cluster
>> >>>> >>>>> >> >>> >leaving the 3 original hosts in place with their
>> >bricks. At
>> >>>> >>>that
>> >>>> >>>>> >> >point
>> >>>> >>>>> >> >>> >my
>> >>>> >>>>> >> >>> >original bricks went down and came back up . The
>> >volume
>> >>>> >>>showed
>> >>>> >>>>> >> >entries
>> >>>> >>>>> >> >>> >that
>> >>>> >>>>> >> >>> >needed healing. At that point I ran gluster volume
>> >heal
>> >>>> >>>images3
>> >>>> >>>>> >> >full,
>> >>>> >>>>> >> >>> >etc.
>> >>>> >>>>> >> >>> >The volume shows no unhealed entries. I also
>> >corrected some
>> >>>> >>>peer
>> >>>> >>>>> >> >>> >errors.
>> >>>> >>>>> >> >>> >
>> >>>> >>>>> >> >>> >However, I am unable to copy disks, move disks to
>> >another
>> >>>> >>>>> >domain,
>> >>>> >>>>> >> >>> >export
>> >>>> >>>>> >> >>> >disks, etc. It appears that the engine cannot
>locate
>> >disks
>> >>>> >>>>> >properly
>> >>>> >>>>> >> >and
>> >>>> >>>>> >> >>> >I
>> >>>> >>>>> >> >>> >get storage I/O errors.
>> >>>> >>>>> >> >>> >
>> >>>> >>>>> >> >>> >I have detached and removed the oVirt Storage
>Domain.
>> >I
>> >>>> >>>>> >reimported
>> >>>> >>>>> >> >the
>> >>>> >>>>> >> >>> >domain and imported 2 VMs, But the VM disks
>exhibit
>> >the
>> >>>> same
>> >>>> >>>>> >> >behaviour
>> >>>> >>>>> >> >>> >and
>> >>>> >>>>> >> >>> >won't run from the hard disk.
>> >>>> >>>>> >> >>> >
>> >>>> >>>>> >> >>> >
>> >>>> >>>>> >> >>> >I get errors such as this
>> >>>> >>>>> >> >>> >
>> >>>> >>>>> >> >>> >VDSM ov05 command HSMGetAllTasksStatusesVDS
>failed:
>> >low
>> >>>> >>>level
>> >>>> >>>>> >Image
>> >>>> >>>>> >> >>> >copy
>> >>>> >>>>> >> >>> >failed: ("Command ['/usr/bin/qemu-img', 'convert',
>> >'-p',
>> >>>> >>>'-t',
>> >>>> >>>>> >> >'none',
>> >>>> >>>>> >> >>> >'-T', 'none', '-f', 'raw',
>> >>>> >>>>> >> >>> >u'/rhev/data-center/mnt/glusterSD/192.168.24.18:
>> >>>> >>>>> >> >>>
>> >>>> >>>>> >>
>> >>>> >>>>> >>
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>>
>> >>>>
>>
>>
>>>>>>>_images3/5fe3ad3f-2d21-404c-832e-4dc7318ca10d/images/3ea5afbd-0fe0-4c09-8d39-e556c66a8b3d/fe6eab63-3b22-4815-bfe6-4a0ade292510',
>> >>>> >>>>> >> >>> >'-O', 'raw',
>> >>>> >>>>> >> >>> >u'/rhev/data-center/mnt/192.168.24.13:
>> >>>> >>>>> >> >>>
>> >>>> >>>>> >>
>> >>>> >>>>> >>
>> >>>> >>>>>
>> >>>> >>>>>
>> >>>>
>> >>>>
>>
>>
>>>>>>>_stor_import1/1ab89386-a2ba-448b-90ab-bc816f55a328/images/f707a218-9db7-4e23-8bbd-9b12972012b6/d6591ec5-3ede-443d-bd40-93119ca7c7d5']
>> >>>> >>>>> >> >>> >failed with rc=1 out='' err=bytearray(b'qemu-img:
>> >error
>> >>>> >>>while
>> >>>> >>>>> >> >reading
>> >>>> >>>>> >> >>> >sector 135168: Transport endpoint is not
>> >>>> >>>connected\\nqemu-img:
>> >>>> >>>>> >> >error
>> >>>> >>>>> >> >>> >while
>> >>>> >>>>> >> >>> >reading sector 131072: Transport endpoint is not
>> >>>> >>>>> >> >connected\\nqemu-img:
>> >>>> >>>>> >> >>> >error while reading sector 139264: Transport
>endpoint
>> >is
>> >>>> not
>> >>>> >>>>> >> >>> >connected\\nqemu-img: error while reading sector
>> >143360:
>> >>>> >>>>> >Transport
>> >>>> >>>>> >> >>> >endpoint
>> >>>> >>>>> >> >>> >is not connected\\nqemu-img: error while reading
>> >sector
>> >>>> >>>147456:
>> >>>> >>>>> >> >>> >Transport
>> >>>> >>>>> >> >>> >endpoint is not connected\\nqemu-img: error while
>> >reading
>> >>>> >>>sector
>> >>>> >>>>> >> >>> >155648:
>> >>>> >>>>> >> >>> >Transport endpoint is not connected\\nqemu-img:
>error
>> >while
>> >>>> >>>>> >reading
>> >>>> >>>>> >> >>> >sector
>> >>>> >>>>> >> >>> >151552: Transport endpoint is not
>> >connected\\nqemu-img:
>> >>>> >>>error
>> >>>> >>>>> >while
>> >>>> >>>>> >> >>> >reading
>> >>>> >>>>> >> >>> >sector 159744: Transport endpoint is not
>> >connected\\n')",)
>> >>>> >>>>> >> >>> >
>> >>>> >>>>> >> >>> >oVirt version is 4.3.82-1.el7
>> >>>> >>>>> >> >>> >OS CentOS Linux release 7.7.1908 (Core)
>> >>>> >>>>> >> >>> >
>> >>>> >>>>> >> >>> >The Gluster Cluster has been working very well
>until
>> >this
>> >>>> >>>>> >incident.
>> >>>> >>>>> >> >>> >
>> >>>> >>>>> >> >>> >Please help.
>> >>>> >>>>> >> >>> >
>> >>>> >>>>> >> >>> >Thank You
>> >>>> >>>>> >> >>> >
>> >>>> >>>>> >> >>> >Charles Williams
>> >>>> >>>>> >> >>>
>> >>>> >>>>> >> >>
>> >>>> >>>>> >>
>> >>>> >>>>>
>> >>>> >>>>
>> >>>> >>
>> >>>> >
>> >>>> _______________________________________________
>> >>>> Users mailing list -- users@ovirt.org
>> >>>> To unsubscribe send an email to users-leave@ovirt.org
>> >>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>> >>>> oVirt Code of Conduct:
>> >>>> https://www.ovirt.org/community/about/community-guidelines/
>> >>>> List Archives:
>> >>>>
>> >
>>
>https://lists.ovirt.org/archives/list/users@ovirt.org/message/YY3VUKEJLI7MRWXF627EHQAMH36UJ5BQ/
>> >>>>
>> >>>
>>