[ovirt-users] Re: Device Mapper Timeout when using Gluster Snapshots

9 Jul 2018

      On Mon, Jul 9, 2018 at 3:52 PM Sahina Bose <sabose@redhat.com> wrote:
...
On Mon, Jul 9, 2018 at 5:41 PM, Hesham Ahmed <hsahmed@gmail.com> wrote:
...
Thanks Sahina for the update,
I am using gluster geo-replication for DR in a different installation,
however I was not aware that Gluster snapshots are not recommended in
a hyperconverged setup, don't. A warning on the Gluster snapshot UI
would be helpful. Is gluster volume snapshots for volumes hosting VM
images a work in progress with a bug tracker or it's something not
expected to change?
Agreed on the warning - can you log a bz?
There's no specific bz tracking support for volume snapshots w.r.t VM store use case. If you have a specific scenario where the geo-rep based DR is not sufficient, please log a bug.
...
On Mon, Jul 9, 2018 at 2:58 PM Sahina Bose <sabose@redhat.com> wrote:
...
On Sun, Jul 8, 2018 at 3:29 PM, Hesham Ahmed <hsahmed@gmail.com> wrote:
...
I also noticed that Gluster Snapshots have the SAME UUID as the main
LV and if using UUID in fstab, the snapshot device is sometimes
mounted instead of the primary LV
For instance:
/etc/fstab contains the following line:
UUID=a0b85d33-7150-448a-9a70-6391750b90ad /gluster_bricks/gv01_data01
auto inode64,noatime,nodiratime,x-parent=dMeNGb-34lY-wFVL-WF42-hlpE-TteI-lMhvvt
0 0
# lvdisplay gluster00/lv01_data01
  --- Logical volume ---
  LV Path                /dev/gluster00/lv01_data01
  LV Name                lv01_data01
  VG Name                gluster00
# mount
/dev/mapper/gluster00-55e97e7412bf48db99bb389bb708edb8_0 on
/gluster_bricks/gv01_data01 type xfs
(rw,noatime,nodiratime,seclabel,attr2,inode64,sunit=1024,swidth=2048,noquota)
Notice above the device mounted at the brick mountpoint is not
/dev/gluster00/lv01_data01 and instead is one of the snapshot devices
of that LV
# blkid
/dev/mapper/gluster00-lv01_shaker_com_sa:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
/dev/mapper/gluster00-55e97e7412bf48db99bb389bb708edb8_0:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
/dev/mapper/gluster00-4ca8eef409ec4932828279efb91339de_0:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
/dev/mapper/gluster00-59992b6c14644f13b5531a054d2aa75c_0:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
/dev/mapper/gluster00-362b50c994b04284b1664b2e2eb49d09_0:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
/dev/mapper/gluster00-0b3cc414f4cb4cddb6e81f162cdb7efe_0:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
/dev/mapper/gluster00-da98ce5efda549039cf45a18e4eacbaf_0:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
/dev/mapper/gluster00-4ea5cce4be704dd7b29986ae6698a666_0:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
Notice the UUID of LV and its snapshots is the same causing systemd to
mount one of the snapshot devices instead of LV which results in the
following gluster error:
gluster> volume start gv01_data01 force
volume start: gv01_data01: failed: Volume id mismatch for brick
vhost03:/gluster_bricks/gv01_data01/gv. Expected volume id
be6bc69b-c6ed-4329-b300-3b9044f375e1, volume id
55e97e74-12bf-48db-99bb-389bb708edb8 found
We do not recommend gluster volume snapshots for volumes hosting VM images. Please look at the https://ovirt.org/develop/release-management/features/gluster/gluster-dr/ as an alternative.
...
On Sun, Jul 8, 2018 at 12:32 PM <hsahmed@gmail.com> wrote:
...
I am facing this trouble since version 4.1 up to the latest 4.2.4, once
we enable gluster snapshots and accumulate some snapshots (as few as 15
snapshots per server) we start having trouble booting the server. The
server enters emergency shell upon boot after timing out waiting for
snapshot devices. Waiting a few minutes and pressing Control-D then boots the server normally. In case of very large number of snapshots (600+) it can take days before the sever will boot. Attaching journal
log, let me know if you need any other logs.
Details of the setup:
3 node hyperconverged oVirt setup (64GB RAM, 8-Core E5 Xeon)
oVirt 4.2.4
oVirt Node 4.2.4
10Gb Interface
Thanks,
Hesham S. Ahmed
Bug report created for the warining
https://bugzilla.redhat.com/show_bug.cgi?id=1599365

We were not using gluster snapshots for DR rather as a quick way to go
back in time (although we never planned how to use the snapshots).
Maybe scheduling ability should be added for VM snapshots as well.