On Sun, Jul 8, 2018 at 3:29 PM, Hesham Ahmed <hsahmed@gmail.com> wrote:

I also noticed that Gluster Snapshots have the SAME UUID as the main
LV and if using UUID in fstab, the snapshot device is sometimes
mounted instead of the primary LV

For instance:
/etc/fstab contains the following line:

UUID=a0b85d33-7150-448a-9a70-6391750b90ad /gluster_bricks/gv01_data01
auto inode64,noatime,nodiratime,x-parent=dMeNGb-34lY-wFVL-WF42-hlpE-TteI-lMhvvt
0 0

# lvdisplay gluster00/lv01_data01
--- Logical volume ---
LV Path /dev/gluster00/lv01_data01
LV Name lv01_data01
VG Name gluster00

# mount
/dev/mapper/gluster00-55e97e7412bf48db99bb389bb708edb8_0 on
/gluster_bricks/gv01_data01 type xfs
(rw,noatime,nodiratime,seclabel,attr2,inode64,sunit=1024,swidth=2048,noquota)

Notice above the device mounted at the brick mountpoint is not
/dev/gluster00/lv01_data01 and instead is one of the snapshot devices
of that LV

# blkid
/dev/mapper/gluster00-lv01_shaker_com_sa:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
/dev/mapper/gluster00-55e97e7412bf48db99bb389bb708edb8_0:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
/dev/mapper/gluster00-4ca8eef409ec4932828279efb91339de_0:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
/dev/mapper/gluster00-59992b6c14644f13b5531a054d2aa75c_0:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
/dev/mapper/gluster00-362b50c994b04284b1664b2e2eb49d09_0:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
/dev/mapper/gluster00-0b3cc414f4cb4cddb6e81f162cdb7efe_0:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
/dev/mapper/gluster00-da98ce5efda549039cf45a18e4eacbaf_0:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
/dev/mapper/gluster00-4ea5cce4be704dd7b29986ae6698a666_0:
UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"

Notice the UUID of LV and its snapshots is the same causing systemd to
mount one of the snapshot devices instead of LV which results in the
following gluster error:

gluster> volume start gv01_data01 force
volume start: gv01_data01: failed: Volume id mismatch for brick
vhost03:/gluster_bricks/gv01_data01/gv. Expected volume id
be6bc69b-c6ed-4329-b300-3b9044f375e1, volume id
55e97e74-12bf-48db-99bb-389bb708edb8 found

We do not recommend gluster volume snapshots for volumes hosting VM images. Please look at the https://ovirt.org/develop/release-management/features/gluster/gluster-dr/ as an alternative.

On Sun, Jul 8, 2018 at 12:32 PM <hsahmed@gmail.com> wrote:
>
> I am facing this trouble since version 4.1 up to the latest 4.2.4, once
> we enable gluster snapshots and accumulate some snapshots (as few as 15
> snapshots per server) we start having trouble booting the server. The
> server enters emergency shell upon boot after timing out waiting for
> snapshot devices. Waiting a few minutes and pressing Control-D then boots the server normally. In case of very large number of snapshots (600+) it can take days before the sever will boot. Attaching journal
> log, let me know if you need any other logs.
>
> Details of the setup:
>
> 3 node hyperconverged oVirt setup (64GB RAM, 8-Core E5 Xeon)
> oVirt 4.2.4
> oVirt Node 4.2.4
> 10Gb Interface
>
> Thanks,
>
> Hesham S. Ahmed