
On Mon, Jul 9, 2018 at 3:52 PM Sahina Bose <sabose@redhat.com> wrote:
On Mon, Jul 9, 2018 at 5:41 PM, Hesham Ahmed <hsahmed@gmail.com> wrote:
Thanks Sahina for the update,
I am using gluster geo-replication for DR in a different installation, however I was not aware that Gluster snapshots are not recommended in a hyperconverged setup, don't. A warning on the Gluster snapshot UI would be helpful. Is gluster volume snapshots for volumes hosting VM images a work in progress with a bug tracker or it's something not expected to change?
Agreed on the warning - can you log a bz?
There's no specific bz tracking support for volume snapshots w.r.t VM store use case. If you have a specific scenario where the geo-rep based DR is not sufficient, please log a bug.
On Mon, Jul 9, 2018 at 2:58 PM Sahina Bose <sabose@redhat.com> wrote:
On Sun, Jul 8, 2018 at 3:29 PM, Hesham Ahmed <hsahmed@gmail.com> wrote:
I also noticed that Gluster Snapshots have the SAME UUID as the main LV and if using UUID in fstab, the snapshot device is sometimes mounted instead of the primary LV
For instance: /etc/fstab contains the following line:
UUID=a0b85d33-7150-448a-9a70-6391750b90ad /gluster_bricks/gv01_data01 auto inode64,noatime,nodiratime,x-parent=dMeNGb-34lY-wFVL-WF42-hlpE-TteI-lMhvvt 0 0
# lvdisplay gluster00/lv01_data01 --- Logical volume --- LV Path /dev/gluster00/lv01_data01 LV Name lv01_data01 VG Name gluster00
# mount /dev/mapper/gluster00-55e97e7412bf48db99bb389bb708edb8_0 on /gluster_bricks/gv01_data01 type xfs (rw,noatime,nodiratime,seclabel,attr2,inode64,sunit=1024,swidth=2048,noquota)
Notice above the device mounted at the brick mountpoint is not /dev/gluster00/lv01_data01 and instead is one of the snapshot devices of that LV
# blkid /dev/mapper/gluster00-lv01_shaker_com_sa: UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs" /dev/mapper/gluster00-55e97e7412bf48db99bb389bb708edb8_0: UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs" /dev/mapper/gluster00-4ca8eef409ec4932828279efb91339de_0: UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs" /dev/mapper/gluster00-59992b6c14644f13b5531a054d2aa75c_0: UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs" /dev/mapper/gluster00-362b50c994b04284b1664b2e2eb49d09_0: UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs" /dev/mapper/gluster00-0b3cc414f4cb4cddb6e81f162cdb7efe_0: UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs" /dev/mapper/gluster00-da98ce5efda549039cf45a18e4eacbaf_0: UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs" /dev/mapper/gluster00-4ea5cce4be704dd7b29986ae6698a666_0: UUID="a0b85d33-7150-448a-9a70-6391750b90ad" TYPE="xfs"
Notice the UUID of LV and its snapshots is the same causing systemd to mount one of the snapshot devices instead of LV which results in the following gluster error:
gluster> volume start gv01_data01 force volume start: gv01_data01: failed: Volume id mismatch for brick vhost03:/gluster_bricks/gv01_data01/gv. Expected volume id be6bc69b-c6ed-4329-b300-3b9044f375e1, volume id 55e97e74-12bf-48db-99bb-389bb708edb8 found
We do not recommend gluster volume snapshots for volumes hosting VM images. Please look at the https://ovirt.org/develop/release-management/features/gluster/gluster-dr/ as an alternative.
On Sun, Jul 8, 2018 at 12:32 PM <hsahmed@gmail.com> wrote:
I am facing this trouble since version 4.1 up to the latest 4.2.4, once we enable gluster snapshots and accumulate some snapshots (as few as 15 snapshots per server) we start having trouble booting the server. The server enters emergency shell upon boot after timing out waiting for snapshot devices. Waiting a few minutes and pressing Control-D then boots the server normally. In case of very large number of snapshots (600+) it can take days before the sever will boot. Attaching journal log, let me know if you need any other logs.
Details of the setup:
3 node hyperconverged oVirt setup (64GB RAM, 8-Core E5 Xeon) oVirt 4.2.4 oVirt Node 4.2.4 10Gb Interface
Thanks,
Hesham S. Ahmed
Bug report created for the warining https://bugzilla.redhat.com/show_bug.cgi?id=1599365 We were not using gluster snapshots for DR rather as a quick way to go back in time (although we never planned how to use the snapshots). Maybe scheduling ability should be added for VM snapshots as well.