[ovirt-users] Re: hyperconverged single node with SSD cache fails gluster creation

Thursday, 5 September 2019

On Wed, Sep 4, 2019 at 9:27 PM <thomas(a)hoberg.net&gt; wrote:

...
 I am seeing more success than failures at creating single and triple
node
 hyperconverged setups after some weeks of experimentation so I am branching
 out to additional features: In this case the ability to use SSDs as cache
 media for hard disks.

 I tried first with a single node that combined caching and compression and
 that fails during the creation of LVMs.

 I tried again without the VDO compression, but actually the results where
 identical whilst VDO compression but without the LV cache worked ok.

 I tried various combinations, using less space etc., but the results are
 always the same and unfortunately rather cryptic (substituted the physical
 disk label with {disklabel}):

 TASK [gluster.infra/roles/backend_setup : Extend volume group]
 *****************
 failed: [{hostname}] (item={u'vgname': u'gluster_vg_{disklabel}p1',
 u'cachethinpoolname': u'gluster_thinpool_gluster_vg_{disklabel}p1',
 u'cachelvname': u'cachelv_gluster_thinpool_gluster_vg_{disklabel}p1',
 u'cachedisk': u'/dev/sda4', u'cachemetalvname':
 u'cache_gluster_thinpool_gluster_vg_{disklabel}p1', u'cachemode':
 u'writeback', u'cachemetalvsize': u'70G', u'cachelvsize':
u'630G'}) =>
 {"ansible_loop_var": "item", "changed": false,
"err": "  Physical volume
 \"/dev/mapper/vdo_{disklabel}p1\" still in use\n", "item":
{"cachedisk":
 "/dev/sda4", "cachelvname":
 "cachelv_gluster_thinpool_gluster_vg_{disklabel}p1", "cachelvsize":
"630G",
 "cachemetalvname":
"cache_gluster_thinpool_gluster_vg_{disklabel}p1",
 "cachemetalvsize": "70G", "cachemode":
"writeback", "cachethinpoolname":
 "gluster_thinpool_gluster_vg_{disklabel}p1", "vgname":
 "gluster_vg_{disklabel}p1"}, "msg": "Unable to reduce
 gluster_vg_{disklabel}p1 by /dev/dm-15.", "rc": 5}

 somewhere within that I see something that points to a race condition
 ("still in use").

 Unfortunately I have not been able to pinpoint the raw logs which are used
 at that stage and I wasn't able to obtain more info.

 At this point quite a bit of storage setup is already done, so rolling
 back for a clean new attempt, can be a bit complicated, with reboots to
 reconcile the kernel with data on disk.

 I don't actually believe it's related to single node and I'd be quite
 happy to move the creation of the SSD cache to a later stage, but in a VDO
 setup, this looks slightly complex to someone without intimate knowledge of
 LVS-with-cache-and-perhaps-thin/VDO/Gluster all thrown into one.

 Needless the feature set (SSD caching & compressed-dedup) sounds terribly
 attractive but when things don't just work, it's more terrifying.

Hi Thomas,

The way we have to write the variables for 2.8 while setting up cache.
Currently we are writing something like this:
...
>>> gluster_infra_cache_vars:
        - vgname: vg_sdb2
          cachedisk: /dev/sdb3
          cachelvname: cachelv_thinpool_vg_sdb2
          cachethinpoolname: thinpool_vg_sdb2
          cachelvsize: '10G'
          cachemetalvsize: '2G'
          cachemetalvname: cache_thinpool_vg_sdb2
          cachemode: writethrough
===================
Not that cachedisk is provided as /dev/sdb3 which would be extended with vg
vg_sdb2 ... this works well
The module will take care of extending the vg with /dev/sdb3.

*However with Ansible-2.8 we cannot provide like this but have to be more
explicit. And have to mention the pv underlying*
*this volume group vg_sdb2. So, with respect to 2.8 we have to write that
variable like:*

>>>>>>>>>>>...
>>>
  gluster_infra_cache_vars:
        - vgname: vg_sdb2
          cachedisk: '/dev/sdb2,/dev/sdb3'
          cachelvname: cachelv_thinpool_vg_sdb2
          cachethinpoolname: thinpool_vg_sdb2
          cachelvsize: '10G'
          cachemetalvsize: '2G'
          cachemetalvname: cache_thinpool_vg_sdb2
          cachemode: writethrough
=====================

Note that I have mentioned both /dev/sdb2 and /dev/sdb3.
This change is backward compatible, that is it works with 2.7 as well. I
have raised an issue with Ansible as well.
Which can be found here: https://github.com/ansible/ansible/issues/56501

However, @olafbuitelaar has fixed this in gluster-ansible-infra, and the
patch is merged in master.
If you can checkout master branch, you should be fine.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

[ovirt-users] Re: hyperconverged single node with SSD cache fails gluster creation