
Hello, I'm still coping with my qemu image corruption, and I'm following some Redhat guidelines that explains the way to go : - Start the VM - Identify the host - On this host, run the ps command to identify the disk image location : # ps ax|grep qemu-kvm|grep vm_name - Look for "-drive file=/rhev/data-center/00000001-0001-0001-0001-00000000033e/b72773dc-c99c-472a-9548-503c122baa0b/images/91bfb2b4-5194-4ab3-90c8-3c172959f712/e7174214-3c2b-4353-98fd-2e504de72c75" (YMMV) - Resolve this symbolic link # ls -la /rhev/data-center/00000001-0001-0001-0001-00000000033e/b72773dc-c99c-472a-9548-503c122baa0b/images/91bfb2b4-5194-4ab3-90c8-3c172959f712/e7174214-3c2b-4353-98fd-2e504de72c75 lrwxrwxrwx 1 vdsm kvm 78 3 oct. 2016 /rhev/data-center/00000001-0001-0001-0001-00000000033e/b72773dc-c99c-472a-9548-503c122baa0b/images/91bfb2b4-5194-4ab3-90c8-3c172959f712/e7174214-3c2b-4353-98fd-2e504de72c75 -> /dev/b72773dc-c99c-472a-9548-503c122baa0b/e7174214-3c2b-4353-98fd-2e504de72c75 - Shutdown the VM - On the SPM, activate the logical volume : # lvchange -ay /dev/b72773dc-c99c-472a-9548-503c122baa0b/e7174214-3c2b-4353-98fd-2e504de72c75 - Verify the state of the qemu image : # qemu-img check /dev/b72773dc-c99c-472a-9548-503c122baa0b/e7174214-3c2b-4353-98fd-2e504de72c75 - If needed, attempt a repair : # qemu-img check -r all /dev/... - In any case, deactivate the LV : # lvchange -an /dev/... I followed this steps tens of times, and finding the LV and activating it was obvious and successful. Since yesterday, I'm finding some VMs one which these steps are not working : I can identify the symbolic link, but the SPM neither the host are able to find the LV device, thus can not LV-activate it : # lvchange -ay /dev/de2fdaa0-6e09-4dd2-beeb-1812318eb893/ce13d349-151e-4631-b600-c42b82106a8d Failed to find logical volume "de2fdaa0-6e09-4dd2-beeb-1812318eb893/ce13d349-151e-4631-b600-c42b82106a8d" Either I need two more coffees, either I may be missing a step or something to check. Looking at the SPM /dev/disk/* structure, it looks like very sound (I can see my three storage domains dm-name-* series of links). As the VM can nicely be ran and stopped, does the host activates something more before being launched? -- Nicolas ECARNOT

Does this report an error on the host where you are having problems activating logical volumes? lvs -a -o +devices Also, do the lvm commands succeed when you explicitly disable lvmetad, ie... lvchange --config 'global {use_lvmetad=0}' -ay ... On Wed, Sep 20, 2017 at 3:38 AM, Nicolas Ecarnot <nicolas@ecarnot.net> wrote:
Hello,
I'm still coping with my qemu image corruption, and I'm following some Redhat guidelines that explains the way to go : - Start the VM - Identify the host - On this host, run the ps command to identify the disk image location :
# ps ax|grep qemu-kvm|grep vm_name
- Look for "-drive file=/rhev/data-center/0000000 1-0001-0001-0001-00000000033e/b72773dc-c99c-472a-9548- 503c122baa0b/images/91bfb2b4-5194-4ab3-90c8-3c172959f712/ e7174214-3c2b-4353-98fd-2e504de72c75" (YMMV)
- Resolve this symbolic link # ls -la /rhev/data-center/00000001-0001-0001-0001-00000000033e/b7277 3dc-c99c-472a-9548-503c122baa0b/images/91bfb2b4-5194-4ab3- 90c8-3c172959f712/e7174214-3c2b-4353-98fd-2e504de72c75 lrwxrwxrwx 1 vdsm kvm 78 3 oct. 2016 /rhev/data-center/00000001-000 1-0001-0001-00000000033e/b72773dc-c99c-472a-9548-503c122baa0 b/images/91bfb2b4-5194-4ab3-90c8-3c172959f712/e7174214-3c2b-4353-98fd-2e504de72c75 -> /dev/b72773dc-c99c-472a-9548-503c122baa0b/e7174214-3c2b-4353 -98fd-2e504de72c75
- Shutdown the VM - On the SPM, activate the logical volume : # lvchange -ay /dev/b72773dc-c99c-472a-9548-5 03c122baa0b/e7174214-3c2b-4353-98fd-2e504de72c75
- Verify the state of the qemu image : # qemu-img check /dev/b72773dc-c99c-472a-9548-5 03c122baa0b/e7174214-3c2b-4353-98fd-2e504de72c75
- If needed, attempt a repair : # qemu-img check -r all /dev/...
- In any case, deactivate the LV : # lvchange -an /dev/...
I followed this steps tens of times, and finding the LV and activating it was obvious and successful. Since yesterday, I'm finding some VMs one which these steps are not working : I can identify the symbolic link, but the SPM neither the host are able to find the LV device, thus can not LV-activate it :
# lvchange -ay /dev/de2fdaa0-6e09-4dd2-beeb-1 812318eb893/ce13d349-151e-4631-b600-c42b82106a8d Failed to find logical volume "de2fdaa0-6e09-4dd2-beeb-18123 18eb893/ce13d349-151e-4631-b600-c42b82106a8d"
Either I need two more coffees, either I may be missing a step or something to check. Looking at the SPM /dev/disk/* structure, it looks like very sound (I can see my three storage domains dm-name-* series of links).
As the VM can nicely be ran and stopped, does the host activates something more before being launched?
-- Nicolas ECARNOT _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
-- Adam Litke

Adam, TL;DR : You nailed it! Le 03/10/2017 à 18:12, Adam Litke a écrit :
Does this report an error on the host where you are having problems activating logical volumes?
lvs -a -o +devices
On the hosts where I can't activate a LV, this command returns nothing interesting : root@serv-hv-prd03:~# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Devices home cl -wi-ao---- 56,25g /dev/sda2(1024) root cl -wi-ao---- 50,00g /dev/sda2(15423) swap cl -wi-ao---- 4,00g /dev/sda2(0) and so goes for pvs and vgs.
Also, do the lvm commands succeed when you explicitly disable lvmetad, ie...
lvchange --config 'global {use_lvmetad=0}' -ay ...
Disabling lvmetad usage allows the activation to succeed. Having understand that, I tried to run some usual LVM commands like pvs vgs, lvs, pvscan, vgscan, lvscan, lvmdiskscan, and they all returned some quite empty answers (to be short : only the local LV). Having understood the role of lvmetd, I ran pcscan --cache, and all in a sudden it filled up the LVM informations : I found back all my oVirt LVM storage domains, as I could see on other hosts. Things to note : - trying to run a VM on empty LVM cache was nonetheless successful - before filling the lvmetad cache, I checked this daemon was running and it was. -- Nicolas ECARNOT

On Wed, Oct 4, 2017 at 4:12 AM Nicolas Ecarnot <nicolas@ecarnot.net> wrote:
Adam,
TL;DR : You nailed it!
Great! Glad you're back up and running. One additional note about LVM commands. It's dangerous to use lvmetad for some commands while vdsm is running since it will not use lvmetad. You could end up with conflicting operations. In general it's safest to not issue any lvm commands while the host is activated but if you must, don't forget to disable lvmetad for all commands.
Le 03/10/2017 à 18:12, Adam Litke a écrit :
Does this report an error on the host where you are having problems activating logical volumes?
lvs -a -o +devices
On the hosts where I can't activate a LV, this command returns nothing interesting :
root@serv-hv-prd03:~# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Devices home cl -wi-ao---- 56,25g /dev/sda2(1024) root cl -wi-ao---- 50,00g /dev/sda2(15423) swap cl -wi-ao---- 4,00g /dev/sda2(0)
and so goes for pvs and vgs.
Also, do the lvm commands succeed when you explicitly disable lvmetad,
ie...
lvchange --config 'global {use_lvmetad=0}' -ay ...
Disabling lvmetad usage allows the activation to succeed.
Having understand that, I tried to run some usual LVM commands like pvs vgs, lvs, pvscan, vgscan, lvscan, lvmdiskscan, and they all returned some quite empty answers (to be short : only the local LV).
Having understood the role of lvmetd, I ran pcscan --cache, and all in a sudden it filled up the LVM informations : I found back all my oVirt LVM storage domains, as I could see on other hosts.
Things to note : - trying to run a VM on empty LVM cache was nonetheless successful - before filling the lvmetad cache, I checked this daemon was running and it was.
-- Nicolas ECARNOT
-- Adam Litke

Le 04/10/2017 à 15:30, Adam Litke a écrit :
On Wed, Oct 4, 2017 at 4:12 AM Nicolas Ecarnot <nicolas@ecarnot.net <mailto:nicolas@ecarnot.net>> wrote:
Adam,
TL;DR : You nailed it!
Great! Glad you're back up and running. One additional note about LVM commands. It's dangerous to use lvmetad for some commands while vdsm is running since it will not use lvmetad. You could end up with conflicting operations. In general it's safest to not issue any lvm commands while the host is activated but if you must, don't forget to disable lvmetad for all commands.
OK. Is it worth trying to understand why amongst our 32 hosts in 2 DC, all in the same version (OS, vdsm, qemu packages...) some are showing they're using lvmetad and some not? -- Nicolas ECARNOT

Sure. vdsm-tool should be disabling lvmetad on the host automatically. Maybe some of the hosts were fresh installed and others have been upgraded from older versions? In any case, you should be able to run on any host in maintenance mode: sudo vdsm-tool configure --force And this should edit the lvm.conf file to disable lvmetad globally and also prevent the lvmetad service from starting. On Wed, Oct 4, 2017 at 10:02 AM, Nicolas Ecarnot <nicolas@ecarnot.net> wrote:
Le 04/10/2017 à 15:30, Adam Litke a écrit :
On Wed, Oct 4, 2017 at 4:12 AM Nicolas Ecarnot <nicolas@ecarnot.net <mailto:nicolas@ecarnot.net>> wrote:
Adam,
TL;DR : You nailed it!
Great! Glad you're back up and running. One additional note about LVM commands. It's dangerous to use lvmetad for some commands while vdsm is running since it will not use lvmetad. You could end up with conflicting operations. In general it's safest to not issue any lvm commands while the host is activated but if you must, don't forget to disable lvmetad for all commands.
OK.
Is it worth trying to understand why amongst our 32 hosts in 2 DC, all in the same version (OS, vdsm, qemu packages...) some are showing they're using lvmetad and some not?
-- Nicolas ECARNOT
-- Adam Litke

Hi Adam, Le 04/10/2017 à 16:48, Adam Litke a écrit :
Sure. vdsm-tool should be disabling lvmetad on the host automatically. Maybe some of the hosts were fresh installed and others have been upgraded from older versions? In any case, you should be able to run on any host in maintenance mode:
sudo vdsm-tool configure --force
And this should edit the lvm.conf file to disable lvmetad globally and also prevent the lvmetad service from starting.
Sorry, but nope. # vdsm-tool configure --force Checking configuration status... Current revision of multipath.conf detected, preserving libvirt is already configured for vdsm SUCCESS: ssl configured to true. No conflicts Running configure... Reconfiguration of sebool is done. Reconfiguration of libvirt is done. Done configuring modules to VDSM. # grep use_lvmetad /etc/lvm/lvm.conf |grep -v '#' use_lvmetad = 1 Actually, as you found a workaround, it's not a big deal, especially if this point has been fixed in version greater than 3.6.7. It's just to let people know. -- Nicolas ECARNOT

בתאריך יום ה׳, 5 באוק׳ 2017, 11:39, מאת Nicolas Ecarnot < nicolas@ecarnot.net>:
Hi Adam,
Le 04/10/2017 à 16:48, Adam Litke a écrit :
Sure. vdsm-tool should be disabling lvmetad on the host automatically. Maybe some of the hosts were fresh installed and others have been upgraded from older versions? In any case, you should be able to run on any host in maintenance mode:
sudo vdsm-tool configure --force
And this should edit the lvm.conf file to disable lvmetad globally and also prevent the lvmetad service from starting.
Sorry, but nope.
# vdsm-tool configure --force
Checking configuration status...
Current revision of multipath.conf detected, preserving libvirt is already configured for vdsm SUCCESS: ssl configured to true. No conflicts
Running configure... Reconfiguration of sebool is done. Reconfiguration of libvirt is done.
Done configuring modules to VDSM.
# grep use_lvmetad /etc/lvm/lvm.conf |grep -v '#' use_lvmetad = 1
vdsm-tool disables lvmetad in the local lvm config (/etc/lvm/lvmlocal.conf) which is owned now by vdsm, similar to the way multipath configuration is managed by vdsm.
Actually, as you found a workaround, it's not a big deal, especially if this point has been fixed in version greater than 3.6.7.
But this was added in 4.0.7. You should upgrade to 4.1, including many other features and fixes. Nir
It's just to let people know.
-- Nicolas ECARNOT _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
participants (3)
-
Adam Litke
-
Nicolas Ecarnot
-
Nir Soffer