
This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --CFI93AnmktNI0BWuXdh0ffiLXcOJQifXa Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello Nir, On Wed Mar 5 14:37:17 2014, Nir Soffer wrote:
----- Original Message -----
From: "Boyan Tabakov" <blade@alslayer.net> To: "Nir Soffer" <nsoffer@redhat.com> Cc: users@ovirt.org Sent: Tuesday, March 4, 2014 3:53:24 PM Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on= some nodes
On Tue Mar 4 14:46:33 2014, Nir Soffer wrote:
----- Original Message -----
From: "Nir Soffer" <nsoffer@redhat.com> To: "Boyan Tabakov" <blade@alslayer.net> Cc: users@ovirt.org, "Zdenek Kabelac" <zkabelac@redhat.com> Sent: Monday, March 3, 2014 9:39:47 PM Subject: Re: [Users] SD Disk's Logical Volume not visible/activated = on some nodes
Hi Zdenek, can you look into this strange incident?
When user creates a disk on one host (create a new lv), the lv is no= t seen on another host in the cluster.
Calling multipath -r cause the new lv to appear on the other host.
Finally, lvs tell us that vg_mda_free is zero - maybe unrelated, but=
>>>>> >>>>> As far as I could track quickly the vdsm code, there is only = call >>>>> to >>>>> lvs >>>>> and not to lvscan or lvchange so the host2 LVM doesn't fully >>>>> refresh.
lvs should see any change on the shared storage.
>>>>> The only workaround so far has been to restart VDSM on host2,= which >>>>> makes it refresh all LVM data properly.
When vdsm starts, it calls multipath -r, which ensure that we see = all physical volumes.
>>>>> >>>>> When is host2 supposed to pick up any newly created LVs in th= e SD >>>>> VG? >>>>> Any suggestions where the problem might be? >>>> >>>> When you create a new lv on the shared storage, the new lv sho= uld be >>>> visible on the other host. Lets start by verifying that you do= see >>>> the new lv after a disk was created. >>>> >>>> Try this: >>>> >>>> 1. Create a new disk, and check the disk uuid in the engine ui=
unusual.
----- Original Message -----
From: "Boyan Tabakov" <blade@alslayer.net> To: "Nir Soffer" <nsoffer@redhat.com> Cc: users@ovirt.org Sent: Monday, March 3, 2014 9:51:05 AM Subject: Re: [Users] SD Disk's Logical Volume not visible/activated= on some nodes
>>>>> Consequently, when creating/booting >>>>> a VM with the said disk attached, the VM fails to start on ho= st2, >>>>> because host2 can't see the LV. Similarly, if the VM is start= ed on >>>>> host1, it fails to migrate to host2. Extract from host2 log i= s in >>>>> the >>>>> end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe32=
>>>> 2. On another machine, run this command: >>>> >>>> lvs -o vg_name,lv_name,tags >>>> >>>> You can identify the new lv using tags, which should contain t= he new >>>> disk >>>> uuid. >>>> >>>> If you don't see the new lv from the other host, please provid= e >>>> /var/log/messages >>>> and /var/log/sanlock.log. >>> >>> Just tried that. The disk is not visible on the non-SPM node. >> >> This means that storage is not accessible from this host. > > Generally, the storage seems accessible ok. For example, if I res= tart > the vdsmd, all volumes get picked up correctly (become visible in= lvs > output and VMs can be started with them).
Lests repeat this test, but now, if you do not see the new lv, ple= ase run:
multipath -r
And report the results.
Running multipath -r helped and the disk was properly picked up by = the second host.
Is running multipath -r safe while host is not in maintenance mode?=
It should be safe, vdsm uses in some cases.
If yes, as a temporary workaround I can patch vdsmd to run multipat=
h -r
when e.g. monitoring the storage domain.
I suggested running multipath as debugging aid; normally this is not=
needed.
You should see lv on the shared storage without running multipath.
Zdenek, can you explain this?
> One warning that I keep seeing in vdsm logs on both nodes is this= : > > Thread-1617881::WARNING::2014-02-24 > 16:57:50,627::sp::1553::Storage.StoragePool::(getInfo) VG > 3307f6fa-dd58-43db-ab23-b1fb299006c7's metadata size exceeded > critical size: mdasize=3D134217728 mdafree=3D0
Can you share the output of the command bellow?
lvs -o uuid,name,attr,size,free,extent_size,extent_count,free_count,t= ags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
Here's the output for both hosts.
host1: [root@host1 ~]# lvs -o uuid,name,attr,size,vg_free,vg_extent_size,vg_extent_count,vg_free_= count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count LV UUID LV Attr LSize VFree Ext #Ext Free LV Tags
VMdaSize VMdaFree #LV #PV jGEpVm-oPW8-XyxI-l2yi-YF4X-qteQ-dm8SqL 3d362bf2-20f4-438d-9ba9-486bd2e8cedf -wi-ao--- 2.00g 114.62g 128.= 00m 1596 917 IU_0227da98-34b2-4b0c-b083-d42e7b760036,MD_5,PU_f4231952-76c5-4764-= 9c8b-ac73492ac465 128.00m 0 13 2
This looks wrong - your vg_mda_free is zero - as vdsm complains.
Patch http://gerrit.ovirt.org/25408 should solve this issue.
It may also solve the other issue with the missing lv - I could not reproduce it yet.
Can you try to apply this patch and report the results?
Thanks, Nir
This patch helped, indeed! I tried it on the non-SPM node (as that's=20 the node that I can currently easily put in maintenance) and the node=20 started picking up newly created volumes correctly. I also set the=20 user_lvmetad to 0 in the main lvm.conf, because without it manually=20 running e.g. lvs was still using the metadata daemon. I can't confirm yet that this helps with the metadata volume warning,=20 as that warning appears only on the SPM. I'll be able to put the SPM=20 node in maintenance soon and will report later. This issue on Fedora makes me think - is Fedora still fully supported=20 platform? Best regards, Boyan --CFI93AnmktNI0BWuXdh0ffiLXcOJQifXa Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlMXKNcACgkQXOXFG4fgV76CZgCgkAj0IDS6sTZr3DyAVmvBO9J+ vEcAnjP/qvyIjx9eR1DkdP6Ccj2VK/4n =/pXc -----END PGP SIGNATURE----- --CFI93AnmktNI0BWuXdh0ffiLXcOJQifXa--