On Wed, May 07, 2014 at 09:56:03AM +0200, Jiri Moskovcak wrote:
> On 05/07/2014 09:28 AM, Nir Soffer wrote:
>> ----- Original Message -----
>>> From: "Jiri Moskovcak" <jmoskovc(a)redhat.com>
>>> To: "Nir Soffer" <nsoffer(a)redhat.com>
>>> Cc: devel(a)ovirt.org, "Federico Simoncelli"
<fsimonce(a)redhat.com>, "Allon Mureinik" <amureini(a)redhat.com>,
"Greg
>>> Padgett" <gpadgett(a)redhat.com>, "Doron Fediuck"
<dfediuck(a)redhat.com>
>>> Sent: Wednesday, May 7, 2014 10:21:28 AM
>>> Subject: Re: [ovirt-devel] vdsm disabling logical volumes
>>>
>>> On 05/05/2014 03:19 PM, Nir Soffer wrote:
>>>> ----- Original Message -----
>>>>> From: "Jiri Moskovcak" <jmoskovc(a)redhat.com>
>>>>> To: "Nir Soffer" <nsoffer(a)redhat.com>
>>>>> Cc: devel(a)ovirt.org, "Federico Simoncelli"
<fsimonce(a)redhat.com>, "Allon
>>>>> Mureinik" <amureini(a)redhat.com>, "Greg
>>>>> Padgett" <gpadgett(a)redhat.com>
>>>>> Sent: Monday, May 5, 2014 3:44:21 PM
>>>>> Subject: Re: [ovirt-devel] vdsm disabling logical volumes
>>>>>
>>>>> On 05/05/2014 02:37 PM, Nir Soffer wrote:
>>>>>> ----- Original Message -----
>>>>>>> From: "Jiri Moskovcak" <jmoskovc(a)redhat.com>
>>>>>>> To: "Nir Soffer" <nsoffer(a)redhat.com>
>>>>>>> Cc: devel(a)ovirt.org, "Federico Simoncelli"
<fsimonce(a)redhat.com>, "Allon
>>>>>>> Mureinik" <amureini(a)redhat.com>, "Greg
>>>>>>> Padgett" <gpadgett(a)redhat.com>
>>>>>>> Sent: Monday, May 5, 2014 3:16:37 PM
>>>>>>> Subject: Re: [ovirt-devel] vdsm disabling logical volumes
>>>>>>>
>>>>>>> On 05/05/2014 12:01 AM, Nir Soffer wrote:
>>>>>>>> ----- Original Message -----
>>>>>>>>> From: "Jiri Moskovcak"
<jmoskovc(a)redhat.com>
>>>>>>>>> To: "Nir Soffer"
<nsoffer(a)redhat.com>
>>>>>>>>> Cc: devel(a)ovirt.org
>>>>>>>>> Sent: Sunday, May 4, 2014 9:23:49 PM
>>>>>>>>> Subject: Re: [ovirt-devel] vdsm disabling logical
volumes
>>>>>>>>>
>>>>>>>>> On 05/04/2014 07:57 PM, Nir Soffer wrote:
>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>> From: "Jiri Moskovcak"
<jmoskovc(a)redhat.com>
>>>>>>>>>>> To: devel(a)ovirt.org
>>>>>>>>>>> Sent: Sunday, May 4, 2014 8:08:33 PM
>>>>>>>>>>> Subject: [ovirt-devel] vdsm disabling logical
volumes
>>>>>>>>>>>
>>>>>>>>>>> Greetings vdsm developers!
>>>>>>>>>>>
>>>>>>>>>>> While working on adding ISCSI support to the
hosted engine tools, I
>>>>>>>>>>> ran
>>>>>>>>>>> into a problem with vdms. It seems that when
stopped vdsm
>>>>>>>>>>> deactivates
>>>>>>>>>>> ALL logical volumes in it's volume group
and when it starts it
>>>>>>>>>>> reactivates only specific logical volumes.
This is a problem for
>>>>>>>>>>> hosted
>>>>>>>>>>> engine tools as they create logical volumes
in the same volume group
>>>>>>>>>>> and
>>>>>>>>>>> when vdsm deactivates the LVs the hosted
engine tools don't have a
>>>>>>>>>>> way
>>>>>>>>>>> to reactivate it, because the services drop
the root permissions and
>>>>>>>>>>> are
>>>>>>>>>>> running as vdsm and apparently only root can
activate LVs.
>>>>>>>>>>
>>>>>>>>>> Can you describe what volumes are you creating,
and why?
>>>>>>>>>
>>>>>>>>> We create hosted-engine.lockspace (for sanlock) and
>>>>>>>>> hosted-engine.metadata (keeps data about hosted
engine hosts)
>>>>>>>>
>>>>>>>> Do you create these lvs in every vdsm vg?
>>>>>>>
>>>>>>> - only in the first vg created by vdsm while deploying
hosted-engine
>>>>
>>>> It seems that the hosted engine has single point of failure - the random
>>>> vg that contains hosted engine data.
>>>>
>>>>>>>
>>>>>>>> Is this part of the domain structure
>>>>>>>> used by hosted engine, or it has nothing to do with the
storage domain?
>>>>>>>
>>>>>>> - sorry, I don't understand this question. How can I tell
if it has
>>>>>>> something to do with the storage domain? It's for storing
data about
>>>>>>> hosts set up to run the hosted-engine and data about state of
engine and
>>>>>>> the state of VM running the engine.
>>>>>>
>>>>>> Can you tell us exactly what lvs you are creating, and on which
vg?
>>>>>>
>>>>>> And how are you creating those lvs - I guess through vdsm?
>>>>>>
>>>>>
>>>>> - no hosted-engine tools do that by calling:
>>>>>
>>>>> lvc = popen(stdin=subprocess.PIPE, stdout=subprocess.PIPE,
>>>>> stderr=subprocess.PIPE,
>>>>> args=["lvm", "lvcreate",
"-L", str(size_bytes)+"B",
>>>>> "-n", lv_name, vg_uuid])
>>>>> ..
>>>>
>>>> How do you ensure that another host is not modifying the same vg in the
>>>> same time?
>>>>
>>>> If you are not ensuring this, you will corrupt this vg sooner or later.
>>>>
>>>> When a storage domain is detached from a host, for example when the host
>>>> is in maintenance mode, lvs on the shared storage may be deleted,
>>>> invalidating
>>>> the devices mapper maps for these devices. If you write to an lv with
wrong
>>>> maps, you may be writing to an extent belonging to another lv,
corrupting
>>>> that
>>>> lv data, or even worse corrupting the engine vg data.
>>>>
>>>> How do you ensure that the lvs are not deleted while you are using them?
>>>>
>>>>>
>>>>>> The output of lvs command on a host with hosted engine installed
will
>>>>>> help us to understand what you are doing, and then we can think
more
>>>>>> clearly
>>>>>> what would be the best way to support this in vdsm.
>>>>>
>>>>> The output of lvs:
http://fpaste.org/99196/93619139/
>>>>>
>>>>> HE created these two LVs:
>>>>> ha_agent-hosted-engine.lockspace
>>>>> ha_agent-hosted-engine.metadata
>>>>
>>>> Why do you create these lvs on a vg owned by vdsm?
>>>>
>>>> If you want total control of these lvs, I suggest that you create your
own
>>>> vg and put what ever lvs you like there.
>>>>
>>>
>>> I would rather not go this way (at least not for 3.5) as it's too much
>>> code changes in hosted-engine. On the other hand the logic in vdsm seems
>>> wrong because it's not complementary (disabling all LVs and then
>>> enabling just some of them) and should be fixed anyway. This problem is
>>> blocking one of our 3.5 features so I've created rhbz#1094657 to track
it.
>>
>> Can you elaborate on this? How should vdsm behave better, and why?
>
> Sure. So far I didn't hear any reason why it behaves like this and
> it seems not logical to disable *all* and then enable just *some*.
>
> How: Disabling and enabling operations should be complementary.
> Why: To be less surprising.
There is an asymmetry between activation and deactivation of an LV. A
mistakenly-active LV can cause data corruption. Making sure that this
does not happen is more important than a new feature.
- just out of a curiosity, how can mistakenly-active LV cause data
corruption? something like a stalled LV which refers to a volume which
doesn't exists anymore?
We do not want to deactivate and then re-activating the same set of LVs.
That would be illogical. We intentionally deactivate LVs that are no
longer used on the specific host - that's important if a qemu died while
Vdsm was down, leaving a stale LV behind.
Design-wise, Vdsm would very much like to keep its ownership of
Vdsm-created storage domain. Let us discuss how your feature can be
implemented without this breach of ownership.
Ok, I agree that this should have been discussed with the storage team
at the design phase, so let's start from the beginning and try to come
up with a better solution.
My problem is that I need a storage for the hosted-engine data which
is accessible from all hosts. It seems logical to use the same physical
storage as we use for "the storage". For NFS it's just a file in
/rhev/data-center/mnt/<IP>\mountpoint/<UUID>/ha_agent/. So where/how do
you suggest to store such data in case of using lvm (iscsi in this
case). Can we use vdsm to set it up or do we have to duplicate the lvm
code and handle it our self?
Thanks,
Jirka