On Mon, May 12, 2014 at 01:26:33PM +0200, Jiri Moskovcak wrote:
> On 05/12/2014 11:22 AM, Dan Kenigsberg wrote:
>> On Wed, May 07, 2014 at 02:16:34PM +0200, Jiri Moskovcak wrote:
>>> On 05/07/2014 11:37 AM, Dan Kenigsberg wrote:
>>>> On Wed, May 07, 2014 at 09:56:03AM +0200, Jiri Moskovcak wrote:
>>>>> On 05/07/2014 09:28 AM, Nir Soffer wrote:
>>>>>> ----- Original Message -----
>>>>>>> From: "Jiri Moskovcak" <jmoskovc(a)redhat.com>
>>>>>>> To: "Nir Soffer" <nsoffer(a)redhat.com>
>>>>>>> Cc: devel(a)ovirt.org, "Federico Simoncelli"
<fsimonce(a)redhat.com>, "Allon Mureinik" <amureini(a)redhat.com>,
"Greg
>>>>>>> Padgett" <gpadgett(a)redhat.com>, "Doron
Fediuck" <dfediuck(a)redhat.com>
>>>>>>> Sent: Wednesday, May 7, 2014 10:21:28 AM
>>>>>>> Subject: Re: [ovirt-devel] vdsm disabling logical volumes
>>>>>>>
>>>>>>> On 05/05/2014 03:19 PM, Nir Soffer wrote:
>>>>>>>> ----- Original Message -----
>>>>>>>>> From: "Jiri Moskovcak"
<jmoskovc(a)redhat.com>
>>>>>>>>> To: "Nir Soffer"
<nsoffer(a)redhat.com>
>>>>>>>>> Cc: devel(a)ovirt.org, "Federico Simoncelli"
<fsimonce(a)redhat.com>, "Allon
>>>>>>>>> Mureinik" <amureini(a)redhat.com>,
"Greg
>>>>>>>>> Padgett" <gpadgett(a)redhat.com>
>>>>>>>>> Sent: Monday, May 5, 2014 3:44:21 PM
>>>>>>>>> Subject: Re: [ovirt-devel] vdsm disabling logical
volumes
>>>>>>>>>
>>>>>>>>> On 05/05/2014 02:37 PM, Nir Soffer wrote:
>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>> From: "Jiri Moskovcak"
<jmoskovc(a)redhat.com>
>>>>>>>>>>> To: "Nir Soffer"
<nsoffer(a)redhat.com>
>>>>>>>>>>> Cc: devel(a)ovirt.org, "Federico
Simoncelli" <fsimonce(a)redhat.com>, "Allon
>>>>>>>>>>> Mureinik" <amureini(a)redhat.com>,
"Greg
>>>>>>>>>>> Padgett" <gpadgett(a)redhat.com>
>>>>>>>>>>> Sent: Monday, May 5, 2014 3:16:37 PM
>>>>>>>>>>> Subject: Re: [ovirt-devel] vdsm disabling
logical volumes
>>>>>>>>>>>
>>>>>>>>>>> On 05/05/2014 12:01 AM, Nir Soffer wrote:
>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>>> From: "Jiri Moskovcak"
<jmoskovc(a)redhat.com>
>>>>>>>>>>>>> To: "Nir Soffer"
<nsoffer(a)redhat.com>
>>>>>>>>>>>>> Cc: devel(a)ovirt.org
>>>>>>>>>>>>> Sent: Sunday, May 4, 2014 9:23:49 PM
>>>>>>>>>>>>> Subject: Re: [ovirt-devel] vdsm
disabling logical volumes
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 05/04/2014 07:57 PM, Nir Soffer
wrote:
>>>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>>>>> From: "Jiri
Moskovcak" <jmoskovc(a)redhat.com>
>>>>>>>>>>>>>>> To: devel(a)ovirt.org
>>>>>>>>>>>>>>> Sent: Sunday, May 4, 2014
8:08:33 PM
>>>>>>>>>>>>>>> Subject: [ovirt-devel] vdsm
disabling logical volumes
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Greetings vdsm developers!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> While working on adding ISCSI
support to the hosted engine tools, I
>>>>>>>>>>>>>>> ran
>>>>>>>>>>>>>>> into a problem with vdms. It
seems that when stopped vdsm
>>>>>>>>>>>>>>> deactivates
>>>>>>>>>>>>>>> ALL logical volumes in
it's volume group and when it starts it
>>>>>>>>>>>>>>> reactivates only specific
logical volumes. This is a problem for
>>>>>>>>>>>>>>> hosted
>>>>>>>>>>>>>>> engine tools as they create
logical volumes in the same volume group
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> when vdsm deactivates the LVs
the hosted engine tools don't have a
>>>>>>>>>>>>>>> way
>>>>>>>>>>>>>>> to reactivate it, because the
services drop the root permissions and
>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>> running as vdsm and
apparently only root can activate LVs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can you describe what volumes are
you creating, and why?
>>>>>>>>>>>>>
>>>>>>>>>>>>> We create hosted-engine.lockspace
(for sanlock) and
>>>>>>>>>>>>> hosted-engine.metadata (keeps data
about hosted engine hosts)
>>>>>>>>>>>>
>>>>>>>>>>>> Do you create these lvs in every vdsm
vg?
>>>>>>>>>>>
>>>>>>>>>>> - only in the first vg created by vdsm while
deploying hosted-engine
>>>>>>>>
>>>>>>>> It seems that the hosted engine has single point of
failure - the random
>>>>>>>> vg that contains hosted engine data.
>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Is this part of the domain structure
>>>>>>>>>>>> used by hosted engine, or it has nothing
to do with the storage domain?
>>>>>>>>>>>
>>>>>>>>>>> - sorry, I don't understand this
question. How can I tell if it has
>>>>>>>>>>> something to do with the storage domain?
It's for storing data about
>>>>>>>>>>> hosts set up to run the hosted-engine and
data about state of engine and
>>>>>>>>>>> the state of VM running the engine.
>>>>>>>>>>
>>>>>>>>>> Can you tell us exactly what lvs you are
creating, and on which vg?
>>>>>>>>>>
>>>>>>>>>> And how are you creating those lvs - I guess
through vdsm?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> - no hosted-engine tools do that by calling:
>>>>>>>>>
>>>>>>>>> lvc = popen(stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
>>>>>>>>> stderr=subprocess.PIPE,
>>>>>>>>> args=["lvm",
"lvcreate", "-L", str(size_bytes)+"B",
>>>>>>>>> "-n", lv_name,
vg_uuid])
>>>>>>>>> ..
>>>>>>>>
>>>>>>>> How do you ensure that another host is not modifying the
same vg in the
>>>>>>>> same time?
>>>>>>>>
>>>>>>>> If you are not ensuring this, you will corrupt this vg
sooner or later.
>>>>>>>>
>>>>>>>> When a storage domain is detached from a host, for
example when the host
>>>>>>>> is in maintenance mode, lvs on the shared storage may be
deleted,
>>>>>>>> invalidating
>>>>>>>> the devices mapper maps for these devices. If you write
to an lv with wrong
>>>>>>>> maps, you may be writing to an extent belonging to
another lv, corrupting
>>>>>>>> that
>>>>>>>> lv data, or even worse corrupting the engine vg data.
>>>>>>>>
>>>>>>>> How do you ensure that the lvs are not deleted while you
are using them?
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> The output of lvs command on a host with hosted
engine installed will
>>>>>>>>>> help us to understand what you are doing, and
then we can think more
>>>>>>>>>> clearly
>>>>>>>>>> what would be the best way to support this in
vdsm.
>>>>>>>>>
>>>>>>>>> The output of lvs:
http://fpaste.org/99196/93619139/
>>>>>>>>>
>>>>>>>>> HE created these two LVs:
>>>>>>>>> ha_agent-hosted-engine.lockspace
>>>>>>>>> ha_agent-hosted-engine.metadata
>>>>>>>>
>>>>>>>> Why do you create these lvs on a vg owned by vdsm?
>>>>>>>>
>>>>>>>> If you want total control of these lvs, I suggest that
you create your own
>>>>>>>> vg and put what ever lvs you like there.
>>>>>>>>
>>>>>>>
>>>>>>> I would rather not go this way (at least not for 3.5) as
it's too much
>>>>>>> code changes in hosted-engine. On the other hand the logic in
vdsm seems
>>>>>>> wrong because it's not complementary (disabling all LVs
and then
>>>>>>> enabling just some of them) and should be fixed anyway. This
problem is
>>>>>>> blocking one of our 3.5 features so I've created
rhbz#1094657 to track it.
>>>>>>
>>>>>> Can you elaborate on this? How should vdsm behave better, and
why?
>>>>>
>>>>> Sure. So far I didn't hear any reason why it behaves like this
and
>>>>> it seems not logical to disable *all* and then enable just *some*.
>>>>>
>>>>> How: Disabling and enabling operations should be complementary.
>>>>> Why: To be less surprising.
>>>>
>>>> There is an asymmetry between activation and deactivation of an LV. A
>>>> mistakenly-active LV can cause data corruption. Making sure that this
>>>> does not happen is more important than a new feature.
>>>
>>> - just out of a curiosity, how can mistakenly-active LV cause data
>>> corruption? something like a stalled LV which refers to a volume
>>> which doesn't exists anymore?
>>>
>>>>
>>>> We do not want to deactivate and then re-activating the same set of LVs.
>>>> That would be illogical. We intentionally deactivate LVs that are no
>>>> longer used on the specific host - that's important if a qemu died
while
>>>> Vdsm was down, leaving a stale LV behind.
>>>>
>>>> Design-wise, Vdsm would very much like to keep its ownership of
>>>> Vdsm-created storage domain. Let us discuss how your feature can be
>>>> implemented without this breach of ownership.
>>>>
>>>
>>> Ok, I agree that this should have been discussed with the storage
>>> team at the design phase, so let's start from the beginning and try
>>> to come up with a better solution.
>>> My problem is that I need a storage for the hosted-engine data
>>> which is accessible from all hosts. It seems logical to use the same
>>> physical storage as we use for "the storage". For NFS it's just
a
>>> file in
>>> /rhev/data-center/mnt/<IP>\mountpoint/<UUID>/ha_agent/. So
where/how
>>> do you suggest to store such data in case of using lvm (iscsi in
>>> this case). Can we use vdsm to set it up or do we have to duplicate
>>> the lvm code and handle it our self?
>>
>> I think that for this to happen, we need to define a Vdsm verb that
>> creates a volume on a storage domain that is unrelated to any pool. Such
>> a verb is in planning; Federico, can its implementation be hasten in
>> favor of hosted engine?
>>
>> On its own, this would not solve the problem of Vdsm deactivating all
>> unused LVs.
>>
>> Jiri, could you describe why you keep your LV active, but not open?
>>
>
> - the setup flow goes approximately like this:
>
> 1. create the LVs for the hosted-engine
> 2. install the engine into the VM
> 3. add the host to the engine
> - this causes re-deploy of vdsm and deactivating the LVs
> 4. start the ha-agent and ha-broker services which uses the LVs
>
> - I guess we could move the creation of the LVs after the vdsm is
> re-deployed, just before the HE services are started, but it won't
> fix the problem if the vdsm is restarted
It's not going to solve our design problem, but can the users of that LV
activate it just before they open it? This way there's only a small
window where vdsm can crash, restart, and deactivate the LV.
- no, it runs as vdsm user and as it seems only root can activate LVs
and during the startup when it still has the root privs, it doesn't know
anything about the storage, the storage is connected on a client request
when it already doesn't have root privs
- we could use vdsm to activate the LV if it provides API which allows that
--Jirka