[Engine-devel] Design wiki page for trusted compute pools integration with oVirt has been updated

Itamar Heim iheim at redhat.com
Sun Apr 28 09:35:05 UTC 2013


On 04/28/2013 11:34 AM, Doron Fediuck wrote:
> Hi Dave,
>
> Just to make sure I fully understand, I'll repeat your basic arguments;
>
> 1. It takes time to query a big number of hosts (hundreds).
>
> 2. When backend is booting, a user may start a VM on a host which was
> hacked during the downtime of the engine.
>
> If the above is your concern, it shouldn't be so.
> The reason is, that no host will become operational before you get a response
> from the attestation server and allow it to become operational. So a user
> cannot start a new VM on a non-operational host.

i'd do the queries in groups of "cluster", so cluste-by-cluster they get 
unblocked.
cluster without attestation service shouldn't block on this of course.

>
> What this means is that your thread may need to update the user by sending
> a periodic event that a large scale attestation operation is in progress.
> Other than that, maybe your thread can work in smaller groups if it gets
> better results? ie- instead of one query for 300 hosts, maybe you can run
> 3 serialized queries for 100 hosts each?
> If this does not help, maybe you can run a short query for something like
> 10 hosts, which should get an answer relatively fast. The you can issue a
> query for the other 290 hosts which will take longer. In this way the system
> may get 10 hosts to work with quite fast, and later on the other 290 hosts
> will join... So this can actually be configurable to a 2-phase process;
> a short query and a longer one. The admin can choose the short query size
> based on his setup, and the longer query can pick up all the other hosts.
> What do you think?
>
> Doron
>
> ----- Original Message -----
>> From: "Wei D Chen" <wei.d.chen at intel.com>
>> To: "Doron Fediuck" <dfediuck at redhat.com>
>> Cc: "Oved Ourfalli" <ovedo at redhat.com>, engine-devel at ovirt.org
>> Sent: Saturday, April 27, 2013 9:36:44 AM
>> Subject: Re: [Engine-devel] Design wiki page for trusted compute pools integration with oVirt has been updated
>>
>> Hi,
>>
>> Our current consideration is add a new thread in engine's side to attest all
>> of hosts (aggregated query from attestation sever) one time in case of
>> engine's rebooting. There is still one potential issue under extreme
>> condition, saying, hundreds of nodes in a datacenter, attest all of hosts
>> still may take couple of mins, let's say, one hacked untrusted node before
>> receiving the latest status may considered as a trusted host, so, the worst
>> case in a datacenter with hundreds of nodes is,
>> 1. engine is down for some reasons and boot up again, some trusted nodes may
>> be hacked and rebooted during this period.
>> 2. our thread is running to get all of node's status (trust /untrusted), may
>> take couple of mins in large datacenter.
>> 2. user create VMs on these hacked nodes and believe these VMs are trusted
>> VMs launched on trusted nodes.
>> 3. our thread get the correct status of these untrusted nodes, set these
>> nodes as non-operational.
>> 4. all of these "trusted" VMs running on these untrusted nodes are expected
>> to migrate to other trusted node.
>>
>> So, the question is in a trusted cluster with hundreds of nodes some VMs
>> expected to create on trusted nodes may actually create on untrusted nodes
>> instead, and this time may last for couple of mins. (worst case in my view
>> is 10 mins with 1000 nodes).
>> Does this acceptable from your point of view? Or any other suggestion?
>>
>>
>> Best Regards,
>> Dave Chen
>>
>>
>>> -----Original Message-----
>>> From: Doron Fediuck [mailto:dfediuck at redhat.com]
>>> Sent: Sunday, April 21, 2013 11:58 PM
>>> To: Chen, Wei D
>>> Cc: Ofri Masad; Oved Ourfalli; engine-devel at ovirt.org
>>> Subject: Re: [Engine-devel] Design wiki page for trusted compute pools
>>> integration with oVirt has been updated
>>>
>>>
>>>
>>> ----- Original Message -----
>>>> From: "Wei D Chen" <wei.d.chen at intel.com>
>>>> To: "Ofri Masad" <omasad at redhat.com>
>>>> Cc: "Oved Ourfalli" <ovedo at redhat.com>, engine-devel at ovirt.org
>>>> Sent: Sunday, April 21, 2013 4:00:55 PM
>>>> Subject: Re: [Engine-devel] Design wiki page for trusted compute pools
>>>> integration with oVirt has been updated
>>>>
>>>> Ofri,
>>>>
>>>> Absolutely right, aggregated query has a significantly time improve
>>>> compared to separated queries. I agree a aggregated query on engine's
>>>> starting. Is it possible to invoke attestation service in engine's
>>>> initialization code block instead of "quartz job"? Is there any class
>>>> similar with
>>> "
>>>> InitVdsOnUpCommand " for engine's initialization?
>>>>
>>>> Best Regards,
>>>> Dave Chen
>>>>
>>> org.ovirt.engine.core.bll.Backend.Initialize()
>>>
>>> Note you cannot block this method while waiting for results.
>>> Instead I suggest you fire a one-time background request from this method.
>>>
>>>
>>>> -----Original Message-----
>>>> From: Ofri Masad [mailto:omasad at redhat.com]
>>>> Sent: Sunday, April 21, 2013 3:29 PM
>>>> To: Chen, Wei D
>>>> Cc: Oved Ourfalli; engine-devel at ovirt.org; Itamar Heim
>>>> Subject: Re: [Engine-devel] Design wiki page for trusted compute pools
>>>> integration with oVirt has been updated
>>>>
>>>> Dave,
>>>>
>>>> If I'm not mistaking, there is a big difference between separated
>>>> queries to the attestation server and aggregated one?
>>>> Is it true?
>>>>
>>>> Thanks,
>>>> Ofri
>>>>
>>>> ----- Original Message -----
>>>>> From: "Itamar Heim" <iheim at redhat.com>
>>>>> To: "Ofri Masad" <omasad at redhat.com>
>>>>> Cc: "Oved Ourfalli" <ovedo at redhat.com>, "Wei D Chen"
>>>>> <wei.d.chen at intel.com>, engine-devel at ovirt.org
>>>>> Sent: Sunday, April 21, 2013 10:20:17 AM
>>>>> Subject: Re: [Engine-devel] Design wiki page for trusted compute
>>>>> pools integration with oVirt has been updated
>>>>>
>>>>> On 04/21/2013 10:13 AM, Ofri Masad wrote:
>>>>>> Hi,
>>>>>> One more thing we need to think about for the second approach -
>>>>>> aggregated query. On engine start we need to determine the trust
>>>>>> state of all the hosts. sending a separate query for each host
>>>>>> will overload the attestation host and the network. an initial
>>>>>> aggregated query needs to be send when the engine starts.
>>>>>> Same thing can happen after management network fail and so on.
>>>>>> Maybe we can run a quartz job every x minutes, checking if a large
>>>>>> part of the hosts in the cluster (like 30%) are untrusted - in
>>>>>> that case run the aggregated query.
>>>>>
>>>>> are we sure this optimization is needed?
>>>>> how heavy/latent is the call to the attestation service?
>>>>>
>>>> _______________________________________________
>>>> Engine-devel mailing list
>>>> Engine-devel at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/engine-devel
>>>>
>> _______________________________________________
>> Engine-devel mailing list
>> Engine-devel at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/engine-devel
>>
> _______________________________________________
> Engine-devel mailing list
> Engine-devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/engine-devel
>




More information about the Engine-devel mailing list