[ovirt-users] storage redundancy in Ovirt
Adam Litke
alitke at redhat.com
Mon Apr 17 14:28:50 UTC 2017
On Mon, Apr 17, 2017 at 9:12 AM, FERNANDO FREDIANI <
fernando.frediani at upx.com> wrote:
> Should I understand then that fecing is not mandatory, just advised and
> there are no downtimes to the rest of the cluster if a pending manual
> action has to be done to confirm a host has been rebooted in order to move
> the SPM to another live host. Perhaps only effect is not be able to add new
> hosts and connect to storage or something ?
>
A missing SPM will not affect existing VMs. Running VMs (on hosts other
than the SPM host) will continue to run normally and you can even start
existing VMs. Without an SPM you will not be able to manipulate the
storage (Add/remove domains, add/remove disks, migrate disks, etc).
> Fernando
> On 17/04/2017 10:06, Nir Soffer wrote:
>
> On Mon, Apr 17, 2017 at 8:24 AM Konstantin Raskoshnyi <konrasko at gmail.com>
> wrote:
>
>> But actually, it didn't work well. After main SPM host went down I see
>> this
>> [image: Screen Shot 2017-04-16 at 10.22.00 PM.png]
>>
>> 2017-04-17 05:23:15,554Z ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy]
>> (DefaultQuartzScheduler5) [4dcc033d-26bf-49bb-bfaa-03a970dbbec1] SPM
>> Init: could not find reported vds or not up - pool: 'STG' vds_spm_id: '1'
>> 2017-04-17 05:23:15,567Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy]
>> (DefaultQuartzScheduler5) [4dcc033d-26bf-49bb-bfaa-03a970dbbec1] SPM
>> selection - vds seems as spm 'tank5'
>> 2017-04-17 05:23:15,567Z WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy]
>> (DefaultQuartzScheduler5) [4dcc033d-26bf-49bb-bfaa-03a970dbbec1] spm vds
>> is non responsive, stopping spm selection.
>>
>> So that means only if BMC is up it's possible to automatically switch
>> SPM host?
>>
>
> BMC?
>
> If your SPM is no responsive, the system will try to fence it. Did you
> configure power management for all hosts? did you check that it
> work? How did you simulate non-responsive host?
>
> If power management is not configured or fail, the system cannot
> move the spm to another host, unless you manually confirm that the
> SPM host was rebooted.
>
> Nir
>
>
>>
>> Thanks
>>
>> On Sun, Apr 16, 2017 at 8:29 PM, Konstantin Raskoshnyi <
>> konrasko at gmail.com> wrote:
>>
>>> Oh, fence agent works fine if I select ilo4,
>>> Thank you for your help!
>>>
>>> On Sun, Apr 16, 2017 at 8:22 PM Dan Yasny <dyasny at gmail.com> wrote:
>>>
>>>> On Sun, Apr 16, 2017 at 11:19 PM, Konstantin Raskoshnyi <
>>>> konrasko at gmail.com> wrote:
>>>>
>>>>> Makes sense.
>>>>> I was trying to set it up, but doesn't work with our staging hardware.
>>>>> We have old ilo100, I'll try again.
>>>>> Thanks!
>>>>>
>>>>>
>>>> It is absolutely necessary for any HA to work properly. There's of
>>>> course the "confirm host has been shutdown" option, which serves as an
>>>> override for the fence command, but it's manual
>>>>
>>>>
>>>>> On Sun, Apr 16, 2017 at 8:18 PM Dan Yasny <dyasny at gmail.com> wrote:
>>>>>
>>>>>> On Sun, Apr 16, 2017 at 11:15 PM, Konstantin Raskoshnyi <
>>>>>> konrasko at gmail.com> wrote:
>>>>>>
>>>>>>> Fence agent under each node?
>>>>>>>
>>>>>>
>>>>>> When you configure a host, there's the power management tab, where
>>>>>> you need to enter the bmc details for the host. If you don't have fencing
>>>>>> enabled, how do you expect the system to make sure a host running a service
>>>>>> is actually down (and it is safe to start HA services elsewhere), and not,
>>>>>> for example, just unreachable by the engine? How do you avoid a splitbraid
>>>>>> -> SBA ?
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> On Sun, Apr 16, 2017 at 8:14 PM Dan Yasny <dyasny at gmail.com> wrote:
>>>>>>>
>>>>>>>> On Sun, Apr 16, 2017 at 11:13 PM, Konstantin Raskoshnyi <
>>>>>>>> konrasko at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> "Corner cases"?
>>>>>>>>> I tried to simulate crash of SPM server and ovirt kept trying to
>>>>>>>>> reistablished connection to the failed node.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Did you configure fencing?
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Apr 16, 2017 at 8:10 PM Dan Yasny <dyasny at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> On Sun, Apr 16, 2017 at 7:29 AM, Nir Soffer <nsoffer at redhat.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> On Sun, Apr 16, 2017 at 2:05 PM Dan Yasny <dyasny at redhat.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Apr 16, 2017 7:01 AM, "Nir Soffer" <nsoffer at redhat.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Apr 16, 2017 at 4:17 AM Dan Yasny <dyasny at gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> When you set up a storage domain, you need to specify a host
>>>>>>>>>>>>> to perform the initial storage operations, but once the SD is defined, it's
>>>>>>>>>>>>> details are in the engine database, and all the hosts get connected to it
>>>>>>>>>>>>> directly. If the first host you used to define the SD goes down, all other
>>>>>>>>>>>>> hosts will still remain connected and work. SPM is an HA service, and if
>>>>>>>>>>>>> the current SPM host goes down, SPM gets started on another host in the DC.
>>>>>>>>>>>>> In short, unless your actual NFS exporting host goes down, there is no
>>>>>>>>>>>>> outage.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> There is no storage outage, but if you shutdown the spm host,
>>>>>>>>>>>> the spm host
>>>>>>>>>>>> will not move to a new host until the spm host is online again,
>>>>>>>>>>>> or you confirm
>>>>>>>>>>>> manually that the spm host was rebooted.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> In a properly configured setup the SBA should take care of
>>>>>>>>>>>> that. That's the whole point of HA services
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> In some cases like power loss or hardware failure, there is no
>>>>>>>>>>> way to start
>>>>>>>>>>> the spm host, and the system cannot recover automatically.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> There are always corner cases, no doubt. But in a normal
>>>>>>>>>> situation. where an SPM host goes down because of a hardware failure, it
>>>>>>>>>> gets fenced, other hosts contend for SPM and start it. No surprises there.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Nir
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Nir
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, Apr 15, 2017 at 1:53 PM, Konstantin Raskoshnyi <
>>>>>>>>>>>>> konrasko at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Fernando,
>>>>>>>>>>>>>> I see each host has direct connection nfs mount, but yes, if
>>>>>>>>>>>>>> main host to which I connected nfs storage going down the storage becomes
>>>>>>>>>>>>>> unavailable and all vms are down
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sat, Apr 15, 2017 at 10:37 AM FERNANDO FREDIANI <
>>>>>>>>>>>>>> fernando.frediani at upx.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hello Konstantin.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That doesn`t make much sense make a whole cluster depend on
>>>>>>>>>>>>>>> a single host. From what I know any host talk directly to NFS Storage Array
>>>>>>>>>>>>>>> or whatever other Shared Storage you have.
>>>>>>>>>>>>>>> Have you tested that host going down if that affects the
>>>>>>>>>>>>>>> other with the NFS mounted directlly in a NFS Storage array ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Fernando
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2017-04-15 12:42 GMT-03:00 Konstantin Raskoshnyi <
>>>>>>>>>>>>>>> konrasko at gmail.com>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In ovirt you have to attach storage through specific host.
>>>>>>>>>>>>>>>> If host goes down storage is not available.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sat, Apr 15, 2017 at 7:31 AM FERNANDO FREDIANI <
>>>>>>>>>>>>>>>> fernando.frediani at upx.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Well, make it not go through host1 and dedicate a storage
>>>>>>>>>>>>>>>>> server for running NFS and make both hosts connect to it.
>>>>>>>>>>>>>>>>> In my view NFS is much easier to manage than any other
>>>>>>>>>>>>>>>>> type of storage, specially FC and iSCSI and performance is pretty much the
>>>>>>>>>>>>>>>>> same, so you won`t get better results other than management going to other
>>>>>>>>>>>>>>>>> type.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Fernando
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2017-04-15 5:25 GMT-03:00 Konstantin Raskoshnyi <
>>>>>>>>>>>>>>>>> konrasko at gmail.com>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi guys,
>>>>>>>>>>>>>>>>>> I have one nfs storage,
>>>>>>>>>>>>>>>>>> it's connected through host1.
>>>>>>>>>>>>>>>>>> host2 also has access to it, I can easily migrate
>>>>>>>>>>>>>>>>>> vms between them.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The question is - if host1 is down - all infrastructure
>>>>>>>>>>>>>>>>>> is down, since all traffic goes through host1,
>>>>>>>>>>>>>>>>>> is there any way in oVirt to use redundant storage?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Only glusterfs?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>> Users mailing list
>>>>>>>>>>>>>>>>>> Users at ovirt.org
>>>>>>>>>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Users mailing list
>>>>>>>>>>>>>> Users at ovirt.org
>>>>>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Users mailing list
>>>>>>>>>>>>> Users at ovirt.org
>>>>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Users mailing list
>>>>>>>>>>>> Users at ovirt.org
>>>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
--
Adam Litke
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170417/75932cb7/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 33452 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170417/75932cb7/attachment-0001.png>
More information about the Users
mailing list