----- Original Message -----
From: "Itamar Heim" <iheim(a)redhat.com>
To: "Sven Knohsalla" <s.knohsalla(a)netbiscuits.com>
Cc: users(a)ovirt.org
Sent: Monday, October 15, 2012 8:36:07 PM
Subject: Re: [Users] ITA-2967 URGENT: ovirt Node turns status to "non
operational" STORAGE_DOMAIN_UNREACHABLE
On 10/15/2012 03:56 PM, Sven Knohsalla wrote:
> Hi,
>
> sometimes one hypervisors status turns to „Non-operational“ with
> error
> “STORAGE_DOMAIN_UNREACHABLE” and the live-migration (activated for
> all
> VMs) is starting.
>
> I don’t currently know why the ovirt-node turns to this status,
> because
> the connected iSCSI SAN is available all the time(checked via iscsi
> session and lsblk), I’m also able to r/w on the SAN during that
> time.
>
> We can simply activate this ovirt-node and it turns up again. The
> migration process is running from scratch and hitting the some
> error
> àReboot of ovirt-node necessary!
>
> When a hypervisor turns to “non-operational” status, the live
> migration
> is starting and tries to migrate ~25 VMs (~ 100 GB RAM to migrate).
>
> During that process the network workload goes 100%, some VMs will
> be
> migrated, then the destination host also turns to “non-operational”
> status with error “STORAGE_DOMAIN_UNREACHABLE”.
>
> Many VMs are still running on their origin host, some are paused,
> some
> are showing “migration from” status.
>
> After a reboot of the origin host, the VMs turns of course into
> unknown
> state.
>
> So the whole cluster is down :/
>
> For this problem I have some questions:
>
> -Does ovirt engine just use the ovirt-mgmt network for
> migration/HA?
yes.
>
> -If so, is there any possibility to *add*/switch a network for
> migration/HA?
you can bond, not yet add another one.
>
> -Is the kind of way we are using the live-migration not
> recommended?
>
> -Which engine module checks the availability of the storage domain
> for
> the ovirt-nodes?
the engine.
>
> -Is there any timeout/cache option we can set/increase to avoid
> this
> problem?
well, not clear what the problem is.
also, vdsm is supposed to throttle live migration to 3 vm's in
parallel
iirc.
also, you can at cluster level configure to not live migrate VMs on
non-operational status.
>
> -Is there any known problem with the versions we are using?
> (Migration
> to ovirt-engine 3.1 is not possible atm)
oh, the cluster level migration policy on non operational may be a
3.1
feature, not sure.
AFAIR, it's in 3.0
>
> -Is it possible to modify the migration queue to just migrate a
> max. of
> 4 VMs at the same time for example?
yes, there is a vdsm config for that. i am pretty sure 3 is the
default
though?
>
> _ovirt-engine: _
>
> FC 16: 3.3.6-3.fc16.x86_64
>
> Engine: 3.0.0_0001-1.6.fc16
>
> KVM based VM: 2 vCPU, 4 GB RAM
>
> 1 NIC for ssh/https access
> 1 NIC for ovirtmgmt network access
> engine source: dreyou repo
>
> _ovirt-node:_
> Node: 2.3.0
> 2 bonded NICs -> Frontend Network
> 4 Multipath NICs -> SAN connection
>
> Attached some relevant logfiles.
>
> Thanks in advance, I really appreciate your help!
>
> Best,
>
> Sven Knohsalla |System Administration
>
> Office +49 631 68036 433 | Fax +49 631 68036 111
> |E-Mails.knohsalla(a)netbiscuits.com
> |<mailto:s.knohsalla@netbiscuits.com>|
> Skype: Netbiscuits.admin
>
> Netbiscuits GmbH | Europaallee 10 | 67657 | GERMANY
>
>
>
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users
>
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users