Hi Sven,
can you attach full logs from the second host (problematic one)? i guess its
"deovn-a01".
2012-10-15 11:13:38,197 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(QuartzScheduler_Worker-33) domain ccaa4e7a-fa89-46a6-a6e0-07dfe78d1bd5 in problem. vds:
deovn-a01
----- Original Message -----
From: "Omer Frenkel" <ofrenkel(a)redhat.com>
To: "Itamar Heim" <iheim(a)redhat.com>, "Sven Knohsalla"
<s.knohsalla(a)netbiscuits.com>
Cc: users(a)ovirt.org
Sent: Tuesday, October 16, 2012 2:02:50 PM
Subject: Re: [Users] ITA-2967 URGENT: ovirt Node turns status to "non
operational" STORAGE_DOMAIN_UNREACHABLE
----- Original Message -----
> From: "Itamar Heim" <iheim(a)redhat.com>
> To: "Sven Knohsalla" <s.knohsalla(a)netbiscuits.com>
> Cc: users(a)ovirt.org
> Sent: Monday, October 15, 2012 8:36:07 PM
> Subject: Re: [Users] ITA-2967 URGENT: ovirt Node turns status to
> "non operational" STORAGE_DOMAIN_UNREACHABLE
>
> On 10/15/2012 03:56 PM, Sven Knohsalla wrote:
> > Hi,
> >
> > sometimes one hypervisors status turns to „Non-operational“ with
> > error
> > “STORAGE_DOMAIN_UNREACHABLE” and the live-migration (activated
> > for
> > all
> > VMs) is starting.
> >
> > I don’t currently know why the ovirt-node turns to this status,
> > because
> > the connected iSCSI SAN is available all the time(checked via
> > iscsi
> > session and lsblk), I’m also able to r/w on the SAN during that
> > time.
> >
> > We can simply activate this ovirt-node and it turns up again. The
> > migration process is running from scratch and hitting the some
> > error
> > àReboot of ovirt-node necessary!
> >
> > When a hypervisor turns to “non-operational” status, the live
> > migration
> > is starting and tries to migrate ~25 VMs (~ 100 GB RAM to
> > migrate).
> >
> > During that process the network workload goes 100%, some VMs will
> > be
> > migrated, then the destination host also turns to
> > “non-operational”
> > status with error “STORAGE_DOMAIN_UNREACHABLE”.
> >
> > Many VMs are still running on their origin host, some are
> > paused,
> > some
> > are showing “migration from” status.
> >
> > After a reboot of the origin host, the VMs turns of course into
> > unknown
> > state.
> >
> > So the whole cluster is down :/
> >
> > For this problem I have some questions:
> >
> > -Does ovirt engine just use the ovirt-mgmt network for
> > migration/HA?
>
> yes.
>
> >
> > -If so, is there any possibility to *add*/switch a network for
> > migration/HA?
>
> you can bond, not yet add another one.
>
> >
> > -Is the kind of way we are using the live-migration not
> > recommended?
> >
> > -Which engine module checks the availability of the storage
> > domain
> > for
> > the ovirt-nodes?
>
> the engine.
>
> >
> > -Is there any timeout/cache option we can set/increase to avoid
> > this
> > problem?
>
> well, not clear what the problem is.
> also, vdsm is supposed to throttle live migration to 3 vm's in
> parallel
> iirc.
> also, you can at cluster level configure to not live migrate VMs on
> non-operational status.
>
> >
> > -Is there any known problem with the versions we are using?
> > (Migration
> > to ovirt-engine 3.1 is not possible atm)
>
> oh, the cluster level migration policy on non operational may be a
> 3.1
> feature, not sure.
>
AFAIR, it's in 3.0
> >
> > -Is it possible to modify the migration queue to just migrate a
> > max. of
> > 4 VMs at the same time for example?
>
> yes, there is a vdsm config for that. i am pretty sure 3 is the
> default
> though?
>
> >
> > _ovirt-engine: _
> >
> > FC 16: 3.3.6-3.fc16.x86_64
> >
> > Engine: 3.0.0_0001-1.6.fc16
> >
> > KVM based VM: 2 vCPU, 4 GB RAM
> >
> > 1 NIC for ssh/https access
> > 1 NIC for ovirtmgmt network access
> > engine source: dreyou repo
> >
> > _ovirt-node:_
> > Node: 2.3.0
> > 2 bonded NICs -> Frontend Network
> > 4 Multipath NICs -> SAN connection
> >
> > Attached some relevant logfiles.
> >
> > Thanks in advance, I really appreciate your help!
> >
> > Best,
> >
> > Sven Knohsalla |System Administration
> >
> > Office +49 631 68036 433 | Fax +49 631 68036 111
> > |E-Mails.knohsalla(a)netbiscuits.com
> > |<mailto:s.knohsalla@netbiscuits.com>|
> > Skype: Netbiscuits.admin
> >
> > Netbiscuits GmbH | Europaallee 10 | 67657 | GERMANY
> >
> >
> >
> > _______________________________________________
> > Users mailing list
> > Users(a)ovirt.org
> >
http://lists.ovirt.org/mailman/listinfo/users
> >
>
>
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users
>
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users