On 02/03/2014 01:25 PM, Andrew Lau wrote:
On Mon, Feb 3, 2014 at 11:23 PM, Itamar Heim <iheim@redhat.com<mailto:iheim@redhat.com>>wrote:
On 02/03/2014 01:19 PM, Andrew Lau wrote:
The issue was a split-brain issue on the dom_md/ids file causing an
input/output error, thanks!
is this with gluster?
Yup a 2 brick gluster replicated instance serving the NFS server, sorry
was meant to say I resolved it too.
you have to use a gluster with quorum, or this will happen often
On Mon, Feb 3, 2014 at 10:43 PM, Andrew Lau
<andrew@andrewklau.com <mailto:andrew@andrewklau.com>
<mailto:andrew@andrewklau.com <mailto:andrew@andrewklau.com>>__>
wrote:
On Mon, Feb 3, 2014 at 10:40 PM, Doron Fediuck
<dfediuck@redhat.com <mailto:dfediuck@redhat.com>
<mailto:dfediuck@redhat.com <mailto:dfediuck@redhat.com>>>
wrote:
----- Original Message -----
> From: "Andrew Lau" <andrew@andrewklau.com
<mailto:andrew@andrewklau.com>
<mailto:andrew@andrewklau.com
<mailto:andrew@andrewklau.com>>__>
> To: "Doron Fediuck" <dfediuck@redhat.com
<mailto:dfediuck@redhat.com>
<mailto:dfediuck@redhat.com <mailto:dfediuck@redhat.com>>>
> Cc: "users" <users@ovirt.org
<mailto:users@ovirt.org> <mailto:users@ovirt.org
<mailto:users@ovirt.org>>>, "Jiri
Moskovcak" <jmoskovc@redhat.com
<mailto:jmoskovc@redhat.com> <mailto:jmoskovc@redhat.com
<mailto:jmoskovc@redhat.com>>>,
"Greg Padgett" <gpadgett@redhat.com
<mailto:gpadgett@redhat.com> <mailto:gpadgett@redhat.com
<mailto:gpadgett@redhat.com>>>
> Sent: Monday, February 3, 2014 1:35:01 PM
> Subject: Re: [Users] Hosted Engine always reports
"unknown
stale-data"
>
> On Mon, Feb 3, 2014 at 9:53 PM, Doron Fediuck
<dfediuck@redhat.com <mailto:dfediuck@redhat.com>
<mailto:dfediuck@redhat.com <mailto:dfediuck@redhat.com>>> wrote:
>
> >
> >
> > ----- Original Message -----
> > > From: "Andrew Lau" <andrew@andrewklau.com
<mailto:andrew@andrewklau.com>
<mailto:andrew@andrewklau.com
<mailto:andrew@andrewklau.com>>__>
> > > To: "users" <users@ovirt.org
<mailto:users@ovirt.org> <mailto:users@ovirt.org
<mailto:users@ovirt.org>>>
> > > Sent: Monday, February 3, 2014 12:32:45 PM
> > > Subject: [Users] Hosted Engine always reports
"unknown
stale-data"
> > >
> > > Hi,
> > >
> > > I was wondering if anyone has this same notice
when they run:
> > > hosted-engine --vm-status
> > >
> > > The "engine status" will always be "unknown
stale-data"
even when the VM
> > is
> > > powered on and the engine is online.
engine-health will
actually report
> > the
> > > correct status.
> > >
> > > eg.
> > >
> > > --== Host 1 status ==--
> > >
> > > Status up-to-date : False
> > > Hostname : 172.16.0.11
> > > Host ID : 1
> > > Engine status : unknown stale-data
> > >
> > > Is it some sort of blocked port causing this or
is this
by design?
> > >
> > > Thanks,
> > > Andrew
> > >
> > > _________________________________________________
> > > Users mailing list
> > > Users@ovirt.org <mailto:Users@ovirt.org>
<mailto:Users@ovirt.org <mailto:Users@ovirt.org>>
> > > http://lists.ovirt.org/__mailman/listinfo/users
<http://lists.ovirt.org/mailman/listinfo/users>
> > >
> >
> > Hi Andrew,
> > it looks like an issue with the time stamp.
> > Which time stamp do you have? How relevant is it?
> >
>
> timestamps seem to be outdated by a lot, interesting
error in
the broker.log
>
> Thread-24::INFO::2014-02-03
>
22:33:14,801::engine_health::__90::engine_health.__CpuLoadNoEngine::(action)
VM
> not running on this host, status down
> Thread-22::INFO::2014-02-03
> 22:33:14,834::mem_free::53::__mem_free.MemFree::(action)
memFree: 27382
> Thread-23::ERROR::2014-02-03
>
22:33:14,922::cpu_load_no___engine::156::cpu_load_no___engine.EngineHealth::(update___stat_file)
> Failed to getVmStats: 'pid'
> Thread-23::INFO::2014-02-03
>
22:33:14,923::cpu_load_no___engine::121::cpu_load_no___engine.EngineHealth::(__calculate_load)
> System load total=0.0124, engine=0.0000,
non-engine=0.0124
>
> I'm assuming that update_stat_file is the metadata
file the
vm-status is
> getting pulled from?
>
Yep.
Can you please verify the time your host actually has?
ie- we have a known issue with time, since we assume all
hosts are in sync. So if one of your hosts has a time sync
issue, this can explain the problem you see.
--== Host 1 status ==--
Status up-to-date : False
Hostname : 172.16.0.11
Host ID : 1
Engine status : unknown stale-data
Score : 0
Local maintenance : False
Host timestamp : 1391417611
--== Host 2 status ==--
Status up-to-date : False
Hostname : 172.16.0.12
Host ID : 2
Engine status : unknown stale-data
Score : 0
Local maintenance : False
Host timestamp : 1391417171
[root@hv01 ~]# date +%s
│[root@hv02 ~]# date +%s
1391427754
│139142775
5
_________________________________________________
Users mailing list
Users@ovirt.org <mailto:Users@ovirt.org>
http://lists.ovirt.org/__mailman/listinfo/users
<http://lists.ovirt.org/mailman/listinfo/users>