[Users] ovirt-guest-agent issue/troubleshooting

Dan Kenigsberg danken at redhat.com
Wed Oct 30 09:45:15 EDT 2013


On Wed, Oct 30, 2013 at 03:24:56PM +0200, Itamar Heim wrote:
> On 10/30/2013 03:10 PM, Karli Sjöberg wrote:
> >ons 2013-10-30 klockan 15:05 +0200 skrev Itamar Heim:
> >>On 10/30/2013 11:12 AM, Karli Sjöberg wrote:
> >>> ons 2013-10-30 klockan 09:16 +0100 skrev Vinzenz Feenstra:
> >>>> On 10/30/2013 07:09 AM, Karli Sjöberg wrote:
> >>>>
> >>>>> tis 2013-10-29 klockan 15:48 +0100 skrev Vinzenz Feenstra:
> >>>>>> On 10/29/2013 03:06 PM, Karli Sjöberg wrote:
> >>>>>>
> >>>>>>> tis 2013-10-29 klockan 14:48 +0100 skrev Vinzenz Feenstra:
> >>>>>>>> On 10/29/2013 02:37 PM, Karli Sjöberg wrote:
> >>>>>>>>
> >>>>>>>>> tis 2013-10-29 klockan 14:30 +0100 skrev René Koch (ovido):
> >>>>>>>>>> On Tue, 2013-10-29 at 13:23 +0000, Karli Sjöberg wrote:
> >>>>>>>>>> > tis 2013-10-29 klockan 14:15 +0100 skrev René Koch (ovido):
> >>>>>>>>>> > > Hi,
> >>>>>>>>>> > >
> >>>>>>>>>> > > I have some issues with ovirt-guest-agent as no information is reported
> >>>>>>>>>> > > from guest-agent to oVirt webadmin anymore.
> >>>>>>>>>> > > I'm unsure when I lost the information - I know for sure it worked last
> >>>>>>>>>> > > month when I tested the guest agent packages for Debian and Ubuntu.
> >>>>>>>>>> > >
> >>>>>>>>>> > > Status now is that I don't receive any data.
> >>>>>>>>>> > > I have the following test vms:
> >>>>>>>>>> > > * RHEL 6 with rhevm-guest-agent 1.0.7 (RHEV repository)
> >>>>>>>>>> > > * RHEL 6 with ovirt-guest-agent 1.0.8 (EPEL)
> >>>>>>>>>> > > * openSUSE 12.3 with ovirt-guest-agent 1.0.8.1 (self compiled)
> >>>>>>>>>> > >
> >>>>>>>>>> > > Both RHEL server reported information (memory, ip-address,...)
> >>>>>>>>>> > > previously.
> >>>>>>>>>> > > The only changes which could broke the guest agent communication are
> >>>>>>>>>> > > updates on the CentOS host and the engine.
> >>>>>>>>>> > >
> >>>>>>>>>> > > Can you give me some hints how to troubleshoot the guest agent? I can't
> >>>>>>>>>> > > find any information (or don't know the right pattern to search for) in
> >>>>>>>>>> > > vdsm.log and engine.log. Guest agent is running in the vms, but doesn't
> >>>>>>>>>> > > log anything except start and stop of the service (can I change the
> >>>>>>>>>> > > handler_logfile args for more debugging and if yes how?):
> >>>>>>>>>> > >
> >>>>>>>>>> > > # tail /var/log/ovirt-guest-agent/ovirt-guest-agent.log
> >>>>>>>>>> > > MainThread::INFO::2013-10-29
> >>>>>>>>>> > > 11:00:15,340::ovirt-guest-agent::37::root::Starting oVirt guest agent
> >>>>>>>>>> > >
> >>>>>>>>>> > > Btw, I'm running oVirt 3.2.3...
> >>>>>>>>>> >
> >>>>>>>>>> > This has happened for me in the past and putting the Host in
> >>>>>>>>>> > maintenance and then restarting the vdsmd solved it.
> >>>>>>>>>> >
> >>>>>>>>>> > /Karli
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Thanks a lot Karli - restarting vdsmd did the trick!
> >>>>>>>>>
> >>>>>>>>> Well, even a broken clock can be right. Even twice a day:)
> >>>>>>>>>
> >>>>>>>>> @developers
> >>>>>>>>> Is this something you have noticed as well? I mean, that
> >>>>>>>>> sometimes vdsmd needs this manual kick? Where do you start
> >>>>>>>>> debugging when this issue occurs, restarting the daemon once in a
> >>>>>>>>> while is just the quickfix, I´d like to solve it once and for all.
> >>>>>>>> Yeah there have been applied multiple fixes in that regard already.
> >>>>>>>>
> >>>>>>>>https://github.com/oVirt/vdsm/commit/5b5c58580e20ffaf3ceff7193f4c28cbadd8c42f
> >>>>>>>> and
> >>>>>>>>https://github.com/oVirt/vdsm/commit/26bfc74765aed35af6d17cfad1ed8115eef650f1
> >>>>>>>>
> >>>>>>>> So it looks like that both of you are using oVirt 3.1, which
> >>>>>>>> contains neither of those two fixes. AFAIK only the first one is
> >>>>>>>> in oVirt 3.2 and the second one is in oVirt 3.3
> >>>>>>>
> >>>>>>> Öhm:
> >>>>>>> ovirt-engine-3.2.2-1.1.43.el6.noarch
> >>>>>>>
> >>>>>>> And René already stated running 3.2.3
> >>>>>> Hmm interesting, but that VDSM version is not 3.2? Did you upgrade
> >>>>>> your hypervisors?
> >>>>>
> >>>>> All systems are as upgraded as they can be, and dc/cluster is running
> >>>>> in 3.2-mode. Probably since I´m running dreyou´s repo (and perhaps
> >>>>> René as well), the versions may be different from what´s in yours?
> >>>> Sorry my bad, I messed up the correlation with the versions. I thought
> >>>> 4.10.3 is 3.1 but it is really 3.2.
> >>>> So this issues are fixed with 3.3 where I would wait with the upgrade
> >>>> until it was finally stabilized.
> >>>
> >>> Yepp yepp, that´s what I was thinking as well. About when would you
> >>> think 3.3 has stabilized, 3.3.1?
> >>
> >>a few days hopefully.
> >
> >"a few days"*™* ;)
> >
> >>note since the bug is on vdsm, upgrading vdsm should resolve it,
> >>regardless of upgrading engine, dc/cluster levels, etc.
> >
> >That´s good to know, thanks! And you are completely confident that there
> >is proper backwards compatibility? Has anyone tested?
> 
> vdsm is supposed to always keep backward compatibility. i think one
> issue was found in 3.3 around live migration in a 3.2 cluster (i.e.,
> mixed versions of 3.2 vdsm and 3.3 vdsm)
> danken - was this fixed in 3.3.1 vdsm?

There has been a breakage of migration between master and ovirt-3.3.0.
Fixing it was my gating issue for the ovirt-3.3 rebase, and that's done
for a couple of weeks.

I've tested 3.3.0<->3.3.1 migration then, and I am not aware of any
more recent problem. (However, when it comes to migration, I'm sure a
bug or two are lurking somewhere....)


More information about the Users mailing list