[ovirt-devel] [vdsm] logging levels and noise in the log
Dan Kenigsberg
danken at redhat.com
Thu Jul 10 16:07:09 UTC 2014
On Thu, Jul 10, 2014 at 10:15:23AM -0400, Nir Soffer wrote:
> ----- Original Message -----
> > From: "Martin Sivak" <msivak at redhat.com>
> > To: devel at ovirt.org
> > Sent: Thursday, July 10, 2014 1:09:14 PM
> > Subject: [ovirt-devel] [vdsm] logging levels and noise in the log
> >
> > Hi,
> >
> > we discussed the right amount of logging with Nir and Francesco while
> > reviewing my patches. Francesco was against one DEBUG log message that could
> > potentially flood the logs. But the message in question was important for me
> > and SLA, because it was logging the VDSM changes in response to MoM command.
> >
> > Since DEBUG really is meant to be used for debug messages, I have one
> > proposal:
> >
> > 1) Change the default log level of vdsm.log to INFO
>
> Not good enough - our log are full or tracebacks logged in error
> and warning levels, which are not helpful to the administrator.
>
> So this will create log that is not useful for anything.
>
> > 2) Log (only) DEBUG messages to a separate file vdsm-debug.log
>
> The debug log should include *all* log levels, basically being the same as
> the current vdsm log. When you need the logs, you want all the information in
> the same file.
>
> > 3) Make the vdsm-debug.log rotate faster (every hour, keep only last couple
> > of hours?) so it does not grow too much
>
> I just debugged a problem in production, where the mailbox stopped working
> 86 logs ago - so having only couple of hours is too little.
>
> However when using proper logging, This issue should be clear even
> in info log level, which show error, warnings, and info message such
> as mailbox stopping.
>
> >
> > This way the customer would be able to monitor INFO logs (much smaller)
> > without all the noise and we would be able to collect the DEBUG part in case
> > something happens.
>
> I think you got it right with the dual log approach. We should have two logs:
>
> 1. Admin log (events.log?)
>
> This log should show important events and errors, but it should not contain
> any tracebacks, thread name, file names etc. Basically, something like the
> events log in the engine.
>
> If we take storage for example, this log can contain storage domain state
> change, or configuration changes triggered by the engine, or events like
> disk that becoming too full.
I have recently introduced a connectivity.log, that tracks changes in
host network devices. Adding parallel logs for other important
admin-understandable changes makes sense.
I'd be happy to hear which bits of information do oVirt admins wish to
see in this "events.log".
>
> 2. Developer log (vdsm.log)
>
> There is lot of work to to here:
>
> - Disable library logs - we have lot of infrastructure code that log useless
> messages. This code should be verified by tests, not by looking at the logs.
>
> - Removing junk log messages:
> - new ioprocess logs: https://bugzilla.redhat.com/1114908
> - new jsonrpc logs: same issue - no bug yet
>
> - Adding useful info like VDSM changes in response to MoM command.
> I did not look at the patch, but if vdsm changes state, it must be logged
>
> - Improving log format
> http://gerrit.ovirt.org/28869
The only problem is to agree which format is better. I personally like
condensed format, that is possibly machine-readable. Spaces, "in
function" and repeating module name and its full path, are suboptimal
imo.
>
> - Removing duplicate errors/tracebacks for same error, caused by code that does
>
> try:
> something()
> except:
> log.error('something failed...')
> raise
+1, I hate those.
>
> Should be:
>
> something()
>
> And let the upper level handle the failure, logging a trackback.
>
> Or:
>
> try:
> something()
> Except LowLevelError:
> raise FlowError("description...")
>
> FlowError can log a traceback of both errors by using sys.exec_info.
> See http://hg.moinmo.in/moin/1.9/file/274a7f675261/MoinMoin/error.py
>
> - Disabling this log by default?
> Only if an admin can easily enable or disable the log, ideally from the ui.
It all too often that a user comes up with a bug or a question. Seeing
his logs usually gives a good hint regarding the solution, or at least
who is to blame. It is important to me to keep this tool handy.
>
> So for example, when flow fails, the event log will contain the a single error message
> related to the flow, while the debug log will contains the same error message with
> a traceback and useful debugging messages during the flow.
>
> Lets make this a feature for 3.6?
Vdsm has a setLogLevel; it must have rotton out of disuse. But I'd love
to see an Engine option for "cluster debug level", that is overridable
per-host. Each admin should decide what does they care more about:
debugablility by developers, or the bliss of quiet logs.
Dan.
More information about the Devel
mailing list