[ovirt-devel] [vdsm] logging levels and noise in the log

Nir Soffer nsoffer at redhat.com
Thu Jul 10 14:15:23 UTC 2014


----- Original Message -----
> From: "Martin Sivak" <msivak at redhat.com>
> To: devel at ovirt.org
> Sent: Thursday, July 10, 2014 1:09:14 PM
> Subject: [ovirt-devel] [vdsm] logging levels and noise in the log
> 
> Hi,
> 
> we discussed the right amount of logging with Nir and Francesco while
> reviewing my patches. Francesco was against one DEBUG log message that could
> potentially flood the logs. But the message in question was important for me
> and SLA, because it was logging the VDSM changes in response to MoM command.
> 
> Since DEBUG really is meant to be used for debug messages, I have one
> proposal:
> 
> 1) Change the default log level of vdsm.log to INFO

Not good enough - our log are full or tracebacks logged in error
and warning levels, which are not helpful to the administrator.

So this will create log that is not useful for anything.

> 2) Log (only) DEBUG messages to a separate file vdsm-debug.log

The debug log should include *all* log levels, basically being the same as
the current vdsm log. When you need the logs, you want all the information in
the same file.

> 3) Make the vdsm-debug.log rotate faster (every hour, keep only last couple
> of hours?) so it does not grow too much

I just debugged a problem in production, where the mailbox stopped working
86 logs ago - so having only couple of hours is too little.

However when using proper logging, This issue should be clear even
in info log level, which show error, warnings, and info message such
as mailbox stopping.

> 
> This way the customer would be able to monitor INFO logs (much smaller)
> without all the noise and we would be able to collect the DEBUG part in case
> something happens.

I think you got it right with the dual log approach. We should have two logs:

1. Admin log (events.log?)

   This log should show important events and errors, but it should not contain
   any tracebacks, thread name, file names etc. Basically, something like the
   events log in the engine.

   If we take storage for example, this log can contain storage domain state
   change, or configuration changes triggered by the engine, or events like
   disk that becoming too full.

2. Developer log (vdsm.log)

   There is lot of work to to here:

   - Disable library logs - we have lot of infrastructure code that log useless
     messages. This code should be verified by tests, not by looking at the logs. 

   - Removing junk log messages:
     - new ioprocess logs: https://bugzilla.redhat.com/1114908
     - new jsonrpc logs: same issue - no bug yet

   - Adding useful info like VDSM changes in response to MoM command.
     I did not look at the patch, but if vdsm changes state, it must be logged
 
   - Improving log format
     http://gerrit.ovirt.org/28869

   - Removing duplicate errors/tracebacks for same error, caused by code that does
     
     try:
        something()
     except:
        log.error('something failed...')
        raise
     
     Should be:

     something()

     And let the upper level handle the failure, logging a trackback.

     Or:

     try:
         something()
     Except LowLevelError:
         raise FlowError("description...")

    FlowError can log a traceback of both errors by using sys.exec_info.
    See http://hg.moinmo.in/moin/1.9/file/274a7f675261/MoinMoin/error.py

   - Disabling this log by default?
     Only if an admin can easily enable or disable the log, ideally from the ui.

So for example, when flow fails, the event log will contain the a single error message
related to the flow, while the debug log will contains the same error message with
a traceback and useful debugging messages during the flow.

Lets make this a feature for 3.6?

Nir



More information about the Devel mailing list