Thanks!! Very helpful

 

 

oracle-email-sig-198324-355094

Gregory King | Software Development Manager | +1.303.272.2427

Oracle Virtualization Sustaining Engineering

500 Eldorado Boulevard Build 5 | Broomfield Colorado 80021

Mobile: +1.303.968.8169 | Fax: +1.303.272.2427

 

From: Nir Soffer [mailto:nsoffer@redhat.com]
Sent: Monday, March 22, 2021 8:53 AM
To: Greg King <greg.king@oracle.com>
Cc: devel@ovirt.org; Ales Musil <amusil@redhat.com>; Milan Zamazal <mzamazal@redhat.com>; Vojtech Juranek <vjuranek@redhat.com>
Subject: [External] : Re: [ovirt-devel] docs: pointers to more in-depth internals?

 

On Tue, Mar 16, 2021 at 10:47 PM Greg King <greg.king@oracle.com> wrote:

I am new to vdsm and trying to understand the architecture/internals much better

 

Welcome to vdsm Greg!

 

The ovirt documentation for architecture I have found so far seems to be relatively high level

 

And it is mostly outdated, but we don't have anything better.

 

My effort to understand the architecture by walking through the vdsm code using pdb/rpdb is slow and probably not all that efficient

 

Does anyone have pointers to documentation that might explain the vdsm modules, classes and internals a little more in depth?

 

I don't think we have more detailed documentation, but there are lot of

talks and slide decks that give more info on specific topics, and are usually

are more updated:

https://www.ovirt.org/community/archived_conferences_presentations.html

 

There is also lot of content on youtube, here some example that I could

find easily:

- [oVirt 3.6 deep dive] - live storage migration between mixed domains

  https://www.youtube.com/watch?v=BPy29Q__VV4

- oVirt 4.1 deep dive - VM leases

  https://www.youtube.com/watch?v=MVa-4fQo2V8

- Back to the future – incremental backup in oVirt

  https://www.youtube.com/watch?v=X-xHD9ddN6s

- oVirt 4k - teaching an old dog new tricks

  https://www.youtube.com/watch?v=Q1VQxjYEzDY

 

 

I’d also like to understand where I might be able to add rpdb.set_trace() so I can step through functions being called in libvirt.py

 

I don't think using a debugger is very helpful with vdsm, since vdsm is not

designed for stopping a thread for unlimited time. In some cases the system

will log warning and traceback every 60 seconds about blocked worker.

In other cases monitoring code may fail to update stats, which may cause

engines to deactivate a host or migrate vms or other trouble.

 

The best way to debug and understand vdsm is to follow the logs, and add

move logs when needed. The main advantage compared with a debugger is

that the time spent with the logs will pay back when you have to debug real

issues in user setup, when logs are the only available resource.

 

Having said that, being able to follow the entire flow by printing a traceback

is a great way to understand how the system works.

 

You can use vdsm.common.concurrent.format_traceback:

https://github.com/oVirt/vdsm/blob/114121ab122a0cd5e529807b938b3506f247f42b/lib/vdsm/common/concurrent.py#L367

 

To print traceback at interesting points. For tracing function from the libvirt

python binginding, you can modify libvirtconnection.py:

https://github.com/oVirt/vdsm/blob/114121ab122a0cd5e529807b938b3506f247f42b/lib/vdsm/common/libvirtconnection.py#L127

 

This module creates a connection, and wraps libvirt.virDomain with a wrapper

that panics on fatal errors. You can modify the wrapper to log a traceback

for all or some of libvirt.virDomain functions.

 

Another option it to modify the virDomain wrapper to log a traceback:

https://github.com/oVirt/vdsm/blob/114121ab122a0cd5e529807b938b3506f247f42b/lib/vdsm/virt/virdomain.py#L82

 

For example here:

https://github.com/oVirt/vdsm/blob/114121ab122a0cd5e529807b938b3506f247f42b/lib/vdsm/virt/virdomain.py#L99

 

Good luck with your vdsm ride!

 

Nir