[ovirt-devel] [VDSM][sampling] thread pool status and handling of stuck calls

Wed Jul 9 12:22:05 UTC 2014

----- Original Message -----
> From: "Nir Soffer" <nsoffer at redhat.com>
> To: "Francesco Romani" <fromani at redhat.com>
> Cc: devel at ovirt.org, "Federico Simoncelli" <fsimonce at redhat.com>, "Michal Skrivanek" <mskrivan at redhat.com>, "Adam
> Litke" <alitke at redhat.com>
> Sent: Monday, July 7, 2014 4:53:29 PM
> Subject: Re: [ovirt-devel] [VDSM][sampling] thread pool status and handling of	stuck calls

> > > This will also allow task to change the sampling policy depending on the
> > > results.
> > > If some calls always fails, maybe we can run it less often.
> > 
> > This looks a feature worth having
> 
> But lets take a note and forget about this - we should focus on simple
> solution

Agreed. Stuff for future work.

> > Yes. In all the reported cases it was directly or indirectly storage being
> > unavailable.
> > Directly: libvirt does some access (stat()) for example and gets stuck.
> > NFS soft mount doesn't always solve this.
> > Indirectly: qemu stuck in D state because it was doing I/O and the storage
> > disapperead
> > underneath its foots.
> 
> When this happens - does all calls related to specific vm can block,
> or it is always the dangerous calls you listed bellow?

As Michal said, the danger come mostly form the listed calls and in general
to storage related calls. To makes things even more complex:

https://www.redhat.com/archives/libvir-list/2014-July/msg00478.html

Relevant quote for the lazy people  (:)) 

> If there is a prior call to libvirt that involves that guest domain
> which has blocked on storage, then this can prevent subsequent calls
> from completely since the prior call may hold a lock.

However this is less or more what we look for - not spam libvirt with related call
if one get stuck, so it seems is not that bad for us, I believe.

> > > If one vm may stoop responding, causing all libvirt calls for this vm to
> > > block, then a thread pool with one connection per worker thread can lead
> > > to a failure when all connection happen to run a request that blocks
> > > the thread. If this is the case, then each task related to one vm must
> > > depend on other tasks and should not be skipped until the previous task
> > > returned, simulating the current threading model without creating 100's
> > > of threads.
> > 
> > Agreed, we should introduce this concept and this is lacking in my
> > threadpool
> > proposal.
> 
> So basically the current threading model is the behavior we want?
> 
> If some call get stuck, stop sampling this vm. Continue when the
> call returns.
> 
> Michal? Federico?

Yep - but with less threads, and surely with a constant number of them.
Your schedule library (review in my queue at very high priority) is indeed
a nice step in this direcation.

Waiting for Federico's ack.

> > > You can block access to storage using iptables, which may cause the block
> > > related calls to stuck, and try to close a connection after few seconds
> > > from
> > > another thread.
> > 
> > Sure, is near to the top of my TODO list.
> 
> Lets wait with this. If we can continue to send other request on the same
> connection while one call never return, I don't see a need to close it.

Work in progress...

> > - block-based devices: libvirt enters the QEMU monitor to ask for the last
> > block extent
> 
> This is our only usage.

Yep, but a very important one indeed

> > 
> > * _sampleCpu uses virDomainGetCPUStats, which uses cgroups. Should be safe.
> 
> Are should be safe really safe? can you check with libvirt developers
> about that?

Done: https://www.redhat.com/archives/libvir-list/2014-July/msg00477.html

Waiting for (more) answers.

> > * _sampleNet uses virDomainInterfaceStats, which uses /proc/net/dev. Should
> > be safe.
> > 
> > * _sampleBalloon uses virDomainGetInfo, which uses /proc, but also needs to
> > enter
> >   the domain monitor.
> 
> And the domain monitor may be stuck when storage is not available?

Just to be clear, here I mean the QEMU domain (aka VM) monitor.
This is a very good question. I think yes, it can get stuck, but I don't have
hard evidence here, yet.

> > * storage/VM health (prevent VM misbehaving on storage exausted):
> > _highWrite
> > * storage status: _updateVolume, _sampleDisk, _sampleDiskLatency.
> > We can optimize the later two, looks wasteful to ask for data, discard
> > some,
> > then ask again for the missing data. Need to check carefully if the API
> > allows to do both in one go, but looks feasible.
> > * balloon stats: _sampleBalloon: needs to enter the QEMU monitor
> > * everything else: does not touch disk nor qemu, so no special treatment
> > here.
> 
> If we separate storage calls so vm sampling continue even if storage
> cause qemu to block - do we gain anything? I guess this vm is
> not useful if its storage is not available, and will soon will
> pause anyway.
> 
> Looks like the simplest solution is the best and fancy separation
> of sampling calls is needed.

You are right about a VM being useless without storage.
Probably separation is not the solution here, however we must consider some kind of isolation.

If a VM gets stuck on storage (D-state etc. etc.) for whatever reason, it's fine
to block all the other samplings of that VM, however the other VM should continue
to sample until they block individually.

We shouldn't assume if the storage is not-responsive for a VM, than it will be for all
the other VMs on an host.

Please note that I'm *not* implying we are assuming the above, it is just to highlight
a requirement for sampling in general and to preserve the current (good) behaviour specifically.

Bests,

-- 
Francesco Romani
RedHat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani