[ovirt-devel] Reporting vm disk stats "truesize" and "apparentsize"?

Michal Skrivanek michal.skrivanek at redhat.com
Mon Sep 19 08:50:39 UTC 2016


> On 18 Sep 2016, at 18:16, Nir Soffer <nsoffer at redhat.com> wrote:
> 
> On Sat, Sep 17, 2016 at 3:14 AM, Nir Soffer <nsoffer at redhat.com <mailto:nsoffer at redhat.com>> wrote:
> On Fri, Sep 16, 2016 at 10:49 AM, Michal Skrivanek <michal.skrivanek at redhat.com <mailto:michal.skrivanek at redhat.com>> wrote:
> 
> > On 15 Sep 2016, at 18:18, Nir Soffer <nsoffer at redhat.com <mailto:nsoffer at redhat.com>> wrote:
> >
> > Hi all,
> >
> > Vdsm reports apparentsize and truesize disk stats [1] when getting
> > vms stats (every 15 seconds?). These values are update every
> > 60 seconds in vdsm.
> >
> > To collect the values, we run risky storage apis in vdsm virt thread
> > pool, and we want to avoid this [2] since one slow or broken domain
> > can cause the entire virt thread pool to get stuck and cause vms
> > using other (healthy) storage domain to become non responsive.
> >
> > These can also break block storage thin provisioned disks, since they
> > depend also on the virt thread pool. So one bad NFS storage domain
> > can cause vms using only block storage to be paused.
> >
> > If both of these values are not used by anyone, we would like to
> > stop reporting them.
> 
> there are only 2 consumers of the monitoring, engine and mom.
> git grep reveals that “apparentsize" is used only for importing HE.
> “truesize" too, but additionally it is used in engine storage code as an actual size of the disk
> 
> >
> > If the values are used, we need to find a safer way to report them,
> > probably in storage thread pool, or maybe we can get these values
> > from libvirt using bulk sampling.
> 
> you could have been able to drop apparentsize right away, but the HE import code is expecting that field and won’t be happy if it is missing
> The monitoring code would work but fill in 0 for the actual size
> 
> Returning always 0 can be nice prank for the storage team :-)

what about that apparentsize? that seems to be unused except for HE import

> 
> Looking in bulk stats, we already have the required info from libvirt:
> 
>  {'bcfa00d3-78a7-40c9-990e-5ffac8886ce0': {'balloon.current': 1048576L,
>                                           'balloon.maximum': 1048576L,
>                                           'block.0.allocation': 0L,
>                                           'block.0.fl.reqs': 0L,
>                                           'block.0.fl.times': 0L,
>                                           'block.0.name <http://block.0.name/>': 'hdc',
>                                           'block.0.physical': 0L,
>                                           'block.0.rd.bytes': 152L,
>                                           'block.0.rd.reqs': 4L,
>                                           'block.0.rd.times': 539801L,
>                                           'block.0.wr.bytes': 0L,
>                                           'block.0.wr.reqs': 0L,
>                                           'block.0.wr.times': 0L,
>                                           'block.1.allocation': 131005952L,
>                                           'block.1.capacity': 8589934592L,
>                                           'block.1.fl.reqs': 68L,
>                                           'block.1.fl.times': 1894725112L,
>                                           'block.1.name <http://block.1.name/>': 'vda',
>                                           'block.1.path': '/rhev/data-center/f9374c0e-ae24-4bc1-a596-f61d5f05bc5f/5f35b5c0-17d7-4475-9125-e97f1cdb06f9/images/c54e7894-b1dc-4f23-9ff5-1836259adc6d/133db162-6c6a-4e82-baae-9ae0e7e3885d',
>                                           'block.1.physical': 1073741824L,
>                                           'block.1.rd.bytes': 123849728L,
>                                           'block.1.rd.reqs': 7979L,
>                                           'block.1.rd.times': 10655381303L,
>                                           'block.1.wr.bytes': 16762880L,
>                                           'block.1.wr.reqs': 455L,
>                                           'block.1.wr.times': 6021639149L,
>                                           'block.2.allocation': 0L,
>                                           'block.2.capacity': 21474836480L,
>                                           'block.2.fl.reqs': 0L,
>                                           'block.2.fl.times': 0L,
>                                           'block.2.name <http://block.2.name/>': 'vdb',
>                                           'block.2.path': '/rhev/data-center/f9374c0e-ae24-4bc1-a596-f61d5f05bc5f/bb85ee2f-d674-489f-9377-3eb1f176e8fb/images/b59304f3-d19d-40dd-9f04-8c2df37ef6d3/4df47a96-8a1b-436e-8a3e-3a638f119b48',
>                                           'block.2.physical': 21474836480L,
>                                           'block.2.rd.bytes': 1389056L,
>                                           'block.2.rd.reqs': 331L,
>                                           'block.2.rd.times': 160943568L,
>                                           'block.2.wr.bytes': 0L,
>                                           'block.2.wr.reqs': 0L,
>                                           'block.2.wr.times': 0L,
>                                           'block.count': 3,
>                                           'cpu.system': 19090000000L,
>                                           'cpu.time': 53480823390L,
>                                           'cpu.user': 4650000000L,
>                                           'net.0.name <http://net.0.name/>': 'vnet0',
>                                           'net.0.rx.bytes': 2595857L,
>                                           'net.0.rx.drop': 0L,
>                                           'net.0.rx.errs': 0L,
>                                           'net.0.rx.pkts': 39957L,
>                                           'net.0.tx.bytes': 17041L,
>                                           'net.0.tx.drop': 0L,
>                                           'net.0.tx.errs': 0L,
>                                           'net.0.tx.pkts': 177L,
>                                           'net.count': 1,
>                                           'state.reason': 1,
>                                           'state.state': 1,
>                                           'vcpu.0.state': 1,
>                                           'vcpu.0.time': 43040000000L,
>                                           'vcpu.0.wait': 0L,
>                                           'vcpu.current': 1,
>                                           'vcpu.maximum': 16}}
> 
> So we can extract the values from the stats cache matching them using drive.path.
> 
> We are already doing this for block.*.rd.bytes etc.
> 
> Francesco, what do you think?
> 
> I check this in https://gerrit.ovirt.org/64093 <https://gerrit.ovirt.org/64093>.
> 
> Unfortunately, we cannot use it, since libvirt allocation value
> is not compatible with truesize.
> 
> truesize is:
> - file storage: number of blocks * block size
> - block storage: size of lv
> 
> Also allocation is available only if qemu has written something
> to a volume. When starting a vm with a chain of volumes, all
> volumes have allocation=0 except the top volume in the boot
> disk, not very useful.
> 
> So we will have to use the storage apis that do the right thing
> for the storage type, but run them in a way that cannot affect
> unrelated vms.
> 
> Nir
>  
> 
>  
> 
> >
> > Please update if these values are used in engine/dwh.
> >
> > [1] https://github.com/oVirt/vdsm/blob/master/lib/vdsm/virt/vmstats.py#L364 <https://github.com/oVirt/vdsm/blob/master/lib/vdsm/virt/vmstats.py#L364>
> > [2] https://gerrit.ovirt.org/59801 <https://gerrit.ovirt.org/59801>
> >
> > Nir
> > _______________________________________________
> > Devel mailing list
> > Devel at ovirt.org <mailto:Devel at ovirt.org>
> > http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20160919/0d227a25/attachment-0001.html>


More information about the Devel mailing list