<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Sat, Sep 17, 2016 at 3:14 AM, Nir Soffer <span dir="ltr">&lt;<a href="mailto:nsoffer@redhat.com" target="_blank">nsoffer@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class="gmail-">On Fri, Sep 16, 2016 at 10:49 AM, Michal Skrivanek <span dir="ltr">&lt;<a href="mailto:michal.skrivanek@redhat.com" target="_blank">michal.skrivanek@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span><br>
&gt; On 15 Sep 2016, at 18:18, Nir Soffer &lt;<a href="mailto:nsoffer@redhat.com" target="_blank">nsoffer@redhat.com</a>&gt; wrote:<br>
&gt;<br>
&gt; Hi all,<br>
&gt;<br>
&gt; Vdsm reports apparentsize and truesize disk stats [1] when getting<br>
&gt; vms stats (every 15 seconds?). These values are update every<br>
&gt; 60 seconds in vdsm.<br>
&gt;<br>
&gt; To collect the values, we run risky storage apis in vdsm virt thread<br>
&gt; pool, and we want to avoid this [2] since one slow or broken domain<br>
&gt; can cause the entire virt thread pool to get stuck and cause vms<br>
&gt; using other (healthy) storage domain to become non responsive.<br>
&gt;<br>
&gt; These can also break block storage thin provisioned disks, since they<br>
&gt; depend also on the virt thread pool. So one bad NFS storage domain<br>
&gt; can cause vms using only block storage to be paused.<br>
&gt;<br>
&gt; If both of these values are not used by anyone, we would like to<br>
&gt; stop reporting them.<br>
<br>
</span>there are only 2 consumers of the monitoring, engine and mom.<br>
git grep reveals that “apparentsize&quot; is used only for importing HE.<br>
“truesize&quot; too, but additionally it is used in engine storage code as an actual size of the disk<br>
<span><br>
&gt;<br>
&gt; If the values are used, we need to find a safer way to report them,<br>
&gt; probably in storage thread pool, or maybe we can get these values<br>
&gt; from libvirt using bulk sampling.<br>
<br>
</span>you could have been able to drop apparentsize right away, but the HE import code is expecting that field and won’t be happy if it is missing<br>
The monitoring code would work but fill in 0 for the actual size<br></blockquote><div><br></div></span><div>Returning always 0 can be nice prank for the storage team :-)</div><div><br></div><div>Looking in bulk stats, we already have the required info from libvirt:</div><div><br></div><div><div> {&#39;bcfa00d3-78a7-40c9-990e-<wbr>5ffac8886ce0&#39;: {&#39;balloon.current&#39;: 1048576L,</div><div>                                          &#39;balloon.maximum&#39;: 1048576L,</div><div>                                          &#39;block.0.allocation&#39;: 0L,</div><div>                                          &#39;block.0.fl.reqs&#39;: 0L,</div><div>                                          &#39;block.0.fl.times&#39;: 0L,</div><div>                                          &#39;<a href="http://block.0.name" target="_blank">block.0.name</a>&#39;: &#39;hdc&#39;,</div><div>                                          &#39;block.0.physical&#39;: 0L,</div><div>                                          &#39;block.0.rd.bytes&#39;: 152L,</div><div>                                          &#39;block.0.rd.reqs&#39;: 4L,</div><div>                                          &#39;block.0.rd.times&#39;: 539801L,</div><div>                                          &#39;block.0.wr.bytes&#39;: 0L,</div><div>                                          &#39;block.0.wr.reqs&#39;: 0L,</div><div>                                          &#39;block.0.wr.times&#39;: 0L,</div><div>                                          &#39;block.1.allocation&#39;: 131005952L,</div><div>                                          &#39;block.1.capacity&#39;: 8589934592L,</div><div>                                          &#39;block.1.fl.reqs&#39;: 68L,</div><div>                                          &#39;block.1.fl.times&#39;: 1894725112L,</div><div>                                          &#39;<a href="http://block.1.name" target="_blank">block.1.name</a>&#39;: &#39;vda&#39;,</div><div>                                          &#39;block.1.path&#39;: &#39;/rhev/data-center/f9374c0e-<wbr>ae24-4bc1-a596-f61d5f05bc5f/<wbr>5f35b5c0-17d7-4475-9125-<wbr>e97f1cdb06f9/images/c54e7894-<wbr>b1dc-4f23-9ff5-1836259adc6d/<wbr>133db162-6c6a-4e82-baae-<wbr>9ae0e7e3885d&#39;,</div><div>                                          &#39;block.1.physical&#39;: 1073741824L,</div><div>                                          &#39;block.1.rd.bytes&#39;: 123849728L,</div><div>                                          &#39;block.1.rd.reqs&#39;: 7979L,</div><div>                                          &#39;block.1.rd.times&#39;: 10655381303L,</div><div>                                          &#39;block.1.wr.bytes&#39;: 16762880L,</div><div>                                          &#39;block.1.wr.reqs&#39;: 455L,</div><div>                                          &#39;block.1.wr.times&#39;: 6021639149L,</div><div>                                          &#39;block.2.allocation&#39;: 0L,</div><div>                                          &#39;block.2.capacity&#39;: 21474836480L,</div><div>                                          &#39;block.2.fl.reqs&#39;: 0L,</div><div>                                          &#39;block.2.fl.times&#39;: 0L,</div><div>                                          &#39;<a href="http://block.2.name" target="_blank">block.2.name</a>&#39;: &#39;vdb&#39;,</div><div>                                          &#39;block.2.path&#39;: &#39;/rhev/data-center/f9374c0e-<wbr>ae24-4bc1-a596-f61d5f05bc5f/<wbr>bb85ee2f-d674-489f-9377-<wbr>3eb1f176e8fb/images/b59304f3-<wbr>d19d-40dd-9f04-8c2df37ef6d3/<wbr>4df47a96-8a1b-436e-8a3e-<wbr>3a638f119b48&#39;,</div><div>                                          &#39;block.2.physical&#39;: 21474836480L,</div><div>                                          &#39;block.2.rd.bytes&#39;: 1389056L,</div><div>                                          &#39;block.2.rd.reqs&#39;: 331L,</div><div>                                          &#39;block.2.rd.times&#39;: 160943568L,</div><div>                                          &#39;block.2.wr.bytes&#39;: 0L,</div><div>                                          &#39;block.2.wr.reqs&#39;: 0L,</div><div>                                          &#39;block.2.wr.times&#39;: 0L,</div><div>                                          &#39;block.count&#39;: 3,</div><div>                                          &#39;cpu.system&#39;: 19090000000L,</div><div>                                          &#39;cpu.time&#39;: 53480823390L,</div><div>                                          &#39;cpu.user&#39;: 4650000000L,</div><div>                                          &#39;<a href="http://net.0.name" target="_blank">net.0.name</a>&#39;: &#39;vnet0&#39;,</div><div>                                          &#39;net.0.rx.bytes&#39;: 2595857L,</div><div>                                          &#39;net.0.rx.drop&#39;: 0L,</div><div>                                          &#39;net.0.rx.errs&#39;: 0L,</div><div>                                          &#39;net.0.rx.pkts&#39;: 39957L,</div><div>                                          &#39;net.0.tx.bytes&#39;: 17041L,</div><div>                                          &#39;net.0.tx.drop&#39;: 0L,</div><div>                                          &#39;net.0.tx.errs&#39;: 0L,</div><div>                                          &#39;net.0.tx.pkts&#39;: 177L,</div><div>                                          &#39;net.count&#39;: 1,</div><div>                                          &#39;state.reason&#39;: 1,</div><div>                                          &#39;state.state&#39;: 1,</div><div>                                          &#39;vcpu.0.state&#39;: 1,</div><div>                                          &#39;vcpu.0.time&#39;: 43040000000L,</div><div>                                          &#39;vcpu.0.wait&#39;: 0L,</div><div>                                          &#39;vcpu.current&#39;: 1,</div><div>                                          &#39;vcpu.maximum&#39;: 16}}</div></div><div><br></div><div>So we can extract the values from the stats cache matching them using drive.path.</div><div><br></div><div>We are already doing this for block.*.rd.bytes etc.</div><div><br></div><div>Francesco, what do you think?</div></div></div></div></blockquote><div><br></div><div>I check this in <a href="https://gerrit.ovirt.org/64093">https://gerrit.ovirt.org/64093</a>.</div><div><br></div><div>Unfortunately, we cannot use it, since libvirt allocation value</div><div>is not compatible with truesize.</div><div><br></div><div>truesize is:</div><div>- file storage: number of blocks * block size</div><div>- block storage: size of lv</div><div><br></div><div>Also allocation is available only if qemu has written something</div><div>to a volume. When starting a vm with a chain of volumes, all</div><div>volumes have allocation=0 except the top volume in the boot</div><div>disk, not very useful.</div><div><br></div><div>So we will have to use the storage apis that do the right thing</div><div>for the storage type, but run them in a way that cannot affect</div><div>unrelated vms.</div><div><br></div><div>Nir</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class="gmail-"><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<span><br>
&gt;<br>
&gt; Please update if these values are used in engine/dwh.<br>
&gt;<br>
&gt; [1] <a href="https://github.com/oVirt/vdsm/blob/master/lib/vdsm/virt/vmstats.py#L364" rel="noreferrer" target="_blank">https://github.com/oVirt/vdsm/<wbr>blob/master/lib/vdsm/virt/vmst<wbr>ats.py#L364</a><br>
&gt; [2] <a href="https://gerrit.ovirt.org/59801" rel="noreferrer" target="_blank">https://gerrit.ovirt.org/59801</a><br>
&gt;<br>
&gt; Nir<br>
</span>&gt; ______________________________<wbr>_________________<br>
&gt; Devel mailing list<br>
&gt; <a href="mailto:Devel@ovirt.org" target="_blank">Devel@ovirt.org</a><br>
&gt; <a href="http://lists.ovirt.org/mailman/listinfo/devel" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman<wbr>/listinfo/devel</a><br>
<br>
</blockquote></span></div><br></div></div>
</blockquote></div><br></div></div>