[ovirt-devel] [vdsm] strange network test failure on FC23

Mon Nov 30 08:52:59 UTC 2015

> On 29 Nov 2015, at 17:34, Nir Soffer <nsoffer at redhat.com> wrote:
> 
> On Sun, Nov 29, 2015 at 6:01 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
> > On Sun, Nov 29, 2015 at 5:37 PM, Nir Soffer <nsoffer at redhat.com> wrote:
> >>
> >> On Sun, Nov 29, 2015 at 10:37 AM, Yaniv Kaul <ykaul at redhat.com> wrote:
> >> >
> >> > On Fri, Nov 27, 2015 at 6:55 PM, Francesco Romani <fromani at redhat.com>
> >> > wrote:
> >> >>
> >> >> Using taskset, the ip command now takes a little longer to complete.

I fail to find the original reference for this.
Why does it take longer? is it purely the additional taskset executable invocation? On busy system we do have these issues all the time, with lvm, etc…so I don’t think it’s significant

> >> >
> >> >
> >> > Since we always use the same set of CPUs, I assume using a mask (for 0 &
> >> > 1,
> >> > just use 0x3, as the man suggests) might be a tiny of a fraction faster
> >> > to
> >> > execute taskset with, instead of the need to translate the numeric CPU
> >> > list.
> >>
> >> Creating the string "0-<last cpu index>" is one line in vdsm. The code
> >> handling this in
> >> taskset is written in C, so the parsing time is practically zero. Even
> >> if it was non-zero,
> >> this code run once when we run a child process, so the cost is
> >> insignificant.
> >
> >
> > I think it's easier to just to have it as a mask in a config item somewhere,
> > without need to create it or parse it anywhere.
> > For us and for the user.
> 
> We have this option in /etc/vdsm/vdsm.conf:
> 
>     # Comma separated whitelist of CPU cores on which VDSM is allowed to
>     # run. The default is "", meaning VDSM can be scheduled by  the OS to
>     # run on any core. Valid examples: "1", "0,1", "0,2,3"
>     # cpu_affinity = 1
> 
> I think this is the easiest option for users.

+1

> 
> >> > However, the real concern is making sure CPUs 0 & 1 are not really too
> >> > busy
> >> > with stuff (including interrupt handling, etc.)
> >>
> >> This code is used when we run a child process, to allow the child
> >> process to run on
> >> all cpus (in this case, cpu 0 and cpu 1). So I think there is no concern
> >> here.
> >>
> >> Vdsm itself is running by default on cpu 1, which should be less busy
> >> then cpu 0.
> >
> >
> > I assume those are cores, which probably in a multi-socket will be in the
> > first socket only.
> > There's a good chance that the FC and or network/cards will also bind their
> > interrupts to core0 & core 1 (check /proc/interrupts) on the same socket.
> > From my poor laptop (1s, 4c):
> > 42:    1487104       9329       4042       3598  IR-PCI-MSI 512000-edge    
> > 0000:00:1f.2
> >
> > (my SATA controller)
> >
> > 43:   14664923         34         18         13  IR-PCI-MSI 327680-edge    
> > xhci_hcd
> > (my dock station connector)
> >
> > 45:    6754579       4437       2501       2419  IR-PCI-MSI 32768-edge    
> > i915
> > (GPU)
> >
> > 47:     187409      11627       1235       1259  IR-PCI-MSI 2097152-edge    
> > iwlwifi
> > (NIC, wifi)
> 
> Interesting, here an example from a 8 cores machine running my vms:
> 
> [nsoffer at jumbo ~]$ cat /proc/interrupts 
>            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7       
>   0:         31          0          0          0          0          0          0          0  IR-IO-APIC-edge      timer
>   1:          2          0          0          1          0          0          0          0  IR-IO-APIC-edge      i8042
>   8:          0          0          0          0          0          0          0          1  IR-IO-APIC-edge      rtc0
>   9:          0          0          0          0          0          0          0          0  IR-IO-APIC-fasteoi   acpi
>  12:          3          0          0          0          0          0          1          0  IR-IO-APIC-edge      i8042
>  16:          4          4          9          0          9          1          1          3  IR-IO-APIC  16-fasteoi   ehci_hcd:usb3
>  23:         13          1          5          0         12          1          1          0  IR-IO-APIC  23-fasteoi   ehci_hcd:usb4
>  24:          0          0          0          0          0          0          0          0  DMAR_MSI-edge      dmar0
>  25:          0          0          0          0          0          0          0          0  DMAR_MSI-edge      dmar1
>  26:       3670        354        215    9062370        491        124        169         54  IR-PCI-MSI-edge      0000:00:1f.2
>  27:          0          0          0          0          0          0          0          0  IR-PCI-MSI-edge      xhci_hcd
>  28:  166285414          0          3          0          4          0          0          0  IR-PCI-MSI-edge      em1
>  29:         18          0          0          0          4          3          0          0  IR-PCI-MSI-edge      mei_me
>  30:          1        151         17          0          3        169         26         94  IR-PCI-MSI-edge      snd_hda_intel
> NMI:       2508       2296       2317       2356        867        918        912        903   Non-maskable interrupts
> LOC:  302996116  312923350  312295375  312089303   86282447   94046427   90847792   91761277   Local timer interrupts
> SPU:          0          0          0          0          0          0          0          0   Spurious interrupts
> PMI:       2508       2296       2317       2356        867        918        912        903   Performance monitoring interrupts
> IWI:          1          0          0          5          0          0          0          0   IRQ work interrupts
> RTR:          0          0          0          0          0          0          0          0   APIC ICR read retries
> RES:   34480637   12953645   13139863   14309885    8881861   10110753    9709070    9703933   Rescheduling interrupts
> CAL:    7387779    7682087    7283716    7135792    2771105    1785528    1887493    1843734   Function call interrupts
> TLB:      11121      16458      17923      15216       8534       8173       8639       7837   TLB shootdowns
> TRM:          0          0          0          0          0          0          0          0   Thermal event interrupts
> THR:          0          0          0          0          0          0          0          0   Threshold APIC interrupts
> MCE:          0          0          0          0          0          0          0          0   Machine check exceptions
> MCP:       7789       7789       7789       7789       7789       7789       7789       7789   Machine check polls
> HYP:          0          0          0          0          0          0          0          0   Hypervisor callback interrupts
> ERR:          0
> MIS:          0
> 
> It seems that our default (CPU1) is fine.

I think it’s safe enough. 
Numbers above (and I checked the same on ppc with similar pattern) are for a reasonablt epty system. We can get a different picture when vdsm is busy. In general I think it’s indeed best to use the second online CPU for vdsm and all CPUs for child processes

regarding exposing to users in UI - I think that’s way too low level. vdsm.conf is good enough

Thanks,
michal

> 
> Francesco, what do you think?
> 
> Nir