Re: [ovirt-devel] [vdsm] strange network test failure on FC23

Monday, 30 November 2015

----- Original Message -----
...
 From: "Michal Skrivanek" <mskrivan(a)redhat.com&gt;
 To: "Nir Soffer" <nsoffer(a)redhat.com&gt;, "Francesco Romani"
<fromani(a)redhat.com&gt;
 Cc: "Yaniv Kaul" <ykaul(a)redhat.com&gt;, "infra"
<infra(a)ovirt.org&gt;, "devel" <devel(a)ovirt.org&gt;
 Sent: Monday, November 30, 2015 9:52:59 AM
 Subject: Re: [ovirt-devel] [vdsm] strange network test failure on FC23

 > On 29 Nov 2015, at 17:34, Nir Soffer <nsoffer(a)redhat.com&gt; wrote:
 > 
 > On Sun, Nov 29, 2015 at 6:01 PM, Yaniv Kaul <ykaul(a)redhat.com&gt; wrote:
 > > On Sun, Nov 29, 2015 at 5:37 PM, Nir Soffer <nsoffer(a)redhat.com&gt; wrote:
 > >>
 > >> On Sun, Nov 29, 2015 at 10:37 AM, Yaniv Kaul <ykaul(a)redhat.com&gt;
wrote:
 > >> >
 > >> > On Fri, Nov 27, 2015 at 6:55 PM, Francesco Romani
<fromani(a)redhat.com&gt;
 > >> > wrote:
 > >> >>
 > >> >> Using taskset, the ip command now takes a little longer to
complete.

 I fail to find the original reference for this.
 Why does it take longer? is it purely the additional taskset executable
 invocation? On busy system we do have these issues all the time, with lvm,
 etc…so I don’t think it’s significant 
Yep, that's only the overhead of taskset executable.

...
 > >> > Since we always use the same set of CPUs, I assume
using a mask (for 0
 > >> > &
 > >> > 1,
 > >> > just use 0x3, as the man suggests) might be a tiny of a fraction
 > >> > faster
 > >> > to
 > >> > execute taskset with, instead of the need to translate the numeric
CPU
 > >> > list.
 > >>
 > >> Creating the string "0-<last cpu index>" is one line in
vdsm. The code
 > >> handling this in
 > >> taskset is written in C, so the parsing time is practically zero. Even
 > >> if it was non-zero,
 > >> this code run once when we run a child process, so the cost is
 > >> insignificant.
 > >
 > >
 > > I think it's easier to just to have it as a mask in a config item
 > > somewhere,
 > > without need to create it or parse it anywhere.
 > > For us and for the user.
 > 
 > We have this option in /etc/vdsm/vdsm.conf:
 > 
 >     # Comma separated whitelist of CPU cores on which VDSM is allowed to
 >     # run. The default is "", meaning VDSM can be scheduled by  the OS to
 >     # run on any core. Valid examples: "1", "0,1",
"0,2,3"
 >     # cpu_affinity = 1
 > 
 > I think this is the easiest option for users.

 +1 
+1, modulo the changes we need to fix https://bugzilla.redhat.com/show_bug.cgi?id=1286462
(patch is coming)

...
 > > I assume those are cores, which probably in a multi-socket
will be in the
 > > first socket only.
 > > There's a good chance that the FC and or network/cards will also bind
 > > their
 > > interrupts to core0 & core 1 (check /proc/interrupts) on the same socket.
 > > From my poor laptop (1s, 4c): 
Yes, especially core0 (since 0 is nice defaults). This was the rationale behind
the choice of cpu #1 in the first place.

...
 > It seems that our default (CPU1) is fine.

 I think it’s safe enough.
 Numbers above (and I checked the same on ppc with similar pattern) are for a
 reasonablt epty system. We can get a different picture when vdsm is busy. In
 general I think it’s indeed best to use the second online CPU for vdsm and
 all CPUs for child processes 
Agreed - except for cases like bz1286462 - but let's discuss this on gerrit/bz

...
 regarding exposing to users in UI - I think that’s way too low
level.
 vdsm.conf is good enough 
Agreed. This is one thing that "just works".

Bests,

-- 
Francesco Romani
RedHat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [ovirt-devel] [vdsm] strange network test failure on FC23