ovirt 3.5 Test day 1 - Import Storage Domain
by ybronhei
Hey,
I assigned myself to the Import Storage Domain feature [1] in the first
ovirt 3.5 test day
Currently I checked that when setting OvfUpdateIntervalInMinutes (on
vdc_options) to 2 minutes, it saves the ovf files as expected. After
setting up new environment and importing the same nfs path that I used
as storage domain, ovirt recovered my setup properly
I'll play with it more on second test day.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1083307
--
Yaniv Bronhaim.
10 years, 4 months
ovirt test day report - ovirt - foreman advanced integration feature
by Moti Asayag
Hi,
I assigned myself to the ovirt-foreman advanced integration feature [1] in the first
ovirt 3.5 test day.
The setup required for testing that environment is a bit complex (foreman, isolated
network, extending the foreman with extra plugins). I faced environmental
issues most of the day, mainly regarding foreman installation and configuration.
I provided specific feedback regarding the installed setup missing steps to Yaniv,
the feature owner, which will incorporate them into the feature page.
I haven't reached into a point where i can provision a host via foreman into ovirt, but
I plan to get back to it in the second ovirt test day (or even before).
[1] http://www.ovirt.org/Features/AdvancedForemanIntegration
Regards,
Moti
10 years, 4 months
Fwd: [ovirt-users] ovirt 3.5 TestDay Results
by Vinzenz Feenstra
-------- Original Message --------
Subject: [ovirt-users] ovirt 3.5 TestDay Results
Date: Thu, 03 Jul 2014 12:36:07 +0200
From: Vinzenz Feenstra <vfeenstr(a)redhat.com>
Organization: Red Hat
To: Users(a)ovirt.org List <Users(a)ovirt.org>
Hi,
I had to test *Bug 1080987*
<https://bugzilla.redhat.com/show_bug.cgi?id=1080987> -OVIRT35 - [RFE]
Support ethtool_opts functionality within oVirt
Installation of the engine went very smooth without a single issue, even
adding a previously used host through the host deploy went smoothly and
upgraded VDSM without any issues.
The feature worked as expected without any problems.
All in all I have to say that this time I had the best beta experience
since I am working with oVirt.
Good Job guys! :-)
--
Regards,
Vinzenz Feenstra | Senior Software Engineer
RedHat Engineering Virtualization R & D
Phone: +420 532 294 625
IRC: vfeenstr or evilissimo
Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com
10 years, 4 months
Test day 1 results
by Martin Betak
Hi all,
During the first test day I tested BZ 1108602 "Implement REST API for oVirt scheduler".
Most basic CRUD operations via rest worked well, but there was a problem with accessing
the subcollections 'filters', 'weights' and 'balances' of the /api/schedulingpolicies
resource by id.
For this a bug was filed under https://bugzilla.redhat.com/show_bug.cgi?id=1115071
Martin
10 years, 4 months
Re: [ovirt-devel] [libvirt] [RFC][scale] new API for querying domains stats
by Francesco Romani
----- Original Message -----
> From: "Daniel P. Berrange" <berrange(a)redhat.com>
> To: "Francesco Romani" <fromani(a)redhat.com>
> Cc: libvir-list(a)redhat.com
> Sent: Tuesday, July 1, 2014 10:35:21 AM
> Subject: Re: [libvirt] [RFC][scale] new API for querying domains stats
>
[...]
> > We [in VDSM] currently use these APIs for our sempling:
> > virDomainBlockInfo
> > virDomainGetInfo
> > virDomainGetCPUStats
> > virDomainBlockStats
> > virDomainBlockStatsFlags
> > virDomainInterfaceStats
> > virDomainGetVcpusFlags
> > virDomainGetMetadata
>
> Why do you need to call virDomainGetMetadata so often ? That merely contains
> a opaque data blob that can only have come from VDSM itself, so I'm surprised
> you need to call that at all frequently.
We store some QoS info in the domain metadata. Actually we can elide this API call
from the list and fix our coude to make smarter use of it.
> > please note that we are much more concerned about thread reduction then
> > about performance numbers. We had report of thread number becoming a
> > real harm, while performance so far is not yet a concern
> > (https://bugzilla.redhat.com/show_bug.cgi?id=1102147#c54)
> >
> > * bulk APIs for querying domain stats
> > (https://bugzilla.redhat.com/show_bug.cgi?id=1113116)
> > would be really welcome as well. It is quite independent from the
> > previous bullet point
> > and would help us greatly with scale.
>
> If we did the first bullet point, we'd be adding another ~10 APIs for
> async variants. If we then did the second bullet point we'd be adding
> another ~10 APIs for bulk querying. So while you're right that they
> are independent, it would be desirable to address them both at the
> same time, so we only need to add 10 new APIs in total, not 20.
I'm fine with this approach.
> For the async API design, I could see two potential designs
>
> 1. A custom callback to run per API
>
> typedef (void)(*virDomainBlockInfoCallback)(virDomainPtr dom,
> bool isError,
> virDomainBlockInfoPtr info,
> void *opaque);
>
> int virDomainGetBlockInfoAsync(virDomainPtr dom,
> const char *disk,
> virDomainBlockInfoCallback cb,
> void *opaque,
> unsigned int flags);
>
>
> 2. A standard callback and a pair of APIs
>
> typedef void *virDomainAsyncResult;
> typedef (void)(*virDomainAsyncCallback)(virDomainPtr dom,
> virDomainAsyncResult res);
>
> void virDomainGetBlockInfoAsync(virDomainPtr dom,
> const char *disk,
> virDomainBlockInfoCallback cb,
> void *opaque,
> unsigned int flags);
> int virDomainGetBlockInfoFinish(virDomainPtr dom,
> virDomainAsyncResult res,
> virDomainBlockInfoPtr info);
>
> This second approach is the way GIO works (see example in this page
> https://developer.gnome.org/gio/stable/GAsyncResult.html ). The main
> difference between them really is probably the way you get error
> reporting from the APIs. In the first example, libvirt would raise
> an error before it invoked the callback, with isError set to True.
> In the second example, the Finish() func would raise the error and
> return -1.
I need to check in deeper detail and sync up with other VDSM developers,
but I have a feel that the first approach is a bit easier for VDSM to consume.
Bests,
--
Francesco Romani
RedHat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani
10 years, 4 months
few notes from the scale setup
by Michal Skrivanek
Hi,
I'd like to share couple of observations on the scale system, just a few chaotic notes mostly for the documentation purposes:-)
4 socket, 8 core, HT -> 64 CPU
samples has been taken in +- stable conditions
1) running 100VMs; top sample with collapsed process usage (all threads sum up)
top - 15:00:48 up 5 days, 6:31, 1 user, load average: 2.25, 1.90, 2.05
Tasks: 1989 total, 4 running, 1983 sleeping, 1 stopped, 1 zombie
Cpu(s): 4.7%us, 1.9%sy, 0.0%ni, 93.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 396900992k total, 80656384k used, 316244608k free, 156632k buffers
Swap: 41148408k total, 0k used, 41148408k free, 12225520k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9057 vdsm 0 -20 17.5g 482m 9m S 162.7 0.1 446:46.28 vdsm
8726 root 20 0 979m 17m 7996 R 135.6 0.0 68:56.42 libvirtd
36937 qemu 20 0 1615m 296m 6328 S 59.2 0.1 0:19.18 qemu-kvm
38174 root 20 0 16496 2724 880 R 32.1 0.0 0:00.54 top
2458 qemu 20 0 1735m 533m 6328 S 11.1 0.1 2:38.87 qemu-kvm
10203 qemu 20 0 1736m 511m 6328 S 11.1 0.1 2:32.53 qemu-kvm
27774 qemu 20 0 1730m 523m 6328 S 11.1 0.1 2:22.36 qemu-kvm
25208 qemu 20 0 1733m 514m 6328 S 9.9 0.1 2:22.47 qemu-kvm
51594 qemu 20 0 1733m 650m 6328 S 9.9 0.2 3:53.42 qemu-kvm
… etc
[ this one's not from stable conditions, unfortunately, as can be seen by PID 36937]
----------------------------
2) running 185 VMs - all threads sum up. VDSM has 411 threads; load is much higher; also note high sys time
top - 07:10:28 up 5 days, 22:41, 1 user, load average: 19.10, 14.28, 13.17
Tasks: 2318 total, 9 running, 2308 sleeping, 0 stopped, 1 zombie
Cpu(s): 10.8%us, 21.0%sy, 0.0%ni, 67.8%id, 0.1%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 396900992k total, 157267616k used, 239633376k free, 175700k buffers
Swap: 41148408k total, 0k used, 41148408k free, 12669856k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9057 vdsm 0 -20 30.1g 856m 10m S 883.4 0.2 9818:59 vdsm
8726 root 20 0 975m 19m 7996 R 142.4 0.0 1370:15 libvirtd
19542 root 20 0 16700 3108 1020 R 15.1 0.0 0:18.11 top
17614 qemu 20 0 1730m 692m 6328 S 6.5 0.2 49:05.53 qemu-kvm
55545 qemu 20 0 1732m 708m 6328 S 6.3 0.2 48:42.01 qemu-kvm
28542 qemu 20 0 1724m 696m 6328 S 6.2 0.2 44:44.50 qemu-kvm
12482 qemu 20 0 1738m 822m 6328 S 6.0 0.2 51:02.71 qemu-kvm
… etc
break-up per thread:
top - 07:05:43 up 5 days, 22:36, 1 user, load average: 12.50, 11.15, 12.00
Tasks: 3357 total, 35 running, 3321 sleeping, 0 stopped, 1 zombie
Cpu(s): 11.3%us, 16.0%sy, 0.0%ni, 72.5%id, 0.0%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 396900992k total, 157238240k used, 239662752k free, 175700k buffers
Swap: 41148408k total, 0k used, 41148408k free, 12669856k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8726 root 20 0 979m 19m 7996 R 81.6 0.0 860:14.21 libvirtd
9103 vdsm 0 -20 30.1g 855m 10m R 63.7 0.2 502:03.90 vdsm
9144 vdsm 0 -20 30.1g 855m 10m R 59.5 0.2 276:23.93 vdsm
11150 vdsm 0 -20 30.1g 855m 10m R 59.0 0.2 399:37.86 vdsm
9100 vdsm 0 -20 30.1g 855m 10m R 42.7 0.2 354:41.75 vdsm
18053 root 20 0 17708 3968 1020 R 17.2 0.0 0:17.46 top
11845 vdsm 0 -20 30.1g 855m 10m S 15.4 0.2 114:48.08 vdsm
8755 root 20 0 979m 19m 7996 S 13.0 0.0 81:25.64 libvirtd
8753 root 20 0 979m 19m 7996 S 12.7 0.0 81:21.16 libvirtd
64396 root 20 0 979m 19m 7996 S 12.4 0.0 80:03.68 libvirtd
8754 root 20 0 979m 19m 7996 R 10.0 0.0 81:26.52 libvirtd
8751 root 20 0 979m 19m 7996 S 9.9 0.0 81:18.83 libvirtd
8752 root 20 0 979m 19m 7996 R 9.7 0.0 81:28.07 libvirtd
52567 vdsm 0 -20 30.1g 855m 10m S 4.9 0.2 9:27.75 vdsm
30617 vdsm 0 -20 30.1g 855m 10m S 3.8 0.2 34:40.75 vdsm
40621 vdsm 0 -20 30.1g 855m 10m S 3.8 0.2 34:12.40 vdsm
8952 vdsm 0 -20 30.1g 855m 10m S 3.8 0.2 24:03.79 vdsm
29818 vdsm 0 -20 30.1g 855m 10m S 3.7 0.2 9:21.33 vdsm
31418 vdsm 0 -20 30.1g 855m 10m S 3.7 0.2 35:09.03 vdsm
6858 vdsm 0 -20 30.1g 855m 10m S 3.7 0.2 34:10.79 vdsm
18513 vdsm 0 -20 30.1g 855m 10m S 3.7 0.2 34:44.03 vdsm
46247 vdsm 0 -20 30.1g 855m 10m S 3.7 0.2 34:16.65 vdsm
50759 vdsm 0 -20 30.1g 855m 10m S 3.7 0.2 34:04.86 vdsm
58612 vdsm 0 -20 30.1g 855m 10m S 3.7 0.2 31:58.11 vdsm
25872 vdsm 0 -20 30.1g 855m 10m S 3.7 0.2 31:03.81 vdsm
31599 vdsm 0 -20 30.1g 855m 10m S 3.7 0.2 31:10.85 vdsm
… etc
overall network usage:
0.5-3Mbps varying, roughly corresponding to 15s
----------------------------
3) special case when vdsm was down, 185 VMs
top - 08:34:11 up 6 days, 4 min, 2 users, load average: 5.96, 5.20, 8.71
Tasks: 2314 total, 7 running, 2306 sleeping, 0 stopped, 1 zombie
Cpu(s): 8.6%us, 4.9%sy, 0.0%ni, 86.1%id, 0.0%wa, 0.0%hi, 0.4%si, 0.0%st
Mem: 396900992k total, 157512324k used, 239388668k free, 180100k buffers
Swap: 41148408k total, 0k used, 41148408k free, 12726620k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
47776 root 20 0 17212 3592 1020 R 16.5 0.0 0:16.32 top
35549 qemu 20 0 1806m 809m 6328 S 6.7 0.2 53:46.20 qemu-kvm
43240 qemu 20 0 1735m 694m 6328 S 6.5 0.2 44:58.63 qemu-kvm
51594 qemu 20 0 1733m 822m 6328 S 6.3 0.2 54:19.51 qemu-kvm
24881 qemu 20 0 1726m 704m 6328 S 6.1 0.2 48:50.63 qemu-kvm
58563 qemu 20 0 1728m 699m 6328 S 6.1 0.2 50:14.10 qemu-kvm
… etc
--------------------------
4) no disk space on /var/log (opened BZ 1115357):
disk space - libvirtd 8726 root 4w REG 253,8 1638998016 34 /var/log/libvirtd.log (deleted)
--------------------------
5) startup of vdsm in 185 VMs environment:
on vdsm service startup the "vdsm: Running nwfilter" took ~5 minutes to finish
then VM recovery took ~20-30 minutes!
overall we need to identify the specific threads and simulate specific issues in a debug-friendly environment to tell a bit more…
Thanks,
michal
10 years, 4 months
github projects moved to gerrit
by Saggi Mizrahi
Hi, we moved all of our stuff from github to
ovirt.gerrit.org.
If you had pending pull-requests for:
* ioprocess
* cpopen
* pthreading
Please resend them on Gerrit.
Don't forget to update your git/config.
ssh://gerrit.ovirt.org:29418/pthreading
ssh://gerrit.ovirt.org:29418/ioprocess
ssh://gerrit.ovirt.org:29418/cpopen
Bugzilla components will be opened shortly
10 years, 4 months
oVirt 3.5 test day 1 results
by Nir Soffer
Hi all,
I tested today [RFE] replace XML-RPC communication (engine-vdsm) with json-rpc based on bidirectional transport
First I upgraded ovirt-3.4 stable engine to ovirt-3.5 - ok
Then I upgraded 4 hosts to latest vdsm - ok
I upgraded 2 data centers to cluster version 3.5:
- 2 Fedora 19 hosts with 30 ISCSI storage domains - ok
- 2 RHEL 6.5 hosts with 45 NFS storage domains - failed
I had to remove the hosts and the virtual machines to complete
the upgrade [1]
Then I removed the hosts and added them back (to configure jsonrpc), and
setup one host using jsonrpc and the other using xmlrpc - ok
After moving the hosts to maintenance mode and starting them back, I found
that the host using jsonrpc was stuck in "Unassigned" state [2],[3].
The errors in the vdsm log were not clear enough. After I improving this [4],
I could fix it in one line patch [5].
Finally when I had a working system, I run some sanity tests:
- start/stop vm - ok
- create vm from template - ok
- migrate vms between two hosts concurrenly (one host use xmlrpc, one using json) - ok
Then I tried to test create template from vm, but I had low disk space
on that storage domain. So I tried to extend the domain which would be
useful test as well.
But turns out that you cannot create or edit a block domain when using jsonrpc [6]
Looking at the logs, I found also that shutting down protocol detector fails [7]
Summary:
- upgrade is broken in some cases - critical
- jsonrpc is not ready yet
- jsonrpc needs lot of additional testing - for next test day I suggest one tester
from each team (virt, storage, networking, sla?) to test jsonrpc with relevant
flows.
[1] https://bugzilla.redhat.com/1114994
Cannot edit cluster after upgrade from version 3.4 to 3.5 because cpu type (Intel Haswell) does not match
[2] https://bugzilla.redhat.com/1115033
StoragePool_disconnect: disconnect() takes exactly 4 arguments
[3] https://bugzilla.redhat.com/1115044
Host stuck in "Unassinged" state when using jsonrpc and disconnection from pool failed
[4] http://gerrit.ovirt.org/29457
bridge: Show more info when method call fail
[5] http://gerrit.ovirt.org/29465
api: Make remove optional
[6] https://bugzilla.redhat.com/show_bug.cgi?id=1115152
Cannot edit or create block storage doamin when using jsonrpc
[7] https://bugzilla.redhat.com/1115104
Shuting down protocol detector fails
Nir
10 years, 4 months
[test day] day 1 results
by Simone Tiraboschi
Hi,
during the test day I should test
https://bugzilla.redhat.com/show_bug.cgi?id=1108599 - OVIRT35 - [RFE] [spice-html5] spice-html5 js client is dumb: no error about network connection issue
However I wasn't able to get a working test plant due to network troubles.
I'm on fully virtualized setup with two fedora 19 VM on KVM with nested virtualization.
Engine got installed correctly on the first host but I wasn't able to add the second VM as an hypervisor.
Trying to do that network connectivity get lost on the second VM.
I tried more than once always with the same result.
I opened a bug for that:
https://bugzilla.redhat.com/show_bug.cgi?id=1115420
Still not comment on the spice-html5 feature cause I'm still not able to test it.
10 years, 4 months
test day upgrade failure
by Dan Kenigsberg
For the test day I've tried to upgrade my ovirt-engine-3.4.2-1.el6.noarch to 3.5 beta.
I've encountered
psql:/usr/share/ovirt-engine/dbscripts/upgrade/03_05_0050_event_notification_methods.sql:2: ERROR: constraint "fk_event_subscriber_event_notification_methods" of relation "event_subscriber" does not exist
FATAL: Cannot execute sql command: --file=/usr/share/ovirt-engine/dbscripts/upgrade/03_05_0050_event_notification_methods.sql
2014-07-02 09:16:13 DEBUG otopi.context context._executeMethod:152 method exception
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/otopi/context.py", line 142, in _executeMethod
method['method']()
File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/db/schema.py", line 291, in _misc
oenginecons.EngineDBEnv.PGPASS_FILE
File "/usr/lib/python2.6/site-packages/otopi/plugin.py", line 871, in execute
command=args[0],
RuntimeError: Command '/usr/share/ovirt-engine/dbscripts/schema.sh' failed to execute
2014-07-02 09:16:13 ERROR otopi.context context._executeMethod:161 Failed to execute stage 'Misc configuration': Command '/usr/share/ovirt-engine/dbscripts/schema.sh' failed to execute
Is this failure known? What is the remedy?
Dan.
10 years, 4 months