July 2014 - Devel - oVirt List Archives

[vdsm] logging levels and noise in the log
by Martin Sivak 11 Jul '14

11 Jul '14

Hi, we discussed the right amount of logging with Nir and Francesco while reviewing my patches. Francesco was against one DEBUG log message that could potentially flood the logs. But the message in question was important for me and SLA, because it was logging the VDSM changes in response to MoM command. Since DEBUG really is meant to be used for debug messages, I have one proposal: 1) Change the default log level of vdsm.log to INFO 2) Log (only) DEBUG messages to a separate file vdsm-debug.log 3) Make the vdsm-debug.log rotate faster (every hour, keep only last couple of hours?) so it does not grow too much This way the customer would be able to monitor INFO logs (much smaller) without all the noise and we would be able to collect the DEBUG part in case something happens. What do you think? -- Martin Sivák msivak(a)redhat.com Red Hat Czech RHEV-M SLA / Brno, CZ

7 11

Call for Papers Deadline In Two Days: Linux.conf.au
by Brian Proffitt 11 Jul '14

11 Jul '14

Conference: Linux.conf.au Information: Each year open source geeks from across the globe gather in Australia or New Zealand to meet their fellow technologists, share the latest ideas and innovations, and spend a week discussing and collaborating on open source projects. The conference is well known for the speakers and delegates depth of talent, and its focus on technical linux content. Possible topics: Virtualization, oVirt, KVM, libvirt, RDO, OpenStack, Foreman Date: January 12-15, 2015 Location: Auckland, New Zealand Website: http://lca2015.linux.org.au/ Call for Papers Deadline: July 13, 2014 Call for Papers URL: http://lca2015.linux.org.au/cfp Contact me for more information and assistance with presentations. -- Brian Proffitt oVirt Community Manager Project Atomic Community Lead Open Source and Standards, Red Hat - http://community.redhat.com Phone: +1 574 383 9BKP IRC: bkp @ OFTC

1 0

python-pthreading 0.1.3-2
by Douglas Schilling Landgraf 11 Jul '14

11 Jul '14

Hi, python-pthreading 0.1.3-2 is available to test. Please help giving karmas to achieve this version to stable branch as soon as possible. Changes included: - monkey_patch: Fail if it is too late to monkey-patch - Add the missing locked() interface F19: https://admin.fedoraproject.org/updates/python-pthreading-0.1.3-2.fc19 F20: https://admin.fedoraproject.org/updates/python-pthreading-0.1.3-2.fc20 EL6: https://admin.fedoraproject.org/updates/python-pthreading-0.1.3-2.el6 -- Cheers Douglas

2 1

[Documentation] Assistance Needed with User-Facing Documentation
by Brian Proffitt 10 Jul '14

10 Jul '14

All: The Red Hat ECS team has made a great effort to convert some of the more important downstream documentation to a MediaWiki format that we can post on oVirt.org as an official set of user- and admin-facing documentation. This is being done as a bootstrapping effort to get our upstream documentation up to date and take a big step towards making the upstream documentation the canonical source for documentation in the near future. Before that can happen, we need to get this RHEV-oriented information ported over to oVirt nomenclature and screenshots taken for oVirt and added to the documents as well. I have placed the three guides * Administration Guide[1] * User's Guide[2] * Installation Guide[3] on the oVirt site as unlinked pages. General wording has been changed from RHEV to oVirt, but not always. Each of these documents must be reviewed and completely adapted to oVirt 3.4 before they can be posted as official documentation. Specifically: * Review all text to ensure proper steps and descriptions for oVirt features and procedures * Review all text to remove downstream-specific text * Review all code for changes in package names on on-screen displays * Replace all downstream RHEV screenshots with upstream oVirt 3.4 screenshots (Max width: 1024px) I will be stepping through these documents to edit them in more detail under these guidelines, but help is most assuredly needed, in order to get this done in a timely manner. oVirt.org wiki users can visit these pages, review them, and add their changes. MediaWiki has limited version control, so it would be best to edit sections instead of entire documents, to minimize stepping on others' changes. Editing by section will also help us track the sections that have been edited, using the pages' histories. Thank you in advance for all of your help on this project... when finished, this will represent a significant improvement to oVirt's documentation, and make oVirt that much easier to use. Peace, Brian [1] http://www.ovirt.org/DraftAdministrationGuide [2] http://www.ovirt.org/DraftUserGuide [3] http://www.ovirt.org/DraftInstallationGuide -- Brian Proffitt oVirt Community Manager Project Atomic Community Lead Open Source and Standards, Red Hat - http://community.redhat.com Phone: +1 574 383 9BKP IRC: bkp @ OFTC

2 1

[ANN] oVirt 3.4.3 Release Candidate is now available
by Sandro Bonazzola 10 Jul '14

10 Jul '14

The oVirt development team is pleased to announce the availability of oVirt 3.4.3 Release Candidate as of Jul 10th 2014 for testing. oVirt is available now for Fedora 19 and Red Hat Enterprise Linux 6.5 (or similar). This release of oVirt includes numerous bug fixes. See the release notes [1] for a list of the new features and bugs fixed. The new repository ovirt-3.4-rc has been updated for delivering this release, please refer to release notes [1] for instructions on how to enable the repository and for Installation / Upgrade instructions. A new oVirt Live ISO will be available soon[2]. Feel free to join us testing it [3]! [1] http://www.ovirt.org/OVirt_3.4.3_Release_Notes [2] http://resources.ovirt.org/pub/ovirt-3.4-rc/iso/ [3] http://www.ovirt.org/Testing/oVirt_3.4.3_Testing -- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com

1 0

Exception during VM recovery causes VMs not being properly recovered
by Vinzenz Feenstra 10 Jul '14

10 Jul '14

Hi, With the current master of VDSM after restarting VDSM (e.g. after upgrading) I noticed that the VMs were not properly initialized and in PAUSED state. Once checking the logs I found that the cause was here: Thread-13::INFO::2014-07-10 12:11:56,400::vm::2244::vm.Vm::(_startUnderlyingVm) vmId=`db614831-3b4b-4010-a989-f7a5ae6fa5d0`::Skipping errors on recovery Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 2228, in _startUnderlyingVm self._run() File "/usr/share/vdsm/virt/vm.py", line 3312, in _run self._domDependentInit() File "/usr/share/vdsm/virt/vm.py", line 3204, in _domDependentInit self._syncVolumeChain(drive) File "/usr/share/vdsm/virt/vm.py", line 5686, in _syncVolumeChain volumes = self._driveGetActualVolumeChain(drive) File "/usr/share/vdsm/virt/vm.py", line 5665, in _driveGetActualVolumeChain sourceAttr = ('file', 'dev')[drive.blockDev] TypeError: tuple indices must be integers, not NoneType The reason here seems to be this: Thread-13::DEBUG::2014-07-10 12:11:56,393::vm::1349::vm.Vm::(blockDev) vmId=`db614831-3b4b-4010-a989-f7a5ae6fa5d0`::Unable to determine if the path '/rhev/data-center/00000002-0002-0002-0002-000000000002/41b6de4e-23da-481d-904d-9af24fc5f3ab/images/17206f99-38ab-45bc-ae9b-d36a66b00e4c/7b05de43-9d85-435f-8ae9-6ccde21548e4' is a block device Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 1346, in blockDev self._blockDev = utils.isBlockDevice(self.path) File "/usr/lib64/python2.6/site-packages/vdsm/utils.py", line 99, in isBlockDevice return stat.S_ISBLK(os.stat(path).st_mode) OSError: [Errno 2] No such file or directory: '/rhev/data-center/00000002-0002-0002-0002-000000000002/41b6de4e-23da-481d-904d-9af24fc5f3ab/images/17206f99-38ab-45bc-ae9b-d36a66b00e4c/7b05de43-9d85-435f-8ae9-6ccde21548e4' I am running the host on RHEL6.5 Note: I just rebooted the host and started a few more VMs again and when I restart VDSM I get the same errors again. -- Regards, Vinzenz Feenstra | Senior Software Engineer RedHat Engineering Virtualization R & D Phone: +420 532 294 625 IRC: vfeenstr or evilissimo Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com

2 1

Re: [ovirt-devel] [VDSM][sampling] thread pool status and handling of stuck calls
by Saggi Mizrahi 10 Jul '14

10 Jul '14

The more I think about it the more it looks like it's purely a libvirt issue. As long as we can make calls that get stuck on D state we can't scale under stress. In any case, seems like having an interface like. libvirtConnectionPool.request(args, callback) Would be a better solution than a thread pool. It would queue up the request and call the callback once it's done. pseudo example: def collectStats(): def callback(resp): doStuff(resp) threading.Timer(4, collectStats) lcp.listAllDevices(callback) We could have the timer queue in a threadpool since it normally runs in the main thread but that is orthogonal to the libvirt connection issue. As for the thread pool itself. As long as it's more than one class it's too complicated. Things like: * Task Cancelling * Re-queuing (periodic operations) Shouldn't be part of the thread pool. Task should just be functions. If we need something with a state we could use the __call__() method to make an object look like a function. I also don't mind doing if hasattr(task, "run") or callable(task) to handle it using an "interface" Re-queuing could be done with the build it threading.Timer as in threading.Timer(4.5, threadpool.queue, args=(self,)) That way each operation is responsible for handing the details of rescheduling. Should we always wait X, should we wait X - timeItTookToCalculate. You could also do: threading.Timer(2, threadpool.queue, args=(self,)) threading.Timer(4, self.handleBeingLate) Which would handle not getting queued for a certain amount of time. Task cancelling can't be done in a generic manner. The most I think we could do is have threadpool.stop() check hassattr("stop", task) and call it. ----- Original Message ----- > From: "Francesco Romani" <fromani(a)redhat.com> > To: devel(a)ovirt.org > Sent: Friday, July 4, 2014 5:48:59 PM > Subject: [ovirt-devel] [VDSM][sampling] thread pool status and handling of stuck calls > > Hi, > > Nir has begun reviewing my draft patches about the thread pool and sampling > refactoring (thanks!), > and already suggested quite some improvements which I'd like to summarize > > Quick links to the ongoing discussion: > http://gerrit.ovirt.org/#/c/29191/8/lib/threadpool/worker.py,cm > http://gerrit.ovirt.org/#/c/29190/4/lib/threadpool/README.rst,cm > > Quick summary of the discussion on gerrit so far: > 1. extract the scheduling logic from the thread pool. Either add a separate > scheduler class > or let the sampling task reschedule themselves after a succesfull > completion. > In any way the concept of 'periodic task', and the added complexity, > isn't needed. > > 2. drop all the *queue classes I've added, thus making the package simpler. > They are no longer needed since we remove the concept of periodic task. > > 3. have per-task timeout, move the stuck task detection elsewhere, like in > the worker thread, ot > maybe better in the aforementioned scheduler. > If the scheduler finds that any task started in the former pass (or even > before!) > has not yet completed, there is no point in keeping this task alive and it > should be cancelled. > > 4. the sampling task (or maybe the scheduler) can be smarter and halting the > sample in presence of > not responding calls for a given VM, granted the VM reports its > 'health'/responsiveness. > > (Hopefully I haven't forgot anything big) > > In the draft currently published, I reluctantly added the *queue classes and > I agree the periodic > task implementation is messy, so I'll be very happy to drop them. > > However, a core question still holds: what to do in presence of the stuck > task? > > I think it is worth to discuss this topic on a medium friendlier than gerrit, > as it is the single > most important decision to make in the sampling refactoring. > > It all boils down to: > Should we just keep somewhere stuck threads and wait? Should we cancel stuck > tasks? > > A. Let's cancel the stuck tasks. > If we move toward a libvirt connection pool, and we give each worker thread > in the sampling pool > a separate libvirt connection, hopefully read-only, then we should be able to > cancel stuck task by > killing the worker's libvirt connection. We'll still need a (probably much > simpler) watchman/supervisor, > but no big deal here. > Libvirt allows to close a connection from a different thread. > I haven't actually tried to unstuck a blocked thread this way, but I have no > reason to believe it > will not work. > > B. Let's keep around blocked threads > The code as it is just leaves a blocked libvirt call and the worker thread > that carried it frozen. > The stuck worker thread can be replaced up to a cap of frozen threads. > In this worst case scenario, we end up with one (blocked!) thread per VM, as > it is today, and with > no sampling data. > > I believe that #A has some drawbacks which we risk to overlook, and on the > same time #B has some merits. > > Let me explain: > The hardest case is a call blocked in the kernel in D state. Libvirt has no > more room than VDSM > to unblock it; and libvirt itself *has* a pool of resources (threads in this > case) which can be depleted > by stuck calls. Actually, retrying to do a failed task may deplete their pool > even faster[1]. > > I'm not happy to just push this problem down the stack, as it looks to me > that we gain > very little by doing so. VDSM itself surely stays cleaner, but the > VDS/hypervisor hosts on the whole > improves just a bit: libvirt scales better, and that gives us some more room. > > On the other hand, by avoiding to reissue dangerous calls, I believe we make > better use of > the host resources in general. Actually, the point of keeping blocked thread > around is a side effect > of not reattempting blocked calls. Moreover, to keep the blocked thread > around has a significant > benefit: we can discover at the earliest moment when it is safe again to do > the blocked call, > because the blocked call itself returns and we can track this event! (and of > course drop the > now stale result). Otherwise, if we drop the connection, we'll lose this > event and we have no > more option that trying again and hoping for the best[2] > > I know the #B approach is not the cleanest, but I think it has slightly more > appeal, especially > on the libvirt depletion front. > > Thoughts and comments very welcome! > > +++ > > [1] They have extensions to management API to dinamically adjust their thread > pool and/or to cancel > tasks, but it is in the RHEL7.2 timeframe. > [2] A crazy idea would be to do something like > http://en.wikipedia.org/wiki/Exponential_backoff > which I'm not sure would be beneficial > > Bests and thanks, > > -- > Francesco Romani > RedHat Engineering Virtualization R & D > Phone: 8261328 > IRC: fromani > _______________________________________________ > Devel mailing list > Devel(a)ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel >

3 4

Docker containers for ovirt-engine
by Daniel Erez 10 Jul '14

10 Jul '14

Hi, I've created a couple of Docker images with a running ovirt-engine service and uploaded to the Docker registry [1]. Quite useful for quickly having a ready-to-use environment (tests, demos, etc) or to be used as a base image for building new Docker Images. Note that currently the container is not persistent by its nature, hence for saving changes, take use of the 'commit' functionality [2]. [1] https://registry.hub.docker.com/u/danielerez/ovirt-engine/ [2] https://docs.docker.com/reference/commandline/cli/#commit

1 0

Re: [ovirt-devel] [vdsm] virt-v2v integration feature
by Sandro Bonazzola 10 Jul '14

10 Jul '14

Il 09/07/2014 20:26, Arik Hadas ha scritto: > Hi All, > > The proposed feature will introduce a new process of import virtual machines from external systems using virt-v2v in oVirt. > I've created a wiki page that contains initial thoughts and design for it: > http://www.ovirt.org/Features/virt-v2v_Integration > > You are more than welcome to share your thoughts and insights. Please use http://www.ovirt.org/Feature_template as template > > Thanks, > Arik > _______________________________________________ > vdsm-devel mailing list > vdsm-devel(a)lists.fedorahosted.org > https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel > -- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com

1 0

3.6 release in bugzilla
by Michal Skrivanek 10 Jul '14

10 Jul '14

can someone please add 3.6 to oVirt release in bugzilla? Thanks, michal

1 1