April 2015 - Devel - Ovirt List Archives

Maintainer rights on vdsm - ovirt-3.5-gluster

by Sahina Bose

Hi! On the vdsm branch "ovirt-3.5-gluster", could you provide merge rights to Bala (barumuga(a)redhat.com) ? thanks sahina

9 years, 6 months

4
11
0 / 0

[ovirt-users] oVirt 3.6 new feature: Affinity Group Enforcement Service

by Tomer Saban

Greetings users and developers, I'm developing a new feature "Affinity Rules Enforcement Service"; In summary, =========== If an anti-affinity policy is applied to VMs that are currently running on the same hypervisor, the engine will not automatically migrate one of them off once the policy is applied. The Affinity Rules Enforcement Service will periodically check that affinity rules are being enforced and will migrate VMs if necessary in order to comply with those rules. You're welcome to review the feature page: http://www.ovirt.org/Affinity_Group_Enforcement_Service The feature page is responding to RFE https://bugzilla.redhat.com/show_bug.cgi?id=1112332 Your comments appreciated. Thanks, Tomer

10 years

2
2
0 / 0

Jenkins and gerrit emails

by David Caro

--CSNFvL6ilyiKL/Hs Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi everyone, While messing with the gerrit flags I have found a way to minimize effectiv= ely the emails from Jenkins reviews (something that annoyed a lot of you) and to still give information on gerrit about what is going on. Now you will not get any emails on starting jobs, and only the owner should= get emails on finished jobs. But all those events are registered on the gerrit comments. Let me know if that does not work for you and why and we'll try to adapt to= get the best solution for everyone. :q :wqa --=20 David Caro Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D Tel.: +420 532 294 605 Email: dcaro(a)redhat.com Web: www.redhat.com RHT Global #: 82-62605 --CSNFvL6ilyiKL/Hs Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVQndWAAoJEEBxx+HSYmnDntkH/3io/BwY7OoeI4dKzrBPaxMi CG9NMlh+4z/LK3H75jRHNexllCwmdNt1ac53am97d8HXMty4NiII0gxgQI30SAgR CFDY38swPdPwvGJZegF4ipE4XjzG7Sv43WCoPB2p/o6SZ1QYCVAgf01IWAxC8hlo xvs9/K8VgdGcpP4+nMTWvI0WewVw0hJ4Co7VXU4tdhAbGnJhZsh/SuOUfDtYX9tV kY+Va3kUczSozMftoU5wxToTgBzbI/elRMEDU9hu20HtGM2eChKU1Q9p5GU58IIY IW4nZARCC4hVU5b1M68ERGSrbMGDMJrZ1VAhloUSMg9Euob5D7aemWS2QQrf4Qo= =DAxT -----END PGP SIGNATURE----- --CSNFvL6ilyiKL/Hs--

10 years, 1 month

2
2
0 / 0

Gluster locking the DC

by Christopher Pereira

Hi, 1) FYI: https://bugzilla.redhat.com/show_bug.cgi?id=1217576 I will need some help from a Gluster expert to debug this issue, since I don't know Gluster internals. On a replica-3 cluster with best practices, it's very frustrating to see the whole DC going down because of a FS lock. 2) Should we at some point freeze gluster updates and stick to a stable version, or are we following new features required by oVirt? I feel Gluster very unstable yet, but maybe it's just me. I guess that if a "stat" operation (mount, lsof, etc) of a gluster volume is locking, it's the layer below sanlock, right? I don't think that sanlock should be able to lock Gluster, does it? 3) My feature request: Sanlock+Gluster unit-tests written by the sanlock authors to stress and exploit all Gluster weaknesses. PS: The good news is that [1] and other related issues were finally fixed. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1201355 Best regards, Christopher.

10 years, 1 month

2
2
0 / 0

[vdsm] VmDevices rework

by Martin Polednik

Hello everyone, I have started working on line of patches that deal with current state of VM devices in VDSM. There is a wiki page at [1] describing the issues, phases and final goals for this work. Additionally, there is formal naming system for code related to devices. I would love to hear your opinions and comments about this effort! (and please note that this is very long term work) [1] http://www.ovirt.org/Feature/VmDevices_rework mpolednik

10 years, 1 month

2
3
0 / 0

"Please activate the master Storage Domain first"

by Christopher Pereira

This is a multi-part message in MIME format. --------------020201080903010807000608 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit The DC storage master domain is on a (unrecoverable) storage on a remote dead host. Engine is automatically setting another storage as the "Data (Master)". Seconds later, the unrecoverable storage is marked as "Data (Master)" again. There is no way to start the Datacenter. Both storages are gluster. The old (unrecoverable) one worked fine as a master. Any hint? Logs: Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::592::Storage.TaskManager.Task::(_updateState) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::moving from state init -> state preparing Thread-32620::INFO::2015-04-28 16:34:02,508::logUtils::48::dispatcher::(wrapper) Run and protect: getAllTasksStatuses(spUUID=None, options=None) Thread-32620::ERROR::2015-04-28 16:34:02,508::task::863::Storage.TaskManager.Task::(_setError) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 870, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 49, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2202, in getAllTasksStatuses raise se.SpmStatusError() SpmStatusError: Not SPM: () Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::882::Storage.TaskManager.Task::(_run) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Task._run: bf487090-8d62-4b42-bfd e-93574a8e1486 () {} failed - stopping task Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::1214::Storage.TaskManager.Task::(stop) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::stopping in state preparing (for ce False) Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::990::Storage.TaskManager.Task::(_decref) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::ref 1 aborting True Thread-32620::INFO::2015-04-28 16:34:02,508::task::1168::Storage.TaskManager.Task::(prepare) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::aborting: Task is aborted: 'No t SPM' - code 654 Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::1173::Storage.TaskManager.Task::(prepare) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Prepare: aborted: Not SPM Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::990::Storage.TaskManager.Task::(_decref) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::ref 0 aborting True Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::925::Storage.TaskManager.Task::(_doAbort) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Task._doAbort: force False Thread-32620::DEBUG::2015-04-28 16:34:02,508::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::592::Storage.TaskManager.Task::(_updateState) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::moving from state preparing -> state aborting Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::547::Storage.TaskManager.Task::(__state_aborting) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::_aborting: recover policy none Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::592::Storage.TaskManager.Task::(_updateState) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::moving from state aborting -> state failed Thread-32620::DEBUG::2015-04-28 16:34:02,508::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-32620::DEBUG::2015-04-28 16:34:02,508::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-32620::ERROR::2015-04-28 16:34:02,509::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Not SPM: ()', 'code': 654}} Thread-32620::DEBUG::2015-04-28 16:34:02,509::stompReactor::158::yajsonrpc.StompServer::(send) Sending response --------------020201080903010807000608 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit <html> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8"> </head> <body bgcolor="#FFFFFF" text="#000000"> The DC storage master domain is on a (unrecoverable) storage on a remote dead host. Engine is automatically setting another storage as the "Data (Master)". Seconds later, the unrecoverable storage is marked as "Data (Master)" again. There is no way to start the Datacenter. Both storages are gluster. The old (unrecoverable) one worked fine as a master. Any hint? Logs: <blockquote>Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::592::Storage.TaskManager.Task::(_updateState) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::moving from state init -> state preparing Thread-32620::INFO::2015-04-28 16:34:02,508::logUtils::48::dispatcher::(wrapper) Run and protect: getAllTasksStatuses(spUUID=None, options=None) Thread-32620::ERROR::2015-04-28 16:34:02,508::task::863::Storage.TaskManager.Task::(_setError) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 870, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 49, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2202, in getAllTasksStatuses raise se.SpmStatusError() SpmStatusError: Not SPM: () Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::882::Storage.TaskManager.Task::(_run) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Task._run: bf487090-8d62-4b42-bfd e-93574a8e1486 () {} failed - stopping task Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::1214::Storage.TaskManager.Task::(stop) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::stopping in state preparing (for ce False) Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::990::Storage.TaskManager.Task::(_decref) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::ref 1 aborting True Thread-32620::INFO::2015-04-28 16:34:02,508::task::1168::Storage.TaskManager.Task::(prepare) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::aborting: Task is aborted: 'No t SPM' - code 654 Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::1173::Storage.TaskManager.Task::(prepare) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Prepare: aborted: Not SPM Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::990::Storage.TaskManager.Task::(_decref) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::ref 0 aborting True Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::925::Storage.TaskManager.Task::(_doAbort) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Task._doAbort: force False Thread-32620::DEBUG::2015-04-28 16:34:02,508::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::592::Storage.TaskManager.Task::(_updateState) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::moving from state preparing -> state aborting Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::547::Storage.TaskManager.Task::(__state_aborting) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::_aborting: recover policy none Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::592::Storage.TaskManager.Task::(_updateState) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::moving from state aborting -> state failed Thread-32620::DEBUG::2015-04-28 16:34:02,508::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-32620::DEBUG::2015-04-28 16:34:02,508::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-32620::ERROR::2015-04-28 16:34:02,509::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Not SPM: ()', 'code': 654}} Thread-32620::DEBUG::2015-04-28 16:34:02,509::stompReactor::158::yajsonrpc.StompServer::(send) Sending response </blockquote> </body> </html> --------------020201080903010807000608--

10 years, 1 month

4
9
0 / 0

Jenkins upgrade today

by David Caro

--p2kqVDKq5asng8Dg Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi! Today I will be upgrading jenkins to the latest stable version, the upgrade will start at 17:00 CEST (18:00 IST) and will take around 30min. I will set jenkins to not accept new jobs 30 minutes before that to let any pending jobs finish, but any job that does not finish in that time, will be aborted, if that happens feel free to retrigger your patch at [1]. Cheers! [1] http://jenkins.ovirt.org/gerrit_manual_trigger/ --=20 David Caro Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D Tel.: +420 532 294 605 Email: dcaro(a)redhat.com Web: www.redhat.com RHT Global #: 82-62605 --p2kqVDKq5asng8Dg Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVQKJOAAoJEEBxx+HSYmnDPYYIAJxbfzwUzAYAbQ64t6iOfy6q Cnb7e02hTu6FoBhlW5ERgmNEtJF+e9xoBG4doE73Zc/oDAhACZ29Wdo1gEGrA3B4 ASwuJ7WR4ZW8JlRGSSITXtado9iCe5kDR7UUgz1s37K334pKDsfuj62l/zeR7pFL dUJSkbQs0Iyc7ZafoD3KPi55+V1bt94lC8I0CreM3RJhP4Y1+I7xePJE2nAJzcp1 KKDQY71al3IUHXd6UiLlWRsy4h8c/8J4u47oRjh8qNXYXQlPSbg2VgI1g20Xgfyc PkUU2hk0FkFsEBzHX6+uUQGowydmmmexEQ/xVus1zxxJDK6U/NSyGrZFLfNtNqo= =C9MQ -----END PGP SIGNATURE----- --p2kqVDKq5asng8Dg--

10 years, 1 month

2
5
0 / 0

Re: [ovirt-devel] [ovirt-users] oVirt HA.

by Sven Kieske

On 29/04/15 21:53, Dan Yasny wrote: > There is always room for improvement, but think about it: ever since SolidICE, there has been a demand to minimize the amount of hardware used in a minimalistic setup, thus the hosted engine project. And now that we have it, all of a sudden, we need to provide a way to make multiple engines work in active/passive mode? If that capability is provided, I'm sure a new demand will arise, asking for active/active engines, infinitely scalable, and so on. of course you wan't active/active clusters for an enterprise product, rather sooner than later > > The question really is, where the line is drawn. The engine downtime can be a few minutes, it's not that critical in setups of hundreds of hosts. oVirt's raison d'etre is to make VMs run, everything else is just plumbing around that. I disagree: ovirt is a provider of critical infrastructure (vms and their management) for modern it business. imagine a large organisation just using ovirt for their virtualization, with lots of different departments which at will can spawn their own vms, maybe even from different countrys with different time zones (just like red hat ;) ). of course, if just the engine service is down for some reason and you can just restart it with an outage of some seconds, or maybe a minute - fine. but everything above a minute could become critical for large orgs relying on the ability to spawn vms at any given time. or imagine critical HA vms running on ovirt: you can't migrate them, when the engine is not running. you might not even want a downtime of a single second for them, that's why you implemented things like live migration in the first place. the bottom line is: if you manage critical infrastructure, the tools to manage this infrastructure have to be as reliable as the infrastructure itself. -- Mit freundlichen Grüßen / Regards Sven Kieske Systemadministrator Mittwald CM Service GmbH & Co. KG Königsberger Straße 6 32339 Espelkamp T: +49-5772-293-100 F: +49-5772-293-333 https://www.mittwald.de Geschäftsführer: Robert Meyer St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen

10 years, 1 month

3
3
0 / 0

Validation issue with IntegerEntityModelTextBoxEditor

by Ramesh

Hi, We have a validation issue with IntegerEntityModelTextBoxEditor. It converts non numeric values as null to the model. As a result IntegerValidation.validate() will be success always even in case of non integer value. Any suggestion to fix this issue?. Regards, Ramesh

10 years, 1 month

5
6
0 / 0

"Could not find working O_DIRECT alignment. Try cache.direct=off"

by Christopher Pereira

virDomainCreateXML() is returning "Could not find working O_DIRECT alignment. Try cache.direct=off". This image was working some days ago with a previous version. Did the image get corrupted or could this be related to a recent change in oVirt? --- Logs --- Thread-694::INFO::2015-04-30 03:32:49,512::vm::1765::vm.Vm::(_run) vmId=`bc99ded7-edc9-4bc5-bca5-a61a286757d2`::<?xml version="1.0" encoding="utf-8"?> <domain type="kvm"> <name>Test 31</name> <uuid>bc99ded7-edc9-4bc5-bca5-a61a286757d2</uuid> <memory>12582912</memory> <currentMemory>12582912</currentMemory> <vcpu current="4">16</vcpu> <devices> <channel type="unix"> <target name="com.redhat.rhevm.vdsm" type="virtio"/> <source mode="bind" path="/var/lib/libvirt/qemu/channels/bc99ded7-edc9-4bc5-bca5-a61a286757d2.com.redhat.rhevm.vdsm"/> </channel> <channel type="unix"> <target name="org.qemu.guest_agent.0" type="virtio"/> <source mode="bind" path="/var/lib/libvirt/qemu/channels/bc99ded7-edc9-4bc5-bca5-a61a286757d2.org.qemu.guest_agent.0"/> </channel> <input bus="ps2" type="mouse"/> <sound model="ich6"/> <memballoon model="virtio"> <address bus="0x00" domain="0x0000" function="0x0" slot="0x06" type="pci"/> </memballoon> <controller index="0" ports="16" type="virtio-serial"> <address bus="0x00" domain="0x0000" function="0x0" slot="0x04" type="pci"/> </controller> <video> <address bus="0x00" domain="0x0000" function="0x0" slot="0x02" type="pci"/> <model heads="1" type="cirrus" vram="32768"/> </video> <graphics autoport="yes" listen="0" passwd="*****" passwdValidTo="1970-01-01T00:00:01" port="-1" type="vnc"/> <interface type="bridge"> <address bus="0x00" domain="0x0000" function="0x0" slot="0x03" type="pci"/> <mac address="00:1a:4a:16:01:51"/> <model type="virtio"/> <source bridge=";vdsmdummy;"/> <filterref filter="vdsm-no-mac-spoofing"/> <link state="up"/> </interface> <disk device="cdrom" snapshot="no" type="file"> <address bus="1" controller="0" target="0" type="drive" unit="0"/> <source file="" startupPolicy="optional"/> <target bus="ide" dev="hdc"/> <readonly/> <serial/> </disk> <disk device="disk" snapshot="no" type="file"> <address bus="0x00" domain="0x0000" function="0x0" slot="0x05" type="pci"/> <source file="/rhev/data-center/00000001-0001-0001-0001-00000000007e/3233144b-7be1-445f-9ea6-6aebbacbb93f/images/5a7db80d-f037-4dcb-8059-6706a2b62b19/db63659c-c9a9-4f4f-887f-3443a008a27e"/> <target bus="virtio" dev="vda"/> <serial>5a7db80d-f037-4dcb-8059-6706a2b62b19</serial> <boot order="1"/> <driver cache="none" error_policy="stop" io="threads" name="qemu" type="raw"/> </disk> </devices> <os> <type arch="x86_64" machine="rhel6.5.0">hvm</type> <smbios mode="sysinfo"/> </os> <sysinfo type="smbios"> <system> <entry name="manufacturer">oVirt</entry> <entry name="product">oVirt Node</entry> <entry name="version">7-1.1503.el7.centos.2.8</entry> <entry name="serial">4C4C4544-0042-4B10-8058-B2C04F5A5A31</entry> <entry name="uuid">bc99ded7-edc9-4bc5-bca5-a61a286757d2</entry> </system> </sysinfo> <clock adjustment="3600" offset="variable"> <timer name="rtc" tickpolicy="catchup"/> <timer name="pit" tickpolicy="delay"/> <timer name="hpet" present="no"/> </clock> <features> <acpi/> </features> <cpu match="exact"> <model>SandyBridge</model> <topology cores="1" sockets="16" threads="1"/> </cpu> </domain> Thread-694::ERROR::2015-04-30 03:32:49,978::vm::741::vm.Vm::(_startUnderlyingVm) vmId=`bc99ded7-edc9-4bc5-bca5-a61a286757d2`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 689, in _startUnderlyingVm self._run() File "/usr/share/vdsm/virt/vm.py", line 1800, in _run self._connection.createXML(domxml, flags), File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 126, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3427, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirtError: internal error: process exited while connecting to monitor: 2015-04-30T06:32:49.783554Z qemu-kvm: -drive file=/rhev/data-center/00000001-0001-0001-0001-00000000007e/3233144b-7be1-445f-9ea6-6aebbacbb93f/images/5a7db80d-f037-4dcb-8059-6706a2b62b19/db63659c-c9a9-4f4f-887f-3443a008a27e,if=none,id=drive-virtio-disk0,format=raw,serial=5a7db80d-f037-4dcb-8059-6706a2b62b19,cache=none,werror=stop,rerror=stop,aio=threads: could not open disk image /rhev/data-center/00000001-0001-0001-0001-00000000007e/3233144b-7be1-445f-9ea6-6aebbacbb93f/images/5a7db80d-f037-4dcb-8059-6706a2b62b19/db63659c-c9a9-4f4f-887f-3443a008a27e: Could not find working O_DIRECT alignment. Try cache.direct=off. --- Related BZ : https://bugzilla.redhat.com/show_bug.cgi?id=1184363

10 years, 1 month

1
0
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Devel April 2015