oVirt October Updates

Hope to meet many next week in LinuxCon Europe/KVM Forum/oVirt conference in Edinburgh Highlights: - oVirt 3.3 GA of course! some links covering it[10] - more links in Russian, Japanese, German and Italian [15] - oVirt 3.3.1 nearing beta this will include a slew of bug fixes, and a few items which missed 3.3. - Malini Rao sent a proposal for some UI changes[4] - some pluggable scheduler samples are available[5] - Summarized 3.4 community requests, hope the top ones will make it to 3.4[6] - 3.4 cycle - planned to be a shorter cycle, trying to make features available faster. - René Koch published version 0.1 of Monitoring UI-Plugin[8] - a new oVirt module merged to Ansible - covers VM life cycle (create/start/stop/etc.) [11] - a puppet module to deploy ovirt engine and ovirt node[12] - and a chef one[14] (both by Jason Cannon) Conferences: - LinuxCon Europe/KVM Forum/oVirt - next week in Edinburgh LOTs of sessions: http://www.ovirt.org/KVM_Forum_2013 - FOSDEM 2014 - We will be co-organizing a "Virtualization and IaaS" DevRoom with +Lars Kurth, +Itamar Heim and +Thierry Carrez as DevRoom organizers - past: Open World Forum - Building an open hybrid cloud with open source[13] K-LUG - Rochester, MN Area Linux Users Group - oVirt intro[16] Other: - a 3 part series by Humble on using the python sdk: How to shutdown/stop or start virtual machines (VMs) in ovirt DC automatically?[1] List Datacenter, hypervisors/host,vms in an ovirt DC with its name,status,id..etc[2] Find out hosts and clusters where vm is running. its status, ids, storage domain details in an ovirt dc[3] - another one by Humble: Convert physical/virtual systems to virtual using virt-p2v && virt-v2v then use it in ovirt DC[7] - LINBIT published "High-Availability oVirt-Cluster with iSCSI-Storage"[9] Have Fun, Itamar [1] http://humblec.com/ovirt-shutdownstop-start-virtual-machines-vms-ovirt-dc-au... [2] http://humblec.com/ovirt-list-datacenter-hypervisorshostvms-ovirt-dc-status-... [3] http://humblec.com/ovirt-find-hosts-clusters-vm-running-status-ids-storage-d... [4] http://lists.ovirt.org/pipermail/users/2013-October/017088.html [5] http://gerrit.ovirt.org/gitweb?p=ovirt-scheduler-proxy.git;a=tree;f=plugins/... [6] http://lists.ovirt.org/pipermail/users/2013-October/016898.html [7] http://humblec.com/convert-physical-virtual-virtual-using-virt-v2v-virt-p2v-... [8] http://lists.ovirt.org/pipermail/users/2013-October/016842.html [9] note: link requires registeration: http://www.linbit.com/en/company/news/333-high-available-virtualization-at-a... the pdf is: http://www.linbit.com/en/downloads/tech-guides?download=68:high-availability... [10] ovirt 3.3 ga links http://lists.ovirt.org/pipermail/announce/2013-September/000061.html http://www.ovirt.org/OVirt_3.3_release_announcement http://www.redhat.com/about/news/archive/2013/9/ovirt-3-3-release-brings-ope... http://community.redhat.com/ovirt-3-3-glusterized/ http://community.redhat.com/ovirt-3-3-spices-up-the-software-defined-datacen... http://www.tuicool.com/articles/eIJnMr http://www.phoronix.com/scan.php?page=news_item&px=MTQ2MzQ http://captainkvm.com/2013/09/614/ http://www.vmwareindex.com/category/hadoop-splunk/openstack/ovirt-3-3-is-now... http://thehyperadvisor.com/2013/09/17/ovirt-3-3-is-now-released/ [11] https://github.com/ansible/ansible/pull/3838 [12] http://forge.puppetlabs.com/jcannon/ovirt/0.0.1 [13] http://www.openworldforum.org/en/schedule/1/ http://www.openworldforum.org/en/speakers/49/ [14] http://community.opscode.com/cookbooks/ovirt [15] russian: http://www.opennet.ru/opennews/art.shtml?num=38000 http://ru.fedoracommunity.org/content/%D0%92%D1%8B%D1%88%D0%B5%D0%BB-ovirt-%... http://linuxforum.ru/viewtopic.php?id=30572 http://citrix.pp.ru/news/2214-novaya-versiya-sistemy-upravleniya-infrastrukt... german: http://www.pro-linux.de/news/1/20258/ovirt-33-unterstuetzt-openstack.html japanese: http://en.sourceforge.jp/magazine/13/09/25/163000 italian: http://www.freeonline.org/cs/com/la-release-ovirt-3-3-integra-openstack-e-gl... http://www.tomshw.it/cont/news/open-source-red-hat-gestisce-la-virtualizzazi... [16] http://k-lug.org/OldNews

On Wed, Oct 16, 2013 at 1:01 PM, Itamar Heim wrote:
Hope to meet many next week in LinuxCon Europe/KVM Forum/oVirt conference in Edinburgh
Unfortunately not me ;-(
- LINBIT published "High-Availability oVirt-Cluster with iSCSI-Storage"[9]
[9] note: link requires registeration: http://www.linbit.com/en/company/news/333-high-available-virtualization-at-a... the pdf is: http://www.linbit.com/en/downloads/tech-guides?download=68:high-availability...
About 3 years ago I successfully tested plain Qemu/KVM with two nodes and drbd dual-primary with live migration on Fedora, so I know it works very well. I think the LINBIT guide gives many suggestions, one of them is in particular just another way to have a High Available oVirt Engine for example, that is a hot topic lately Deep attention is to be put in fencing mechanism though, as always it is in general when talking about drbd, pacemaker, oVirt. I don't know it quite well, but the drbd proxy feature (closed source and not free) could also be then an option for DR in small environments I'm going to experiment the solution of the paper (possibly using clusterlabs.org repo for cluster and so using corosync instead of heartbeat). To compare in small environments where two nodes are ok, in my opinion glusterfs is far away from robust at the moment: with 3.4.1 on fedora 19 I cannot put a host in maintenance and get back a workable node after normal reboot. Didn't find any "safe" way for a node in distributed-replicated two nodes config to leave the cluster and get it back without becoming crazy to manual solve the brick differences generated in the mean time. And this was with only one VM running ..... Possibly I have to dig more in this, but I didn't find great resources for it. Any tips/links are welcome I would like to have oVirt more conscious about it and have sort of capability to solve itself the misalignments generated on gluster backend during mainteneance of a node. At the moment it seems to me it only shows volumes are ok in the sense of started, but they could be very different... For example another tab with details about heal info; something like the output of the command gluster volume heal $VOLUME info and/or gluster volume heal $VOLUME info split-brain so that if one finds 0 entries, he/she is calm and at the same doesn't risk to be erroneously calm in the other scenario... just my opinion. Gianluca

On 10/19/2013 10:44 AM, Gianluca Cecchi wrote:
On Wed, Oct 16, 2013 at 1:01 PM, Itamar Heim wrote:
Hope to meet many next week in LinuxCon Europe/KVM Forum/oVirt conference in Edinburgh
Unfortunately not me ;-(
- LINBIT published "High-Availability oVirt-Cluster with iSCSI-Storage"[9]
[9] note: link requires registeration: http://www.linbit.com/en/company/news/333-high-available-virtualization-at-a... the pdf is: http://www.linbit.com/en/downloads/tech-guides?download=68:high-availability...
About 3 years ago I successfully tested plain Qemu/KVM with two nodes and drbd dual-primary with live migration on Fedora, so I know it works very well. I think the LINBIT guide gives many suggestions, one of them is in particular just another way to have a High Available oVirt Engine for example, that is a hot topic lately Deep attention is to be put in fencing mechanism though, as always it is in general when talking about drbd, pacemaker, oVirt. I don't know it quite well, but the drbd proxy feature (closed source and not free) could also be then an option for DR in small environments
I'm going to experiment the solution of the paper (possibly using clusterlabs.org repo for cluster and so using corosync instead of heartbeat). To compare in small environments where two nodes are ok, in my opinion glusterfs is far away from robust at the moment: with 3.4.1 on fedora 19 I cannot put a host in maintenance and get back a workable node after normal reboot.
Are you referring to the absence of self-healing after maintenance? Please see below as to why this should not happen. If there's a reproducible test case, it would be good to track this bug at: https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
Didn't find any "safe" way for a node in distributed-replicated two nodes config to leave the cluster and get it back without becoming crazy to manual solve the brick differences generated in the mean time. And this was with only one VM running ..... Possibly I have to dig more in this, but I didn't find great resources for it. Any tips/links are welcome
The following commands might help: 1. gluster volume heal <volname> - perform a fast index based self-heal 2. gluster volume heal <volname> full - perform a "full" self-heal if 1. does not work. 3. gluster volume heal <volname> info - provide information about files/directories that were healed recently. Neither of 1. or 2. would be required in normal operational course as the proactive self-healing feature in glusterfs takes care of healing after a node comes online.
I would like to have oVirt more conscious about it and have sort of capability to solve itself the misalignments generated on gluster backend during mainteneance of a node. At the moment it seems to me it only shows volumes are ok in the sense of started, but they could be very different... For example another tab with details about heal info; something like the output of the command
gluster volume heal $VOLUME info
and/or
gluster volume heal $VOLUME info split-brain
Yes, we are looking to build this for monitoring replicated gluster volumes. - Vijay
so that if one finds 0 entries, he/she is calm and at the same doesn't risk to be erroneously calm in the other scenario...
just my opinion.
Gianluca _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On Mon, Oct 21, 2013 at 11:33 AM, Vijay Bellur wrote:
The following commands might help:
Thanks for your commands. In the mean time I noticed that as I changed gluster sources to allow live migration, offsetting the port to 50152+ as in http://review.gluster.org/#/c/6075/ referred in https://bugzilla.redhat.com/show_bug.cgi?id=1018178 I missed the iptables rules that contained: # Ports for gluster volume bricks (default 100 ports) -A INPUT -p tcp -m tcp --dport 24009:24108 -j ACCEPT -A INPUT -p tcp -m tcp --dport 49152:49251 -j ACCEPT So I had some gluster communication problems too. I updated it at the moment to # Ports for gluster volume bricks (default 100 ports) -A INPUT -p tcp -m tcp --dport 24009:24108 -j ACCEPT -A INPUT -p tcp -m tcp --dport 49152:49251 -j ACCEPT -A INPUT -p tcp -m tcp --dport 50152:50251 -j ACCEPT until libvirt for fedora19 get patched as upstream (a quick test on rebuilding libvirt-1.0.5.6-3.fc19.src.rpm with the patches proposed gave many failure and in the mean time I saw that also some other parts are to be patched...) I put in iptables these lines # Ports for gluster volume bricks (default 100 ports) -A INPUT -p tcp -m tcp --dport 24009:24108 -j ACCEPT -A INPUT -p tcp -m tcp --dport 49152:49251 -j ACCEPT -A INPUT -p tcp -m tcp --dport 50152:50251 -j ACCEPT I'm going to retest completely and see how it behaves. See here for my test case and begin of problem simulation: http://lists.ovirt.org/pipermail/users/2013-October/017228.html It represents a realistic maintenance scenario that forced a manual alignment on gluster that is in my opinin not feasible.... Eventually I'm going to put into a bugzilla if new test with correct iptables ports gives problems
I would like to have oVirt more conscious about it and have sort of capability to solve itself the misalignments generated on gluster backend during mainteneance of a node. At the moment it seems to me it only shows volumes are ok in the sense of started, but they could be very different... For example another tab with details about heal info; something like the output of the command
gluster volume heal $VOLUME info
and/or
gluster volume heal $VOLUME info split-brain
Yes, we are looking to build this for monitoring replicated gluster volumes.
Good! Gianluca

On Tue, Oct 22, 2013 at 12:14 PM, Gianluca Cecchi wrote:
On Mon, Oct 21, 2013 at 11:33 AM, Vijay Bellur wrote:
The following commands might help:
Thanks for your commands. In the mean time I noticed that as I changed gluster sources to allow live migration, offsetting the port to 50152+ as in http://review.gluster.org/#/c/6075/ referred in https://bugzilla.redhat.com/show_bug.cgi?id=1018178
I missed the iptables rules that contained: # Ports for gluster volume bricks (default 100 ports) -A INPUT -p tcp -m tcp --dport 24009:24108 -j ACCEPT -A INPUT -p tcp -m tcp --dport 49152:49251 -j ACCEPT
So I had some gluster communication problems too. I updated it at the moment to # Ports for gluster volume bricks (default 100 ports) -A INPUT -p tcp -m tcp --dport 24009:24108 -j ACCEPT -A INPUT -p tcp -m tcp --dport 49152:49251 -j ACCEPT -A INPUT -p tcp -m tcp --dport 50152:50251 -j ACCEPT
until libvirt for fedora19 get patched as upstream (a quick test on rebuilding libvirt-1.0.5.6-3.fc19.src.rpm with the patches proposed gave many failure and in the mean time I saw that also some other parts are to be patched...) I put in iptables these lines
# Ports for gluster volume bricks (default 100 ports) -A INPUT -p tcp -m tcp --dport 24009:24108 -j ACCEPT -A INPUT -p tcp -m tcp --dport 49152:49251 -j ACCEPT -A INPUT -p tcp -m tcp --dport 50152:50251 -j ACCEPT
Strangely, after fixing iptables rules, still te second node has problems with the VM iage file where I ran the command [g.cecchi at c6s ~]$ sudo time dd if=/dev/zero bs=1024k count=3096 of=/testfile 3096+0 records in 3096+0 records out 3246391296 bytes (3.2 GB) copied, 42.3414 s, 76.7 MB/s 0.01user 7.99system 0:42.34elapsed 18%CPU (0avgtext+0avgdata 7360maxresident)k 0inputs+6352984outputs (0major+493minor)pagefaults 0swaps If I delete the img file that has the delta: [root@f18ovn03 /]# find /gluster/DATA_GLUSTER/brick1/ -samefile /gluster/DATA_GLUSTER/brick1/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/images/15f9ca1c-c435-4892-9eb7-0c84583b2a7d/a123801a-0a4d-4a47-a426-99d8480d2e49 -print -delete /gluster/DATA_GLUSTER/brick1/d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/images/15f9ca1c-c435-4892-9eb7-0c84583b2a7d/a123801a-0a4d-4a47-a426-99d8480d2e49 /gluster/DATA_GLUSTER/brick1/.glusterfs/f4/c1/f4c1b1a4-7328-4d6d-8be8-6b7ff8271d51 then auto heal happens: [root@f18ovn03 15f9ca1c-c435-4892-9eb7-0c84583b2a7d]# gluster volume heal gvdata info Gathering Heal info on volume gvdata has been successful Brick f18ovn01.mydomain:/gluster/DATA_GLUSTER/brick1 Number of entries: 2 /d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/images/15f9ca1c-c435-4892-9eb7-0c84583b2a7d/a123801a-0a4d-4a47-a426-99d8480d2e49 /d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids Brick f18ovn03.mydomain:/gluster/DATA_GLUSTER/brick1 Number of entries: 1 /d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/dom_md/ids [root@f18ovn03 15f9ca1c-c435-4892-9eb7-0c84583b2a7d]# gluster volume heal gvdata info Gathering Heal info on volume gvdata has been successful Brick f18ovn01.mydomain:/gluster/DATA_GLUSTER/brick1 Number of entries: 1 /d0b96d4a-62aa-4e9f-b50e-f7a0cb5be291/images/15f9ca1c-c435-4892-9eb7-0c84583b2a7d/a123801a-0a4d-4a47-a426-99d8480d2e49 Brick f18ovn03.mydomain:/gluster/DATA_GLUSTER/brick1 Number of entries: 0 at the end: [root@f18ovn03 15f9ca1c-c435-4892-9eb7-0c84583b2a7d]# gluster volume heal gvdata info Gathering Heal info on volume gvdata has been successful Brick f18ovn01.mydomain:/gluster/DATA_GLUSTER/brick1 Number of entries: 0 Brick f18ovn03.mydomain:/gluster/DATA_GLUSTER/brick1 Number of entries: 0 But [root@f18ovn03 15f9ca1c-c435-4892-9eb7-0c84583b2a7d]# qemu-img info a123801a-0a4d-4a47-a426-99d8480d2e49 image: a123801a-0a4d-4a47-a426-99d8480d2e49 file format: raw virtual size: 10G (10737418240 bytes) disk size: 1.4G [root@f18ovn01 15f9ca1c-c435-4892-9eb7-0c84583b2a7d]# qemu-img info a123801a-0a4d-4a47-a426-99d8480d2e49 image: a123801a-0a4d-4a47-a426-99d8480d2e49 file format: raw virtual size: 10G (10737418240 bytes) disk size: 4.2G any problem here? Gianluca
participants (3)
-
Gianluca Cecchi
-
Itamar Heim
-
Vijay Bellur