Re: Still having NFS issues. (Permissions)
by Nir Soffer
On Tue, Dec 10, 2019 at 4:35 PM Robert Webb <rwebb(a)ropeguru.com> wrote:
...
> >https://ovirt.org/develop/troubleshooting-nfs-storage-issues.html
> >
> >Generally speaking:
> >
> >Files there are created by vdsm (vdsmd), but are used (when running VMs)
> >by qemu. So both of them need access.
>
> So the link to the NFS storage troubleshooting page is where I found that the perms needed to be 755.
I think this is an error in the troubleshooting page. There is no
reason to allow access to
other users except vdsm:kvm.
...
> Like this:
>
> drwxr-xr-x+ 2 vdsm kvm 4096 Dec 10 09:03 .
> drwxr-xr-x+ 3 vdsm kvm 4096 Dec 10 09:02 ..
> -rw-rw---- 1 vdsm kvm 53687091200 Dec 10 09:02 5a514067-82fb-42f9-b436-f8f93883fe27
> -rw-rw---- 1 vdsm kvm 1048576 Dec 10 09:03 5a514067-82fb-42f9-b436-f8f93883fe27.lease
> -rw-r--r-- 1 vdsm kvm 298 Dec 10 09:03 5a514067-82fb-42f9-b436-f8f93883fe27.meta
>
>
> So, with all that said, I cleaned everything up and my directory permissions look like what Tony posted for his. I have added in his export options to my setup and rebooted my host.
>
> I created a new VM from scratch and the files under images now look like this:
>
> drwxr-xr-x+ 2 vdsm kvm 4096 Dec 10 09:03 .
> drwxr-xr-x+ 3 vdsm kvm 4096 Dec 10 09:02 ..
> -rw-rw---- 1 vdsm kvm 53687091200 Dec 10 09:02 5a514067-82fb-42f9-b436-f8f93883fe27
> -rw-rw---- 1 vdsm kvm 1048576 Dec 10 09:03 5a514067-82fb-42f9-b436-f8f93883fe27.lease
> -rw-r--r-- 1 vdsm kvm 298 Dec 10 09:03 5a514067-82fb-42f9-b436-f8f93883fe27.meta
>
>
> Still not the 755 as expected,
It is not expected, the permissions look normal.
These are the permissions used for volumes on file based storage:
lib/vdsm/storage/constants.py:FILE_VOLUME_PERMISSIONS = 0o660
but I am guessing with the addition of the "anonuid=36,anongid=36" to
the exports, everything is now working as expected. The VM will boot
and run as expected. There was nothing in the any of the documentation
which alluded to possibly needed the additional options in the NFS
export options.
I this is a libvirt issue, it tries to access volumes as root, and
without anonuid=36,anongid=36
it will be squashed to nobody and fail.
Nir
4 years, 11 months
Hosted Engine Failover Timing
by Robert Webb
So in doing some testing, I pulled the plug on my node where the hosted engine was running. Rough timing was about 3.5 minutes before the portal was available again.
I searched around first, but could not find if there was any way to speed of the detection time in order to reboot the hosted engine quicker.
Right now I am only testing this and will add in VM's later, which I understand should reboot a lot quicker.
4 years, 11 months
Did a change in Ansible 2.9 in the ovirt_vm_facts module break the hosted-engine-setup?
by thomas@hoberg.net
I am having problems installing a 3-node HCI cluster on machines that used to work fine.... and on a fresh set of servers, too.
After a series of setbacks on a set of machines with failed installations and potentially failed clean-ups, I am ssing a fresh set of servers that had never run oVirt before.
Patched to just before today's bigger changes (new kernel..) installation failed during the setup of the local hosted engine first and when I switched from GUi to script setup 'hosted-engine --deploy' *without* doing a cleanup this time, it progressed further to the point where the local VM had actually been teleported onto the (gluster based) cluster and is running there.
In what seems the absolutely final action before adding the other two hosts, ansible is doing a finaly inventory of the virtual machine and collects facts or rather information (that's perhaps the breaking point) about that first VM before I would continue, only the data structure got renamed between ansible 2.8 and 2.9 according to this:
https://fossies.org/diffs/ansible/2.8.5_vs_2.9.0rc1/lib/ansible/modules/c...
And the resulting error message from the /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-*.log file is:
2019-12-04 13:15:19,232+0000 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:107 fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_vms": [{"affinity_labels": [], "applications": [], "bios": {"boot_menu": {"enabled": false}, "type": "i440fx_sea_bios"}, "cdroms": [], "cluster": {"href": "/ovirt-engine/api/clusters/6616551e-1695-11ea-a86b-00163e34e004", "id": "6616551e-1695-11ea-a86b-00163e34e004"}, "comment": "", "cpu": {"architecture": "x86_64", "topology": {"cores": 1, "sockets": 4, "threads": 1}}, "cpu_profile": {"href": "/ovirt-engine/api/cpuprofiles/58ca604e-01a7-003f-01de-000000000250", "id": "58ca604e-01a7-003f-01de-000000000250"}, "cpu_shares": 0, "creation_time": "2019-12-04 13:01:12.780000+00:00", "delete_protected": false, "description": "", "disk_attachments": [], "display": {"address": "127.0.0.1", "allow_override": false, "certificate": {"content": "-----BEGIN CERTIFICATE-----(redacted)-----END CERTIFICATE-----\n", "organization":
"***", "subject": "**"}, "copy_paste_enabled": true, "disconnect_action": "LOCK_SCREEN", "file_transfer_enabled": true, "monitors": 1, "port": 5900, "single_qxl_pci": false, "smartcard_enabled": false, "type": "vnc"}, "fqdn": "xdrd1001s.priv.atos.fr", "graphics_consoles": [], "guest_operating_system": {"architecture": "x86_64", "codename": "", "distribution": "CentOS Linux", "family": "Linux", "kernel": {"version": {"build": 0, "full_version": "3.10.0-1062.4.3.el7.x86_64", "major": 3, "minor": 10, "revision": 1062}}, "version": {"full_version": "7", "major": 7}}, "guest_time_zone": {"name": "GMT", "utc_offset": "+00:00"}, "high_availability": {"enabled": false, "priority": 0}, "host": {"href": "/ovirt-engine/api/hosts/75d096fd-4a2f-4ba4-b9fb-941f86daf624", "id": "75d096fd-4a2f-4ba4-b9fb-941f86daf624"}, "host_devices": [], "href": "/ovirt-engine/api/vms/dee6ec3b-5b4a-4063-ade9-12dece0f5fab", "id": "dee6ec3b-5b4a-4063-ade9-12dece0f5fab", "io": {"threads": 1}, "katello_errata": [], "la
rge_icon": {"href": "/ovirt-engine/api/icons/9588ebfc-865a-4969-9829-d170d3654900", "id": "9588ebfc-865a-4969-9829-d170d3654900"}, "memory": 17179869184, "memory_policy": {"guaranteed": 17179869184, "max": 17179869184}, "migration": {"auto_converge": "inherit", "compressed": "inherit"}, "migration_downtime": -1, "multi_queues_enabled": true, "name": "external-HostedEngineLocal", "next_run_configuration_exists": false, "nics": [], "numa_nodes": [], "numa_tune_mode": "interleave", "origin": "external", "original_template": {"href": "/ovirt-engine/api/templates/00000000-0000-0000-0000-000000000000", "id": "00000000-0000-0000-0000-000000000000"}, "os": {"boot": {"devices": ["hd"]}, "type": "other"}, "permissions": [], "placement_policy": {"affinity": "migratable"}, "quota": {"id": "7af18f3a-1695-11ea-ab7e-00163e34e004"}, "reported_devices": [], "run_once": false, "sessions": [], "small_icon": {"href": "/ovirt-engine/api/icons/dec3572e-7465-4527-884b-f7c2eb2ed811", "id": "dec3572e-7465-4
527-884b-f7c2eb2ed811"}, "snapshots": [], "sso": {"methods": [{"id": "guest_agent"}]}, "start_paused": false, "stateless": false, "statistics": [], "status": "unknown", "storage_error_resume_behaviour": "auto_resume", "tags": [], "template": {"href": "/ovirt-engine/api/templates/00000000-0000-0000-0000-000000000000", "id": "00000000-0000-0000-0000-000000000000"}, "time_zone": {"name": "Etc/GMT"}, "type": "desktop", "usb": {"enabled": false}, "watchdogs": []}]}, "attempts": 24, "changed": false, "deprecations": [{"msg": "The 'ovirt_vm_facts' module has been renamed to 'ovirt_vm_info', and the renamed one no longer returns ansible_facts", "version": "2.13"}]}
If that is the case, wouldn't that imply that there is no test case that covers the most typical HCI deployment when Ansible is updated?
That would be truly frightening...
4 years, 11 months
Re: ovirt 4.3.7 geo-replication not working
by Strahil
Hi Adrian,
Have you checked the following link:
https://access.redhat.com/documentation/en-us/red_hat_hyperconverged_infr...
Best Regards,
Strahil NikolovOn Dec 12, 2019 12:35, adrianquintero(a)gmail.com wrote:
>
> Hi Sahina/Strahil,
> We followed the recommended setup from gluster documentation however one of my colleagues noticed a python entry in the logs, turns out it is a missing sym link to a library
>
> We created the following symlink to all the master servers (cluster 1 oVirt 1) and slave servers (Cluster 2, oVirt2) and geo-sync started working:
> /lib64/libgfchangelog.so -> /lib64/libgfchangelog.so.0
> --------------------------------------------------------------------------------------------------------------------------------------------------------
> MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> host1.mydomain1.com geo-master /gluster_bricks/geo-master/geo-master root slave1.mydomain2.com::geo-slave slave1.mydomain2.com Active Changelog Crawl 2019-12-12 05:22:56
> host2.mydomain1.com geo-master /gluster_bricks/geo-master/geo-master root slave1.mydomain2.com::geo-slave slave2.mydomain2.com Passive N/A N/A
> host3.mydomain1.com geo-master /gluster_bricks/geo-master/geo-master root slave1.mydomain2.com::geo-slave slave3.mydomain2.com Passive N/A N/A
> --------------------------------------------------------------------------------------------------------------------------------------------------------
> we still require a bit more testing but at least it is syncing now.
>
> I am trying to find good documentation on how to achieve geo-replication for oVirt, is that something you can point me to? basically looking for a way to do Geo-replication from site A to Site B, but the Geo-Replication pop up window from ovirt does not seem to have the functionality to connect to a slave server from another oVirt setup....
>
>
>
> As a side note, from the oVirt WEB UI the "cancel button" for the "New Geo-Replication" does not seem to work: storage > volumes > "select your volume" > "click 'Geo-Replication'
>
> Any good documentation you can point me to is welcome.
>
> thank you for the swift assistance.
>
> Regards,
>
> Adrian
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/EQHPJ3AUTWM...
4 years, 11 months
Re: HCL: 4.3.7: Hosted engine fails
by Ralf Schenk
Hello,
hosted engine deployment simply fails on EPYC with 4.3.7.
See my earlier posts "HostedEngine Deployment fails on AMD EPYC 7402P
4.3.7".
I was able to get this up and running by modifying HostedEngine VM XML
while Installation tries to start the engine. Great fun !
virsh -r dumpxml HostedEngine shows:
<cpu mode='custom' match='exact' check='partial'>
<model fallback='allow'>EPYC</model>
<topology sockets='16' cores='4' threads='1'/>
<feature policy='require' name='ibpb'/>
<feature policy='require' name='virt-ssbd'/>
which is not working. I removed the line with virt-ssbd via virsh edit
and then it was able to start Engine. To modify the defined HostedEngine
you need to create an account for libvirt via saslpasswd2 to be able to
"virsh edit HostedEngine"
saslpasswd2 -c -f /etc/libvirt/passwd.db admin@ovirt
so your Passdb shows:
sasldblistusers2 /etc/libvirt/passwd.db
vdsm@ovirt: userPassword
admin@ovirt: userPassword
If the HostedEngine runs afterwards let it run so it updates the
persitent configuration in hosted-storage. I also modified these by
untarring and maniplulating the correspondent XML-Strings !.
Really ugly and disfunctional.
Bye
Am 11.12.2019 um 14:33 schrieb Robert Webb:
> I could not find if that CPU supported SSBD or not.
>
> Log into one your nodes via console and run, "cat /proc/cpuinfo" and check the "flags" section and see if SSBD is listed. If not, then look at your cluster config under the "General" section and see what it has for "Cluster CPU Type". Make sure it hasn't chosen a CPU type which it thinks has SSBD available.
>
> I have a Xeon X5670 and it does support SSBD and there is a specific CPU type selected named, "Intel Westmere IBRS SSBD Family".
>
> Hope this helps.
>
> ________________________________________
> From: Christian Reiss <email(a)christian-reiss.de>
> Sent: Wednesday, December 11, 2019 7:55 AM
> To: users(a)ovirt.org
> Subject: [ovirt-users] HCL: 4.3.7: Hosted engine fails
>
> Hey all,
>
> Using a homogeneous ovirt-node-ng-4.3.7-0.20191121.0 freshly created
> cluster using node installer I am unable to deploy the hosted engine.
> Everything else worked.
>
> In vdsm.log is a line, just after attempting to start the engine:
>
> libvirtError: the CPU is incompatible with host CPU: Host CPU does not
> provide required features: virt-ssbd
>
> I am using AMD EPYC 7282 16-Core Processors.
>
> I have attached
>
> - vdsm.log (during and failing the start)
> - messages (for bootup / libvirt messages)
> - dmesg (grub / boot config)
> - deploy.log (browser output during deployment)
> - virt-capabilites (virsh -r capabilities)
>
> I can't think -or don't know- off any other log files of interest here,
> but I am more than happy to oblige.
>
> notectl check tells me
>
> Status: OK
> Bootloader ... OK
> Layer boot entries ... OK
> Valid boot entries ... OK
> Mount points ... OK
> Separate /var ... OK
> Discard is used ... OK
> Basic storage ... OK
> Initialized VG ... OK
> Initialized Thin Pool ... OK
> Initialized LVs ... OK
> Thin storage ... OK
> Checking available space in thinpool ... OK
> Checking thinpool auto-extend ... OK
> vdsmd ... OK
>
> layers:
> ovirt-node-ng-4.3.7-0.20191121.0:
> ovirt-node-ng-4.3.7-0.20191121.0+1
> bootloader:
> default: ovirt-node-ng-4.3.7-0.20191121.0 (3.10.0-1062.4.3.el7.x86_64)
> entries:
> ovirt-node-ng-4.3.7-0.20191121.0 (3.10.0-1062.4.3.el7.x86_64):
> index: 0
> title: ovirt-node-ng-4.3.7-0.20191121.0 (3.10.0-1062.4.3.el7.x86_64)
> kernel:
> /boot/ovirt-node-ng-4.3.7-0.20191121.0+1/vmlinuz-3.10.0-1062.4.3.el7.x86_64
> args: "ro crashkernel=auto rd.lvm.lv=onn_node01/swap
> rd.lvm.lv=onn_node01/ovirt-node-ng-4.3.7-0.20191121.0+1 rhgb quiet
> LANG=en_GB.UTF-8 img.bootid=ovirt-node-ng-4.3.7-0.20191121.0+1"
> initrd:
> /boot/ovirt-node-ng-4.3.7-0.20191121.0+1/initramfs-3.10.0-1062.4.3.el7.x86_64.img
> root: /dev/onn_node01/ovirt-node-ng-4.3.7-0.20191121.0+1
> current_layer: ovirt-node-ng-4.3.7-0.20191121.0+1
>
>
> The odd thing is the hosted engine vm does get started during initial
> configuration and works. Just when the ansible stuff is done an its
> moved over to ha storage the CPU quirks start.
>
> So far I learned that ssbd is a mitigation protection but the flag is
> not in my cpu. Well, ssbd is virt-ssbd is not.
>
> I am *starting* with ovirt. I would really, really welcome it if
> recommendations would include clues on how to make it happen.
> I do rtfm, but I was unable to find anything (or any solution) anywhere.
> Not after 80 hours of working on this.
>
> Thank you all.
> -Chris.
>
> --
> Christian Reiss - email(a)christian-reiss.de /"\ ASCII Ribbon
> support(a)alpha-labs.net \ / Campaign
> X against HTML
> WEB alpha-labs.net / \ in eMails
>
> GPG Retrieval https://gpg.christian-reiss.de
> GPG ID ABCD43C5, 0x44E29126ABCD43C5
> GPG fingerprint = 9549 F537 2596 86BA 733C A4ED 44E2 9126 ABCD 43C5
>
> "It's better to reign in hell than to serve in heaven.",
> John Milton, Paradise lost.
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MXW3L2HAXJC...
--
*Ralf Schenk*
fon +49 (0) 24 05 / 40 83 70
fax +49 (0) 24 05 / 40 83 759
mail *rs(a)databay.de* <mailto:rs@databay.de>
*Databay AG*
Jens-Otto-Krag-Straße 11
D-52146 Würselen
*www.databay.de* <http://www.databay.de>
Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.-Kfm.
Philipp Hermanns
Aufsichtsratsvorsitzender: Wilhelm Dohmen
------------------------------------------------------------------------
4 years, 11 months
Re: ovirt 4.3.7 geo-replication not working
by Strahil
Hi Adrian,
Have you checked the passwordless rsync between master and slave volume nodes ?
Best Regards,
Strahil NikolovOn Dec 11, 2019 22:36, adrianquintero(a)gmail.com wrote:
>
> Hi,
> I am trying to setup geo-replication between 2 sites, but I keep getting:
> [root@host1 ~]# gluster vol geo-rep geo-master slave1.mydomain2.com::geo-slave status
>
> MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> host1.mydomain1.com geo-master /gluster_bricks/geo-master/geo-master root slave1.mydomain2.com::geo-slave N/A Faulty N/A N/A
> host2.mydomain2.com geo-master /gluster_bricks/geo-master/geo-master root slave1.mydomain2.com::geo-slave N/A Faulty N/A N/A
> vmm11.virt.iad3p geo-master /gluster_bricks/geo-master/geo-master root slave1.mydomain2.com::geo-slave N/A Faulty N/A N/A
>
>
> oVirt GUI has an icon in the volume that says "volume data is being geo-replicated" but we know that is not the case
> From the logs i can see:
> [2019-12-11 19:57:48.441557] I [fuse-bridge.c:6810:fini] 0-fuse: Unmounting '/tmp/gsyncd-aux-mount-5WaCmt'.
> [2019-12-11 19:57:48.441578] I [fuse-bridge.c:6815:fini] 0-fuse: Closing fuse connection to '/tmp/gsyncd-aux-mount-5WaCmt'
>
> and
> [2019-12-11 19:45:14.785758] I [monitor(monitor):278:monitor] Monitor: worker died in startup phase brick=/gluster_bricks/geo-master/geo-master
>
> thoughts?
>
> thanks,
>
> Adrian
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/TPTAODQ3Q4Z...
4 years, 11 months
Mixing compute and storage without (initial) HCI
by thomas@hoberg.net
Some documentation, especially on older RHEV versions seems to indicate that Gluster storage roles and compute server roles in an oVirt cluster or actually exclusive.
Yet HCI is all about doing both, which is slightly confusing when you try to overcome HCI issues simply by running the management engine in a "side car", in my case simply a KVM VM running on a host that's not part of the HCI (attempted) cluster.
So the HCI wizard failed to deploy the management VM, but left me with a working Gluster.
I then went ahead and used a fresh CentOS 7.7 VM to install the hosted engine the the script variant without an appliance.
That went ... not too bad initially, it even found the Gluster bricks eventually, installed the three hosts, looked almost ready to go, but something must be wrong with the "ovirtmgmt" network.
It looks alright in the network configuration, where it has all roles, but when I try to create a VM, I cannot select it as the standard network for the VM.
Any new network I might try to define, won't work either.
So apart from these detail problems: Is the prohibition against mising both roles, gluster and (compute) cluster still in place?
Because there are also documentation remnants which seem to indicate that a management engine for an existing cluster can actually be converted into a VM managed by the hosted engine e.g. as part of a recovery, repeating later what the HCI setup attempts to automate.
4 years, 11 months
Gluster mount still fails on Engine deployment - any suggestions...
by rob.downer@orbitalsystems.co.uk
Hi Engine deployment fails here...
[ INFO ] TASK [ovirt.hosted_engine_setup : Add glusterfs storage domain]
[ ERROR ] Error: Fault reason is "Operation Failed". Fault detail is "[Unexpected exception]". HTTP response code is 400.
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Fault reason is \"Operation Failed\". Fault detail is \"[Unexpected exception]\". HTTP response code is 400."}
However Gluster looks good...
I have reinstalled all nodes from scratch.
root@ovirt3 ~]# gluster volume status
Status of volume: data
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick gfs3.gluster.private:/gluster_bricks/
data/data 49152 0 Y 3756
Brick gfs2.gluster.private:/gluster_bricks/
data/data 49153 0 Y 3181
Brick gfs1.gluster.private:/gluster_bricks/
data/data 49152 0 Y 15548
Self-heal Daemon on localhost N/A N/A Y 17602
Self-heal Daemon on gfs1.gluster.private N/A N/A Y 15706
Self-heal Daemon on gfs2.gluster.private N/A N/A Y 3348
Task Status of Volume data
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: engine
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick gfs3.gluster.private:/gluster_bricks/
engine/engine 49153 0 Y 3769
Brick gfs2.gluster.private:/gluster_bricks/
engine/engine 49154 0 Y 3194
Brick gfs1.gluster.private:/gluster_bricks/
engine/engine 49153 0 Y 15559
Self-heal Daemon on localhost N/A N/A Y 17602
Self-heal Daemon on gfs1.gluster.private N/A N/A Y 15706
Self-heal Daemon on gfs2.gluster.private N/A N/A Y 3348
Task Status of Volume engine
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: vmstore
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick gfs3.gluster.private:/gluster_bricks/
vmstore/vmstore 49154 0 Y 3786
Brick gfs2.gluster.private:/gluster_bricks/
vmstore/vmstore 49152 0 Y 2901
Brick gfs1.gluster.private:/gluster_bricks/
vmstore/vmstore 49154 0 Y 15568
Self-heal Daemon on localhost N/A N/A Y 17602
Self-heal Daemon on gfs1.gluster.private N/A N/A Y 15706
Self-heal Daemon on gfs2.gluster.private N/A N/A Y 3348
Task Status of Volume vmstore
------------------------------------------------------------------------------
There are no active volume tasks
4 years, 11 months
Re: OVN communications between hosts
by Strahil
Hi Pavel,
Would you mind to share the list of ovn devices you have.
Currently in UI, I don't have any network (except ovirtmgmt) and I see multiplee devices.
My guess is that I should remove all but the br-int , but I would like not to kill my cluster communication :)
Best Regards,
Strahil NikolovOn Dec 11, 2019 10:25, Pavel Nakonechnyi <pavel(a)gremwell.com> wrote:
>
> Hi Strahil,
>
> On Tuesday, 10 December 2019 20:50:27 CET Strahil wrote:
> > Hi Pavel,
> >
> > Can you explain how did you find the issue.
> >
> > I'm new in OVN and I experience the same symptoms .
> >
> > I'm not sure what will be the best approach to start cleanly with OVN.
> >
>
> I am new to OVN / OpenVSwitch too. :)
>
> What helped me to begin with is capturing traffic on "ovirtmgmnt" interface using tcpdump and looking into it for packets related to data exchange between VMs over virtual network. I found communications over UDP:6081, which were fine by themselves. However, this helped to craft more targeted search requests which lead me to this discussion: https://lists.ovirt.org/archives/list/users@ovirt.org/thread/PVJJPK7MQOFL...
>
> Some suggestions there were helpful to at least become slightly more familiar with how the things work.
>
> Anyway, lack of documentation on this matter (I know that there are high-level old docs) does not help to resolve such cases.
>
>
>
4 years, 11 months
Ovirt OVN help needed
by Strahil Nikolov
Hi Community,
can someone hint me how to get rid of some ports? I just want to 'reset' my ovn setup.
Here is what I have so far:
[root@ovirt1 openvswitch]# ovs-vsctl list interface
_uuid : be89c214-10e4-4a97-a9eb-1b82bc433a24
admin_state : up
bfd : {}
bfd_status : {}
cfm_fault : []
cfm_fault_status : []
cfm_flap_count : []
cfm_health : []
cfm_mpid : []
cfm_remote_mpids : []
cfm_remote_opstate : []
duplex : []
error : []
external_ids : {}
ifindex : 35
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current : []
link_resets : 0
link_speed : []
link_state : up
lldp : {}
mac : []
mac_in_use : "7a:7d:1d:a7:43:1d"
mtu : []
mtu_request : []
name : "ovn-25cc77-0"
ofport : 6
ofport_request : []
options : {csum="true", key=flow, remote_ip="192.168.1.64"}
other_config : {}
statistics : {rx_bytes=0, rx_packets=0, tx_bytes=0, tx_packets=0}
status : {tunnel_egress_iface=ovirtmgmt, tunnel_egress_iface_carrier=up}
type : geneve
_uuid : ec6a6688-e5d6-4346-ac47-ece1b8379440
admin_state : down
bfd : {}
bfd_status : {}
cfm_fault : []
cfm_fault_status : []
cfm_flap_count : []
cfm_health : []
cfm_mpid : []
cfm_remote_mpids : []
cfm_remote_opstate : []
duplex : []
error : []
external_ids : {}
ifindex : 13
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current : []
link_resets : 0
link_speed : []
link_state : down
lldp : {}
mac : []
mac_in_use : "66:36:dd:63:dc:48"
mtu : 1500
mtu_request : []
name : br-int
ofport : 65534
ofport_request : []
options : {}
other_config : {}
statistics : {collisions=0, rx_bytes=0, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_over_err=0, rx_packets=0, tx_bytes=0, tx_dropped=0, tx_errors=0, tx_packets=0}
status : {driver_name=openvswitch}
type : internal
_uuid : 1e511b4d-f7c2-499f-bd8c-07236e7bb7af
admin_state : up
bfd : {}
bfd_status : {}
cfm_fault : []
cfm_fault_status : []
cfm_flap_count : []
cfm_health : []
cfm_mpid : []
cfm_remote_mpids : []
cfm_remote_opstate : []
duplex : []
error : []
external_ids : {}
ifindex : 35
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current : []
link_resets : 0
link_speed : []
link_state : up
lldp : {}
mac : []
mac_in_use : "1a:85:d1:d9:e2:a5"
mtu : []
mtu_request : []
name : "ovn-566849-0"
ofport : 5
ofport_request : []
options : {csum="true", key=flow, remote_ip="192.168.1.41"}
other_config : {}
statistics : {rx_bytes=0, rx_packets=0, tx_bytes=0, tx_packets=0}
status : {tunnel_egress_iface=ovirtmgmt, tunnel_egress_iface_carrier=up}
type : geneve
When I try to remove a port - it never ends (just hanging):
[root@ovirt1 openvswitch]# ovs-vsctl --dry-run del-port br-int ovn-25cc77-0
In journal I see only this:дек 12 04:13:57 ovirt1.localdomain ovs-vsctl[22030]: ovs|00001|vsctl|INFO|Called as ovs-vsctl --dry-run del-port br-int ovn-25cc77-0
The stranger part to me is the log output:
[root@ovirt1 openvswitch]# grep ovn-25cc77-0 /var/log/openvswitch/*.log
/var/log/openvswitch/ovs-vswitchd.log:2019-12-12T01:26:28.642Z|00032|bridge|INFO|bridge br-int: added interface ovn-25cc77-0 on port 14
/var/log/openvswitch/ovs-vswitchd.log:2019-12-12T01:45:15.646Z|00113|bridge|INFO|bridge br-int: deleted interface ovn-25cc77-0 on port 14
/var/log/openvswitch/ovs-vswitchd.log:2019-12-12T01:45:15.861Z|00116|bridge|INFO|bridge br-int: added interface ovn-25cc77-0 on port 2
/var/log/openvswitch/ovs-vswitchd.log:2019-12-12T01:50:36.678Z|00118|bridge|INFO|bridge br-int: deleted interface ovn-25cc77-0 on port 2
/var/log/openvswitch/ovs-vswitchd.log:2019-12-12T01:52:31.180Z|00121|bridge|INFO|bridge br-int: added interface ovn-25cc77-0 on port 3
/var/log/openvswitch/ovs-vswitchd.log:2019-12-12T01:55:09.734Z|00125|bridge|INFO|bridge br-int: deleted interface ovn-25cc77-0 on port 3
/var/log/openvswitch/ovs-vswitchd.log:2019-12-12T01:58:15.138Z|00127|bridge|INFO|bridge br-int: added interface ovn-25cc77-0 on port 6
I'm also attaching the verbose output of the dryrun.
Thanks in advance.
Best Regards,Strahil Nikolov
4 years, 11 months