Re: Parent checkpoint ID does not match the actual leaf checkpoint
by Nir Soffer
On Sun, Jul 19, 2020 at 5:38 PM Łukasz Kołaciński <l.kolacinski(a)storware.eu>
wrote:
> Hello,
> Thanks to previous answers, I was able to make backups. Unfortunately, we
> had some infrastructure issues and after the host reboots new problems
> appeared. I am not able to do any backup using the commands that worked
> yesterday. I looked through the logs and there is something like this:
>
> 2020-07-17 15:06:30,644+02 ERROR
> [org.ovirt.engine.core.bll.StartVmBackupCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-54)
> [944a1447-4ea5-4a1c-b971-0bc612b6e45e] Failed to execute VM backup
> operation 'StartVmBackup': {}:
> org.ovirt.engine.core.common.errors.EngineException: EngineException:
> org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException:
> VDSGenericException: VDSErrorException: Failed to StartVmBackupVDS, error =
> Checkpoint Error: {'parent_checkpoint_id': None, 'leaf_checkpoint_id':
> 'cd078706-84c0-4370-a6ec-654ccd6a21aa', 'vm_id':
> '116aa6eb-31a1-43db-9b1e-ad6e32fb9260', 'reason': '*Parent checkpoint ID
> does not match the actual leaf checkpoint*'}, code = 1610 (Failed with
> error unexpected and code 16)
>
>
It looks like engine sent:
parent_checkpoint_id: None
This issue was fix in engine few weeks ago.
Which engine and vdsm versions are you testing?
> at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:114)
> at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.runVdsCommand(VDSBrokerFrontendImpl.java:33)
> at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.runVdsCommand(CommandBase.java:2114)
> at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.StartVmBackupCommand.performVmBackupOperation(StartVmBackupCommand.java:368)
> at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.StartVmBackupCommand.runVmBackup(StartVmBackupCommand.java:225)
> at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.StartVmBackupCommand.performNextOperation(StartVmBackupCommand.java:199)
> at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32)
> at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80)
> at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175)
> at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109)
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at
> java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
> at
> org.glassfish.javax.enterprise.concurrent//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383)
> at
> org.glassfish.javax.enterprise.concurrent//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> at
> org.glassfish.javax.enterprise.concurrent//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)
>
>
> And the last error is:
>
> 2020-07-17 15:13:45,835+02 ERROR
> [org.ovirt.engine.core.bll.StartVmBackupCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-14)
> [f553c1f2-1c99-4118-9365-ba6b862da936] Failed to execute VM backup
> operation 'GetVmBackupInfo': {}:
> org.ovirt.engine.core.common.errors.EngineException: EngineException:
> org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException:
> VDSGenericException: VDSErrorException: Failed to GetVmBackupInfoVDS, error
> = No such backup Error: {'vm_id': '116aa6eb-31a1-43db-9b1e-ad6e32fb9260',
> 'backup_id': 'bf1c26f7-c3e5-437c-bb5a-255b8c1b3b73', 'reason': '*VM
> backup not exists: Domain backup job id not found: no domain backup job
> present'*}, code = 1601 (Failed with error unexpected and code 16)
>
>
This is likely a result of the first error. If starting backup failed the
backup entity
is deleted.
> (these errors are from full backup)
>
> Like I said this is very strange because everything was working correctly.
>
>
> Regards
>
> Łukasz Kołaciński
>
> Junior Java Developer
>
> e-mail: l.kolacinski(a)storware.eu
> <m.helbert(a)storware.eu>
>
>
>
>
> *[image: STORWARE]* <http://www.storware.eu/>
>
>
>
> *ul. Leszno 8/44 01-192 Warszawa www.storware.eu
> <https://www.storware.eu/>*
>
> *[image: facebook]* <https://www.facebook.com/storware>
>
> *[image: twitter]* <https://twitter.com/storware>
>
> *[image: linkedin]* <https://www.linkedin.com/company/storware>
>
> *[image: Storware_Stopka_09]*
> <https://www.youtube.com/channel/UCKvLitYPyAplBctXibFWrkw>
>
>
>
> *Storware Spółka z o.o. nr wpisu do ewidencji KRS dla M.St. Warszawa
> 000510131* *, NIP 5213672602.** Wiadomość ta jest przeznaczona jedynie
> dla osoby lub podmiotu, który jest jej adresatem i może zawierać poufne
> i/lub uprzywilejowane informacje. Zakazane jest jakiekolwiek przeglądanie,
> przesyłanie, rozpowszechnianie lub inne wykorzystanie tych informacji lub
> podjęcie jakichkolwiek działań odnośnie tych informacji przez osoby lub
> podmioty inne niż zamierzony adresat. Jeżeli Państwo otrzymali przez
> pomyłkę tę informację prosimy o poinformowanie o tym nadawcy i usunięcie
> tej wiadomości z wszelkich komputerów. **This message is intended only
> for the person or entity to which it is addressed and may contain
> confidential and/or privileged material. Any review, retransmission,
> dissemination or other use of, or taking of any action in reliance upon,
> this information by persons or entities other than the intended recipient
> is prohibited. If you have received this message in error, please contact
> the sender and remove the material from all of your computer systems.*
>
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/S3PLYPOZGT6...
>
3 years, 4 months
poweroff and reboot with ovirt_vm ansible module
by Nathanaël Blanchet
Hello, is there a way to poweroff or reboot (without stopped and running
state) a vm with the ovirt_vm ansible module?
--
Nathanaël Blanchet
Supervision réseau
Pôle Infrastrutures Informatiques
227 avenue Professeur-Jean-Louis-Viala
34193 MONTPELLIER CEDEX 5
Tél. 33 (0)4 67 54 84 55
Fax 33 (0)4 67 54 84 14
blanchet(a)abes.fr
3 years, 4 months
supervdsm failing during network_caps
by Alan G
Hi,
I have issues with one host where supervdsm is failing in network_caps.
I see the following trace in the log.
MainProcess|jsonrpc/1::ERROR::2020-01-06 03:01:05,558::supervdsm_server::100::SuperVdsm.ServerCallback::(wrapper) Error in network_caps
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/supervdsm_server.py", line 98, in wrapper
res = func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/network/api.py", line 56, in network_caps
return netswitch.configurator.netcaps(compatibility=30600)
File "/usr/lib/python2.7/site-packages/vdsm/network/netswitch/configurator.py", line 317, in netcaps
net_caps = netinfo(compatibility=compatibility)
File "/usr/lib/python2.7/site-packages/vdsm/network/netswitch/configurator.py", line 325, in netinfo
_netinfo = netinfo_get(vdsmnets, compatibility)
File "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/cache.py", line 150, in get
return _stringify_mtus(_get(vdsmnets))
File "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/cache.py", line 59, in _get
ipaddrs = getIpAddrs()
File "/usr/lib/python2.7/site-packages/vdsm/network/netinfo/addresses.py", line 72, in getIpAddrs
for addr in nl_addr.iter_addrs():
File "/usr/lib/python2.7/site-packages/vdsm/network/netlink/addr.py", line 33, in iter_addrs
with _nl_addr_cache(sock) as addr_cache:
File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/usr/lib/python2.7/site-packages/vdsm/network/netlink/__init__.py", line 92, in _cache_manager
cache = cache_allocator(sock)
File "/usr/lib/python2.7/site-packages/vdsm/network/netlink/libnl.py", line 469, in rtnl_addr_alloc_cache
raise IOError(-err, nl_geterror(err))
IOError: [Errno 16] Message sequence number mismatch
A restart of supervdsm will resolve the issue for a period, maybe 24 hours, then it will occur again. So I'm thinking it's resource exhaustion or a leak of some kind?
Running 4.2.8.2 with VDSM at 4.20.46.
I've had a look through the bugzilla and can't find an exact match, closest was this one https://bugzilla.redhat.com/show_bug.cgi?id=1666123 which seems to be a RHV only fix.
Thanks,
Alan
3 years, 6 months
OVN and change of mgmt network
by Gianluca Cecchi
Hello,
I previously had OVN running on engine (as OVN provider with northd and
northbound and southbound DBs) and hosts (with OVN controller).
After changing mgmt ip of hosts (engine has retained instead the same ip),
I executed again on them the command:
vdsm-tool ovn-config <ip_of_engine> <nel_local_ip_of_host>
Now I think I have to clean up some things, eg:
1) On engine
where I get these lines below
systemctl status ovn-northd.service -l
. . .
Sep 29 14:41:42 ovmgr1 ovsdb-server[940]: ovs|00005|reconnect|ERR|tcp:
10.4.167.40:37272: no response to inactivity probe after 5 seconds,
disconnecting
Oct 03 11:52:00 ovmgr1 ovsdb-server[940]: ovs|00006|reconnect|ERR|tcp:
10.4.167.41:52078: no response to inactivity probe after 5 seconds,
disconnecting
The two IPs are the old ones of two hosts
It seems that a restart of the services has fixed...
Can anyone confirm if I have to do anything else?
2) On hosts (there are 3 hosts with OVN on ip 10.4.192.32/33/34)
where I currently have this output
[root@ov301 ~]# ovs-vsctl show
3a38c5bb-0abf-493d-a2e6-345af8aedfe3
Bridge br-int
fail_mode: secure
Port "ovn-1dce5b-0"
Interface "ovn-1dce5b-0"
type: geneve
options: {csum="true", key=flow, remote_ip="10.4.192.32"}
Port "ovn-ddecf0-0"
Interface "ovn-ddecf0-0"
type: geneve
options: {csum="true", key=flow, remote_ip="10.4.192.33"}
Port "ovn-fd413b-0"
Interface "ovn-fd413b-0"
type: geneve
options: {csum="true", key=flow, remote_ip="10.4.168.74"}
Port br-int
Interface br-int
type: internal
ovs_version: "2.7.2"
[root@ov301 ~]#
The IPs of kind 10.4.192.x are ok.
But there is a left-over of an old host I initially used for tests,
corresponding to 10.4.168.74, that now doesn't exist anymore
How can I clean records for 1) and 2)?
Thanks,
Gianluca
3 years, 8 months
CentOS Stream support
by Michal Skrivanek
Hi all,
we would like to ask about interest in community about oVirt moving to CentOS Stream.
There were some requests before but it’s hard to see how many people would really like to see that.
With CentOS releases lagging behind RHEL for months it’s interesting to consider moving to CentOS Stream as it is much more up to date and allows us to fix bugs faster, with less workarounds and overhead for maintaining old code. E.g. our current integration tests do not really pass on CentOS 8.1 and we can’t really do much about that other than wait for more up to date packages. It would also bring us closer to make oVirt run smoothly on RHEL as that is also much closer to Stream than it is to outdated CentOS.
So..would you like us to support CentOS Stream?
We don’t really have capacity to run 3 different platforms, would you still want oVirt to support CentOS Stream if it means “less support” for regular CentOS?
There are some concerns about Stream being a bit less stable, do you share those concerns?
Thank you for your comments,
michal
3 years, 8 months
encrypted GENEVE traffic
by Pavel Nakonechnyi
Dear oVirt Community,
From my understanding oVirt does not support Open vSwitch IPSEC tunneling for GENEVE traffic (which is described on pages http://docs.openvswitch.org/en/latest/howto/ipsec/ and http://docs.openvswitch.org/en/latest/tutorials/ipsec/).
Are there plans to introduce such support? (or explicitly not to..)
Is it possible to somehow manually configure such tunneling for existing virtual networks? (even in a limited way)
Alternatively, is it possible to deploy oVirt on top of the tunneled (i.e. via VXLAN, IPSec) interfaces? This will allow to encrypt all management traffic.
Such requirement arises when using oVirt deployment on third-party premises with untrusted network.
Thank in advance for any clarifications. :)
--
WBR, Pavel
+32478910884
3 years, 8 months
oVirt 4.4: Self-hosted engine deployment fails with backup restore from 4.3 engine
by Oliver Leinfelder
Hi there,
I'm a bit puzzled about an possible upgrade paths from a 4.3 cluster to
version 4.4 in a self-hosted engine environment.
My idea was:
Set up a new host with a clean ovirt node 4.4 installation, then deploy the
hosted engine on this with a restored backup from the production cluster
and go from there.
This however fails with the following error:
2020-05-27 00:17:08,886+0200 DEBUG
otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils._process_output:103 {'msg': 'non-zero return code', 'cmd':
['engine-setup', '--accept-defaults',
'--config-append=/root/ovirt-engine-answers'], 'stdout': "[ INFO ] Stage:
Initializing\n[ INFO ] Stage: Environment setup\n C
onfiguration files: /etc/ovirt-engine-setup.conf.d/10-packaging-jboss.conf,
/etc/ovirt-engine-setup.conf.d/10-packaging.conf,
/etc/ovirt-engine-setup.conf.d/20-setup-ovirt-post.conf,
/root/ovirt-engine-answers\n Log file:
/var/log/ovirt-engine/setup/ovirt-engine-setup-20200527001657-fyeueu.log\n
Version: otop
i-1.9.1 (otopi-1.9.1-1.el8)\n[ INFO ] DNF Downloading 1 files, 0.00KB\n[
INFO ] DNF Downloaded CentOS-8 - AppStream\n[ INFO ] DNF Downloading 1
files, 0.00KB\n[ INFO ] DNF Downloaded CentOS-8 - Base\n[ INFO ] DNF
Downloading 1 files, 0.00KB\n
[...]
... anwsers from backup config follow ....
[...]
2020-05-27 00:17:12,396+0200 DEBUG otopi.context context._executeMethod:145
method exception
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/otopi/context.py", line 132, in
_executeMethod
method['method']()
File
"/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-ansiblesetup/core/misc.py",
line 403, in _closeup
r = ah.run()
File
"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/ansible_utils.py",
line 229, in run
raise RuntimeError(_('Failed executing ansible-playbook'))
Is this approach (restoring from 4.3) generally supposed to work? If not,
what is the appropriate upgrade path?
Thank you!
Regards
Oli
3 years, 10 months
"gluster-ansible-roles is not installed on Host" error on Cockpit
by Hesham Ahmed
On a new 4.3.1 oVirt Node installation, when trying to deploy HCI
(also when trying adding a new gluster volume to existing clusters)
using Cockpit, an error is displayed "gluster-ansible-roles is not
installed on Host. To continue deployment, please install
gluster-ansible-roles on Host and try again". There is no package
named gluster-ansible-roles in the repositories:
[root@localhost ~]# yum install gluster-ansible-roles
Loaded plugins: enabled_repos_upload, fastestmirror, imgbased-persist,
package_upload, product-id, search-disabled-repos,
subscription-manager, vdsmupgrade
This system is not registered with an entitlement server. You can use
subscription-manager to register.
Loading mirror speeds from cached hostfile
* ovirt-4.3-epel: mirror.horizon.vn
No package gluster-ansible-roles available.
Error: Nothing to do
Uploading Enabled Repositories Report
Cannot upload enabled repos report, is this client registered?
This is due to check introduced here:
https://gerrit.ovirt.org/#/c/98023/1/dashboard/src/helpers/AnsibleUtil.js
Changing the line from:
[ "rpm", "-qa", "gluster-ansible-roles" ], { "superuser":"require" }
to
[ "rpm", "-qa", "gluster-ansible" ], { "superuser":"require" }
resolves the issue. The above code snippet is installed at
/usr/share/cockpit/ovirt-dashboard/app.js on oVirt node and can be
patched by running "sed -i 's/gluster-ansible-roles/gluster-ansible/g'
/usr/share/cockpit/ovirt-dashboard/app.js && systemctl restart
cockpit"
3 years, 10 months
ovirt-imageio-proxy not working after updating SSL certificates with a wildcard cert issued by AlphaSSL (intermediate)
by Lynn Dixon
All,
I recently bought a wildcard certificate for my lab domain (shadowman.dev)
and I replaced all the certs on my RHV4.3 machine per our documentation.
The WebUI presents the certs successfully and without any issues, and
everything seemed to be fine, until I tried to upload a disk image (or an
ISO) to my storage domain. I get this error in the events tab:
https://share.getcloudapp.com/p9uPvegx
[image: image.png]
I also see that the disk is showing up in my storage domain, but its
showing "Paused by System" and I can't do anything with it. I cant even
delete it!
I have tried following this document to fix the issue, but it didn't work:
https://access.redhat.com/solutions/4148361
I am seeing this error pop into my engine.log:
https://pastebin.com/kDLSEq1A
And I see this error in my image-proxy.log:
WARNING 2020-07-24 15:26:34,802 web:137:web:(log_error) ERROR [172.17.0.30]
PUT /tickets/ [403] Error verifying signed ticket: Invalid ovirt ticket
(data='------my_ticket_data-----', reason=Untrusted certificate)
[request=0.002946/1]
Now, when I bought my wildcard, I was given a root certificate for the CA,
as well as a separate intermediate CA certificate from the provider.
Likewise, they gave me a certificate and a private key of course. The root
and intermediate CA's certificates have been added
to /etc/pki/ca-trust/source/anchors/ and I did an update-ca-trust.
I also started experiencing issues with the ovpn network provider at the
same time I replaced the SSL certs, but I disregarded it at the time, but
now I am thinking its related. Any advice on what to look for to fix the
ovirt-imageio-proxy?
Thanks!
*Lynn Dixon* | Red Hat Certified Architect #100-006-188
*Solutions Architect* | NA Commercial
Google Voice: 423-618-1414
Cell/Text: 423-774-3188
Click here to view my Certification Portfolio <http://red.ht/1XMX2Mi>
3 years, 10 months