vdsm error every hour
by eevans@digitaldatatechs.com
In the web ui I get an error about every hour: VDSM command SetVolumeDescriptionVDS failed: Volume does not exist: (u'e3f79840-8355-45b0-ad2b-440c877be637',)
I looked in storage and disks and this disk does not exist. Its more of an annoyance than a problem but if there is way to get rid of this error I would like to know.
My research says I can install vdsm-tools and vdsm-cli but vdsm-cli is not available and I really don;t want to install anything until I know it's what I need.
Is there a vdsm command to purge a missing disk so this error won't show up?
Thanks in advance.
Eric
4 years, 4 months
Error with Imported VM Run
by Ian Easter
I imported an orphaned Storage Domain into a new oVirt instance (4.4) and
imported the VMs that were found in that storage domain. The SD was
previously attached to, 4.2/4.3 instance of oVirt and I did not see any
errors about the imports.
However, when I attempt to run these VMs, I see this error in the
engine.log:
ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engine-Thread-51848)
[d56a3ce0-7139-4173-bcdb-d5bd4d9378fa] EVENT_ID: USER_FAILED_RUN_VM(54),
Failed to run VM mtl-portal-01 (User: admin@internal-authz).
As best as I can tell, this is a vdsm version issue... So what are
my options to resolve this? Should I go through the steps to build out an
older oVirt release and then attach the SD to that instance and work
through the migration?
Can I manually update the VM to work with the newer vdsm version?
*Thank you,*
*Ian Easter*
4 years, 4 months
NVMe-oF
by Daniel Menzel
Hi all,
does anybody know whether NVMe-over-Fabrics support is somehow scheduled
in oVirt? As, from a logical point of view, it is like iSCSI on
steroids, I guess it shouldn't be too hard to implement it. Yep, I know,
bold statement from someone who isn't a programmer. ;-) Nonetheless: Are
there plans to implement NVMe-oF as storage backend for oVirt in the
near future? If so: Is there a way to help (i.e. with hardware ressources)?
Kind regards,
Daniel
--
Daniel Menzel
Geschäftsführer
Menzel IT GmbH
Charlottenburger Str. 33a
13086 Berlin
+49 (0) 30 / 5130 444 - 00
daniel.menzel(a)menzel-it.net
https://menzel-it.net
Geschäftsführer: Daniel Menzel, Josefin Menzel
Unternehmenssitz: Berlin
Handelsregister: Amtsgericht Charlottenburg
Handelsregister-Nummer: HRB 149835 B
USt-ID: DE 309 226 751
4 years, 4 months
oVirt 2020 online conference
by Sandro Bonazzola
It is our pleasure to invite you to oVirt 2020 online conference. The
conference,organized by oVirt community, will take place online on Monday,
September 7th 2020!
oVirt 2020 is a free conference for oVirt community project users and
contributors coming to a web browser near you!
There is no admission or ticket charge for this event. However, you will be
required to complete a free registration.
Watch https://blogs.ovirt.org/ovirt-2020-online-conference/ for updates
about registration. Talks, presentations and workshops will all be in
English.
We encourage students and new graduates as well as professionals to submit
proposals to oVirt conferences.
We will be looking for talks and discussions across virtualization, and how
oVirt 4.4 can effectively solve user issues around:
- Upgrade flows
- New features
- Integration with other projects
- User stories
The deadline to submit abstracts is July 26th 2020.
To submit your abstract, please click on the following link: submission form
<https://forms.gle/1RDUwPGKobJfKhaYA>
More information are available at
https://blogs.ovirt.org/ovirt-2020-online-conference/
Thanks,
--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA <https://www.redhat.com/>
sbonazzo(a)redhat.com
<https://www.redhat.com/>
*Red Hat respects your work life balance. Therefore there is no need to
answer this email out of your office hours.
<https://mojo.redhat.com/docs/DOC-1199578>*
4 years, 4 months
Re: VDSM can't see StoragePool
by Strahil Nikolov
Can you set one of the Hypervisours into maintenance and use the "reinstall" option from the UI ?
Best Regards,
Strahil Nikolov
На 25 юни 2020 г. 13:24:26 GMT+03:00, Erez Zarum <erezz(a)nanosek.com> написа:
>I have a Self-hosted Engine running on iSCSI as well as couple of
>Storage domains using iSCSI, both the SE and those Storage Domains uses
>the same target portals (two).
>I can see the iSCSI sessions and multipath working well from the Host
>point of view.
>Yesterday after doing a restart for the “ovirt-engine” all the hosts
>besides the one that runs the SE and the SPM went into “Unassigned”
>mode with an error stating the ovirt-engine can’t communicate with the
>hosts.
>Network wise, everything is good, I can reach all the ports, all the
>network is well configured, so I ruled this.
>Looking at the VDSM logs on those “Unassigned” hosts it looks like the
>VDSM can’t find the Storage Pool.
>
>(vmrecovery) [vdsm.api] START
>getConnectedStoragePoolsList(options=None) from=internal,
>task_id=217ec32b-591c-4376-8dc0-8d62200557ee (api:48)
>(vmrecovery) [vdsm.api] FINISH getConnectedStoragePoolsList
>return={'poollist': []} from=internal,
>task_id=217ec32b-591c-4376-8dc0-8d62200557ee (api:54)
>(vmrecovery) [vds] recovery: waiting for storage pool to go up
>(clientIF:723)
>
>If I look at the VDSM logs on the host where the SE and SPM is running,
>no issues there and the node appears up (green) in the ovirt-engine UI.
>
>I managed to set the hosted-engine to maintenance, shut it down and
>then start it again on another Host, when it starts on that Host, the
>host goes “green” and if the SPM stays on the previous host, I have two
>hosts working and the rest remains “Unassigned”.
>
>All the “ovirt-ha-agent”/”ovirt-ha-broker” services seems ok, I
>restarted them, I also tried to restart the VDSM on the hosts with no
>luck.
>I have the VMs still running, I did shutdown one host (even used the
>“SSH restart” from the WebUI) to see if that helps, it came back and
>still went into “Unassigned”.
>
>It seems like the hosts can’t see the Storage pool.
>
>Where should I start to troubleshoot this?
>
>Thanks
4 years, 4 months
Re: Migration of self Hosted Engine from iSCSI to Gluster/NFS
by Strahil Nikolov
As you will migrate from block-based storage to file-based storage, I think that you should use the backup & restore procedure.
Best Regards,
Strahil Nikolov
На 25 юни 2020 г. 7:31:55 GMT+03:00, Erez Zarum <erezz(a)nanosek.com> написа:
>I was looking for a “complete” best practice to migrate a self-hosted
>engine running currently on an iSCSI LUN to a Gluster of NFS storage
>domain
>oVirt version 4.3.10
>
>Thanks!
4 years, 4 months
update and engine-setup
by eevans@digitaldatatechs.com
I do not have a self hosted engine and did yum update whech update these files:
Updated:
microcode_ctl.x86_64 2:2.1-61.10.el7_8
ovirt-engine-appliance.x86_64 0:4.3-20200625.1.el7
ovirt-engine-extensions-api-impl.noarch 0:4.3.11.1-0.0.master.20200625131236.git33fd414.el7
ovirt-engine-extensions-api-impl-javadoc.noarch 0:4.3.11.1-0.0.master.20200625131236.git33fd414.el7
ovirt-engine-health-check-bundler.noarch 0:4.3.11.1-0.0.master.20200625131236.git33fd414.el7
ovirt-engine-setup.noarch 0:4.3.11.1-0.0.master.20200625131236.git33fd414.el7
ovirt-engine-setup-base.noarch 0:4.3.11.1-0.0.master.20200625131236.git33fd414.el7
ovirt-engine-setup-plugin-cinderlib.noarch 0:4.3.11.1-0.0.master.20200625131236.git33fd414.el7
ovirt-engine-setup-plugin-ovirt-engine.noarch 0:4.3.11.1-0.0.master.20200625131236.git33fd414.el7
ovirt-engine-setup-plugin-ovirt-engine-common.noarch 0:4.3.11.1-0.0.master.20200625131236.git33fd414.el7
ovirt-engine-setup-plugin-vmconsole-proxy-helper.noarch 0:4.3.11.1-0.0.master.20200625131236.git33fd414.el7
ovirt-engine-setup-plugin-websocket-proxy.noarch 0:4.3.11.1-0.0.master.20200625131236.git33fd414.el7
ovirt-engine-ui-extensions.noarch 0:1.0.13-1.20200303git3b594b8.el7
ovirt-engine-vmconsole-proxy-helper.noarch 0:4.3.11.1-0.0.master.20200625131236.git33fd414.el7
ovirt-engine-websocket-proxy.noarch 0:4.3.11.1-0.0.master.20200625131236.git33fd414.el7
ovirt-node-ng-image-update.noarch 0:4.3.11-0.1.rc1.20200625025312.git243a031.el7
ovirt-release43.noarch 0:4.3.11-0.1.rc1.20200626025335.git243a031.el7
ovirt-release43-snapshot.noarch 0:4.3.11-0.1.rc1.20200626025335.git243a031.el7
ovirt-release43-tested.noarch 0:4.3.11-0.1.rc1.20200626025335.git243a031.el7
python2-ovirt-engine-lib.noarch 0:4.3.11.1-0.0.master.20200625131236.git33fd414.el7
python2-tracer.noarch 0:0.7.4-1.el7
tracer-common.noarch 0:0.7.4-1.el7
It statedI needed to run engine-setup but is failed with these messages:
engine-setup
[ INFO ] Stage: Initializing
[ INFO ] Stage: Environment setup
Configuration files: ['/etc/ovirt-engine-setup.conf.d/10-packaging-jboss.conf', '/etc/ovirt-engine-setup.conf.d/10-packaging.conf', '/etc/ovirt-engine-setup.conf.d/20-setup-ovirt-post.conf']
Log file: /var/log/ovirt-engine/setup/ovirt-engine-setup-20200626111946-5y1tda.log
Version: otopi-1.8.5_master (otopi-1.8.5-0.0.master.20191017094324.git500b7f5.el7)
[ INFO ] Stage: Environment packages setup
[ INFO ] Stage: Programs detection
[ INFO ] Stage: Environment setup (late)
[ INFO ] Stage: Environment customization
--== PRODUCT OPTIONS ==--
[ INFO ] ovirt-provider-ovn already installed, skipping.
--== PACKAGES ==--
[ INFO ] Checking for product updates...
[ ERROR ] Yum ['ovirt-engine conflicts with ovirt-engine-ui-extensions-1.0.13-1.20200303git3b594b8.el7.noarch']
[ INFO ] Yum Performing yum transaction rollback
[ ERROR ] Failed to execute stage 'Environment customization': ['ovirt-engine conflicts with ovirt-engine-ui-extensions-1.0.13-1.20200303git3b594b8.el7.noarch']
[ INFO ] Stage: Clean up
Log file is located at /var/log/ovirt-engine/setup/ovirt-engine-setup-20200626111946-5y1tda.log
[ INFO ] Generating answer file '/var/lib/ovirt-engine/setup/answers/20200626112010-setup.conf'
[ INFO ] Stage: Pre-termination
[ INFO ] Stage: Termination
[ ERROR ] Execution of setup failed
I ran engine-setup --offline and it completed successfully. I'm not sure what the issue is with the UI extensions but it's been an issue for a while failing migrations, etc. Each time I update, I have to downgrade the UI-extensions as stated in a previous post.
I thought the developers should know about this so it can be addressed.
4 years, 4 months
engine failure
by eevans@digitaldatatechs.com
The last update for 4.3 has caused my engine server to fail. I cannot find the original repos for the 4.3 install that work with out missing dependencies. I am using Centos 7 that is up to date. Can someone forward the URL for 4.3.10.1 repo installs or info so I can reinstall and hopefully restore my stand alone engine? My vm's are still running but I cannot manage anything . Please help and it is appreciated.
4 years, 4 months
Does a slow Web proxy (can't run the update-check) ruin HA scores? (Solved, ...but really?)
by thomas@hoberg.net
Running a couple of oVirt clusters on left-over hardware in an R&D niche of the data center. Lots of switches/proxies still at 100Mbit and just checking for updates via 'yum update' can take awhile, even time out 2 times of out 3.
The network between the nodes is 10Gbit though, faster than any other part of the hardware, including some SSDs and RAIDs: Cluster communication should be excellent, even if everything goes through a single port.
After moving some servers to a new IP range, where there are even more hops to the proxy, I am shocked to see the three HCI nodes in one cluster almost permanently report bad HA scores, which of course becomes a real issue, when it hits all three. The entire cluster really starts to 'wobble'....
Trying to find the reason for that bad score and there is nothing obvious: Machines have been running just fine, very light loads, no downtimes, reboots etc.
But looking at the events recorded on hosts, something like "Failed to check for available updates on host <name> with message 'Failed to run check-update of host '<host>'. Error: null'." does come up pretty often. Moreover, when I then have all three servers run the update check on the GUI, I can find myself locked-out of the oVirt GUI and once I get back in, all non-active HostedEngine hosts are suddenly back in the 'low HS score' state.
So I have this inkling impression, that the ability (or not) to run the update check is counting into the HA score, which ... IMHO would be quite mad. It would have production clusters go haywire, just because an external internet connection is interrupted...
Any feedback on this?
P.S.
Only minutes later, after noticing the ha-scores reported by hosted-engine --vm-status were really in the low 2000s range overall, I did a quick Google and found this:
ovirt-ha-agent - host score penaltieshttps://github.com/oVirt/ovirt-hosted-engine-ha/blob/master/ovir...
NOTE: These values must be the same for all hosts in the HA
cluster!base-score=3400
gateway-score-penalty=1600
not-uptodate-config-penalty=1000 //not 'knowing if there are updates' is not the same as 'knowing it missing critical patches'
mgmt-bridge-score-penalty=600
free-memory-score-penalty=400
cpu-load-score-penalty=1000
engine-retry-score-penalty=50
cpu-load-penalty-min=0.4
cpu-load-penalty-max=0.9
So now I know how to fix it for me, but I'd consider this pretty much a bug: When the update check fails, that implies really only that the update check could not go through. It doesn't mean the cluster is fundamentally unhealthy.
Now I understand how that negative feedback is next to impossible inside RedHat's network, where update servers are local.
But having a cluster HA score being based on something 'just now happening' on the other far edges of the Internet... seems a very bad design decision.
Please comment and/or tell me how and where I should file this as a bug.
4 years, 4 months
NFS import storage domain permissions error
by Joop
Hi All,
I've got an error when I try to import a storage domain from a Synology
NAS that used to be a 4.3.10 oVirt installation but no matter what I try
I can't get it to import. I keep getting a permission denied error while
with the same settings it worked for the previous version oVirt.
This is the log:
2020-06-27 19:39:38,113+0200 INFO (jsonrpc/0) [vdsm.api] START connectStorageServer(domType=1, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'password': '********', 'protocol_version': 'auto', 'port': '', 'iqn': '', 'conn
ection': 'pakhuis:/volume1/nfs/data', 'ipv6_enabled': 'false', 'id': '00000000-0000-0000-0000-000000000000', 'user': '', 'tpgt': '1'}], options=None) from=::ffff:192.168.11.14,53948, flow_id=c753ebc5-20a3-4e1c-bcd8-794ddde3ec69, task
_id=2a9ef4e7-e2c0-43d6-b03a-7f0d62f08e50 (api:48)
2020-06-27 19:39:38,113+0200 DEBUG (jsonrpc/0) [storage.Server.NFS] Using local locks for NFSv3 locks (storageServer:442)
2020-06-27 19:39:38,114+0200 INFO (jsonrpc/0) [storage.StorageServer.MountConnection] Creating directory '/rhev/data-center/mnt/pakhuis:_volume1_nfs_data' (storageServer:167)
2020-06-27 19:39:38,114+0200 INFO (jsonrpc/0) [storage.fileUtils] Creating directory: /rhev/data-center/mnt/pakhuis:_volume1_nfs_data mode: None (fileUtils:198)
2020-06-27 19:39:38,114+0200 INFO (jsonrpc/0) [storage.Mount] mounting pakhuis:/volume1/nfs/data at /rhev/data-center/mnt/pakhuis:_volume1_nfs_data (mount:207)
2020-06-27 19:39:38,360+0200 DEBUG (jsonrpc/0) [storage.Mount] /rhev/data-center/mnt/pakhuis:_volume1_nfs_data mounted: 0.25 seconds (utils:390)
2020-06-27 19:39:38,380+0200 DEBUG (jsonrpc/0) [storage.Mount] Waiting for udev mount events: 0.02 seconds (utils:390)
2020-06-27 19:39:38,381+0200 DEBUG (jsonrpc/0) [storage.oop] Creating ioprocess Global (outOfProcess:89)
2020-06-27 19:39:38,382+0200 INFO (jsonrpc/0) [IOProcessClient] (Global) Starting client (__init__:308)
2020-06-27 19:39:38,394+0200 INFO (ioprocess/10476) [IOProcess] (Global) Starting ioprocess (__init__:434)
2020-06-27 19:39:38,394+0200 DEBUG (ioprocess/10476) [IOProcess] (Global) Closing unrelated FDs... (__init__:432)
2020-06-27 19:39:38,394+0200 DEBUG (ioprocess/10476) [IOProcess] (Global) Opening communication channels... (__init__:432)
2020-06-27 19:39:38,394+0200 DEBUG (ioprocess/10476) [IOProcess] (Global) Queuing request (slotsLeft=20) (__init__:432)
2020-06-27 19:39:38,394+0200 DEBUG (ioprocess/10476) [IOProcess] (Global) (1) Start request for method 'access' (waitTime=49) (__init__:432)
2020-06-27 19:39:38,395+0200 DEBUG (ioprocess/10476) [IOProcess] (Global) (1) Finished request for method 'access' (runTime=421) (__init__:432)
2020-06-27 19:39:38,395+0200 DEBUG (ioprocess/10476) [IOProcess] (Global) (1) Dequeuing request (slotsLeft=21) (__init__:432)
2020-06-27 19:39:38,396+0200 WARN (jsonrpc/0) [storage.oop] Permission denied for directory: /rhev/data-center/mnt/pakhuis:_volume1_nfs_data with permissions:7 (outOfProcess:193)
2020-06-27 19:39:38,396+0200 INFO (jsonrpc/0) [storage.Mount] unmounting /rhev/data-center/mnt/pakhuis:_volume1_nfs_data (mount:215)
2020-06-27 19:39:38,435+0200 DEBUG (jsonrpc/0) [storage.Mount] /rhev/data-center/mnt/pakhuis:_volume1_nfs_data unmounted: 0.04 seconds (utils:390)
2020-06-27 19:39:38,453+0200 DEBUG (jsonrpc/0) [storage.Mount] Waiting for udev mount events: 0.02 seconds (utils:390)
2020-06-27 19:39:38,453+0200 ERROR (jsonrpc/0) [storage.HSM] Could not connect to storageServer (hsm:2421)
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/vdsm/storage/fileSD.py", line 80, in validateDirAccess
getProcPool().fileUtils.validateAccess(dirPath)
File "/usr/lib/python3.6/site-packages/vdsm/storage/outOfProcess.py", line 194, in validateAccess
raise OSError(errno.EACCES, os.strerror(errno.EACCES))
PermissionError: [Errno 13] Permission denied
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 2418, in connectStorageServer
conObj.connect()
File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 449, in connect
return self._mountCon.connect()
File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 190, in connect
six.reraise(t, v, tb)
File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise
raise value
File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py", line 183, in connect
self.getMountObj().getRecord().fs_file)
File "/usr/lib/python3.6/site-packages/vdsm/storage/fileSD.py", line 91, in validateDirAccess
raise se.StorageServerAccessPermissionError(dirPath)
vdsm.storage.exception.StorageServerAccessPermissionError: Permission settings on the specified path do not allow access to the storage. Verify permission settings on the specified storage path.: 'path = /rhev/data-center/mnt/pakhuis
:_volume1_nfs_data'
2020-06-27 19:39:38,453+0200 DEBUG (jsonrpc/0) [storage.HSM] knownSDs: {a2691633-e3c9-454e-9ee4-a6f50f4e00fa: vdsm.storage.glusterSD.findDomain} (hsm:2470)
202
If I make a mistake in the path I get the mount command in the logs and
correcting for the mistake let me mount it at /mnt and then I'm able to
create files and folders using the vdsm and qemu accounts so I don't
know where that error is coming from. Had a look at the source of
vdsm/storage/* but I'm not a python programmer.
A couple of other 4.4 oVirt installs don't have this problem and I used
the same usb stick with Centos-8.2 for those as for this one. Followed
the same install instructions, same versions of packages as far as I can
see. Beats me
I'm tempted to reinstall Centos7 and oVirt4.3.10 just to see what
happens, but I'm hoping someone can point me to something that I must
have overlooked.
Regards,
Joop
4 years, 4 months