May 2020 - Users - oVirt List Archives

oVirt 4.4 pending VM changes on HostedEngine VM
by Radoslav Milanov 28 May '20

28 May '20

Hello I'm not sure how but I've got pending changes on the HostedEngine VM on a 3 node 4.4 cluster and those changes are never applied. Is there a way to cancel pending VM changes in general ? Thanks.

2 2

Adding extra parameters to qemu-kvm
by Sakari Poussa 28 May '20

28 May '20

Hi, Before I dive in too deeply to vdsm-hook-qemucmdline I'd like to know if it is the correct solution to my needs. I want to add the following (type of) parameters to VM startup -object memory-backend-file,id=mem1,share=on,mem-path=/dev/dax0.0,size=$PMEM_SIZE,align=2M \-device nvdimm,id=nvdimm1,memdev=mem1,label-size=128k \ Is the vdsm-hook-qemucmdline the correct tool? Can I use (add the params) it also from the ansible playbook (via ovirt roles) somehow? Thanks, Sakari

1 0

Current Architecture of oVirt and Quality Assurance measures
by Juergen Novak 28 May '20

28 May '20

I am currently evaluating oVirt for the usage in big companies as base system for edge computing applications. I already tried to find a current architecture overview regarding the oVirt modules -- particularly ovirt-node-ng and imgbased. But I did not manage to find any. There is some architecture documentation, but marked as outdated and many links go to 404. So I would ask, if somebody could provide any current architecture overview (no user documentation), particularly of the node module(s). An other question I have, is which quality measures are used in development? * Tools used (for Unit-Tests, Component-Tests, Integration-Tests) * Are there any statistics about test coverage? Any assistance would be appreciated. Best Regards /juergen

1 0

oVirt 4.4 gluster single host vm paused due to unknown storage error
by Gianluca Cecchi 28 May '20

28 May '20

Hello, in environment in subject, I downloaded from glance repository the CentOS 8 image CentOS 8 Generic Cloud Image v20200113.3 for x86_64 (5e35c84) and imported as a template. I created a vm based on it (I got the message: "In order to create a VM from a template with a different chipset, device configuration will be changed. This may affect functionality of the guest software. Are you sure you want to proceed?") When running "dnf update", during I/O of packages updates the VM went into pause. VM c8desktop started on Host novirt2.example.net 5/28/201:29:04 PM VM c8desktop has been paused. 5/28/201:41:52 PM VM c8desktop has been paused due to unknown storage error. 5/28/201:41:52 PM VM c8desktop has recovered from paused back to up. 5/28/201:43:50 PM In messages of (nested) host I see: May 28 13:28:06 novirt2 systemd-machined[1497]: New machine qemu-7-c8desktop. May 28 13:28:06 novirt2 systemd[1]: Started Virtual Machine qemu-7-c8desktop. May 28 13:28:06 novirt2 kvm[57798]: 2 guests now active May 28 13:28:07 novirt2 journal[13368]: Guest agent is not responding: QEMU guest agent is not connected May 28 13:28:12 novirt2 journal[13368]: Guest agent is not responding: QEMU guest agent is not connected May 28 13:28:17 novirt2 journal[13368]: Guest agent is not responding: QEMU guest agent is not connected May 28 13:28:22 novirt2 journal[13368]: Guest agent is not responding: QEMU guest agent is not connected May 28 13:28:27 novirt2 journal[13368]: Guest agent is not responding: QEMU guest agent is not connected May 28 13:28:32 novirt2 journal[13368]: Guest agent is not responding: QEMU guest agent is not connected May 28 13:28:37 novirt2 journal[13368]: Domain id=7 name='c8desktop' uuid=63e27cb5-087d-435e-bf61-3fe25e3319d 6 is tainted: custom-ga-command May 28 13:28:37 novirt2 journal[26984]: Cannot open log file: '/var/log/libvirt/qemu/c8desktop.log': Device o r resource busy May 28 13:28:37 novirt2 journal[13368]: Cannot open log file: '/var/log/libvirt/qemu/c8desktop.log': Device o r resource busy May 28 13:28:37 novirt2 journal[13368]: Unable to open domainlog May 28 13:30:00 novirt2 systemd[1]: Starting system activity accounting tool... May 28 13:30:00 novirt2 systemd[1]: Started system activity accounting tool. May 28 13:37:21 novirt2 python3[62512]: detected unhandled Python exception in '/usr/lib/python3.6/site-packages/vdsm/gluster/gfapi.py' May 28 13:37:21 novirt2 abrt-server[62514]: Deleting problem directory Python3-2020-05-28-13:37:21-62512 (dup of Python3-2020-05-28-10:32:57-29697) May 28 13:37:21 novirt2 dbus-daemon[1502]: [system] Activating service name='org.freedesktop.problems' requested by ':1.3111' (uid=0 pid=62522 comm="/usr/libexec/platform-python /usr/bin/abrt-action-" label="system_u:system_r:abrt_t:s0-s0:c0.c1023") (using servicehelper) May 28 13:37:21 novirt2 dbus-daemon[1502]: [system] Successfully activated service 'org.freedesktop.problems' May 28 13:37:21 novirt2 abrt-server[62514]: /bin/sh: reporter-systemd-journal: command not found May 28 13:37:21 novirt2 python3[62550]: detected unhandled Python exception in '/usr/lib/python3.6/site-packages/vdsm/gluster/gfapi.py' May 28 13:37:21 novirt2 abrt-server[62552]: Not saving repeating crash in '/usr/lib/python3.6/site-packages/vdsm/gluster/gfapi.py' May 28 13:37:22 novirt2 python3[62578]: detected unhandled Python exception in '/usr/lib/python3.6/site-packages/vdsm/gluster/gfapi.py' May 28 13:37:22 novirt2 abrt-server[62584]: Not saving repeating crash in '/usr/lib/python3.6/site-packages/vdsm/gluster/gfapi.py' May 28 13:40:00 novirt2 systemd[1]: Starting system activity accounting tool... May 28 13:40:00 novirt2 systemd[1]: Started system activity accounting tool. On the related gluster volume log where vm disk is I have: [2020-05-28 11:41:33.892074] W [MSGID: 114031] [client-rpc-fops_v2.c:679:client4_0_writev_cbk] 0-vmstore-client-0: remote operation failed [Invalid argument] [2020-05-28 11:41:33.892140] W [fuse-bridge.c:2925:fuse_writev_cbk] 0-glusterfs-fuse: 348168: WRITE => -1 gfid=35ae86e8-0ccd-48b8-9ef2-6ca9a108ccf9 fd=0x7fd1d800cf38 (Invalid argument) [2020-05-28 11:41:33.902984] I [MSGID: 133022] [shard.c:3693:shard_delete_shards] 0-vmstore-shard: Deleted shards of gfid=35ae86e8-0ccd-48b8-9ef2-6ca9a108ccf9 from backend [2020-05-28 11:41:52.434362] E [MSGID: 133010] [shard.c:2339:shard_common_lookup_shards_cbk] 0-vmstore-shard: Lookup on shard 6 failed. Base file gfid = 3e12e7fe-6a77-41b8-932a-d4f50c41ac00 [No such file or directory] [2020-05-28 11:41:52.434423] W [fuse-bridge.c:2925:fuse_writev_cbk] 0-glusterfs-fuse: 353565: WRITE => -1 gfid=3e12e7fe-6a77-41b8-932a-d4f50c41ac00 fd=0x7fd208093fb8 (No such file or directory) [2020-05-28 11:46:34.095697] W [MSGID: 114031] [client-rpc-fops_v2.c:679:client4_0_writev_cbk] 0-vmstore-client-0: remote operation failed [Invalid argument] [2020-05-28 11:46:34.095758] W [fuse-bridge.c:2925:fuse_writev_cbk] 0-glusterfs-fuse: 384006: WRITE => -1 gfid=7b804a1a-1734-4bec-b8f4-9ba33ffefe8b fd=0x7fd1d0005fd8 (Invalid argument) [2020-05-28 11:46:34.104494] I [MSGID: 133022] [shard.c:3693:shard_delete_shards] 0-vmstore-shard: Deleted shards of gfid=7b804a1a-1734-4bec-b8f4-9ba33ffefe8b from backend Very similar sharding message when in 4.3.9 in single host with gluster and "heavy"/sudden I/O operations when using thin provisioned disks..... I see in 4.4 Gluster is glusterfs-7.5-1.el8.x86_64 Can this be a problem that only appears in single host as there are indeed no data travellling to the network to sync nodes and sharding feature for some reason is not able to keep pace, when local disk is very fast? In 4.3.9 single host with gluster 6.8-1 my only way to solve was to disable sharding.. and I got final stability see here https://lists.ovirt.org/archives/list/users@ovirt.org/thread/OIN4R63I6ITOQS… Still waiting for Gluster devs comments on logs provided at that time As I already wrote, in my opinion the single host wizard should set sharding off in automatic, because in that environment can make thin provisioned disks unusable. In case of future additions of nodes, the setup can make a check and say the user that he/she should re-enable sharding.... Just my 0.2 eurocent Gianluca

1 0

oVirt 4.4 HE on Copy local VM disk to shared storage (NFS) failing
by Anton Gonzalez 28 May '20

28 May '20

Deploy via cockpit fails after copy local VM disk to shared storage, Exit HE maintenance mode, Check engine VM health. The HE VM does not start from shared storage with hosted-engine --vm-status returning down_unexpected. I've also tried with iSCSI, with similar results. I'm using CentOS 8.1 now, but also tried with Node 4.4 with same results. VDSM log attached. If any other logs are needed, please let me know and I'll add. I've used the same NFS server to deploy 4.3 without issue, and I ran the nfs-check.py script to verify that the NFS share was correctly provisioned. [ INFO ] TASK [ovirt.hosted_engine_setup : Copy local VM disk to shared storage] [ INFO ] changed: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Remove temporary entry in /etc/hosts for the local VM] [ INFO ] changed: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Start ovirt-ha-broker service on the host] [ INFO ] changed: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Initialize lockspace volume] [ INFO ] TASK [ovirt.hosted_engine_setup : Workaround for ovirt-ha-broker start failures] [ INFO ] changed: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Initialize lockspace volume] [ INFO ] changed: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Start ovirt-ha-agent service on the host] [ INFO ] changed: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Exit HE maintenance mode] [ INFO ] changed: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Check engine VM health] [ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 180, "changed": true, "cmd": ["hosted-engine", "--vm-status", "--json"], "delta": "0:00:00.186082", "end": "2020-05-26 20:05:40.063450", "rc": 0, "start": "2020-05-26 20:05:39.877368", "stderr": "", "stderr_lines": [], "stdout": "{\"1\": {\"host-id\": 1, \"host-ts\": 2500, \"score\": 0, \"engine-status\": {\"vm\": \"down_unexpected\", \"health\": \"bad\", \"detail\": \"Down\", \"reason\": \"bad vm status\"}, \"hostname\": \"ovirthost02.po-lite.local\", \"maintenance\": false, \"stopped\": false, \"crc32\": \"3b77b8b8\", \"conf_on_shared_storage\": true, \"local_conf_timestamp\": 2500, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=2500 (Tue May 26 20:05:36 2020)\\nhost-id=1\\nscore=0\\nvm_conf_refresh_time=2500 (Tue May 26 20:05:36 2020)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineUnexpectedlyDown\\nstopped=False\\ntimeout=Wed Dec 31 19:47:59 1969\\n\", \"live-data\": true}, \"global_maintenance\": false}", "stdout_lines": ["{\"1\": {\"host-id\": 1, \"host-ts\": 2500, \"score\": 0, \"engine-status\": {\"vm\": \"down_unexpected\", \"health\": \"bad\", \"detail\": \"Down\", \"reason\": \"bad vm status\"}, \"hostname\": \"ovirthost02.po-lite.local\", \"maintenance\": false, \"stopped\": false, \"crc32\": \"3b77b8b8\", \"conf_on_shared_storage\": true, \"local_conf_timestamp\": 2500, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=2500 (Tue May 26 20:05:36 2020)\\nhost-id=1\\nscore=0\\nvm_conf_refresh_time=2500 (Tue May 26 20:05:36 2020)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineUnexpectedlyDown\\nstopped=False\\ntimeout=Wed Dec 31 19:47:59 1969\\n\", \"live-data\": true}, \"global_maintenance\": false}"]} [ INFO ] TASK [ovirt.hosted_engine_setup : Check VM status at virt level] [ INFO ] TASK [ovirt.hosted_engine_setup : Fail if engine VM is not running] [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM is not running, please check vdsm logs"}

4 7

[ANN] oVirt 4.4.1 Second Release Candidate
by Sandro Bonazzola 28 May '20

28 May '20

oVirt 4.4.1 Second Release Candidate is now available for testing The oVirt Project is pleased to announce the availability of oVirt 4.4.1 Second Release Candidate for testing, as of May 28th, 2020. This update is the first in a series of stabilization updates to the 4.4 series. Important notes before you try it Please note this is a pre-release build. The oVirt Project makes no guarantees as to its suitability or usefulness. This pre-release must not be used in production. Some of the features included in oVirt 4.4.1 Release Candidate require content that will be available in CentOS Linux 8.2 but can’t be tested on RHEL 8.2 yet due to some incompatibility in openvswitch package shipped in CentOS Virt SIG which requires to rebuild openvswitch on top of CentOS 8.2. Installation instructions For the engine: either use appliance or: - Install CentOS Linux 8 minimal from http://centos.mirror.garr.it/centos/8.1.1911/isos/x86_64/CentOS-8.1.1911-x8… - dnf install https://resources.ovirt.org/pub/yum-repo/ovirt-release44-pre.rpm - dnf update (reboot if needed) - dnf module enable -y javapackages-tools pki-deps postgresql:12 - dnf install ovirt-engine - engine-setup For the nodes: Either use oVirt Node ISO or: - Install CentOS Linux 8 from http://centos.mirror.garr.it/centos/8.1.1911/isos/x86_64/CentOS-8.1.1911-x8… ; select minimal installation - dnf install https://resources.ovirt.org/pub/yum-repo/ovirt-release44-pre.rpm - dnf update (reboot if needed) - Attach the host to engine and let it be deployed. This release is available now on x86_64 architecture for: * Red Hat Enterprise Linux 8.1 or newer * CentOS Linux (or similar) 8.1 or newer This release supports Hypervisor Hosts on x86_64 and ppc64le architectures for: * Red Hat Enterprise Linux 8.1 or newer * CentOS Linux (or similar) 8.1 or newer * oVirt Node 4.4 based on CentOS Linux 8.1 (available for x86_64 only) See the release notes [1] for installation instructions and a list of new features and bugs fixed. If you manage more than one oVirt instance, OKD or RDO we also recommend to try ManageIQ <http://manageiq.org/>. In such a case, please be sure to take the qc2 image and not the ova image. Notes: - oVirt Appliance is already available for CentOS Linux 8 - oVirt Node NG is already available for CentOS Linux 8 Additional Resources: * Read more about the oVirt 4.4.1 release highlights: http://www.ovirt.org/release/4.4.1/ * Get more oVirt project updates on Twitter: https://twitter.com/ovirt * Check out the latest project news on the oVirt blog: http://www.ovirt.org/blog/ [1] http://www.ovirt.org/release/4.4.1/ [2] http://resources.ovirt.org/pub/ovirt-4.4-pre/iso/ -- Sandro Bonazzola MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV Red Hat EMEA <https://www.redhat.com/> sbonazzo(a)redhat.com <https://www.redhat.com/> [image: |Our code is open_] <https://www.redhat.com/en/our-code-is-open> *Red Hat respects your work life balance. Therefore there is no need to answer this email out of your office hours.*

1 0

Re: AutoStart VMs (was Re: Re: oVirt 4.4.0 Release is now generally available)
by Derek Atkins 27 May '20

27 May '20

Eh, no point in creating a repo for that, so I just put them on the web: https://www.ihtfp.org/ovirt/ -derek On Wed, May 27, 2020 11:05 am, Staniforth, Paul wrote: > > Thanks Derek, > GitHub or GitLab probably. > > Regards, > Paul S. > ________________________________ > From: Derek Atkins <derek(a)ihtfp.com> > Sent: 27 May 2020 15:50 > To: Gianluca Cecchi <gianluca.cecchi(a)gmail.com> > Cc: thomas(a)hoberg.net <thomas(a)hoberg.net>; users <users(a)ovirt.org> > Subject: [ovirt-users] AutoStart VMs (was Re: Re: oVirt 4.4.0 Release is > now generally available) > > Caution External Mail: Do not click any links or open any attachments > unless you trust the sender and know that the content is safe. > > Hi, > > (Sorry if you get this twice -- looks like it didn't like the python > script in there so I'm resending without the code) > > Gianluca Cecchi <gianluca.cecchi(a)gmail.com> writes: > >> Hi Derek, >> today I played around with Ansible to accomplish, I think, what you >> currently >> do in oVirt shell. >> It was the occasion to learn, as always, something new: as "blocks" in >> Ansible >> dont' support looping, a workaround to get that. >> Furthermore I have a single host environment where it can turn usefull >> too... > [snip] > > I found the time to work on this using the Python SDK. Took me longer > than I wanted but I think I've got something working now. I just > haven't done a FULL test, yet, but a runtime time on the online system > works (I commented out the start call). > > I still have two files, a vm_list.py which is a config file that > contains the list of VMs, in order, and then the main program itself > (start_vms.py) which is based on several of the examples available in > github. > > Unfortunately I can't seem to send the script in email because it's > getting blocked by the redhat server -- so I have no idea the best way > to share it. > > -derek > > -- > Derek Atkins 617-623-3745 > derek(a)ihtfp.com > https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.ihtfp.… > Computer and Internet Security Consultant > _______________________________________________ > Users mailing list -- users(a)ovirt.org > To unsubscribe send an email to users-leave(a)ovirt.org > Privacy Statement: > https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt… > oVirt Code of Conduct: > https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ovirt… > List Archives: > https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ovi… > To view the terms under which this email is distributed, please go to:- > http://leedsbeckett.ac.uk/disclaimer/email/ > -- Derek Atkins 617-623-3745 derek(a)ihtfp.com www.ihtfp.com Computer and Internet Security Consultant

1 0

ovirt 4.4.0 - Live merge failure with libvirt error "virDomainBlockCommit() failed"
by Marco Fais 27 May '20

27 May '20

Hi, I have upgraded one of my nodes to oVirt-node 4.4.0 and I am testing the basic functionality in preparation for the full cluster migration. Unfortunately snapshot deletions are not working at the moment; I have a live merge failure most of the times due to a libvirt error -- virDomainBlockCommit() failed I have opened a bug on this as well (https://bugzilla.redhat.com/show_bug.cgi?id=1840414) See the most relevant portion of the vdsm.log here: 2020-05-26 22:43:39,067+0100 ERROR (jsonrpc/1) [virt.vm] (vmId='baaf6be8-dcf4-4f26-b0f1-435287eeed95') Live merge failed (job: 7b96452b-0e60-46c6-9236-6b8a906b0ed8) (vm:5381) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5379, in merge bandwidth, flags) File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f ret = attr(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python3.6/site-packages/libvirt.py", line 728, in blockCommit if ret == -1: raise libvirtError ('virDomainBlockCommit() failed', dom=self) libvirt.libvirtError: internal error: qemu block name 'json:{"backing": {"driver": "raw", "file": {"driver": "file", "filename": "/rhev/data-center/mnt/glusterSD/192.168.30.2:_Temp__Storage/246845e5-5546-4f04-948e-66a9532b403f/images/64955343-da61-4b76-9158-8df8363dc5a3/b0558698-bc82-4941-ac52-9fa2385a9f00"}}, "driver": "qcow2", "file": {"driver": "file", "filename": "/rhev/data-center/mnt/glusterSD/192.168.30.2:_Temp__Storage/246845e5-5546-4f04-948e-66a9532b403f/images/64955343-da61-4b76-9158-8df8363dc5a3/be844ebe-10b6-4ac7-89c1-9f8526a52762"}}' doesn't match expected '/rhev/data-center/mnt/glusterSD/192.168.30.2:_Temp__Storage/246845e5-5546-4f04-948e-66a9532b403f/images/64955343-da61-4b76-9158-8df8363dc5a3/be844ebe-10b6-4ac7-89c1-9f8526a52762' 2020-05-26 22:43:39,075+0100 INFO (jsonrpc/1) [api.virt] FINISH merge return={'status': {'code': 52, 'message': 'Merge failed'}} from=::ffff:10.144.138.240,60914, flow_id=dbf9c831-e0cb-4891-a38c-d61136daf029, vmId=baaf6be8-dcf4-4f26-b0f1-435287eeed95 (api:54) 2020-05-26 22:43:39,075+0100 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call VM.merge failed (error 52) in 0.18 seconds (__init__:312) This issue was reported also on https://bugzilla.redhat.com/show_bug.cgi?id=1785939; according to the comments the resolution seems to be in libvirt 6.x. However I am not quite sure if / how I can get libvirt 6 in my setup... is there a test image (ideally ovirt-node) I can use? Any suggestions? Unrelated: I had also a problem during the hosted-engine deployment due to the fact the machine is not connected directly to internet and could not complete the dnf update steps (even with the proxy configured it was still failing at a later stage during the engine setup, same problem..). I might open another thread on it... Thanks, Marco

1 0

oVirt 4.3.9.4-1:--> VM Has been paused due to storage I/O error
by adrianquintero＠gmail.com 27 May '20

27 May '20

Team, I've been having issues all my VMs cant be started, error is "VM Has been paused due to storage I/O error" Any ideas are welcome as all my VMs are down Gluster log (gluster version 6.8): [2020-05-27 09:10:28.132619] E [MSGID: 113040] [posix-inode-fd-ops.c:1572:posix_readv] 0-vmstore-posix: read failed on gfid=8674ab5f-56b9-4136-9b30-a65ca86be204, fd=0x7f12b807e6d8, offset=0 size=1, buf=0x7f134fbd9000 [Invalid argument] [2020-05-27 09:10:28.132694] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-vmstore-server: 3286: READV 3 (8674ab5f-56b9-4136-9b30-a65ca86be204), client: CTX_ID:03354525-5de3-4390-b775-3db7a85c0022-GRAPH_ID:0-PID:29372-HOST:jrz-061-ovirt3.example.com-PC_NAME:vmstore-client-2-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-05-27 09:10:28.211930] E [MSGID: 113040] [posix-inode-fd-ops.c:1572:posix_readv] 0-vmstore-posix: read failed on gfid=8674ab5f-56b9-4136-9b30-a65ca86be204, fd=0x7f12b825ed18, offset=0 size=1, buf=0x7f134fbd9000 [Invalid argument] [2020-05-27 09:10:28.211995] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-vmstore-server: 3306: READV 2 (8674ab5f-56b9-4136-9b30-a65ca86be204), client: CTX_ID:03354525-5de3-4390-b775-3db7a85c0022-GRAPH_ID:0-PID:29372-HOST:jrz-061-ovirt3.example.com-PC_NAME:vmstore-client-2-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-05-27 09:10:28.226451] E [MSGID: 113040] [posix-inode-fd-ops.c:1572:posix_readv] 0-vmstore-posix: read failed on gfid=755664db-2d04-4ed0-9333-251c6cc3dcb1, fd=0x7f12b807e6d8, offset=0 size=1, buf=0x7f134fbd9000 [Invalid argument] [2020-05-27 09:10:28.226511] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-vmstore-server: 3317: READV 2 (755664db-2d04-4ed0-9333-251c6cc3dcb1), client: CTX_ID:03354525-5de3-4390-b775-3db7a85c0022-GRAPH_ID:0-PID:29372-HOST:jrz-061-ovirt3.example.com-PC_NAME:vmstore-client-2-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-05-27 09:10:28.232122] E [MSGID: 113040] [posix-inode-fd-ops.c:1572:posix_readv] 0-vmstore-posix: read failed on gfid=755664db-2d04-4ed0-9333-251c6cc3dcb1, fd=0x7f12b807e6d8, offset=0 size=1, buf=0x7f134fbd9000 [Invalid argument] [2020-05-27 09:10:28.232181] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-vmstore-server: 3319: READV 2 (755664db-2d04-4ed0-9333-251c6cc3dcb1), client: CTX_ID:03354525-5de3-4390-b775-3db7a85c0022-GRAPH_ID:0-PID:29372-HOST:jrz-061-ovirt3.example.com-PC_NAME:vmstore-client-2-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-05-27 09:10:28.237043] E [MSGID: 113040] [posix-inode-fd-ops.c:1572:posix_readv] 0-vmstore-posix: read failed on gfid=8674ab5f-56b9-4136-9b30-a65ca86be204, fd=0x7f12b82038b8, offset=0 size=1, buf=0x7f134fbd9000 [Invalid argument] [2020-05-27 09:10:28.237100] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-vmstore-server: 3321: READV 3 (8674ab5f-56b9-4136-9b30-a65ca86be204), client: CTX_ID:03354525-5de3-4390-b775-3db7a85c0022-GRAPH_ID:0-PID:29372-HOST:jrz-061-ovirt3.example.com-PC_NAME:vmstore-client-2-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-05-27 09:10:28.242176] E [MSGID: 113040] [posix-inode-fd-ops.c:1572:posix_readv] 0-vmstore-posix: read failed on gfid=755664db-2d04-4ed0-9333-251c6cc3dcb1, fd=0x7f12b807e6d8, offset=0 size=1, buf=0x7f134fbd9000 [Invalid argument] [2020-05-27 09:10:28.242235] E [MSGID: 115068] [server-rpc-fops_v2.c:1425:server4_readv_cbk] 0-vmstore-server: 3323: READV 2 (755664db-2d04-4ed0-9333-251c6cc3dcb1), client: CTX_ID:03354525-5de3-4390-b775-3db7a85c0022-GRAPH_ID:0-PID:29372-HOST:jrz-061-ovirt3.example.com-PC_NAME:vmstore-client-2-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-05-27 09:11:18.990877] I [MSGID: 115036] [server.c:499:server_rpc_notify] 0-vmstore-server: disconnecting connection from CTX_ID:87270d80-4310-4795-9f4d-2c1a61d16cee-GRAPH_ID:0-PID:28815-HOST:jrz-059-ovirt1.example.com-PC_NAME:vmstore-client-2-RECON_NO:-0 [2020-05-27 09:11:18.991896] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-vmstore-server: Shutting down connection CTX_ID:87270d80-4310-4795-9f4d-2c1a61d16cee-GRAPH_ID:0-PID:28815-HOST:jrz-059-ovirt1.example.com-PC_NAME:vmstore-client-2-RECON_NO:-0 [2020-05-27 09:11:19.329120] I [MSGID: 115036] [server.c:499:server_rpc_notify] 0-vmstore-server: disconnecting connection from CTX_ID:5858e269-923e-4a38-9c6b-62f337a6abac-GRAPH_ID:0-PID:20791-HOST:jrz-060-ovirt2.example.com-PC_NAME:vmstore-client-2-RECON_NO:-0 [2020-05-27 09:11:19.329419] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-vmstore-server: Shutting down connection CTX_ID:5858e269-923e-4a38-9c6b-62f337a6abac-GRAPH_ID:0-PID:20791-HOST:jrz-060-ovirt2.example.com-PC_NAME:vmstore-client-2-RECON_NO:-0 [2020-05-27 09:11:21.044186] I [addr.c:54:compare_addr_and_update] 0-/gluster_bricks/vmstore/vmstore: allowed = "*", received addr = "192.168.0.59" [2020-05-27 09:11:21.044227] I [login.c:110:gf_auth] 0-auth/login: allowed user names: c2ef048a-b354-4f91-8e41-a2aab6e65dbc [2020-05-27 09:11:21.044265] I [MSGID: 115029] [server-handshake.c:553:server_setvolume] 0-vmstore-server: accepted client from CTX_ID:c2426c68-ad6c-4d3a-b6a4-decce8d18b75-GRAPH_ID:0-PID:1558-HOST:jrz-059-ovirt1.example.com-PC_NAME:vmstore-client-2-RECON_NO:-0 (version: 6.8) with subvol /gluster_bricks/vmstore/vmstore [2020-05-27 09:11:21.383131] I [addr.c:54:compare_addr_and_update] 0-/gluster_bricks/vmstore/vmstore: allowed = "*", received addr = "192.168.0.60" [2020-05-27 09:11:21.383168] I [login.c:110:gf_auth] 0-auth/login: allowed user names: c2ef048a-b354-4f91-8e41-a2aab6e65dbc [2020-05-27 09:11:21.383190] I [MSGID: 115029] [server-handshake.c:553:server_setvolume] 0-vmstore-server: accepted client from CTX_ID:5a363377-0bc7-40eb-a7cb-f4f5757a102b-GRAPH_ID:0-PID:27173-HOST:jrz-060-ovirt2.example.com-PC_NAME:vmstore-client-2-RECON_NO:-0 (version: 6.8) with subvol /gluster_bricks/vmstore/vmstore

1 0

Tasks stuck waiting on another after failed storage migration (yet not visible on SPM)
by david.sekne＠gmail.com 27 May '20

27 May '20

Hello, I'm running oVirt version 4.3.9.4-1.el7. After a failed live storage migration VM got stuck with snapshot. Checking the engine logs I can see that the snapshot removal task is waiting for Merge to complete and vice versa. 2020-05-26 18:34:04,826+02 INFO [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommandCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [90f428b0-9c4e-4ac0-8de6-1103fc13da9e] Command 'RemoveSnapshotSingleDiskLive' (id: '60ce36c1-bf74-40a9-9fb0-7fcf7eb95f40') waiting on child command id: 'f7d1de7b-9e87-47ba-9ba0-ee04301ba3b1' type:'Merge' to complete 2020-05-26 18:34:04,827+02 INFO [org.ovirt.engine.core.bll.MergeCommandCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [90f428b0-9c4e-4ac0-8de6-1103fc13da9e] Waiting on merge command to complete (jobId = f694590a-1577-4dce-bf0c-3a8d74adf341) 2020-05-26 18:34:04,845+02 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [90f428b0-9c4e-4ac0-8de6-1103fc13da9e] Command 'RemoveSnapshot' (id: '47c9a847-5b4b-4256-9264-a760acde8275') waiting on child command id: '60ce36c1-bf74-40a9-9fb0-7fcf7eb95f40' type:'RemoveSnapshotSingleDiskLive' to complete 2020-05-26 18:34:14,277+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmJobsMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-96) [] VM Job [f694590a-1577-4dce-bf0c-3a8d74adf341]: In progress (no change) I cant see any tasks on sPM via command 2020-05-26 18:34:04,826+02 INFO [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommandCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [90f428b0-9c4e-4ac0-8de6-1103fc13da9e] Command 'RemoveSnapshotSingleDiskLive' (id: '60ce36c1-bf74-40a9-9fb0-7fcf7eb95f40') waiting on child command id: 'f7d1de7b-9e87-47ba-9ba0-ee04301ba3b1' type:'Merge' to complete 2020-05-26 18:34:04,827+02 INFO [org.ovirt.engine.core.bll.MergeCommandCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [90f428b0-9c4e-4ac0-8de6-1103fc13da9e] Waiting on merge command to complete (jobId = f694590a-1577-4dce-bf0c-3a8d74adf341) 2020-05-26 18:34:04,845+02 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [90f428b0-9c4e-4ac0-8de6-1103fc13da9e] Command 'RemoveSnapshot' (id: '47c9a847-5b4b-4256-9264-a760acde8275') waiting on child command id: '60ce36c1-bf74-40a9-9fb0-7fcf7eb95f40' type:'RemoveSnapshotSingleDiskLive' to complete 2020-05-26 18:34:14,277+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmJobsMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-96) [] VM Job [f694590a-1577-4dce-bf0c-3a8d74adf341]: In progress (no change) I cannot see any runnig tasks on the SPM (vdsm-client Host getAllTasksInfo). I also cannot find the task ID in any of the other node's logs. I already tried restarting the Engine (didn't help). To start I'm puzzled where is the engine getting the task info? Any Ideas on how I could resolve this? Thank you. Regards, David

1 0