June 2019 - Users - oVirt List Archives

Failed to add storage domain
by thunderlight1＠gmail.com 31 May '20

31 May '20

Hi! I have installed oVirt using the iso ovirt-node-ng-installer-4.3.2-2019031908.el7. I the did run the Host-engine deployment through Cockpit. I got an error when it tries to create the domain storage. It sucessfully mounted the NFS-share on the host. Bellow is the error I got: 2019-04-14 10:40:38,967+0200 INFO ansible skipped {'status': 'SKIPPED', 'ansible_task': u'Check storage domain free space', 'ansible_host': u'localhost', 'ansible_playbook': u'/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml', 'ansible_type': 'task'} 2019-04-14 10:40:38,967+0200 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fb6918ad9d0> kwargs 2019-04-14 10:40:39,516+0200 INFO ansible task start {'status': 'OK', 'ansible_task': u'ovirt.hosted_engine_setup : Activate storage domain', 'ansible_playbook': u'/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml', 'ansible_type': 'task'} 2019-04-14 10:40:39,516+0200 DEBUG ansible on_any args TASK: ovirt.hosted_engine_setup : Activate storage domain kwargs is_conditional:False 2019-04-14 10:40:41,923+0200 DEBUG var changed: host "localhost" var "otopi_storage_domain_details" type "<type 'dict'>" value: "{ "changed": false, "exception": "Traceback (most recent call last):\n File \"/tmp/ansible_ovirt_storage_domain_payload_xSFxOp/__main__.py\", line 664, in main\n storage_domains_module.post_create_check(sd_id)\n File \"/tmp/ansible_ovirt_storage_domain_payload_xSFxOp/__main__.py\", line 526, in post_create_check\n id=storage_domain.id,\n File \"/usr/lib64/python2.7/site-packages/ovirtsdk4/services.py\", line 3053, in add\n return self._internal_add(storage_domain, headers, query, wait)\n File \"/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py\", line 232, in _internal_add\n return future.wait() if wait else future\n File \"/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py\", line 55, in wait\n return self._code(response)\n File \"/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py\", line 229, in callback\n self._check_fault(response)\n File \"/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py\", line 132, in _check_fault\n self._raise_error(response , body)\n File \"/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py\", line 118, in _raise_error\n raise error\nError: Fault reason is \"Operation Failed\". Fault detail is \"[]\". HTTP response code is 400.\n", "failed": true, "msg": "Fault reason is \"Operation Failed\". Fault detail is \"[]\". HTTP response code is 400." }" 2019-04-14 10:40:41,924+0200 DEBUG var changed: host "localhost" var "ansible_play_hosts" type "<type 'list'>" value: "[]" 2019-04-14 10:40:41,924+0200 DEBUG var changed: host "localhost" var "play_hosts" type "<type 'list'>" value: "[]" 2019-04-14 10:40:41,924+0200 DEBUG var changed: host "localhost" var "ansible_play_batch" type "<type 'list'>" value: "[]" 2019-04-14 10:40:41,924+0200 ERROR ansible failed {'status': 'FAILED', 'ansible_type': 'task', 'ansible_task': u'Activate storage domain', 'ansible_result': u'type: <type \'dict\'>\nstr: {\'_ansible_parsed\': True, u\'exception\': u\'Traceback (most recent call last):\\n File "/tmp/ansible_ovirt_storage_domain_payload_xSFxOp/__main__.py", line 664, in main\\n storage_domains_module.post_create_check(sd_id)\\n File "/tmp/ansible_ovirt_storage_domain_payload_xSFxOp/__main__.py", line 526', 'task_duration': 2, 'ansible_host': u'localhost', 'ansible_playbook': u'/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml'} 2019-04-14 10:40:41,924+0200 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fb691843190> kwargs ignore_errors:None 2019-04-14 10:40:41,928+0200 INFO ansible stats { "ansible_playbook": "/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml", "ansible_playbook_duration": "00:37 Minutes", "ansible_result": "type: <type 'dict'>\nstr: {u'localhost': {'unreachable': 0, 'skipped': 6, 'ok': 23, 'changed': 1, 'failures': 1}}", "ansible_type": "finish", "status": "FAILED" } 2019-04-14 10:40:41,928+0200 INFO SUMMARY: Duration Task Name -------- -------- [ < 1 sec ] Execute just a specific set of steps [ 00:01 ] Force facts gathering [ 00:01 ] Check local VM dir stat [ 00:01 ] Obtain SSO token using username/password credentials [ 00:01 ] Fetch host facts [ < 1 sec ] Fetch cluster ID [ 00:01 ] Fetch cluster facts [ 00:01 ] Fetch Datacenter facts [ < 1 sec ] Fetch Datacenter ID [ < 1 sec ] Fetch Datacenter name [ 00:02 ] Add NFS storage domain [ 00:01 ] Get storage domain details [ 00:01 ] Find the appliance OVF [ 00:01 ] Parse OVF [ < 1 sec ] Get required size [ FAILED ] Activate storage domain 2019-04-14 10:40:41,928+0200 DEBUG ansible on_any args <ansible.executor.stats.AggregateStats object at 0x7fb69404eb90> kwargs Any suggestions on how fix this?

2 2

How to connect to a guest with vGPU ?
by Josep Manel Andrés Moscardó 29 May '20

29 May '20

Hi, I got vGPU through mdev working but I am wondering how I would connect to the client and make use of the GPU. So far I try to access the console through SPICE and at some point in the boot process it switches to GPU and I cannot see anything else. Thanks. -- Josep Manel Andrés Moscardó Systems Engineer, IT Operations EMBL Heidelberg T +49 6221 387-8394

3 4

Vm suddenly paused with error "vm has paused due to unknown storage error"
by Jasper Siero 18 Feb '20

18 Feb '20

Hi all, Since we upgraded our Ovirt nodes to CentOS 7 a vm (not a specific one but never more then one) will sometimes pause suddenly with the error "VM ... has paused due to unknown storage error". It happens now two times in a month. The Ovirt node uses san storage for the vm's running on it. When a specific vm is pausing with an error the other vm's keeps running without problems. The vm runs without problems after unpausing it. Versions: CentOS Linux release 7.1.1503 vdsm-4.14.17-0 libvirt-daemon-1.2.8-16 vdsm.log: VM Channels Listener::DEBUG::2015-10-25 07:43:54,382::vmChannels::95::vds::(_handle_timeouts) Timeout on fileno 78. libvirtEventLoop::INFO::2015-10-25 07:43:56,177::vm::4602::vm.Vm::(_onIOError) vmId=`77f07ae0-cc3e-4ae2-90ec-7fba7b11deeb`::abnormal vm stop device virtio-disk0 error eother libvirtEventLoop::DEBUG::2015-10-25 07:43:56,178::vm::5204::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`77f07ae0-cc3e-4ae2-90ec-7fba7b11deeb`::event Suspended detail 2 opaque None libvirtEventLoop::INFO::2015-10-25 07:43:56,178::vm::4602::vm.Vm::(_onIOError) vmId=`77f07ae0-cc3e-4ae2-90ec-7fba7b11deeb`::abnormal vm stop device virtio-disk0 error eother ........... libvirtEventLoop::INFO::2015-10-25 07:43:56,180::vm::4602::vm.Vm::(_onIOError) vmId=`77f07ae0-cc3e-4ae2-90ec-7fba7b11deeb`::abnormal vm stop device virtio-disk0 error eother specific error part in libvirt vm log: block I/O error in device 'drive-virtio-disk0': Unknown error 32758 (32758) ........... block I/O error in device 'drive-virtio-disk0': Unknown error 32758 (32758) engine.log: 2015-10-25 07:44:48,945 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-40) [a43dcc8] VM diataal-prod-cas1 77f07ae0-cc3e-4ae2-90ec-7fba7b11deeb moved from Up --> Paused 2015-10-25 07:44:49,003 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-40) [a43dcc8] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM diataal-prod-cas1 has paused due to unknown storage error. Has anyone experienced the same problem or knows a way to solve this? Kind regards, Jasper

3 3

Ovirt-engine-ha cannot to see live status of Hosted Engine
by asm＠pioner.kz 01 Feb '20

01 Feb '20

Good day for all. I have some issues with Ovirt 4.2.6. But now the main this of it: I have two Centos 7 Nodes with same config and last Ovirt 4.2.6 with Hostedengine with disk on NFS storage. Also some of virtual machines working good. But, when HostedEngine running on one node (srv02.local) everything is fine. After migrating to another node (srv00.local), i see that agent cannot to check livelinness of HostedEngine. After few minutes HostedEngine going to reboot and after some time i see some situation. After migration to another node (srv00.local) all looks OK. hosted-engine --vm-status commang when HosterEngine on srv00 node: --== Host 1 status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : srv02.local Host ID : 1 Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down_unexpected", "detail": "unknown"} Score : 0 stopped : False Local maintenance : False crc32 : ecc7ad2d local_conf_timestamp : 78328 Host timestamp : 78328 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=78328 (Tue Sep 18 12:44:18 2018) host-id=1 score=0 vm_conf_refresh_time=78328 (Tue Sep 18 12:44:18 2018) conf_on_shared_storage=True maintenance=False state=EngineUnexpectedlyDown stopped=False timeout=Fri Jan 2 03:49:58 1970 --== Host 2 status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : srv00.local Host ID : 2 Engine status : {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "Up"} Score : 3400 stopped : False Local maintenance : False crc32 : 1d62b106 local_conf_timestamp : 326288 Host timestamp : 326288 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=326288 (Tue Sep 18 12:44:21 2018) host-id=2 score=3400 vm_conf_refresh_time=326288 (Tue Sep 18 12:44:21 2018) conf_on_shared_storage=True maintenance=False state=EngineStarting stopped=False Log agent.log from srv00.local: MainThread::INFO::2018-09-18 12:40:51,749::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE ngine::(consume) VM is powering up.. MainThread::INFO::2018-09-18 12:40:52,052::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine. HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) MainThread::INFO::2018-09-18 12:41:01,066::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE ngine::(consume) VM is powering up.. MainThread::INFO::2018-09-18 12:41:01,374::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine. HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) MainThread::INFO::2018-09-18 12:41:11,393::state_machine::169::ovirt_hosted_engine_ha.agent.hosted_engine. HostedEngine::(refresh) Global metadata: {'maintenance': False} MainThread::INFO::2018-09-18 12:41:11,393::state_machine::174::ovirt_hosted_engine_ha.agent.hosted_engine. HostedEngine::(refresh) Host srv02.local.pioner.kz (id 1): {'conf_on_shared_storage': True, 'extra': 'meta data_parse_version=1\nmetadata_feature_version=1\ntimestamp=78128 (Tue Sep 18 12:40:58 2018)\nhost-id=1\ns core=0\nvm_conf_refresh_time=78128 (Tue Sep 18 12:40:58 2018)\nconf_on_shared_storage=True\nmaintenance=Fa lse\nstate=EngineUnexpectedlyDown\nstopped=False\ntimeout=Fri Jan 2 03:49:58 1970\n', 'hostname': 'srv02. local.pioner.kz', 'alive': True, 'host-id': 1, 'engine-status': {'reason': 'vm not running on this host', 'health': 'bad', 'vm': 'down_unexpected', 'detail': 'unknown'}, 'score': 0, 'stopped': False, 'maintenance ': False, 'crc32': 'e18e3f22', 'local_conf_timestamp': 78128, 'host-ts': 78128} MainThread::INFO::2018-09-18 12:41:11,393::state_machine::177::ovirt_hosted_engine_ha.agent.hosted_engine. HostedEngine::(refresh) Local (id 2): {'engine-health': {'reason': 'failed liveliness check', 'health': 'b ad', 'vm': 'up', 'detail': 'Up'}, 'bridge': True, 'mem-free': 12763.0, 'maintenance': False, 'cpu-load': 0 .0364, 'gateway': 1.0, 'storage-domain': True} MainThread::INFO::2018-09-18 12:41:11,393::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE ngine::(consume) VM is powering up.. MainThread::INFO::2018-09-18 12:41:11,703::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine. HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) MainThread::INFO::2018-09-18 12:41:21,716::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE ngine::(consume) VM is powering up.. MainThread::INFO::2018-09-18 12:41:22,020::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine. HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) MainThread::INFO::2018-09-18 12:41:31,033::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE ngine::(consume) VM is powering up.. MainThread::INFO::2018-09-18 12:41:31,344::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine. HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) As we can see, agent thinking that HostedEngine just in powering up mode. I cannot to do anythink with it. I allready reinstalled many times srv00 node without success. One time i even has to uninstall ovirt* and vdsm* software. Also here one interesting point, after installing just "yum install http://resources.ovirt.org/pub/yum-repo/ovirt-release42.rpm" on this node i try to install this node from engine web interface with "Deploy" action. But, installation was unsuccesfull, before i didnt install ovirt-hosted-engine-ha on this node. I dont see in documentation that its need bofore installation of new hosts. But this is for information and checking. After installing ovirt-hosted-engine-ha node was installed with HostedEngine support. But the main issue not changed. Thanks in advance for help. BR, Alexandr

3 5

Hyperconverged setup - storage architecture - scaling
by Leo David 10 Jan '20

10 Jan '20

Hello Everyone, Reading through the document: "Red Hat Hyperconverged Infrastructure for Virtualization 1.5 Automating RHHI for Virtualization deployment" Regarding storage scaling, i see the following statements: *2.7. SCALINGRed Hat Hyperconverged Infrastructure for Virtualization is supported for one node, and for clusters of 3, 6, 9, and 12 nodes.The initial deployment is either 1 or 3 nodes.There are two supported methods of horizontally scaling Red Hat Hyperconverged Infrastructure for Virtualization:* *1 Add new hyperconverged nodes to the cluster, in sets of three, up to the maximum of 12 hyperconverged nodes.* *2 Create new Gluster volumes using new disks on existing hyperconverged nodes.You cannot create a volume that spans more than 3 nodes, or expand an existing volume so that it spans across more than 3 nodes at a time* *2.9.1. Prerequisites for geo-replicationBe aware of the following requirements and limitations when configuring geo-replication:One geo-replicated volume onlyRed Hat Hyperconverged Infrastructure for Virtualization (RHHI for Virtualization) supports only one geo-replicated volume. Red Hat recommends backing up the volume that stores the data of your virtual machines, as this is usually contains the most valuable data.* ------ Also in oVirtEngine UI, when I add a brick to an existing volume i get the following warning: *"Expanding gluster volume in a hyper-converged setup is not recommended as it could lead to degraded performance. To expand storage for cluster, it is advised to add additional gluster volumes." * Those things are raising a couple of questions that maybe for some for you guys are easy to answer, but for me it creates a bit of confusion... I am also referring to RedHat product documentation, because I treat oVirt as production-ready as RHHI is. *1*. Is there any reason for not going to distributed-replicated volumes ( ie: spread one volume across 6,9, or 12 nodes ) ? - ie: is recomanded that in a 9 nodes scenario I should have 3 separated volumes, but how should I deal with the folowing question *2.* If only one geo-replicated volume can be configured, how should I deal with 2nd and 3rd volume replication for disaster recovery *3.* If the limit of hosts per datacenter is 250, then (in theory ) the recomended way in reaching this treshold would be to create 20 separated oVirt logical clusters with 12 nodes per each ( and datacenter managed from one ha-engine ) ? *4.* In present, I have the folowing one 9 nodes cluster , all hosts contributing with 2 disks each to a single replica 3 distributed replicated volume. They where added to the volume in the following order: node1 - disk1 node2 - disk1 ...... node9 - disk1 node1 - disk2 node2 - disk2 ...... node9 - disk2 At the moment, the volume is arbitrated, but I intend to go for full distributed replica 3. Is this a bad setup ? Why ? It oviously brakes the redhat recommended rules... Is there anyone so kind to discuss on these things ? Thank you very much ! Leo -- Best regards, Leo David -- Best regards, Leo David

3 5

disk locked after export as OVA
by adam_xu＠adagene.com.cn 15 Oct '19

15 Oct '19

Hello, everyone. I tried to export a vm as OVA. I got a error in webui: VDSM ovirt1.ntbaobei.com command HSMGetAllTasksStatusesVDS failed: Volume Group not big enough: (u'Not enough free extents for extending LV d9f2378f-92f0-4bc1-96a8-20a0f2c575cb/c515a3f9-0590-4ebe-81c5-9d0993f1fec9 (free=848, needed=872)',) seems no enough space in the storage domain. I deleted the vm which i wanted to export as OVA before, but I saw a disk with id 8140d2e7-6908-438e-a646-23e58a33913e left in the storage and the status is locked. here's some log in the engine.log: 2018-09-17 14:27:03,467+08 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CopyImageVDSCommand] (EE-ManagedThreadFactory-engine-Thread-107849) [a407f6a9-445d-4ccc-b998-f9dfc6dc67ec] START, CopyImageVDSCommand( CopyImageVDSCommandParameters:{storagePoolId='79432500-ad45-11e8-98f3-00163e188641', ignoreFailoverLimit='false', storageDomainId='d9f2378f-92f0-4bc1-96a8-20a0f2c575cb', imageGroupId='ee55bbe7-8001-4c82-b1a6-0cac7710704b', imageId='701a7472-8c00-4b9f-aa87-4837eb128695', dstImageGroupId='8140d2e7-6908-438e-a646-23e58a33913e', vmId='b3aa623e-3072-430e-87d6-b81a3b07f466', dstImageId='c515a3f9-0590-4ebe-81c5-9d0993f1fec9', imageDescription='', dstStorageDomainId='d9f2378f-92f0-4bc1-96a8-20a0f2c575cb', copyVolumeType='LeafVol', volumeFormat='COW', preallocate='Sparse', postZero='false', discard='false', force='true'}), log id: 164e3046 2018-09-17 14:27:03,467+08 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CopyImageVDSCommand] (EE-ManagedThreadFactory-engine-Thread-107849) [a407f6a9-445d-4ccc-b998-f9dfc6dc67ec] ++ dstImageGUID=8140d2e7-6908-438e-a646-23e58a33913e I think maybe some command did not been excuted after I export the vm as OVA. I can find the similar case here: https://access.redhat.com/solutions/1298893 I have used the tool unlock_entity.sh but cannot find any disk that is locked. So what should I do to delete the disk that is been locked? yours Adam

2 1

ovirt-imagio-proxy upload speed slow
by Dev Ops 18 Sep '19

18 Sep '19

I am working on integrating a backup solution for our ovirt environment and having issues with the time it takes to backup the VM's. This backup solution is simply taking a snapshot and making a clone and backing the clone up to a backup server. A VM that is 100 gig takes 52 minutes to back up. The same VM doing a file backup using the same product, and bypassing their rhv plugin, takes 14 minutes. So the throughput is there but the ovirt imageio-proxy process seems to be what manages how images are uploaded and is officially my bottle neck. Load is not high on the engine or kvm hosts. I had bumped up the Upload image size from 100MB to 10gig weeks ago and that didn't seem to help. [root@blah-lab-engine ~]# engine-config -a |grep Upload UploadImageChunkSizeKB: 10240000 version: general [root@bgl-vms-engine ~]# rpm -qa |grep ovirt-image ovirt-imageio-proxy-1.4.6-1.el7.noarch ovirt-imageio-common-1.4.6-1.el7.x86_64 ovirt-imageio-proxy-setup-1.4.6-1.el7.noarch I have seen bugs reported to redhat about this but I am running above the affected releases. engine software is 4.2.8.2-1.el7 Any idea what we can tweak to open up this bottleneck?

5 8

Re: Cannot Increase Hosted Engine VM Memory
by Douglas Duckworth 04 Sep '19

04 Sep '19

Yes, I do. Gold crown indeed. It's the "HostedEngine" as seen attached! Thanks, Douglas Duckworth, MSc, LFCS HPC System Administrator Scientific Computing Unit<https://scu.med.cornell.edu> Weill Cornell Medicine 1300 York Avenue New York, NY 10065 E: doug(a)med.cornell.edu<mailto:doug@med.cornell.edu> O: 212-746-6305 F: 212-746-8690 On Wed, Jan 23, 2019 at 12:02 PM Simone Tiraboschi <stirabos(a)redhat.com<mailto:stirabos@redhat.com>> wrote: On Wed, Jan 23, 2019 at 5:51 PM Douglas Duckworth <dod2014(a)med.cornell.edu<mailto:dod2014@med.cornell.edu>> wrote: Hi Simone Can I get help with this issue? Still cannot increase memory for Hosted Engine. From the logs it seams that the engine is trying to hotplug memory to the engine VM which is something it should not happen. The engine should simply update engine VM configuration in the OVF_STORE and require a reboot of the engine VM. Quick question, in the VM panel do you see a gold crown symbol on the Engine VM? Thanks, Douglas Duckworth, MSc, LFCS HPC System Administrator Scientific Computing Unit<https://scu.med.cornell.edu> Weill Cornell Medicine 1300 York Avenue New York, NY 10065 E: doug(a)med.cornell.edu<mailto:doug@med.cornell.edu> O: 212-746-6305 F: 212-746-8690 On Thu, Jan 17, 2019 at 8:08 AM Douglas Duckworth <dod2014(a)med.cornell.edu<mailto:dod2014@med.cornell.edu>> wrote: Sure, they're attached. In "first attempt" the error seems to be: 2019-01-17 07:49:24,795-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-29) [680f82b3-7612-4d91-afdc-43937aa298a2] EVENT_ID: FAILED_HOT_SET_MEMORY_NOT_DIVIDABLE(2,048), Failed to hot plug memory to VM HostedEngine. Amount of added memory (4000MiB) is not dividable by 256MiB. Followed by: 2019-01-17 07:49:24,814-05 WARN [org.ovirt.engine.core.bll.UpdateRngDeviceCommand] (default task-29) [26f5f3ed] Validation of action 'UpdateRngDevice' failed for user admin@internal-authz. Reasons: ACTION_TYPE_FAILED_VM_IS_RUNNING 2019-01-17 07:49:24,815-05 ERROR [org.ovirt.engine.core.bll.UpdateVmCommand] (default task-29) [26f5f3ed] Updating RNG device of VM HostedEngine (adf14389-1563-4b1a-9af6-4b40370a825b) failed. Old RNG device = VmRngDevice:{id='VmDeviceId:{deviceId='6435b2b5-163c-4f0c-934e-7994da60dc89', vmId='adf14389-1563-4b1a-9af6-4b40370a825b'}', device='virtio', type='RNG', specParams='[source=urandom]', address='', managed='true', plugged='true', readOnly='false', deviceAlias='', customProperties='null', snapshotId='null', logicalName='null', hostDevice='null'}. New RNG device = VmRngDevice:{id='VmDeviceId:{deviceId='6435b2b5-163c-4f0c-934e-7994da60dc89', vmId='adf14389-1563-4b1a-9af6-4b40370a825b'}', device='virtio', type='RNG', specParams='[source=urandom]', address='', managed='true', plugged='true', readOnly='false', deviceAlias='', customProperties='null', snapshotId='null', logicalName='null', hostDevice='null'}. In "second attempt" I used values that are dividable by 256 MiB so that's no longer present. Though same error: 2019-01-17 07:56:59,795-05 INFO [org.ovirt.engine.core.vdsbroker.SetAmountOfMemoryVDSCommand] (default task-22) [7059a48f] START, SetAmountOfMemoryVDSCommand(HostName = ovirt-hv1.med.cornell.edu<http://ovirt-hv1.med.cornell.edu>, Params:{hostId='cdd5ffda-95c7-4ffa-ae40-be66f1d15c30', vmId='adf14389-1563-4b1a-9af6-4b40370a825b', memoryDevice='VmDevice:{id='VmDeviceId:{deviceId='7f7d97cc-c273-4033-af53-bc9033ea3abe', vmId='adf14389-1563-4b1a-9af6-4b40370a825b'}', device='memory', type='MEMORY', specParams='[node=0, size=2048]', address='', managed='true', plugged='true', readOnly='false', deviceAlias='', customProperties='null', snapshotId='null', logicalName='null', hostDevice='null'}', minAllocatedMem='6144'}), log id: 50873daa 2019-01-17 07:56:59,855-05 INFO [org.ovirt.engine.core.vdsbroker.SetAmountOfMemoryVDSCommand] (default task-22) [7059a48f] FINISH, SetAmountOfMemoryVDSCommand, log id: 50873daa 2019-01-17 07:56:59,862-05 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-22) [7059a48f] EVENT_ID: HOT_SET_MEMORY(2,039), Hotset memory: changed the amount of memory on VM HostedEngine from 4096 to 4096 2019-01-17 07:56:59,881-05 WARN [org.ovirt.engine.core.bll.UpdateRngDeviceCommand] (default task-22) [28fd4c82] Validation of action 'UpdateRngDevice' failed for user admin@internal-authz. Reasons: ACTION_TYPE_FAILED_VM_IS_RUNNING 2019-01-17 07:56:59,882-05 ERROR [org.ovirt.engine.core.bll.UpdateVmCommand] (default task-22) [28fd4c82] Updating RNG device of VM HostedEngine (adf14389-1563-4b1a-9af6-4b40370a825b) failed. Old RNG device = VmRngDevice:{id='VmDeviceId:{deviceId='6435b2b5-163c-4f0c-934e-7994da60dc89', vmId='adf14389-1563-4b1a-9af6-4b40370a825b'}', device='virtio', type='RNG', specParams='[source=urandom]', address='', managed='true', plugged='true', readOnly='false', deviceAlias='', customProperties='null', snapshotId='null', logicalName='null', hostDevice='null'}. New RNG device = VmRngDevice:{id='VmDeviceId:{deviceId='6435b2b5-163c-4f0c-934e-7994da60dc89', vmId='adf14389-1563-4b1a-9af6-4b40370a825b'}', device='virtio', type='RNG', specParams='[source=urandom]', address='', managed='true', plugged='true', readOnly='false', deviceAlias='', customProperties='null', snapshotId='null', logicalName='null', hostDevice='null'}. This message repeats throughout engine.log: 2019-01-17 07:55:43,270-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-89) [] EVENT_ID: VM_MEMORY_UNDER_GUARANTEED_VALUE(148), VM HostedEngine on host ovirt-hv1.med.cornell.edu<http://ovirt-hv1.med.cornell.edu> was guaranteed 8192 MB but currently has 4224 MB As you can see attached the host has plenty of memory. Thank you Simone! Thanks, Douglas Duckworth, MSc, LFCS HPC System Administrator Scientific Computing Unit<https://scu.med.cornell.edu> Weill Cornell Medicine 1300 York Avenue New York, NY 10065 E: doug(a)med.cornell.edu<mailto:doug@med.cornell.edu> O: 212-746-6305 F: 212-746-8690 On Thu, Jan 17, 2019 at 5:09 AM Simone Tiraboschi <stirabos(a)redhat.com<mailto:stirabos@redhat.com>> wrote: On Wed, Jan 16, 2019 at 8:22 PM Douglas Duckworth <dod2014(a)med.cornell.edu<mailto:dod2014@med.cornell.edu>> wrote: Sorry for accidental send. Anyway I try to increase physical memory however it won't go above 4096MB. The hypervisor has 64GB. Do I need to modify this value with Hosted Engine offline? No, it's not required. Can you please attach your engine.log for the relevant time frame? Thanks, Douglas Duckworth, MSc, LFCS HPC System Administrator Scientific Computing Unit<https://scu.med.cornell.edu> Weill Cornell Medicine 1300 York Avenue New York, NY 10065 E: doug(a)med.cornell.edu<mailto:doug@med.cornell.edu> O: 212-746-6305 F: 212-746-8690 On Wed, Jan 16, 2019 at 1:58 PM Douglas Duckworth <dod2014(a)med.cornell.edu<mailto:dod2014@med.cornell.edu>> wrote: Hello I am trying to increase Hosted Engine physical memory above 4GB Thanks, Douglas Duckworth, MSc, LFCS HPC System Administrator Scientific Computing Unit<https://scu.med.cornell.edu> Weill Cornell Medicine 1300 York Avenue New York, NY 10065 E: doug(a)med.cornell.edu<mailto:doug@med.cornell.edu> O: 212-746-6305 F: 212-746-8690 _______________________________________________ Users mailing list -- users(a)ovirt.org<mailto:users@ovirt.org> To unsubscribe send an email to users-leave(a)ovirt.org<mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/site/privacy-policy/<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ovirt.org_site_pri…> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ovirt.org_communit…> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/WGSXQVVPJJ2CR…<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ovirt.org_archiv…>

3 8

Hosted-engine inaccessible
by Tau Makgaile 29 Aug '19

29 Aug '19

Hi, I am currently experiencing a problem with my Hosted-engine. Hosted-engine disconnected after increasing / partition. The increase went well but after some time the hosted-enigine VM disconnected and has since been giving alerts such as* re-initializingFSM*. Though VMs underneth are running, Hosted-engine --vm-status: *"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail"* There is no backup to restore at the moment. I am looking for a way to bring it up without redeploying the hosted engine. Thanks in advance for your help. Kind regards, Tau

2 1

Agentless backup solutions
by femi adegoke 07 Aug '19

07 Aug '19

What backup solutions are people using? I'm only interested in host-level backups (no agents in the vm's) I am aware of the following products: https://storware.eu/en/storware-vprotect/ https://github.com/zipurman/oVIRT_Simple_Backup https://www.sepusa.com/ Any others?

9 16