Re: Ovirt 4.3.1 problem with HA agent
by Strahil
Hi Simone,
I have noticed that my Engine's root disk is 'vda' just in standalone KVM.
I have the feeling that was not the case before.
Can someone check a default engine and post the output of lsblk ?
Thanks in advance.
Best Regards,
Strahil NikolovOn Mar 15, 2019 12:46, Strahil Nikolov <hunter86_bg(a)yahoo.com> wrote:
>
>
> On Fri, Mar 15, 2019 at 8:12 AM Strahil Nikolov <hunter86_bg(a)yahoo.com> wrote:
>>
>> Ok,
>>
>> I have managed to recover again and no issues are detected this time.
>> I guess this case is quite rare and nobody has experienced that.
>
>
> >Hi,
> >can you please explain how you fixed it?
>
> I have set again to global maintenance, defined the HostedEngine from the old xml (taken from old vdsm log) , defined the network and powered it off.
> Set the OVF update period to 5 min , but it took several hours until the OVF_STORE were updated. Once this happened I restarted the ovirt-ha-agent ovirt-ha-broker on both nodes.Then I powered off the HostedEngine and undefined it from ovirt1.
>
> then I set the maintenance to 'none' and the VM powered on ovirt1.
> In order to test a failure, I removed the global maintenance and powered off the HostedEngine from itself (via ssh). It was brought back to the other node.
>
> In order to test failure of ovirt2, I set ovirt1 in local maintenance and removed it (mode 'none') and again shutdown the VM via ssh and it started again to ovirt1.
>
> It seems to be working, as I have later shut down the Engine several times and it managed to start without issues.
>
> I'm not sure this is related, but I had detected that ovirt2 was out-of-sync of the vdsm-ovirtmgmt network , but it got fixed easily via the UI.
>
>
>
> Best Regards,
> Strahil Nikolov
5 years, 10 months
Are people still experiencing issues with GlusterFS on 4.3x?
by Jayme
I along with others had GlusterFS issues after 4.3 upgrades, the failed to
dispatch handler issue with bricks going down intermittently. After some
time it seemed to have corrected itself (at least in my enviornment) and I
hadn't had any brick problems in a while. I upgraded my three node HCI
cluster to 4.3.1 yesterday and again I'm running in to brick issues. They
will all be up running fine then all of a sudden a brick will randomly drop
and I have to force start the volume to get it back up.
Have any of these Gluster issues been addressed in 4.3.2 or any other
releases/patches that may be available to help the problem at this time?
Thanks!
5 years, 10 months
Re: Broken dependencies in CentOS7 hosts upgrading to 4.2.8 from 4.2.7
by Strahil
Here is my repolist:
Last login: Sun Mar 17 12:42:53 2019 from 192.168.1.43
[root@ovirt1 ~]# yum repolist
Loaded plugins: enabled_repos_upload, fastestmirror, package_upload, product- : id, search-disabled-repos, subscription-manager, vdsmupgrade
This system is not registered with an entitlement server. You can use subscription-manager to register.
Loading mirror speeds from cached hostfile
* base: mirror.wwfx.net
* extras: mirror.wwfx.net
* ovirt-4.3: ftp.nluug.nl
* ovirt-4.3-epel: mirror.t-home.mk
* updates: mirror.wwfx.net
repo id repo name statusbase/7/x86_64 CentOS-7 - Base 10,019centos-sclo-rh-release/x86_64 CentOS-7 - SCLo rh 8,113extras/7/x86_64 CentOS-7 - Extras 371ovirt-4.3/7 Latest oVirt 4.3 Release 265ovirt-4.3-centos-gluster5/x86_64 CentOS-7 - Gluster 5 112ovirt-4.3-centos-opstools/x86_64 CentOS-7 - OpsTools - release 853ovirt-4.3-centos-ovirt43/x86_64 CentOS-7 - oVirt 4.3 287ovirt-4.3-centos-qemu-ev/x86_64 CentOS-7 - QEMU EV 71ovirt-4.3-epel/x86_64 Extra Packages for Enterprise Linux 7 12,981ovirt-4.3-virtio-win-latest virtio-win builds roughly matching wh 40sac-gluster-ansible/x86_64 Copr repo for gluster-ansible owned b 18updates/7/x86_64 CentOS-7 - Updates 1,163repolist: 34,293
Uploading Enabled Repositories Report
Cannot upload enabled repos report, is this client registered?
[root@ovirt1 ~]#
As you can see my CentOS7 - Updates is enabled.
So far I have managed to upgrade 4.2.7 -> 4.2.8 -> 4.3.0 -> 4.3.1
There were some issues , but nothing related to the repos.
Best Regards,
Strahil NikolovOn Mar 15, 2019 18:57, Roberto Nunin <robnunin(a)gmail.com> wrote:
>
> Hi
> I have some oVirt clusters, in various config.
>
> One cluster based on CentOS7 hosts, another based on ovirt-node-ng.
> While the second was successfully updated from 4.2.7 to 4.2.8, attempts to update hosts of the first one ends with:
>
> Error: Package: vdsm-4.20.46-1.el7.x86_64 (ovirt-4.2)
> Requires: libvirt-daemon-kvm >= 4.5.0-10.el7_6.3
> Installed: libvirt-daemon-kvm-4.5.0-10.el7.x86_64 (@base)
> libvirt-daemon-kvm = 4.5.0-10.el7
>
> Being a CentOS7 installation and not an ovirt-node-ng, I cannot follow the notice in :
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/AJUBXAIGXVD...
>
> The only way to have libvirt-daemon-kvm release 4-5-0.10_6.4 (and not 6.3) is to enable CentOS Updates repo. Looking at host-deploy log looks fine, but It's safe to enable that repo ? There is another, safest method to update these hosts to the latest version of 4.2 ?
>
> Thanks in advance
>
> --
> Roberto Nunin
>
>
>
5 years, 10 months
GPU passthrough
by Darin Schmidt
If I'm reading through the documents correctly, you can't use consumer
grade gpus such as gtx 1080's and assign them to a VM?
I'm trying to do something similar to how Linus used untaid to build
several gaming machines from one.
Any help would be appreciated.
Thanks.
5 years, 10 months
single server, self hosted engine on host, one vm wont start
by abzstrak@gmail.com
I'm not an expert with ovirt by any means, I've been running it for about 1.5 years coming from Xen.
I have a Vm with 16TiB of storage that will not boot, it has some snapshots that wont remove. The web gui says it finished removing the last snapshot, but I don't think it actually did (took it about 50 hours).... I have no idea what is wrong with the vm that wont boot, it just says it's starting and never actually starts. Eventually I'll get a message that it failed to start.
The boot drive of that vm is 16GB. I removed any snapshots from it and was able to clone it to a new disk, make a new vm and attach that disk to the new vm... booted fine. Now, the storage disk is what I really need working, but I do not have enough space to clone it and I cannot unattach it from the flakey vm since it has snapshots... and it seemingly won't remove the snapshots.
Right now, I think its actually still maybe attempting to remove the snapshot? in top, I see qemu-img pulling all of one core at about 100%. unfortunately its very very slow because it's only using one core (of 16). This seemed to be a thing, AFAIK, when removing/merging snapshots in that it will only run single threaded... if it takes a while I don't care, I just want to know what's going on.
I can't find any logs to look at that tell me that a snapshot removal or merge is ongoing or ever happened. I cannot find any logs that ever stated the progress of any such merge either. Any pointers there?
Also, my backup machine died about 2 weeks ago taking its raid 5 with it... of course... I've been busy at work and figured I'd fix or rebuild it in a couple of weeks... so I don't have a good recent backup of my data on this vm, which is why I'm trying to get it to boot up again.
5 years, 10 months
self-hosted ovirt-engine down
by siovelrm@gmail.com
Hi, I have a big problem with ovirt. I use version 4.2.7 with self-hosted. The problem is that when I try to raise the vm of the ovirt-engine with the command: hosted-engine --vm-start, it appears in the output
"VM exists and is down, cleaning up and restarting"
when running: hosted-engine --vm-status appears
--== Host 1 status ==--
conf_on_shared_storage : True
Status up-to-date : True
Hostname : node1.softel.cu
Host ID : 1
Engine status : {"reason": "bad vm status", "health": "bad", "vm": "down_unexpected", "detail": "Down"}
Score : 0
stopped : False
Local maintenance : False
crc32 : 02c3b5a4
local_conf_timestamp : 49529
Host timestamp : 49529
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=49529 (Sat Mar 16 02:39:10 2019)
host-id=1
score=0
vm_conf_refresh_time=49529 (Sat Mar 16 02:39:11 2019)
conf_on_shared_storage=True
maintenance=False
state=EngineUnexpectedlyDown
stopped=False
timeout=Thu Jan 1 08:49:39 1970
in /var/log/messages
Mar 16 02:35:34 node1 vdsm[26151]: WARN Attempting to remove a non existing network: ovirtmgmt/0c3e1c08-3928-47f1-96a8-c6a8d0dc3241
Mar 16 02:35:34 node1 vdsm[26151]: WARN Attempting to remove a non existing net user: ovirtmgmt/0c3e1c08-3928-47f1-96a8-c6a8d0dc3241
Mar 16 02:35:34 node1 vdsm[26151]: WARN Attempting to remove a non existing network: ovirtmgmt/0c3e1c08-3928-47f1-96a8-c6a8d0dc3241
Mar 16 02:35:34 node1 vdsm[26151]: WARN Attempting to remove a non existing net user: ovirtmgmt/0c3e1c08-3928-47f1-96a8-c6a8d0dc3241
Mar 16 02:35:34 node1 vdsm[26151]: WARN File: /var/lib/libvirt/qemu/channels/0c3e1c08-3928-47f1-96a8-c6a8d0dc3241.org.qemu.guest_agent.0 already removed
Please help!!
5 years, 10 months
Self Hosted Engine failed during setup using oVirt Node 4.3
by Jagi Sarcilla
Hardware Specs
Hypervisor:
Supermicro A2SDi-16C-HLN4F
Intel(R) Atom(TM) CPU C3955 @ 2.10GHz 16cores
Samsung NVME 1TB
Sandisk SSHD 1TB
256GB RAM
4 x 1G nics
Storage:
FreeNAS-11.2-RELEASE-U1
- NFS
- iSCSI
* Issues #1
using iscsi unable to discover the target using cockpit dashboard
* Issue #2
using nfs can successfully connect but the host wont come up after shutdown by the installation
* Error message:
[ INFO ] TASK [oVirt.hosted-engine-setup : Check engine VM health]
[ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 120, "changed": true, "cmd": ["hosted-engine", "--vm-status", "--json"], "delta": "0:00:00.358141", "end": "2019-03-14 06:33:35.499429", "rc": 0, "start": "2019-03-14 06:33:35.141288", "stderr": "", "stderr_lines": [], "stdout": "{\"1\": {\"conf_on_shared_storage\": true, \"live-data\": true, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=43479 (Thu Mar 14 06:33:34 2019)\\nhost-id=1\\nscore=0\\nvm_conf_refresh_time=43480 (Thu Mar 14 06:33:34 2019)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineUnexpectedlyDown\\nstopped=False\\ntimeout=Thu Jan 1 07:04:46 1970\\n\", \"hostname\": \"dreamlevel-1.logistics.corp\", \"host-id\": 1, \"engine-status\": {\"reason\": \"bad vm status\", \"health\": \"bad\", \"vm\": \"down_unexpected\", \"detail\": \"Down\"}, \"score\": 0, \"stopped\": false, \"maintenance\": false, \"crc32\": \"de300a81\", \"local_conf_timestamp\": 43480, \"host-ts\": 43479},
\"global_maintenance\": false}", "stdout_lines": ["{\"1\": {\"conf_on_shared_storage\": true, \"live-data\": true, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=43479 (Thu Mar 14 06:33:34 2019)\\nhost-id=1\\nscore=0\\nvm_conf_refresh_time=43480 (Thu Mar 14 06:33:34 2019)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineUnexpectedlyDown\\nstopped=False\\ntimeout=Thu Jan 1 07:04:46 1970\\n\", \"hostname\": \"dreamlevel-1.logistics.corp\", \"host-id\": 1, \"engine-status\": {\"reason\": \"bad vm status\", \"health\": \"bad\", \"vm\": \"down_unexpected\", \"detail\": \"Down\"}, \"score\": 0, \"stopped\": false, \"maintenance\": false, \"crc32\": \"de300a81\", \"local_conf_timestamp\": 43480, \"host-ts\": 43479}, \"global_maintenance\": false}"]}
[ INFO ] TASK [oVirt.hosted-engine-setup : Check VM status at virt level]
[ INFO ] TASK [oVirt.hosted-engine-setup : debug]
[ INFO ] ok: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : Fail if engine VM is not running]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM is not running, please check vdsm logs"}
2019-03-14 06:33:41,168-0400 ERROR ansible failed {'status': 'FAILED', 'ansible_type': 'task', 'ansible_task': u'Check VM status at virt level', 'ansible_result': u"type: <type 'dict'>\nstr: {'_ansible_parsed': True, 'stderr_lines': [], u'changed': True, u'end': u'2019-03-14 06:33:39.429283', '_ansible_no_log': False, u'stdout': u'', u'cmd': u'virsh -r list | grep HostedEngine | grep running', u'rc': 1, u'stderr': u'', u'delta': u'0:00:00.118398', u'invocation': {u'module_args': {u'crea", 'ansible_host': u'localhost', 'ansible_playbook': u'/usr/share/ovirt-hosted-engine-setup/ansible/create_target_vm.yml'}
2019-03-14 06:33:41,168-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8948c2fd0> kwargs ignore_errors:True
2019-03-14 06:30:45,000-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8944dd810> kwargs
2019-03-14 06:30:50,678-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe894782490> kwargs
2019-03-14 06:30:56,369-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe894675490> kwargs
2019-03-14 06:31:02,059-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8950ff750> kwargs
2019-03-14 06:31:07,738-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8944dd810> kwargs
2019-03-14 06:31:13,429-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8947824d0> kwargs
2019-03-14 06:31:19,118-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8947c2390> kwargs
2019-03-14 06:31:24,797-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe894750b10> kwargs
2019-03-14 06:31:30,481-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8944dd810> kwargs
2019-03-14 06:31:36,174-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe894782490> kwargs
2019-03-14 06:31:41,852-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe894675490> kwargs
2019-03-14 06:31:47,534-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8947c2390> kwargs
2019-03-14 06:31:53,226-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8944dd810> kwargs
2019-03-14 06:31:58,908-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8947824d0> kwargs
2019-03-14 06:32:04,592-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8945f9f10> kwargs
2019-03-14 06:32:10,266-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8944fb390> kwargs
2019-03-14 06:32:15,948-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8944dd810> kwargs
2019-03-14 06:32:21,630-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe894782490> kwargs
2019-03-14 06:32:27,316-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe894675490> kwargs
2019-03-14 06:32:33,016-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8945f9f10> kwargs
2019-03-14 06:32:38,702-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8944dd810> kwargs
2019-03-14 06:32:44,382-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8947824d0> kwargs
2019-03-14 06:32:50,063-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8950ff750> kwargs
2019-03-14 06:32:55,745-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe894750b10> kwargs
2019-03-14 06:33:01,422-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8944dd810> kwargs
2019-03-14 06:33:07,107-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe894782490> kwargs
2019-03-14 06:33:12,794-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe894675490> kwargs
2019-03-14 06:33:18,470-0400 DEBUG ansible on_any args <ansible.executor.task_result.TaskResult object at 0x7fe8950ff750> kwargs
Engine VM status: EngineUnexpectedlyDown-EngineDown.
* Manually start the Engine VM via virsh:
error: Failed to start domain HostedEngine
error: the CPU is incompatible with host CPU: Host CPU does not provide required features: pcid
5 years, 10 months
Broken dependencies in CentOS7 hosts upgrading to 4.2.8 from 4.2.7
by Roberto Nunin
Hi
I have some oVirt clusters, in various config.
One cluster based on CentOS7 hosts, another based on ovirt-node-ng.
While the second was successfully updated from 4.2.7 to 4.2.8, attempts to
update hosts of the first one ends with:
Error: Package: vdsm-4.20.46-1.el7.x86_64 (ovirt-4.2)
Requires: libvirt-daemon-kvm >= 4.5.0-10.el7_6.3
Installed: libvirt-daemon-kvm-4.5.0-10.el7.x86_64 (@base)
libvirt-daemon-kvm = 4.5.0-10.el7
Being a CentOS7 installation and not an ovirt-node-ng, I cannot follow the
notice in :
https://lists.ovirt.org/archives/list/users@ovirt.org/message/AJUBXAIGXVD...
The only way to have libvirt-daemon-kvm release 4-5-0.10_6.4 (and not 6.3)
is to enable CentOS Updates repo. Looking at host-deploy log looks fine,
but It's safe to enable that repo ? There is another, safest method to
update these hosts to the latest version of 4.2 ?
Thanks in advance
--
Roberto Nunin
5 years, 10 months
Discard a snapshot
by Mitja Mihelič
Hi!
We have run into a problem migrating a VM's disk. While doing a live
disk migration it all went well until "Removing Snapshot Auto-generated
for Live Storage Migration". The operation started and got stuck in the
"Preparing to merge" stage. The task was visible as an async task for a
couple of days, then dissapeared. The snapshot named "Auto-generated for
Live Storage Migration" still exists.
If we try to start the VM it produces the following error:
VM VM_NAME_HERE is down with error. Exit message: Bad volume
specification {u'index': 0, u'domainID':
u'47ee9bde-4f18-43c2-9383-0d30a27d1ae7', 'reqsize': '0', u'format':
u'cow', u'bootOrder': u'1', u'address': {u'function': u'0x0', u'bus':
u'0x00', u'domain': u'0x0000', u'type': u'pci', u'slot': u'0x05'},
u'volumeID': u'3b0184b7-5c5d-4669-8b3a-04880883118a', 'apparentsize':
'19193135104', u'imageID': u'35599677-250a-4586-9db1-4df8f3291164',
u'discard': False, u'specParams': {}, u'readonly': u'false', u'iface':
u'virtio', u'optional': u'false', u'deviceId':
u'35599677-250a-4586-9db1-4df8f3291164', 'truesize': '19193135104',
u'poolID': u'35628205-9fa3-41de-b914-84f9adb633e4', u'device': u'disk',
u'shared': u'false', u'propagateErrors': u'off', u'type': u'disk'}.
From what I understand the data access path is snapshot->"disk before
snapshot". And since the snapshot is corrupted it cannot be accessed and
the whole VM start process fails. I believe that the disk before the
snapshot was made is still intact.
How can we discard the whole snapshot so that are left only with the
state before the snapshot was made?
We can afford to lose the data in the snapshot.
Best regards,
Mitja
5 years, 10 months