Host reboots when network switch goes down
by cen
Hi,
we are experiencing a weird issue with our Ovirt setup. We have two
physical hosts (DC1 and DC2) and mounted Lenovo NAS storage for all VM data.
They are connected via a managed network switch.
What happens is that if switch goes down for whatever reason (firmware
update etc), physical host reboots. Not sure if this is an action
performed by Ovirt but I suspect it is because connection to mounted
storage is lost and it performs some kind of an emergency action. I
would need to get some direction pointers to find out
a) who triggers the reboot and why
c) a way to prevent reboots by increasing storage? timeouts
Switch reboot takes 2-3 minutes.
These are the host /var/log/messages just before reboot occurs:
Sep 28 16:20:00 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:00 7690984
[10993]: s11 check_our_lease warning 72 last_success 7690912
Sep 28 16:20:00 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:00 7690984
[10993]: s3 check_our_lease warning 76 last_success 7690908
Sep 28 16:20:00 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:00 7690984
[10993]: s1 check_our_lease warning 68 last_success 7690916
Sep 28 16:20:00 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:00 7690984
[27983]: s11 delta_renew read timeout 10 sec offset 0
/var/run/vdsm/storage/15514c65-5d45-4ba7-bcd4-cc772351c940/fce598a8-11c3-44f9-8aaf-8712c96e00ce/65413499-6970-4a4c-af04-609ef78891a2
Sep 28 16:20:00 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:00 7690984
[27983]: s11 renewal error -202 delta_length 20 last_success 7690912
Sep 28 16:20:00 ovirtnode02 wdmd[11102]: test warning now 7690984 ping
7690970 close 7690980 renewal 7690912 expire 7690992 client 10993
sanlock_hosted-engine:2
Sep 28 16:20:00 ovirtnode02 wdmd[11102]: test warning now 7690984 ping
7690970 close 7690980 renewal 7690908 expire 7690988 client 10993
sanlock_3cb12f04-5d68-4d79-8663-f33c0655baa6:2
Sep 28 16:20:01 ovirtnode02 systemd: Created slice User Slice of root.
Sep 28 16:20:01 ovirtnode02 systemd: Started Session 15148 of user root.
Sep 28 16:20:01 ovirtnode02 systemd: Removed slice User Slice of root.
Sep 28 16:20:01 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:01 7690985
[10993]: s11 check_our_lease warning 73 last_success 7690912
Sep 28 16:20:01 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:01 7690985
[10993]: s3 check_our_lease warning 77 last_success 7690908
Sep 28 16:20:01 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:01 7690985
[10993]: s1 check_our_lease warning 69 last_success 7690916
Sep 28 16:20:01 ovirtnode02 wdmd[11102]: test warning now 7690985 ping
7690970 close 7690980 renewal 7690912 expire 7690992 client 10993
sanlock_hosted-engine:2
Sep 28 16:20:01 ovirtnode02 wdmd[11102]: test warning now 7690985 ping
7690970 close 7690980 renewal 7690908 expire 7690988 client 10993
sanlock_3cb12f04-5d68-4d79-8663-f33c0655baa6:2
Sep 28 16:20:02 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:02 7690986
[10993]: s11 check_our_lease warning 74 last_success 7690912
Sep 28 16:20:02 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:02 7690986
[10993]: s3 check_our_lease warning 78 last_success 7690908
Sep 28 16:20:02 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:02 7690986
[10993]: s1 check_our_lease warning 70 last_success 7690916
Sep 28 16:20:02 ovirtnode02 wdmd[11102]: test warning now 7690986 ping
7690970 close 7690980 renewal 7690916 expire 7690996 client 10993
sanlock_15514c65-5d45-4ba7-bcd4-cc772351c940:2
Sep 28 16:20:02 ovirtnode02 wdmd[11102]: test warning now 7690986 ping
7690970 close 7690980 renewal 7690912 expire 7690992 client 10993
sanlock_hosted-engine:2
Sep 28 16:20:02 ovirtnode02 wdmd[11102]: test warning now 7690986 ping
7690970 close 7690980 renewal 7690908 expire 7690988 client 10993
sanlock_3cb12f04-5d68-4d79-8663-f33c0655baa6:2
Sep 28 16:20:03 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:03 7690987
[10993]: s11 check_our_lease warning 75 last_success 7690912
Sep 28 16:20:03 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:03 7690987
[10993]: s3 check_our_lease warning 79 last_success 7690908
Sep 28 16:20:03 ovirtnode02 sanlock[10993]: 2021-09-28 16:20:03 7690987
[10993]: s1 check_our_lease warning 71 last_success 7690916
3 years, 4 months
Ovirt 4.3 Upload of Image fails
by Mark Morgan
Hi, I am trying to upload an image to a Ovirt 4.3 Instance but it keeps
failing.
After a few seconds it says paused by system.
The test connection is successful in the upload image window so we have
installed the certificate properly.
Due to an older
thread(https://www.mail-archive.com/users@ovirt.org/msg50954.html) I
also checked if it has something to do with wifi. But I am not even
using a wifi connection.
Here is a small part of the log, where you can see the transfer failing.
2021-09-29 11:44:43,011+02 INFO
[org.ovirt.engine.core.bll.storage.disk.image.TransferImageStatusCommand]
(default task-96804) [d370a18b-bb12-4992-9fc8-7ce6607358f8] Running
command: TransferImageStatusCommand internal: false. Entities affected
: ID: aaa00000-0000-0000-0000-123456789aaa Type: SystemAction group
CREATE_DISK with role type USER
2021-09-29 11:44:43,055+02 INFO
[org.ovirt.engine.core.bll.storage.disk.image.TransferImageStatusCommand]
(default task-96804) [1cbc3b4f-b1d4-428a-965a-b9745fd0e108] Running
command: TransferImageStatusCommand internal: false. Entities affected
: ID: aaa00000-0000-0000-0000-123456789aaa Type: SystemAction group
CREATE_DISK with role type USER
2021-09-29 11:44:43,056+02 INFO
[org.ovirt.engine.core.bll.storage.disk.image.ImageTransferUpdater]
(default task-96804) [1cbc3b4f-b1d4-428a-965a-b9745fd0e108] Updating
image transfer 0681f799-f44f-4b1e-8369-4d1033bd81e6 (image
ce221b1f-46aa-4eb4-b159-0e0adb762102) phase to Resuming (message: 'Sent
0MB')
2021-09-29 11:44:47,096+02 INFO
[org.ovirt.engine.core.bll.storage.disk.image.TransferImageStatusCommand]
(default task-96801) [50849f1b-ef18-41ab-9380-e2c7980a1f73] Running
command: TransferImageStatusCommand internal: false. Entities affected
: ID: aaa00000-0000-0000-0000-123456789aaa Type: SystemAction group
CREATE_DISK with role type USER
2021-09-29 11:44:48,878+02 INFO
[org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-80)
[6c5f2ed0-976c-4722-a6fb-86f3d9eb1c3b] Resuming transfer for Upload disk
'CentOS-8.4.2105-x86_64-boot.iso' (disk id:
'ce221b1f-46aa-4eb4-b159-0e0adb762102', image id:
'45896ce1-a602-49f5-9774-4dc17d960589')
2021-09-29 11:44:48,896+02 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engineScheduled-Thread-80)
[6c5f2ed0-976c-4722-a6fb-86f3d9eb1c3b] EVENT_ID:
TRANSFER_IMAGE_RESUMED_BY_USER(1,074), Image transfer was resumed by
user (admin@internal-authz).
2021-09-29 11:44:48,902+02 INFO
[org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-80)
[6c5f2ed0-976c-4722-a6fb-86f3d9eb1c3b] Renewing transfer ticket for
Upload disk 'CentOS-8.4.2105-x86_64-boot.iso' (disk id:
'ce221b1f-46aa-4eb4-b159-0e0adb762102', image id:
'45896ce1-a602-49f5-9774-4dc17d960589')
2021-09-29 11:44:48,903+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.ExtendImageTicketVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-80)
[6c5f2ed0-976c-4722-a6fb-86f3d9eb1c3b] START,
ExtendImageTicketVDSCommand(HostName = virthost01,
ExtendImageTicketVDSCommandParameters:{hostId='15d10fdf-4dc1-4a4c-a12f-cab50c492974',
ticketId='8d09cf8c-baf9-4497-8b52-ea53a97b4a19', timeout='300'}), log
id: 197aba7
2021-09-29 11:44:48,908+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.ExtendImageTicketVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-80)
[6c5f2ed0-976c-4722-a6fb-86f3d9eb1c3b] FINISH,
ExtendImageTicketVDSCommand, return: StatusOnlyReturn [status=Status
[code=0, message=Done]], log id: 197aba7
2021-09-29 11:44:48,908+02 INFO
[org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-80)
[6c5f2ed0-976c-4722-a6fb-86f3d9eb1c3b] Transfer session with ticket id
8d09cf8c-baf9-4497-8b52-ea53a97b4a19 extended, timeout 300 seconds
2021-09-29 11:44:48,920+02 INFO
[org.ovirt.engine.core.bll.storage.disk.image.ImageTransferUpdater]
(EE-ManagedThreadFactory-engineScheduled-Thread-80)
[6c5f2ed0-976c-4722-a6fb-86f3d9eb1c3b] Updating image transfer
0681f799-f44f-4b1e-8369-4d1033bd81e6 (image
ce221b1f-46aa-4eb4-b159-0e0adb762102) phase to Transferring (message:
'Sent 0MB')
2021-09-29 11:44:51,379+02 INFO
[org.ovirt.engine.core.bll.storage.disk.image.TransferImageStatusCommand]
(default task-96801) [e2247750-524d-40e4-bffb-1176ff13f1f5] Running
command: TransferImageStatusCommand internal: false. Entities affected
: ID: aaa00000-0000-0000-0000-123456789aaa Type: SystemAction group
CREATE_DISK with role type USER
2021-09-29 11:44:55,376+02 INFO
[org.ovirt.engine.core.bll.storage.disk.image.TransferImageStatusCommand]
(default task-96801) [f9b3dec1-9aac-4695-ba39-43e5e66bdccd] Running
command: TransferImageStatusCommand internal: false. Entities affected
: ID: aaa00000-0000-0000-0000-123456789aaa Type: SystemAction group
CREATE_DISK with role type USER
Am I doing something wrong?
3 years, 4 months
Failed to update OVF disks / Failed to update VMs/Templates OVF data for Storage Domain
by nicolas@devels.es
Hi,
We upgraded from oVirt 4.3.8 to 4.4.8 and sometimes we're finding events
like these in the event log (3-4 times/day):
Failed to update OVF disks 77818843-f72e-4d40-9354-4e1231da341f, OVF
data isn't updated on those OVF stores (Data Center KVMRojo, Storage
Domain pv04-003).
Failed to update VMs/Templates OVF data for Storage Domain pv02-002
in Data Center KVMRojo.
I found [1], however, it seems not to solve the issue. I restarted all
the hosts and we're still getting the messages.
We couldn't upgrade hosts to 4.4 yet, FWIW. Maybe it's caused by this?
If someone could shed some light about this, I'd be grateful.
Thanks.
[1]: https://access.redhat.com/solutions/3353011
3 years, 4 months
Managed Block Storage and Templates
by Shantur Rathore
Hi all,
Anyone tried using Templates with Managed Block Storage?
I created a VM on MBS and then took a snapshot.
This worked but as soon as I created a Template from snapshot, the
template got created but there is no disk attached to the template.
Anyone seeing something similar?
Thanks
3 years, 4 months
Managed Block Storage issues
by Shantur Rathore
Hi all,
I am trying to set up Managed block storage and have the following issues.
My setup:
Latest oVirt Node NG : 4.4.8
Latest oVirt Engine : 4.4.8
1. Unable to copy to iSCSI based block storage
I created a MBS with Synology UC3200 as a backend ( supported by
Cinderlib ). It was created fine but when I try to copy disks to it,
it fails.
Upon looking at the logs from SPM, I found "qemu-img" failed with an
error that it cannot open "/dev/mapper/xxxxxxxxxx" : Permission Error.
Had a look through the code and digging out more, I saw that
a. Sometimes /dev/mapper/xxxx symlink isn't created ( log attached )
b. The ownership to /dev/mapper/xxxxxx and /dev/dm-xx for the new
device always stays at root:root
I added a udev rule
ACTION=="add|change", ENV{DM_UUID}=="mpath-*", GROUP="qemu",
OWNER="vdsm", MODE="0660"
and the disk copied correctly when /dev/mapper/xxxxx got created.
2. Copy progress finishes in UI very early than the actual qemu-img process.
The UI shows the Copy process is completed successfully but it's
actually still copying the image.
This happens both for ceph and iscsi based mbs.
Is there any known workaround to get iSCSI MBS working?
Kind regards,
Shantur
3 years, 4 months
oVirt / Hyperconverged
by topoigerm@gmail.com
I have 4 servers of identical hardware. The documentation says "you need 3", not "you need 3 or more"; is it possible to run hyperconverged with 4 servers. Currently all the 4 nodes server has been crashed n after the 4th node try joining the hyperconverged 3nodes cluster. Kindly advise.
FYI currently i'm trying to reinstall back all the OS back due mentioned incident happen.
/BR
Faizal
3 years, 4 months
about the Live Storage Migration
by Tommy Sway
From the document:
Overview of Live Storage Migration
Virtual disks can be migrated from one storage domain to another while the
virtual machine to which they are attached is running. This is referred to
as live storage migration. When a disk attached to a running virtual machine
is migrated, a snapshot of that disk's image chain is created in the source
storage domain, and the entire image chain is replicated in the destination
storage domain. As such, ensure that you have sufficient storage space in
both the source storage domain and the destination storage domain to host
both the disk image chain and the snapshot. A new snapshot is created on
each live storage migration attempt, even when the migration fails.
Consider the following when using live storage migration:
You can live migrate multiple disks at one time.
Multiple disks for the same virtual machine can reside across more than one
storage domain, but the image chain for each disk must reside on a single
storage domain.
You can live migrate disks between any two storage domains in the same data
center.
You cannot live migrate direct LUN hard disk images or disks marked as
shareable.
But where do users perform online storage migrations?
There seems to be no interface.
3 years, 4 months
Re: About the vm memory limit
by Tommy Sway
In fact, I am very interested in the part you mentioned, because my environment is running relational database, which usually requires a large amount of memory,
and some systems clearly need to configure HUGEPAGE memory (such as Oracle).
Could you elaborate on some technical details about the management of huge page memory? And the difference between 4.3 and 4.4 in this respect?
Thank you very much!
From: Strahil Nikolov <hunter86_bg(a)yahoo.com>
Sent: Saturday, September 25, 2021 5:32 PM
To: Tommy Sway <sz_cuitao(a)163.com>
Subject: Re: [ovirt-users] About the vm memory limit
It depends on the numa configuration of the host.
If you have 256G per CPU, it's best to stay into that range.
Also, consider disabling transparent huge pages on the host & VM.
Since 4.4 Regular Huge Pages (do not confuse them with THP) can be used on the Hypervisors, while on 4.3 there were some issues but I can't provode any details.
Best Regards,
Strahil Nikolov
On Fri, Sep 24, 2021 at 6:40, Tommy Sway
<sz_cuitao(a)163.com <mailto:sz_cuitao@163.com> > wrote:
I would like to ask if there is any limit on the memory size of virtual machines, or performance curve or something like that?
As long as there is memory on the physical machine, the more virtual machines the better?
In our usage scenario, there are many virtual machines with databases, and their memory varies greatly.
For some virtual machines, 4G memory is enough, while for some virtual machines, 64GB memory is needed.
I want to know what is the best use of memory for a virtual machine, since the virtual machine is just a QEMU emulation process on a physical machine, and I worry that it is not using as much memory as a physical machine. Understand this so that we can develop guidelines for optimal memory usage scenarios for virtual machines.
Thank you!
_______________________________________________
Users mailing list -- users(a)ovirt.org <mailto:users@ovirt.org>
To unsubscribe send an email to users-leave(a)ovirt.org <mailto:users-leave@ovirt.org>
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y6XDOIMKCP4...
3 years, 4 months