VM hanging at sustained high throughput
by David Johnson
Hi ovirt gurus,
This is an interesting issue, one I never expected to have.
When I push high volumes of writes to my NAS, I will cause VM's to go into
a paused state. I'm looking at this from a number of angles, including
upgrades on the NAS appliance.
I can reproduce this problem at will running a centos 7.9 VM on Ovirt 4.5.
*Questions:*
1. Is my analysis of the failure (below) reasonable/correct?
2. What am I looking for to validate this?
3. Is there a configuration that I can set to make it a little more robust
while I acquire the hardware to improve the NAS?
*Reproduction:*
Standard test of file write speed:
[root@cen-79-pgsql-01 ~]# dd if=/dev/zero of=./test bs=512k count=4096
oflag=direct
4096+0 records in
4096+0 records out
2147483648 bytes (2.1 GB) copied, 1.68431 s, 1.3 GB/s
Give it more data
[root@cen-79-pgsql-01 ~]# dd if=/dev/zero of=./test bs=512k count=12228
oflag=direct
12228+0 records in
12228+0 records out
6410993664 bytes (6.4 GB) copied, 7.22078 s, 888 MB/s
The odds are about 50/50 that 6 GB will kill the VM, but 100% when I hit 8
GB.
*Analysis:*
What I think appears to be happening is that the intent cache on the NAS is
on an SSD, and my VM's are pushing data about three times as fast as the
SSD can handle. When the SSD gets queued up beyond a certain point, the NAS
(which places reliability over speed) says "Whoah Nellie!", and the VM
chokes.
*David Johnson*
2 years, 9 months
Re: Best CPU topolgy for VMs (Socket / Core / Threads)
by Strahil Nikolov
As long as you keep inside the NUMA limits you should be OK.For example:1 core , 2 threads is equal to 2 cores, 1 thread eachAfter all, all VMs in KVM are just processes.
Yet, if your server has 2 CPUs each with 6 cores ( 2 threads per core ) ,you should avoid setting VMs with 13 vCPUs (13 real threads) as you will have to use some of the threads on the second CPU.
i think there is a guide for High Performance VMs where NUMA cases are described quite well.
Best Regards,Strahil Nikolov
On Mon, Mar 7, 2022 at 16:37, Laurent Duparchy<duparchy(a)esrf.fr> wrote: Thanks for your reply.
So, no performance issue if the virtual topology does not match the physical one ?
Laurent Duparchy
ESRF - The European Synchrotron
MIS Group
04 76 88 22 56 Strahil Nikolov wrote on 07/03/2022 15:10:
I think it's most useful for licensing purposes -> like the Win10 example
Best Regards, Strahil Nikolov
Hi,
Given the fact that there is the option to match de CPUs physical topology (Socket / Core / Threads) , I guess it can make a difference.
When ?
Linux vs Windows ?
(One example I know is that Windows 10 won't access more than 4 sockets.)
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/5BN4OWEHKAY...
2 years, 9 months
Give direct internet access to Redhat RHVM/Ovirt Vms from an AWS bare metal host
by Eugène Ngontang
Hi,
I’ve set up a *RHVM/Ovirt* host on AWS using a bare metal instance.
Everything is working but now I would like to give a direct internet access
to VMs created inside this host. Actually those VMs get to internet through
a ssh forwarded squid proxy.
I can’t find the way to set that direct internet access for underlying VMs.
Please can one here advise or give me any good doc link?
Best regards,
Eugène NG
--
LesCDN <http://lescdn.com>
engontang(a)lescdn.com
------------------------------------------------------------
*Aux hommes il faut un chef, et au*
* chef il faut des hommes!L'habit ne fait pas le moine, mais lorsqu'on te
voit on te juge!*
2 years, 9 months
GlusterFS poor performance
by francesco@shellrent.com
Hi all,
I'm running a glusterFS setup v 8.6 with two node and one arbiter. Both nodes and arbiter are CentOS 8 Stream with oVirt 4.4. Under gluster I have a LVM thin partition.
VMs running in this cluster have really poor write performance, when a test directly performend on the disk score about 300 MB/s
dd test on host1:
[root@ovirt-host1 tmp]# dd if=/dev/zero of=./foo.dat bs=256M count=1 oflag=dsync
1+0 records in
1+0 records out
268435456 bytes (268 MB, 256 MiB) copied, 0.839861 s, 320 MB/s
dd test on host1 on gluster:
[root@ovirt-host1 tmp]# dd if=/dev/zero of=/rhev/data-center/mnt/glusterSD/ovirt-host1:_data/foo.dat bs=256M count=1 oflag=dsync
1+0 records in
1+0 records out
268435456 bytes (268 MB, 256 MiB) copied, 50.6889 s, 5.3 MB/s
Nontheless, the write results in a VM inside the cluster is a little bit faster (dd results vary from 15 MB/s to 60 MB/s) and this is very strange to me:
root@vm1-ha:/tmp# dd if=/dev/zero of=./foo.dat bs=256M count=1 oflag=dsync; rm -f ./foo.dat
1+0 records in
1+0 records out
268435456 bytes (268 MB, 256 MiB) copied, 5.58727 s, 48.0 MB/s
Here's the actual gluster configuration, I also applied some paramaters in /var/lib/glusterd/groups/virt as mentioned in other ovirt thread related I found.
gluster volume info data
Volume Name: data
Type: Replicate
Volume ID: 09b532eb-57de-4c29-862d-93993c990e32
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ovirt-host1:/gluster_bricks/data/data
Brick2: ovirt-host2:/gluster_bricks/data/data
Brick3: ovirt-arbiter:/gluster_bricks/data/data (arbiter)
Options Reconfigured:
server.event-threads: 4
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.server-quorum-type: server
cluster.lookup-optimize: off
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.choose-local: off
client.event-threads: 4
performance.client-io-threads: on
nfs.disable: on
transport.address-family: inet
storage.fips-mode-rchecksum: on
storage.owner-uid: 36
storage.owner-gid: 36
features.shard: on
performance.low-prio-threads: 32
performance.strict-o-direct: on
network.remote-dio: off
network.ping-timeout: 30
user.cifs: off
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
cluster.eager-lock: enable
The speed between two hosts is about 1Gb/s:
[root@ovirt-host1 ~]# iperf3 -c ovirt-host2 -p 5002
Connecting to host ovirt-host2 port 5002
[ 5] local x.x.x.x port 58072 connected to y.y.y.y port 5002
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 112 MBytes 938 Mbits/sec 117 375 KBytes
[ 5] 1.00-2.00 sec 112 MBytes 937 Mbits/sec 0 397 KBytes
[ 5] 2.00-3.00 sec 110 MBytes 924 Mbits/sec 18 344 KBytes
[ 5] 3.00-4.00 sec 112 MBytes 936 Mbits/sec 0 369 KBytes
[ 5] 4.00-5.00 sec 111 MBytes 927 Mbits/sec 12 386 KBytes
[ 5] 5.00-6.00 sec 112 MBytes 938 Mbits/sec 0 471 KBytes
[ 5] 6.00-7.00 sec 108 MBytes 909 Mbits/sec 34 382 KBytes
[ 5] 7.00-8.00 sec 112 MBytes 942 Mbits/sec 0 438 KBytes
[ 5] 8.00-9.00 sec 111 MBytes 928 Mbits/sec 38 372 KBytes
[ 5] 9.00-10.00 sec 111 MBytes 934 Mbits/sec 0 481 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 1.08 GBytes 931 Mbits/sec 219 sender
[ 5] 0.00-10.04 sec 1.08 GBytes 926 Mbits/sec receiver
iperf Done.
Between nodes and arbiter about 200MB/s
[ 5] local ovirt-arbiter port 45220 connected to ovirt-host1 port 5002
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 30.6 MBytes 257 Mbits/sec 1177 281 KBytes
[ 5] 1.00-2.00 sec 26.2 MBytes 220 Mbits/sec 0 344 KBytes
[ 5] 2.00-3.00 sec 28.8 MBytes 241 Mbits/sec 15 288 KBytes
[ 5] 3.00-4.00 sec 26.2 MBytes 220 Mbits/sec 0 352 KBytes
[ 5] 4.00-5.00 sec 30.0 MBytes 252 Mbits/sec 32 293 KBytes
[ 5] 5.00-6.00 sec 26.2 MBytes 220 Mbits/sec 0 354 KBytes
[ 5] 6.00-7.00 sec 30.0 MBytes 252 Mbits/sec 32 293 KBytes
[ 5] 7.00-8.00 sec 27.5 MBytes 231 Mbits/sec 0 355 KBytes
[ 5] 8.00-9.00 sec 28.8 MBytes 241 Mbits/sec 30 294 KBytes
[ 5] 9.00-10.00 sec 26.2 MBytes 220 Mbits/sec 3 250 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 281 MBytes 235 Mbits/sec 1289 sender
[ 5] 0.00-10.03 sec 277 MBytes 232 Mbits/sec receiver
iperf Done.
I definitely missing something obvious and I'm not a gluster/ovirt black bealt... Can anyone point me to the right way?
Thank you for your time.
Regards,
Francesco
2 years, 9 months
Best CPU topolgy for VMs (Socket / Core / Threads)
by duparchy@esrf.fr
Hi,
Given the fact that there is the option to match de CPUs physical topology (Socket / Core / Threads) , I guess it can make a difference.
When ?
Linux vs Windows ?
(One example I know is that Windows 10 won't access more than 4 sockets.)
2 years, 9 months
Hosted Engine deployment looks stuck in startup during the deployment
by Eugène Ngontang
Hi,
I'm using an aws ec2 bare metal install to deploy RHV-M in order to
create and test NVidia GPU VMs.
I'm trying to deploy a self hosted engine version 4.4.
I've setup everything till the hosted-engine deployment and Hosted
Engine deployment looks like stuck at engine host startup, and times
out many more hours after.
I'm suspecting networking startup issue but can really and clearly
identify the issue. Because during all this time the deployment
process is waiting for the hosted engine to come up before it
finishes, the hosted engine itself is up and running, is still running
till now, but is not reachable.
Here attached you will find :
- A screenshot before the timeout
- A screenshot after the timeout (fail)
- The answer file I appended to the hosted-engine command
> hosted-engine --deploy --4 --config-append=hosted-engine.conf
- The deployment log output
- The resulting answer file after the deployment.
I think the problem would
I think the problem would at the network startup step but as I don't
have any explicit error/failure message, I can't tell.
Please can someone here advise?
Please let me know if you need any more information from me.
Best regards,
Eugène NG
Best regards,
Eugène NG
--
LesCDN <http://lescdn.com>
engontang(a)lescdn.com
------------------------------------------------------------
*Aux hommes il faut un chef, et au*
* chef il faut des hommes!L'habit ne fait pas le moine, mais lorsqu'on te
voit on te juge!*
2 years, 9 months
Upgrade Hosted Ovirt 4.3 to 4.4 fails in Initialize lockspace volume
by Farag, Pavly
Hello All,
Hi all,
I'm trying to update my Ovirt engine with the following specs:
* hosted-engine 4.3 (updated to latest version of 4.3)
* Hosted on Centos 7 based Hosts
* Engine is hosted on iSCSI storage.
Target Ovirt-engine Host is one of the already existing hosts with RHEL 8.5 fresh installation.
Update failed during the execution of "hosted-engine --deploy --restore-from-file=backup.bck"
Below is the console log, and I'm also attaching some logs hope it helps:
[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Initialize lockspace volume]
[ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 5, "changed": true, "cmd": ["hosted-engine", "--reinitialize-lockspace", "--force"], "delta": "0:00:00.246116", "end": "2022-03-04 14:47:45.602836", "msg": "non-zero return code", "rc": 1, "start": "2022-03-04 14:47:45.356720", "stderr": "Traceback (most recent call last):\n File \"/usr/lib64/python3.6/runpy.py\", line 193, in _run_module_as_main\n \"__main__\", mod_spec)\n File \"/usr/lib64/python3.6/runpy.py\", line 85, in _run_code\n exec(code, run_globals)\n File \"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/reinitialize_lockspace.py\", line 30, in <module>\n ha_cli.reset_lockspace(force)\n File \"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/client/client.py\", line 286, in reset_lockspace\n stats = broker.get_stats_from_storage()\n File \"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py\", line 148, in get_stats_from_storage\n result = self._proxy.get_stats()\n File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1112, in __call__\n return self.__send(self.__name, args)\n File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1452, in __request\n verbose=self.__verbose\n File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1154, in request\n return self.single_request(host, handler, request_body, verbose)\n File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1166, in single_request\n http_conn = self.send_request(host, handler, request_body, verbose)\n File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1279, in send_request\n self.send_content(connection, request_body)\n File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1309, in send_content\n connection.endheaders(request_body)\n File \"/usr/lib64/python3.6/http/client.py\", line 1264, in endheaders\n self._send_output(message_body, encode_chunked=encode_chunked)\n File \"/usr/lib64/python3.6/http/client.py\", line 1040, in _send_output\n self.send(msg)\n File \"/usr/lib64/python3.6/http/client.py\", line 978, in send\n self.connect()\n File \"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py\", line 76, in connect\n self.sock.connect(base64.b16decode(self.host))\nFileNotFoundError: [Errno 2] No such file or directory", "stderr_lines": ["Traceback (most recent call last):", " File \"/usr/lib64/python3.6/runpy.py\", line 193, in _run_module_as_main", " \"__main__\", mod_spec)", " File \"/usr/lib64/python3.6/runpy.py\", line 85, in _run_code", " exec(code, run_globals)", " File \"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/reinitialize_lockspace.py\", line 30, in <module>", " ha_cli.reset_lockspace(force)", " File \"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/client/client.py\", line 286, in reset_lockspace", " stats = broker.get_stats_from_storage()", " File \"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py\", line 148, in get_stats_from_storage", " result = self._proxy.get_stats()", " File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1112, in __call__", " return self.__send(self.__name, args)", " File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1452, in __request", " verbose=self.__verbose", " File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1154, in request", " return self.single_request(host, handler, request_body, verbose)", " File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1166, in single_request", " http_conn = self.send_request(host, handler, request_body, verbose)", " File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1279, in send_request", " self.send_content(connection, request_body)", " File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1309, in send_content", " connection.endheaders(request_body)", " File \"/usr/lib64/python3.6/http/client.py\", line 1264, in endheaders", " self._send_output(message_body, encode_chunked=encode_chunked)", " File \"/usr/lib64/python3.6/http/client.py\", line 1040, in _send_output", " self.send(msg)", " File \"/usr/lib64/python3.6/http/client.py\", line 978, in send", " self.connect()", " File \"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py\", line 76, in connect", " self.sock.connect(base64.b16decode(self.host))", "FileNotFoundError: [Errno 2] No such file or directory"], "stdout": "", "stdout_lines": []}
[ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook
Thanks a lot for your help,
Kind regards,
Pavly
2 years, 9 months
New virtual machine default properties
by ravi k
Hello,
We are currently using oVirt version 4.3.10.4-1.0.22.el7. When we click on create new VM, the OS is selected as "other OS" and "optimized for" is desktop. I'm creating the VMs from Ansible. The problem is that the "video type" is defaulting to QXL due to which the VM is failing to start with an error " Exit message: unsupported configuration: this QEMU does not support 'qxl' video device."
It used to default to VGA earlier because this playbook was working earlier. So I'm guessing some config would've been changed which is causing this. But I was not able to find where this default config is set. Can you please help? I checked to see if I can set it in the Ansible playbook. But Ansible's ovirt_vm module only provides headless_mode and protocol options. It would be nice if it provided the video type parameter as well.
Is there any config file in the engine that I can tweak to set the default values for a new VM creation?
Regards,
Ravi
2 years, 9 months