cpu QoS doesn't work
by pub.virtualization@gmail.com
Hello,
I set cpu QoS as 10 and applied it to VM on oVirt 4.2, but it doesn't seem to work.
compared to VM without QoS, there wasn't any difference in cpu usage.
Also, there wasn't any <period>...</period> or <quota>...</quota> field related to QoS in libvirt file.
is it right result??
Thanks,
2 years, 8 months
Unable to create a node in oVirt 4.0
by Rodrigo G. López
Hi there,
We are trying to setup a node in the same machine where we are running
the engine, and noticed that the vdsmd service fails because the
supervdsmd daemon can't authenticate against libvirtd afaict.
The error is the following on supervdsmd:
daemonAdapter[17803]: libvirt: XML-RPC error : authentication
failed: authentication failed
...
and in libvirtd:
Sep 15 03:34:18 ovirt-test libvirtd[17775]: 2020-09-15
07:34:18.410+0000: 17775: error : virNetSocketReadWire:1806 : End of
file while reading data: Input/output error
Sep 15 03:34:18 ovirt-test libvirtd[17775]: 2020-09-15
07:34:18.612+0000: 17776: error : virNetSASLSessionListMechanisms:393 :
internal error: cannot list SASL mechanisms -4 (SASL(-4): no mechanism
available: Internal Error -4 in server.c near line 1757)
Sep 15 03:34:18 ovirt-test libvirtd[17775]: 2020-09-15
07:34:18.612+0000: 17776: error : remoteDispatchAuthSaslInit:3440 :
authentication failed: authentication failed
Sep 15 03:34:18 ovirt-test libvirtd[17775]: 2020-09-15
07:34:18.612+0000: 17775: error : virNetSocketReadWire:1806 : End of
file while reading data: Input/output error
Sep 15 03:34:18 ovirt-test libvirtd[17775]: 2020-09-15
07:34:18.814+0000: 17778: error : virNetSASLSessionListMechanisms:393 :
internal error: cannot list SASL mechanisms -4 (SASL(-4): no mechanism
available: Internal Error -4 in server.c near line 1757)
Sep 15 03:34:18 ovirt-test libvirtd[17775]: 2020-09-15
07:34:18.814+0000: 17778: error : remoteDispatchAuthSaslInit:3440 :
authentication failed: authentication failed
Sep 15 03:34:18 ovirt-test libvirtd[17775]: 2020-09-15
07:34:18.815+0000: 17775: error : virNetSocketReadWire:1806 : End of
file while reading data: Input/output error
Sep 15 03:34:19 ovirt-test libvirtd[17775]: 2020-09-15
07:34:19.017+0000: 17780: error : virNetSASLSessionListMechanisms:393 :
internal error: cannot list SASL mechanisms -4 (SASL(-4): no mechanism
available: Internal Error -4 in server.c near line 1757)
Sep 15 03:34:19 ovirt-test libvirtd[17775]: 2020-09-15
07:34:19.017+0000: 17780: error : remoteDispatchAuthSaslInit:3440 :
authentication failed: authentication failed
Sep 15 03:34:19 ovirt-test libvirtd[17775]: 2020-09-15
07:34:19.020+0000: 17775: error : virNetSocketReadWire:1806 : End of
file while reading data: Input/output error
Is there any way to work around that?
We have working infra on top of 4.0 in CentOS 7 systems, and we would
like to replicate the exact same environment for availability purposes,
in case anything bad happened.
Best regards,
-rodri
2 years, 8 months
Removal of deprecated init-scripts (network-scripts)
by Ales Musil
Hello,
network-scripts for host networking were deprecated since oVirt 4.4.
It will be removed completely in the 4.4.3 release. There is no action
required
for setups that did not change the configuration to use network-scripts
backend (net_nmstate_enabled = false).
Users that did disable nmstate should redeploy all affected hosts before
4.4.3.
Also can you please tell us what was the reason to use network-scripts, if
that is the case?
Thank you.
Best regards,
Ales Musil
--
Ales Musil
Software Engineer - RHV Network
Red Hat EMEA <https://www.redhat.com>
amusil(a)redhat.com IM: amusil
<https://red.ht/sig>
2 years, 8 months
Random hosts disconnects
by Anton Louw
Hi All,
I have a strange issue in my oVirt environment. I currently have a standalone manager which is running in VMware. In my oVirt environment, I have two Data Centers. The manager is currently sitting on the same subnet as DC1. Randomly, hosts in DC2 will say "Not Responding" and then 2 seconds later, the hosts will activate again.
The strange thing is, when the manager was sitting on the same subnet as DC2, hosts in DC1 will randomly say "Not Responding"
I have tried going through the logs, but I cannot see anything out of the ordinary regarding why the hosts would drop connection. I have attached the engine.log for anybody that would like to do a spot check.
Thanks
Anton Louw
Cloud Engineer: Storage and Virtualization
______________________________________
D: 087 805 1572 | M: N/A
A: Rutherford Estate, 1 Scott Street, Waverley, Johannesburg
anton.louw(a)voxtelecom.co.za
www.vox.co.za
2 years, 8 months
Enable a cluster node to run the hosted engine
by rap@isogmbh.de
Hi there,
currently my team is evaluating oVirt and we're also testing several fail scenarios, backup and so on.
One scenario was:
- hyperconverged oVirt cluster with 3 nodes
- self-hosted engine
- simulate the break down of one of the nodes by power off
- to replace it make a clean install of a new node and reintegrate it in the cluster
Actually everything worked out fine. The new installed node and related bricks (vmstore, data, engine) were added to the existing Gluster storage and it was added to the oVirt cluster (as host).
But there's one remaining problem: The new host doesn't have the grey crown, which means it's unable to run the hosted engine. How can I achieve that?
I also found out that the ovirt-ha-agent and ovirt-ha-broker isn't started/enabled on that node. Reason is that the /etc/ovirt-hosted-engine/hosted-engine.conf doesn't exist. I guess this is not only a problem concerning the hosted engine, but also for HA VM's.
Thank you for any advice and greetings,
Marcus
2 years, 8 months
Random hosts disconnects
by anton.louw@voxtelecom.co.za
Hi All,
I have a strange issue in my oVirt environment. I currently have a standalone manager which is running in VMware. In my oVirt environment, I have two Data Centers. The manager is currently sitting on the same subnet as DC1. Randomly, hosts in DC2 will say “Not Responding” and then 2 seconds later, the hosts will activate again.
The strange thing is, when the manager was sitting on the same subnet as DC2, hosts in DC1 will randomly say “Not Responding”
I have tried going through the logs, but I cannot see anything out of the ordinary regarding why the hosts would drop connection. I have attached the engine.log for anybody that would like to do a spot check.
Thanks
2 years, 8 months
Disconnected Server has closed the connection.
by info@worldhostess.com
It seems that the installation is all done, but I have a problem. it takes very long to open the web pages, plus it disconnect all the time. it is impossible to do anything.
I can ping the hostname as I set up a sub-domain for it. to be honest, I am new to this and it took me days to get to this point. I think there are some issues with my network settings.
if there are any oVirt experts that can check my installation and give me advice about how to improve it, it will be greatly appreciated.
I have done an "Installing oVirt as a self-hosted engine using the Cockpit web interface"
2 years, 8 months
Gluster quorum issue on 3-node HCI with extra 5-nodes as compute and storage nodes
by thomas@hoberg.net
Yes, I've also posted this on the Gluster Slack. But I am using Gluster mostly because it's part of oVirt HCI, so don't just send me away, please!
Problem: GlusterD refusing to start due to quorum issues for volumes where it isn’t contributing any brick
(I've had this before on a different farm, but there it was transitory. Now I have it in a more observable manner, that's why I open a new topic)
In a test farm with recycled servers, I started running Gluster via oVirt 3node-HCI, because I got 3 machines originally.
They were set up as group A in a 2:1 (replica:arbiter) oVirt HCI setup with 'engine', 'vmstore' and 'data' volumes, one brick on each node.
I then got another five machines with hardware specs that were rather different to group A, so I set those up as group B to mostly act as compute nodes, but also to provide extra storage, mostly to be used externally as GlusterFS shares. It took a bit of fiddling with Ansible but I got these 5 nodes to serve two more Gluster volumes 'tape' and 'scratch' using dispersed bricks (4 disperse:1 redundancy), RAID5 in my mind.
The two groups are in one Gluster, not because they serve bricks to the same volumes, but because oVirt doesn't like nodes to be in different Glusters (or actually, to already be in a Gluster when you add them as host node). But the two groups provide bricks to distinct volumes, there is no overlap.
After setup things have been running fine for weeks, but now I needed to restart a machine from group B, which has ‘tape’ and ‘scratch’ bricks, but none from original oVirt ‘engine’, ‘vmstore’ and ‘data’ in group A. Yet the gluster daemon refuses to start, citing a loss of quorum for these three volumes, even if it has no bricks in them… which makes no sense to me.
I am afraid the source of the issue is concept issues: I clearly don't really understand some design assumptions of Gluster.
And I'm afraid the design assumptions of Gluster and of oVirt (even with HCI), are not as related as one might assume from the marketing materials on the oVirt home-page.
But most of all I'd like to know: How do I fix this now?
I can't heal 'tape' and 'scratch', which are growing ever more apart while the glusterd on this machine in group B refuses to come online for lack of a quorum on volumes where it is not contributing bricks.
2 years, 8 months
What is the purpose of memory deflation in oVirt memory ballooning?
by pub.virtualization@gmail.com
Hi, guys
Why does momd(ballooning manager in oVirt) explicitly deflate the balloon when host gets plenty of memory?
as far as I know, momd is supporting memory ballooning with setMemory API to inflate/deflate the balloon in guest
and I've just checked the memory change in the guest after inflating the balloon.
as expected, memory(total, free, available) in the guest was reduced just after inflating the balloon, but it was "automatically" restored to its initial memory after a few seconds.
So, here I'm wondering why deflation is additionally required even though it can be restored automatically just after seconds.
Thanks.
2 years, 8 months