I am encountering the following issue on a single instance hyper-converged
The following fio test was done:
fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test
--filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randwrite
The results are very poor doing the test inside of a vm with a prealocated
disk on the ssd store: ~2k IOPS
Same test done on the oVirt node directly on the mounted ssd_lvm: ~30k IOPS
Same test done, this time on the gluster mount path: ~20K IOPS
What could be the issue that the vms have this slow hdd performance ( 2k on
ssd !! )?
Thank you very much !
Best regards, Leo David
Upcoming in 3.6 is enhancement for managing the hosted engine VM.
In short, we want to:
* Allow editing the Hosted engine VM, storage domain, disks, networks etc
* Have a shared configuration for the hosted engine VM
* Have a backup for the hosted engine VM configuration
please review and comment on the wiki below:
OK thanks! I saw that after upgrading to 4.0.5 from 4.0.4 the DB
already dropped with around 500MB directly and is now at 2GB smaller.
Does this sounds familiar to you with other settings in 4.0.5 ?
2017-01-08 10:45 GMT+01:00 Shirly Radco <sradco(a)redhat.com>:
> No. That will corrupt your database.
> Are you using the full dwh or the smaller version for the dashboards?
> Please set the delete thresholds to save less data and the data older then
> the time you set will be deleted.
> Add a file to /ovirt-engine-dwhd.conf.d/
> Add these lines with the new configurations. The numbers represent the hours
> to keep the data.
> These are the configurations for a full dwh.
> The smaller version configurations are:
> The delete process by default at 3am every day (DWH_DELETE_JOB_HOUR=3)
> Best regards,
> Shirly Radco
> BI Software Engineer
> Red Hat Israel Ltd.
> 34 Jerusalem Road
> Building A, 4th floor
> Ra'anana, Israel 4350109
> On Fri, Jan 6, 2017 at 6:35 PM, Matt . <yamakasi.014(a)gmail.com> wrote:
>> I seem to have some large database for the DWH logging and I wonder
>> how I can empty it safely.
>> Can I just simply empty the database ?
>> Have a good weekend!
>> Users mailing list
Dear oVirt Gurus,
Using the oVirt user VM portal seems to not work through the squid proxy setup (configured as per the guide). The page loads and login works fine through the proxy, but the asynchronous requests just hang. I've attached a screenshot, but you can see the "api" endpoint just hanging in a web inspector:
This works fine when not going through the proxy.
Is there a way to force noVNC HTML as the console mode through the web-ui, or at least have it as an option if not default?
The console seems not to work when logged in with a base 'user role'.
Research Computing Core
Wellcome Trust Centre for Human Genetics
University of Oxford
I'm running Version 184.108.40.206-1.el7, and after reboot the engine machine no longer could login into administration portal with this error:
sun.security.validator.ValidatorException: PKIX path validation faile
java.security.cert.CertPathValidatorException: validity check failed
I'm using a self signed cert.
Once again my production ovirt cluster is collapsing in on itself. My
servers are intermittently unavailable or degrading, customers are noticing
and calling in. This seems to be yet another gluster failure that I
haven't been able to pin down.
I posted about this a while ago, but didn't get anywhere (no replies that I
found). The problem started out as a glusterfsd process consuming large
amounts of ram (up to the point where ram and swap were exhausted and the
kernel OOM killer killed off the glusterfsd process). For reasons not
clear to me at this time, that resulted in any VMs running on that host and
that gluster volume to be paused with I/O error (the glusterfs process is
usually unharmed; why it didn't continue I/O with other servers is
confusing to me).
I have 3 servers and a total of 4 gluster volumes (engine, iso, data, and
data-hdd). The first 3 are replica 2+arb; the 4th (data-hdd) is replica
3. The first 3 are backed by an LVM partition (some thin provisioned) on
an SSD; the 4th is on a seagate hybrid disk (hdd + some internal flash for
acceleration). data-hdd is the only thing on the disk. Servers are Dell
R610 with the PERC/6i raid card, with the disks individually passed through
to the OS (no raid enabled).
The above RAM usage issue came from the data-hdd volume. Yesterday, I
cought one of the glusterfsd high ram usage before the OOM-Killer had to
run. I was able to migrate the VMs off the machine and for good measure,
reboot the entire machine (after taking this opportunity to run the
software updates that ovirt said were pending). Upon booting back up, the
necessary volume healing began. However, this time, the healing caused all
three servers to go to very, very high load averages (I saw just under 200
on one server; typically they've been 40-70) with top reporting IO Wait at
7-20%. Network for this volume is a dedicated gig network. According to
bwm-ng, initially the network bandwidth would hit 50MB/s (yes, bytes), but
tailed off to mostly in the kB/s for a while. All machines' load averages
were still 40+ and gluster volume heal data-hdd info reported 5 items
needing healing. Server's were intermittently experiencing IO issues, even
on the 3 gluster volumes that appeared largely unaffected. Even the OS
activities on the hosts itself (logging in, running commands) would often
be very delayed. The ovirt engine was seemingly randomly throwing engine
down / engine up / engine failed notifications. Responsiveness on ANY VM
was horrific most of the time, with random VMs being inaccessible.
I let the gluster heal run overnight. By morning, there were still 5 items
needing healing, all three servers were still experiencing high load, and
servers were still largely unstable.
I've noticed that all of my ovirt outages (and I've had a lot, way more
than is acceptable for a production cluster) have come from gluster. I
still have 3 VMs who's hard disk images have become corrupted by my last
gluster crash that I haven't had time to repair / rebuild yet (I believe
this crash was caused by the OOM issue previously mentioned, but I didn't
know it at the time).
Is gluster really ready for production yet? It seems so unstable to
me.... I'm looking at replacing gluster with a dedicated NFS server likely
FreeNAS. Any suggestions? What is the "right" way to do production
storage on this (3 node cluster)? Can I get this gluster volume stable
enough to get my VMs to run reliably again until I can deploy another
Property: default route, host - true, DC - false
I have 4 nics.
bond0 = 2 x 10g
eno1 = ovirtmgmt
eno2 = for vm traffic.
eno2 says it's out of sync, hosts network config differs from DC,
default route: host - true, DC - false.
I have tried "sync all networks" but the message still remains
Where can I look to fix the issue?
I am currently building a new virtualization cluster with oVirt, using
AMD EPYC processors (AMD EPYC 7351P). At the moment I'm running oVirt
Node Version 4.2.3 @ CentOS 7.4.1708.
We have the situation that the processor type is recognized as "AMD
Opteron G3". With this type of instruction set the VMs are not able to
do AES in hardware, this results in poor performance in our case.
I found some information that tells me that this problem should be
solved with CentOS 7.5
My actual questions:
- Are there any further information about the AMD EPYC support?
- Any information about an update of the oVirt node to CentOS 7.5?