Re: huge page in ovirt 4.2.7
by Sharon Gratch
Hi Fabrice,
The "hugepages" custom property value in oVirt should be set to *size of
the pages in KiB* (i.e. 1GiB = 1048576, 2MiB = 2048).
In addition, it is recommended to set the huge page size of the VM to the
largest size supported by the host.
In the configuration you sent, the huge page size of the VM is set to 64
KiB and since the VM's allocated memory size is at least 32,768 MiB then it
requires at least (32768 * 1024/64=) 524,288 pages. Since you only have 120
pages declared in the host then it failed with an error "...there are not
enough free huge pages to run the VM".
So to solve the problem, please change the VM's huge page size to be the
same as the host's huge page supported size which is 1GiB and therefore
hugepages value should be 1048576 KiB instead of 64:
<custom_property>
<name>hugepages</name>
<value>*1048576*</value>
</custom_property>
Please note that since total VM's memory size is no more than 64 GB then
only 64 pages will be needed by the VM and it's < 120 pages supported by
the host and therefore OK.
Hope it helped.
Regards,
Sharon
On Wed, Nov 14, 2018 at 2:11 PM, Fabrice Bacchella <
fabrice.bacchella(a)orange.fr> wrote:
> I'm trying to understand huge page in oVirt, I'm quite sure to understand
> it well.
>
> I have an host with 128GiB. I have configured reserved huge page:
>
> cat /proc/cmdline
>
> ... hugepagesz=1GB hugepages=120
>
> $ grep -r . /sys/kernel/mm/hugepages
> /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_overcommit_hugepages:0
> /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages:120
> /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages_mempolicy:120
> /sys/kernel/mm/hugepages/hugepages-1048576kB/surplus_hugepages:0
> /sys/kernel/mm/hugepages/hugepages-1048576kB/resv_hugepages:0
> /sys/kernel/mm/hugepages/hugepages-1048576kB/free_hugepages:120
> /sys/kernel/mm/hugepages/hugepages-2048kB/nr_overcommit_hugepages:0
> /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages:0
> /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages_mempolicy:0
> /sys/kernel/mm/hugepages/hugepages-2048kB/surplus_hugepages:0
> /sys/kernel/mm/hugepages/hugepages-2048kB/resv_hugepages:0
> /sys/kernel/mm/hugepages/hugepages-2048kB/free_hugepages:0
>
> I have a big VM running on it:
> <custom_properties>
> <custom_property>
> <name>hugepages</name>
> <value>64</value>
> </custom_property>
> </custom_properties>
> <memory>68719476736</memory>, aka 65536 MiB
> <memory_policy>
> <guaranteed>34359738368</guaranteed>, aka 32768 MiB
> <max>68719476736</max>
> </memory_policy>
>
> And it keep failing when I want to start it:
> /var/log/ovirt-engine/engine.log:2018-11-14 12:56:06,937+01 ERROR
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (default task-66) [13c13a2c-f973-4ba2-b8bd-260e5b35a047] EVENT_ID:
> USER_FAILED_RUN_VM(54), Failed to run VM XXX due to a failed validation:
> [Cannot run VM. There is no host that satisfies current scheduling
> constraints. See below for details:, The host XXX did not satisfy internal
> filter HugePages because there are not enough free huge pages to run the
> VM.]
>
> The huge page fs is mounted:
>
> $ findmnt
> | |-/dev/hugepages1G hugetlbfs
> hugetlbfs rw,relatime,pagesize=1G
> | `-/dev/hugepages hugetlbfs
> hugetlbfs rw,relatime
>
> What am I missing ?
>
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: https://www.ovirt.org/communit
> y/about/community-guidelines/
> List Archives: https://lists.ovirt.org/archiv
> es/list/users(a)ovirt.org/message/VTYKTSSAXQQLS5HO5KOQSBDIHPTAHTOR/
>
>
5 years, 7 months
Re: ERROR running your engine inside of the hosted-engine VM and are not in "Global Maintenance" mode
by Simone Tiraboschi
On Tue, Feb 5, 2019 at 1:46 PM Martin Humaj <mhumaj(a)gmail.com> wrote:
> Hi the problem is that ovirt-engine is running on different vm like a
> virtual machine under the hosts
>
> !! Cluster is in GLOBAL MAINTENANCE mode !!
> I can set it on the host machine but not on the ovirt-engine vm
>
>
Sorry,
one thing more to mention: the check is performed against the latest
information recorded in the DB.
So if the engine is down, it's not going to update the DB and so it will
never update the global maintenance status field.
If you are sure that you are in global maintenance mode and you want to
skip at all the check you can execute:
engine-setup
--otopi-environment=OVESETUP_CONFIG/continueSetupOnHEVM=bool:True
>
>
>
>
> On Tue, Feb 5, 2019 at 1:28 PM Simone Tiraboschi <stirabos(a)redhat.com>
> wrote:
>
>>
>>
>> On Tue, Feb 5, 2019 at 12:31 PM <mhumaj(a)gmail.com> wrote:
>>
>>> Hi,
>>>
>>> We run ovirt upgrade to 4.3, after upgrade we wanted to run engine-setup
>>> but we do not know how to put this host which is simply another virtual
>>> machine with ovirt-engine. hosted-engine is running on hosts.
>>>
>>> During execution engine service will be stopped (OK, Cancel)
>>> [OK]:
>>> [ ERROR ] It seems that you are running your engine inside of the
>>> hosted-engine VM and are not in "Global Maintenance" mode.
>>> In that case you should put the system into the "Global
>>> Maintenance" mode before running engine-setup, or the hosted-engine HA
>>> agent might kill the machine, which might corrupt your data.
>>>
>>> [ ERROR ] Failed to execute stage 'Setup validation': Hosted Engine
>>> setup detected, but Global Maintenance is not set.
>>> [ INFO ] Stage: Clean up
>>> Log file is located at
>>> /var/log/ovirt-engine/setup/ovirt-engine-setup-20190205121802-l7llrw.log
>>> [ INFO ] Generating answer file
>>> '/var/lib/ovirt-engine/setup/answers/20190205121855-setup.conf'
>>> [ INFO ] Stage: Pre-termination
>>> [ INFO ] Stage: Termination
>>> [ ERROR ] Execution of setup failed
>>>
>>> from the hosted nodes
>>>
>>> --== Host 2 status ==--
>>>
>>> Host ID : 2
>>> Engine status : {"reason": "vm not running on this
>>> host", "health": "bad", "vm": "down", "detail": "unknown"}
>>> Can anyone please tell me how to put the global maintenance on virtual
>>> machine where the ovirt-engine is? not the hosts even if I put them on the
>>> global maintenance I am unable to run engine-setup on vm with ovirt-enging.
>>>
>>
>> Run this one one of your hosts:
>> hosted-engine --set-maintenance --mode=global
>> or from the webadmin UI, as you prefer.
>>
>>
>>>
>>> thanks
>>> _______________________________________________
>>> Users mailing list -- users(a)ovirt.org
>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/3YJQVUDKVC2...
>>>
>>
5 years, 7 months
4.3 rc2: info about shutdown of last host and networking page
by Gianluca Cecchi
Hello,
env with single host HCI deployment from node ng iso
After powering off VMs, put into global maintenance, shutdown hosted
engine, I then run shutdown on host. This I think would simulate also a
total shutdown scenario when you have the final step of shutdown of the
final host.
This is what I see in console about failure in several unmounts, both /var
and /var/log and also gluster related filesystems
Could there be any dependency problem?
See here the screenshot:
https://drive.google.com/file/d/1wqqXlmT66hHLJ8DMcwfPvDXJHkWm_ijE/view?us...
Also I have not understood the networking page of host in Cockpit:
https://drive.google.com/file/d/1isX4F8qPcFmTyhmCVnZadhp6NY0R9Sy7/view?us...
in my scenario I was downloading a 1Gb image for the CentOS Atomic host
from public glance repo.
Questions:
- how to explain the receiving box graph? What are the different color
lines for?
- In "receiving" column inside the "interfaces" sections I see all "0" for
gluster one (eth1) and empty (no value) for ovirtmgmt (perhaps because it
is unmanaged from host point of view?)
Also, it seems to me that the temporary bridge for hosted engine deployment
on 192.168.124.0 (virbr0) is still there even after finishing the
deployment, even if with "no carrier"... is this expected?
Better perhaps to undefine the network after deployment?
Gianluca
5 years, 7 months
Agentless Backup update: Trilio Vault...what a joke, pay $7500
by femi adegoke
Back in June of 2018, I posted regarding Agentless backup solutions.
Greg Sheremeta replied & mentioned Trilio.
The rest of the story...well just read on...
These guys at Trilio have turned out to be a complete waste of time. (I could use some more colorful language)
After my post & reply from Greg Sheremeta, I followed up with Trilio.
A lot of promises/dates were made (by Trilio) as to when a beta would be made available for testing.
Needless to say, they never came through.
Finally in Dec 2018, they said we are ready for the demo, we will show you a working product.
The day came, Trilio said, sorry it's too close to Xmas, we will postpone the demo's till 2019.
In 2019, I continued to follow up & finally they set a date for another demo - Jan 29.
On Jan. 29, we get on the Webex, they said, sorry the demo just broke 10 mins prior to our call, so no demo.
They show me some screenshots & their OpenStack version & again promise to get me beta software in a few days.
I continue to follow up (via email)
Yesterday (Feb 6), I get an email from Thomas Lahive GM; Sales and Alliance Partners....(copied & pasted below):
"We started RHV betas and decided to prioritize current Trilio customers (those that purchased Triliovault for Openstack).
If you would like to be part of the beta now then We can sign you up as a certified Trilio reseller which has a $7,500 Starter Fee. The $7,500 will be credited against your first customer order that is at least $7,500 so it will eventually cost you nothing. Many of our partners can apply the fee against revenue so it's a great tax incentive, but you can confirm with your finance department.
Please Lmk how you would like to proceed."
Please remember, I have never seen a working demo of this product, never.
Is this typical behavior of RH partners?
5 years, 7 months
[Users] Moving iSCSI Master Data
by rni@chef.net
--========GMXBoundary282021374122634158505
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
Hi,
it's me again....
I started my oVirt 'project' as a proof of concept,.. but it happend as always, it became production
Now, I've to move the iSCSI Master data to the real iSCSI traget.
Is there any way to do this, and to become rid of the old Master Data?
Thank you for your help
Hans-Joachim
--========GMXBoundary282021374122634158505
Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
<span style=3D'font-family:Verdana'><span style=3D'font-size:12px'>Hi,<br /=
><br />it's me again....<br /><br />I started my oVirt 'project' as a proof=
of concept,.. but it happend as always, it became production <img alt=
=3D" " title=3D" " src=3D"http://images.gmx.com/images/outsource/applicatio=
n/mailclient/mailcom/resource/mailclient/icons/blue/emoticons/animated/S_02=
-516742918.gif" /><br /><br />Now, I've to move the iSCSI Master data to th=
e real iSCSI traget.<br />Is there any way to do this, and to become rid of=
the old Master Data?<br /><br /><span id=3D"editor_signature">Thank you fo=
r your help</span><br /><br />Hans-Joachim</span></span>
--========GMXBoundary282021374122634158505--
5 years, 7 months
remote-viewer can not display console
by pxb@zj.sgcc.com.cn
I installed and configured ovirt 4.2.8 on centos 7.6,hosts and vms runs normal.
However, the virtual machine console often does not display properly. When I open the console, only "connected to graphic server" is displayed.
Is it ovirt 's bug or kvm's bug?
5 years, 7 months
oVirt 4.3.2 Disk extended broken (UI)
by Strahil Nikolov
Hi,
I have just extended the disk of one of my openSuSE VMs and I have noticed that despite the disk is only 140GiB (in UI), the VM sees it as 180GiB.
I think that this should not happen at all.
[root@ovirt1 ee8b1dce-c498-47ef-907f-8f1a6fe4e9be]# qemu-img info c525f67d-92ac-4f36-a0ef-f8db501102faimage: c525f67d-92ac-4f36-a0ef-f8db501102fafile format: rawvirtual size: 180G (193273528320 bytes)disk size: 71G
Attaching some UI screen shots.
Note: I have extended the disk via the UI by selecting 40GB (old value in UI -> 100GB).
Best Regards,Strahil Nikolov
5 years, 7 months
errors upgrading ovirt 4.3.1 to 4.3.2
by Jason Keltz
Hi.
I have a few issues after a recent upgrade from 4.3.1 to 4.3.2:
1) Power management is no longer working. I'm using Dell drac7. This
has always worked previously. When I click on the "Test" button, I get:
"Testing in progress. It will take a few seconds. Please wait" but then
it just sits there and never returns.
2) After rekickstarting one of my hosts, when I click on it, and choose
"Host Console", I get "Authentication failed: invalid-hostkey". If I
click "Try again", I'm taken to a page with "404 - Page not found Click
here to continue". The page not found is likely a bug. Now, if I visit
cockpit directly on the host via its own URL, it works just fine. Given
that I deleted the host and re-added to engine, it's really not clear to
me how to tell engine to refresh. I figured after rekickstarting the
host, the problem would surely go away, but it did not.
3) From time to time, I am seeing the following error appear in engine:
"Uncaught exception occurred. Please try reloading the page. Details:
(TypeError): oab (...) is null Please have your administrator check the
UI logs". Another bug ...
Engine is standalone engine, not hosted.
Jason.
5 years, 7 months
I can't create disk with ovirt 4.3.2
by siovelrm@gmail.com
Hello, I just installed ovirt 4.3.2, in Self-Hosted mode, all the same as in previous versions. It happens that when I want to create a disk with a user that is not the admin I get the following error.
"Error while executing action Add Disk to VM: Internal Engine Error"
This happens to me with all the other users except the admin even when those users are also Super User. Please I need your help.
Engine logs say the following
2019-04-05 15: 34: 09,977-04 INFO [org.ovirt.engine.core.bll.storage.disk.AddDiskCommand] (default task-13) [37026e0a-92e6-4bfa-9f0f-f052d9eced2d] Running command: AddDiskCommand internal: false. Entities affected: ID: c76a5059-f891-496a-b45f-7ba7ea878ceb Type: StorageAction group CREATE_DISK with role type USER
2019-04-05 15: 34: 10,002-04 WARN [org.ovirt.engine.core.bll.storage.disk.image.AddImageFromScratchCommand] (default task-13) [37026e0a-92e6-4bfa-9f0f-f052d9eced2d] Validation of action 'AddImageFromScratch' failed for user jdoe @ internal-authz. Reasons: VAR__TYPE__STORAGE__DOMAIN, NON_ADMIN_USER_NOT_AUTHORIZED_TO_PERFORM_ACTION_ON_HE
2019-04-05 15: 34: 10,070-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-13) [37026e0a-92e6-4bfa-9f0f-f052d9eced2d] EVENT_ID: USER_FAILED_ADD_DISK (2,023), Add-Disk operation failed (User: jdoe @ internal-authz).
2019-04-05 15: 34: 10,432-04 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-8) [37026e0a-92e6-4bfa-9f0f-f052d9eced2d] Command ' AddDisk 'id:' 8b10a2b8-3a38-45a4-9c08-7e742eca001b 'child commands' [bcfaa199-dee6-4ae4-9404-9b75cd8e9339]' executions were completed, status' FAILED '
2019-04-05 15: 34: 11,461-04 ERROR [org.ovirt.engine.core.bll.storage.disk.AddDiskCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-59) [37026e0a-92e6-4bfa-9f0f- f052d9eced2d] Ending command 'org.ovirt.engine.core.bll.storage.disk.AddDiskCommand' with failure.
2019-04-05 15: 34: 11,471-04 ERROR [org.ovirt.engine.core.bll.storage.disk.image.AddImageFromScratchCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-59) [37026e0a-92e6-4bfa- 9f0f-f052d9eced2d] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.AddImageFromScratchCommand' with failure.
2019-04-05 15: 34: 11,493-04 WARN [org.ovirt.engine.core.bll.storage.disk.AddDiskCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-59) [] VM is null - not unlocking
2019-04-05 15: 34: 11,523-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-59) [] EVENT_ID: USER_ADD_DISK_FINISHED_FAILURE (2,022), Add-Disk operation failed to complete.
5 years, 7 months
All hosts non-operational after upgrading from 4.2 to 4.3
by John Florian
I am in a severe pinch here. A while back I upgraded from 4.2.8 to 4.3.3
and only had one step remaining and that was to set the cluster compat
level to 4.3 (from 4.2). When I tried this it gave the usual warning that
each VM would have to be rebooted to complete, but then I got my first
unusual piece when it then told me next that this could not be completed
until each host was in maintenance mode. Quirky I thought, but I stopped
all VMs and put both hosts into maintenance mode. I then set the cluster
to 4.3. Things didn't want to become active again and I eventually noticed
that I was being told the DC needed to be 4.3 as well. Don't remember that
from before, but oh well that was easy.
However, the DC and SD remains down. The hosts are non-op. I've powered
everything off and started fresh but still wind up in the same state.
Hosts will look like their active for a bit (green triangle) but then go
non-op after about a minute. It appears that my iSCSI sessions are
active/logged in. The one glaring thing I see in the logs is this in
vdsm.log:
2019-04-05 12:03:30,225-0400 ERROR (monitor/07bb1bf) [storage.Monitor]
Setting up monitor for 07bb1bf8-3b3e-4dc0-bc43-375b09e06683 failed
(monitor:329)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
326, in _setupLoop
self._setupMonitor()
File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
348, in _setupMonitor
self._produceDomain()
File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 158, in
wrapper
value = meth(self, *a, **kw)
File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
366, in _produceDomain
self.domain = sdCache.produce(self.sdUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in
produce
domain.getRealDomain()
File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in
getRealDomain
return self._cache._realProduce(self._sdUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in
_realProduce
domain = self._findDomain(sdUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in
_findDomain
return findMethod(sdUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176, in
_findUnfetchedDomain
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist:
(u'07bb1bf8-3b3e-4dc0-bc43-375b09e06683',)
How do I proceed to get back operational?
5 years, 7 months