ESXi 6.7 as a nested VM on top of oVirt cluster.
by branimirp@gmail.com
Hello list
I am wondering if anyone has tried this before? I am trying to consolidate my lab to an oVirt cluster which consists of oVirt (4.3.8) and 2 KVM hypervisors (CentOS 7.7.). Among other things, one of my efforts is a small ESXi + vCenter lab. In addition, I have a standalone KVM hypervisor. I can run nested ESXi 6.7 on top of the standalone KVM hypervisor (with nested KVM enabled) without any problem. However, on top of oVirt-controlled KVMs I have some issues. The hypervisors have nested kvm support enabled via vdsm hooks. qemu emulates e1000 for a nested ESXi vm. ESXi installation process goes smooth but as soon as I enable management network and restart it, the nested ESXi cannot communicate with the outside world (DNS ping cannot pass). Also https connection to VMWare ESXi web gui and ping to the ESXi also fail. I noticed on my client machine, that ARP requests are incomplete for ESXI as well. Within oVirt, I see no packet drops in "Network interface" tab for the nested ESXi.
In addition, I have a few, ordinary, non-nested VMs running on the same network as the nested vm and I can normally establish connection to those machines. Additionally, as a test, I created a nested KVM vm on top of the oVirt cluster, same network as the nested ESXi and it works as expected - can spin VMs, can connect to it. The network assigned to the nested ESXi has a "No network filter" vNic profile applied.
I tried to google for a solution but found only this: https://github.com/mechinn/kvmhidden - not sure if this is the solution at all (also, I wonder if this is still up-to-date after 3+ years)? Could I please ask if anyone tried something similar and experienced this problem? Is there any additional configuration that I should apply to the oVirt cluster?
Thank you very much in advance!
Regards,
Branimir
4 years, 2 months
Re: Power Management - drac5
by eevans@digitaldatatechs.com
I enabled ipmi with administrator and it worked. Thank you so much!!!
Eric Evans
Digital Data Services LLC.
304.660.9080
-----Original Message-----
From: Robert Webb <rwebb(a)ropeguru.com>
Sent: Monday, February 03, 2020 1:30 PM
To: Jayme <jaymef(a)gmail.com>
Cc: users <users(a)ovirt.org>
Subject: [ovirt-users] Re: Power Management - drac5
ipmi over lan is allowed, and the user is the root user with admin access.
Also the channel privilege level under ipmi is Administrator.
Will check via cli to see what I get.
________________________________________
From: Jayme <jaymef(a)gmail.com>
Sent: Monday, February 3, 2020 1:23 PM
To: Robert Webb
Cc: users
Subject: Re: [ovirt-users] Power Management - drac5
Also make sure you have "Enable IPMI Over LAN" enabled under idrac settings.
On Mon, Feb 3, 2020 at 2:15 PM Jayme <jaymef(a)gmail.com<mailto:jaymef@gmail.com>> wrote:
I recall having a problem similar to this before and it was related to the user roles/permissions in iDrac. Check what access rights the user has. If that leads no where you might have some luck testing manually using the fence_idrac5 CLI tool directly on one of the oVirt hosts
On Mon, Feb 3, 2020 at 2:09 PM Robert Webb <rwebb(a)ropeguru.com<mailto:rwebb@ropeguru.com>> wrote:
I have 3 Dell R410's with iDrac6 Enterprise capability. I am trying to get power management set up but the test will not pass and I am not finding the docs very helpful.
I have put in the IP, user name, password, and drac5 as the type. I have tested both with and without secure checked and always get, "Test failed: Internal JSON-RPC error".
idrac log shows:
2020 Feb 3 17:41:22 os[19772] root closing session from 192.168.1.12
2020 Feb 3 17:41:17 os[19746] root login from 192.168.1.12
Can someone please guide me in the right direction?
_______________________________________________
Users mailing list -- users(a)ovirt.org<mailto:users@ovirt.org>
To unsubscribe send an email to users-leave(a)ovirt.org<mailto:users-leave@ovirt.org>
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RKFEK2ORWOO...
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/N2EEZKP6REG...
4 years, 2 months
Re: Power Management - drac5
by Jayme
I recall having a problem similar to this before and it was related to the
user roles/permissions in iDrac. Check what access rights the user has.
If that leads no where you might have some luck testing manually using the
fence_idrac5 CLI tool directly on one of the oVirt hosts
On Mon, Feb 3, 2020 at 2:09 PM Robert Webb <rwebb(a)ropeguru.com> wrote:
> I have 3 Dell R410's with iDrac6 Enterprise capability. I am trying to get
> power management set up but the test will not pass and I am not finding the
> docs very helpful.
>
> I have put in the IP, user name, password, and drac5 as the type. I have
> tested both with and without secure checked and always get, "Test failed:
> Internal JSON-RPC error".
>
> idrac log shows:
>
> 2020 Feb 3 17:41:22 os[19772] root closing session from
> 192.168.1.12
> 2020 Feb 3 17:41:17 os[19746] root login from 192.168.1.12
>
> Can someone please guide me in the right direction?
>
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/RKFEK2ORWOO...
>
4 years, 2 months
Power Management - drac5
by Robert Webb
I have 3 Dell R410's with iDrac6 Enterprise capability. I am trying to get power management set up but the test will not pass and I am not finding the docs very helpful.
I have put in the IP, user name, password, and drac5 as the type. I have tested both with and without secure checked and always get, "Test failed: Internal JSON-RPC error".
idrac log shows:
2020 Feb 3 17:41:22 os[19772] root closing session from 192.168.1.12
2020 Feb 3 17:41:17 os[19746] root login from 192.168.1.12
Can someone please guide me in the right direction?
4 years, 2 months
Understanding ovirt memory management which appears incorrect
by divan@santanas.co.za
Hi All,
A question regarding memory management with ovirt. I know memory can
be complicated hence I'm asking the experts. :)
Two examples of where it looks - to me - that memory management from
ovirt perspective is incorrect. This is resulting in us not getting as
much out of a host as we'd expect.
## Example 1:
host: dev-cluster-04
I understand the mem on the host to be:
128G total (physical)
68G used
53G available
56G buff/cache
I understand therefore 53G should still be available to allocate
(approximately, minus a few things).
```
DEV [root@dev-cluster-04:~] # free -m
total used free shared buff/cache available
Mem: 128741 68295 4429 4078 56016 53422
Swap: 12111 1578 10533
DEV [root@dev-cluster-04:~] # cat /proc/meminfo
MemTotal: 131831292 kB
MemFree: 4540852 kB
MemAvailable: 54709832 kB
Buffers: 3104 kB
Cached: 5174136 kB
SwapCached: 835012 kB
Active: 66943552 kB
Inactive: 5980340 kB
Active(anon): 66236968 kB
Inactive(anon): 5713972 kB
Active(file): 706584 kB
Inactive(file): 266368 kB
Unevictable: 50036 kB
Mlocked: 54132 kB
SwapTotal: 12402684 kB
SwapFree: 10786688 kB
Dirty: 812 kB
Writeback: 0 kB
AnonPages: 67068548 kB
Mapped: 143880 kB
Shmem: 4176328 kB
Slab: 52183680 kB
SReclaimable: 49822156 kB
SUnreclaim: 2361524 kB
KernelStack: 20000 kB
PageTables: 213628 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 78318328 kB
Committed_AS: 110589076 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 859104 kB
VmallocChunk: 34291324976 kB
HardwareCorrupted: 0 kB
AnonHugePages: 583680 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 621088 kB
DirectMap2M: 44439552 kB
DirectMap1G: 91226112 kB
```
The ovirt engine, compute -> hosts view shows s4-dev-cluster-01 as 93%
memory utilised.
Clicking on the node says:
Physical Memory: 128741 MB total, 119729 MB used, 9012 MB free
So ovirt engine says 9G free. The OS reports 4G free but 53G
available. Surely ovirt should be looking at available memory?
This is a problem, for instance, when trying to run a VM, called
dev-cassandra-01, with mem size 24576, max mem 24576 and mem
guarantee set to 10240 on this host it fails with:
```
Cannot run VM. There is no host that satisfies current scheduling
constraints. See below for details:
The host dev-cluster-04.fnb.co.za did not satisfy internal filter
Memory because its available memory is too low (19884 MB) to run the
VM.
```
To me this looks blatantly wrong. The host has 53G available according
to free -m.
Guessing I'm missing something, unless this is some sort of bug?
versions:
```
engine: 4.3.7.2-1.el7
host:
OS Version: RHEL - 7 - 6.1810.2.el7.centos
OS Description: CentOS Linux 7 (Core)
Kernel Version: 3.10.0 - 957.12.1.el7.x86_64
KVM Version: 2.12.0 - 18.el7_6.3.1
LIBVIRT Version: libvirt-4.5.0-10.el7_6.7
VDSM Version: vdsm-4.30.13-1.el7
SPICE Version: 0.14.0 - 6.el7_6.1
GlusterFS Version: [N/A]
CEPH Version: librbd1-10.2.5-4.el7
Open vSwitch Version: openvswitch-2.10.1-3.el7
Kernel Features: PTI: 1, IBRS: 0, RETP: 1, SSBD: 3
VNC Encryption: Disabled
```
## Example 2:
A ovirt host with two VMs:
According to the host, it has 128G of physical memory of which 56G is
used, 69G is buff/cache and 65G is available.
As is shown here:
```
LIVE [root@prod-cluster-01:~] # cat /proc/meminfo
MemTotal: 131326836 kB
MemFree: 2630812 kB
MemAvailable: 66573596 kB
Buffers: 2376 kB
Cached: 5670628 kB
SwapCached: 151072 kB
Active: 59106140 kB
Inactive: 2744176 kB
Active(anon): 58099732 kB
Inactive(anon): 2327428 kB
Active(file): 1006408 kB
Inactive(file): 416748 kB
Unevictable: 40004 kB
Mlocked: 42052 kB
SwapTotal: 4194300 kB
SwapFree: 3579492 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 56085040 kB
Mapped: 121816 kB
Shmem: 4231808 kB
Slab: 65143868 kB
SReclaimable: 63145684 kB
SUnreclaim: 1998184 kB
KernelStack: 25296 kB
PageTables: 148336 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 69857716 kB
Committed_AS: 76533164 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 842296 kB
VmallocChunk: 34291404724 kB
HardwareCorrupted: 0 kB
AnonHugePages: 55296 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 722208 kB
DirectMap2M: 48031744 kB
DirectMap1G: 87031808 kB
LIVE [root@prod-cluster-01:~] # free -m
total used free shared buff/cache available
Mem: 128248 56522 2569 4132 69157 65013
Swap: 4095 600 3495
```
However the compute -> hosts ovirt screen shows this node as 94%
memory.
Clicking compute -> hosts -> prod-cluster-01 -> general says:
Physical Memory: 128248 MB total, 120553 MB used, 7695 MB free
Swap Size: 4095 MB total, 600 MB used, 3495 MB free
The physical memory in the above makes no sense to me. Unless it
includes caches which I would think it shouldn't.
This host has just two VMs:
LIVE [root@prod-cluster-01:~] # virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf list
Id Name State
----------------------------------------------------
35 prod-box-18 running
36 prod-box-11 running
Moreover each VM has 32G memory set, in every possible place - from
what I can see.
```
LIVE [root@prod-cluster-01:~] # virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf dumpxml prod-box-11|grep -i mem
<ovirt-vm:memGuaranteedSize type="int">32768</ovirt-vm:memGuaranteedSize>
<ovirt-vm:minGuaranteedMemoryMb type="int">32768</ovirt-vm:minGuaranteedMemoryMb>
<memory unit='KiB'>33554432</memory>
<currentMemory unit='KiB'>33554432</currentMemory>
<cell id='0' cpus='0-27' memory='33554432' unit='KiB'/>
<suspend-to-mem enabled='no'/>
<model type='qxl' ram='65536' vram='32768' vgamem='16384' heads='1' primary='yes'/>
<memballoon model='virtio'>
</memballoon>
```
prod-box-11 is however set as high performance VM. That could cause a
problem.
Same for the other VM:
```
LIVE [root@prod-cluster-01:~] # virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf dumpxml prod-box-18|grep -i mem
<ovirt-vm:memGuaranteedSize type="int">32768</ovirt-vm:memGuaranteedSize>
<ovirt-vm:minGuaranteedMemoryMb type="int">32768</ovirt-vm:minGuaranteedMemoryMb>
<memory unit='KiB'>33554432</memory>
<currentMemory unit='KiB'>33554432</currentMemory>
<cell id='0' cpus='0-27' memory='33554432' unit='KiB'/>
<suspend-to-mem enabled='no'/>
<model type='qxl' ram='65536' vram='32768' vgamem='16384' heads='1' primary='yes'/>
<memballoon model='virtio'>
</memballoon>
```
So I understand that two VMs each having allocated 32G of ram should
consume approx 64G of ram on the host. The host has 128G of ram, so
usage should be at approx 50%. However ovirt is reporting 94% usage.
Versions:
```
engine: 4.3.5.5-1.el7
host:
OS Version: RHEL - 7 - 6.1810.2.el7.centos
OS Description: CentOS Linux 7 (Core)
Kernel Version: 3.10.0 - 957.10.1.el7.x86_64
KVM Version: 2.12.0 - 18.el7_6.3.1
LIBVIRT Version: libvirt-4.5.0-10.el7_6.6
VDSM Version: vdsm-4.30.11-1.el7
SPICE Version: 0.14.0 - 6.el7_6.1
GlusterFS Version: [N/A]
CEPH Version: librbd1-10.2.5-4.el7
Open vSwitch Version: openvswitch-2.10.1-3.el7
Kernel Features: PTI: 1, IBRS: 0, RETP: 1
VNC Encryption: Disabled
```
Thanks for any insights!
--
Divan Santana
https://divansantana.com
4 years, 2 months
Spacewalk integration
by eevans@digitaldatatechs.com
I know in the past that Spacewalk integration was not possible. Has anyone successfully integrated Spacewalk? It is possible or is it something that's not being put together. I use Spacewalk 2.9 and would like to integrate if it's possible and feasible.
Thanks
Eric Evans
Digital Data Services
4 years, 2 months
Two host cluster without hyperconverged
by Göker Dalar
Hello everyone,
I want to get an idea in this topic.
I have two servers with the same capabilities and 8 same physical disc per
node. I want to setup a cluster using a redundant disc.I dont have another
server for gluster hyperconverged . How i should build for this
structure ?
Thanks in advance,
Göker
4 years, 2 months
Can't connect vdsm storage: Command StorageDomain.getInfo with args failed: (code=350, message=Error in storage domain action
by asm@pioner.kz
Hi! I trying to upgrade my hosts and have problem with it. After uprgading one host i see that this one NonOperational. All was fine with vdsm-4.30.24-1.el7 but after upgrading with new version vdsm-4.30.40-1.el7.x86_64 and some others i have errors.
Firtst of all i see in ovirt Events: Host srv02 cannot access the Storage Domain(s) <UNKNOWN> attached to the Data Center Default. Setting Host state to Non-Operational. My Default storage domain with HE VM data on NFS storage.
In messages log of host:
srv02 journal: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call last):#012 File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/a
gent.py", line 131, in _run_agent#012 return action(he)#012 File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 55, in action_proper#012 return he.start_monitoring
()#012 File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 432, in start_monitoring#012 self._initialize_broker()#012 File "/usr/lib/python2.7/site-packages/
ovirt_hosted_engine_ha/agent/hosted_engine.py", line 556, in _initialize_broker#012 m.get('options', {}))#012 File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 8
9, in start_monitor#012 ).format(t=type, o=options, e=e)#012RequestError: brokerlink - failed to start monitor via ovirt-ha-broker: [Errno 2] No such file or directory, [monitor: 'network', options:
{'tcp_t_address': None, 'network_test': None, 'tcp_t_port': None, 'addr': '192.168.2.248'}]
Feb 1 15:41:42 srv02 journal: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent
In broker log:
MainThread::WARNING::2020-02-01 15:43:35,167::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) Can't connect vdsm storage: Command StorageDomain.getInfo with ar
gs {'storagedomainID': 'bbdddea7-9cd6-41e7-ace5-fb9a6795caa8'} failed:
(code=350, message=Error in storage domain action: (u'sdUUID=bbdddea7-9cd6-41e7-ace5-fb9a6795caa8',))
In vdsm.lod
2020-02-01 15:44:19,930+0600 INFO (jsonrpc/0) [vdsm.api] FINISH getStorageDomainInfo error=[Errno 1] Operation not permitted from=::1,57528, task_id=40683f67-d7b0-4105-aab8-6338deb54b00 (api:52)
2020-02-01 15:44:19,930+0600 ERROR (jsonrpc/0) [storage.TaskManager.Task] (Task='40683f67-d7b0-4105-aab8-6338deb54b00') Unexpected error (task:875)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
return fn(*args, **kargs)
File "<string>", line 2, in getStorageDomainInfo
File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method
ret = func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2753, in getStorageDomainInfo
dom = self.validateSdUUID(sdUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 305, in validateSdUUID
sdDom = sdCache.produce(sdUUID=sdUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce
domain.getRealDomain()
File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain
return self._cache._realProduce(self._sdUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce
domain = self._findDomain(sdUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain
return findMethod(sdUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/nfsSD.py", line 145, in findDomain
return NfsStorageDomain(NfsStorageDomain.findDomainPath(sdUUID))
File "/usr/lib/python2.7/site-packages/vdsm/storage/fileSD.py", line 378, in __init__
manifest.sdUUID, manifest.mountpoint)
File "/usr/lib/python2.7/site-packages/vdsm/storage/fileSD.py", line 853, in _detect_block_size
block_size = iop.probe_block_size(mountpoint)
File "/usr/lib/python2.7/site-packages/vdsm/storage/outOfProcess.py", line 384, in probe_block_size
return self._ioproc.probe_block_size(dir_path)
File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 602, in probe_block_size
"probe_block_size", {"dir": dir_path}, self.timeout)
File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 448, in _sendCommand
raise OSError(errcode, errstr)
OSError: [Errno 1] Operation not permitted
2020-02-01 15:44:19,930+0600 INFO (jsonrpc/0) [storage.TaskManager.Task] (Task='40683f67-d7b0-4105-aab8-6338deb54b00') aborting: Task is aborted: u'[Errno 1] Operation not permitted' - code 100 (task:1
181)
2020-02-01 15:44:19,930+0600 ERROR (jsonrpc/0) [storage.Dispatcher] FINISH getStorageDomainInfo error=[Errno 1] Operation not permitted (dispatcher:87)
But i see that this domain is mounted (by mount command):
storage:/volume3/ovirt-hosted on /rhev/data-center/mnt/storage:_volume3_ovirt-hosted type nfs4 (rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=192.168.2.251,local_lock=none,addr=192.168.2.248)
I didnt see storage directory in /var/run/vdsm? I see many differences with another hosts. Here is listing of var/run/vdsm:
bonding-defaults.json
dhclientmon
nets_restored
payload
svdsm.sock
v2v
vhostuser
bonding-name2numeric.json
mom-vdsm.sock
ovirt-imageio-daemon.sock
supervdsmd.lock
trackedInterfaces
vdsmd.lock
What whe problem? Please help.
4 years, 2 months