Debugging "non_operational" host during self-hosted deploy
by Mikael Öhman
The "bootstrap_local_vm.yml" playbook fails at the end, during the task "Wait for the host to be up"
Looking through the ovirt-hosted-engine-setup-bootstrap_local_vm log, I found the reason is supposed to be:
"status": "non_operational",
"status_detail": "network_unreachable",
But that's it. I can't find anything wrong with any networks, neither on the host or in the partly-prepared HE VM.
Is there some verbose information I can dump to find out why it thinks the network is unreachable?
I can't find any logs indicating any issues until this step.
6 years
Cannot import VM. Invalid time zone for given OS type. Attribute: vm.vmStatic
by daniel@djj-consultants.com
Need some magic here ...
- 2 hosts CentOS 7.5.1804 with all the patch available, in the cluster
- hosted engine from rpm ovirt-engine-appliance-4.2-20181113.1.el7.noarch with all the patch as well (manager Version 4.2.7.5-1.el7)
1) I re-install the hosts and the hosted engine from scratch
2) I imported the ISO, export and DATA domain from the SAN
3) I went to the DATA domain in the VM Import tab where I saw all the VMs
4) I imported the VM one after one BUT ....
One of the VM (Windows 10 x64) can't be import because of the error in the subject of this thread. In the engine.log file I get
2018-12-01 19:17:50,387-05 WARN [org.ovirt.engine.core.bll.exportimport.ImportVmFromConfigurationCommand] (default task-4) [92c96b3c-38c5-42ab-b81c-11413ea32ccd] Validation of action 'ImportVmFromConfiguration' failed for user admin@internal-authz. Reasons: VAR__ACTION__IMPORT,VAR__TYPE__VM,ACTION_TYPE_FAILED_INVALID_TIMEZONE,$groups [Ljava.lang.Class;@5cac1830,$message ACTION_TYPE_FAILED_INVALID_TIMEZONE,$payload [Ljava.lang.Class;@6770dba6,ACTION_TYPE_FAILED_ATTRIBUTE_PATH,$path vm.vmStatic,$validatedValue org.ovirt.engine.core.common.businessentities.VmStatic@804b3abe
So it seem that the timezone of the VM is not a valid timezone so I used engine-config like this
engine-config -s DefaultWindowsTimeZone="Eastern Standard Time"
and restart the engine service and try again to import without success
engine-config -s DefaultWindowsTimeZone="GMT Standard Time"
and restart again and import again no success
I tried all the timezone available for Windows and always the same result. I also tried to put the DefaultWindowsTimeZone empty but engine-config doesn't allow that neither if I put a *
The VM is on iscsi device so can't go see the vm files to fix it directly on the fs
10 hours since I start to fix this import thing ;-(
If someone has some magic for me, I will be a nice gift before christmas
6 years
[OT] Xen/Oracle VM under Qemu-kvm/oVirt
by Gianluca Cecchi
Hello,
I have to do a test on Oracle VM 3.4.4 that is based on Xen 4.4.4.
The test aims in simulating storage migration in OVM, where I have to
migrate also the so called server pool file system, that requires a special
procedure.
I see that some guys wrote here to ask about OVM to oVirt migrations, so
there could be some expertise in this field.
I'm just trying to proceed with the hypervisor steps (Oracle VM Server),
that is the more difficult part, I suppose.
I'm testing at the moment on a Fedora 29 system, using qemu-kvm, setting
host-pass-through for the vm dedicated to the Oracle VM Hypervisor.
I already tested nested virtualization (with oVirt for some vms) inside
this system, so the host part should be ok.
I have set disks as scsi and vnic as e1000
Installation gone ok but at reboot I see cpu stuck and no prompt yet
see here:
https://drive.google.com/file/d/1sAEO-W0-OLIGSugxIwSLRBx3WXXoK1yx/view?us...
I have not yet understood exactly the reason of complain...
Any one already able to create a Xen (or Oracle VM) nested environment
inside qemu-kvm?
The OS of the Xen dom0 should be based on Oracle Linux 6.x
I can also test trying to install on oVirt if anyone on the list already
ran through this path.
Thanks in advance for any help or pointers.
I'm also going and post on Oracle VM forum...
Gianluca
6 years
Best Openstack version to integrate with oVirt 4.2.7
by Gianluca Cecchi
Hello,
do you think it is ok to use Rocky version of Openstack to integrate its
services with oVirt 4.2.7 on CentOS 7?
I see on https://repos.fedorapeople.org/repos/openstack/ that, if Rocky is
too new, between the older releases available there are, from newer to
older:
Queens
Pike
Ocata
Newton
At the moment I have two separate lab environments:
oVirt with 4.2.7
Openstack with Rocky (single host with packstack allinone)
just trying first integration steps with these versions, it seems I'm not
able to communicate with glance, because I get in engine.log
2018-11-10 17:32:58,386+01 ERROR
[org.ovirt.engine.core.bll.provider.storage.AbstractOpenStackStorageProviderProxy]
(default task-51) [e2fccee7-1bb2-400f-b8d3-b87b679117d1] Not Found
(OpenStack response error code: 404)
Nothing in glance logs on openstack, apparently.
In my test I'm using
http://xxx.xxx.xxx.xxx:9292 as provider url
checked the authentication check box and
glance user with its password
35357 as the port and services as the tenant
a telnet on port 9292 of openstack server from engine to openstack is ok
similar with cinder I get:
2018-11-10 17:45:42,226+01 ERROR
[org.ovirt.engine.core.bll.provider.storage.AbstractOpenStackStorageProviderProxy]
(default task-50) [32a31aa7-fe3f-460c-a8b9-cc9b277deab7] Not Found
(OpenStack response error code: 404)
So before digging more I would lile to be certain which one is currently
the best combination, possibly keeping as fixed the oVirt version to 4.2.7.
Thanks,
Gianluca
6 years
(no subject)
by Abhishek Sahni
Hello Team,
We are running a setup of 3-way replica HC gluster setup configured during
the initial deployment from the cockpit console using ansible.
NODE1
- /dev/sda (OS)
- /dev/sdb ( Gluster Bricks )
* /gluster_bricks/engine/engine/
* /gluster_bricks/data/data/
* /gluster_bricks/vmstore/vmstore/
NODE2 and NODE3 with a similar setup.
Hosted engine was running on node2.
- While moving NODE1 to maintenance mode along with stopping the
gluster service as it prompts before, Hosted engine instantly went down.
- I start the gluster service back on node1 and start the hosted engine
again and found hosted engine started properly but getting crashed again
and again within frames of second after a successful start because HE
itself stopping glusterd on node1. (not sure) but cross-verified by
checking glusterd status.
*Is it possible to clear pending tasks or not let the HE to stop
glusterd on node1?*
*Or we can start the HE using other gluster node?*
https://paste.fedoraproject.org/paste/Qu2tSHuF-~G4GjGmstV6mg
--
ABHISHEK SAHNI
IISER Bhopal
6 years
Hosted Engine goes down while putting gluster node into maintenance mode.
by Abhishek Sahni
Hello Team,
We are running a setup of 3-way replica HC gluster setup configured during
the initial deployment from the cockpit console using ansible.
NODE1
- /dev/sda (OS)
- /dev/sdb ( Gluster Bricks )
* /gluster_bricks/engine/engine/
* /gluster_bricks/data/data/
* /gluster_bricks/vmstore/vmstore/
NODE2 and NODE3 with a similar setup.
Hosted engine was running on node2.
- While moving NODE1 to maintenance mode along with stopping the
gluster service as it prompts before, Hosted engine instantly went down.
- I start the gluster service back on node1 and start the hosted engine
again and found hosted engine started properly but getting crashed again
and again within frames of second after a successful start because HE
itself stopping glusterd on node1. (not sure) but cross-verified by
checking glusterd status.
*Is it possible to clear pending tasks or not let the HE to stop
glusterd on node1?*
*Or we can start the HE using other gluster node?*
https://paste.fedoraproject.org/paste/Qu2tSHuF-~G4GjGmstV6mg
--
ABHISHEK SAHNI
IISER Bhopal
6 years
From Self-hosted engine to standalone engine
by Punaatua PK
Hello,
we currently have a self-hosted engine on gluster with 3 hosts. We want to have the engine on a single machine on a standalone KVM.
We did the following steps on our test platform.
- Create a VM on a standalone KVM
- Put the self hosted engine into global maintenance
- Shut the self-hosted engine
- Copy the self-hosted engine image disk (by browsing into the gluster engine volume) by using linux dd command to the standalone KVM
- Reusing the self-hosted engine's MAC address on the new standalone VM
- Starting the standalone VM which use the self-hosted image disk previously copied on the standalone KVM
- Log in the engine and then undeploy the hosted-engine by re-installing all the host and go for UNDEPLOY on the self-hosted section
- Stop ovirt-ha-agent and ovirt-ha-broker
Everything seems to be ok for now.
What do you think about our process ? (to go from self-hosted to standalone)
Do you have any idea on what should be checked ?
Thank you
(We wanted to go out from self-hosted engine, because we don't really master this deployment)
6 years
FCoE wont initialize on reboot oVirt 4.2.
by Jacob Green
I have HP BL460c Gen9 blades with BCM57840 NetXtreme II
10/20-Gigabit Ethernet via a HP Virtual connection. In my ovirt 4.1
environment I have fibre channel working great.
However in the new environment that I want to bring the data domain over
too ultimately, I am having issues with ovirt interfering with the hosts
ability to see the Fiber Channel storage. If i build a clean CentOS 7
installation and get my FCoE module installed and Fiber channel set up
on my appropriate interfaces, it works and sees the fiber channel
interface every time I type fcoeadm -i, I can reboot a million times,
does not go away. However once I turn it into a oVirt 4.2 node and add
it to my environment and reboot the blade it is hit or miss if fcoeadm
-i is going to return interface information.
If I then type "systemctl restart network" my fiber channel comes
online, but I should not need to do this. I can see in my dmesg logs
that the fiber channel is initializing on boot.
[ 39.465578] cnic: QLogic cnicDriver v2.5.22 (July 20, 2015)
[ 39.465594] bnx2x 0000:06:00.2 eno51: Added CNIC device
[ 39.475618] bnx2x 0000:06:00.3 eno52: Added CNIC device
[ 39.495575] bnx2fc: QLogic FCoE Driver bnx2fc v2.11.8 (October 15, 2015)
[ 39.505971] bnx2fc: bnx2fc: FCoE initialized for eno52.
[ 39.506299] bnx2fc: [06]: FCOE_INIT passed
[ 39.516308] bnx2fc: bnx2fc: FCoE initialized for eno51.
[ 39.516654] bnx2fc: [06]: FCOE_INIT passed
Reminder I have not added a FC storage domain yet, because I need to
turn off and detach the domain from the old 4.1 environment first.
However that should not keep the fiber channel interfaces from coming up...
And I need to know its working before I do that.
Below is what an fcoeadm -i should return when it seems to be working.
____________________________________________________________________________________________________
fcoeadm -i
Description: BCM57840 NetXtreme II 10/20-Gigabit Ethernet
Revision: 11
Manufacturer: Broadcom Limited
Serial Number: 5CB901C7EE00
Driver: bnx2x 1.712.30-0
Number of Ports: 1
Symbolic Name: bnx2fc (QLogic BCM57840) v2.11.8 over eno51
OS Device Name: host10
Node Name: 0x50060b0000c27a05
Port Name: 0x50060b0000c27a04
Fabric Name: 0x1000c4f57c218ff4
Speed: unknown
Supported Speed: 1 Gbit, 10 Gbit
MaxFrameSize: 2048 bytes
FC-ID (Port ID): 0x0a025c
State: Online
Description: BCM57840 NetXtreme II 10/20-Gigabit Ethernet
Revision: 11
Manufacturer: Broadcom Limited
Serial Number: 5CB901C7EE00
Driver: bnx2x 1.712.30-0
Number of Ports: 1
Symbolic Name: bnx2fc (QLogic BCM57840) v2.11.8 over eno52
OS Device Name: host11
Node Name: 0x50060b0000c27a07
Port Name: 0x50060b0000c27a06
Fabric Name: 0x1000c4f57c21979d
Speed: unknown
Supported Speed: 1 Gbit, 10 Gbit
MaxFrameSize: 2048 bytes
FC-ID (Port ID): 0x14037b
State: Online
____________________________________________________________________________________________________
However if I reboot the node from the ovirt console and wait a few
minutes after it has rebooted then type fcoeadm -i I get the following.
fcoeadm -i
fcoeadm: No action was taken
Try 'fcoeadm --help' for more information.
it is not until I perform a systemctl restart network, that I get the
correct output from above.
Any help or insight into fiber channel with ovirt 4.2 would be greatly
appreciated.
--
Jacob Green
Systems Admin
American Alloy Steel
713-300-5690
6 years
hosted-engine --deploy fails on Ovirt-Node-NG 4.2.7
by Ralf Schenk
Hello,
I try to deploy hosted-engine to a NFS Share accessible by (currently)
two hosts. The host is running latest ovirt-node-ng 4.2.7.
hosted-engine --deploy fails constantly in late stage when trying to run
engine from NFS. It already ran as "HostedEngineLocal" and I think is
then migrated to NFS storage.
Engine seems to be deployed to NFS already:
[root@epycdphv02 ~]# ls -al
/rhev/data-center/mnt/storage01.office.databay.de:_ovirt_engine
total 23
drwxrwxrwx 3 vdsm kvm 4 Dec 3 13:01 .
drwxr-xr-x 3 vdsm kvm 4096 Dec 1 17:11 ..
drwxr-xr-x 6 vdsm kvm 6 Dec 3 13:09 1dacf1ea-0934-4840-bed4-e9d023572f59
-rwxr-xr-x 1 vdsm kvm 0 Dec 3 13:42 __DIRECT_IO_TEST__
NFS Mount:
storage01.office.databay.de:/ovirt/engine on
/rhev/data-center/mnt/storage01.office.databay.de:_ovirt_engine type
nfs4
(rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=192.168.1.121,local_lock=none,addr=192.168.1.3)
Libvirt quemu states an error:
Could not open
'/var/run/vdsm/storage/1dacf1ea-0934-4840-bed4-e9d023572f59/2b1332f6-3bb6-495b-87fe-c5b85e0ac495/39d45b33-5f29-430b-8b58-14a8ea20fb08':
Permission denied
Even permissions of mentioned file seem to be ok. SELINUX is disabled
since I had a lots of problems with earlier versions trying to deploy
hosted-engine.
[root@epycdphv02 ~]# ls -al
'/var/run/vdsm/storage/1dacf1ea-0934-4840-bed4-e9d023572f59/2b1332f6-3bb6-495b-87fe-c5b85e0ac495/39d45b33-5f29-430b-8b58-14a8ea20fb08'
-rw-rw---- 1 vdsm kvm 53687091200 Dec 3 13:09
/var/run/vdsm/storage/1dacf1ea-0934-4840-bed4-e9d023572f59/2b1332f6-3bb6-495b-87fe-c5b85e0ac495/39d45b33-5f29-430b-8b58-14a8ea20fb08
hosted-engine --deploy ends with error. Logfile is attached.
[ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 120, "changed":
true, "cmd": ["hosted-engine", "--vm-status", "--json"], "delta":
"0:00:00.218320", "end": "2018-12-03 13:20:19.139919", "rc": 0, "start":
"2018-12-03 13:20:18.921599", "stderr": "", "stderr_lines": [],
"stdout": "{\"1\": {\"conf_on_shared_storage\": true, \"live-data\":
true, \"extra\":
\"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=156443
(Mon Dec 3 13:20:16
2018)\\nhost-id=1\\nscore=0\\nvm_conf_refresh_time=156443 (Mon Dec 3
13:20:16
2018)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineUnexpectedlyDown\\nstopped=False\\ntimeout=Fri
Jan 2 20:29:01 1970\\n\", \"hostname\":
\"epycdphv02.office.databay.de\", \"host-id\": 1, \"engine-status\":
{\"reason\": \"bad vm status\", \"health\": \"bad\", \"vm\":
\"down_unexpected\", \"detail\": \"Down\"}, \"score\": 0, \"stopped\":
false, \"maintenance\": false, \"crc32\": \"d3355c40\",
\"local_conf_timestamp\": 156443, \"host-ts\": 156443},
\"global_maintenance\": false}", "stdout_lines": ["{\"1\":
{\"conf_on_shared_storage\": true, \"live-data\": true, \"extra\":
\"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=156443
(Mon Dec 3 13:20:16
2018)\\nhost-id=1\\nscore=0\\nvm_conf_refresh_time=156443 (Mon Dec 3
13:20:16
2018)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineUnexpectedlyDown\\nstopped=False\\ntimeout=Fri
Jan 2 20:29:01 1970\\n\", \"hostname\":
\"epycdphv02.office.databay.de\", \"host-id\": 1, \"engine-status\":
{\"reason\": \"bad vm status\", \"health\": \"bad\", \"vm\":
\"down_unexpected\", \"detail\": \"Down\"}, \"score\": 0, \"stopped\":
false, \"maintenance\": false, \"crc32\": \"d3355c40\",
\"local_conf_timestamp\": 156443, \"host-ts\": 156443},
\"global_maintenance\": false}"]}
[ INFO ] TASK [Check VM status at virt level]
[ INFO ] TASK [Fail if engine VM is not running]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg":
"Engine VM is not running, please check vdsm logs"}
[ ERROR ] Failed to execute stage 'Closing up': Failed executing
ansible-playbook
[ INFO ] Stage: Clean up
[ INFO ] Cleaning temporary resources
[ INFO ] TASK [Gathering Facts]
[ INFO ] ok: [localhost]
[ INFO ] TASK [Fetch logs from the engine VM]
[ INFO ] ok: [localhost]
[ INFO ] TASK [Set destination directory path]
[ INFO ] ok: [localhost]
[ INFO ] TASK [Create destination directory]
[ INFO ] changed: [localhost]
[ INFO ] TASK [include_tasks]
[ INFO ] ok: [localhost]
[ INFO ] TASK [Find the local appliance image]
[ INFO ] ok: [localhost]
[ INFO ] TASK [Set local_vm_disk_path]
[ INFO ] ok: [localhost]
[ INFO ] TASK [Give the vm time to flush dirty buffers]
[ INFO ] ok: [localhost]
[ INFO ] TASK [Copy engine logs]
[ INFO ] TASK [include_tasks]
[ INFO ] ok: [localhost]
[ INFO ] TASK [Remove local vm dir]
[ INFO ] changed: [localhost]
[ INFO ] TASK [Remove temporary entry in /etc/hosts for the local VM]
[ INFO ] ok: [localhost]
[ INFO ] Generating answer file
'/var/lib/ovirt-hosted-engine-setup/answers/answers-20181203132110.conf'
[ INFO ] Stage: Pre-termination
[ INFO ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed: please check the logs for the
issue, fix accordingly or re-deploy from scratch.
Log file is located at
/var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20181203124703-45t4ja.log
--
*Ralf Schenk*
fon +49 (0) 24 05 / 40 83 70
fax +49 (0) 24 05 / 40 83 759
mail *rs(a)databay.de* <mailto:rs@databay.de>
*Databay AG*
Jens-Otto-Krag-Straße 11
D-52146 Würselen
*www.databay.de* <http://www.databay.de>
Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.-Kfm.
Philipp Hermanns
Aufsichtsratsvorsitzender: Wilhelm Dohmen
------------------------------------------------------------------------
6 years