HE deployment on FC (fibre-channel) disk fails at 99% completed at final "hosted-engine --reinitialize-lockspace --force"
by Jeffrey Slapp
This identical to many others who have encountered this issue, yet nothing definitive has been suggested.
The entire HE deployment nearly finishes, but after the copy of the HE VM to the shared disk, shortly after that I reach the "initialize lockspace" section and the following error occurs:
20083 2024-06-19 13:24:01,777-0400 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 {'changed': True, 'stdout': '', 'stderr': 'Traceback (most recent call last):\n File "/usr/lib64/python3.9/runpy.py", line 197, in _run_module_as_main\n return _run_ code(code, main_globals, None,\n File "/usr/lib64/python3.9/runpy.py", line 87, in _run_code\n exec(code, run_globals)\n File "/usr/lib/p ython3.9/site-packages/ovirt_hosted_engine_setup/reinitialize_lockspace.py", line 30, in <module>\n ha_cli.reset_lockspace(force)\n File " /usr/lib/python3.9/site-packages/ovirt_hosted_engine_ha/client/client.py", line 286, in reset_lockspace\n stats = broker.get_stats_from_sto rage()\n File "/usr/lib/python3.9/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 148, in get_stats_from_storage\n result = self._proxy.get_stats()\n File "/usr/lib64/python3.9/xmlrpc/client.py", line 1122, in __call__\n ret
urn self.__send(self.__name, args)\n File "/usr/lib64/python3.9/xmlrpc/client.py", line 1464, in __request\n response = self.__transport.request(\n File "/usr/lib64/python3.9/ xmlrpc/client.py", line 1166, in request\n return self.single_request(host, handler, request_body, verbose)\n File "/usr/lib64/python3.9/x mlrpc/client.py", line 1178, in single_request\n http_conn = self.send_request(host, handler, request_body, verbose)\n File "/usr/lib64/py thon3.9/xmlrpc/client.py", line 1291, in send_request\n self.send_content(connection, request_body)\n File "/usr/lib64/python3.9/xmlrpc/cl ient.py", line 1321, in send_content\n connection.endheaders(request_body)\n File "/usr/lib64/python3.9/http/client.py", line 1280, in end headers\n self._send_output(message_body, encode_chunked=encode_chunked)\n File "/usr/lib64/python3.9/http/client.py", line 1040, in _send _output\n self.send(msg)\n File "/usr/lib64/python3.9/http/cl
ient.py", line 980, in send\n self.connect()\n File "/usr/lib/python3.9/s ite-packages/ovirt_hosted_engine_ha/lib/unixrpc.py", line 76, in connect\n self.sock.connect(base64.b16decode(self.host))\nFileNotFoundErro r: [Errno 2] No such file or directory', 'rc': 1, 'cmd': ['hosted-engine', '--reinitialize-lockspace', '--force'], 'start': '2024-06-19 13:24: 01.438386', 'end': '2024-06-19 13:24:01.618227', 'delta': '0:00:00.179841', 'msg': 'non-zero return code', 'invocation': {'module_args': {'_ra w_params': 'hosted-engine --reinitialize-lockspace --force', '_uses_shell': False, 'stdin_add_newline': True, 'strip_empty_ends': True, 'argv' : None, 'chdir': None, 'executable': None, 'creates': None, 'removes': None, 'stdin': None}}, 'stdout_lines': [], 'stderr_lines': ['Traceback (most recent call last):', ' File "/usr/lib64/python3.9/runpy.py", line 197, in _run_module_as_main', ' return _run_code(code, main_global s, None,', ' File "/usr/l
ib64/python3.9/runpy.py", line 87, in _run_code', ' exec(code, run_globals)', ' File "/usr/lib/python3.9/site-pa ckages/ovirt_hosted_engine_setup/reinitialize_lockspace.py", line 30, in <module>', ' ha_cli.reset_lockspace(force)', ' File "/usr/lib/pyt hon3.9/site-packages/ovirt_hosted_engine_ha/client/client.py", line 286, in reset_lockspace', ' stats = broker.get_stats_from_storage()', ' File "/usr/lib/python3.9/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 148, in get_stats_from_storage', ' result = self._ proxy.get_stats()', ' File "/usr/lib64/python3.9/xmlrpc/client.py", line 1122, in __call__', ' return self.__send(self.__name, args)', ' File "/usr/lib64/python3.9/xmlrpc/client.py", line 1464, in __request', ' response = self.__transport.request(', ' File "/usr/lib64/python 3.9/xmlrpc/client.py", line 1166, in request', ' return self.single_request(host, handler, request_body, verbose)', ' File "/usr/li
b64/pyt hon3.9/xmlrpc/client.py", line 1178, in single_request', ' http_conn = self.send_request(host, handler, request_body, verbose)', ' File "/ usr/lib64/python3.9/xmlrpc/client.py", line 1291, in send_request', ' self.send_content(connection, request_body)', ' File "/usr/lib64/pyt hon3.9/xmlrpc/client.py", line 1321, in send_content', ' connection.endheaders(request_body)', ' File "/usr/lib64/python3.9/http/client.py ", line 1280, in endheaders', ' self._send_output(message_body, encode_chunked=encode_chunked)', ' File "/usr/lib64/python3.9/http/client. py", line 1040, in _send_output', ' self.send(msg)', ' File "/usr/lib64/python3.9/http/client.py", line 980, in send', ' self.connect() ', ' File "/usr/lib/python3.9/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py", line 76, in connect', ' self.sock.connect(base64.b16de code(self.host))', 'FileNotFoundError: [Errno 2] No such file or directory'], '_ansible_no_log':
False, 'attempts': 5}
No errors in /var/log/messages or /var/log/sanlock.log.
I have this working with iSCSI on another storage system, but can't seem to get this to work on FC. I have read that the sector sizes could possible cause this. On my iSCSI system I have [PHY-SEC:LOG-SEC] as [512:512] but on my FC system I have [4096:512].
Hoping that someone can confirm whether this is the issue or not. Interestingly all previous "initialize lockspace" phases of the install are fine, it just appears to be this final one.
6 months, 3 weeks
New host cannot connect to master domain
by ovirt@kirschke.de
Hi,
i just installed a new host, which cannot connect to the master domain. The domain is an iscsi device on a nas, and ovirt is actually fine with it, other hosts have no problem. I'sure , I#m just overlooking something.
What I see in the log is:
024-04-16 11:10:06,320-0400 INFO (jsonrpc/3) [storage.storagepoolmemorybackend] new storage pool master version 302 and domains map {'6f018fbd-de93-4c56-880d-8ede2aad2674': 'Active', '2c870e06-6c70-45ec-b665-ce29408c8a8e': 'Active', 'a154
96dc-c241-4658-af9d-0dfe11783916': 'Active', '41012bfb-b802-4092-b699-7f5284d95c8e': 'Active'} (spbackends:417)
2024-04-16 11:10:06,320-0400 INFO (jsonrpc/3) [storage.storagepool] updating pool 5836aaac-0030-0064-024d-0000000002e4 backend from type NoneType instance 0x7f10d667fb70 to type StoragePoolMemoryBackend instance 0x7f10702d7408 (sp:149)
2024-04-16 11:10:06,320-0400 INFO (jsonrpc/3) [storage.storagepool] Connect host #2 to the storage pool 5836aaac-0030-0064-024d-0000000002e4 with master domain: a15496dc-c241-4658-af9d-0dfe11783916 (ver = 302) (sp:699)
2024-04-16 11:10:06,320-0400 INFO (jsonrpc/3) [storage.storagedomaincache] Invalidating storage domain cache (sdc:57)
2024-04-16 11:10:06,320-0400 INFO (jsonrpc/3) [storage.storagedomaincache] Clearing storage domain cache (sdc:182)
2024-04-16 11:10:06,320-0400 INFO (jsonrpc/3) [storage.storagedomaincache] Refreshing storage domain cache (resize=True) (sdc:63)
2024-04-16 11:10:06,320-0400 INFO (jsonrpc/3) [storage.iscsi] Scanning iSCSI devices (iscsi:445)
2024-04-16 11:10:06,456-0400 INFO (jsonrpc/3) [storage.iscsi] Scanning iSCSI devices: 0.14 seconds (utils:373)
2024-04-16 11:10:06,456-0400 INFO (jsonrpc/3) [storage.hba] Scanning FC devices (hba:42)
2024-04-16 11:10:06,481-0400 INFO (jsonrpc/3) [storage.hba] Scanning FC devices: 0.03 seconds (utils:373)
2024-04-16 11:10:06,481-0400 INFO (jsonrpc/3) [storage.multipath] Waiting until multipathd is ready (multipath:95)
2024-04-16 11:10:08,498-0400 INFO (jsonrpc/3) [storage.multipath] Waited 2.02 seconds for multipathd (tries=2, ready=2) (multipath:122)
2024-04-16 11:10:08,498-0400 INFO (jsonrpc/3) [storage.multipath] Resizing multipath devices (multipath:223)
2024-04-16 11:10:08,499-0400 INFO (jsonrpc/3) [storage.multipath] Resizing multipath devices: 0.00 seconds (utils:373)
2024-04-16 11:10:08,499-0400 INFO (jsonrpc/3) [storage.storagedomaincache] Refreshing storage domain cache: 2.18 seconds (utils:373)
2024-04-16 11:10:08,499-0400 INFO (jsonrpc/3) [storage.storagedomaincache] Looking up domain a15496dc-c241-4658-af9d-0dfe11783916 (sdc:154)
2024-04-16 11:10:08,536-0400 INFO (jsonrpc/3) [storage.storagedomaincache] Looking up domain a15496dc-c241-4658-af9d-0dfe11783916: 0.04 seconds (utils:373)
2024-04-16 11:10:08,537-0400 INFO (jsonrpc/3) [vdsm.api] FINISH connectStoragePool error=Cannot find master domain: 'spUUID=5836aaac-0030-0064-024d-0000000002e4, msdUUID=a15496dc-c241-4658-af9d-0dfe11783916' from=::ffff:10.2.0.4,44914, flo
w_id=44d9e674, task_id=32f0ea2a-044b-4a81-a12e-a269e59a802b (api:35)
2024-04-16 11:10:08,537-0400 ERROR (jsonrpc/3) [storage.taskmanager.task] (Task='32f0ea2a-044b-4a81-a12e-a269e59a802b') Unexpected error (task:860)
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/vdsm/storage/sp.py", line 1550, in setMasterDomain
domain = sdCache.produce(msdUUID)
File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 98, in produce
domain.getRealDomain()
File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 34, in getRealDomain
return self._cache._realProduce(self._sdUUID)
File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 122, in _realProduce
domain = self._findDomain(sdUUID)
File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 139, in _findDomain
return findMethod(sdUUID)
File "/usr/lib/python3.6/site-packages/vdsm/storage/sdc.py", line 169, in _findUnfetchedDomain
raise se.StorageDomainDoesNotExist(sdUUID)
vdsm.storage.exception.StorageDomainDoesNotExist: Storage domain does not exist: ('a15496dc-c241-4658-af9d-0dfe11783916',)
During handling of the above exception, another exception occurred:
....
The domain a15496dc-c241-4658-af9d-0dfe11783916 definitely exist, and works on other systems.
THe new host can acces the iscsi targets, and i am able to log into them.
Any one who knows, which blinder I have to remove, so I see the actual problem?
Thanks and best regards
Steffen
6 months, 3 weeks
Admin interface broken
by ovirt@kirschke.de
Hi
i tried to migrate my ovirt engine form a centos 8 to a 9 machine. So, I made a backup, created and configured a new machine, stopped the old one, started the new one with identical network configuration, restored the backup and tried engine-setup to set up the new site, exping it to work. It did not, but I thought, no prob, I still have my old one.
Now, going back to the old mache, when logging into Admin Portal, I get error messages:
- Error while executing action: A Request to the Server failed: Type 'org.ovirt.engine.core.common.queries.QueryType' was not assignable to 'com.google.gwt.user.client.rpc.IsSerializable' and did not have a
custom field serializer. For security purposes, this type will not be deserialized.
- Error while executing query: null
Each in a Box, which I clicked away getting the messages toggeling.
After a while, the abckground of the admin Portal's dashboad is becomes visible, but remains empty, with the 'loading' spinner.
In the serverlog I only see the following exveption:
Caused by: com.google.gwt.user.client.rpc.SerializationException: Type 'org.ovirt.engine.core.common.queries.QueryType' was not assignable to 'com.google.gwt.user.client.rpc.IsSerializable' and did not have a custom field serializer. For security purposes, this type will not be deserialized.
at deployment.engine.ear.webadmin.war//com.google.gwt.user.server.rpc.impl.LegacySerializationPolicy.validateDeserialize(LegacySerializationPolicy.java:128)
at deployment.engine.ear.webadmin.war//com.google.gwt.user.server.rpc.impl.ServerSerializationStreamReader.deserialize(ServerSerializationStreamReader.java:676)
at deployment.engine.ear.webadmin.war//com.google.gwt.user.server.rpc.impl.ServerSerializationStreamReader.readObject(ServerSerializationStreamReader.java:592)
at deployment.engine.ear.webadmin.war//com.google.gwt.user.server.rpc.core.java.util.Collection_ServerCustomFieldSerializerBase.deserialize(Collection_ServerCustomFieldSerializerBase.java:38)
at deployment.engine.ear.webadmin.war//com.google.gwt.user.server.rpc.core.java.util.ArrayList_ServerCustomFieldSerializer.deserialize(ArrayList_ServerCustomFieldSerializer.java:40)
at deployment.engine.ear.webadmin.war//com.google.gwt.user.server.rpc.core.java.util.ArrayList_ServerCustomFieldSerializer.deserializeInstance(ArrayList_ServerCustomFieldSerializer.java:54)
at deployment.engine.ear.webadmin.war//com.google.gwt.user.server.rpc.core.java.util.ArrayList_ServerCustomFieldSerializer.deserializeInstance(ArrayList_ServerCustomFieldSerializer.java:33)
at deployment.engine.ear.webadmin.war//com.google.gwt.user.server.rpc.impl.ServerSerializationStreamReader.deserializeImpl(ServerSerializationStreamReader.java:887)
at deployment.engine.ear.webadmin.war//com.google.gwt.user.server.rpc.impl.ServerSerializationStreamReader.deserialize(ServerSerializationStreamReader.java:687)
at deployment.engine.ear.webadmin.war//com.google.gwt.user.server.rpc.impl.ServerSerializationStreamReader.readObject(ServerSerializationStreamReader.java:592)
at deployment.engine.ear.webadmin.war//com.google.gwt.user.server.rpc.impl.ServerSerializationStreamReader$ValueReader$8.readValue(ServerSerializationStreamReader.java:149)
at deployment.engine.ear.webadmin.war//com.google.gwt.user.server.rpc.impl.ServerSerializationStreamReader.deserializeValue(ServerSerializationStreamReader.java:434)
at deployment.engine.ear.webadmin.war//com.google.gwt.user.server.rpc.RPC.decodeRequest(RPC.java:312)
... 64 more
Which , of course I do not understand, what it tries to tell me.
The VM-Portal is still reachable, and seems to work.
Has anybody had an effect like that, or, even better, an idea, what I am doing wrong and how I can fix that, or where I can look to find the main cause for the bahaviour-
Forgot to mention: Desperate, as I am, I also tried to upgrade to 4.5.7-master, which worked without any problem or warning, but the effect is still the same.
Thanks and best regards
Steffen
6 months, 3 weeks
Ovirt4.5 installation failure on Rocky 8
by 盛家杰
Hello Ovit Devs,
I am new to ovirt and am trying to install Ovirt4.5 on Rocky8 machine, and I am stuck at the first step, enable ovirt engine repo.
1. I follow the manual to do some yum repo configuration staff, with bash code in "Rocky" part , https://www.ovirt.org/download/install_on_rhel.html
# On Rocky there's an issue with centos-release-nfv package from extras
dnf install -y dnf-plugins-core
dnf config-manager --set-disabled extras
2. I try to install ovirt4.5 with "dnf install -y centos-release-ovirt45", but it fails with "unable to find a match". I find centos-release-ovirt45 is listed in extras repo and extras repo is disabled. If extras repo needs to be disabled, how can I install centos-release-ovirt45?
Any suggestions would be highly appreciated!
Jiajie
6 months, 4 weeks
How to access the ovirt node individually?
by De Lee
Hi,
I've installed the ovirt manager on top of the oVirt node. For some reason the node went to offline and unable to start the ovrit manager.
Do we have option to enable the auto start of the ovrit manager and can I able to connect the console of the ovirt node try to power on the VM?
6 months, 4 weeks
Please tell me how to deal with the error
by d_hashima@sagaidc.co.jp
I'm currently installing oVirt45 self-host on RHEL9.
Installation fails due to the following error.
Could you please give me some advice?
error contents:
"Failed to download metadata for repo 'centos-ceph-pacific': Cannot prepare internal mirrorlist: No URLs in mirrorlist"
7 months
Add new host fail and reinstall fail
by max.tseng@extremedata.com.tw
Hi Sir,
I want crate new host server , always get error message
2024-06-19 14:52:59,440+08 WARN [org.apache.sshd.client.session.ClientConnectionService] (sshd-SshClient[79ddc832]-nio2-thread-8) globalRequest(ClientConnectionService[ClientSessionImpl[root@/172.16.1.40:22]])[hostkeys-00@openssh.com, want-reply=false] failed (SshException) to process: EdDSA provider not supported
7 months
Out-of-sync - Host Network's configurations differ from DC
by munnadawood@gmail.com
Hello Team,
We had a power outage last week and we started observing the error on our oVirt cluster, where all the hosts went out-of-sync.
When I select my ovirtmgmt interface it shows the out-of-sync symbol and shows IPv4 gateway ( Host - Null, DC - 10.xx..xx.xx)
Same with my VM traffic network, it shows the out-of-sync symbol and shows IPv4 gateway ( Host - Null, DC - 10.xx..xx.xx)
Please help me with my troubleshooting as we lost access to VM's access after our power outage.
Thanks,
Dawood.
7 months
Deployment Failure with oVirt Hosted Engine on CentOS Stream 9
by khaltarabien@gmail.com
Hello,
I've encountered a failure while trying to deploy the oVirt Hosted Engine using the oVirt Node NG Installer on a system running CentOS Stream 9. Here are the details:
Environment:
OS: CentOS Stream 9 (oVirt Node 4.5.5)
oVirt Version: 4.5.5
Installer ISO: oVirt Node NG Installer 4.5.5-2023113015 for el9
Issue:
During the deployment of the hosted engine, the process failed with several error messages, notably failing to download metadata for the 'centos-ceph-pacific' repository due to no URLs in the mirrorlist, and a failure deploying the engine on the local VM.
Error Messages:
Failed to download metadata for repo 'centos-ceph-pacific': Cannot prepare internal mirrorlist: No URLs in mirrorlist
There was a failure deploying the engine on the local engine VM. The system may not be provisioned according to the playbook results: please check the logs for the issue, fix accordingly or re-deploy from scratch.
Deprecation warning for the Python crypt module.
Steps Taken:
Attempted deployment using the hosted-engine --deploy command.
Used the specified installer ISO.
Seeking Help:
Has anyone faced similar issues, or does anyone have suggestions on how to resolve the mirrorlist problem and the deployment issue? Any help or guidance would be greatly appreciated!ERY, status: NOERROR, id: 52241
2024-06-13 00:07:19,621+0000 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:113 fatal: [localhost -> 192.168.222.192]: FAILED! => {"changed": false, "msg": "Failed to download metadata for repo 'centos-ceph-pacific': Cannot prepare internal mirrorlist: No URLs in mirrorlist", "rc": 1, "results": []}
2024-06-13 00:07:48,484+0000 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:113 fatal: [localhost]: FAILED! => {"changed": false, "msg": "There was a failure deploying the engine on the local engine VM. The system may not be provisioned according to the playbook results: please check the logs for the issue, fix accordingly or re-deploy from scratch.\n"}
2024-06-13 00:07:49,088+0000 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:238 b'[DEPRECATION WARNING]: Encryption using the Python crypt module is deprecated. \n'
2024-06-13 00:07:49,088+0000 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:238 b'The Python crypt module is deprecated and will be removed from Python 3.13. \n'
2024-06-13 00:07:49,089+0000 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:238 b'Install the passlib library for continued encryption functionality. This \n'
2024-06-13 00:07:49,089+0000 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:238 b'feature will be removed in version 2.17. Deprecation warnings can be disabled \n'
2024-06-13 00:07:49,090+0000 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:238 b'by setting deprecation_warnings=False in ansible.cfg.\n'
2024-06-13 00:07:49,094+0000 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Closing up': Failed executing ansible-playbook
2024-06-13 00:08:11,089+0000 ERROR otopi.plugins.gr_he_common.core.misc misc._terminate:164 Hosted Engine deployment failed: please check the logs for the issue, fix accordingly or re-deploy from scratch.
7 months