
在 2020/9/17 16:38, Yedidyah Bar David 写道:
On Thu, Sep 17, 2020 at 11:29 AM Adam Xu <adam_xu@adagene.com.cn> wrote:
在 2020/9/17 15:07, Yedidyah Bar David 写道:
On Thu, Sep 17, 2020 at 8:16 AM Adam Xu <adam_xu@adagene.com.cn> wrote:
On Wed, Sep 16, 2020 at 10:46 AM Adam Xu <adam_xu@adagene.com.cn> wrote:
在 2020/9/16 15:12, Yedidyah Bar David 写道: > On Wed, Sep 16, 2020 at 6:10 AM Adam Xu <adam_xu@adagene.com.cn> wrote: >> Hi ovirt >> >> I just try to upgrade a self-Hosted engine from 4.3.10 to 4.4.1.4. I followed the step in the document: >> >> https://www.ovirt.org/documentation/upgrade_guide/#SHE_Upgrading_from_4-3 >> >> the old 4.3 env has a FC storage as engine storage domain and I have created a new FC storage vv for the new storage domain to be used in the next steps. >> >> I backup the old 4.3 env and prepare a total new host to restore the env. >> >> in charter 4.4 step 8, it said: >> >> "During the deployment you need to provide a new storage domain. The deployment script renames the 4.3 storage domain and retains its data." >> >> it does rename the old storage domain. but it didn't let me choose a new storage domain during the deployment. So the new enigne just deployed in the new host's local storage and can not move to the FC storage domain. >> >> Can anyone tell me what the problem is? > What do you mean in "deployed in the new host's local storage"? > > Did deploy finish successfully? I think it was not finished yet. You did 'hosted-engine --deploy --restore-from-file=something', right?
Did this finish? not finished yet. What are the last few lines of the output? [ INFO ] You can now connect to https://ovirt6.ntbaobei.com:6900/ovirt-engine/ and check the status of
在 2020/9/16 15:53, Yedidyah Bar David 写道: this host and eventually remediate it, please continue only when the host is listed as 'up'
[ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]
[ INFO ] ok: [localhost] [ INFO ] TASK [ovirt.hosted_engine_setup : Create temporary lock file] [ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Pause execution until /tmp/ansible.g2opa_y6_he_setup_lock is removed, delete it once ready to proceed] Great. This means that you replied 'Yes' to 'Pause the execution after adding this host to the engine?', and it's now waiting.
but the new host which run the self-hosted engine's status is "NonOperational" and never will be "up" You seem to to imply that you expected it to become "up" by itself, and that you claim that this will never happen, in which you are correct.
But that's not the intention. The message you got is:
You will be able to iteratively connect to the restored engine in order to manually review and remediate its configuration before proceeding with the deployment: please ensure that all the datacenter hosts and storage domain are listed as up or in maintenance mode before proceeding. This is normally not required when restoring an up to date and coherent backup.
This means that it's up to you to handle this nonoperational host, and that you are requested to continue (by removing that file) only then.
So now, let's try to understand why the host is nonoperational, and try to fix that. Ok?
You should be able to find the current (private/local) IP address of the engine vm by searching the hosted-engine setup logs for 'local_vm_ip'. You can ssh (and scp etc.) there from the host, using user 'root' and the password you supplied.
Please check/share all of /var/log/ovirt-engine on the engine vm. In particular, please check host-deploy/* logs there. The last lines show a summary, like:
HOSTNAME : ok=97 changed=34 unreachable=0 failed=0 skipped=46 rescued=0 ignored=1 my log here is:
2020-09-17 12:19:40 CST - TASK [Executing post tasks defined by user] ************************************ 2020-09-17 12:19:40 CST - PLAY RECAP ********************************************************************* ovirt2.ntbaobei.com : ok=99 changed=45 unreachable=0 failed=0 skipped=45 rescued=0 ignored=1
Good.
Is 'failed' higher than 0? If so, please find the failed task and check/share the relevant error (or just the entire file).
Also, please check engine.log there for any ' ERROR '. I collected some error log in engine.log Only those below?
2020-09-17 12:14:35,084+08 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-83) [4a6cf221] Command 'UploadStreamVDSCommand(HostName = ovirt6.ntbaobei.com, UploadStreamVDSCommandParameters:{hostId='784eada4-49e3-4d6c-95cd-f7c81337c2f7'})' execution failed: java.net.SocketException: Connection reset This, and similar ones, are expected - the engine is still on the private network, so it can't access the other hosts.
...
2020-09-17 12:14:35,085+08 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-83) [4a6cf221] Command 'org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.net.SocketException: Connection reset (Failed with error VDS_NETWORK_ERROR and code 5022)
...
2020-09-17 12:14:40,322+08 ERROR [org.ovirt.engine.core.bll.pm.FenceProxyLocator] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-53) [8b0987a ] Can not run fence action on host 'ovirt2.ntbaobei.com', no suitable proxy host was found. Not sure why it would want to fence ovirt2, but I think it can be ignored for now as well.
...
2020-09-17 12:14:48,861+08 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [4a6cf221] Ending command 'org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand' with failure. Same - it can't access the storage, so updating ovfstore fails. OK.
2020-09-17 12:14:52,630+08 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41) [56d6bb10] Failed to update OVF_STORE content 2020-09-17 12:14:52,630+08 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41) [56d6bb10] Command 'ProcessOvfUpdateForStorageDomain' id: '8e6e1fa1-1fdf-4928-9153-4fe2ae9b77b0' with children [1c4d99f8-2d05-4b0a-938b-8733157778e1, 62caf674-5567-461c-8e86-4ed7b03306af] failed when attempting to perform the next operation, marking as 'ACTIVE' 2020-09-17 12:14:52,630+08 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41) [56d6bb10] null: java.lang.RuntimeException
Same.
Are these the only errors?
In particular, try to search for 'ovirt2' (your host's name), try to find when it became nonoperational, and check errors around this.
the host has the permission to access the storage. I don't know why it can access the storage. should I use one host of the original cluster to install the new self-Hosted engine and restore the backup file?
Thanks,
Good luck and best regards,
Please also check/share logs from /var/log/ovirt-hosted-engine-setup/* (including subdirs). no more errers there, just a lot of DEBUG messages.
It didn't tell me to choose a new storage domain and just give me the new hosts fqdn as the engine's URL. like host6.example.com:6900 . Yes, that's temporarily, to let you access the engine VM (on the local network).
I can login use the host6.example.com:6900 and I saw the engine vm ran in host6's /tmp dir.
> HE deploy (since 4.3) first creates a VM for the engine on local > storage, then prompts you to provide the storage you want to use, and > then moves the VM disk image there. > > Best regards, > >> Thanks >> >> -- >> Adam Xu >> >> _______________________________________________ >> Users mailing list -- users@ovirt.org >> To unsubscribe send an email to users-leave@ovirt.org >> Privacy Statement: https://www.ovirt.org/privacy-policy.html >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/XHDGJB2ZAFS7AJ... -- Adam Xu Phone: 86-512-8777-3585 Adagene (Suzhou) Limited C14, No. 218, Xinghu Street, Suzhou Industrial Park
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RLOBPKLW7OBZR5... -- Adam Xu
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/UTVZW7W6XHZTZZ...
-- Adam Xu Phone: 86-512-8777-3585 Adagene (Suzhou) Limited C14, No. 218, Xinghu Street, Suzhou Industrial Park
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RQ3V7J4JKQ44SG...
-- Adam Xu Phone: 86-512-8777-3585 Adagene (Suzhou) Limited C14, No. 218, Xinghu Street, Suzhou Industrial Park