
在 2020/9/21 14:09, Yedidyah Bar David 写道:
On Fri, Sep 18, 2020 at 3:50 AM Adam Xu <adam_xu@adagene.com.cn> wrote:
在 2020/9/17 17:42, Yedidyah Bar David 写道:
On Thu, Sep 17, 2020 at 11:57 AM Adam Xu <adam_xu@adagene.com.cn> wrote:
On Thu, Sep 17, 2020 at 11:29 AM Adam Xu <adam_xu@adagene.com.cn> wrote:
在 2020/9/17 15:07, Yedidyah Bar David 写道: > On Thu, Sep 17, 2020 at 8:16 AM Adam Xu <adam_xu@adagene.com.cn> wrote: >> 在 2020/9/16 15:53, Yedidyah Bar David 写道: >>> On Wed, Sep 16, 2020 at 10:46 AM Adam Xu <adam_xu@adagene.com.cn> wrote: >>>> 在 2020/9/16 15:12, Yedidyah Bar David 写道: >>>>> On Wed, Sep 16, 2020 at 6:10 AM Adam Xu <adam_xu@adagene.com.cn> wrote: >>>>>> Hi ovirt >>>>>> >>>>>> I just try to upgrade a self-Hosted engine from 4.3.10 to 4.4.1.4. I followed the step in the document: >>>>>> >>>>>> https://www.ovirt.org/documentation/upgrade_guide/#SHE_Upgrading_from_4-3 >>>>>> >>>>>> the old 4.3 env has a FC storage as engine storage domain and I have created a new FC storage vv for the new storage domain to be used in the next steps. >>>>>> >>>>>> I backup the old 4.3 env and prepare a total new host to restore the env. >>>>>> >>>>>> in charter 4.4 step 8, it said: >>>>>> >>>>>> "During the deployment you need to provide a new storage domain. The deployment script renames the 4.3 storage domain and retains its data." >>>>>> >>>>>> it does rename the old storage domain. but it didn't let me choose a new storage domain during the deployment. So the new enigne just deployed in the new host's local storage and can not move to the FC storage domain. >>>>>> >>>>>> Can anyone tell me what the problem is? >>>>> What do you mean in "deployed in the new host's local storage"? >>>>> >>>>> Did deploy finish successfully? >>>> I think it was not finished yet. >>> You did 'hosted-engine --deploy --restore-from-file=something', right? >>> >>> Did this finish? >> not finished yet. >>> What are the last few lines of the output? >> [ INFO ] You can now connect to >> https://ovirt6.ntbaobei.com:6900/ovirt-engine/ and check the status of >> this host and eventually remediate it, please continue only when the >> host is listed as 'up' >> >> [ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks] >> >> [ INFO ] ok: [localhost] >> [ INFO ] TASK [ovirt.hosted_engine_setup : Create temporary lock file] >> [ INFO ] changed: [localhost] >> >> [ INFO ] TASK [ovirt.hosted_engine_setup : Pause execution until >> /tmp/ansible.g2opa_y6_he_setup_lock is removed, delete it once ready to >> proceed] > Great. This means that you replied 'Yes' to 'Pause the execution > after adding this host to the engine?', and it's now waiting. > >> but the new host which run the self-hosted engine's status is >> "NonOperational" and never will be "up" > You seem to to imply that you expected it to become "up" by itself, > and that you claim that this will never happen, in which you are > correct. > > But that's not the intention. The message you got is: > > You will be able to iteratively connect to the restored engine in > order to manually review and remediate its configuration before > proceeding with the deployment: > please ensure that all the datacenter hosts and storage domain are > listed as up or in maintenance mode before proceeding. > This is normally not required when restoring an up to date and > coherent backup. > > This means that it's up to you to handle this nonoperational host, > and that you are requested to continue (by removing that file) only > then. > > So now, let's try to understand why the host is nonoperational, and > try to fix that. Ok? > > You should be able to find the current (private/local) IP address of > the engine vm by searching the hosted-engine setup logs for 'local_vm_ip'. > You can ssh (and scp etc.) there from the host, using user 'root' and > the password you supplied. > > Please check/share all of /var/log/ovirt-engine on the engine vm. > In particular, please check host-deploy/* logs there. The last lines > show a summary, like: > > HOSTNAME : ok=97 changed=34 unreachable=0 failed=0 > skipped=46 rescued=0 ignored=1 my log here is:
2020-09-17 12:19:40 CST - TASK [Executing post tasks defined by user] ************************************ 2020-09-17 12:19:40 CST - PLAY RECAP ********************************************************************* ovirt2.ntbaobei.com : ok=99 changed=45 unreachable=0 failed=0 skipped=45 rescued=0 ignored=1 Good.
> Is 'failed' higher than 0? If so, please find the failed task and > check/share the relevant error (or just the entire file). > > Also, please check engine.log there for any ' ERROR '. I collected some error log in engine.log Only those below?
2020-09-17 12:14:35,084+08 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-83) [4a6cf221] Command 'UploadStreamVDSCommand(HostName = ovirt6.ntbaobei.com, UploadStreamVDSCommandParameters:{hostId='784eada4-49e3-4d6c-95cd-f7c81337c2f7'})' execution failed: java.net.SocketException: Connection reset This, and similar ones, are expected - the engine is still on the private network, so it can't access the other hosts.
...
2020-09-17 12:14:35,085+08 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-83) [4a6cf221] Command 'org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.net.SocketException: Connection reset (Failed with error VDS_NETWORK_ERROR and code 5022)
...
2020-09-17 12:14:40,322+08 ERROR [org.ovirt.engine.core.bll.pm.FenceProxyLocator] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-53) [8b0987a ] Can not run fence action on host 'ovirt2.ntbaobei.com', no suitable proxy host was found. Not sure why it would want to fence ovirt2, but I think it can be ignored for now as well.
...
2020-09-17 12:14:48,861+08 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [4a6cf221] Ending command 'org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand' with failure. Same - it can't access the storage, so updating ovfstore fails. OK.
2020-09-17 12:14:52,630+08 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41) [56d6bb10] Failed to update OVF_STORE content 2020-09-17 12:14:52,630+08 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41) [56d6bb10] Command 'ProcessOvfUpdateForStorageDomain' id: '8e6e1fa1-1fdf-4928-9153-4fe2ae9b77b0' with children [1c4d99f8-2d05-4b0a-938b-8733157778e1, 62caf674-5567-461c-8e86-4ed7b03306af] failed when attempting to perform the next operation, marking as 'ACTIVE' 2020-09-17 12:14:52,630+08 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41) [56d6bb10] null: java.lang.RuntimeException Same.
Are these the only errors?
In particular, try to search for 'ovirt2' (your host's name), try to find when it became nonoperational, and check errors around this.
在 2020/9/17 16:38, Yedidyah Bar David 写道: the host has the permission to access the storage. I don't know why it can access the storage. Me neither, but that's still irrelevant. First the node has to be Up, then you should check the storage.
should I use one host of the original cluster to install the new self-Hosted engine and restore the backup file? I thought this is what you did, no?
Please explain what you did. for example, I have a ovirt cluster which have 3 hosts, named ovirt1.example.com, ovirt2.example.com and ovirt3.example
I backup the engine and prepare a new host named ovirt4.example.com to restore the backup file. is that why ovirt4 can not manage the store domain ?
No.
OK. I given up. Since we have to endure a certain amount of downtime. At last, I create a new ovirt and use a export domain to migrate all the VMs to the new ovirt. I know it's ugly but useful.
Best regards,
Thanks,
Thanks,
> Good luck and best regards, > >>> Please also check/share logs from /var/log/ovirt-hosted-engine-setup/* >>> (including subdirs). >>> no more errers there, just a lot of DEBUG messages. >>>> It didn't tell me to choose a new >>>> storage domain and just give me the new hosts fqdn as the engine's URL. >>>> like host6.example.com:6900 . >>> Yes, that's temporarily, to let you access the engine VM (on the local network). >>> >>>> I can login use the host6.example.com:6900 and I saw the engine vm ran >>>> in host6's /tmp dir. >>>> >>>>> HE deploy (since 4.3) first creates a VM for the engine on local >>>>> storage, then prompts you to provide the storage you want to use, and >>>>> then moves the VM disk image there. >>>>> >>>>> Best regards, >>>>> >>>>>> Thanks >>>>>> >>>>>> -- >>>>>> Adam Xu >>>>>> >>>>>> _______________________________________________ >>>>>> Users mailing list -- users@ovirt.org >>>>>> To unsubscribe send an email to users-leave@ovirt.org >>>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html >>>>>> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >>>>>> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/XHDGJB2ZAFS7AJ... >>>> -- >>>> Adam Xu >>>> Phone: 86-512-8777-3585 >>>> Adagene (Suzhou) Limited >>>> C14, No. 218, Xinghu Street, Suzhou Industrial Park >>>> >>>> _______________________________________________ >>>> Users mailing list -- users@ovirt.org >>>> To unsubscribe send an email to users-leave@ovirt.org >>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html >>>> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >>>> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RLOBPKLW7OBZR5... >> -- >> Adam Xu >> >> _______________________________________________ >> Users mailing list -- users@ovirt.org >> To unsubscribe send an email to users-leave@ovirt.org >> Privacy Statement: https://www.ovirt.org/privacy-policy.html >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/UTVZW7W6XHZTZZ... -- Adam Xu Phone: 86-512-8777-3585 Adagene (Suzhou) Limited C14, No. 218, Xinghu Street, Suzhou Industrial Park
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RQ3V7J4JKQ44SG... -- Adam Xu Phone: 86-512-8777-3585 Adagene (Suzhou) Limited C14, No. 218, Xinghu Street, Suzhou Industrial Park
Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/M4S6G6OANQI3QR...
-- Adam Xu Phone: 86-512-8777-3585 Adagene (Suzhou) Limited C14, No. 218, Xinghu Street, Suzhou Industrial Park
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/LCLQPHRZWW6YZO...
-- Adam Xu Phone: 86-512-8777-3585 Adagene (Suzhou) Limited C14, No. 218, Xinghu Street, Suzhou Industrial Park