On Thu, Sep 17, 2020 at 11:29 AM Adam Xu
<adam_xu(a)adagene.com.cn> wrote:
>
> 在 2020/9/17 15:07, Yedidyah Bar David 写道:
>> On Thu, Sep 17, 2020 at 8:16 AM Adam Xu <adam_xu(a)adagene.com.cn> wrote:
>>> 在 2020/9/16 15:53, Yedidyah Bar David 写道:
>>>> On Wed, Sep 16, 2020 at 10:46 AM Adam Xu <adam_xu(a)adagene.com.cn>
wrote:
>>>>> 在 2020/9/16 15:12, Yedidyah Bar David 写道:
>>>>>> On Wed, Sep 16, 2020 at 6:10 AM Adam Xu
<adam_xu(a)adagene.com.cn> wrote:
>>>>>>> Hi ovirt
>>>>>>>
>>>>>>> I just try to upgrade a self-Hosted engine from 4.3.10 to
4.4.1.4. I followed the step in the document:
>>>>>>>
>>>>>>>
https://www.ovirt.org/documentation/upgrade_guide/#SHE_Upgrading_from_4-3
>>>>>>>
>>>>>>> the old 4.3 env has a FC storage as engine storage domain and
I have created a new FC storage vv for the new storage domain to be used in the next
steps.
>>>>>>>
>>>>>>> I backup the old 4.3 env and prepare a total new host to
restore the env.
>>>>>>>
>>>>>>> in charter 4.4 step 8, it said:
>>>>>>>
>>>>>>> "During the deployment you need to provide a new storage
domain. The deployment script renames the 4.3 storage domain and retains its data."
>>>>>>>
>>>>>>> it does rename the old storage domain. but it didn't let
me choose a new storage domain during the deployment. So the new enigne just deployed in
the new host's local storage and can not move to the FC storage domain.
>>>>>>>
>>>>>>> Can anyone tell me what the problem is?
>>>>>> What do you mean in "deployed in the new host's local
storage"?
>>>>>>
>>>>>> Did deploy finish successfully?
>>>>> I think it was not finished yet.
>>>> You did 'hosted-engine --deploy --restore-from-file=something',
right?
>>>>
>>>> Did this finish?
>>> not finished yet.
>>>> What are the last few lines of the output?
>>> [ INFO ] You can now connect to
>>>
https://ovirt6.ntbaobei.com:6900/ovirt-engine/ and check the status of
>>> this host and eventually remediate it, please continue only when the
>>> host is listed as 'up'
>>>
>>> [ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]
>>>
>>> [ INFO ] ok: [localhost]
>>> [ INFO ] TASK [ovirt.hosted_engine_setup : Create temporary lock file]
>>> [ INFO ] changed: [localhost]
>>>
>>> [ INFO ] TASK [ovirt.hosted_engine_setup : Pause execution until
>>> /tmp/ansible.g2opa_y6_he_setup_lock is removed, delete it once ready to
>>> proceed]
>> Great. This means that you replied 'Yes' to 'Pause the execution
>> after adding this host to the engine?', and it's now waiting.
>>
>>> but the new host which run the self-hosted engine's status is
>>> "NonOperational" and never will be "up"
>> You seem to to imply that you expected it to become "up" by itself,
>> and that you claim that this will never happen, in which you are
>> correct.
>>
>> But that's not the intention. The message you got is:
>>
>> You will be able to iteratively connect to the restored engine in
>> order to manually review and remediate its configuration before
>> proceeding with the deployment:
>> please ensure that all the datacenter hosts and storage domain are
>> listed as up or in maintenance mode before proceeding.
>> This is normally not required when restoring an up to date and
>> coherent backup.
>>
>> This means that it's up to you to handle this nonoperational host,
>> and that you are requested to continue (by removing that file) only
>> then.
>>
>> So now, let's try to understand why the host is nonoperational, and
>> try to fix that. Ok?
>>
>> You should be able to find the current (private/local) IP address of
>> the engine vm by searching the hosted-engine setup logs for
'local_vm_ip'.
>> You can ssh (and scp etc.) there from the host, using user 'root' and
>> the password you supplied.
>>
>> Please check/share all of /var/log/ovirt-engine on the engine vm.
>> In particular, please check host-deploy/* logs there. The last lines
>> show a summary, like:
>>
>> HOSTNAME : ok=97 changed=34 unreachable=0 failed=0
>> skipped=46 rescued=0 ignored=1
> my log here is:
>
> 2020-09-17 12:19:40 CST - TASK [Executing post tasks defined by user]
> ************************************
> 2020-09-17 12:19:40 CST - PLAY RECAP
> *********************************************************************
>
ovirt2.ntbaobei.com : ok=99 changed=45 unreachable=0
> failed=0 skipped=45 rescued=0 ignored=1
Good.
>> Is 'failed' higher than 0? If so, please find the failed task and
>> check/share the relevant error (or just the entire file).
>>
>> Also, please check engine.log there for any ' ERROR '.
> I collected some error log in engine.log
Only those below?
> 2020-09-17 12:14:35,084+08 ERROR
> [org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-83)
> [4a6cf221] Command 'UploadStreamVDSCommand(HostName =
>
ovirt6.ntbaobei.com,
>
UploadStreamVDSCommandParameters:{hostId='784eada4-49e3-4d6c-95cd-f7c81337c2f7'})'
> execution failed: java.net.SocketException: Connection reset
This, and similar ones, are expected - the engine is still on the
private network, so it can't access the other hosts.
> ...
>
> 2020-09-17 12:14:35,085+08 ERROR
> [org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-83)
> [4a6cf221] Command
> 'org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand' failed:
> EngineException:
> org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
> java.net.SocketException: Connection reset (Failed with error
> VDS_NETWORK_ERROR and code 5022)
>
> ...
>
> 2020-09-17 12:14:40,322+08 ERROR
> [org.ovirt.engine.core.bll.pm.FenceProxyLocator]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-53)
> [8b0987a
> ] Can not run fence action on host 'ovirt2.ntbaobei.com', no suitable
> proxy host was found.
Not sure why it would want to fence ovirt2, but I think it can be ignored
for now as well.
> ...
>
> 2020-09-17 12:14:48,861+08 ERROR
> [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2)
> [4a6cf221] Ending command
>
'org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand'
> with failure.
Same - it can't access the storage, so updating ovfstore fails. OK.
>
> 2020-09-17 12:14:52,630+08 ERROR
> [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41)
> [56d6bb10] Failed to update OVF_STORE content
> 2020-09-17 12:14:52,630+08 ERROR
> [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41)
> [56d6bb10] Command 'ProcessOvfUpdateForStorageDomain' id:
> '8e6e1fa1-1fdf-4928-9153-4fe2ae9b77b0' with children
> [1c4d99f8-2d05-4b0a-938b-8733157778e1,
> 62caf674-5567-461c-8e86-4ed7b03306af] failed when attempting to perform
> the next operation, marking as 'ACTIVE'
> 2020-09-17 12:14:52,630+08 ERROR
> [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41)
> [56d6bb10] null: java.lang.RuntimeException
Same.
Are these the only errors?
In particular, try to search for 'ovirt2' (your host's name), try to
find when it became nonoperational, and check errors around this.
the host has the permission to access the storage. I don't know why it
can access the storage.
should I use one host of the original cluster to install the new
self-Hosted engine and restore the backup file?
Thanks,
>> Good luck and best regards,
>>
>>>> Please also check/share logs from /var/log/ovirt-hosted-engine-setup/*
>>>> (including subdirs).
>>>> no more errers there, just a lot of DEBUG messages.
>>>>> It didn't tell me to choose a new
>>>>> storage domain and just give me the new hosts fqdn as the
engine's URL.
>>>>> like host6.example.com:6900 .
>>>> Yes, that's temporarily, to let you access the engine VM (on the
local network).
>>>>
>>>>> I can login use the host6.example.com:6900 and I saw the engine vm
ran
>>>>> in host6's /tmp dir.
>>>>>
>>>>>> HE deploy (since 4.3) first creates a VM for the engine on local
>>>>>> storage, then prompts you to provide the storage you want to use,
and
>>>>>> then moves the VM disk image there.
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> --
>>>>>>> Adam Xu
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>>>>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>>>>>>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XHDGJB2ZAFS...
>>>>> --
>>>>> Adam Xu
>>>>> Phone: 86-512-8777-3585
>>>>> Adagene (Suzhou) Limited
>>>>> C14, No. 218, Xinghu Street, Suzhou Industrial Park
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list -- users(a)ovirt.org
>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>>>>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RLOBPKLW7OB...
>>> --
>>> Adam Xu
>>>
>>> _______________________________________________
>>> Users mailing list -- users(a)ovirt.org
>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UTVZW7W6XHZ...
>>
> --
> Adam Xu
> Phone: 86-512-8777-3585
> Adagene (Suzhou) Limited
> C14, No. 218, Xinghu Street, Suzhou Industrial Park
>
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RQ3V7J4JKQ4...
--
Adam Xu
Phone: 86-512-8777-3585
Adagene (Suzhou) Limited
C14, No. 218, Xinghu Street, Suzhou Industrial Park