在 2020/9/17 16:38, Yedidyah Bar David 写道:
> On Thu, Sep 17, 2020 at 11:29 AM Adam Xu <adam_xu(a)adagene.com.cn> wrote:
>>
>> 在 2020/9/17 15:07, Yedidyah Bar David 写道:
>>> On Thu, Sep 17, 2020 at 8:16 AM Adam Xu <adam_xu(a)adagene.com.cn>
wrote:
>>>> 在 2020/9/16 15:53, Yedidyah Bar David 写道:
>>>>> On Wed, Sep 16, 2020 at 10:46 AM Adam Xu
<adam_xu(a)adagene.com.cn> wrote:
>>>>>> 在 2020/9/16 15:12, Yedidyah Bar David 写道:
>>>>>>> On Wed, Sep 16, 2020 at 6:10 AM Adam Xu
<adam_xu(a)adagene.com.cn> wrote:
>>>>>>>> Hi ovirt
>>>>>>>>
>>>>>>>> I just try to upgrade a self-Hosted engine from 4.3.10
to 4.4.1.4. I followed the step in the document:
>>>>>>>>
>>>>>>>>
https://www.ovirt.org/documentation/upgrade_guide/#SHE_Upgrading_from_4-3
>>>>>>>>
>>>>>>>> the old 4.3 env has a FC storage as engine storage
domain and I have created a new FC storage vv for the new storage domain to be used in the
next steps.
>>>>>>>>
>>>>>>>> I backup the old 4.3 env and prepare a total new host to
restore the env.
>>>>>>>>
>>>>>>>> in charter 4.4 step 8, it said:
>>>>>>>>
>>>>>>>> "During the deployment you need to provide a new
storage domain. The deployment script renames the 4.3 storage domain and retains its
data."
>>>>>>>>
>>>>>>>> it does rename the old storage domain. but it didn't
let me choose a new storage domain during the deployment. So the new enigne just deployed
in the new host's local storage and can not move to the FC storage domain.
>>>>>>>>
>>>>>>>> Can anyone tell me what the problem is?
>>>>>>> What do you mean in "deployed in the new host's
local storage"?
>>>>>>>
>>>>>>> Did deploy finish successfully?
>>>>>> I think it was not finished yet.
>>>>> You did 'hosted-engine --deploy
--restore-from-file=something', right?
>>>>>
>>>>> Did this finish?
>>>> not finished yet.
>>>>> What are the last few lines of the output?
>>>> [ INFO ] You can now connect to
>>>>
https://ovirt6.ntbaobei.com:6900/ovirt-engine/ and check the status of
>>>> this host and eventually remediate it, please continue only when the
>>>> host is listed as 'up'
>>>>
>>>> [ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]
>>>>
>>>> [ INFO ] ok: [localhost]
>>>> [ INFO ] TASK [ovirt.hosted_engine_setup : Create temporary lock file]
>>>> [ INFO ] changed: [localhost]
>>>>
>>>> [ INFO ] TASK [ovirt.hosted_engine_setup : Pause execution until
>>>> /tmp/ansible.g2opa_y6_he_setup_lock is removed, delete it once ready to
>>>> proceed]
>>> Great. This means that you replied 'Yes' to 'Pause the
execution
>>> after adding this host to the engine?', and it's now waiting.
>>>
>>>> but the new host which run the self-hosted engine's status is
>>>> "NonOperational" and never will be "up"
>>> You seem to to imply that you expected it to become "up" by
itself,
>>> and that you claim that this will never happen, in which you are
>>> correct.
>>>
>>> But that's not the intention. The message you got is:
>>>
>>> You will be able to iteratively connect to the restored engine in
>>> order to manually review and remediate its configuration before
>>> proceeding with the deployment:
>>> please ensure that all the datacenter hosts and storage domain are
>>> listed as up or in maintenance mode before proceeding.
>>> This is normally not required when restoring an up to date and
>>> coherent backup.
>>>
>>> This means that it's up to you to handle this nonoperational host,
>>> and that you are requested to continue (by removing that file) only
>>> then.
>>>
>>> So now, let's try to understand why the host is nonoperational, and
>>> try to fix that. Ok?
>>>
>>> You should be able to find the current (private/local) IP address of
>>> the engine vm by searching the hosted-engine setup logs for
'local_vm_ip'.
>>> You can ssh (and scp etc.) there from the host, using user 'root'
and
>>> the password you supplied.
>>>
>>> Please check/share all of /var/log/ovirt-engine on the engine vm.
>>> In particular, please check host-deploy/* logs there. The last lines
>>> show a summary, like:
>>>
>>> HOSTNAME : ok=97 changed=34 unreachable=0 failed=0
>>> skipped=46 rescued=0 ignored=1
>> my log here is:
>>
>> 2020-09-17 12:19:40 CST - TASK [Executing post tasks defined by user]
>> ************************************
>> 2020-09-17 12:19:40 CST - PLAY RECAP
>> *********************************************************************
>>
ovirt2.ntbaobei.com : ok=99 changed=45 unreachable=0
>> failed=0 skipped=45 rescued=0 ignored=1
> Good.
>
>>> Is 'failed' higher than 0? If so, please find the failed task and
>>> check/share the relevant error (or just the entire file).
>>>
>>> Also, please check engine.log there for any ' ERROR '.
>> I collected some error log in engine.log
> Only those below?
>
>> 2020-09-17 12:14:35,084+08 ERROR
>> [org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand]
>> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-83)
>> [4a6cf221] Command 'UploadStreamVDSCommand(HostName =
>>
ovirt6.ntbaobei.com,
>>
UploadStreamVDSCommandParameters:{hostId='784eada4-49e3-4d6c-95cd-f7c81337c2f7'})'
>> execution failed: java.net.SocketException: Connection reset
> This, and similar ones, are expected - the engine is still on the
> private network, so it can't access the other hosts.
>
>> ...
>>
>> 2020-09-17 12:14:35,085+08 ERROR
>> [org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand]
>> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-83)
>> [4a6cf221] Command
>> 'org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand'
failed:
>> EngineException:
>> org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
>> java.net.SocketException: Connection reset (Failed with error
>> VDS_NETWORK_ERROR and code 5022)
>>
>> ...
>>
>> 2020-09-17 12:14:40,322+08 ERROR
>> [org.ovirt.engine.core.bll.pm.FenceProxyLocator]
>> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-53)
>> [8b0987a
>> ] Can not run fence action on host 'ovirt2.ntbaobei.com', no suitable
>> proxy host was found.
> Not sure why it would want to fence ovirt2, but I think it can be ignored
> for now as well.
>
>> ...
>>
>> 2020-09-17 12:14:48,861+08 ERROR
>>
[org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand]
>> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2)
>> [4a6cf221] Ending command
>>
'org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand'
>> with failure.
> Same - it can't access the storage, so updating ovfstore fails. OK.
>
>>
>> 2020-09-17 12:14:52,630+08 ERROR
>>
[org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand]
>> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41)
>> [56d6bb10] Failed to update OVF_STORE content
>> 2020-09-17 12:14:52,630+08 ERROR
>> [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback]
>> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41)
>> [56d6bb10] Command 'ProcessOvfUpdateForStorageDomain' id:
>> '8e6e1fa1-1fdf-4928-9153-4fe2ae9b77b0' with children
>> [1c4d99f8-2d05-4b0a-938b-8733157778e1,
>> 62caf674-5567-461c-8e86-4ed7b03306af] failed when attempting to perform
>> the next operation, marking as 'ACTIVE'
>> 2020-09-17 12:14:52,630+08 ERROR
>> [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback]
>> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41)
>> [56d6bb10] null: java.lang.RuntimeException
> Same.
>
> Are these the only errors?
>
> In particular, try to search for 'ovirt2' (your host's name), try to
> find when it became nonoperational, and check errors around this.
the host has the permission to access the storage. I don't know why it
can access the storage.
Me neither, but that's still irrelevant. First the node has to be Up, then
you should check the storage.
should I use one host of the original cluster to install the new
self-Hosted engine and restore the backup file?
>
> Thanks,
>
>>> Good luck and best regards,
>>>
>>>>> Please also check/share logs from
/var/log/ovirt-hosted-engine-setup/*
>>>>> (including subdirs).
>>>>> no more errers there, just a lot of DEBUG messages.
>>>>>> It didn't tell me to choose a new
>>>>>> storage domain and just give me the new hosts fqdn as the
engine's URL.
>>>>>> like host6.example.com:6900 .
>>>>> Yes, that's temporarily, to let you access the engine VM (on the
local network).
>>>>>
>>>>>> I can login use the host6.example.com:6900 and I saw the engine
vm ran
>>>>>> in host6's /tmp dir.
>>>>>>
>>>>>>> HE deploy (since 4.3) first creates a VM for the engine on
local
>>>>>>> storage, then prompts you to provide the storage you want to
use, and
>>>>>>> then moves the VM disk image there.
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> --
>>>>>>>> Adam Xu
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>>>>>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>>>>>>>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XHDGJB2ZAFS...
>>>>>> --
>>>>>> Adam Xu
>>>>>> Phone: 86-512-8777-3585
>>>>>> Adagene (Suzhou) Limited
>>>>>> C14, No. 218, Xinghu Street, Suzhou Industrial Park
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>>>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>>>>>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RLOBPKLW7OB...
>>>> --
>>>> Adam Xu
>>>>
>>>> _______________________________________________
>>>> Users mailing list -- users(a)ovirt.org
>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>>>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UTVZW7W6XHZ...
>>>
>> --
>> Adam Xu
>> Phone: 86-512-8777-3585
>> Adagene (Suzhou) Limited
>> C14, No. 218, Xinghu Street, Suzhou Industrial Park
>>
>> _______________________________________________
>> Users mailing list -- users(a)ovirt.org
>> To unsubscribe send an email to users-leave(a)ovirt.org
>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RQ3V7J4JKQ4...
>
>
--
Adam Xu
Phone: 86-512-8777-3585
Adagene (Suzhou) Limited
C14, No. 218, Xinghu Street, Suzhou Industrial Park
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/M4S6G6OANQI...