On Thu, Sep 17, 2020 at 11:29 AM Adam Xu <adam_xu(a)adagene.com.cn> wrote:
在 2020/9/17 15:07, Yedidyah Bar David 写道:
> On Thu, Sep 17, 2020 at 8:16 AM Adam Xu <adam_xu(a)adagene.com.cn> wrote:
>>
>> 在 2020/9/16 15:53, Yedidyah Bar David 写道:
>>> On Wed, Sep 16, 2020 at 10:46 AM Adam Xu <adam_xu(a)adagene.com.cn>
wrote:
>>>> 在 2020/9/16 15:12, Yedidyah Bar David 写道:
>>>>> On Wed, Sep 16, 2020 at 6:10 AM Adam Xu
<adam_xu(a)adagene.com.cn> wrote:
>>>>>> Hi ovirt
>>>>>>
>>>>>> I just try to upgrade a self-Hosted engine from 4.3.10 to
4.4.1.4. I followed the step in the document:
>>>>>>
>>>>>>
https://www.ovirt.org/documentation/upgrade_guide/#SHE_Upgrading_from_4-3
>>>>>>
>>>>>> the old 4.3 env has a FC storage as engine storage domain and I
have created a new FC storage vv for the new storage domain to be used in the next steps.
>>>>>>
>>>>>> I backup the old 4.3 env and prepare a total new host to restore
the env.
>>>>>>
>>>>>> in charter 4.4 step 8, it said:
>>>>>>
>>>>>> "During the deployment you need to provide a new storage
domain. The deployment script renames the 4.3 storage domain and retains its data."
>>>>>>
>>>>>> it does rename the old storage domain. but it didn't let me
choose a new storage domain during the deployment. So the new enigne just deployed in the
new host's local storage and can not move to the FC storage domain.
>>>>>>
>>>>>> Can anyone tell me what the problem is?
>>>>> What do you mean in "deployed in the new host's local
storage"?
>>>>>
>>>>> Did deploy finish successfully?
>>>> I think it was not finished yet.
>>> You did 'hosted-engine --deploy --restore-from-file=something',
right?
>>>
>>> Did this finish?
>> not finished yet.
>>> What are the last few lines of the output?
>> [ INFO ] You can now connect to
>>
https://ovirt6.ntbaobei.com:6900/ovirt-engine/ and check the status of
>> this host and eventually remediate it, please continue only when the
>> host is listed as 'up'
>>
>> [ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]
>>
>> [ INFO ] ok: [localhost]
>> [ INFO ] TASK [ovirt.hosted_engine_setup : Create temporary lock file]
>> [ INFO ] changed: [localhost]
>>
>> [ INFO ] TASK [ovirt.hosted_engine_setup : Pause execution until
>> /tmp/ansible.g2opa_y6_he_setup_lock is removed, delete it once ready to
>> proceed]
> Great. This means that you replied 'Yes' to 'Pause the execution
> after adding this host to the engine?', and it's now waiting.
>
>> but the new host which run the self-hosted engine's status is
>> "NonOperational" and never will be "up"
> You seem to to imply that you expected it to become "up" by itself,
> and that you claim that this will never happen, in which you are
> correct.
>
> But that's not the intention. The message you got is:
>
> You will be able to iteratively connect to the restored engine in
> order to manually review and remediate its configuration before
> proceeding with the deployment:
> please ensure that all the datacenter hosts and storage domain are
> listed as up or in maintenance mode before proceeding.
> This is normally not required when restoring an up to date and
> coherent backup.
>
> This means that it's up to you to handle this nonoperational host,
> and that you are requested to continue (by removing that file) only
> then.
>
> So now, let's try to understand why the host is nonoperational, and
> try to fix that. Ok?
>
> You should be able to find the current (private/local) IP address of
> the engine vm by searching the hosted-engine setup logs for 'local_vm_ip'.
> You can ssh (and scp etc.) there from the host, using user 'root' and
> the password you supplied.
>
> Please check/share all of /var/log/ovirt-engine on the engine vm.
> In particular, please check host-deploy/* logs there. The last lines
> show a summary, like:
>
> HOSTNAME : ok=97 changed=34 unreachable=0 failed=0
> skipped=46 rescued=0 ignored=1
my log here is:
2020-09-17 12:19:40 CST - TASK [Executing post tasks defined by user]
************************************
2020-09-17 12:19:40 CST - PLAY RECAP
*********************************************************************
ovirt2.ntbaobei.com : ok=99 changed=45 unreachable=0
failed=0 skipped=45 rescued=0 ignored=1
Good.
>
> Is 'failed' higher than 0? If so, please find the failed task and
> check/share the relevant error (or just the entire file).
>
> Also, please check engine.log there for any ' ERROR '.
I collected some error log in engine.log
Only those below?
2020-09-17 12:14:35,084+08 ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-83)
[4a6cf221] Command 'UploadStreamVDSCommand(HostName =
ovirt6.ntbaobei.com,
UploadStreamVDSCommandParameters:{hostId='784eada4-49e3-4d6c-95cd-f7c81337c2f7'})'
execution failed: java.net.SocketException: Connection reset
This, and similar ones, are expected - the engine is still on the
private network, so it can't access the other hosts.
...
2020-09-17 12:14:35,085+08 ERROR
[org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-83)
[4a6cf221] Command
'org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand' failed:
EngineException:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
java.net.SocketException: Connection reset (Failed with error
VDS_NETWORK_ERROR and code 5022)
...
2020-09-17 12:14:40,322+08 ERROR
[org.ovirt.engine.core.bll.pm.FenceProxyLocator]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-53)
[8b0987a
] Can not run fence action on host 'ovirt2.ntbaobei.com', no suitable
proxy host was found.
Not sure why it would want to fence ovirt2, but I think it can be ignored
for now as well.
...
2020-09-17 12:14:48,861+08 ERROR
[org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2)
[4a6cf221] Ending command
'org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand'
with failure.
Same - it can't access the storage, so updating ovfstore fails. OK.
2020-09-17 12:14:52,630+08 ERROR
[org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41)
[56d6bb10] Failed to update OVF_STORE content
2020-09-17 12:14:52,630+08 ERROR
[org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41)
[56d6bb10] Command 'ProcessOvfUpdateForStorageDomain' id:
'8e6e1fa1-1fdf-4928-9153-4fe2ae9b77b0' with children
[1c4d99f8-2d05-4b0a-938b-8733157778e1,
62caf674-5567-461c-8e86-4ed7b03306af] failed when attempting to perform
the next operation, marking as 'ACTIVE'
2020-09-17 12:14:52,630+08 ERROR
[org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41)
[56d6bb10] null: java.lang.RuntimeException
Same.
Are these the only errors?
In particular, try to search for 'ovirt2' (your host's name), try to
find when it became nonoperational, and check errors around this.
Thanks,
>
> Good luck and best regards,
>
>>> Please also check/share logs from /var/log/ovirt-hosted-engine-setup/*
>>> (including subdirs).
>>> no more errers there, just a lot of DEBUG messages.
>>>> It didn't tell me to choose a new
>>>> storage domain and just give me the new hosts fqdn as the engine's
URL.
>>>> like host6.example.com:6900 .
>>> Yes, that's temporarily, to let you access the engine VM (on the local
network).
>>>
>>>> I can login use the host6.example.com:6900 and I saw the engine vm ran
>>>> in host6's /tmp dir.
>>>>
>>>>> HE deploy (since 4.3) first creates a VM for the engine on local
>>>>> storage, then prompts you to provide the storage you want to use,
and
>>>>> then moves the VM disk image there.
>>>>>
>>>>> Best regards,
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> --
>>>>>> Adam Xu
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>>>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>>>>>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XHDGJB2ZAFS...
>>>> --
>>>> Adam Xu
>>>> Phone: 86-512-8777-3585
>>>> Adagene (Suzhou) Limited
>>>> C14, No. 218, Xinghu Street, Suzhou Industrial Park
>>>>
>>>> _______________________________________________
>>>> Users mailing list -- users(a)ovirt.org
>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>>>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RLOBPKLW7OB...
>>>
>> --
>> Adam Xu
>>
>> _______________________________________________
>> Users mailing list -- users(a)ovirt.org
>> To unsubscribe send an email to users-leave(a)ovirt.org
>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UTVZW7W6XHZ...
>
>
--
Adam Xu
Phone: 86-512-8777-3585
Adagene (Suzhou) Limited
C14, No. 218, Xinghu Street, Suzhou Industrial Park
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RQ3V7J4JKQ4...
--
Didi