在 2020/9/17 17:42, Yedidyah Bar David 写道:
> On Thu, Sep 17, 2020 at 11:57 AM Adam Xu <adam_xu(a)adagene.com.cn> wrote:
>>
>> 在 2020/9/17 16:38, Yedidyah Bar David 写道:
>>> On Thu, Sep 17, 2020 at 11:29 AM Adam Xu <adam_xu(a)adagene.com.cn>
wrote:
>>>> 在 2020/9/17 15:07, Yedidyah Bar David 写道:
>>>>> On Thu, Sep 17, 2020 at 8:16 AM Adam Xu
<adam_xu(a)adagene.com.cn> wrote:
>>>>>> 在 2020/9/16 15:53, Yedidyah Bar David 写道:
>>>>>>> On Wed, Sep 16, 2020 at 10:46 AM Adam Xu
<adam_xu(a)adagene.com.cn> wrote:
>>>>>>>> 在 2020/9/16 15:12, Yedidyah Bar David 写道:
>>>>>>>>> On Wed, Sep 16, 2020 at 6:10 AM Adam Xu
<adam_xu(a)adagene.com.cn> wrote:
>>>>>>>>>> Hi ovirt
>>>>>>>>>>
>>>>>>>>>> I just try to upgrade a self-Hosted engine from
4.3.10 to 4.4.1.4. I followed the step in the document:
>>>>>>>>>>
>>>>>>>>>>
https://www.ovirt.org/documentation/upgrade_guide/#SHE_Upgrading_from_4-3
>>>>>>>>>>
>>>>>>>>>> the old 4.3 env has a FC storage as engine
storage domain and I have created a new FC storage vv for the new storage domain to be
used in the next steps.
>>>>>>>>>>
>>>>>>>>>> I backup the old 4.3 env and prepare a total new
host to restore the env.
>>>>>>>>>>
>>>>>>>>>> in charter 4.4 step 8, it said:
>>>>>>>>>>
>>>>>>>>>> "During the deployment you need to provide
a new storage domain. The deployment script renames the 4.3 storage domain and retains its
data."
>>>>>>>>>>
>>>>>>>>>> it does rename the old storage domain. but it
didn't let me choose a new storage domain during the deployment. So the new enigne
just deployed in the new host's local storage and can not move to the FC storage
domain.
>>>>>>>>>>
>>>>>>>>>> Can anyone tell me what the problem is?
>>>>>>>>> What do you mean in "deployed in the new
host's local storage"?
>>>>>>>>>
>>>>>>>>> Did deploy finish successfully?
>>>>>>>> I think it was not finished yet.
>>>>>>> You did 'hosted-engine --deploy
--restore-from-file=something', right?
>>>>>>>
>>>>>>> Did this finish?
>>>>>> not finished yet.
>>>>>>> What are the last few lines of the output?
>>>>>> [ INFO ] You can now connect to
>>>>>>
https://ovirt6.ntbaobei.com:6900/ovirt-engine/ and check the
status of
>>>>>> this host and eventually remediate it, please continue only when
the
>>>>>> host is listed as 'up'
>>>>>>
>>>>>> [ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]
>>>>>>
>>>>>> [ INFO ] ok: [localhost]
>>>>>> [ INFO ] TASK [ovirt.hosted_engine_setup : Create temporary
lock file]
>>>>>> [ INFO ] changed: [localhost]
>>>>>>
>>>>>> [ INFO ] TASK [ovirt.hosted_engine_setup : Pause execution
until
>>>>>> /tmp/ansible.g2opa_y6_he_setup_lock is removed, delete it once
ready to
>>>>>> proceed]
>>>>> Great. This means that you replied 'Yes' to 'Pause the
execution
>>>>> after adding this host to the engine?', and it's now
waiting.
>>>>>
>>>>>> but the new host which run the self-hosted engine's status
is
>>>>>> "NonOperational" and never will be "up"
>>>>> You seem to to imply that you expected it to become "up"
by itself,
>>>>> and that you claim that this will never happen, in which you are
>>>>> correct.
>>>>>
>>>>> But that's not the intention. The message you got is:
>>>>>
>>>>> You will be able to iteratively connect to the restored
engine in
>>>>> order to manually review and remediate its configuration before
>>>>> proceeding with the deployment:
>>>>> please ensure that all the datacenter hosts and storage
domain are
>>>>> listed as up or in maintenance mode before proceeding.
>>>>> This is normally not required when restoring an up to date
and
>>>>> coherent backup.
>>>>>
>>>>> This means that it's up to you to handle this nonoperational
host,
>>>>> and that you are requested to continue (by removing that file) only
>>>>> then.
>>>>>
>>>>> So now, let's try to understand why the host is nonoperational,
and
>>>>> try to fix that. Ok?
>>>>>
>>>>> You should be able to find the current (private/local) IP address
of
>>>>> the engine vm by searching the hosted-engine setup logs for
'local_vm_ip'.
>>>>> You can ssh (and scp etc.) there from the host, using user
'root' and
>>>>> the password you supplied.
>>>>>
>>>>> Please check/share all of /var/log/ovirt-engine on the engine vm.
>>>>> In particular, please check host-deploy/* logs there. The last
lines
>>>>> show a summary, like:
>>>>>
>>>>> HOSTNAME : ok=97 changed=34 unreachable=0 failed=0
>>>>> skipped=46 rescued=0 ignored=1
>>>> my log here is:
>>>>
>>>> 2020-09-17 12:19:40 CST - TASK [Executing post tasks defined by user]
>>>> ************************************
>>>> 2020-09-17 12:19:40 CST - PLAY RECAP
>>>> *********************************************************************
>>>>
ovirt2.ntbaobei.com : ok=99 changed=45 unreachable=0
>>>> failed=0 skipped=45 rescued=0 ignored=1
>>> Good.
>>>
>>>>> Is 'failed' higher than 0? If so, please find the failed
task and
>>>>> check/share the relevant error (or just the entire file).
>>>>>
>>>>> Also, please check engine.log there for any ' ERROR '.
>>>> I collected some error log in engine.log
>>> Only those below?
>>>
>>>> 2020-09-17 12:14:35,084+08 ERROR
>>>> [org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand]
>>>>
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-83)
>>>> [4a6cf221] Command 'UploadStreamVDSCommand(HostName =
>>>>
ovirt6.ntbaobei.com,
>>>>
UploadStreamVDSCommandParameters:{hostId='784eada4-49e3-4d6c-95cd-f7c81337c2f7'})'
>>>> execution failed: java.net.SocketException: Connection reset
>>> This, and similar ones, are expected - the engine is still on the
>>> private network, so it can't access the other hosts.
>>>
>>>> ...
>>>>
>>>> 2020-09-17 12:14:35,085+08 ERROR
>>>> [org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand]
>>>>
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-83)
>>>> [4a6cf221] Command
>>>> 'org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand'
failed:
>>>> EngineException:
>>>> org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
>>>> java.net.SocketException: Connection reset (Failed with error
>>>> VDS_NETWORK_ERROR and code 5022)
>>>>
>>>> ...
>>>>
>>>> 2020-09-17 12:14:40,322+08 ERROR
>>>> [org.ovirt.engine.core.bll.pm.FenceProxyLocator]
>>>>
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-53)
>>>> [8b0987a
>>>> ] Can not run fence action on host 'ovirt2.ntbaobei.com', no
suitable
>>>> proxy host was found.
>>> Not sure why it would want to fence ovirt2, but I think it can be ignored
>>> for now as well.
>>>
>>>> ...
>>>>
>>>> 2020-09-17 12:14:48,861+08 ERROR
>>>>
[org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand]
>>>> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2)
>>>> [4a6cf221] Ending command
>>>>
'org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand'
>>>> with failure.
>>> Same - it can't access the storage, so updating ovfstore fails. OK.
>>>
>>>> 2020-09-17 12:14:52,630+08 ERROR
>>>>
[org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand]
>>>>
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41)
>>>> [56d6bb10] Failed to update OVF_STORE content
>>>> 2020-09-17 12:14:52,630+08 ERROR
>>>> [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback]
>>>>
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41)
>>>> [56d6bb10] Command 'ProcessOvfUpdateForStorageDomain' id:
>>>> '8e6e1fa1-1fdf-4928-9153-4fe2ae9b77b0' with children
>>>> [1c4d99f8-2d05-4b0a-938b-8733157778e1,
>>>> 62caf674-5567-461c-8e86-4ed7b03306af] failed when attempting to perform
>>>> the next operation, marking as 'ACTIVE'
>>>> 2020-09-17 12:14:52,630+08 ERROR
>>>> [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback]
>>>>
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41)
>>>> [56d6bb10] null: java.lang.RuntimeException
>>> Same.
>>>
>>> Are these the only errors?
>>>
>>> In particular, try to search for 'ovirt2' (your host's name),
try to
>>> find when it became nonoperational, and check errors around this.
>> the host has the permission to access the storage. I don't know why it
>> can access the storage.
> Me neither, but that's still irrelevant. First the node has to be Up, then
> you should check the storage.
>
>> should I use one host of the original cluster to install the new
>> self-Hosted engine and restore the backup file?
> I thought this is what you did, no?
>
> Please explain what you did.
for example, I have a ovirt cluster which have 3 hosts, named
ovirt1.example.com,
ovirt2.example.com and ovirt3.example
I backup the engine and prepare a new host named
ovirt4.example.com to
restore the backup file. is that why ovirt4 can not manage the store
domain ?
>
> Thanks,
>
>>> Thanks,
>>>
>>>>> Good luck and best regards,
>>>>>
>>>>>>> Please also check/share logs from
/var/log/ovirt-hosted-engine-setup/*
>>>>>>> (including subdirs).
>>>>>>> no more errers there, just a lot of DEBUG messages.
>>>>>>>> It didn't tell me to choose a new
>>>>>>>> storage domain and just give me the new hosts fqdn as
the engine's URL.
>>>>>>>> like host6.example.com:6900 .
>>>>>>> Yes, that's temporarily, to let you access the engine VM
(on the local network).
>>>>>>>
>>>>>>>> I can login use the host6.example.com:6900 and I saw the
engine vm ran
>>>>>>>> in host6's /tmp dir.
>>>>>>>>
>>>>>>>>> HE deploy (since 4.3) first creates a VM for the
engine on local
>>>>>>>>> storage, then prompts you to provide the storage you
want to use, and
>>>>>>>>> then moves the VM disk image there.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Adam Xu
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>>>>>> To unsubscribe send an email to
users-leave(a)ovirt.org
>>>>>>>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>>>>>>>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>>>>>>>>>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XHDGJB2ZAFS...
>>>>>>>> --
>>>>>>>> Adam Xu
>>>>>>>> Phone: 86-512-8777-3585
>>>>>>>> Adagene (Suzhou) Limited
>>>>>>>> C14, No. 218, Xinghu Street, Suzhou Industrial Park
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>>>>>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>>>>>>>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RLOBPKLW7OB...
>>>>>> --
>>>>>> Adam Xu
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>>>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>>>>>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UTVZW7W6XHZ...
>>>> --
>>>> Adam Xu
>>>> Phone: 86-512-8777-3585
>>>> Adagene (Suzhou) Limited
>>>> C14, No. 218, Xinghu Street, Suzhou Industrial Park
>>>>
>>>> _______________________________________________
>>>> Users mailing list -- users(a)ovirt.org
>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>>>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RQ3V7J4JKQ4...
>>>
>> --
>> Adam Xu
>> Phone: 86-512-8777-3585
>> Adagene (Suzhou) Limited
>> C14, No. 218, Xinghu Street, Suzhou Industrial Park
>> _______________________________________________
>> Users mailing list -- users(a)ovirt.org
>> To unsubscribe send an email to users-leave(a)ovirt.org
>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/M4S6G6OANQI...
>
>
--
Adam Xu
Phone: 86-512-8777-3585
Adagene (Suzhou) Limited
C14, No. 218, Xinghu Street, Suzhou Industrial Park
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/LCLQPHRZWW6...