On 23 Jan 2019, at 14:33, Simone Tiraboschi <stirabos@redhat.com> wrote:



On Wed, Jan 23, 2019 at 5:27 PM Vinícius Ferrão <ferrao@versatushpc.com.br> wrote:
Simone, may I up this thread.

I will request the RFE on bugzilla. I just need some time to do this.

But I have another question on this issue. In a case of an already deployed oVirt installation with this bug. There’s a way to fix it? Production VMs are running and I would like to know if this can be fixed without interrupting the VMs.

I was thinking on getting a backup of the bugged SHE VM with hosted-engine command and then trying to restore it with ovirt-hosted-engine-setup with the Ansible backend. But I’m not sure if this will work.

Yes, it will.
You can also use that tool to migrate from bare metal to hosted-engine and so on.

Thank you Simone!!!

So I will get the backup from the broken HE and then redeploy it using the backup with the Ansible backend. Rewriting just to be sure.

 

If not, there’s a way to at least keep the VMs running and redeploy the engine from the ground without restoring the backup?

Thanks!

Sent from my iPhone

On 8 Jan 2019, at 14:49, Simone Tiraboschi <stirabos@redhat.com> wrote:



On Tue, Jan 8, 2019 at 5:31 PM Vinícius Ferrão <ferrao@versatushpc.com.br> wrote:
Hello,

On 8 Jan 2019, at 11:20, Simone Tiraboschi <stirabos@redhat.com> wrote:



On Mon, Jan 7, 2019 at 10:43 PM Vinícius Ferrão <ferrao@versatushpc.com.br> wrote:
Simone,

I have additional findings: Ansible was failing because I was defined the option without-password on SSH root access. So it fails with an authentication failure error during the deployment.

After allowing root access over SSH the hosted engine deployement with Ansible worked.

Now I will check if everything else is working fine.

Maybe I need to open a bug on Bugzilla on this issue?

Ok, from the logs I see that you set without-password and you correctly entered a public ssh key when requested.
But then ansible failed to authenticate to the engine VM, as root, with that password.
So, if you are sure that the correspondent private key was available in the right place and with the right permissions, please open a bug.

Hello Simone, just to be sure. The private key was always on my personal computer. It was never on the oVirt Node.

For years I’ve deployed oVirt this way and it worked as expected.

So if the new behaviour demands a private key on the hypervisor this makes the deployment different. 

The purpose of the key and enabling root ssh without-password is to enforce the security of the hosted engine, right? Not the security between the hypervisor and hosted engine during the deployment phase. So the setting without-password should be set at the end of hosted engine deployment.

If this assumptions are correct I will proceed to the ticket on bugzilla.

Now the whole flow, including engine-setup on the engine VM to create the DB and so on, is executed with ansible and this requires ansible, executed on the first host, to be able to authenticate to the engine VM over ssh.
Currently the setup is configuring the root password and/or the root ssh pub key on the first boot with cloud-init and so this implicitly requires the user to enable password authentication or to configure the host to be able to access the engine VM with an ssh key.

What you are proposing requires the setup to inject a temporary key generated on the fly and remove it at the end or configure without-password only after the deployment.
It makes sense to me but on my opinion it's more an RFE than a real bug.
Feel free to file it.

 

Thanks, 


 

Thanks,

On 7 Jan 2019, at 15:22, Vinícius Ferrão <ferrao@versatushpc.com.br> wrote:

Hello,

On 7 Jan 2019, at 12:52, Simone Tiraboschi <stirabos@redhat.com> wrote:



On Mon, Jan 7, 2019 at 2:03 PM Vinícius Ferrão <ferrao@versatushpc.com.br> wrote:
Hello Simone,

Sent from my iPhone

On 7 Jan 2019, at 07:11, Simone Tiraboschi <stirabos@redhat.com> wrote:



On Sun, Jan 6, 2019 at 5:31 PM <ferrao@versatushpc.com.br> wrote:
Hello,

I’ve a new oVirt installation using oVirt 4.2.7.1 Node and after deploying the hosted engine it does not show up on the interface even after adding the first storage.

The Datacenter is up but the engine VM and the engine storage does not appear.

I have the following message repeated constantly on /var/log/messages:

Jan  4 20:17:30 ovirt1 journal: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config.vm ERROR Unable to identify the OVF_STORE volume, falling back to initial vm.conf. Please ensure you already added your first data domain for regular VMs

What’s wrong? Am I doing something different?

The import of external VM is broken in 4.2.7 as for https://bugzilla.redhat.com/show_bug.cgi?id=1649615
It will be fixed with 4.2.8.

In the mean time I strongly suggest to use the regular flow for hosted-engine deployment (simply skip --noansible option) since only the vintage deprecated flow is affected by this issue.
 

Thanks for pointing the issue. I was unable the find this on bugzilla by myself. The title isn’t helping either.

But on other hand, I only used the legacy mode because ansible mode fails.

Can you please attach a log of the issue?

For sure, logs on the link:

What happens is that Ansible just bypasses the storage configuration questions:

[ INFO  ] Stage: Environment packages setup
[ INFO  ] Stage: Programs detection
[ INFO  ] Stage: Environment setup
[ INFO  ] Stage: Environment customization
         
          --== STORAGE CONFIGURATION ==--
         
         
          --== HOST NETWORK CONFIGURATION ==--
         
          Please indicate a pingable gateway IP address [10.20.0.1]: 
[ INFO  ] TASK [Gathering Facts]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [Detecting interface on existing management bridge]
[ INFO  ] skipping: [localhost]
[ INFO  ] TASK [Get all active network interfaces]
[ INFO  ] TASK [Filter bonds with bad naming]
[ INFO  ] TASK [Generate output list]


 

I’m not sure why it fails. I can try it again, but I can ask in advance: the management network is bonded, is this an issue? I think I’ve read something about this on this list but I’m unsure.

No, but you should set bond mode 1, 2, 3, or 4.
Teaming is not supported.

Thanks, since I’m using 802.3ad (LACP) - mode 4, I think I’m good.

 

Thanks,



Additional infos:

[root@ovirt1 ~]# vdsm-tool list-nets
ovirtmgmt (default route)
storage

[root@ovirt1 ~]# ip a | grep "inet "
   inet 127.0.0.1/8 scope host lo
   inet 10.20.0.101/24 brd 10.20.0.255 scope global dynamic ovirtmgmt
   inet 192.168.10.1/29 brd 192.168.10.7 scope global storage

[root@ovirt1 ~]# mount | grep -i nfs
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
10.20.0.200:/mnt/pool0/ovirt/he on /rhev/data-center/mnt/10.20.0.200:_mnt_pool0_ovirt_he type nfs4 (rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.20.0.101,local_lock=none,addr=10.20.0.200)

[root@ovirt1 ~]# hosted-engine --check-deployed
Returns nothing!

[root@ovirt1 ~]# hosted-engine --check-liveliness
Hosted Engine is up!

[root@ovirt1 ~]# hosted-engine --vm-status

--== Host 1 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : ovirt1.local.versatushpc.com.br
Host ID                            : 1
Engine status                      : {"health": "good", "vm": "up", "detail": "Up"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 1736a87d
local_conf_timestamp               : 7836
Host timestamp                     : 7836
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=7836 (Fri Jan  4 20:18:10 2019)
        host-id=1
        score=3400
        vm_conf_refresh_time=7836 (Fri Jan  4 20:18:10 2019)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineUp
        stopped=False


Thanks in advance,

PS: Log files are available here: http://www.if.ufrj.br/~ferrao/ovirt/issues/he-not-showing/
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/IQHM6YQ7HVBHLFQYBCRV2ODTELTWLLWC/
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/BPJAV4AVRN55YEHAPT5BMS42PT7NHKEM/
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/NSOQQ5T6VLMRHZKVZPKXSR42QX5GQJI3/