terraform integration

Nathanaël Blanchet

5 Nov 2019 5 Nov '19

1:22 p.m.

Hello, I'm interested by installing okd on ovirt with the official openshift installer (https://github.com/openshift/installer), but ovirt is not yet supported. Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... , how ovirt 4.3.7 should integrate openshift installer integration with terraform? -- Nathanaël Blanchet Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

Show replies by date

Roy Golan

5 Nov 5 Nov

1:24 p.m.

On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...

Hello,

I'm interested by installing okd on ovirt with the official openshift installer (https://github.com/openshift/installer), but ovirt is not yet supported.

If you want to give a try and supply feedback I'll be glad.

...

Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and

https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... , how ovirt 4.3.7 should integrate openshift installer integration with terraform?

Terraform is part of it, yes, It is what we use to spin the first 3 masters, plus a bootstraping machine. --

...

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

Sandro Bonazzola

2:49 p.m.

Il giorno mar 5 nov 2019 alle ore 12:24 Roy Golan <rgolan@redhat.com> ha scritto:

...

On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Hello,

I'm interested by installing okd on ovirt with the official openshift installer (https://github.com/openshift/installer), but ovirt is not yet supported.

If you want to give a try and supply feedback I'll be glad.

+Gal Zaidman <gzaidman@redhat.com> +Evgeny Slutsky <eslutsky@redhat.com> +Roy Golan <rgolan@redhat.com> +Douglas Landgraf <dlandgra@redhat.com> , maybe you can prepare a "quick setup guide" for this case on ovirt.org?

...

...
Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and

https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... , how ovirt 4.3.7 should integrate openshift installer integration with terraform?

Terraform is part of it, yes, It is what we use to spin the first 3 masters, plus a bootstraping machine.

--

...
Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

-- Sandro Bonazzola MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV Red Hat EMEA <https://www.redhat.com/> sbonazzo@redhat.com <https://www.redhat.com/>*Red Hat respects your work life balance. Therefore there is no need to answer this email out of your office hours. <https://mojo.redhat.com/docs/DOC-1199578>*

Roy Golan

2:56 p.m.

On Tue, 5 Nov 2019 at 14:50, Sandro Bonazzola <sbonazzo@redhat.com> wrote:

...

Il giorno mar 5 nov 2019 alle ore 12:24 Roy Golan <rgolan@redhat.com> ha scritto:

...
On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Hello,

I'm interested by installing okd on ovirt with the official openshift installer (https://github.com/openshift/installer), but ovirt is not yet supported.

If you want to give a try and supply feedback I'll be glad.

+Gal Zaidman <gzaidman@redhat.com> +Evgeny Slutsky <eslutsky@redhat.com> +Roy Golan <rgolan@redhat.com> +Douglas Landgraf <dlandgra@redhat.com> , maybe you can prepare a "quick setup guide" for this case on ovirt.org?

There is an install doc as part of the PR, lets keep work on single one - if needs additions, let me know by direct feedback on the PR https://github.com/openshift/installer/pull/1948/commits/1f1c4c2b4c9e1fff64c...

...

...
...
Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and

https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... , how ovirt 4.3.7 should integrate openshift installer integration with terraform?

Terraform is part of it, yes, It is what we use to spin the first 3 masters, plus a bootstraping machine.

--

...
Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

--

Sandro Bonazzola

MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV

Red Hat EMEA <https://www.redhat.com/>

sbonazzo@redhat.com <https://www.redhat.com/>*Red Hat respects your work life balance. Therefore there is no need to answer this email out of your office hours. <https://mojo.redhat.com/docs/DOC-1199578>*

Nathanaël Blanchet

2:52 p.m.

I tried openshift-install after compiling but no ovirt provider is available... So waht do you mean when you say "give a try"? Maybe only provisionning ovirt with the terraform module? [root@vm5 installer]# bin/openshift-install create cluster ? Platform [Use arrows to move, space to select, type to filter, ? for more help]

...

aws azure gcp openstack

Le 05/11/2019 à 12:24, Roy Golan a écrit :

...

On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Hello,

I'm interested by installing okd on ovirt with the official openshift installer (https://github.com/openshift/installer), but ovirt is not yet supported.

If you want to give a try and supply feedback I'll be glad.

Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A...

, how ovirt 4.3.7 should integrate openshift installer integration with terraform?

Terraform is part of it, yes, It is what we use to spin the first 3 masters, plus a bootstraping machine.

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

Roy Golan

2:54 p.m.

On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...

I tried openshift-install after compiling but no ovirt provider is available... So waht do you mean when you say "give a try"? Maybe only provisionning ovirt with the terraform module?

[root@vm5 installer]# bin/openshift-install create cluster ? Platform [Use arrows to move, space to select, type to filter, ? for more help]

...
aws azure gcp openstack

Its not merged yet. Please pull this image and work with it as a container quay.io/rgolangh/openshift-installer

...

Le 05/11/2019 à 12:24, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Hello,

I'm interested by installing okd on ovirt with the official openshift installer (https://github.com/openshift/installer), but ovirt is not yet supported.

If you want to give a try and supply feedback I'll be glad.

...
Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and

https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... , how ovirt 4.3.7 should integrate openshift installer integration with terraform?

Terraform is part of it, yes, It is what we use to spin the first 3 masters, plus a bootstraping machine.

--

...
Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

Nathanaël Blanchet

7:12 p.m.

Le 05/11/2019 à 13:54, Roy Golan a écrit :

...

On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

I tried openshift-install after compiling but no ovirt provider is available... So waht do you mean when you say "give a try"? Maybe only provisionning ovirt with the terraform module?

[root@vm5 installer]# bin/openshift-install create cluster ? Platform [Use arrows to move, space to select, type to filter, ? for more help] > aws azure gcp openstack

Its not merged yet. Please pull this image and work with it as a container quay.io/rgolangh/openshift-installer <http://quay.io/rgolangh/openshift-installer>

A little feedback as you asked: [root@openshift-installer ~]# docker run -it 56e5b667100f create cluster ? Platform ovirt ? Enter oVirt's api endpoint URL https://air-dev.v100.abes.fr/ovirt-engine/api ? Enter ovirt-engine username admin@internal ? Enter password ********** ? Pick the oVirt cluster Default ? Pick a VM template centos7.x ? Enter the internal API Virtual IP 10.34.212.200 ? Enter the internal DNS Virtual IP 10.34.212.100 ? Enter the ingress IP 10.34.212.50 ? Base Domain oc4.localdomain ? Cluster Name test ? Pull Secret [? for help] ************************************* INFO Creating infrastructure resources... INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443... ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53: no such host INFO Pulling debug logs from the bootstrap machine ERROR Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: failed to initialize the SSH agent: failed to read directory "/output/.ssh": open /output/.ssh: no such file or directory FATAL Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded * 6 vms are successfully created thin dependent from the template * each vm is provisionned by cloud-init * the step "INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443..." fails. It seems that the DNS pod is not up at this time. * Right this moment, there is no more visibility on what is done, what goes wrong... what's happening there? supposing a kind of playbook downloading a kind of images... * The" pull secret step" is not clear: we must have a redhat account to https://cloud.redhat.com/openshift/install/ to get a key like * {"auths":{"cloud.openshift.com":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr"},"quay.io":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr"},"registry.connect.redhat.com":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr"},"registry.redhat.io":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr"}}} Can you tell me if I'm doing wrong?

...

Le 05/11/2019 à 12:24, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Hello,

I'm interested by installing okd on ovirt with the official openshift installer (https://github.com/openshift/installer), but ovirt is not yet supported.

If you want to give a try and supply feedback I'll be glad.

Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A...

, how ovirt 4.3.7 should integrate openshift installer integration with terraform?

Terraform is part of it, yes, It is what we use to spin the first 3 masters, plus a bootstraping machine.

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

Roy Golan

7:22 p.m.

On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...

Le 05/11/2019 à 13:54, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
I tried openshift-install after compiling but no ovirt provider is available... So waht do you mean when you say "give a try"? Maybe only provisionning ovirt with the terraform module?

[root@vm5 installer]# bin/openshift-install create cluster ? Platform [Use arrows to move, space to select, type to filter, ? for more help]

...
aws azure gcp openstack

Its not merged yet. Please pull this image and work with it as a container quay.io/rgolangh/openshift-installer

A little feedback as you asked:

[root@openshift-installer ~]# docker run -it 56e5b667100f create cluster ? Platform ovirt ? Enter oVirt's api endpoint URL https://air-dev.v100.abes.fr/ovirt-engine/api ? Enter ovirt-engine username admin@internal ? Enter password ********** ? Pick the oVirt cluster Default ? Pick a VM template centos7.x ? Enter the internal API Virtual IP 10.34.212.200 ? Enter the internal DNS Virtual IP 10.34.212.100 ? Enter the ingress IP 10.34.212.50 ? Base Domain oc4.localdomain ? Cluster Name test ? Pull Secret [? for help] ************************************* INFO Creating infrastructure resources... INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443... ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53: no such host INFO Pulling debug logs from the bootstrap machine ERROR Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: failed to initialize the SSH agent: failed to read directory "/output/.ssh": open /output/.ssh: no such file or directory FATAL Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded

- 6 vms are successfully created thin dependent from the template

- each vm is provisionned by cloud-init - the step "INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443..." fails. It seems that the DNS pod is not up at this time. - Right this moment, there is no more visibility on what is done, what goes wrong... what's happening there? supposing a kind of playbook downloading a kind of images... - The" pull secret step" is not clear: we must have a redhat account to https://cloud.redhat.com/openshift/install/ to get a key like - {"auths":{"cloud.openshift.com ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": "exploit@abes.fr" <exploit@abes.fr>},"quay.io ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": "exploit@abes.fr" <exploit@abes.fr>},"registry.connect.redhat.com ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": "exploit@abes.fr" <exploit@abes.fr>},"registry.redhat.io ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": "exploit@abes.fr" <exploit@abes.fr>}}}

Can you tell me if I'm doing wrong?

What is the template you are using? I don't think its RHCOS(Red Hat CoreOs) template, it looks like Centos? Use this gist to import the template https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b Le 05/11/2019 à 12:24, Roy Golan a écrit :

...

...
On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Hello,

I'm interested by installing okd on ovirt with the official openshift installer (https://github.com/openshift/installer), but ovirt is not yet supported.

If you want to give a try and supply feedback I'll be glad.

...
Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and

https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... , how ovirt 4.3.7 should integrate openshift installer integration with terraform?

Terraform is part of it, yes, It is what we use to spin the first 3 masters, plus a bootstraping machine.

--

...
Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

--

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

Nathanaël Blanchet

8:28 p.m.

Le 05/11/2019 à 18:22, Roy Golan a écrit :

...

On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 13:54, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

I tried openshift-install after compiling but no ovirt provider is available... So waht do you mean when you say "give a try"? Maybe only provisionning ovirt with the terraform module?

[root@vm5 installer]# bin/openshift-install create cluster ? Platform [Use arrows to move, space to select, type to filter, ? for more help] > aws azure gcp openstack

Its not merged yet. Please pull this image and work with it as a container quay.io/rgolangh/openshift-installer <http://quay.io/rgolangh/openshift-installer>

A little feedback as you asked:

[root@openshift-installer ~]# docker run -it 56e5b667100f create cluster ? Platform ovirt ? Enter oVirt's api endpoint URL https://air-dev.v100.abes.fr/ovirt-engine/api ? Enter ovirt-engine username admin@internal ? Enter password ********** ? Pick the oVirt cluster Default ? Pick a VM template centos7.x ? Enter the internal API Virtual IP 10.34.212.200 ? Enter the internal DNS Virtual IP 10.34.212.100 ? Enter the ingress IP 10.34.212.50 ? Base Domain oc4.localdomain ? Cluster Name test ? Pull Secret [? for help] ************************************* INFO Creating infrastructure resources... INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443... ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53 <http://10.34.212.100:53>: no such host INFO Pulling debug logs from the bootstrap machine ERROR Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: failed to initialize the SSH agent: failed to read directory "/output/.ssh": open /output/.ssh: no such file or directory FATAL Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded

* 6 vms are successfully created thin dependent from the template

* each vm is provisionned by cloud-init * the step "INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443..." fails. It seems that the DNS pod is not up at this time. * Right this moment, there is no more visibility on what is done, what goes wrong... what's happening there? supposing a kind of playbook downloading a kind of images... * The" pull secret step" is not clear: we must have a redhat account to https://cloud.redhat.com/openshift/install/ to get a key like *

{"auths":{"cloud.openshift.com <http://cloud.openshift.com>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"quay.io <http://quay.io>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"registry.connect.redhat.com <http://registry.connect.redhat.com>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"registry.redhat.io <http://registry.redhat.io>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr" <mailto:exploit@abes.fr>}}}

Can you tell me if I'm doing wrong?

What is the template you are using? I don't think its RHCOS(Red Hat CoreOs) template, it looks like Centos?

Use this gist to import the template https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b

Unfortunately, the result is the same with the RHCOS template...

...

...
Le 05/11/2019 à 12:24, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Hello,

I'm interested by installing okd on ovirt with the official openshift installer (https://github.com/openshift/installer), but ovirt is not yet supported.

If you want to give a try and supply feedback I'll be glad.

Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A...

, how ovirt 4.3.7 should integrate openshift installer integration with terraform?

Terraform is part of it, yes, It is what we use to spin the first 3 masters, plus a bootstraping machine.

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

Roy Golan

10:46 p.m.

On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...

Le 05/11/2019 à 18:22, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 05/11/2019 à 13:54, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
I tried openshift-install after compiling but no ovirt provider is available... So waht do you mean when you say "give a try"? Maybe only provisionning ovirt with the terraform module?

[root@vm5 installer]# bin/openshift-install create cluster ? Platform [Use arrows to move, space to select, type to filter, ? for more help]

...
aws azure gcp openstack

Its not merged yet. Please pull this image and work with it as a container quay.io/rgolangh/openshift-installer

A little feedback as you asked:

[root@openshift-installer ~]# docker run -it 56e5b667100f create cluster ? Platform ovirt ? Enter oVirt's api endpoint URL https://air-dev.v100.abes.fr/ovirt-engine/api ? Enter ovirt-engine username admin@internal ? Enter password ********** ? Pick the oVirt cluster Default ? Pick a VM template centos7.x ? Enter the internal API Virtual IP 10.34.212.200 ? Enter the internal DNS Virtual IP 10.34.212.100 ? Enter the ingress IP 10.34.212.50 ? Base Domain oc4.localdomain ? Cluster Name test ? Pull Secret [? for help] ************************************* INFO Creating infrastructure resources... INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443... ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53: no such host INFO Pulling debug logs from the bootstrap machine ERROR Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: failed to initialize the SSH agent: failed to read directory "/output/.ssh": open /output/.ssh: no such file or directory FATAL Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded

- 6 vms are successfully created thin dependent from the template

- each vm is provisionned by cloud-init - the step "INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443..." fails. It seems that the DNS pod is not up at this time. - Right this moment, there is no more visibility on what is done, what goes wrong... what's happening there? supposing a kind of playbook downloading a kind of images... - The" pull secret step" is not clear: we must have a redhat account to https://cloud.redhat.com/openshift/install/ to get a key like - {"auths":{"cloud.openshift.com ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": "exploit@abes.fr" <exploit@abes.fr>},"quay.io ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": "exploit@abes.fr" <exploit@abes.fr>},"registry.connect.redhat.com ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": "exploit@abes.fr" <exploit@abes.fr>},"registry.redhat.io ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": "exploit@abes.fr" <exploit@abes.fr>}}}

Can you tell me if I'm doing wrong?

What is the template you are using? I don't think its RHCOS(Red Hat CoreOs) template, it looks like Centos?

Use this gist to import the template https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b

Unfortunately, the result is the same with the RHCOS template...

Make sure that: - the IPs supplied are taken, and belong to the VM network of those master VMs - localdomain or local domain suffix shouldn't be used - your ovirt-engine is version 4.3.7 or master

...

Le 05/11/2019 à 12:24, Roy Golan a écrit :

...
...
On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Hello,

I'm interested by installing okd on ovirt with the official openshift installer (https://github.com/openshift/installer), but ovirt is not yet supported.

If you want to give a try and supply feedback I'll be glad.

...
Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and

https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... , how ovirt 4.3.7 should integrate openshift installer integration with terraform?

Terraform is part of it, yes, It is what we use to spin the first 3 masters, plus a bootstraping machine.

--

...
Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

--

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

Roy Golan

10:50 p.m.

On Tue, 5 Nov 2019 at 22:46, Roy Golan <rgolan@redhat.com> wrote:

...

On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 05/11/2019 à 18:22, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 05/11/2019 à 13:54, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
I tried openshift-install after compiling but no ovirt provider is available... So waht do you mean when you say "give a try"? Maybe only provisionning ovirt with the terraform module?

[root@vm5 installer]# bin/openshift-install create cluster ? Platform [Use arrows to move, space to select, type to filter, ? for more help]

...
aws azure gcp openstack

Its not merged yet. Please pull this image and work with it as a container quay.io/rgolangh/openshift-installer

A little feedback as you asked:

[root@openshift-installer ~]# docker run -it 56e5b667100f create cluster ? Platform ovirt ? Enter oVirt's api endpoint URL https://air-dev.v100.abes.fr/ovirt-engine/api ? Enter ovirt-engine username admin@internal ? Enter password ********** ? Pick the oVirt cluster Default ? Pick a VM template centos7.x ? Enter the internal API Virtual IP 10.34.212.200 ? Enter the internal DNS Virtual IP 10.34.212.100 ? Enter the ingress IP 10.34.212.50 ? Base Domain oc4.localdomain ? Cluster Name test ? Pull Secret [? for help] ************************************* INFO Creating infrastructure resources... INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443... ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53: no such host INFO Pulling debug logs from the bootstrap machine ERROR Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: failed to initialize the SSH agent: failed to read directory "/output/.ssh": open /output/.ssh: no such file or directory FATAL Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded

- 6 vms are successfully created thin dependent from the template

- each vm is provisionned by cloud-init - the step "INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443..." fails. It seems that the DNS pod is not up at this time. - Right this moment, there is no more visibility on what is done, what goes wrong... what's happening there? supposing a kind of playbook downloading a kind of images... - The" pull secret step" is not clear: we must have a redhat account to https://cloud.redhat.com/openshift/install/ to get a key like - {"auths":{"cloud.openshift.com ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": "exploit@abes.fr" <exploit@abes.fr>},"quay.io ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": "exploit@abes.fr" <exploit@abes.fr>},"registry.connect.redhat.com ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": "exploit@abes.fr" <exploit@abes.fr>},"registry.redhat.io ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": "exploit@abes.fr" <exploit@abes.fr>}}}

Can you tell me if I'm doing wrong?

What is the template you are using? I don't think its RHCOS(Red Hat CoreOs) template, it looks like Centos?

Use this gist to import the template https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b

Unfortunately, the result is the same with the RHCOS template...

Make sure that: - the IPs supplied are taken, and belong to the VM network of those master VMs - localdomain or local domain suffix shouldn't be used - your ovirt-engine is version 4.3.7 or master

I didn't mention that you can provide any domain name, even non-existing.

When the bootstrap phase will be done, the instllation will teardown the bootsrap mahchine. At this stage if you are using a non-existing domain you would need to add the DNS Virtual IP you provided to your resolv.conf so the installation could resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN. Also, you have a log under your $INSTALL_DIR/.openshift_install.log

...

...
Le 05/11/2019 à 12:24, Roy Golan a écrit :

...
...
On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Hello,

I'm interested by installing okd on ovirt with the official openshift installer (https://github.com/openshift/installer), but ovirt is not yet supported.

If you want to give a try and supply feedback I'll be glad.

...
Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and

https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... , how ovirt 4.3.7 should integrate openshift installer integration with terraform?

Terraform is part of it, yes, It is what we use to spin the first 3 masters, plus a bootstraping machine.

--

...
Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

--

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

Nathanaël Blanchet

7 Nov 7 Nov

12:10 a.m.

Le 05/11/2019 à 21:50, Roy Golan a écrit :

...

On Tue, 5 Nov 2019 at 22:46, Roy Golan <rgolan@redhat.com <mailto:rgolan@redhat.com>> wrote:

On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 18:22, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 13:54, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

I tried openshift-install after compiling but no ovirt provider is available... So waht do you mean when you say "give a try"? Maybe only provisionning ovirt with the terraform module?

[root@vm5 installer]# bin/openshift-install create cluster ? Platform [Use arrows to move, space to select, type to filter, ? for more help] > aws azure gcp openstack

Its not merged yet. Please pull this image and work with it as a container quay.io/rgolangh/openshift-installer <http://quay.io/rgolangh/openshift-installer>

A little feedback as you asked:

[root@openshift-installer ~]# docker run -it 56e5b667100f create cluster ? Platform ovirt ? Enter oVirt's api endpoint URL https://air-dev.v100.abes.fr/ovirt-engine/api ? Enter ovirt-engine username admin@internal ? Enter password ********** ? Pick the oVirt cluster Default ? Pick a VM template centos7.x ? Enter the internal API Virtual IP 10.34.212.200 ? Enter the internal DNS Virtual IP 10.34.212.100 ? Enter the ingress IP 10.34.212.50 ? Base Domain oc4.localdomain ? Cluster Name test ? Pull Secret [? for help] ************************************* INFO Creating infrastructure resources... INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443... ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53 <http://10.34.212.100:53>: no such host INFO Pulling debug logs from the bootstrap machine ERROR Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: failed to initialize the SSH agent: failed to read directory "/output/.ssh": open /output/.ssh: no such file or directory FATAL Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded

* 6 vms are successfully created thin dependent from the template

* each vm is provisionned by cloud-init * the step "INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443..." fails. It seems that the DNS pod is not up at this time. * Right this moment, there is no more visibility on what is done, what goes wrong... what's happening there? supposing a kind of playbook downloading a kind of images... * The" pull secret step" is not clear: we must have a redhat account to https://cloud.redhat.com/openshift/install/ to get a key like *

{"auths":{"cloud.openshift.com <http://cloud.openshift.com>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"quay.io <http://quay.io>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"registry.connect.redhat.com <http://registry.connect.redhat.com>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"registry.redhat.io <http://registry.redhat.io>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr" <mailto:exploit@abes.fr>}}}

Can you tell me if I'm doing wrong?

What is the template you are using? I don't think its RHCOS(Red Hat CoreOs) template, it looks like Centos?

Use this gist to import the template https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b

Unfortunately, the result is the same with the RHCOS template...

Make sure that: - the IPs supplied are taken, and belong to the VM network of those master VMs - localdomain or local domain suffix shouldn't be used - your ovirt-engine is version 4.3.7 or master

I didn't mention that you can provide any domain name, even non-existing. When the bootstrap phase will be done, the instllation will teardown the bootsrap mahchine. At this stage if you are using a non-existing domain you would need to add the DNS Virtual IP you provided to your resolv.conf so the installation could resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN.

Also, you have a log under your $INSTALL_DIR/.openshift_install.log

I tried several things with your advices, but I'm still stuck at the https://api.test.oc4.localdomain:6443/version?timeout=32s test with logs: time="2019-11-06T20:21:15Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource" So it means DNS resolution and network are now good and ignition provisionning is is OK but something goes wrong with the bootstrap vm. Now if I log into the bootstrap vm, I can see a selinux message, but it may be not relevant... SELinux: mount invalid. Same Superblock, different security settings for (dev nqueue, type nqueue). Some other cluewWith journalctl: journalctl -b -f -u bootkube Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.661Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7beef51d-daad-4b46-9497-8e135e528f7c/etcd-1.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-1.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-03992fc6-5a87-4160-9b87-44ec6e82f7cd/etcd-2.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-2.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-00db28a7-5188-4666-896b-e37c88ad3ae9/etcd-0.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-0.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-1.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-2.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-0.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: Error: unhealthy cluster Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.720514151 +0000 UTC m=+5.813853296 container died 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image=registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, name=etcdctl) Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.817475095 +0000 UTC m=+5.910814273 container remove 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image=registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, name=etcdctl) Nov 06 21:55:40 localhost bootkube.sh[2101]: etcdctl failed. Retrying in 5 seconds... It seems to be again a dns resolution issue. [user1@localhost ~]$ dig api.test.oc4.localdomain +short 10.34.212.201 [user1@localhost ~]$ dig etcd-2.test.oc4.localdomain +short nothing So what do you think about that?

...

...
...
Le 05/11/2019 à 12:24, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Hello,

I'm interested by installing okd on ovirt with the official openshift installer (https://github.com/openshift/installer), but ovirt is not yet supported.

If you want to give a try and supply feedback I'll be glad.

Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A...

, how ovirt 4.3.7 should integrate openshift installer integration with terraform?

Terraform is part of it, yes, It is what we use to spin the first 3 masters, plus a bootstraping machine.

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

Roy Golan

8:18 a.m.

On Thu, 7 Nov 2019 at 00:10, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...

Le 05/11/2019 à 21:50, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 22:46, Roy Golan <rgolan@redhat.com> wrote:

...
On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 05/11/2019 à 18:22, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 05/11/2019 à 13:54, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
I tried openshift-install after compiling but no ovirt provider is available... So waht do you mean when you say "give a try"? Maybe only provisionning ovirt with the terraform module?

[root@vm5 installer]# bin/openshift-install create cluster ? Platform [Use arrows to move, space to select, type to filter, ? for more help]

...
aws azure gcp openstack

Its not merged yet. Please pull this image and work with it as a container quay.io/rgolangh/openshift-installer

A little feedback as you asked:

[root@openshift-installer ~]# docker run -it 56e5b667100f create cluster ? Platform ovirt ? Enter oVirt's api endpoint URL https://air-dev.v100.abes.fr/ovirt-engine/api ? Enter ovirt-engine username admin@internal ? Enter password ********** ? Pick the oVirt cluster Default ? Pick a VM template centos7.x ? Enter the internal API Virtual IP 10.34.212.200 ? Enter the internal DNS Virtual IP 10.34.212.100 ? Enter the ingress IP 10.34.212.50 ? Base Domain oc4.localdomain ? Cluster Name test ? Pull Secret [? for help] ************************************* INFO Creating infrastructure resources... INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443... ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53: no such host INFO Pulling debug logs from the bootstrap machine ERROR Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: failed to initialize the SSH agent: failed to read directory "/output/.ssh": open /output/.ssh: no such file or directory FATAL Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded

- 6 vms are successfully created thin dependent from the template

- each vm is provisionned by cloud-init - the step "INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443..." fails. It seems that the DNS pod is not up at this time. - Right this moment, there is no more visibility on what is done, what goes wrong... what's happening there? supposing a kind of playbook downloading a kind of images... - The" pull secret step" is not clear: we must have a redhat account to https://cloud.redhat.com/openshift/install/ to get a key like - {"auths":{"cloud.openshift.com ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": "exploit@abes.fr" <exploit@abes.fr>},"quay.io ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": "exploit@abes.fr" <exploit@abes.fr>},"registry.connect.redhat.com ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": "exploit@abes.fr" <exploit@abes.fr>},"registry.redhat.io ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": "exploit@abes.fr" <exploit@abes.fr>}}}

Can you tell me if I'm doing wrong?

What is the template you are using? I don't think its RHCOS(Red Hat CoreOs) template, it looks like Centos?

Use this gist to import the template https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b

Unfortunately, the result is the same with the RHCOS template...

Make sure that: - the IPs supplied are taken, and belong to the VM network of those master VMs - localdomain or local domain suffix shouldn't be used - your ovirt-engine is version 4.3.7 or master

I didn't mention that you can provide any domain name, even non-existing.

When the bootstrap phase will be done, the instllation will teardown the bootsrap mahchine. At this stage if you are using a non-existing domain you would need to add the DNS Virtual IP you provided to your resolv.conf so the installation could resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN.

Also, you have a log under your $INSTALL_DIR/.openshift_install.log

I tried several things with your advices, but I'm still stuck at the https://api.test.oc4.localdomain:6443/version?timeout=32s test

with logs:

time="2019-11-06T20:21:15Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource"

So it means DNS resolution and network are now good and ignition provisionning is is OK but something goes wrong with the bootstrap vm.

Now if I log into the bootstrap vm, I can see a selinux message, but it may be not relevant...

SELinux: mount invalid. Same Superblock, different security settings for (dev nqueue, type nqueue).

Some other cluewWith journalctl:

journalctl -b -f -u bootkube

Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.661Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7beef51d-daad-4b46-9497-8e135e528f7c/etcd-1.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-1.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-03992fc6-5a87-4160-9b87-44ec6e82f7cd/etcd-2.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-2.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-00db28a7-5188-4666-896b-e37c88ad3ae9/etcd-0.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-0.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-1.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-2.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-0.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: Error: unhealthy cluster Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.720514151 +0000 UTC m=+5.813853296 container died 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image= registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, name=etcdctl) Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.817475095 +0000 UTC m=+5.910814273 container remove 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image= registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, name=etcdctl) Nov 06 21:55:40 localhost bootkube.sh[2101]: etcdctl failed. Retrying in 5 seconds...

It seems to be again a dns resolution issue.

[user1@localhost ~]$ dig api.test.oc4.localdomain +short 10.34.212.201

[user1@localhost ~]$ dig etcd-2.test.oc4.localdomain +short nothing

So what do you think about that?

Key here is the masters - they need to boot, get ignition from the

bootstrap machine and start publishing their IPs and hostnames. Connect to a master, check its hostname, check its running or failing containers `crictl ps -a` by root user.

...

...
...
Le 05/11/2019 à 12:24, Roy Golan a écrit :

...
...
On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Hello,

I'm interested by installing okd on ovirt with the official openshift installer (https://github.com/openshift/installer), but ovirt is not yet supported.

If you want to give a try and supply feedback I'll be glad.

...
Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and

https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... , how ovirt 4.3.7 should integrate openshift installer integration with terraform?

Terraform is part of it, yes, It is what we use to spin the first 3 masters, plus a bootstraping machine.

--

...
Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

--

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

--

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

Nathanaël Blanchet

11:23 a.m.

Le 07/11/2019 à 07:18, Roy Golan a écrit :

...

On Thu, 7 Nov 2019 at 00:10, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 21:50, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 22:46, Roy Golan <rgolan@redhat.com <mailto:rgolan@redhat.com>> wrote:

On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 18:22, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 13:54, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

I tried openshift-install after compiling but no ovirt provider is available... So waht do you mean when you say "give a try"? Maybe only provisionning ovirt with the terraform module?

[root@vm5 installer]# bin/openshift-install create cluster ? Platform [Use arrows to move, space to select, type to filter, ? for more help] > aws azure gcp openstack

Its not merged yet. Please pull this image and work with it as a container quay.io/rgolangh/openshift-installer <http://quay.io/rgolangh/openshift-installer>

A little feedback as you asked:

[root@openshift-installer ~]# docker run -it 56e5b667100f create cluster ? Platform ovirt ? Enter oVirt's api endpoint URL https://air-dev.v100.abes.fr/ovirt-engine/api ? Enter ovirt-engine username admin@internal ? Enter password ********** ? Pick the oVirt cluster Default ? Pick a VM template centos7.x ? Enter the internal API Virtual IP 10.34.212.200 ? Enter the internal DNS Virtual IP 10.34.212.100 ? Enter the ingress IP 10.34.212.50 ? Base Domain oc4.localdomain ? Cluster Name test ? Pull Secret [? for help] ************************************* INFO Creating infrastructure resources... INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443... ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53 <http://10.34.212.100:53>: no such host INFO Pulling debug logs from the bootstrap machine ERROR Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: failed to initialize the SSH agent: failed to read directory "/output/.ssh": open /output/.ssh: no such file or directory FATAL Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded

* 6 vms are successfully created thin dependent from the template

* each vm is provisionned by cloud-init * the step "INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443..." fails. It seems that the DNS pod is not up at this time. * Right this moment, there is no more visibility on what is done, what goes wrong... what's happening there? supposing a kind of playbook downloading a kind of images... * The" pull secret step" is not clear: we must have a redhat account to https://cloud.redhat.com/openshift/install/ to get a key like *

{"auths":{"cloud.openshift.com <http://cloud.openshift.com>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"quay.io <http://quay.io>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"registry.connect.redhat.com <http://registry.connect.redhat.com>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"registry.redhat.io <http://registry.redhat.io>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr" <mailto:exploit@abes.fr>}}}

Can you tell me if I'm doing wrong?

What is the template you are using? I don't think its RHCOS(Red Hat CoreOs) template, it looks like Centos?

Use this gist to import the template https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b

Unfortunately, the result is the same with the RHCOS template...

Make sure that: - the IPs supplied are taken, and belong to the VM network of those master VMs - localdomain or local domain suffix shouldn't be used - your ovirt-engine is version 4.3.7 or master

I didn't mention that you can provide any domain name, even non-existing. When the bootstrap phase will be done, the instllation will teardown the bootsrap mahchine. At this stage if you are using a non-existing domain you would need to add the DNS Virtual IP you provided to your resolv.conf so the installation could resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN.

Also, you have a log under your $INSTALL_DIR/.openshift_install.log

I tried several things with your advices, but I'm still stuck at the https://api.test.oc4.localdomain:6443/version?timeout=32s test

with logs:

time="2019-11-06T20:21:15Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource"

So it means DNS resolution and network are now good and ignition provisionning is is OK but something goes wrong with the bootstrap vm.

Now if I log into the bootstrap vm, I can see a selinux message, but it may be not relevant...

SELinux: mount invalid. Same Superblock, different security settings for (dev nqueue, type nqueue).

Some other cluewWith journalctl:

journalctl -b -f -u bootkube

Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.661Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7beef51d-daad-4b46-9497-8e135e528f7c/etcd-1.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-1.test.oc4.localdomain on 10.34.212.101:53 <http://10.34.212.101:53>: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-03992fc6-5a87-4160-9b87-44ec6e82f7cd/etcd-2.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-2.test.oc4.localdomain on 10.34.212.101:53 <http://10.34.212.101:53>: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-00db28a7-5188-4666-896b-e37c88ad3ae9/etcd-0.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-0.test.oc4.localdomain on 10.34.212.101:53 <http://10.34.212.101:53>: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-1.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-2.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-0.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: Error: unhealthy cluster Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.720514151 +0000 UTC m=+5.813853296 container died 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image=registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae>, name=etcdctl) Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.817475095 +0000 UTC m=+5.910814273 container remove 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image=registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae>, name=etcdctl) Nov 06 21:55:40 localhost bootkube.sh[2101]: etcdctl failed. Retrying in 5 seconds...

It seems to be again a dns resolution issue.

[user1@localhost ~]$ dig api.test.oc4.localdomain +short 10.34.212.201

[user1@localhost ~]$ dig etcd-2.test.oc4.localdomain +short nothing

So what do you think about that?

Key here is the masters - they need to boot, get ignition from the bootstrap machine and start publishing their IPs and hostnames.

Connect to a master, check its hostname, check its running or failing containers `crictl ps -a` by root user.

You were right: # crictl ps -a CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID 744cb8e654705 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 4 minutes ago Running discovery 75 9462e9a8ca478 912ba9db736c3 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 14 minutes ago Exited discovery 74 9462e9a8ca478 # crictl logs 744cb8e654705 E1107 08:10:04.262330 1 run.go:67] error looking up self for candidate IP 10.34.212.227: lookup _etcd-server-ssl._tcp.test.oc4.localdomain on 10.34.212.51:53: no such host # hostname localhost Conclusion: discovery didn't publish IPs and hostname to coreDNS because the master didn't get its name master-0.test.oc4.localdomain during provisionning phase. I changed the master-0 hostname and reinitiates ignition to verify: # hostnamectl set-hostname master-0.test.oc4.localdomain # touch /boot/ignition.firstboot && rm -rf /etc/machine-id && reboot After reboot is completed, no more exited discovery container: CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID e701efa8bc583 77ec5e26cc676ef2bf5c42dd40e55394a11fb45a3e2d7e95cbaf233a1eef472f 20 seconds ago Running coredns 1 cbabc53322ac8 2c7bc6abb5b65 d73eca122bd567a3a1f70fa5021683bc17dd87003d05d88b1cdd0215c55049f6 20 seconds ago Running mdns-publisher 1 6f8914ff9db35 b3f619d5afa2c 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running haproxy-monitor 1 0e5c209496787 07769ce79b032 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running keepalived-monitor 1 02cf141d01a29 fb20d66b81254 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 21 seconds ago Running discovery 77 562f32067e0a7 476b07599260e 86a34bc5edd3e70073313f97bfd51ed8937658b341dc52334fb98ea6896ebdc2 22 seconds ago Running haproxy 1 0e5c209496787 26b53050a412b 9f94e500f85a735ec212ffb7305e0b63f7151a5346e41c2d5d293c8456f6fa42 22 seconds ago Running keepalived 1 02cf141d01a29 30ce48453854b 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 cbabc53322ac8 ad3ab0ae52077 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 6f8914ff9db35 650d62765e9e1 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e 13 hours ago Exited coredns 0 2ae0512b3b6ac 481969ce49bb9 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e 13 hours ago Exited mdns-publisher 0 d49754042b792 3594d9d261ca7 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d 13 hours ago Exited haproxy-monitor 0 3476219058ba8 88b13ec02a5c1 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 13 hours ago Exited keepalived-monitor 0 a3e13cf07c04f 1ab721b5599ed registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60 13 hours ago because DNS registration is OK: [user1@master-0 ~]$ dig etcd-0.test.oc4.localdomain +short 10.34.212.227 CONCLUSION: * none of rhcos vm is correctly provisionned to their targeted hostname, so they all stay with localhost. * Cloud-init syntax for the hostname is ok, but it is not provisioned by ignition: Why not provisionning these hostnames with a json snippet or else? |{"ignition":{"version":"2.2.0"},"storage":{"files":[{"filesystem":"root","path":"/etc/hostname","mode":420,"contents":{"source":"data:,master-0.test.oc4.localdomain"}}]}}|

...

...
...
...
Le 05/11/2019 à 12:24, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Hello,

I'm interested by installing okd on ovirt with the official openshift installer (https://github.com/openshift/installer), but ovirt is not yet supported.

If you want to give a try and supply feedback I'll be glad.

Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A...

, how ovirt 4.3.7 should integrate openshift installer integration with terraform?

Terraform is part of it, yes, It is what we use to spin the first 3 masters, plus a bootstraping machine.

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

Roy Golan

12:16 p.m.

On Thu, 7 Nov 2019 at 11:23, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...

Le 07/11/2019 à 07:18, Roy Golan a écrit :

On Thu, 7 Nov 2019 at 00:10, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 05/11/2019 à 21:50, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 22:46, Roy Golan <rgolan@redhat.com> wrote:

...
On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 05/11/2019 à 18:22, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 05/11/2019 à 13:54, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
I tried openshift-install after compiling but no ovirt provider is available... So waht do you mean when you say "give a try"? Maybe only provisionning ovirt with the terraform module?

[root@vm5 installer]# bin/openshift-install create cluster ? Platform [Use arrows to move, space to select, type to filter, ? for more help] > aws azure gcp openstack

Its not merged yet. Please pull this image and work with it as a container quay.io/rgolangh/openshift-installer

A little feedback as you asked:

[root@openshift-installer ~]# docker run -it 56e5b667100f create cluster ? Platform ovirt ? Enter oVirt's api endpoint URL https://air-dev.v100.abes.fr/ovirt-engine/api ? Enter ovirt-engine username admin@internal ? Enter password ********** ? Pick the oVirt cluster Default ? Pick a VM template centos7.x ? Enter the internal API Virtual IP 10.34.212.200 ? Enter the internal DNS Virtual IP 10.34.212.100 ? Enter the ingress IP 10.34.212.50 ? Base Domain oc4.localdomain ? Cluster Name test ? Pull Secret [? for help] ************************************* INFO Creating infrastructure resources... INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443... ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53: no such host INFO Pulling debug logs from the bootstrap machine ERROR Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: failed to initialize the SSH agent: failed to read directory "/output/.ssh": open /output/.ssh: no such file or directory FATAL Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded

- 6 vms are successfully created thin dependent from the template

- each vm is provisionned by cloud-init - the step "INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443..." fails. It seems that the DNS pod is not up at this time. - Right this moment, there is no more visibility on what is done, what goes wrong... what's happening there? supposing a kind of playbook downloading a kind of images... - The" pull secret step" is not clear: we must have a redhat account to https://cloud.redhat.com/openshift/install/ to get a key like - {"auths":{"cloud.openshift.com ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": "exploit@abes.fr" <exploit@abes.fr>},"quay.io ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": "exploit@abes.fr" <exploit@abes.fr>},"registry.connect.redhat.com ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": "exploit@abes.fr" <exploit@abes.fr>},"registry.redhat.io ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": "exploit@abes.fr" <exploit@abes.fr>}}}

Can you tell me if I'm doing wrong?

What is the template you are using? I don't think its RHCOS(Red Hat CoreOs) template, it looks like Centos?

Use this gist to import the template https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b

Unfortunately, the result is the same with the RHCOS template...

Make sure that: - the IPs supplied are taken, and belong to the VM network of those master VMs - localdomain or local domain suffix shouldn't be used - your ovirt-engine is version 4.3.7 or master

I didn't mention that you can provide any domain name, even non-existing.

When the bootstrap phase will be done, the instllation will teardown the bootsrap mahchine. At this stage if you are using a non-existing domain you would need to add the DNS Virtual IP you provided to your resolv.conf so the installation could resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN.

Also, you have a log under your $INSTALL_DIR/.openshift_install.log

I tried several things with your advices, but I'm still stuck at the https://api.test.oc4.localdomain:6443/version?timeout=32s test

with logs:

time="2019-11-06T20:21:15Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource"

So it means DNS resolution and network are now good and ignition provisionning is is OK but something goes wrong with the bootstrap vm.

Now if I log into the bootstrap vm, I can see a selinux message, but it may be not relevant...

SELinux: mount invalid. Same Superblock, different security settings for (dev nqueue, type nqueue).

Some other cluewWith journalctl:

journalctl -b -f -u bootkube

Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.661Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7beef51d-daad-4b46-9497-8e135e528f7c/etcd-1.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-1.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-03992fc6-5a87-4160-9b87-44ec6e82f7cd/etcd-2.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-2.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-00db28a7-5188-4666-896b-e37c88ad3ae9/etcd-0.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-0.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-1.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-2.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-0.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: Error: unhealthy cluster Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.720514151 +0000 UTC m=+5.813853296 container died 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image= registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, name=etcdctl) Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.817475095 +0000 UTC m=+5.910814273 container remove 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image= registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, name=etcdctl) Nov 06 21:55:40 localhost bootkube.sh[2101]: etcdctl failed. Retrying in 5 seconds...

It seems to be again a dns resolution issue.

[user1@localhost ~]$ dig api.test.oc4.localdomain +short 10.34.212.201

[user1@localhost ~]$ dig etcd-2.test.oc4.localdomain +short nothing

So what do you think about that?

Key here is the masters - they need to boot, get ignition from the

bootstrap machine and start publishing their IPs and hostnames.

Connect to a master, check its hostname, check its running or failing containers `crictl ps -a` by root user.

You were right: # crictl ps -a CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID 744cb8e654705 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 4 minutes ago Running discovery 75 9462e9a8ca478 912ba9db736c3 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 14 minutes ago Exited discovery 74 9462e9a8ca478

# crictl logs 744cb8e654705 E1107 08:10:04.262330 1 run.go:67] error looking up self for candidate IP 10.34.212.227: lookup _etcd-server-ssl._tcp.test.oc4.localdomain on 10.34.212.51:53: no such host

# hostname localhost

Conclusion: discovery didn't publish IPs and hostname to coreDNS because the master didn't get its name master-0.test.oc4.localdomain during provisionning phase.

I changed the master-0 hostname and reinitiates ignition to verify:

# hostnamectl set-hostname master-0.test.oc4.localdomain

# touch /boot/ignition.firstboot && rm -rf /etc/machine-id && reboot

After reboot is completed, no more exited discovery container:

CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID e701efa8bc583 77ec5e26cc676ef2bf5c42dd40e55394a11fb45a3e2d7e95cbaf233a1eef472f 20 seconds ago Running coredns 1 cbabc53322ac8 2c7bc6abb5b65 d73eca122bd567a3a1f70fa5021683bc17dd87003d05d88b1cdd0215c55049f6 20 seconds ago Running mdns-publisher 1 6f8914ff9db35 b3f619d5afa2c 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running haproxy-monitor 1 0e5c209496787 07769ce79b032 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running keepalived-monitor 1 02cf141d01a29 fb20d66b81254 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 21 seconds ago Running discovery 77 562f32067e0a7 476b07599260e 86a34bc5edd3e70073313f97bfd51ed8937658b341dc52334fb98ea6896ebdc2 22 seconds ago Running haproxy 1 0e5c209496787 26b53050a412b 9f94e500f85a735ec212ffb7305e0b63f7151a5346e41c2d5d293c8456f6fa42 22 seconds ago Running keepalived 1 02cf141d01a29 30ce48453854b 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 cbabc53322ac8 ad3ab0ae52077 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 6f8914ff9db35 650d62765e9e1 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e 13 hours ago Exited coredns 0 2ae0512b3b6ac 481969ce49bb9 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e 13 hours ago Exited mdns-publisher 0 d49754042b792 3594d9d261ca7 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d 13 hours ago Exited haproxy-monitor 0 3476219058ba8 88b13ec02a5c1 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 13 hours ago Exited keepalived-monitor 0 a3e13cf07c04f 1ab721b5599ed registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60 13 hours ago

because DNS registration is OK:

[user1@master-0 ~]$ dig etcd-0.test.oc4.localdomain +short 10.34.212.227

CONCLUSION:

- none of rhcos vm is correctly provisionned to their targeted hostname, so they all stay with localhost.

What is your engine version? the hostname support for ignition is merged into 4.3.7 and master

...

- Cloud-init syntax for the hostname is ok, but it is not provisioned by ignition:

Why not provisionning these hostnames with a json snippet or else?

{ "ignition": { "version": "2.2.0" }, "storage": { "files": [{ "filesystem": "root", "path": "/etc/hostname", "mode": 420, "contents": { "source": "data:,master-0.test.oc4.localdomain" } }] }}

...
...
...
Le 05/11/2019 à 12:24, Roy Golan a écrit :

...
...
On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr> wrote:

> Hello, > > I'm interested by installing okd on ovirt with the official > openshift > installer (https://github.com/openshift/installer), but ovirt is > not yet > supported. > > If you want to give a try and supply feedback I'll be glad.

> Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and > > https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... > , how ovirt 4.3.7 should integrate openshift installer integration > with > terraform? > > Terraform is part of it, yes, It is what we use to spin the first 3 masters, plus a bootstraping machine.

-- > Nathanaël Blanchet > > Supervision réseau > Pôle Infrastrutures Informatiques > 227 avenue Professeur-Jean-Louis-Viala > 34193 MONTPELLIER CEDEX 5 > Tél. 33 (0)4 67 54 84 55 > Fax 33 (0)4 67 54 84 14 > blanchet@abes.fr > > -- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

--

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

--

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

--

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

Nathanaël Blanchet

12:28 p.m.

Le 07/11/2019 à 11:16, Roy Golan a écrit :

...

On Thu, 7 Nov 2019 at 11:23, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 07/11/2019 à 07:18, Roy Golan a écrit :

...
On Thu, 7 Nov 2019 at 00:10, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 21:50, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 22:46, Roy Golan <rgolan@redhat.com <mailto:rgolan@redhat.com>> wrote:

On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 18:22, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 13:54, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

I tried openshift-install after compiling but no ovirt provider is available... So waht do you mean when you say "give a try"? Maybe only provisionning ovirt with the terraform module?

[root@vm5 installer]# bin/openshift-install create cluster ? Platform [Use arrows to move, space to select, type to filter, ? for more help] > aws azure gcp openstack

Its not merged yet. Please pull this image and work with it as a container quay.io/rgolangh/openshift-installer <http://quay.io/rgolangh/openshift-installer>

A little feedback as you asked:

[root@openshift-installer ~]# docker run -it 56e5b667100f create cluster ? Platform ovirt ? Enter oVirt's api endpoint URL https://air-dev.v100.abes.fr/ovirt-engine/api ? Enter ovirt-engine username admin@internal ? Enter password ********** ? Pick the oVirt cluster Default ? Pick a VM template centos7.x ? Enter the internal API Virtual IP 10.34.212.200 ? Enter the internal DNS Virtual IP 10.34.212.100 ? Enter the ingress IP 10.34.212.50 ? Base Domain oc4.localdomain ? Cluster Name test ? Pull Secret [? for help] ************************************* INFO Creating infrastructure resources... INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443... ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53 <http://10.34.212.100:53>: no such host INFO Pulling debug logs from the bootstrap machine ERROR Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: failed to initialize the SSH agent: failed to read directory "/output/.ssh": open /output/.ssh: no such file or directory FATAL Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded

* 6 vms are successfully created thin dependent from the template

* each vm is provisionned by cloud-init * the step "INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443..." fails. It seems that the DNS pod is not up at this time. * Right this moment, there is no more visibility on what is done, what goes wrong... what's happening there? supposing a kind of playbook downloading a kind of images... * The" pull secret step" is not clear: we must have a redhat account to https://cloud.redhat.com/openshift/install/ to get a key like *

{"auths":{"cloud.openshift.com <http://cloud.openshift.com>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"quay.io <http://quay.io>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"registry.connect.redhat.com <http://registry.connect.redhat.com>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"registry.redhat.io <http://registry.redhat.io>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr" <mailto:exploit@abes.fr>}}}

Can you tell me if I'm doing wrong?

What is the template you are using? I don't think its RHCOS(Red Hat CoreOs) template, it looks like Centos?

Use this gist to import the template https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b

Unfortunately, the result is the same with the RHCOS template...

Make sure that: - the IPs supplied are taken, and belong to the VM network of those master VMs - localdomain or local domain suffix shouldn't be used - your ovirt-engine is version 4.3.7 or master

I didn't mention that you can provide any domain name, even non-existing. When the bootstrap phase will be done, the instllation will teardown the bootsrap mahchine. At this stage if you are using a non-existing domain you would need to add the DNS Virtual IP you provided to your resolv.conf so the installation could resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN.

Also, you have a log under your $INSTALL_DIR/.openshift_install.log

I tried several things with your advices, but I'm still stuck at the https://api.test.oc4.localdomain:6443/version?timeout=32s test

with logs:

time="2019-11-06T20:21:15Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource"

So it means DNS resolution and network are now good and ignition provisionning is is OK but something goes wrong with the bootstrap vm.

Now if I log into the bootstrap vm, I can see a selinux message, but it may be not relevant...

SELinux: mount invalid. Same Superblock, different security settings for (dev nqueue, type nqueue).

Some other cluewWith journalctl:

journalctl -b -f -u bootkube

Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.661Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7beef51d-daad-4b46-9497-8e135e528f7c/etcd-1.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-1.test.oc4.localdomain on 10.34.212.101:53 <http://10.34.212.101:53>: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-03992fc6-5a87-4160-9b87-44ec6e82f7cd/etcd-2.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-2.test.oc4.localdomain on 10.34.212.101:53 <http://10.34.212.101:53>: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-00db28a7-5188-4666-896b-e37c88ad3ae9/etcd-0.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-0.test.oc4.localdomain on 10.34.212.101:53 <http://10.34.212.101:53>: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-1.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-2.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-0.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: Error: unhealthy cluster Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.720514151 +0000 UTC m=+5.813853296 container died 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image=registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae>, name=etcdctl) Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.817475095 +0000 UTC m=+5.910814273 container remove 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image=registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae>, name=etcdctl) Nov 06 21:55:40 localhost bootkube.sh[2101]: etcdctl failed. Retrying in 5 seconds...

It seems to be again a dns resolution issue.

[user1@localhost ~]$ dig api.test.oc4.localdomain +short 10.34.212.201

[user1@localhost ~]$ dig etcd-2.test.oc4.localdomain +short nothing

So what do you think about that?

Key here is the masters - they need to boot, get ignition from the bootstrap machine and start publishing their IPs and hostnames.

Connect to a master, check its hostname, check its running or failing containers `crictl ps -a` by root user.

You were right:

# crictl ps -a CONTAINER ID IMAGE CREATED             STATE NAME                 ATTEMPT             POD ID 744cb8e654705 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 4 minutes ago       Running discovery            75                  9462e9a8ca478 912ba9db736c3 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 14 minutes ago      Exited discovery            74                  9462e9a8ca478

# crictl logs 744cb8e654705 E1107 08:10:04.262330       1 run.go:67] error looking up self for candidate IP 10.34.212.227 <http://10.34.212.227>: lookup _etcd-server-ssl._tcp.test.oc4.localdomain on 10.34.212.51:53 <http://10.34.212.51:53>: no such host

# hostname localhost

Conclusion: discovery didn't publish IPs and hostname to coreDNS because the master didn't get its name master-0.test.oc4.localdomain during provisionning phase.

I changed the master-0 hostname and reinitiates ignition to verify:

# hostnamectl set-hostname master-0.test.oc4.localdomain

# touch /boot/ignition.firstboot && rm -rf /etc/machine-id && reboot

After reboot is completed, no more exited discovery container:

CONTAINER ID IMAGE CREATED             STATE NAME                 ATTEMPT             POD ID e701efa8bc583 77ec5e26cc676ef2bf5c42dd40e55394a11fb45a3e2d7e95cbaf233a1eef472f 20 seconds ago      Running coredns              1                   cbabc53322ac8 2c7bc6abb5b65 d73eca122bd567a3a1f70fa5021683bc17dd87003d05d88b1cdd0215c55049f6 20 seconds ago      Running mdns-publisher       1                   6f8914ff9db35 b3f619d5afa2c 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago      Running haproxy-monitor      1                   0e5c209496787 07769ce79b032 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago      Running keepalived-monitor   1                   02cf141d01a29 fb20d66b81254 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 21 seconds ago      Running discovery            77                  562f32067e0a7 476b07599260e 86a34bc5edd3e70073313f97bfd51ed8937658b341dc52334fb98ea6896ebdc2 22 seconds ago      Running haproxy              1                   0e5c209496787 26b53050a412b 9f94e500f85a735ec212ffb7305e0b63f7151a5346e41c2d5d293c8456f6fa42 22 seconds ago      Running keepalived           1                   02cf141d01a29 30ce48453854b 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago      Exited render-config        1                   cbabc53322ac8 ad3ab0ae52077 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago      Exited render-config        1                   6f8914ff9db35 650d62765e9e1 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e> 13 hours ago        Exited coredns              0                   2ae0512b3b6ac 481969ce49bb9 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e> 13 hours ago        Exited mdns-publisher       0                   d49754042b792 3594d9d261ca7 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d> 13 hours ago        Exited haproxy-monitor      0                   3476219058ba8 88b13ec02a5c1 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 13 hours ago        Exited keepalived-monitor   0                   a3e13cf07c04f 1ab721b5599ed registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60 <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60> 13 hours ago

because DNS registration is OK:

[user1@master-0 ~]$ dig etcd-0.test.oc4.localdomain +short 10.34.212.227

CONCLUSION:

* none of rhcos vm is correctly provisionned to their targeted hostname, so they all stay with localhost.

What is your engine version? the hostname support for ignition is merged into 4.3.7 and master

4.3.7.1-1.el7 I only upgraded engine and not vdsm on hosts, but I suppose hosts are not important for ignition

...

* Cloud-init syntax for the hostname is ok, but it is not provisioned by ignition:

Why not provisionning these hostnames with a json snippet or else?

|{"ignition":{"version":"2.2.0"},"storage":{"files":[{"filesystem":"root","path":"/etc/hostname","mode":420,"contents":{"source":"data:,master-0.test.oc4.localdomain"}}]}}|

...
...
...
...
Le 05/11/2019 à 12:24, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Hello,

I'm interested by installing okd on ovirt with the official openshift installer (https://github.com/openshift/installer), but ovirt is not yet supported.

If you want to give a try and supply feedback I'll be glad.

Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A...

, how ovirt 4.3.7 should integrate openshift installer integration with terraform?

Terraform is part of it, yes, It is what we use to spin the first 3 masters, plus a bootstraping machine.

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

Roy Golan

12:53 p.m.

On Thu, 7 Nov 2019 at 12:28, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...

Le 07/11/2019 à 11:16, Roy Golan a écrit :

On Thu, 7 Nov 2019 at 11:23, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 07/11/2019 à 07:18, Roy Golan a écrit :

On Thu, 7 Nov 2019 at 00:10, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 05/11/2019 à 21:50, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 22:46, Roy Golan <rgolan@redhat.com> wrote:

...
On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 05/11/2019 à 18:22, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 05/11/2019 à 13:54, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr> wrote:

> I tried openshift-install after compiling but no ovirt provider is > available... So waht do you mean when you say "give a try"? Maybe only > provisionning ovirt with the terraform module? > > [root@vm5 installer]# bin/openshift-install create cluster > ? Platform [Use arrows to move, space to select, type to filter, ? > for more help] > > aws > azure > gcp > openstack > > > Its not merged yet. Please pull this image and work with it as a container quay.io/rgolangh/openshift-installer

A little feedback as you asked:

[root@openshift-installer ~]# docker run -it 56e5b667100f create cluster ? Platform ovirt ? Enter oVirt's api endpoint URL https://air-dev.v100.abes.fr/ovirt-engine/api ? Enter ovirt-engine username admin@internal ? Enter password ********** ? Pick the oVirt cluster Default ? Pick a VM template centos7.x ? Enter the internal API Virtual IP 10.34.212.200 ? Enter the internal DNS Virtual IP 10.34.212.100 ? Enter the ingress IP 10.34.212.50 ? Base Domain oc4.localdomain ? Cluster Name test ? Pull Secret [? for help] ************************************* INFO Creating infrastructure resources... INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443... ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53: no such host INFO Pulling debug logs from the bootstrap machine ERROR Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: failed to initialize the SSH agent: failed to read directory "/output/.ssh": open /output/.ssh: no such file or directory FATAL Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded

- 6 vms are successfully created thin dependent from the template

- each vm is provisionned by cloud-init - the step "INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443..." fails. It seems that the DNS pod is not up at this time. - Right this moment, there is no more visibility on what is done, what goes wrong... what's happening there? supposing a kind of playbook downloading a kind of images... - The" pull secret step" is not clear: we must have a redhat account to https://cloud.redhat.com/openshift/install/ to get a key like - {"auths":{"cloud.openshift.com ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": "exploit@abes.fr" <exploit@abes.fr>},"quay.io ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": "exploit@abes.fr" <exploit@abes.fr>},"registry.connect.redhat.com ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": "exploit@abes.fr" <exploit@abes.fr>},"registry.redhat.io ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": "exploit@abes.fr" <exploit@abes.fr>}}}

Can you tell me if I'm doing wrong?

What is the template you are using? I don't think its RHCOS(Red Hat CoreOs) template, it looks like Centos?

Use this gist to import the template https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b

Unfortunately, the result is the same with the RHCOS template...

Make sure that: - the IPs supplied are taken, and belong to the VM network of those master VMs - localdomain or local domain suffix shouldn't be used - your ovirt-engine is version 4.3.7 or master

I didn't mention that you can provide any domain name, even

non-existing. When the bootstrap phase will be done, the instllation will teardown the bootsrap mahchine. At this stage if you are using a non-existing domain you would need to add the DNS Virtual IP you provided to your resolv.conf so the installation could resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN.

Also, you have a log under your $INSTALL_DIR/.openshift_install.log

I tried several things with your advices, but I'm still stuck at the https://api.test.oc4.localdomain:6443/version?timeout=32s test

with logs:

time="2019-11-06T20:21:15Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource"

So it means DNS resolution and network are now good and ignition provisionning is is OK but something goes wrong with the bootstrap vm.

Now if I log into the bootstrap vm, I can see a selinux message, but it may be not relevant...

SELinux: mount invalid. Same Superblock, different security settings for (dev nqueue, type nqueue).

Some other cluewWith journalctl:

journalctl -b -f -u bootkube

Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.661Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7beef51d-daad-4b46-9497-8e135e528f7c/etcd-1.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-1.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-03992fc6-5a87-4160-9b87-44ec6e82f7cd/etcd-2.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-2.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-00db28a7-5188-4666-896b-e37c88ad3ae9/etcd-0.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-0.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-1.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-2.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-0.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: Error: unhealthy cluster Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.720514151 +0000 UTC m=+5.813853296 container died 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image= registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, name=etcdctl) Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.817475095 +0000 UTC m=+5.910814273 container remove 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image= registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, name=etcdctl) Nov 06 21:55:40 localhost bootkube.sh[2101]: etcdctl failed. Retrying in 5 seconds...

It seems to be again a dns resolution issue.

[user1@localhost ~]$ dig api.test.oc4.localdomain +short 10.34.212.201

[user1@localhost ~]$ dig etcd-2.test.oc4.localdomain +short nothing

So what do you think about that?

Key here is the masters - they need to boot, get ignition from the

bootstrap machine and start publishing their IPs and hostnames.

Connect to a master, check its hostname, check its running or failing containers `crictl ps -a` by root user.

You were right: # crictl ps -a CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID 744cb8e654705 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 4 minutes ago Running discovery 75 9462e9a8ca478 912ba9db736c3 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 14 minutes ago Exited discovery 74 9462e9a8ca478

# crictl logs 744cb8e654705 E1107 08:10:04.262330 1 run.go:67] error looking up self for candidate IP 10.34.212.227: lookup _etcd-server-ssl._tcp.test.oc4.localdomain on 10.34.212.51:53: no such host

# hostname localhost

Conclusion: discovery didn't publish IPs and hostname to coreDNS because the master didn't get its name master-0.test.oc4.localdomain during provisionning phase.

I changed the master-0 hostname and reinitiates ignition to verify:

# hostnamectl set-hostname master-0.test.oc4.localdomain

# touch /boot/ignition.firstboot && rm -rf /etc/machine-id && reboot

After reboot is completed, no more exited discovery container:

CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID e701efa8bc583 77ec5e26cc676ef2bf5c42dd40e55394a11fb45a3e2d7e95cbaf233a1eef472f 20 seconds ago Running coredns 1 cbabc53322ac8 2c7bc6abb5b65 d73eca122bd567a3a1f70fa5021683bc17dd87003d05d88b1cdd0215c55049f6 20 seconds ago Running mdns-publisher 1 6f8914ff9db35 b3f619d5afa2c 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running haproxy-monitor 1 0e5c209496787 07769ce79b032 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running keepalived-monitor 1 02cf141d01a29 fb20d66b81254 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 21 seconds ago Running discovery 77 562f32067e0a7 476b07599260e 86a34bc5edd3e70073313f97bfd51ed8937658b341dc52334fb98ea6896ebdc2 22 seconds ago Running haproxy 1 0e5c209496787 26b53050a412b 9f94e500f85a735ec212ffb7305e0b63f7151a5346e41c2d5d293c8456f6fa42 22 seconds ago Running keepalived 1 02cf141d01a29 30ce48453854b 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 cbabc53322ac8 ad3ab0ae52077 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 6f8914ff9db35 650d62765e9e1 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e 13 hours ago Exited coredns 0 2ae0512b3b6ac 481969ce49bb9 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e 13 hours ago Exited mdns-publisher 0 d49754042b792 3594d9d261ca7 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d 13 hours ago Exited haproxy-monitor 0 3476219058ba8 88b13ec02a5c1 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 13 hours ago Exited keepalived-monitor 0 a3e13cf07c04f 1ab721b5599ed registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60 13 hours ago

because DNS registration is OK:

[user1@master-0 ~]$ dig etcd-0.test.oc4.localdomain +short 10.34.212.227

CONCLUSION:

- none of rhcos vm is correctly provisionned to their targeted hostname, so they all stay with localhost.

What is your engine version? the hostname support for ignition is merged into 4.3.7 and master

4.3.7.1-1.el7

https://gerrit.ovirt.org/c/100397/ merged 2 days ago, so it will apear in 4.3.7.2. Sandro when is 4.7.3.2 is due? I only upgraded engine and not vdsm on hosts, but I suppose hosts are not

...

important for ignition

Correct.

...

...
- Cloud-init syntax for the hostname is ok, but it is not provisioned by ignition:

Why not provisionning these hostnames with a json snippet or else?

{ "ignition": { "version": "2.2.0" }, "storage": { "files": [{ "filesystem": "root", "path": "/etc/hostname", "mode": 420, "contents": { "source": "data:,master-0.test.oc4.localdomain" } }] }}

...
...
...
Le 05/11/2019 à 12:24, Roy Golan a écrit :

...
> > > > On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr> > wrote: > >> Hello, >> >> I'm interested by installing okd on ovirt with the official >> openshift >> installer (https://github.com/openshift/installer), but ovirt is >> not yet >> supported. >> >> > If you want to give a try and supply feedback I'll be glad. > > >> Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and >> >> https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... >> , how ovirt 4.3.7 should integrate openshift installer integration >> with >> terraform? >> >> > Terraform is part of it, yes, It is what we use to spin the first 3 > masters, plus a bootstraping machine. > > -- >> Nathanaël Blanchet >> >> Supervision réseau >> Pôle Infrastrutures Informatiques >> 227 avenue Professeur-Jean-Louis-Viala >> 34193 MONTPELLIER CEDEX 5 >> Tél. 33 (0)4 67 54 84 55 >> Fax 33 (0)4 67 54 84 14 >> blanchet@abes.fr >> >> -- > Nathanaël Blanchet > > Supervision réseau > Pôle Infrastrutures Informatiques > 227 avenue Professeur-Jean-Louis-Viala > 34193 MONTPELLIER CEDEX 5 > Tél. 33 (0)4 67 54 84 55 > Fax 33 (0)4 67 54 84 14blanchet@abes.fr > > -- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

--

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

--

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

Yedidyah Bar David

1:02 p.m.

On Thu, Nov 7, 2019 at 12:57 PM Roy Golan <rgolan@redhat.com> wrote:

...

On Thu, 7 Nov 2019 at 12:28, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 07/11/2019 à 11:16, Roy Golan a écrit :

On Thu, 7 Nov 2019 at 11:23, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 07/11/2019 à 07:18, Roy Golan a écrit :

On Thu, 7 Nov 2019 at 00:10, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 05/11/2019 à 21:50, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 22:46, Roy Golan <rgolan@redhat.com> wrote:

...
On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 05/11/2019 à 18:22, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <blanchet@abes.fr> wrote:

> > Le 05/11/2019 à 13:54, Roy Golan a écrit : > > > > On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr> > wrote: > >> I tried openshift-install after compiling but no ovirt provider is >> available... So waht do you mean when you say "give a try"? Maybe only >> provisionning ovirt with the terraform module? >> >> [root@vm5 installer]# bin/openshift-install create cluster >> ? Platform [Use arrows to move, space to select, type to filter, ? >> for more help] >> > aws >> azure >> gcp >> openstack >> >> >> > Its not merged yet. Please pull this image and work with it as a > container > quay.io/rgolangh/openshift-installer > > A little feedback as you asked: > > [root@openshift-installer ~]# docker run -it 56e5b667100f create > cluster > ? Platform ovirt > ? Enter oVirt's api endpoint URL > https://air-dev.v100.abes.fr/ovirt-engine/api > ? Enter ovirt-engine username admin@internal > ? Enter password ********** > ? Pick the oVirt cluster Default > ? Pick a VM template centos7.x > ? Enter the internal API Virtual IP 10.34.212.200 > ? Enter the internal DNS Virtual IP 10.34.212.100 > ? Enter the ingress IP 10.34.212.50 > ? Base Domain oc4.localdomain > ? Cluster Name test > ? Pull Secret [? for help] ************************************* > INFO Creating infrastructure resources... > INFO Waiting up to 30m0s for the Kubernetes API at > https://api.test.oc4.localdomain:6443... > ERROR Attempted to gather ClusterOperator status after installation > failure: listing ClusterOperator objects: Get > https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: > dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53: no > such host > INFO Pulling debug logs from the bootstrap machine > ERROR Attempted to gather debug logs after installation failure: > failed to create SSH client, ensure the proper ssh key is in your keyring > or specify with --key: failed to initialize the SSH agent: failed to read > directory "/output/.ssh": open /output/.ssh: no such file or directory > FATAL Bootstrap failed to complete: waiting for Kubernetes API: > context deadline exceeded > > - 6 vms are successfully created thin dependent from the template > > > - each vm is provisionned by cloud-init > - the step "INFO Waiting up to 30m0s for the Kubernetes API at > https://api.test.oc4.localdomain:6443..." fails. It seems that > the DNS pod is not up at this time. > - Right this moment, there is no more visibility on what is > done, what goes wrong... what's happening there? supposing a kind of > playbook downloading a kind of images... > - The" pull secret step" is not clear: we must have a redhat > account to https://cloud.redhat.com/openshift/install/ to get a > key like > - > {"auths":{"cloud.openshift.com > ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": > "exploit@abes.fr" <exploit@abes.fr>},"quay.io > ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": > "exploit@abes.fr" <exploit@abes.fr>},"registry.connect.redhat.com > ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": > "exploit@abes.fr" <exploit@abes.fr>},"registry.redhat.io > ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": > "exploit@abes.fr" <exploit@abes.fr>}}} > > > Can you tell me if I'm doing wrong? >

What is the template you are using? I don't think its RHCOS(Red Hat CoreOs) template, it looks like Centos?

Use this gist to import the template https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b

Unfortunately, the result is the same with the RHCOS template...

Make sure that: - the IPs supplied are taken, and belong to the VM network of those master VMs - localdomain or local domain suffix shouldn't be used - your ovirt-engine is version 4.3.7 or master

I didn't mention that you can provide any domain name, even

non-existing. When the bootstrap phase will be done, the instllation will teardown the bootsrap mahchine. At this stage if you are using a non-existing domain you would need to add the DNS Virtual IP you provided to your resolv.conf so the installation could resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN.

Also, you have a log under your $INSTALL_DIR/.openshift_install.log

I tried several things with your advices, but I'm still stuck at the https://api.test.oc4.localdomain:6443/version?timeout=32s test

with logs:

time="2019-11-06T20:21:15Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource"

So it means DNS resolution and network are now good and ignition provisionning is is OK but something goes wrong with the bootstrap vm.

Now if I log into the bootstrap vm, I can see a selinux message, but it may be not relevant...

SELinux: mount invalid. Same Superblock, different security settings for (dev nqueue, type nqueue).

Some other cluewWith journalctl:

journalctl -b -f -u bootkube

Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.661Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7beef51d-daad-4b46-9497-8e135e528f7c/etcd-1.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-1.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-03992fc6-5a87-4160-9b87-44ec6e82f7cd/etcd-2.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-2.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-00db28a7-5188-4666-896b-e37c88ad3ae9/etcd-0.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-0.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-1.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-2.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-0.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: Error: unhealthy cluster Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.720514151 +0000 UTC m=+5.813853296 container died 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image= registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, name=etcdctl) Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.817475095 +0000 UTC m=+5.910814273 container remove 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image= registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, name=etcdctl) Nov 06 21:55:40 localhost bootkube.sh[2101]: etcdctl failed. Retrying in 5 seconds...

It seems to be again a dns resolution issue.

[user1@localhost ~]$ dig api.test.oc4.localdomain +short 10.34.212.201

[user1@localhost ~]$ dig etcd-2.test.oc4.localdomain +short nothing

So what do you think about that?

Key here is the masters - they need to boot, get ignition from the

bootstrap machine and start publishing their IPs and hostnames.

Connect to a master, check its hostname, check its running or failing containers `crictl ps -a` by root user.

You were right: # crictl ps -a CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID 744cb8e654705 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 4 minutes ago Running discovery 75 9462e9a8ca478 912ba9db736c3 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 14 minutes ago Exited discovery 74 9462e9a8ca478

# crictl logs 744cb8e654705 E1107 08:10:04.262330 1 run.go:67] error looking up self for candidate IP 10.34.212.227: lookup _etcd-server-ssl._tcp.test.oc4.localdomain on 10.34.212.51:53: no such host

# hostname localhost

Conclusion: discovery didn't publish IPs and hostname to coreDNS because the master didn't get its name master-0.test.oc4.localdomain during provisionning phase.

I changed the master-0 hostname and reinitiates ignition to verify:

# hostnamectl set-hostname master-0.test.oc4.localdomain

# touch /boot/ignition.firstboot && rm -rf /etc/machine-id && reboot

After reboot is completed, no more exited discovery container:

CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID e701efa8bc583 77ec5e26cc676ef2bf5c42dd40e55394a11fb45a3e2d7e95cbaf233a1eef472f 20 seconds ago Running coredns 1 cbabc53322ac8 2c7bc6abb5b65 d73eca122bd567a3a1f70fa5021683bc17dd87003d05d88b1cdd0215c55049f6 20 seconds ago Running mdns-publisher 1 6f8914ff9db35 b3f619d5afa2c 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running haproxy-monitor 1 0e5c209496787 07769ce79b032 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running keepalived-monitor 1 02cf141d01a29 fb20d66b81254 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 21 seconds ago Running discovery 77 562f32067e0a7 476b07599260e 86a34bc5edd3e70073313f97bfd51ed8937658b341dc52334fb98ea6896ebdc2 22 seconds ago Running haproxy 1 0e5c209496787 26b53050a412b 9f94e500f85a735ec212ffb7305e0b63f7151a5346e41c2d5d293c8456f6fa42 22 seconds ago Running keepalived 1 02cf141d01a29 30ce48453854b 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 cbabc53322ac8 ad3ab0ae52077 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 6f8914ff9db35 650d62765e9e1 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e 13 hours ago Exited coredns 0 2ae0512b3b6ac 481969ce49bb9 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e 13 hours ago Exited mdns-publisher 0 d49754042b792 3594d9d261ca7 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d 13 hours ago Exited haproxy-monitor 0 3476219058ba8 88b13ec02a5c1 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 13 hours ago Exited keepalived-monitor 0 a3e13cf07c04f 1ab721b5599ed registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60 13 hours ago

because DNS registration is OK:

[user1@master-0 ~]$ dig etcd-0.test.oc4.localdomain +short 10.34.212.227

CONCLUSION:

- none of rhcos vm is correctly provisionned to their targeted hostname, so they all stay with localhost.

What is your engine version? the hostname support for ignition is merged into 4.3.7 and master

4.3.7.1-1.el7

https://gerrit.ovirt.org/c/100397/ merged 2 days ago, so it will apear in 4.3.7.2.

Sandro when is 4.7.3.2 is due?

You can also use the nightly 4.3 snapshot - it's not really nightly anymore - it's updated per every run of CI Change-Queue, IIUC: https://www.ovirt.org/develop/dev-process/install-nightly-snapshot.html

...

I only upgraded engine and not vdsm on hosts, but I suppose hosts are not

...
important for ignition

Correct.

...
...
- Cloud-init syntax for the hostname is ok, but it is not provisioned by ignition:

Why not provisionning these hostnames with a json snippet or else?

{ "ignition": { "version": "2.2.0" }, "storage": { "files": [{ "filesystem": "root", "path": "/etc/hostname", "mode": 420, "contents": { "source": "data:,master-0.test.oc4.localdomain" } }] }}

...
...
...
Le 05/11/2019 à 12:24, Roy Golan a écrit : >> >> >> >> On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr> >> wrote: >> >>> Hello, >>> >>> I'm interested by installing okd on ovirt with the official >>> openshift >>> installer (https://github.com/openshift/installer), but ovirt is >>> not yet >>> supported. >>> >>> >> If you want to give a try and supply feedback I'll be glad. >> >> >>> Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 and >>> >>> https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... >>> , how ovirt 4.3.7 should integrate openshift installer integration >>> with >>> terraform? >>> >>> >> Terraform is part of it, yes, It is what we use to spin the first 3 >> masters, plus a bootstraping machine. >> >> -- >>> Nathanaël Blanchet >>> >>> Supervision réseau >>> Pôle Infrastrutures Informatiques >>> 227 avenue Professeur-Jean-Louis-Viala >>> 34193 MONTPELLIER CEDEX 5 >>> Tél. 33 (0)4 67 54 84 55 >>> Fax 33 (0)4 67 54 84 14 >>> blanchet@abes.fr >>> >>> -- >> Nathanaël Blanchet >> >> Supervision réseau >> Pôle Infrastrutures Informatiques >> 227 avenue Professeur-Jean-Louis-Viala >> 34193 MONTPELLIER CEDEX 5 >> Tél. 33 (0)4 67 54 84 55 >> Fax 33 (0)4 67 54 84 14blanchet@abes.fr >> >> -- > Nathanaël Blanchet > > Supervision réseau > Pôle Infrastrutures Informatiques > 227 avenue Professeur-Jean-Louis-Viala > 34193 MONTPELLIER CEDEX 5 > Tél. 33 (0)4 67 54 84 55 > Fax 33 (0)4 67 54 84 14blanchet@abes.fr > > -- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

--

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

--

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

_______________________________________________

Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GZ64UU7KYDYJMZ...

-- Didi

Nathanaël Blanchet

3:07 p.m.

Le 07/11/2019 à 12:02, Yedidyah Bar David a écrit :

...

On Thu, Nov 7, 2019 at 12:57 PM Roy Golan <rgolan@redhat.com <mailto:rgolan@redhat.com>> wrote:

On Thu, 7 Nov 2019 at 12:28, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 07/11/2019 à 11:16, Roy Golan a écrit :

...
On Thu, 7 Nov 2019 at 11:23, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 07/11/2019 à 07:18, Roy Golan a écrit :

...
On Thu, 7 Nov 2019 at 00:10, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 21:50, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 22:46, Roy Golan <rgolan@redhat.com <mailto:rgolan@redhat.com>> wrote:

On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 18:22, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 13:54, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

I tried openshift-install after compiling but no ovirt provider is available... So waht do you mean when you say "give a try"? Maybe only provisionning ovirt with the terraform module?

[root@vm5 installer]# bin/openshift-install create cluster ? Platform [Use arrows to move, space to select, type to filter, ? for more help] > aws azure gcp openstack

Its not merged yet. Please pull this image and work with it as a container quay.io/rgolangh/openshift-installer <http://quay.io/rgolangh/openshift-installer>

A little feedback as you asked:

[root@openshift-installer ~]# docker run -it 56e5b667100f create cluster ? Platform ovirt ? Enter oVirt's api endpoint URL https://air-dev.v100.abes.fr/ovirt-engine/api ? Enter ovirt-engine username admin@internal ? Enter password ********** ? Pick the oVirt cluster Default ? Pick a VM template centos7.x ? Enter the internal API Virtual IP 10.34.212.200 ? Enter the internal DNS Virtual IP 10.34.212.100 ? Enter the ingress IP 10.34.212.50 ? Base Domain oc4.localdomain ? Cluster Name test ? Pull Secret [? for help] ************************************* INFO Creating infrastructure resources... INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443... ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53 <http://10.34.212.100:53>: no such host INFO Pulling debug logs from the bootstrap machine ERROR Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: failed to initialize the SSH agent: failed to read directory "/output/.ssh": open /output/.ssh: no such file or directory FATAL Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded

* 6 vms are successfully created thin dependent from the template

* each vm is provisionned by cloud-init * the step "INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443..." fails. It seems that the DNS pod is not up at this time. * Right this moment, there is no more visibility on what is done, what goes wrong... what's happening there? supposing a kind of playbook downloading a kind of images... * The" pull secret step" is not clear: we must have a redhat account to https://cloud.redhat.com/openshift/install/ to get a key like *

{"auths":{"cloud.openshift.com <http://cloud.openshift.com>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"quay.io <http://quay.io>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"registry.connect.redhat.com <http://registry.connect.redhat.com>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"registry.redhat.io <http://registry.redhat.io>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr" <mailto:exploit@abes.fr>}}}

Can you tell me if I'm doing wrong?

What is the template you are using? I don't think its RHCOS(Red Hat CoreOs) template, it looks like Centos?

Use this gist to import the template https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b

Unfortunately, the result is the same with the RHCOS template...

Make sure that: - the IPs supplied are taken, and belong to the VM network of those master VMs - localdomain or local domain suffix shouldn't be used - your ovirt-engine is version 4.3.7 or master

I didn't mention that you can provide any domain name, even non-existing. When the bootstrap phase will be done, the instllation will teardown the bootsrap mahchine. At this stage if you are using a non-existing domain you would need to add the DNS Virtual IP you provided to your resolv.conf so the installation could resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN.

Also, you have a log under your $INSTALL_DIR/.openshift_install.log

I tried several things with your advices, but I'm still stuck at the https://api.test.oc4.localdomain:6443/version?timeout=32s test

with logs:

time="2019-11-06T20:21:15Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource"

So it means DNS resolution and network are now good and ignition provisionning is is OK but something goes wrong with the bootstrap vm.

Now if I log into the bootstrap vm, I can see a selinux message, but it may be not relevant...

SELinux: mount invalid. Same Superblock, different security settings for (dev nqueue, type nqueue).

Some other cluewWith journalctl:

journalctl -b -f -u bootkube

Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.661Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7beef51d-daad-4b46-9497-8e135e528f7c/etcd-1.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-1.test.oc4.localdomain on 10.34.212.101:53 <http://10.34.212.101:53>: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-03992fc6-5a87-4160-9b87-44ec6e82f7cd/etcd-2.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-2.test.oc4.localdomain on 10.34.212.101:53 <http://10.34.212.101:53>: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-00db28a7-5188-4666-896b-e37c88ad3ae9/etcd-0.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-0.test.oc4.localdomain on 10.34.212.101:53 <http://10.34.212.101:53>: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-1.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-2.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-0.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: Error: unhealthy cluster Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.720514151 +0000 UTC m=+5.813853296 container died 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image=registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae>, name=etcdctl) Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.817475095 +0000 UTC m=+5.910814273 container remove 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image=registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae>, name=etcdctl) Nov 06 21:55:40 localhost bootkube.sh[2101]: etcdctl failed. Retrying in 5 seconds...

It seems to be again a dns resolution issue.

[user1@localhost ~]$ dig api.test.oc4.localdomain +short 10.34.212.201

[user1@localhost ~]$ dig etcd-2.test.oc4.localdomain +short nothing

So what do you think about that?

Key here is the masters - they need to boot, get ignition from the bootstrap machine and start publishing their IPs and hostnames.

Connect to a master, check its hostname, check its running or failing containers `crictl ps -a` by root user.

You were right:

# crictl ps -a CONTAINER ID IMAGE CREATED             STATE NAME                 ATTEMPT POD ID 744cb8e654705 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 4 minutes ago       Running discovery            75 9462e9a8ca478 912ba9db736c3 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 14 minutes ago      Exited discovery            74 9462e9a8ca478

# crictl logs 744cb8e654705 E1107 08:10:04.262330       1 run.go:67] error looking up self for candidate IP 10.34.212.227 <http://10.34.212.227>: lookup _etcd-server-ssl._tcp.test.oc4.localdomain on 10.34.212.51:53 <http://10.34.212.51:53>: no such host

# hostname localhost

Conclusion: discovery didn't publish IPs and hostname to coreDNS because the master didn't get its name master-0.test.oc4.localdomain during provisionning phase.

I changed the master-0 hostname and reinitiates ignition to verify:

# hostnamectl set-hostname master-0.test.oc4.localdomain

# touch /boot/ignition.firstboot && rm -rf /etc/machine-id && reboot

After reboot is completed, no more exited discovery container:

CONTAINER ID IMAGE CREATED             STATE NAME                 ATTEMPT POD ID e701efa8bc583 77ec5e26cc676ef2bf5c42dd40e55394a11fb45a3e2d7e95cbaf233a1eef472f 20 seconds ago      Running coredns              1 cbabc53322ac8 2c7bc6abb5b65 d73eca122bd567a3a1f70fa5021683bc17dd87003d05d88b1cdd0215c55049f6 20 seconds ago      Running mdns-publisher       1 6f8914ff9db35 b3f619d5afa2c 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago      Running haproxy-monitor      1 0e5c209496787 07769ce79b032 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago      Running keepalived-monitor   1 02cf141d01a29 fb20d66b81254 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 21 seconds ago      Running discovery            77 562f32067e0a7 476b07599260e 86a34bc5edd3e70073313f97bfd51ed8937658b341dc52334fb98ea6896ebdc2 22 seconds ago      Running haproxy              1 0e5c209496787 26b53050a412b 9f94e500f85a735ec212ffb7305e0b63f7151a5346e41c2d5d293c8456f6fa42 22 seconds ago      Running keepalived           1 02cf141d01a29 30ce48453854b 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago      Exited render-config        1 cbabc53322ac8 ad3ab0ae52077 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago      Exited render-config        1 6f8914ff9db35 650d62765e9e1 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e> 13 hours ago        Exited coredns              0 2ae0512b3b6ac 481969ce49bb9 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e> 13 hours ago        Exited mdns-publisher       0 d49754042b792 3594d9d261ca7 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d> 13 hours ago        Exited haproxy-monitor      0 3476219058ba8 88b13ec02a5c1 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 13 hours ago        Exited keepalived-monitor   0 a3e13cf07c04f 1ab721b5599ed registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60 <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60> 13 hours ago

because DNS registration is OK:

[user1@master-0 ~]$ dig etcd-0.test.oc4.localdomain +short 10.34.212.227

CONCLUSION:

* none of rhcos vm is correctly provisionned to their targeted hostname, so they all stay with localhost.

What is your engine version? the hostname support for ignition is merged into 4.3.7 and master

4.3.7.1-1.el7

https://gerrit.ovirt.org/c/100397/ merged 2 days ago, so it will apear in 4.3.7.2.

Sandro when is 4.7.3.2 is due?

You can also use the nightly 4.3 snapshot - it's not really nightly anymore - it's updated per every run of CI Change-Queue, IIUC:

https://www.ovirt.org/develop/dev-process/install-nightly-snapshot.html

I confirm 4.3 snapshot supports hostname change with ignition, now it works out of the box... until this issue : INFO Cluster operator image-registry Available is False with StorageNotConfigured: storage backend not configured ERROR Cluster operator image-registry Degraded is True with StorageNotConfigured: storage backend not configured INFO Cluster operator insights Disabled is False with : FATAL failed to initialize the cluster: Cluster operator image-registry is still updating

...

I only upgraded engine and not vdsm on hosts, but I suppose hosts are not important for ignition

Correct.

...
* Cloud-init syntax for the hostname is ok, but it is not provisioned by ignition:

Why not provisionning these hostnames with a json snippet or else?

|{"ignition":{"version":"2.2.0"},"storage":{"files":[{"filesystem":"root","path":"/etc/hostname","mode":420,"contents":{"source":"data:,master-0.test.oc4.localdomain"}}]}}|

...
...
...
...
Le 05/11/2019 à 12:24, Roy Golan a écrit : > > > On Tue, 5 Nov 2019 at 13:22, > Nathanaël Blanchet > <blanchet@abes.fr > <mailto:blanchet@abes.fr>> wrote: > > Hello, > > I'm interested by installing > okd on ovirt with the > official openshift > installer > (https://github.com/openshift/installer), > but ovirt is not yet > supported. > > > If you want to give a try and > supply feedback I'll be glad. > > Regarding > https://bugzilla.redhat.com/show_bug.cgi?id=1578255 > and > https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... > > , how ovirt 4.3.7 should > integrate openshift > installer integration with > terraform? > > > Terraform is part of it, yes, It > is what we use to spin the first > 3 masters, plus a bootstraping > machine. > > -- > Nathanaël Blanchet > > Supervision réseau > Pôle Infrastrutures > Informatiques > 227 avenue > Professeur-Jean-Louis-Viala > 34193 MONTPELLIER CEDEX 5 > Tél. 33 (0)4 67 54 84 55 > Fax 33 (0)4 67 54 84 14 > blanchet@abes.fr > <mailto:blanchet@abes.fr> > -- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

_______________________________________________ Users mailing list -- users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email to users-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GZ64UU7KYDYJMZ...

-- Didi

Roy Golan

3:52 p.m.

On Thu, 7 Nov 2019 at 15:07, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...

Le 07/11/2019 à 12:02, Yedidyah Bar David a écrit :

On Thu, Nov 7, 2019 at 12:57 PM Roy Golan <rgolan@redhat.com> wrote:

...
On Thu, 7 Nov 2019 at 12:28, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 07/11/2019 à 11:16, Roy Golan a écrit :

On Thu, 7 Nov 2019 at 11:23, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 07/11/2019 à 07:18, Roy Golan a écrit :

On Thu, 7 Nov 2019 at 00:10, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 05/11/2019 à 21:50, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 22:46, Roy Golan <rgolan@redhat.com> wrote:

...
On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet <blanchet@abes.fr> wrote:

> > Le 05/11/2019 à 18:22, Roy Golan a écrit : > > > > On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <blanchet@abes.fr> > wrote: > >> >> Le 05/11/2019 à 13:54, Roy Golan a écrit : >> >> >> >> On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr> >> wrote: >> >>> I tried openshift-install after compiling but no ovirt provider is >>> available... So waht do you mean when you say "give a try"? Maybe only >>> provisionning ovirt with the terraform module? >>> >>> [root@vm5 installer]# bin/openshift-install create cluster >>> ? Platform [Use arrows to move, space to select, type to filter, >>> ? for more help] >>> > aws >>> azure >>> gcp >>> openstack >>> >>> >>> >> Its not merged yet. Please pull this image and work with it as a >> container >> quay.io/rgolangh/openshift-installer >> >> A little feedback as you asked: >> >> [root@openshift-installer ~]# docker run -it 56e5b667100f create >> cluster >> ? Platform ovirt >> ? Enter oVirt's api endpoint URL >> https://air-dev.v100.abes.fr/ovirt-engine/api >> ? Enter ovirt-engine username admin@internal >> ? Enter password ********** >> ? Pick the oVirt cluster Default >> ? Pick a VM template centos7.x >> ? Enter the internal API Virtual IP 10.34.212.200 >> ? Enter the internal DNS Virtual IP 10.34.212.100 >> ? Enter the ingress IP 10.34.212.50 >> ? Base Domain oc4.localdomain >> ? Cluster Name test >> ? Pull Secret [? for help] ************************************* >> INFO Creating infrastructure resources... >> INFO Waiting up to 30m0s for the Kubernetes API at >> https://api.test.oc4.localdomain:6443... >> ERROR Attempted to gather ClusterOperator status after installation >> failure: listing ClusterOperator objects: Get >> https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: >> dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53: no >> such host >> INFO Pulling debug logs from the bootstrap machine >> ERROR Attempted to gather debug logs after installation failure: >> failed to create SSH client, ensure the proper ssh key is in your keyring >> or specify with --key: failed to initialize the SSH agent: failed to read >> directory "/output/.ssh": open /output/.ssh: no such file or directory >> FATAL Bootstrap failed to complete: waiting for Kubernetes API: >> context deadline exceeded >> >> - 6 vms are successfully created thin dependent from the >> template >> >> >> - each vm is provisionned by cloud-init >> - the step "INFO Waiting up to 30m0s for the Kubernetes API at >> https://api.test.oc4.localdomain:6443..." fails. It seems that >> the DNS pod is not up at this time. >> - Right this moment, there is no more visibility on what is >> done, what goes wrong... what's happening there? supposing a kind of >> playbook downloading a kind of images... >> - The" pull secret step" is not clear: we must have a redhat >> account to https://cloud.redhat.com/openshift/install/ to get a >> key like >> - >> {"auths":{"cloud.openshift.com >> ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": >> "exploit@abes.fr" <exploit@abes.fr>},"quay.io >> ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": >> "exploit@abes.fr" <exploit@abes.fr>}," >> registry.connect.redhat.com >> ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": >> "exploit@abes.fr" <exploit@abes.fr>},"registry.redhat.io >> ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": >> "exploit@abes.fr" <exploit@abes.fr>}}} >> >> >> Can you tell me if I'm doing wrong? >> > > What is the template you are using? I don't think its RHCOS(Red Hat > CoreOs) template, it looks like Centos? > > Use this gist to import the template > https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b > > Unfortunately, the result is the same with the RHCOS template... >

Make sure that: - the IPs supplied are taken, and belong to the VM network of those master VMs - localdomain or local domain suffix shouldn't be used - your ovirt-engine is version 4.3.7 or master

I didn't mention that you can provide any domain name, even

non-existing. When the bootstrap phase will be done, the instllation will teardown the bootsrap mahchine. At this stage if you are using a non-existing domain you would need to add the DNS Virtual IP you provided to your resolv.conf so the installation could resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN.

Also, you have a log under your $INSTALL_DIR/.openshift_install.log

I tried several things with your advices, but I'm still stuck at the https://api.test.oc4.localdomain:6443/version?timeout=32s test

with logs:

time="2019-11-06T20:21:15Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource"

So it means DNS resolution and network are now good and ignition provisionning is is OK but something goes wrong with the bootstrap vm.

Now if I log into the bootstrap vm, I can see a selinux message, but it may be not relevant...

SELinux: mount invalid. Same Superblock, different security settings for (dev nqueue, type nqueue).

Some other cluewWith journalctl:

journalctl -b -f -u bootkube

Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.661Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7beef51d-daad-4b46-9497-8e135e528f7c/etcd-1.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-1.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-03992fc6-5a87-4160-9b87-44ec6e82f7cd/etcd-2.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-2.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-00db28a7-5188-4666-896b-e37c88ad3ae9/etcd-0.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-0.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-1.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-2.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-0.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: Error: unhealthy cluster Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.720514151 +0000 UTC m=+5.813853296 container died 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image= registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, name=etcdctl) Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.817475095 +0000 UTC m=+5.910814273 container remove 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image= registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, name=etcdctl) Nov 06 21:55:40 localhost bootkube.sh[2101]: etcdctl failed. Retrying in 5 seconds...

It seems to be again a dns resolution issue.

[user1@localhost ~]$ dig api.test.oc4.localdomain +short 10.34.212.201

[user1@localhost ~]$ dig etcd-2.test.oc4.localdomain +short nothing

So what do you think about that?

Key here is the masters - they need to boot, get ignition from the

bootstrap machine and start publishing their IPs and hostnames.

Connect to a master, check its hostname, check its running or failing containers `crictl ps -a` by root user.

You were right: # crictl ps -a CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID 744cb8e654705 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 4 minutes ago Running discovery 75 9462e9a8ca478 912ba9db736c3 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 14 minutes ago Exited discovery 74 9462e9a8ca478

# crictl logs 744cb8e654705 E1107 08:10:04.262330 1 run.go:67] error looking up self for candidate IP 10.34.212.227: lookup _etcd-server-ssl._tcp.test.oc4.localdomain on 10.34.212.51:53: no such host

# hostname localhost

Conclusion: discovery didn't publish IPs and hostname to coreDNS because the master didn't get its name master-0.test.oc4.localdomain during provisionning phase.

I changed the master-0 hostname and reinitiates ignition to verify:

# hostnamectl set-hostname master-0.test.oc4.localdomain

# touch /boot/ignition.firstboot && rm -rf /etc/machine-id && reboot

After reboot is completed, no more exited discovery container:

CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID e701efa8bc583 77ec5e26cc676ef2bf5c42dd40e55394a11fb45a3e2d7e95cbaf233a1eef472f 20 seconds ago Running coredns 1 cbabc53322ac8 2c7bc6abb5b65 d73eca122bd567a3a1f70fa5021683bc17dd87003d05d88b1cdd0215c55049f6 20 seconds ago Running mdns-publisher 1 6f8914ff9db35 b3f619d5afa2c 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running haproxy-monitor 1 0e5c209496787 07769ce79b032 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running keepalived-monitor 1 02cf141d01a29 fb20d66b81254 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 21 seconds ago Running discovery 77 562f32067e0a7 476b07599260e 86a34bc5edd3e70073313f97bfd51ed8937658b341dc52334fb98ea6896ebdc2 22 seconds ago Running haproxy 1 0e5c209496787 26b53050a412b 9f94e500f85a735ec212ffb7305e0b63f7151a5346e41c2d5d293c8456f6fa42 22 seconds ago Running keepalived 1 02cf141d01a29 30ce48453854b 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 cbabc53322ac8 ad3ab0ae52077 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 6f8914ff9db35 650d62765e9e1 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e 13 hours ago Exited coredns 0 2ae0512b3b6ac 481969ce49bb9 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e 13 hours ago Exited mdns-publisher 0 d49754042b792 3594d9d261ca7 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d 13 hours ago Exited haproxy-monitor 0 3476219058ba8 88b13ec02a5c1 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 13 hours ago Exited keepalived-monitor 0 a3e13cf07c04f 1ab721b5599ed registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60 13 hours ago

because DNS registration is OK:

[user1@master-0 ~]$ dig etcd-0.test.oc4.localdomain +short 10.34.212.227

CONCLUSION:

- none of rhcos vm is correctly provisionned to their targeted hostname, so they all stay with localhost.

What is your engine version? the hostname support for ignition is merged into 4.3.7 and master

4.3.7.1-1.el7

https://gerrit.ovirt.org/c/100397/ merged 2 days ago, so it will apear in 4.3.7.2.

Sandro when is 4.7.3.2 is due?

You can also use the nightly 4.3 snapshot - it's not really nightly anymore - it's updated per every run of CI Change-Queue, IIUC:

https://www.ovirt.org/develop/dev-process/install-nightly-snapshot.html

I confirm 4.3 snapshot supports hostname change with ignition, now it works out of the box... until this issue :

INFO Cluster operator image-registry Available is False with StorageNotConfigured: storage backend not configured ERROR Cluster operator image-registry Degraded is True with StorageNotConfigured: storage backend not configured INFO Cluster operator insights Disabled is False with : FATAL failed to initialize the cluster: Cluster operator image-registry is still updating

That should have been solved by https://github.com/openshift/cluster-image-registry-operator/pull/406 Can you share `oc describe co/image-registry` ?

...

...
I only upgraded engine and not vdsm on hosts, but I suppose hosts are not

...
important for ignition

Correct.

...
...
- Cloud-init syntax for the hostname is ok, but it is not provisioned by ignition:

Why not provisionning these hostnames with a json snippet or else?

{ "ignition": { "version": "2.2.0" }, "storage": { "files": [{ "filesystem": "root", "path": "/etc/hostname", "mode": 420, "contents": { "source": "data:,master-0.test.oc4.localdomain" } }] }}

...
...
> > > Le 05/11/2019 à 12:24, Roy Golan a écrit : >>> >>> >>> >>> On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr> >>> wrote: >>> >>>> Hello, >>>> >>>> I'm interested by installing okd on ovirt with the official >>>> openshift >>>> installer (https://github.com/openshift/installer), but ovirt is >>>> not yet >>>> supported. >>>> >>>> >>> If you want to give a try and supply feedback I'll be glad. >>> >>> >>>> Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 >>>> and >>>> >>>> https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... >>>> , how ovirt 4.3.7 should integrate openshift installer >>>> integration with >>>> terraform? >>>> >>>> >>> Terraform is part of it, yes, It is what we use to spin the first >>> 3 masters, plus a bootstraping machine. >>> >>> -- >>>> Nathanaël Blanchet >>>> >>>> Supervision réseau >>>> Pôle Infrastrutures Informatiques >>>> 227 avenue Professeur-Jean-Louis-Viala >>>> 34193 MONTPELLIER CEDEX 5 >>>> Tél. 33 (0)4 67 54 84 55 >>>> Fax 33 (0)4 67 54 84 14 >>>> blanchet@abes.fr >>>> >>>> -- >>> Nathanaël Blanchet >>> >>> Supervision réseau >>> Pôle Infrastrutures Informatiques >>> 227 avenue Professeur-Jean-Louis-Viala >>> 34193 MONTPELLIER CEDEX 5 >>> Tél. 33 (0)4 67 54 84 55 >>> Fax 33 (0)4 67 54 84 14blanchet@abes.fr >>> >>> -- >> Nathanaël Blanchet >> >> Supervision réseau >> Pôle Infrastrutures Informatiques >> 227 avenue Professeur-Jean-Louis-Viala >> 34193 MONTPELLIER CEDEX 5 >> Tél. 33 (0)4 67 54 84 55 >> Fax 33 (0)4 67 54 84 14blanchet@abes.fr >> >> -- > Nathanaël Blanchet > > Supervision réseau > Pôle Infrastrutures Informatiques > 227 avenue Professeur-Jean-Louis-Viala > 34193 MONTPELLIER CEDEX 5 > Tél. 33 (0)4 67 54 84 55 > Fax 33 (0)4 67 54 84 14blanchet@abes.fr > > --

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

--

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

_______________________________________________

Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GZ64UU7KYDYJMZ...

-- Didi

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

Nathanaël Blanchet

4:41 p.m.

Le 07/11/2019 à 14:52, Roy Golan a écrit :

...

On Thu, 7 Nov 2019 at 15:07, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 07/11/2019 à 12:02, Yedidyah Bar David a écrit :

...
On Thu, Nov 7, 2019 at 12:57 PM Roy Golan <rgolan@redhat.com <mailto:rgolan@redhat.com>> wrote:

On Thu, 7 Nov 2019 at 12:28, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 07/11/2019 à 11:16, Roy Golan a écrit :

...
On Thu, 7 Nov 2019 at 11:23, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 07/11/2019 à 07:18, Roy Golan a écrit :

...
On Thu, 7 Nov 2019 at 00:10, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 21:50, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 22:46, Roy Golan <rgolan@redhat.com <mailto:rgolan@redhat.com>> wrote:

On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 18:22, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 13:54, Roy Golan a écrit : > > > On Tue, 5 Nov 2019 at 14:52, > Nathanaël Blanchet > <blanchet@abes.fr > <mailto:blanchet@abes.fr>> wrote: > > I tried openshift-install > after compiling but no ovirt > provider is available... So > waht do you mean when you > say "give a try"? Maybe only > provisionning ovirt with the > terraform module? > > [root@vm5 installer]# > bin/openshift-install create > cluster > ? Platform [Use arrows to > move, space to select, type > to filter, ? for more help] > > aws > azure > gcp > openstack > > > Its not merged yet. Please pull > this image and work with it as a > container > quay.io/rgolangh/openshift-installer > <http://quay.io/rgolangh/openshift-installer>

A little feedback as you asked:

[root@openshift-installer ~]# docker run -it 56e5b667100f create cluster ? Platform ovirt ? Enter oVirt's api endpoint URL https://air-dev.v100.abes.fr/ovirt-engine/api ? Enter ovirt-engine username admin@internal ? Enter password ********** ? Pick the oVirt cluster Default ? Pick a VM template centos7.x ? Enter the internal API Virtual IP 10.34.212.200 ? Enter the internal DNS Virtual IP 10.34.212.100 ? Enter the ingress IP 10.34.212.50 ? Base Domain oc4.localdomain ? Cluster Name test ? Pull Secret [? for help] ************************************* INFO Creating infrastructure resources... INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443...

ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53 <http://10.34.212.100:53>: no such host INFO Pulling debug logs from the bootstrap machine ERROR Attempted to gather debug logs after installation failure: failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: failed to initialize the SSH agent: failed to read directory "/output/.ssh": open /output/.ssh: no such file or directory FATAL Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded

* 6 vms are successfully created thin dependent from the template

* each vm is provisionned by cloud-init * the step "INFO Waiting up to 30m0s for the Kubernetes API at https://api.test.oc4.localdomain:6443..." fails. It seems that the DNS pod is not up at this time. * Right this moment, there is no more visibility on what is done, what goes wrong... what's happening there? supposing a kind of playbook downloading a kind of images... * The" pull secret step" is not clear: we must have a redhat account to https://cloud.redhat.com/openshift/install/ to get a key like *

{"auths":{"cloud.openshift.com <http://cloud.openshift.com>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"quay.io <http://quay.io>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"registry.connect.redhat.com <http://registry.connect.redhat.com>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr" <mailto:exploit@abes.fr>},"registry.redhat.io <http://registry.redhat.io>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr" <mailto:exploit@abes.fr>}}}

Can you tell me if I'm doing wrong?

What is the template you are using? I don't think its RHCOS(Red Hat CoreOs) template, it looks like Centos?

Use this gist to import the template https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b

Unfortunately, the result is the same with the RHCOS template...

Make sure that: - the IPs supplied are taken, and belong to the VM network of those master VMs - localdomain or local domain suffix shouldn't be used - your ovirt-engine is version 4.3.7 or master

I didn't mention that you can provide any domain name, even non-existing. When the bootstrap phase will be done, the instllation will teardown the bootsrap mahchine. At this stage if you are using a non-existing domain you would need to add the DNS Virtual IP you provided to your resolv.conf so the installation could resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN.

Also, you have a log under your $INSTALL_DIR/.openshift_install.log

I tried several things with your advices, but I'm still stuck at the https://api.test.oc4.localdomain:6443/version?timeout=32s test

with logs:

time="2019-11-06T20:21:15Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource"

So it means DNS resolution and network are now good and ignition provisionning is is OK but something goes wrong with the bootstrap vm.

Now if I log into the bootstrap vm, I can see a selinux message, but it may be not relevant...

SELinux: mount invalid. Same Superblock, different security settings for (dev nqueue, type nqueue).

Some other cluewWith journalctl:

journalctl -b -f -u bootkube

Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.661Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7beef51d-daad-4b46-9497-8e135e528f7c/etcd-1.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-1.test.oc4.localdomain on 10.34.212.101:53 <http://10.34.212.101:53>: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-03992fc6-5a87-4160-9b87-44ec6e82f7cd/etcd-2.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-2.test.oc4.localdomain on 10.34.212.101:53 <http://10.34.212.101:53>: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-00db28a7-5188-4666-896b-e37c88ad3ae9/etcd-0.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-0.test.oc4.localdomain on 10.34.212.101:53 <http://10.34.212.101:53>: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-1.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-2.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-0.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: Error: unhealthy cluster Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.720514151 +0000 UTC m=+5.813853296 container died 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image=registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae>, name=etcdctl) Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.817475095 +0000 UTC m=+5.910814273 container remove 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image=registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae>, name=etcdctl) Nov 06 21:55:40 localhost bootkube.sh[2101]: etcdctl failed. Retrying in 5 seconds...

It seems to be again a dns resolution issue.

[user1@localhost ~]$ dig api.test.oc4.localdomain +short 10.34.212.201

[user1@localhost ~]$ dig etcd-2.test.oc4.localdomain +short nothing

So what do you think about that?

Key here is the masters - they need to boot, get ignition from the bootstrap machine and start publishing their IPs and hostnames.

Connect to a master, check its hostname, check its running or failing containers `crictl ps -a` by root user.

You were right:

# crictl ps -a CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT             POD ID 744cb8e654705 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 4 minutes ago Running discovery 75 9462e9a8ca478 912ba9db736c3 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 14 minutes ago Exited discovery 74 9462e9a8ca478

# crictl logs 744cb8e654705 E1107 08:10:04.262330       1 run.go:67] error looking up self for candidate IP 10.34.212.227 <http://10.34.212.227>: lookup _etcd-server-ssl._tcp.test.oc4.localdomain on 10.34.212.51:53 <http://10.34.212.51:53>: no such host

# hostname localhost

Conclusion: discovery didn't publish IPs and hostname to coreDNS because the master didn't get its name master-0.test.oc4.localdomain during provisionning phase.

I changed the master-0 hostname and reinitiates ignition to verify:

# hostnamectl set-hostname master-0.test.oc4.localdomain

# touch /boot/ignition.firstboot && rm -rf /etc/machine-id && reboot

After reboot is completed, no more exited discovery container:

CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT             POD ID e701efa8bc583 77ec5e26cc676ef2bf5c42dd40e55394a11fb45a3e2d7e95cbaf233a1eef472f 20 seconds ago Running coredns 1 cbabc53322ac8 2c7bc6abb5b65 d73eca122bd567a3a1f70fa5021683bc17dd87003d05d88b1cdd0215c55049f6 20 seconds ago Running mdns-publisher 1 6f8914ff9db35 b3f619d5afa2c 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running haproxy-monitor 1 0e5c209496787 07769ce79b032 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running keepalived-monitor 1 02cf141d01a29 fb20d66b81254 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 21 seconds ago Running discovery 77 562f32067e0a7 476b07599260e 86a34bc5edd3e70073313f97bfd51ed8937658b341dc52334fb98ea6896ebdc2 22 seconds ago Running haproxy 1 0e5c209496787 26b53050a412b 9f94e500f85a735ec212ffb7305e0b63f7151a5346e41c2d5d293c8456f6fa42 22 seconds ago Running keepalived 1 02cf141d01a29 30ce48453854b 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 cbabc53322ac8 ad3ab0ae52077 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 6f8914ff9db35 650d62765e9e1 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e> 13 hours ago Exited coredns 0 2ae0512b3b6ac 481969ce49bb9 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e> 13 hours ago Exited mdns-publisher 0 d49754042b792 3594d9d261ca7 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d> 13 hours ago Exited haproxy-monitor 0 3476219058ba8 88b13ec02a5c1 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 13 hours ago Exited keepalived-monitor 0 a3e13cf07c04f 1ab721b5599ed registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60 <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60> 13 hours ago

because DNS registration is OK:

[user1@master-0 ~]$ dig etcd-0.test.oc4.localdomain +short 10.34.212.227

CONCLUSION:

* none of rhcos vm is correctly provisionned to their targeted hostname, so they all stay with localhost.

What is your engine version? the hostname support for ignition is merged into 4.3.7 and master

4.3.7.1-1.el7

https://gerrit.ovirt.org/c/100397/ merged 2 days ago, so it will apear in 4.3.7.2.

Sandro when is 4.7.3.2 is due?

You can also use the nightly 4.3 snapshot - it's not really nightly anymore - it's updated per every run of CI Change-Queue, IIUC:

https://www.ovirt.org/develop/dev-process/install-nightly-snapshot.html

I confirm 4.3 snapshot supports hostname change with ignition, now it works out of the box... until this issue :

INFO Cluster operator image-registry Available is False with StorageNotConfigured: storage backend not configured ERROR Cluster operator image-registry Degraded is True with StorageNotConfigured: storage backend not configured INFO Cluster operator insights Disabled is False with : FATAL failed to initialize the cluster: Cluster operator image-registry is still updating

That should have been solved by https://github.com/openshift/cluster-image-registry-operator/pull/406

Can you share `oc describe co/image-registry` ?

Where should I type this command, into master or worker? I wasn't able to success with "oc login": error: Missing or incomplete configuration info. Please login or point to an existing, complete config file...

...

...
I only upgraded engine and not vdsm on hosts, but I suppose hosts are not important for ignition

Correct.

...
* Cloud-init syntax for the hostname is ok, but it is not provisioned by ignition:

Why not provisionning these hostnames with a json snippet or else?

|{"ignition":{"version":"2.2.0"},"storage":{"files":[{"filesystem":"root","path":"/etc/hostname","mode":420,"contents":{"source":"data:,master-0.test.oc4.localdomain"}}]}}|

...
...
...
> Le 05/11/2019 à 12:24, Roy > Golan a écrit : >> >> >> On Tue, 5 Nov 2019 at >> 13:22, Nathanaël Blanchet >> <blanchet@abes.fr >> <mailto:blanchet@abes.fr>> >> wrote: >> >> Hello, >> >> I'm interested by >> installing okd on ovirt >> with the official >> openshift >> installer >> (https://github.com/openshift/installer), >> but ovirt is not yet >> supported. >> >> >> If you want to give a try >> and supply feedback I'll be >> glad. >> >> Regarding >> https://bugzilla.redhat.com/show_bug.cgi?id=1578255 >> and >> https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... >> >> , how ovirt 4.3.7 >> should integrate >> openshift installer >> integration with >> terraform? >> >> >> Terraform is part of it, >> yes, It is what we use to >> spin the first 3 masters, >> plus a bootstraping machine. >> >> -- >> Nathanaël Blanchet >> >> Supervision réseau >> Pôle Infrastrutures >> Informatiques >> 227 avenue >> Professeur-Jean-Louis-Viala >> 34193 MONTPELLIER CEDEX 5 >> Tél. 33 (0)4 67 54 84 55 >> Fax 33 (0)4 67 54 84 14 >> blanchet@abes.fr >> <mailto:blanchet@abes.fr> >> > -- > Nathanaël Blanchet > > Supervision réseau > Pôle Infrastrutures Informatiques > 227 avenue Professeur-Jean-Louis-Viala > 34193 MONTPELLIER CEDEX 5 > Tél. 33 (0)4 67 54 84 55 > Fax 33 (0)4 67 54 84 14 > blanchet@abes.fr <mailto:blanchet@abes.fr> > -- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

_______________________________________________ Users mailing list -- users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email to users-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GZ64UU7KYDYJMZ...

-- Didi

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

Roy Golan

10:35 p.m.

On Thu, 7 Nov 2019 at 16:41, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...

Le 07/11/2019 à 14:52, Roy Golan a écrit :

On Thu, 7 Nov 2019 at 15:07, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 07/11/2019 à 12:02, Yedidyah Bar David a écrit :

On Thu, Nov 7, 2019 at 12:57 PM Roy Golan <rgolan@redhat.com> wrote:

...
On Thu, 7 Nov 2019 at 12:28, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 07/11/2019 à 11:16, Roy Golan a écrit :

On Thu, 7 Nov 2019 at 11:23, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 07/11/2019 à 07:18, Roy Golan a écrit :

On Thu, 7 Nov 2019 at 00:10, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 05/11/2019 à 21:50, Roy Golan a écrit :

On Tue, 5 Nov 2019 at 22:46, Roy Golan <rgolan@redhat.com> wrote:

> > > On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet <blanchet@abes.fr> > wrote: > >> >> Le 05/11/2019 à 18:22, Roy Golan a écrit : >> >> >> >> On Tue, 5 Nov 2019 at 19:12, Nathanaël Blanchet <blanchet@abes.fr> >> wrote: >> >>> >>> Le 05/11/2019 à 13:54, Roy Golan a écrit : >>> >>> >>> >>> On Tue, 5 Nov 2019 at 14:52, Nathanaël Blanchet <blanchet@abes.fr> >>> wrote: >>> >>>> I tried openshift-install after compiling but no ovirt provider >>>> is available... So waht do you mean when you say "give a try"? Maybe only >>>> provisionning ovirt with the terraform module? >>>> >>>> [root@vm5 installer]# bin/openshift-install create cluster >>>> ? Platform [Use arrows to move, space to select, type to filter, >>>> ? for more help] >>>> > aws >>>> azure >>>> gcp >>>> openstack >>>> >>>> >>>> >>> Its not merged yet. Please pull this image and work with it as a >>> container >>> quay.io/rgolangh/openshift-installer >>> >>> A little feedback as you asked: >>> >>> [root@openshift-installer ~]# docker run -it 56e5b667100f create >>> cluster >>> ? Platform ovirt >>> ? Enter oVirt's api endpoint URL >>> https://air-dev.v100.abes.fr/ovirt-engine/api >>> ? Enter ovirt-engine username admin@internal >>> ? Enter password ********** >>> ? Pick the oVirt cluster Default >>> ? Pick a VM template centos7.x >>> ? Enter the internal API Virtual IP 10.34.212.200 >>> ? Enter the internal DNS Virtual IP 10.34.212.100 >>> ? Enter the ingress IP 10.34.212.50 >>> ? Base Domain oc4.localdomain >>> ? Cluster Name test >>> ? Pull Secret [? for help] ************************************* >>> INFO Creating infrastructure resources... >>> INFO Waiting up to 30m0s for the Kubernetes API at >>> https://api.test.oc4.localdomain:6443... >>> ERROR Attempted to gather ClusterOperator status after >>> installation failure: listing ClusterOperator objects: Get >>> https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: >>> dial tcp: lookup api.test.oc4.localdomain on 10.34.212.100:53: no >>> such host >>> INFO Pulling debug logs from the bootstrap machine >>> ERROR Attempted to gather debug logs after installation failure: >>> failed to create SSH client, ensure the proper ssh key is in your keyring >>> or specify with --key: failed to initialize the SSH agent: failed to read >>> directory "/output/.ssh": open /output/.ssh: no such file or directory >>> FATAL Bootstrap failed to complete: waiting for Kubernetes API: >>> context deadline exceeded >>> >>> - 6 vms are successfully created thin dependent from the >>> template >>> >>> >>> - each vm is provisionned by cloud-init >>> - the step "INFO Waiting up to 30m0s for the Kubernetes API at >>> https://api.test.oc4.localdomain:6443..." fails. It seems that >>> the DNS pod is not up at this time. >>> - Right this moment, there is no more visibility on what is >>> done, what goes wrong... what's happening there? supposing a kind of >>> playbook downloading a kind of images... >>> - The" pull secret step" is not clear: we must have a redhat >>> account to https://cloud.redhat.com/openshift/install/ to get >>> a key like >>> - >>> {"auths":{"cloud.openshift.com >>> ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": >>> "exploit@abes.fr" <exploit@abes.fr>},"quay.io >>> ":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email": >>> "exploit@abes.fr" <exploit@abes.fr>}," >>> registry.connect.redhat.com >>> ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": >>> "exploit@abes.fr" <exploit@abes.fr>},"registry.redhat.io >>> ":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email": >>> "exploit@abes.fr" <exploit@abes.fr>}}} >>> >>> >>> Can you tell me if I'm doing wrong? >>> >> >> What is the template you are using? I don't think its RHCOS(Red Hat >> CoreOs) template, it looks like Centos? >> >> Use this gist to import the template >> https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b >> >> Unfortunately, the result is the same with the RHCOS template... >> > > Make sure that: > - the IPs supplied are taken, and belong to the VM network of those > master VMs > - localdomain or local domain suffix shouldn't be used > - your ovirt-engine is version 4.3.7 or master > > I didn't mention that you can provide any domain name, even non-existing. When the bootstrap phase will be done, the instllation will teardown the bootsrap mahchine. At this stage if you are using a non-existing domain you would need to add the DNS Virtual IP you provided to your resolv.conf so the installation could resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN.

Also, you have a log under your $INSTALL_DIR/.openshift_install.log

I tried several things with your advices, but I'm still stuck at the https://api.test.oc4.localdomain:6443/version?timeout=32s test

with logs:

time="2019-11-06T20:21:15Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource"

So it means DNS resolution and network are now good and ignition provisionning is is OK but something goes wrong with the bootstrap vm.

Now if I log into the bootstrap vm, I can see a selinux message, but it may be not relevant...

SELinux: mount invalid. Same Superblock, different security settings for (dev nqueue, type nqueue).

Some other cluewWith journalctl:

journalctl -b -f -u bootkube

Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.661Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7beef51d-daad-4b46-9497-8e135e528f7c/etcd-1.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-1.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-03992fc6-5a87-4160-9b87-44ec6e82f7cd/etcd-2.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-2.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-00db28a7-5188-4666-896b-e37c88ad3ae9/etcd-0.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-0.test.oc4.localdomain on 10.34.212.101:53: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-1.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-2.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-0.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: Error: unhealthy cluster Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.720514151 +0000 UTC m=+5.813853296 container died 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image= registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, name=etcdctl) Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.817475095 +0000 UTC m=+5.910814273 container remove 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image= registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae, name=etcdctl) Nov 06 21:55:40 localhost bootkube.sh[2101]: etcdctl failed. Retrying in 5 seconds...

It seems to be again a dns resolution issue.

[user1@localhost ~]$ dig api.test.oc4.localdomain +short 10.34.212.201

[user1@localhost ~]$ dig etcd-2.test.oc4.localdomain +short nothing

So what do you think about that?

Key here is the masters - they need to boot, get ignition from the

bootstrap machine and start publishing their IPs and hostnames.

Connect to a master, check its hostname, check its running or failing containers `crictl ps -a` by root user.

You were right: # crictl ps -a CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID 744cb8e654705 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 4 minutes ago Running discovery 75 9462e9a8ca478 912ba9db736c3 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 14 minutes ago Exited discovery 74 9462e9a8ca478

# crictl logs 744cb8e654705 E1107 08:10:04.262330 1 run.go:67] error looking up self for candidate IP 10.34.212.227: lookup _etcd-server-ssl._tcp.test.oc4.localdomain on 10.34.212.51:53: no such host

# hostname localhost

Conclusion: discovery didn't publish IPs and hostname to coreDNS because the master didn't get its name master-0.test.oc4.localdomain during provisionning phase.

I changed the master-0 hostname and reinitiates ignition to verify:

# hostnamectl set-hostname master-0.test.oc4.localdomain

# touch /boot/ignition.firstboot && rm -rf /etc/machine-id && reboot

After reboot is completed, no more exited discovery container:

CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID e701efa8bc583 77ec5e26cc676ef2bf5c42dd40e55394a11fb45a3e2d7e95cbaf233a1eef472f 20 seconds ago Running coredns 1 cbabc53322ac8 2c7bc6abb5b65 d73eca122bd567a3a1f70fa5021683bc17dd87003d05d88b1cdd0215c55049f6 20 seconds ago Running mdns-publisher 1 6f8914ff9db35 b3f619d5afa2c 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running haproxy-monitor 1 0e5c209496787 07769ce79b032 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running keepalived-monitor 1 02cf141d01a29 fb20d66b81254 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 21 seconds ago Running discovery 77 562f32067e0a7 476b07599260e 86a34bc5edd3e70073313f97bfd51ed8937658b341dc52334fb98ea6896ebdc2 22 seconds ago Running haproxy 1 0e5c209496787 26b53050a412b 9f94e500f85a735ec212ffb7305e0b63f7151a5346e41c2d5d293c8456f6fa42 22 seconds ago Running keepalived 1 02cf141d01a29 30ce48453854b 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 cbabc53322ac8 ad3ab0ae52077 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 6f8914ff9db35 650d62765e9e1 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e 13 hours ago Exited coredns 0 2ae0512b3b6ac 481969ce49bb9 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e 13 hours ago Exited mdns-publisher 0 d49754042b792 3594d9d261ca7 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d 13 hours ago Exited haproxy-monitor 0 3476219058ba8 88b13ec02a5c1 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 13 hours ago Exited keepalived-monitor 0 a3e13cf07c04f 1ab721b5599ed registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60 13 hours ago

because DNS registration is OK:

[user1@master-0 ~]$ dig etcd-0.test.oc4.localdomain +short 10.34.212.227

CONCLUSION:

- none of rhcos vm is correctly provisionned to their targeted hostname, so they all stay with localhost.

What is your engine version? the hostname support for ignition is merged into 4.3.7 and master

4.3.7.1-1.el7

https://gerrit.ovirt.org/c/100397/ merged 2 days ago, so it will apear in 4.3.7.2.

Sandro when is 4.7.3.2 is due?

You can also use the nightly 4.3 snapshot - it's not really nightly anymore - it's updated per every run of CI Change-Queue, IIUC:

https://www.ovirt.org/develop/dev-process/install-nightly-snapshot.html

I confirm 4.3 snapshot supports hostname change with ignition, now it works out of the box... until this issue :

INFO Cluster operator image-registry Available is False with StorageNotConfigured: storage backend not configured ERROR Cluster operator image-registry Degraded is True with StorageNotConfigured: storage backend not configured INFO Cluster operator insights Disabled is False with : FATAL failed to initialize the cluster: Cluster operator image-registry is still updating

That should have been solved by https://github.com/openshift/cluster-image-registry-operator/pull/406

Can you share `oc describe co/image-registry` ?

Where should I type this command, into master or worker? I wasn't able to success with "oc login":

error: Missing or incomplete configuration info. Please login or point to an existing, complete config file...

Thanks for coming with back with questions. I really supplied little info for that one. 'oc' needs a configuration file with the credentials and a certificate to interact with the cluster. When you invoke 'openshift-installer' it creates 'auth' directory and there you have the desired 'auth/kubeconfig' file . Do note, that the certificate will look to resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN. On you end, that would be *api.test.oc4.localdomain* Since the cluster is serving its internal DNS you can add 10.34.212.100 to your resolv.conf for now. Now you interact with the cluster: ``` $ KUBECONFIG=auth/kubeconfig $ # have some bash completion for smooth experience $ source <(oc completion bash) $ # let see the cluster nodes $ oc get nodes ``` Next you want to consult with this troubleshooting doc https://github.com/openshift/installer/blob/master/docs/user/troubleshooting...

...

...
...
I only upgraded engine and not vdsm on hosts, but I suppose hosts are

...
not important for ignition

Correct.

...
...
- Cloud-init syntax for the hostname is ok, but it is not provisioned by ignition:

Why not provisionning these hostnames with a json snippet or else?

{ "ignition": { "version": "2.2.0" }, "storage": { "files": [{ "filesystem": "root", "path": "/etc/hostname", "mode": 420, "contents": { "source": "data:,master-0.test.oc4.localdomain" } }] }}

...
> >> >> >> Le 05/11/2019 à 12:24, Roy Golan a écrit : >>>> >>>> >>>> >>>> On Tue, 5 Nov 2019 at 13:22, Nathanaël Blanchet <blanchet@abes.fr> >>>> wrote: >>>> >>>>> Hello, >>>>> >>>>> I'm interested by installing okd on ovirt with the official >>>>> openshift >>>>> installer (https://github.com/openshift/installer), but ovirt >>>>> is not yet >>>>> supported. >>>>> >>>>> >>>> If you want to give a try and supply feedback I'll be glad. >>>> >>>> >>>>> Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1578255 >>>>> and >>>>> >>>>> https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... >>>>> , how ovirt 4.3.7 should integrate openshift installer >>>>> integration with >>>>> terraform? >>>>> >>>>> >>>> Terraform is part of it, yes, It is what we use to spin the first >>>> 3 masters, plus a bootstraping machine. >>>> >>>> -- >>>>> Nathanaël Blanchet >>>>> >>>>> Supervision réseau >>>>> Pôle Infrastrutures Informatiques >>>>> 227 avenue Professeur-Jean-Louis-Viala >>>>> 34193 MONTPELLIER CEDEX 5 >>>>> Tél. 33 (0)4 67 54 84 55 >>>>> Fax 33 (0)4 67 54 84 14 >>>>> blanchet@abes.fr >>>>> >>>>> -- >>>> Nathanaël Blanchet >>>> >>>> Supervision réseau >>>> Pôle Infrastrutures Informatiques >>>> 227 avenue Professeur-Jean-Louis-Viala >>>> 34193 MONTPELLIER CEDEX 5 >>>> Tél. 33 (0)4 67 54 84 55 >>>> Fax 33 (0)4 67 54 84 14blanchet@abes.fr >>>> >>>> -- >>> Nathanaël Blanchet >>> >>> Supervision réseau >>> Pôle Infrastrutures Informatiques >>> 227 avenue Professeur-Jean-Louis-Viala >>> 34193 MONTPELLIER CEDEX 5 >>> Tél. 33 (0)4 67 54 84 55 >>> Fax 33 (0)4 67 54 84 14blanchet@abes.fr >>> >>> -- >> Nathanaël Blanchet >> >> Supervision réseau >> Pôle Infrastrutures Informatiques >> 227 avenue Professeur-Jean-Louis-Viala >> 34193 MONTPELLIER CEDEX 5 >> Tél. 33 (0)4 67 54 84 55 >> Fax 33 (0)4 67 54 84 14blanchet@abes.fr >> >> -- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

--

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

_______________________________________________

Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GZ64UU7KYDYJMZ...

-- Didi

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

Nathanaël Blanchet

8 Nov 8 Nov

12:20 p.m.

Le 07/11/2019 à 21:35, Roy Golan a écrit :

...

On Thu, 7 Nov 2019 at 16:41, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 07/11/2019 à 14:52, Roy Golan a écrit :

...
On Thu, 7 Nov 2019 at 15:07, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 07/11/2019 à 12:02, Yedidyah Bar David a écrit :

...
On Thu, Nov 7, 2019 at 12:57 PM Roy Golan <rgolan@redhat.com <mailto:rgolan@redhat.com>> wrote:

On Thu, 7 Nov 2019 at 12:28, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 07/11/2019 à 11:16, Roy Golan a écrit :

...
On Thu, 7 Nov 2019 at 11:23, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 07/11/2019 à 07:18, Roy Golan a écrit :

...
On Thu, 7 Nov 2019 at 00:10, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 21:50, Roy Golan a écrit :

...
On Tue, 5 Nov 2019 at 22:46, Roy Golan <rgolan@redhat.com <mailto:rgolan@redhat.com>> wrote:

On Tue, 5 Nov 2019 at 20:28, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 05/11/2019 à 18:22, Roy Golan a écrit : > > > On Tue, 5 Nov 2019 at 19:12, > Nathanaël Blanchet > <blanchet@abes.fr > <mailto:blanchet@abes.fr>> wrote: > > > Le 05/11/2019 à 13:54, Roy > Golan a écrit : >> >> >> On Tue, 5 Nov 2019 at >> 14:52, Nathanaël Blanchet >> <blanchet@abes.fr >> <mailto:blanchet@abes.fr>> >> wrote: >> >> I tried >> openshift-install after >> compiling but no ovirt >> provider is >> available... So waht do >> you mean when you say >> "give a try"? Maybe >> only provisionning >> ovirt with the >> terraform module? >> >> [root@vm5 installer]# >> bin/openshift-install >> create cluster >> ? Platform [Use arrows >> to move, space to >> select, type to filter, >> ? for more help] >> > aws >> azure >> gcp >> openstack >> >> >> Its not merged yet. Please >> pull this image and work >> with it as a container >> quay.io/rgolangh/openshift-installer >> <http://quay.io/rgolangh/openshift-installer> > > A little feedback as you asked: > > [root@openshift-installer > ~]# docker run -it > 56e5b667100f create cluster > ? Platform ovirt > ? Enter oVirt's api endpoint > URL > https://air-dev.v100.abes.fr/ovirt-engine/api > ? Enter ovirt-engine > username admin@internal > ? Enter password ********** > ? Pick the oVirt cluster Default > ? Pick a VM template centos7.x > ? Enter the internal API > Virtual IP 10.34.212.200 > ? Enter the internal DNS > Virtual IP 10.34.212.100 > ? Enter the ingress IP > 10.34.212.50 > ? Base Domain oc4.localdomain > ? Cluster Name test > ? Pull Secret [? for help] > ************************************* > INFO Creating infrastructure > resources... > INFO Waiting up to 30m0s for > the Kubernetes API at > https://api.test.oc4.localdomain:6443... > > ERROR Attempted to gather > ClusterOperator status after > installation failure: > listing ClusterOperator > objects: Get > https://api.test.oc4.localdomain:6443/apis/config.openshift.io/v1/clusterope...: > dial tcp: lookup > api.test.oc4.localdomain on > 10.34.212.100:53 > <http://10.34.212.100:53>: > no such host > INFO Pulling debug logs from > the bootstrap machine > ERROR Attempted to gather > debug logs after > installation failure: failed > to create SSH client, ensure > the proper ssh key is in > your keyring or specify with > --key: failed to initialize > the SSH agent: failed to > read directory > "/output/.ssh": open > /output/.ssh: no such file > or directory > FATAL Bootstrap failed to > complete: waiting for > Kubernetes API: context > deadline exceeded > > * 6 vms are successfully > created thin dependent > from the template > > * each vm is provisionned > by cloud-init > * the step "INFO Waiting > up to 30m0s for the > Kubernetes API at > https://api.test.oc4.localdomain:6443..." > fails. It seems that the > DNS pod is not up at > this time. > * Right this moment, there > is no more visibility on > what is done, what goes > wrong... what's > happening there? > supposing a kind of > playbook downloading a > kind of images... > * The" pull secret step" > is not clear: we must > have a redhat account to > https://cloud.redhat.com/openshift/install/ > to get a key like > * > > {"auths":{"cloud.openshift.com > <http://cloud.openshift.com>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr" > <mailto:exploit@abes.fr>},"quay.io > <http://quay.io>":{"auth":"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2V4cGxvaXRhYmVzZnIxdGN0ZnR0dmFnMHpuazMxd2IwMnIwenV1MDg6TE9XVzFQODM1NzNJWlI4MlZDSUEyTFdEVlJJS0U5VTVWM0NTSUdOWjJH********************==","email":"exploit@abes.fr" > <mailto:exploit@abes.fr>},"registry.connect.redhat.com > <http://registry.connect.redhat.com>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr" > <mailto:exploit@abes.fr>},"registry.redhat.io > <http://registry.redhat.io>":{"auth":"NTI0MjkwMnx1aGMtMVRDVEZUVFZBRzBaTkszMXdCMDJSMFp1VTA4OmV5SmhiR2NpT2lKU1V6VXhNaUo5LmV5SnpkV0lpT2lJMk4ySTJNREV3WXpObE1HSTBNbVE0T1RGbVpUZGxa**********************","email":"exploit@abes.fr" > <mailto:exploit@abes.fr>}}} > > Can you tell me if I'm doing > wrong? > > > What is the template you are > using? I don't think its > RHCOS(Red Hat CoreOs) template, > it looks like Centos? > > Use this gist to import the > template > https://gist.github.com/rgolangh/adccf6d6b5eaecaebe0b0aeba9d3331b Unfortunately, the result is the same with the RHCOS template...

Make sure that: - the IPs supplied are taken, and belong to the VM network of those master VMs - localdomain or local domain suffix shouldn't be used - your ovirt-engine is version 4.3.7 or master

I didn't mention that you can provide any domain name, even non-existing. When the bootstrap phase will be done, the instllation will teardown the bootsrap mahchine. At this stage if you are using a non-existing domain you would need to add the DNS Virtual IP you provided to your resolv.conf so the installation could resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN.

Also, you have a log under your $INSTALL_DIR/.openshift_install.log

I tried several things with your advices, but I'm still stuck at the https://api.test.oc4.localdomain:6443/version?timeout=32s test

with logs:

time="2019-11-06T20:21:15Z" level=debug msg="Still waiting for the Kubernetes API: the server could not find the requested resource"

So it means DNS resolution and network are now good and ignition provisionning is is OK but something goes wrong with the bootstrap vm.

Now if I log into the bootstrap vm, I can see a selinux message, but it may be not relevant...

SELinux: mount invalid. Same Superblock, different security settings for (dev nqueue, type nqueue).

Some other cluewWith journalctl:

journalctl -b -f -u bootkube

Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.661Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-7beef51d-daad-4b46-9497-8e135e528f7c/etcd-1.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-1.test.oc4.localdomain on 10.34.212.101:53 <http://10.34.212.101:53>: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-03992fc6-5a87-4160-9b87-44ec6e82f7cd/etcd-2.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-2.test.oc4.localdomain on 10.34.212.101:53 <http://10.34.212.101:53>: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: {"level":"warn","ts":"2019-11-06T21:55:40.662Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-00db28a7-5188-4666-896b-e37c88ad3ae9/etcd-0.test.oc4.localdomain:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup etcd-0.test.oc4.localdomain on 10.34.212.101:53 <http://10.34.212.101:53>: no such host\""} Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-1.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-2.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: https://etcd-0.test.oc4.localdomain:2379 is unhealthy: failed to commit proposal: context deadline exceeded Nov 06 21:55:40 localhost bootkube.sh[2101]: Error: unhealthy cluster Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.720514151 +0000 UTC m=+5.813853296 container died 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image=registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae>, name=etcdctl) Nov 06 21:55:40 localhost podman[61210]: 2019-11-06 21:55:40.817475095 +0000 UTC m=+5.910814273 container remove 7db3014e3f19c61775bac2a7a155eeb8521a6b78fea0d512384dd965cb0b8b01 (image=registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:696a0ba7d344e625ec71d915d74d387fc2a951b879d4d54bdc69d460724c01ae>, name=etcdctl) Nov 06 21:55:40 localhost bootkube.sh[2101]: etcdctl failed. Retrying in 5 seconds...

It seems to be again a dns resolution issue.

[user1@localhost ~]$ dig api.test.oc4.localdomain +short 10.34.212.201

[user1@localhost ~]$ dig etcd-2.test.oc4.localdomain +short nothing

So what do you think about that?

Key here is the masters - they need to boot, get ignition from the bootstrap machine and start publishing their IPs and hostnames.

Connect to a master, check its hostname, check its running or failing containers `crictl ps -a` by root user.

You were right:

# crictl ps -a CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID 744cb8e654705 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 4 minutes ago Running discovery 75 9462e9a8ca478 912ba9db736c3 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 14 minutes ago Exited discovery 74 9462e9a8ca478

# crictl logs 744cb8e654705 E1107 08:10:04.262330 1 run.go:67] error looking up self for candidate IP 10.34.212.227 <http://10.34.212.227>: lookup _etcd-server-ssl._tcp.test.oc4.localdomain on 10.34.212.51:53 <http://10.34.212.51:53>: no such host

# hostname localhost

Conclusion: discovery didn't publish IPs and hostname to coreDNS because the master didn't get its name master-0.test.oc4.localdomain during provisionning phase.

I changed the master-0 hostname and reinitiates ignition to verify:

# hostnamectl set-hostname master-0.test.oc4.localdomain

# touch /boot/ignition.firstboot && rm -rf /etc/machine-id && reboot

After reboot is completed, no more exited discovery container:

CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD ID e701efa8bc583 77ec5e26cc676ef2bf5c42dd40e55394a11fb45a3e2d7e95cbaf233a1eef472f 20 seconds ago Running coredns 1 cbabc53322ac8 2c7bc6abb5b65 d73eca122bd567a3a1f70fa5021683bc17dd87003d05d88b1cdd0215c55049f6 20 seconds ago Running mdns-publisher 1 6f8914ff9db35 b3f619d5afa2c 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running haproxy-monitor 1 0e5c209496787 07769ce79b032 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 21 seconds ago Running keepalived-monitor 1 02cf141d01a29 fb20d66b81254 e77034cf36baff5e625acbba15331db68e1d84571f977d254fd833341158daa8 21 seconds ago Running discovery 77 562f32067e0a7 476b07599260e 86a34bc5edd3e70073313f97bfd51ed8937658b341dc52334fb98ea6896ebdc2 22 seconds ago Running haproxy 1 0e5c209496787 26b53050a412b 9f94e500f85a735ec212ffb7305e0b63f7151a5346e41c2d5d293c8456f6fa42 22 seconds ago Running keepalived 1 02cf141d01a29 30ce48453854b 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 cbabc53322ac8 ad3ab0ae52077 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 22 seconds ago Exited render-config 1 6f8914ff9db35 650d62765e9e1 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:9a7e8297f7b70ee3a11fcbe4a78c59a5861e1afda5657a7437de6934bdc2458e> 13 hours ago Exited coredns 0 2ae0512b3b6ac 481969ce49bb9 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:768194132b4dbde077e32de8801c952265643da00ae161f1ee560fabf6ed1f8e> 13 hours ago Exited mdns-publisher 0 d49754042b792 3594d9d261ca7 registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:8c3b0223699801a9cb93246276da4746fa4d6fa66649929b2d9b702c17dac75d> 13 hours ago Exited haproxy-monitor 0 3476219058ba8 88b13ec02a5c1 7aa184de043265814f9a775968234ac3280a285056da773f1aba0917e9615370 13 hours ago Exited keepalived-monitor 0 a3e13cf07c04f 1ab721b5599ed registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60 <http://registry.svc.ci.openshift.org/origin/4.3-2019-10-29-180250@sha256:629d73fdd28beafda788d2248608f90c7ed048357e250f3e855b9462b92cfe60> 13 hours ago

because DNS registration is OK:

[user1@master-0 ~]$ dig etcd-0.test.oc4.localdomain +short 10.34.212.227

CONCLUSION:

* none of rhcos vm is correctly provisionned to their targeted hostname, so they all stay with localhost.

What is your engine version? the hostname support for ignition is merged into 4.3.7 and master

4.3.7.1-1.el7

https://gerrit.ovirt.org/c/100397/ merged 2 days ago, so it will apear in 4.3.7.2.

Sandro when is 4.7.3.2 is due?

You can also use the nightly 4.3 snapshot - it's not really nightly anymore - it's updated per every run of CI Change-Queue, IIUC:

https://www.ovirt.org/develop/dev-process/install-nightly-snapshot.html

I confirm 4.3 snapshot supports hostname change with ignition, now it works out of the box... until this issue :

INFO Cluster operator image-registry Available is False with StorageNotConfigured: storage backend not configured ERROR Cluster operator image-registry Degraded is True with StorageNotConfigured: storage backend not configured INFO Cluster operator insights Disabled is False with : FATAL failed to initialize the cluster: Cluster operator image-registry is still updating

That should have been solved by https://github.com/openshift/cluster-image-registry-operator/pull/406

Is it merged? What should I do now, waiting for new release? I can't test installation further until this is not fixed. It seems to be a storage backend issue for the registry container. I noticed the openshift version is 4.3, is it possible to install a previous stable one? Does it mean all other openshift-installer providers are concerned by this bug?

...

...
Can you share `oc describe co/image-registry` ?

I finally manged to connect to the cluster: ./oc describe co/image-registry Name: image-registry Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2019-11-07T12:33:15Z Generation: 1 Resource Version: 10637 Self Link: /apis/config.openshift.io/v1/clusteroperators/image-registry UID: 28c491ad-7e89-4269-a6e3-8751085e7653 Spec: Status: Conditions: Last Transition Time: 2019-11-07T12:31:52Z Message: storage backend not configured Reason: StorageNotConfigured Status: False Type: Available Last Transition Time: 2019-11-07T12:31:52Z Message: Unable to apply resources: storage backend not configured Reason: Error Status: False Type: Progressing Last Transition Time: 2019-11-07T12:31:52Z Message: storage backend not configured Reason: StorageNotConfigured Status: True Type: Degraded Extension: <nil> Related Objects: Group: imageregistry.operator.openshift.io Name: cluster Resource: configs Group: Name: openshift-image-registry Resource: namespaces Events: <none>

...

Where should I type this command, into master or worker? I wasn't able to success with "oc login":

error: Missing or incomplete configuration info. Please login or point to an existing, complete config file...

Thanks for coming with back with questions. I really supplied little info for that one.

'oc' needs a configuration file with the credentials and a certificate to interact with the cluster. When you invoke 'openshift-installer' it creates 'auth' directory and there you have the desired 'auth/kubeconfig' file . Do note, that the certificate will look to resolve api.$CLUSTER_NAME.$CLUSTER_DOMAIN. On you end, that would be *api.test.oc4.localdomain* Since the cluster is serving its internal DNS you can add 10.34.212.100 to your resolv.conf for now. Now you interact with the cluster: ``` $ KUBECONFIG=auth/kubeconfig

$ export KUBECONFIG=auth/kubeconfig // if omitted export, oc doesn't connect

...

$ # have some bash completion for smooth experience $ source <(oc completion bash) $ # let see the cluster nodes $ oc get nodes ``` Next you want to consult with this troubleshooting doc https://github.com/openshift/installer/blob/master/docs/user/troubleshooting...

...
...
I only upgraded engine and not vdsm on hosts, but I suppose hosts are not important for ignition

Correct.

...
* Cloud-init syntax for the hostname is ok, but it is not provisioned by ignition:

Why not provisionning these hostnames with a json snippet or else?

|{"ignition":{"version":"2.2.0"},"storage":{"files":[{"filesystem":"root","path":"/etc/hostname","mode":420,"contents":{"source":"data:,master-0.test.oc4.localdomain"}}]}}|

...
...
> > > >> Le 05/11/2019 à 12:24, >> Roy Golan a écrit : >>> >>> >>> On Tue, 5 Nov 2019 at >>> 13:22, Nathanaël >>> Blanchet >>> <blanchet@abes.fr >>> <mailto:blanchet@abes.fr>> >>> wrote: >>> >>> Hello, >>> >>> I'm interested by >>> installing okd on >>> ovirt with the >>> official openshift >>> installer >>> (https://github.com/openshift/installer), >>> but ovirt is not yet >>> supported. >>> >>> >>> If you want to give a >>> try and supply >>> feedback I'll be glad. >>> >>> Regarding >>> https://bugzilla.redhat.com/show_bug.cgi?id=1578255 >>> and >>> https://lists.ovirt.org/archives/list/users@ovirt.org/thread/EF7OQUVTY53GV3A... >>> >>> , how ovirt 4.3.7 >>> should integrate >>> openshift >>> installer >>> integration with >>> terraform? >>> >>> >>> Terraform is part of >>> it, yes, It is what we >>> use to spin the first >>> 3 masters, plus a >>> bootstraping machine. >>> >>> -- >>> Nathanaël Blanchet >>> >>> Supervision réseau >>> Pôle >>> Infrastrutures >>> Informatiques >>> 227 avenue >>> Professeur-Jean-Louis-Viala >>> 34193 MONTPELLIER >>> CEDEX 5 >>> Tél. 33 (0)4 67 54 >>> 84 55 >>> Fax 33 (0)4 67 54 >>> 84 14 >>> blanchet@abes.fr >>> <mailto:blanchet@abes.fr> >>> >> -- >> Nathanaël Blanchet >> >> Supervision réseau >> Pôle Infrastrutures Informatiques >> 227 avenue Professeur-Jean-Louis-Viala >> 34193 MONTPELLIER CEDEX 5 >> Tél. 33 (0)4 67 54 84 55 >> Fax 33 (0)4 67 54 84 14 >> blanchet@abes.fr <mailto:blanchet@abes.fr> >> > -- > Nathanaël Blanchet > > Supervision réseau > Pôle Infrastrutures Informatiques > 227 avenue Professeur-Jean-Louis-Viala > 34193 MONTPELLIER CEDEX 5 > Tél. 33 (0)4 67 54 84 55 > Fax 33 (0)4 67 54 84 14 > blanchet@abes.fr <mailto:blanchet@abes.fr> > -- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

_______________________________________________ Users mailing list -- users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email to users-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GZ64UU7KYDYJMZ...

-- Didi

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

Nathanaël Blanchet

12 Nov 12 Nov

10:16 a.m.

Le 08/11/2019 à 11:20, Nathanaël Blanchet a écrit :

...

...
...
That should have been solved by https://github.com/openshift/cluster-image-registry-operator/pull/406

nothing helped me, what am I supposed to do now for getting a working cluster?

thank you

...

Is it merged? What should I do now, waiting for new release? I can't test installation further until this is not fixed.

It seems to be a storage backend issue for the registry container.

I noticed the openshift version is 4.3, is it possible to install a previous stable one?

Does it mean all other openshift-installer providers are concerned by this bug?

...
...
Can you share `oc describe co/image-registry` ?

I finally manged to connect to the cluster:

./oc describe co/image-registry

Name: image-registry Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2019-11-07T12:33:15Z Generation: 1 Resource Version: 10637 Self Link: /apis/config.openshift.io/v1/clusteroperators/image-registry UID: 28c491ad-7e89-4269-a6e3-8751085e7653 Spec: Status: Conditions: Last Transition Time: 2019-11-07T12:31:52Z Message: storage backend not configured Reason: StorageNotConfigured Status: False Type: Available Last Transition Time: 2019-11-07T12:31:52Z Message: Unable to apply resources: storage backend not configured Reason: Error Status: False Type: Progressing Last Transition Time: 2019-11-07T12:31:52Z Message: storage backend not configured Reason: StorageNotConfigured Status: True Type: Degraded Extension: <nil> Related Objects: Group: imageregistry.operator.openshift.io Name: cluster Resource: configs Group: Name: openshift-image-registry Resource: namespaces Events: <none>

Roy Golan

19 Nov 19 Nov

9:55 a.m.

On Tue, 12 Nov 2019 at 10:22, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...

Le 08/11/2019 à 11:20, Nathanaël Blanchet a écrit :

...
That should have been solved by https://github.com/openshift/cluster-image-registry-operator/pull/406

nothing helped me, what am I supposed to do now for getting a working cluster?

thank you

This exact image is passing CI. Anyhow some more changes are coming, that will have a good impact on resource (essentially spinning the worker later in the process and not imediatly) Can you try to run the install again? What is the output of: oc get -o json clusterversion Is it merged? What should I do now, waiting for new release? I can't test

...

installation further until this is not fixed.

It seems to be a storage backend issue for the registry container.

I noticed the openshift version is 4.3, is it possible to install a previous stable one?

Does it mean all other openshift-installer providers are concerned by this bug?

...
Can you share `oc describe co/image-registry` ?

I finally manged to connect to the cluster:

./oc describe co/image-registry Name: image-registry Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2019-11-07T12:33:15Z Generation: 1 Resource Version: 10637 Self Link: /apis/ config.openshift.io/v1/clusteroperators/image-registry UID: 28c491ad-7e89-4269-a6e3-8751085e7653 Spec: Status: Conditions: Last Transition Time: 2019-11-07T12:31:52Z Message: storage backend not configured Reason: StorageNotConfigured Status: False Type: Available Last Transition Time: 2019-11-07T12:31:52Z Message: Unable to apply resources: storage backend not configured Reason: Error Status: False Type: Progressing Last Transition Time: 2019-11-07T12:31:52Z Message: storage backend not configured Reason: StorageNotConfigured Status: True Type: Degraded Extension: <nil> Related Objects: Group: imageregistry.operator.openshift.io Name: cluster Resource: configs Group: Name: openshift-image-registry Resource: namespaces Events: <none>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

Nathanaël Blanchet

1:18 p.m.

Le 19/11/2019 à 08:55, Roy Golan a écrit :

...

oc get -o json clusterversion

This is the output of the previous failed deployment, I'll give a try to a newer one when I'll have a minute to test (do I need to use the terraform-workers tag instead of latest?) docker pull quay.io/rgolangh/openshift-installer:terraform-workers [root@openshift-installer openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit]# ./oc get -o json clusterversion { "apiVersion": "v1", "items": [ { "apiVersion": "config.openshift.io/v1", "kind": "ClusterVersion", "metadata": { "creationTimestamp": "2019-11-07T12:23:06Z", "generation": 1, "name": "version", "namespace": "", "resourceVersion": "3770202", "selfLink": "/apis/config.openshift.io/v1/clusterversions/version", "uid": "77600bba-6e71-4b35-a60b-d8ee6e0f545c" }, "spec": { "channel": "stable-4.3", "clusterID": "6f87b719-e563-4c0b-ab5a-1144172bc983", "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" }, "status": { "availableUpdates": null, "conditions": [ { "lastTransitionTime": "2019-11-07T12:23:12Z", "status": "False", "type": "Available" }, { "lastTransitionTime": "2019-11-07T12:56:15Z", "message": "Cluster operator image-registry is still updating", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Failing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to apply 4.3.0-0.okd-2019-10-29-180250: the cluster operator image-registry has not yet successfully rolled out", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Progressing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to retrieve available updates: currently installed version 4.3.0-0.okd-2019-10-29-180250 not found in the \"stable-4.3\" channel", "reason": "RemoteFailed", "status": "False", "type": "RetrievedUpdates" } ], "desired": { "force": false, "image": "registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9", "version": "4.3.0-0.okd-2019-10-29-180250" }, "history": [ { "completionTime": null, "image": "registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9", "startedTime": "2019-11-07T12:23:12Z", "state": "Partial", "verified": false, "version": "4.3.0-0.okd-2019-10-29-180250" } ], "observedGeneration": 1, "versionHash": "-3onP9QpPTg=" } } ], "kind": "List", "metadata": { "resourceVersion": "", "selfLink": "" } } Can you answer to these few questions please? * The latest stable OKD version is 4.2.4. Is it possible to chose the version of okd when deploying (seems to use 4.3) or does the installer always download the latest OKD? * Can we use FCOS instead of RHCOS? * About the pull secret, do we absolutely need a redhat login to get this file to deploy an upstream OKD cluster and not downstream openshift? -- Nathanaël Blanchet Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

Roy Golan

2:43 p.m.

On Tue, 19 Nov 2019 at 14:34, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...

Le 19/11/2019 à 08:55, Roy Golan a écrit :

oc get -o json clusterversion

This is the output of the previous failed deployment, I'll give a try to a newer one when I'll have a minute to test

(do I need to use the terraform-workers tag instead of latest?)

docker pull quay.io/rgolangh/openshift-installer:terraform-workers

[root@openshift-installer openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit]# ./oc get -o json clusterversion { "apiVersion": "v1", "items": [ { "apiVersion": "config.openshift.io/v1", "kind": "ClusterVersion", "metadata": { "creationTimestamp": "2019-11-07T12:23:06Z", "generation": 1, "name": "version", "namespace": "", "resourceVersion": "3770202", "selfLink": "/apis/ config.openshift.io/v1/clusterversions/version", "uid": "77600bba-6e71-4b35-a60b-d8ee6e0f545c" }, "spec": { "channel": "stable-4.3", "clusterID": "6f87b719-e563-4c0b-ab5a-1144172bc983", "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" <https://api.openshift.com/api/upgrades_info/v1/graph> }, "status": { "availableUpdates": null, "conditions": [ { "lastTransitionTime": "2019-11-07T12:23:12Z", "status": "False", "type": "Available" }, {script "lastTransitionTime": "2019-11-07T12:56:15Z", "message": "Cluster operator image-registry is still updating", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Failing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to apply 4.3.0-0.okd-2019-10-29-180250: the cluster operator image-registry has not yet successfully rolled out", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Progressing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to retrieve available updates: currently installed version 4.3.0-0.okd-2019-10-29-180250 not found in the \"stable-4.3\" channel", "reason": "RemoteFailed", "status": "False", "type": "RetrievedUpdates" } ], "desired": { "force": false, "image": " registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 ", "version": "4.3.0-0.okd-2019-10-29-180250"

Indeed this version is not the latest and is missing the aforementioned fix for the registry. },

...

"history": [ { "completionTime": null, "image": " registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 ", "startedTime": "2019-11-07T12:23:12Z", "state": "Partial", "verified": false, "version": "4.3.0-0.okd-2019-10-29-180250" } ], "observedGeneration": 1, "versionHash": "-3onP9QpPTg=" } } ], "kind": "List", "metadata": { "resourceVersion": "", "selfLink": "" }

}

Can you answer to these few questions please?

- The latest stable OKD version is 4.2.4. Is it possible to chose the version of okd when deploying (seems to use 4.3) or does the installer always download the latest OKD?

...

- Can we use FCOS instead of RHCOS?

...

- About the pull secret, do we absolutely need a redhat login to get this file to deploy an upstream OKD cluster and not downstream openshift?

To answer the 3 of those, this specific build is not really OKD, and will use 4.3 and Red Hat artifact and must use RHCOs, hence the pull secret

thing. I frankly don't know when OKD 4.3 is going to be released, I guess it will be on top FCOS. I'll update the list once we have the oVirt installer for OKD ready for testing (on FCOS) --

...

Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

Nathanaël Blanchet

8:23 p.m.

Le 19/11/2019 à 13:43, Roy Golan a écrit :

...

On Tue, 19 Nov 2019 at 14:34, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 19/11/2019 à 08:55, Roy Golan a écrit :

...
oc get -o json clusterversion

This is the output of the previous failed deployment, I'll give a try to a newer one when I'll have a minute to test

Without changing nothing with template, I gave a new try and... nothing works anymore now, none of provided IPs can be pingued : dial tcp 10.34.212.51:6443: connect: no route to host", so none of masters can be provisonned by bootstrap. I tried with the latest rhcos and latest ovirt 4.3.7, it is the same. Obviously something changed since my first attempt 12 days ago... is your docker image for openshift-installer up to date? Are you still able to your side to deploy a valid cluster ?

...

(do I need to use the terraform-workers tag instead of latest?)

docker pullquay.io/rgolangh/openshift-installer:terraform-workers <http://quay.io/rgolangh/openshift-installer:terraform-workers>

[root@openshift-installer openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit]# ./oc get -o json clusterversion { "apiVersion": "v1", "items": [ { "apiVersion": "config.openshift.io/v1 <http://config.openshift.io/v1>", "kind": "ClusterVersion", "metadata": { "creationTimestamp": "2019-11-07T12:23:06Z", "generation": 1, "name": "version", "namespace": "", "resourceVersion": "3770202", "selfLink": "/apis/config.openshift.io/v1/clusterversions/version <http://config.openshift.io/v1/clusterversions/version>", "uid": "77600bba-6e71-4b35-a60b-d8ee6e0f545c" }, "spec": { "channel": "stable-4.3", "clusterID": "6f87b719-e563-4c0b-ab5a-1144172bc983", "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" <https://api.openshift.com/api/upgrades_info/v1/graph> }, "status": { "availableUpdates": null, "conditions": [ { "lastTransitionTime": "2019-11-07T12:23:12Z", "status": "False", "type": "Available" }, {script "lastTransitionTime": "2019-11-07T12:56:15Z", "message": "Cluster operator image-registry is still updating", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Failing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to apply 4.3.0-0.okd-2019-10-29-180250: the cluster operator image-registry has not yet successfully rolled out", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Progressing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to retrieve available updates: currently installed version 4.3.0-0.okd-2019-10-29-180250 not found in the \"stable-4.3\" channel", "reason": "RemoteFailed", "status": "False", "type": "RetrievedUpdates" } ], "desired": { "force": false, "image": "registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 <http://registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9>", "version": "4.3.0-0.okd-2019-10-29-180250"

Indeed this version is not the latest and is missing the aforementioned fix for the registry.

}, "history": [ { "completionTime": null, "image": "registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 <http://registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9>", "startedTime": "2019-11-07T12:23:12Z", "state": "Partial", "verified": false, "version": "4.3.0-0.okd-2019-10-29-180250" } ], "observedGeneration": 1, "versionHash": "-3onP9QpPTg=" } } ], "kind": "List", "metadata": { "resourceVersion": "", "selfLink": "" }

}

Can you answer to these few questions please?

* The latest stable OKD version is 4.2.4. Is it possible to chose the version of okd when deploying (seems to use 4.3) or does the installer always download the latest OKD?

* Can we use FCOS instead of RHCOS?

* About the pull secret, do we absolutely need a redhat login to get this file to deploy an upstream OKD cluster and not downstream openshift?

To answer the 3 of those, this specific build is not really OKD, and will use 4.3 and Red Hat artifact and must use RHCOs, hence the pull secret thing. I frankly don't know when OKD 4.3 is going to be released, I guess it will be on top FCOS. I'll update the list once we have the oVirt installer for OKD ready for testing (on FCOS)

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

Nathanaël Blanchet

20 Nov 20 Nov

9:49 a.m.

Le 19/11/2019 à 19:23, Nathanaël Blanchet a écrit :

...

Le 19/11/2019 à 13:43, Roy Golan a écrit :

...
On Tue, 19 Nov 2019 at 14:34, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 19/11/2019 à 08:55, Roy Golan a écrit :

...
oc get -o json clusterversion

This is the output of the previous failed deployment, I'll give a try to a newer one when I'll have a minute to test

Without changing nothing with template, I gave a new try and... nothing works anymore now, none of provided IPs can be pingued : dial tcp 10.34.212.51:6443: connect: no route to host", so none of masters can be provisonned by bootstrap.

I tried with the latest rhcos and latest ovirt 4.3.7, it is the same. Obviously something changed since my first attempt 12 days ago... is your docker image for openshift-installer up to date?

Are you still able to your side to deploy a valid cluster ?

I investigated looking at bootstrap logs (attached) and it seems that every containers die immediately after been started. Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.60107571 +0000 UTC m=+0.794838407 container init 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623197173 +0000 UTC m=+0.816959853 container start 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623814258 +0000 UTC m=+0.817576965 container attach 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:34 localhost systemd[1]: libpod-446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603.scope: Consumed 814ms CPU time Nov 20 07:02:34 localhost podman[2024]: 2019-11-20 07:02:34.100569998 +0000 UTC m=+1.294332779 container died 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:35 localhost podman[2024]: 2019-11-20 07:02:35.138523102 +0000 UTC m=+2.332285844 container remove 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) and this: Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489527 1909 remote_runtime.go:200] CreateContainer in sandbox "58f2062aa7b6a5b2bdd6b9cf7b41a9f94ca2b30ad5a20e4fa4dec8a9b82f05e5" from runtime service failed: rpc error: code = Unknown desc = container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH" Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489714 1909 kuberuntime_manager.go:783] init container start failed: CreateContainerError: container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH" What do you think about this?

...

...
(do I need to use the terraform-workers tag instead of latest?)

docker pullquay.io/rgolangh/openshift-installer:terraform-workers <http://quay.io/rgolangh/openshift-installer:terraform-workers>

[root@openshift-installer openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit]# ./oc get -o json clusterversion { "apiVersion": "v1", "items": [ { "apiVersion": "config.openshift.io/v1 <http://config.openshift.io/v1>", "kind": "ClusterVersion", "metadata": { "creationTimestamp": "2019-11-07T12:23:06Z", "generation": 1, "name": "version", "namespace": "", "resourceVersion": "3770202", "selfLink": "/apis/config.openshift.io/v1/clusterversions/version <http://config.openshift.io/v1/clusterversions/version>", "uid": "77600bba-6e71-4b35-a60b-d8ee6e0f545c" }, "spec": { "channel": "stable-4.3", "clusterID": "6f87b719-e563-4c0b-ab5a-1144172bc983", "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" <https://api.openshift.com/api/upgrades_info/v1/graph> }, "status": { "availableUpdates": null, "conditions": [ { "lastTransitionTime": "2019-11-07T12:23:12Z", "status": "False", "type": "Available" }, {script "lastTransitionTime": "2019-11-07T12:56:15Z", "message": "Cluster operator image-registry is still updating", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Failing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to apply 4.3.0-0.okd-2019-10-29-180250: the cluster operator image-registry has not yet successfully rolled out", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Progressing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to retrieve available updates: currently installed version 4.3.0-0.okd-2019-10-29-180250 not found in the \"stable-4.3\" channel", "reason": "RemoteFailed", "status": "False", "type": "RetrievedUpdates" } ], "desired": { "force": false, "image": "registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 <http://registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9>", "version": "4.3.0-0.okd-2019-10-29-180250"

Indeed this version is not the latest and is missing the aforementioned fix for the registry.

}, "history": [ { "completionTime": null, "image": "registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 <http://registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9>", "startedTime": "2019-11-07T12:23:12Z", "state": "Partial", "verified": false, "version": "4.3.0-0.okd-2019-10-29-180250" } ], "observedGeneration": 1, "versionHash": "-3onP9QpPTg=" } } ], "kind": "List", "metadata": { "resourceVersion": "", "selfLink": "" }

}

Can you answer to these few questions please?

* The latest stable OKD version is 4.2.4. Is it possible to chose the version of okd when deploying (seems to use 4.3) or does the installer always download the latest OKD?

* Can we use FCOS instead of RHCOS?

* About the pull secret, do we absolutely need a redhat login to get this file to deploy an upstream OKD cluster and not downstream openshift?

To answer the 3 of those, this specific build is not really OKD, and will use 4.3 and Red Hat artifact and must use RHCOs, hence the pull secret thing. I frankly don't know when OKD 4.3 is going to be released, I guess it will be on top FCOS. I'll update the list once we have the oVirt installer for OKD ready for testing (on FCOS)

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MLO4NW7NWR2TXK...

Roy Golan

21 Nov 21 Nov

8:48 a.m.

On Wed, 20 Nov 2019 at 09:49, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...

Le 19/11/2019 à 19:23, Nathanaël Blanchet a écrit :

Le 19/11/2019 à 13:43, Roy Golan a écrit :

On Tue, 19 Nov 2019 at 14:34, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 19/11/2019 à 08:55, Roy Golan a écrit :

oc get -o json clusterversion

This is the output of the previous failed deployment, I'll give a try to a newer one when I'll have a minute to test

Without changing nothing with template, I gave a new try and... nothing works anymore now, none of provided IPs can be pingued : dial tcp 10.34.212.51:6443: connect: no route to host", so none of masters can be provisonned by bootstrap.

I tried with the latest rhcos and latest ovirt 4.3.7, it is the same. Obviously something changed since my first attempt 12 days ago... is your docker image for openshift-installer up to date?

Are you still able to your side to deploy a valid cluster ?

I investigated looking at bootstrap logs (attached) and it seems that every containers die immediately after been started.

Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.60107571 +0000 UTC m=+0.794838407 container init 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623197173 +0000 UTC m=+0.816959853 container start 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623814258 +0000 UTC m=+0.817576965 container attach 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:34 localhost systemd[1]: libpod-446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603.scope: Consumed 814ms CPU time Nov 20 07:02:34 localhost podman[2024]: 2019-11-20 07:02:34.100569998 +0000 UTC m=+1.294332779 container died 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:35 localhost podman[2024]: 2019-11-20 07:02:35.138523102 +0000 UTC m=+2.332285844 container remove 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon)

and this:

Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489527 1909 remote_runtime.go:200] CreateContainer in sandbox "58f2062aa7b6a5b2bdd6b9cf7b41a9f94ca2b30ad5a20e4fa4dec8a9b82f05e5" from runtime service failed: rpc error: code = Unknown desc = container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH" Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489714 1909 kuberuntime_manager.go:783] init container start failed: CreateContainerError: container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH"

What do you think about this?

I'm seeing the same now, checking... (do I need to use the terraform-workers tag instead of latest?)

...

...
docker pull quay.io/rgolangh/openshift-installer:terraform-workers

[root@openshift-installer openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit]# ./oc get -o json clusterversion { "apiVersion": "v1", "items": [ { "apiVersion": "config.openshift.io/v1", "kind": "ClusterVersion", "metadata": { "creationTimestamp": "2019-11-07T12:23:06Z", "generation": 1, "name": "version", "namespace": "", "resourceVersion": "3770202", "selfLink": "/apis/ config.openshift.io/v1/clusterversions/version", "uid": "77600bba-6e71-4b35-a60b-d8ee6e0f545c" }, "spec": { "channel": "stable-4.3", "clusterID": "6f87b719-e563-4c0b-ab5a-1144172bc983", "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" <https://api.openshift.com/api/upgrades_info/v1/graph> }, "status": { "availableUpdates": null, "conditions": [ { "lastTransitionTime": "2019-11-07T12:23:12Z", "status": "False", "type": "Available" }, {script "lastTransitionTime": "2019-11-07T12:56:15Z", "message": "Cluster operator image-registry is still updating", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Failing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to apply 4.3.0-0.okd-2019-10-29-180250: the cluster operator image-registry has not yet successfully rolled out", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Progressing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to retrieve available updates: currently installed version 4.3.0-0.okd-2019-10-29-180250 not found in the \"stable-4.3\" channel", "reason": "RemoteFailed", "status": "False", "type": "RetrievedUpdates" } ], "desired": { "force": false, "image": " registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 ", "version": "4.3.0-0.okd-2019-10-29-180250"

Indeed this version is not the latest and is missing the aforementioned fix for the registry.

},

...
"history": [ { "completionTime": null, "image": " registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 ", "startedTime": "2019-11-07T12:23:12Z", "state": "Partial", "verified": false, "version": "4.3.0-0.okd-2019-10-29-180250" } ], "observedGeneration": 1, "versionHash": "-3onP9QpPTg=" } } ], "kind": "List", "metadata": { "resourceVersion": "", "selfLink": "" }

}

Can you answer to these few questions please?

- The latest stable OKD version is 4.2.4. Is it possible to chose the version of okd when deploying (seems to use 4.3) or does the installer always download the latest OKD?

...
- Can we use FCOS instead of RHCOS?

...
- About the pull secret, do we absolutely need a redhat login to get this file to deploy an upstream OKD cluster and not downstream openshift?

To answer the 3 of those, this specific build is not really OKD, and will use 4.3 and Red Hat artifact and must use RHCOs, hence the pull secret thing. I frankly don't know when OKD 4.3 is going to be released, I guess it will be on top FCOS. I'll update the list once we have the oVirt installer for OKD ready for testing (on FCOS)

--

...
Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MLO4NW7NWR2TXK...

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

Roy Golan

2:57 p.m.

On Thu, 21 Nov 2019 at 08:48, Roy Golan <rgolan@redhat.com> wrote:

...

On Wed, 20 Nov 2019 at 09:49, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 19/11/2019 à 19:23, Nathanaël Blanchet a écrit :

Le 19/11/2019 à 13:43, Roy Golan a écrit :

On Tue, 19 Nov 2019 at 14:34, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 19/11/2019 à 08:55, Roy Golan a écrit :

oc get -o json clusterversion

This is the output of the previous failed deployment, I'll give a try to a newer one when I'll have a minute to test

Without changing nothing with template, I gave a new try and... nothing works anymore now, none of provided IPs can be pingued : dial tcp 10.34.212.51:6443: connect: no route to host", so none of masters can be provisonned by bootstrap.

I tried with the latest rhcos and latest ovirt 4.3.7, it is the same. Obviously something changed since my first attempt 12 days ago... is your docker image for openshift-installer up to date?

Are you still able to your side to deploy a valid cluster ?

I investigated looking at bootstrap logs (attached) and it seems that every containers die immediately after been started.

Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.60107571 +0000 UTC m=+0.794838407 container init 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623197173 +0000 UTC m=+0.816959853 container start 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623814258 +0000 UTC m=+0.817576965 container attach 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:34 localhost systemd[1]: libpod-446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603.scope: Consumed 814ms CPU time Nov 20 07:02:34 localhost podman[2024]: 2019-11-20 07:02:34.100569998 +0000 UTC m=+1.294332779 container died 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:35 localhost podman[2024]: 2019-11-20 07:02:35.138523102 +0000 UTC m=+2.332285844 container remove 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon)

and this:

Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489527 1909 remote_runtime.go:200] CreateContainer in sandbox "58f2062aa7b6a5b2bdd6b9cf7b41a9f94ca2b30ad5a20e4fa4dec8a9b82f05e5" from runtime service failed: rpc error: code = Unknown desc = container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH" Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489714 1909 kuberuntime_manager.go:783] init container start failed: CreateContainerError: container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH"

What do you think about this?

I'm seeing the same now, checking...

Because of the move upstream to release OKD the release-image that comes with the installer I gave you are no longer valid. I need to prepare an installer version with the preview of OKD, you can find the details here https://mobile.twitter.com/smarterclayton/status/1196477646885965824 (do I need to use the terraform-workers tag instead of latest?)

...

...
...
docker pull quay.io/rgolangh/openshift-installer:terraform-workers

[root@openshift-installer openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit]# ./oc get -o json clusterversion { "apiVersion": "v1", "items": [ { "apiVersion": "config.openshift.io/v1", "kind": "ClusterVersion", "metadata": { "creationTimestamp": "2019-11-07T12:23:06Z", "generation": 1, "name": "version", "namespace": "", "resourceVersion": "3770202", "selfLink": "/apis/ config.openshift.io/v1/clusterversions/version", "uid": "77600bba-6e71-4b35-a60b-d8ee6e0f545c" }, "spec": { "channel": "stable-4.3", "clusterID": "6f87b719-e563-4c0b-ab5a-1144172bc983", "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" <https://api.openshift.com/api/upgrades_info/v1/graph> }, "status": { "availableUpdates": null, "conditions": [ { "lastTransitionTime": "2019-11-07T12:23:12Z", "status": "False", "type": "Available" }, {script "lastTransitionTime": "2019-11-07T12:56:15Z", "message": "Cluster operator image-registry is still updating", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Failing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to apply 4.3.0-0.okd-2019-10-29-180250: the cluster operator image-registry has not yet successfully rolled out", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Progressing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to retrieve available updates: currently installed version 4.3.0-0.okd-2019-10-29-180250 not found in the \"stable-4.3\" channel", "reason": "RemoteFailed", "status": "False", "type": "RetrievedUpdates" } ], "desired": { "force": false, "image": " registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 ", "version": "4.3.0-0.okd-2019-10-29-180250"

Indeed this version is not the latest and is missing the aforementioned fix for the registry.

},

...
"history": [ { "completionTime": null, "image": " registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 ", "startedTime": "2019-11-07T12:23:12Z", "state": "Partial", "verified": false, "version": "4.3.0-0.okd-2019-10-29-180250" } ], "observedGeneration": 1, "versionHash": "-3onP9QpPTg=" } } ], "kind": "List", "metadata": { "resourceVersion": "", "selfLink": "" }

}

Can you answer to these few questions please?

- The latest stable OKD version is 4.2.4. Is it possible to chose the version of okd when deploying (seems to use 4.3) or does the installer always download the latest OKD?

...
- Can we use FCOS instead of RHCOS?

...
- About the pull secret, do we absolutely need a redhat login to get this file to deploy an upstream OKD cluster and not downstream openshift?

To answer the 3 of those, this specific build is not really OKD, and will use 4.3 and Red Hat artifact and must use RHCOs, hence the pull secret thing. I frankly don't know when OKD 4.3 is going to be released, I guess it will be on top FCOS. I'll update the list once we have the oVirt installer for OKD ready for testing (on FCOS)

--

...
Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MLO4NW7NWR2TXK...

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

Nathanaël Blanchet

4:40 p.m.

Le 21/11/2019 à 13:57, Roy Golan a écrit :

...

On Thu, 21 Nov 2019 at 08:48, Roy Golan <rgolan@redhat.com <mailto:rgolan@redhat.com>> wrote:

On Wed, 20 Nov 2019 at 09:49, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 19/11/2019 à 19:23, Nathanaël Blanchet a écrit :

...
Le 19/11/2019 à 13:43, Roy Golan a écrit :

...
On Tue, 19 Nov 2019 at 14:34, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 19/11/2019 à 08:55, Roy Golan a écrit :

...
oc get -o json clusterversion

This is the output of the previous failed deployment, I'll give a try to a newer one when I'll have a minute to test

Without changing nothing with template, I gave a new try and... nothing works anymore now, none of provided IPs can be pingued : dial tcp 10.34.212.51:6443 <http://10.34.212.51:6443>: connect: no route to host", so none of masters can be provisonned by bootstrap.

I tried with the latest rhcos and latest ovirt 4.3.7, it is the same. Obviously something changed since my first attempt 12 days ago... is your docker image for openshift-installer up to date?

Are you still able to your side to deploy a valid cluster ?

I investigated looking at bootstrap logs (attached) and it seems that every containers die immediately after been started.

Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.60107571 +0000 UTC m=+0.794838407 container init 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3 <http://registry.svc.ci.openshift.org/origin/release:4.3>, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623197173 +0000 UTC m=+0.816959853 container start 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3 <http://registry.svc.ci.openshift.org/origin/release:4.3>, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623814258 +0000 UTC m=+0.817576965 container attach 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3 <http://registry.svc.ci.openshift.org/origin/release:4.3>, name=eager_cannon) Nov 20 07:02:34 localhost systemd[1]: libpod-446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603.scope: Consumed 814ms CPU time Nov 20 07:02:34 localhost podman[2024]: 2019-11-20 07:02:34.100569998 +0000 UTC m=+1.294332779 container died 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3 <http://registry.svc.ci.openshift.org/origin/release:4.3>, name=eager_cannon) Nov 20 07:02:35 localhost podman[2024]: 2019-11-20 07:02:35.138523102 +0000 UTC m=+2.332285844 container remove 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3 <http://registry.svc.ci.openshift.org/origin/release:4.3>, name=eager_cannon)

and this:

Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489527 1909 remote_runtime.go:200] CreateContainer in sandbox "58f2062aa7b6a5b2bdd6b9cf7b41a9f94ca2b30ad5a20e4fa4dec8a9b82f05e5" from runtime service failed: rpc error: code = Unknown desc = container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH" Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489714 1909 kuberuntime_manager.go:783] init container start failed: CreateContainerError: container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH"

What do you think about this?

I'm seeing the same now, checking...

Because of the move upstream to release OKD the release-image that comes with the installer I gave you are no longer valid.

I need to prepare an installer version with the preview of OKD, you can find the details here https://mobile.twitter.com/smarterclayton/status/1196477646885965824

Seems that okd4 https://github.com/openshift/okd/releases/download/4.3.0-0.okd-2019-11-15-18... is a compiled binary without the ovirt installer. Can you tell where are sources so as to include ovirt extensions into the installer build?

...

...
...
(do I need to use the terraform-workers tag instead of latest?)

docker pullquay.io/rgolangh/openshift-installer:terraform-workers <http://quay.io/rgolangh/openshift-installer:terraform-workers>

[root@openshift-installer openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit]# ./oc get -o json clusterversion { "apiVersion": "v1", "items": [ { "apiVersion": "config.openshift.io/v1 <http://config.openshift.io/v1>", "kind": "ClusterVersion", "metadata": { "creationTimestamp": "2019-11-07T12:23:06Z", "generation": 1, "name": "version", "namespace": "", "resourceVersion": "3770202", "selfLink": "/apis/config.openshift.io/v1/clusterversions/version <http://config.openshift.io/v1/clusterversions/version>", "uid": "77600bba-6e71-4b35-a60b-d8ee6e0f545c" }, "spec": { "channel": "stable-4.3", "clusterID": "6f87b719-e563-4c0b-ab5a-1144172bc983", "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" <https://api.openshift.com/api/upgrades_info/v1/graph> }, "status": { "availableUpdates": null, "conditions": [ { "lastTransitionTime": "2019-11-07T12:23:12Z", "status": "False", "type": "Available" }, {script "lastTransitionTime": "2019-11-07T12:56:15Z", "message": "Cluster operator image-registry is still updating", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Failing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to apply 4.3.0-0.okd-2019-10-29-180250: the cluster operator image-registry has not yet successfully rolled out", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Progressing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to retrieve available updates: currently installed version 4.3.0-0.okd-2019-10-29-180250 not found in the \"stable-4.3\" channel", "reason": "RemoteFailed", "status": "False", "type": "RetrievedUpdates" } ], "desired": { "force": false, "image": "registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 <http://registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9>", "version": "4.3.0-0.okd-2019-10-29-180250"

Indeed this version is not the latest and is missing the aforementioned fix for the registry.

}, "history": [ { "completionTime": null, "image": "registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 <http://registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9>", "startedTime": "2019-11-07T12:23:12Z", "state": "Partial", "verified": false, "version": "4.3.0-0.okd-2019-10-29-180250" } ], "observedGeneration": 1, "versionHash": "-3onP9QpPTg=" } } ], "kind": "List", "metadata": { "resourceVersion": "", "selfLink": "" }

}

Can you answer to these few questions please?

* The latest stable OKD version is 4.2.4. Is it possible to chose the version of okd when deploying (seems to use 4.3) or does the installer always download the latest OKD?

* Can we use FCOS instead of RHCOS?

* About the pull secret, do we absolutely need a redhat login to get this file to deploy an upstream OKD cluster and not downstream openshift?

To answer the 3 of those, this specific build is not really OKD, and will use 4.3 and Red Hat artifact and must use RHCOs, hence the pull secret thing. I frankly don't know when OKD 4.3 is going to be released, I guess it will be on top FCOS. I'll update the list once we have the oVirt installer for OKD ready for testing (on FCOS)

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

_______________________________________________ Users mailing list --users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email tousers-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement:https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct:https://www.ovirt.org/community/about/community-guidelines/ List Archives:https://lists.ovirt.org/archives/list/users@ovirt.org/message/MLO4NW7NWR2TXK...

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

Roy Golan

24 Nov 24 Nov

11:25 a.m.

On Thu, 21 Nov 2019 at 16:42, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...

Le 21/11/2019 à 13:57, Roy Golan a écrit :

On Thu, 21 Nov 2019 at 08:48, Roy Golan <rgolan@redhat.com> wrote:

...
On Wed, 20 Nov 2019 at 09:49, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 19/11/2019 à 19:23, Nathanaël Blanchet a écrit :

Le 19/11/2019 à 13:43, Roy Golan a écrit :

On Tue, 19 Nov 2019 at 14:34, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 19/11/2019 à 08:55, Roy Golan a écrit :

oc get -o json clusterversion

This is the output of the previous failed deployment, I'll give a try to a newer one when I'll have a minute to test

Without changing nothing with template, I gave a new try and... nothing works anymore now, none of provided IPs can be pingued : dial tcp 10.34.212.51:6443: connect: no route to host", so none of masters can be provisonned by bootstrap.

I tried with the latest rhcos and latest ovirt 4.3.7, it is the same. Obviously something changed since my first attempt 12 days ago... is your docker image for openshift-installer up to date?

Are you still able to your side to deploy a valid cluster ?

I investigated looking at bootstrap logs (attached) and it seems that every containers die immediately after been started.

Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.60107571 +0000 UTC m=+0.794838407 container init 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623197173 +0000 UTC m=+0.816959853 container start 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623814258 +0000 UTC m=+0.817576965 container attach 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:34 localhost systemd[1]: libpod-446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603.scope: Consumed 814ms CPU time Nov 20 07:02:34 localhost podman[2024]: 2019-11-20 07:02:34.100569998 +0000 UTC m=+1.294332779 container died 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:35 localhost podman[2024]: 2019-11-20 07:02:35.138523102 +0000 UTC m=+2.332285844 container remove 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon)

and this:

Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489527 1909 remote_runtime.go:200] CreateContainer in sandbox "58f2062aa7b6a5b2bdd6b9cf7b41a9f94ca2b30ad5a20e4fa4dec8a9b82f05e5" from runtime service failed: rpc error: code = Unknown desc = container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH" Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489714 1909 kuberuntime_manager.go:783] init container start failed: CreateContainerError: container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH"

What do you think about this?

I'm seeing the same now, checking...

Because of the move upstream to release OKD the release-image that comes with the installer I gave you are no longer valid.

I need to prepare an installer version with the preview of OKD, you can find the details here https://mobile.twitter.com/smarterclayton/status/1196477646885965824

Seems that okd4 https://github.com/openshift/okd/releases/download/4.3.0-0.okd-2019-11-15-18... is a compiled binary without the ovirt installer. Can you tell where are sources so as to include ovirt extensions into the installer build?

Soon the CI will help us in building okd images from pull requests. I'll update you when this will become available.

...

(do I need to use the terraform-workers tag instead of latest?)

...
...
...
docker pull quay.io/rgolangh/openshift-installer:terraform-workers

[root@openshift-installer openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit]# ./oc get -o json clusterversion { "apiVersion": "v1", "items": [ { "apiVersion": "config.openshift.io/v1", "kind": "ClusterVersion", "metadata": { "creationTimestamp": "2019-11-07T12:23:06Z", "generation": 1, "name": "version", "namespace": "", "resourceVersion": "3770202", "selfLink": "/apis/ config.openshift.io/v1/clusterversions/version", "uid": "77600bba-6e71-4b35-a60b-d8ee6e0f545c" }, "spec": { "channel": "stable-4.3", "clusterID": "6f87b719-e563-4c0b-ab5a-1144172bc983", "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" <https://api.openshift.com/api/upgrades_info/v1/graph> }, "status": { "availableUpdates": null, "conditions": [ { "lastTransitionTime": "2019-11-07T12:23:12Z", "status": "False", "type": "Available" }, {script "lastTransitionTime": "2019-11-07T12:56:15Z", "message": "Cluster operator image-registry is still updating", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Failing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to apply 4.3.0-0.okd-2019-10-29-180250: the cluster operator image-registry has not yet successfully rolled out", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Progressing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to retrieve available updates: currently installed version 4.3.0-0.okd-2019-10-29-180250 not found in the \"stable-4.3\" channel", "reason": "RemoteFailed", "status": "False", "type": "RetrievedUpdates" } ], "desired": { "force": false, "image": " registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 ", "version": "4.3.0-0.okd-2019-10-29-180250"

Indeed this version is not the latest and is missing the aforementioned fix for the registry.

},

...
"history": [ { "completionTime": null, "image": " registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 ", "startedTime": "2019-11-07T12:23:12Z", "state": "Partial", "verified": false, "version": "4.3.0-0.okd-2019-10-29-180250" } ], "observedGeneration": 1, "versionHash": "-3onP9QpPTg=" } } ], "kind": "List", "metadata": { "resourceVersion": "", "selfLink": "" }

}

Can you answer to these few questions please?

- The latest stable OKD version is 4.2.4. Is it possible to chose the version of okd when deploying (seems to use 4.3) or does the installer always download the latest OKD?

...
- Can we use FCOS instead of RHCOS?

...
- About the pull secret, do we absolutely need a redhat login to get this file to deploy an upstream OKD cluster and not downstream openshift?

To answer the 3 of those, this specific build is not really OKD, and will use 4.3 and Red Hat artifact and must use RHCOs, hence the pull secret thing. I frankly don't know when OKD 4.3 is going to be released, I guess it will be on top FCOS. I'll update the list once we have the oVirt installer for OKD ready for testing (on FCOS)

--

...
Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MLO4NW7NWR2TXK...

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

Nathanaël Blanchet

6 Jan 6 Jan

8:54 p.m.

Hello Roy Le 21/11/2019 à 13:57, Roy Golan a écrit :

...

On Thu, 21 Nov 2019 at 08:48, Roy Golan <rgolan@redhat.com <mailto:rgolan@redhat.com>> wrote:

On Wed, 20 Nov 2019 at 09:49, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 19/11/2019 à 19:23, Nathanaël Blanchet a écrit :

...
Le 19/11/2019 à 13:43, Roy Golan a écrit :

...
On Tue, 19 Nov 2019 at 14:34, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 19/11/2019 à 08:55, Roy Golan a écrit :

...
oc get -o json clusterversion

This is the output of the previous failed deployment, I'll give a try to a newer one when I'll have a minute to test

Without changing nothing with template, I gave a new try and... nothing works anymore now, none of provided IPs can be pingued : dial tcp 10.34.212.51:6443 <http://10.34.212.51:6443>: connect: no route to host", so none of masters can be provisonned by bootstrap.

I tried with the latest rhcos and latest ovirt 4.3.7, it is the same. Obviously something changed since my first attempt 12 days ago... is your docker image for openshift-installer up to date?

Are you still able to your side to deploy a valid cluster ?

I investigated looking at bootstrap logs (attached) and it seems that every containers die immediately after been started.

Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.60107571 +0000 UTC m=+0.794838407 container init 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3 <http://registry.svc.ci.openshift.org/origin/release:4.3>, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623197173 +0000 UTC m=+0.816959853 container start 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3 <http://registry.svc.ci.openshift.org/origin/release:4.3>, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623814258 +0000 UTC m=+0.817576965 container attach 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3 <http://registry.svc.ci.openshift.org/origin/release:4.3>, name=eager_cannon) Nov 20 07:02:34 localhost systemd[1]: libpod-446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603.scope: Consumed 814ms CPU time Nov 20 07:02:34 localhost podman[2024]: 2019-11-20 07:02:34.100569998 +0000 UTC m=+1.294332779 container died 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3 <http://registry.svc.ci.openshift.org/origin/release:4.3>, name=eager_cannon) Nov 20 07:02:35 localhost podman[2024]: 2019-11-20 07:02:35.138523102 +0000 UTC m=+2.332285844 container remove 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3 <http://registry.svc.ci.openshift.org/origin/release:4.3>, name=eager_cannon)

and this:

Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489527 1909 remote_runtime.go:200] CreateContainer in sandbox "58f2062aa7b6a5b2bdd6b9cf7b41a9f94ca2b30ad5a20e4fa4dec8a9b82f05e5" from runtime service failed: rpc error: code = Unknown desc = container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH" Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489714 1909 kuberuntime_manager.go:783] init container start failed: CreateContainerError: container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH"

What do you think about this?

I'm seeing the same now, checking...

Because of the move upstream to release OKD the release-image that comes with the installer I gave you are no longer valid.

I need to prepare an installer version with the preview of OKD, you can find the details here https://mobile.twitter.com/smarterclayton/status/1196477646885965824

I tested your last openshift-installer container on quay.io, but the ovirt provider is not available anymore. Will ovirt be supported as an OKD 4.2 iaas provider ?

...

...
...
(do I need to use the terraform-workers tag instead of latest?)

docker pullquay.io/rgolangh/openshift-installer:terraform-workers <http://quay.io/rgolangh/openshift-installer:terraform-workers>

[root@openshift-installer openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit]# ./oc get -o json clusterversion { "apiVersion": "v1", "items": [ { "apiVersion": "config.openshift.io/v1 <http://config.openshift.io/v1>", "kind": "ClusterVersion", "metadata": { "creationTimestamp": "2019-11-07T12:23:06Z", "generation": 1, "name": "version", "namespace": "", "resourceVersion": "3770202", "selfLink": "/apis/config.openshift.io/v1/clusterversions/version <http://config.openshift.io/v1/clusterversions/version>", "uid": "77600bba-6e71-4b35-a60b-d8ee6e0f545c" }, "spec": { "channel": "stable-4.3", "clusterID": "6f87b719-e563-4c0b-ab5a-1144172bc983", "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" <https://api.openshift.com/api/upgrades_info/v1/graph> }, "status": { "availableUpdates": null, "conditions": [ { "lastTransitionTime": "2019-11-07T12:23:12Z", "status": "False", "type": "Available" }, {script "lastTransitionTime": "2019-11-07T12:56:15Z", "message": "Cluster operator image-registry is still updating", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Failing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to apply 4.3.0-0.okd-2019-10-29-180250: the cluster operator image-registry has not yet successfully rolled out", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Progressing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to retrieve available updates: currently installed version 4.3.0-0.okd-2019-10-29-180250 not found in the \"stable-4.3\" channel", "reason": "RemoteFailed", "status": "False", "type": "RetrievedUpdates" } ], "desired": { "force": false, "image": "registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 <http://registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9>", "version": "4.3.0-0.okd-2019-10-29-180250"

Indeed this version is not the latest and is missing the aforementioned fix for the registry.

}, "history": [ { "completionTime": null, "image": "registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 <http://registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9>", "startedTime": "2019-11-07T12:23:12Z", "state": "Partial", "verified": false, "version": "4.3.0-0.okd-2019-10-29-180250" } ], "observedGeneration": 1, "versionHash": "-3onP9QpPTg=" } } ], "kind": "List", "metadata": { "resourceVersion": "", "selfLink": "" }

}

Can you answer to these few questions please?

* The latest stable OKD version is 4.2.4. Is it possible to chose the version of okd when deploying (seems to use 4.3) or does the installer always download the latest OKD?

* Can we use FCOS instead of RHCOS?

* About the pull secret, do we absolutely need a redhat login to get this file to deploy an upstream OKD cluster and not downstream openshift?

To answer the 3 of those, this specific build is not really OKD, and will use 4.3 and Red Hat artifact and must use RHCOs, hence the pull secret thing. I frankly don't know when OKD 4.3 is going to be released, I guess it will be on top FCOS. I'll update the list once we have the oVirt installer for OKD ready for testing (on FCOS)

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

_______________________________________________ Users mailing list --users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email tousers-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement:https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct:https://www.ovirt.org/community/about/community-guidelines/ List Archives:https://lists.ovirt.org/archives/list/users@ovirt.org/message/MLO4NW7NWR2TXK...

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet Supervision réseau SIRE 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

Roy Golan

10:30 p.m.

The merge window is now open for the masters branches of the various origin components. Post merge there should be an OKD release - this is not under my control, but when it will be available I'll let you know. On Mon, 6 Jan 2020 at 20:54, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...

Hello Roy Le 21/11/2019 à 13:57, Roy Golan a écrit :

On Thu, 21 Nov 2019 at 08:48, Roy Golan <rgolan@redhat.com> wrote:

...
On Wed, 20 Nov 2019 at 09:49, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 19/11/2019 à 19:23, Nathanaël Blanchet a écrit :

Le 19/11/2019 à 13:43, Roy Golan a écrit :

On Tue, 19 Nov 2019 at 14:34, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 19/11/2019 à 08:55, Roy Golan a écrit :

oc get -o json clusterversion

This is the output of the previous failed deployment, I'll give a try to a newer one when I'll have a minute to test

Without changing nothing with template, I gave a new try and... nothing works anymore now, none of provided IPs can be pingued : dial tcp 10.34.212.51:6443: connect: no route to host", so none of masters can be provisonned by bootstrap.

I tried with the latest rhcos and latest ovirt 4.3.7, it is the same. Obviously something changed since my first attempt 12 days ago... is your docker image for openshift-installer up to date?

Are you still able to your side to deploy a valid cluster ?

I investigated looking at bootstrap logs (attached) and it seems that every containers die immediately after been started.

Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.60107571 +0000 UTC m=+0.794838407 container init 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623197173 +0000 UTC m=+0.816959853 container start 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623814258 +0000 UTC m=+0.817576965 container attach 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:34 localhost systemd[1]: libpod-446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603.scope: Consumed 814ms CPU time Nov 20 07:02:34 localhost podman[2024]: 2019-11-20 07:02:34.100569998 +0000 UTC m=+1.294332779 container died 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:35 localhost podman[2024]: 2019-11-20 07:02:35.138523102 +0000 UTC m=+2.332285844 container remove 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon)

and this:

Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489527 1909 remote_runtime.go:200] CreateContainer in sandbox "58f2062aa7b6a5b2bdd6b9cf7b41a9f94ca2b30ad5a20e4fa4dec8a9b82f05e5" from runtime service failed: rpc error: code = Unknown desc = container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH" Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489714 1909 kuberuntime_manager.go:783] init container start failed: CreateContainerError: container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH"

What do you think about this?

I'm seeing the same now, checking...

Because of the move upstream to release OKD the release-image that comes with the installer I gave you are no longer valid.

I need to prepare an installer version with the preview of OKD, you can find the details here https://mobile.twitter.com/smarterclayton/status/1196477646885965824

I tested your last openshift-installer container on quay.io, but the ovirt provider is not available anymore. Will ovirt be supported as an OKD 4.2 iaas provider ?

(do I need to use the terraform-workers tag instead of latest?)

...
...
...
docker pull quay.io/rgolangh/openshift-installer:terraform-workers

[root@openshift-installer openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit]# ./oc get -o json clusterversion { "apiVersion": "v1", "items": [ { "apiVersion": "config.openshift.io/v1", "kind": "ClusterVersion", "metadata": { "creationTimestamp": "2019-11-07T12:23:06Z", "generation": 1, "name": "version", "namespace": "", "resourceVersion": "3770202", "selfLink": "/apis/ config.openshift.io/v1/clusterversions/version", "uid": "77600bba-6e71-4b35-a60b-d8ee6e0f545c" }, "spec": { "channel": "stable-4.3", "clusterID": "6f87b719-e563-4c0b-ab5a-1144172bc983", "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" <https://api.openshift.com/api/upgrades_info/v1/graph> }, "status": { "availableUpdates": null, "conditions": [ { "lastTransitionTime": "2019-11-07T12:23:12Z", "status": "False", "type": "Available" }, {script "lastTransitionTime": "2019-11-07T12:56:15Z", "message": "Cluster operator image-registry is still updating", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Failing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to apply 4.3.0-0.okd-2019-10-29-180250: the cluster operator image-registry has not yet successfully rolled out", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Progressing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to retrieve available updates: currently installed version 4.3.0-0.okd-2019-10-29-180250 not found in the \"stable-4.3\" channel", "reason": "RemoteFailed", "status": "False", "type": "RetrievedUpdates" } ], "desired": { "force": false, "image": " registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 ", "version": "4.3.0-0.okd-2019-10-29-180250"

Indeed this version is not the latest and is missing the aforementioned fix for the registry.

},

...
"history": [ { "completionTime": null, "image": " registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 ", "startedTime": "2019-11-07T12:23:12Z", "state": "Partial", "verified": false, "version": "4.3.0-0.okd-2019-10-29-180250" } ], "observedGeneration": 1, "versionHash": "-3onP9QpPTg=" } } ], "kind": "List", "metadata": { "resourceVersion": "", "selfLink": "" }

}

Can you answer to these few questions please?

- The latest stable OKD version is 4.2.4. Is it possible to chose the version of okd when deploying (seems to use 4.3) or does the installer always download the latest OKD?

...
- Can we use FCOS instead of RHCOS?

...
- About the pull secret, do we absolutely need a redhat login to get this file to deploy an upstream OKD cluster and not downstream openshift?

To answer the 3 of those, this specific build is not really OKD, and will use 4.3 and Red Hat artifact and must use RHCOs, hence the pull secret thing. I frankly don't know when OKD 4.3 is going to be released, I guess it will be on top FCOS. I'll update the list once we have the oVirt installer for OKD ready for testing (on FCOS)

--

...
Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MLO4NW7NWR2TXK...

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau SIRE 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

Nathanaël Blanchet

21 Feb 21 Feb

2:52 p.m.

Hello, It seems that the work for including ovirt as a provider in the master branch of openshift installer has been done. I compiled the master code and ovirt does appear in the survey. I don't have much time to test it for now but is it operationnal? If yes, I will prior to have a look to it. Thanks. Le 06/01/2020 à 21:30, Roy Golan a écrit :

...

The merge window is now open for the masters branches of the various origin components. Post merge there should be an OKD release - this is not under my control, but when it will be available I'll let you know.

On Mon, 6 Jan 2020 at 20:54, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Hello Roy

Le 21/11/2019 à 13:57, Roy Golan a écrit :

...
On Thu, 21 Nov 2019 at 08:48, Roy Golan <rgolan@redhat.com <mailto:rgolan@redhat.com>> wrote:

On Wed, 20 Nov 2019 at 09:49, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 19/11/2019 à 19:23, Nathanaël Blanchet a écrit :

...
Le 19/11/2019 à 13:43, Roy Golan a écrit :

...
On Tue, 19 Nov 2019 at 14:34, Nathanaël Blanchet <blanchet@abes.fr <mailto:blanchet@abes.fr>> wrote:

Le 19/11/2019 à 08:55, Roy Golan a écrit :

...
oc get -o json clusterversion

This is the output of the previous failed deployment, I'll give a try to a newer one when I'll have a minute to test

Without changing nothing with template, I gave a new try and... nothing works anymore now, none of provided IPs can be pingued : dial tcp 10.34.212.51:6443 <http://10.34.212.51:6443>: connect: no route to host", so none of masters can be provisonned by bootstrap.

I tried with the latest rhcos and latest ovirt 4.3.7, it is the same. Obviously something changed since my first attempt 12 days ago... is your docker image for openshift-installer up to date?

Are you still able to your side to deploy a valid cluster ?

I investigated looking at bootstrap logs (attached) and it seems that every containers die immediately after been started.

Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.60107571 +0000 UTC m=+0.794838407 container init 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3 <http://registry.svc.ci.openshift.org/origin/release:4.3>, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623197173 +0000 UTC m=+0.816959853 container start 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3 <http://registry.svc.ci.openshift.org/origin/release:4.3>, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623814258 +0000 UTC m=+0.817576965 container attach 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3 <http://registry.svc.ci.openshift.org/origin/release:4.3>, name=eager_cannon) Nov 20 07:02:34 localhost systemd[1]: libpod-446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603.scope: Consumed 814ms CPU time Nov 20 07:02:34 localhost podman[2024]: 2019-11-20 07:02:34.100569998 +0000 UTC m=+1.294332779 container died 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3 <http://registry.svc.ci.openshift.org/origin/release:4.3>, name=eager_cannon) Nov 20 07:02:35 localhost podman[2024]: 2019-11-20 07:02:35.138523102 +0000 UTC m=+2.332285844 container remove 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image=registry.svc.ci.openshift.org/origin/release:4.3 <http://registry.svc.ci.openshift.org/origin/release:4.3>, name=eager_cannon)

and this:

Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489527    1909 remote_runtime.go:200] CreateContainer in sandbox "58f2062aa7b6a5b2bdd6b9cf7b41a9f94ca2b30ad5a20e4fa4dec8a9b82f05e5" from runtime service failed: rpc error: code = Unknown desc = container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH" Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489714    1909 kuberuntime_manager.go:783] init container start failed: CreateContainerError: container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH"

What do you think about this?

I'm seeing the same now, checking...

Because of the move upstream to release OKD the release-image that comes with the installer I gave you are no longer valid.

I need to prepare an installer version with the preview of OKD, you can find the details here https://mobile.twitter.com/smarterclayton/status/1196477646885965824

I tested your last openshift-installer container on quay.io <http://quay.io>, but the ovirt provider is not available anymore. Will ovirt be supported as an OKD 4.2 iaas provider ?

...
...
...
(do I need to use the terraform-workers tag instead of latest?)

docker pullquay.io/rgolangh/openshift-installer:terraform-workers <http://quay.io/rgolangh/openshift-installer:terraform-workers>

[root@openshift-installer openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit]# ./oc get -o json clusterversion {     "apiVersion": "v1",     "items": [         {             "apiVersion": "config.openshift.io/v1 <http://config.openshift.io/v1>",             "kind": "ClusterVersion",             "metadata": { "creationTimestamp": "2019-11-07T12:23:06Z",                 "generation": 1,                 "name": "version",                 "namespace": "", "resourceVersion": "3770202",                 "selfLink": "/apis/config.openshift.io/v1/clusterversions/version <http://config.openshift.io/v1/clusterversions/version>",                 "uid": "77600bba-6e71-4b35-a60b-d8ee6e0f545c"             },             "spec": {                 "channel": "stable-4.3",                 "clusterID": "6f87b719-e563-4c0b-ab5a-1144172bc983",                 "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" <https://api.openshift.com/api/upgrades_info/v1/graph>             },             "status": { "availableUpdates": null,                 "conditions": [                     { "lastTransitionTime": "2019-11-07T12:23:12Z", "status": "False", "type": "Available"                     },                     {script "lastTransitionTime": "2019-11-07T12:56:15Z", "message": "Cluster operator image-registry is still updating", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Failing"                     },                     { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to apply 4.3.0-0.okd-2019-10-29-180250: the cluster operator image-registry has not yet successfully rolled out", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Progressing"                     },                     { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to retrieve available updates: currently installed version 4.3.0-0.okd-2019-10-29-180250 not found in the \"stable-4.3\" channel", "reason": "RemoteFailed", "status": "False", "type": "RetrievedUpdates"                     }                 ],                 "desired": {                     "force": false,                     "image": "registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 <http://registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9>",                     "version": "4.3.0-0.okd-2019-10-29-180250"

Indeed this version is not the latest and is missing the aforementioned fix for the registry.

                },                 "history": [                     { "completionTime": null, "image": "registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 <http://registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9>", "startedTime": "2019-11-07T12:23:12Z", "state": "Partial", "verified": false, "version": "4.3.0-0.okd-2019-10-29-180250"                     }                 ], "observedGeneration": 1,                 "versionHash": "-3onP9QpPTg="             }         }     ],     "kind": "List",     "metadata": {         "resourceVersion": "",         "selfLink": ""     }

}

Can you answer to these few questions please?

* The latest stable OKD version is 4.2.4. Is it possible to chose the version of okd when deploying (seems to use 4.3) or does the installer always download the latest OKD?

* Can we use FCOS instead of RHCOS?

* About the pull secret, do we absolutely need a redhat login to get this file to deploy an upstream OKD cluster and not downstream openshift?

To answer the 3 of those, this specific build is not really OKD, and will use 4.3 and Red Hat artifact and must use RHCOs, hence the pull secret thing. I frankly don't know when OKD 4.3 is going to be released, I guess it will be on top FCOS. I'll update the list once we have the oVirt installer for OKD ready for testing (on FCOS)

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

_______________________________________________ Users mailing list --users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email tousers-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement:https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct:https://www.ovirt.org/community/about/community-guidelines/ List Archives:https://lists.ovirt.org/archives/list/users@ovirt.org/message/MLO4NW7NWR2TXK...

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet

Supervision réseau SIRE 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr <mailto:blanchet@abes.fr>

-- Nathanaël Blanchet Supervision réseau SIRE 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

Roy Golan

25 Feb 25 Feb

4:10 p.m.

On Fri, 21 Feb 2020 at 14:52, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...

Hello,

It seems that the work for including ovirt as a provider in the master branch of openshift installer has been done. I compiled the master code and ovirt does appear in the survey. I don't have much time to test it for now but is it operationnal? If yes, I will prior to have a look to it.

It is operational, yes - which OS are you going to use?

...

Thanks. Le 06/01/2020 à 21:30, Roy Golan a écrit :

The merge window is now open for the masters branches of the various origin components. Post merge there should be an OKD release - this is not under my control, but when it will be available I'll let you know.

On Mon, 6 Jan 2020 at 20:54, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Hello Roy Le 21/11/2019 à 13:57, Roy Golan a écrit :

On Thu, 21 Nov 2019 at 08:48, Roy Golan <rgolan@redhat.com> wrote:

...
On Wed, 20 Nov 2019 at 09:49, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 19/11/2019 à 19:23, Nathanaël Blanchet a écrit :

Le 19/11/2019 à 13:43, Roy Golan a écrit :

On Tue, 19 Nov 2019 at 14:34, Nathanaël Blanchet <blanchet@abes.fr> wrote:

...
Le 19/11/2019 à 08:55, Roy Golan a écrit :

oc get -o json clusterversion

This is the output of the previous failed deployment, I'll give a try to a newer one when I'll have a minute to test

Without changing nothing with template, I gave a new try and... nothing works anymore now, none of provided IPs can be pingued : dial tcp 10.34.212.51:6443: connect: no route to host", so none of masters can be provisonned by bootstrap.

I tried with the latest rhcos and latest ovirt 4.3.7, it is the same. Obviously something changed since my first attempt 12 days ago... is your docker image for openshift-installer up to date?

Are you still able to your side to deploy a valid cluster ?

I investigated looking at bootstrap logs (attached) and it seems that every containers die immediately after been started.

Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.60107571 +0000 UTC m=+0.794838407 container init 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623197173 +0000 UTC m=+0.816959853 container start 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:33 localhost podman[2024]: 2019-11-20 07:02:33.623814258 +0000 UTC m=+0.817576965 container attach 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:34 localhost systemd[1]: libpod-446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603.scope: Consumed 814ms CPU time Nov 20 07:02:34 localhost podman[2024]: 2019-11-20 07:02:34.100569998 +0000 UTC m=+1.294332779 container died 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon) Nov 20 07:02:35 localhost podman[2024]: 2019-11-20 07:02:35.138523102 +0000 UTC m=+2.332285844 container remove 446dc9b7a04ff3ff4bbcfa6750e3946c084741b39707eb088c9d7ae648e35603 (image= registry.svc.ci.openshift.org/origin/release:4.3, name=eager_cannon)

and this:

Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489527 1909 remote_runtime.go:200] CreateContainer in sandbox "58f2062aa7b6a5b2bdd6b9cf7b41a9f94ca2b30ad5a20e4fa4dec8a9b82f05e5" from runtime service failed: rpc error: code = Unknown desc = container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH" Nov 20 07:04:16 localhost hyperkube[1909]: E1120 07:04:16.489714 1909 kuberuntime_manager.go:783] init container start failed: CreateContainerError: container create failed: container_linux.go:345: starting container process caused "exec: \"runtimecfg\": executable file not found in $PATH"

What do you think about this?

I'm seeing the same now, checking...

Because of the move upstream to release OKD the release-image that comes with the installer I gave you are no longer valid.

I need to prepare an installer version with the preview of OKD, you can find the details here https://mobile.twitter.com/smarterclayton/status/1196477646885965824

I tested your last openshift-installer container on quay.io, but the ovirt provider is not available anymore. Will ovirt be supported as an OKD 4.2 iaas provider ?

(do I need to use the terraform-workers tag instead of latest?)

...
...
...
docker pull quay.io/rgolangh/openshift-installer:terraform-workers

[root@openshift-installer openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit]# ./oc get -o json clusterversion { "apiVersion": "v1", "items": [ { "apiVersion": "config.openshift.io/v1", "kind": "ClusterVersion", "metadata": { "creationTimestamp": "2019-11-07T12:23:06Z", "generation": 1, "name": "version", "namespace": "", "resourceVersion": "3770202", "selfLink": "/apis/ config.openshift.io/v1/clusterversions/version", "uid": "77600bba-6e71-4b35-a60b-d8ee6e0f545c" }, "spec": { "channel": "stable-4.3", "clusterID": "6f87b719-e563-4c0b-ab5a-1144172bc983", "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" <https://api.openshift.com/api/upgrades_info/v1/graph> }, "status": { "availableUpdates": null, "conditions": [ { "lastTransitionTime": "2019-11-07T12:23:12Z", "status": "False", "type": "Available" }, {script "lastTransitionTime": "2019-11-07T12:56:15Z", "message": "Cluster operator image-registry is still updating", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Failing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to apply 4.3.0-0.okd-2019-10-29-180250: the cluster operator image-registry has not yet successfully rolled out", "reason": "ClusterOperatorNotAvailable", "status": "True", "type": "Progressing" }, { "lastTransitionTime": "2019-11-07T12:23:12Z", "message": "Unable to retrieve available updates: currently installed version 4.3.0-0.okd-2019-10-29-180250 not found in the \"stable-4.3\" channel", "reason": "RemoteFailed", "status": "False", "type": "RetrievedUpdates" } ], "desired": { "force": false, "image": " registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 ", "version": "4.3.0-0.okd-2019-10-29-180250"

Indeed this version is not the latest and is missing the aforementioned fix for the registry.

},

...
"history": [ { "completionTime": null, "image": " registry.svc.ci.openshift.org/origin/release@sha256:68286e07f7d68ebc8a067389aabf38dee9f9b810c5520d6ee4593c38eb48ddc9 ", "startedTime": "2019-11-07T12:23:12Z", "state": "Partial", "verified": false, "version": "4.3.0-0.okd-2019-10-29-180250" } ], "observedGeneration": 1, "versionHash": "-3onP9QpPTg=" } } ], "kind": "List", "metadata": { "resourceVersion": "", "selfLink": "" }

}

Can you answer to these few questions please?

- The latest stable OKD version is 4.2.4. Is it possible to chose the version of okd when deploying (seems to use 4.3) or does the installer always download the latest OKD?

...
- Can we use FCOS instead of RHCOS?

...
- About the pull secret, do we absolutely need a redhat login to get this file to deploy an upstream OKD cluster and not downstream openshift?

To answer the 3 of those, this specific build is not really OKD, and will use 4.3 and Red Hat artifact and must use RHCOs, hence the pull secret thing. I frankly don't know when OKD 4.3 is going to be released, I guess it will be on top FCOS. I'll update the list once we have the oVirt installer for OKD ready for testing (on FCOS)

--

...
Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MLO4NW7NWR2TXK...

-- Nathanaël Blanchet

Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau SIRE 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

-- Nathanaël Blanchet

Supervision réseau SIRE 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14blanchet@abes.fr

1990

Age (days ago)

2102

Last active (days ago)

List overview

Download

36 comments

4 participants

participants (4)

Nathanaël Blanchet
Roy Golan
Sandro Bonazzola
Yedidyah Bar David

terraform integration

tags

participants (4)