On Mon, Oct 3, 2016 at 11:56 PM, Jason Jeffrey <jason@sudo.co.uk> wrote:

Hi,

 

Another problem has appeared, after rebooting the primary the VM will not start.

 

Appears the symlink is broken between gluster mount ref and vdsm


The first host was correctly deployed but it seas that you are facing some issue connecting the storage.
Can you please attach vdsm logs and /var/log/messages from the first host?
 

 

From broker.log

 

Thread-169::ERROR::2016-10-04 22:44:16,189::storage_broker::138::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(get_raw_stats_for_service_type) Failed to read metadata from /rhev/data-center/mnt/glusterSD/dcastor01:engine/bbb70623-194a-46d2-a164-76a4876ecaaf/ha_agent/hosted-engine.metadata

 

[root@dcasrv01 ovirt-hosted-engine-ha]# ls -al /rhev/data-center/mnt/glusterSD/dcastor01\:engine/bbb70623-194a-46d2-a164-76a4876ecaaf/ha_agent/

total 9

drwxrwx---. 2 vdsm kvm 4096 Oct  3 17:27 .

drwxr-xr-x. 5 vdsm kvm 4096 Oct  3 17:17 ..

lrwxrwxrwx. 1 vdsm kvm  132 Oct  3 17:27 hosted-engine.lockspace -> /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/23d81b73-bcb7-4742-abde-128522f43d78/11d6a3e1-1817-429d-b2e0-9051a3cf41a4

lrwxrwxrwx. 1 vdsm kvm  132 Oct  3 17:27 hosted-engine.metadata -> /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/fd44dbf9-473a-496a-9996-c8abe3278390/cee9440c-4eb8-453b-bc04-c47e6f9cbc93   

 

[root@dcasrv01 /]# ls -al /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/

ls: cannot access /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/: No such file or directory  

 

Though file appears to be there

 

Gluster is setup as xpool/engine

 

[root@dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# pwd

/xpool/engine/brick/bbb70623-194a-46d2-a164-76a4876ecaaf/images/fd44dbf9-473a-496a-9996-c8abe3278390

[root@dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# ls -al

total 2060

drwxr-xr-x. 2 vdsm kvm    4096 Oct  3 17:17 .

drwxr-xr-x. 6 vdsm kvm    4096 Oct  3 17:17 ..

-rw-rw----. 2 vdsm kvm 1028096 Oct  3 20:48 cee9440c-4eb8-453b-bc04-c47e6f9cbc93

-rw-rw----. 2 vdsm kvm 1048576 Oct  3 17:17 cee9440c-4eb8-453b-bc04-c47e6f9cbc93.lease

-rw-r--r--. 2 vdsm kvm     283 Oct  3 17:17 cee9440c-4eb8-453b-bc04-c47e6f9cbc93.meta  

 

 

[root@dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# gluster volume info

 

Volume Name: data

Type: Replicate

Volume ID: 54fbcafc-fed9-4bce-92ec-fa36cdcacbd4

Status: Started

Number of Bricks: 1 x (2 + 1) = 3

Transport-type: tcp

Bricks:

Brick1: dcastor01:/xpool/data/brick

Brick2: dcastor03:/xpool/data/brick

Brick3: dcastor02:/xpool/data/bricky (arbiter)

Options Reconfigured:

performance.readdir-ahead: on

performance.quick-read: off

performance.read-ahead: off

performance.io-cache: off

performance.stat-prefetch: off

cluster.eager-lock: enable

network.remote-dio: enable

cluster.quorum-type: auto

cluster.server-quorum-type: server

storage.owner-uid: 36

storage.owner-gid: 36

 

Volume Name: engine

Type: Replicate

Volume ID: dd4c692d-03aa-4fc6-9011-a8dad48dad96

Status: Started

Number of Bricks: 1 x (2 + 1) = 3

Transport-type: tcp

Bricks:

Brick1: dcastor01:/xpool/engine/brick

Brick2: dcastor02:/xpool/engine/brick

Brick3: dcastor03:/xpool/engine/brick (arbiter)

Options Reconfigured:

performance.readdir-ahead: on

performance.quick-read: off

performance.read-ahead: off

performance.io-cache: off

performance.stat-prefetch: off

cluster.eager-lock: enable

network.remote-dio: enable

cluster.quorum-type: auto

cluster.server-quorum-type: server

storage.owner-uid: 36

storage.owner-gid: 36

 

Volume Name: export

Type: Replicate

Volume ID: 23f14730-d264-4cc2-af60-196b943ecaf3

Status: Started

Number of Bricks: 1 x (2 + 1) = 3

Transport-type: tcp

Bricks:

Brick1: dcastor02:/xpool/export/brick

Brick2: dcastor03:/xpool/export/brick

Brick3: dcastor01:/xpool/export/brick (arbiter)

Options Reconfigured:

performance.readdir-ahead: on

storage.owner-uid: 36

storage.owner-gid: 36

 

Volume Name: iso

Type: Replicate

Volume ID: b2d3d7e2-9919-400b-8368-a0443d48e82a

Status: Started

Number of Bricks: 1 x (2 + 1) = 3

Transport-type: tcp

Bricks:

Brick1: dcastor01:/xpool/iso/brick

Brick2: dcastor02:/xpool/iso/brick

Brick3: dcastor03:/xpool/iso/brick (arbiter)

Options Reconfigured:

performance.readdir-ahead: on

storage.owner-uid: 36

storage.owner-gid: 36                                  

 

 

[root@dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# gluster volume status

Status of volume: data

Gluster process                             TCP Port  RDMA Port  Online  Pid

------------------------------------------------------------------------------

Brick dcastor01:/xpool/data/brick           49153     0          Y       3076

Brick dcastor03:/xpool/data/brick           49153     0          Y       3019

Brick dcastor02:/xpool/data/bricky          49153     0          Y       3857

NFS Server on localhost                     2049      0          Y       3097

Self-heal Daemon on localhost               N/A       N/A        Y       3088

NFS Server on dcastor03                     2049      0          Y       3039

Self-heal Daemon on dcastor03               N/A       N/A        Y       3114

NFS Server on dcasrv02                      2049      0          Y       3871

Self-heal Daemon on dcasrv02                N/A       N/A        Y       3864

 

Task Status of Volume data

------------------------------------------------------------------------------

There are no active volume tasks

 

Status of volume: engine

Gluster process                             TCP Port  RDMA Port  Online  Pid

------------------------------------------------------------------------------

Brick dcastor01:/xpool/engine/brick         49152     0          Y       3131

Brick dcastor02:/xpool/engine/brick         49152     0          Y       3852

Brick dcastor03:/xpool/engine/brick         49152     0          Y       2992

NFS Server on localhost                     2049      0          Y       3097

Self-heal Daemon on localhost               N/A       N/A        Y       3088

NFS Server on dcastor03                     2049      0          Y       3039

Self-heal Daemon on dcastor03               N/A       N/A        Y       3114

NFS Server on dcasrv02                      2049      0          Y       3871

Self-heal Daemon on dcasrv02                N/A       N/A        Y       3864

 

Task Status of Volume engine

------------------------------------------------------------------------------

There are no active volume tasks

 

Status of volume: export

Gluster process                             TCP Port  RDMA Port  Online  Pid

------------------------------------------------------------------------------

Brick dcastor02:/xpool/export/brick         49155     0          Y       3872

Brick dcastor03:/xpool/export/brick         49155     0          Y       3147

Brick dcastor01:/xpool/export/brick         49155     0          Y       3150

NFS Server on localhost                     2049      0          Y       3097

Self-heal Daemon on localhost               N/A       N/A        Y       3088

NFS Server on dcastor03                     2049      0          Y       3039

Self-heal Daemon on dcastor03               N/A       N/A        Y       3114

NFS Server on dcasrv02                      2049      0          Y       3871

Self-heal Daemon on dcasrv02                N/A       N/A        Y       3864

 

Task Status of Volume export

------------------------------------------------------------------------------

There are no active volume tasks

 

Status of volume: iso

Gluster process                             TCP Port  RDMA Port  Online  Pid

------------------------------------------------------------------------------

Brick dcastor01:/xpool/iso/brick            49154     0          Y       3152

Brick dcastor02:/xpool/iso/brick            49154     0          Y       3881

Brick dcastor03:/xpool/iso/brick            49154     0          Y       3146

NFS Server on localhost                     2049      0          Y       3097

Self-heal Daemon on localhost               N/A       N/A        Y       3088

NFS Server on dcastor03                     2049      0          Y       3039

Self-heal Daemon on dcastor03               N/A       N/A        Y       3114

NFS Server on dcasrv02                      2049      0          Y       3871

Self-heal Daemon on dcasrv02                N/A       N/A        Y       3864

 

Task Status of Volume iso

------------------------------------------------------------------------------

There are no active volume tasks

                                                                                 

Thanks

 

Jason

 

 

 

From: users-bounces@ovirt.org [mailto:users-bounces@ovirt.org] On Behalf Of Jason Jeffrey
Sent: 03 October 2016 18:40
To: users@ovirt.org


Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy

 

Hi,

 

Setup log attached for primary

 

Regards

 

Jason

 

From: Simone Tiraboschi [mailto:stirabos@redhat.com]
Sent: 03 October 2016 09:27
To: Jason Jeffrey <jason@sudo.co.uk>
Cc: users <users@ovirt.org>
Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy

 

 

 

On Mon, Oct 3, 2016 at 12:45 AM, Jason Jeffrey <jason@sudo.co.uk> wrote:

Hi,

 

I am trying to build a x3 HC cluster, with a self hosted engine using gluster.

 

I have successful built the 1st node,  however when I attempt to run hosted-engine –deploy on node 2, I get the following error

 

[WARNING] A configuration file must be supplied to deploy Hosted Engine on an additional host.

[ ERROR ] 'version' is not stored in the HE configuration image

[ ERROR ] Unable to get the answer file from the shared storage

[ ERROR ] Failed to execute stage 'Environment customization': Unable to get the answer file from the shared storage

[ INFO  ] Stage: Clean up

[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20161002232505.conf'

[ INFO  ] Stage: Pre-termination

[ INFO  ] Stage: Termination

[ ERROR ] Hosted Engine deployment failed   

 

Looking at the failure in the log file..

 

Can you please attach hosted-engine-setup logs from the first host?

 

 

2016-10-02 23:25:05 WARNING otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._customization:151 A configuration

file must be supplied to deploy Hosted Engine on an additional host.

2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._fetch_answer_file:61 _fetch_answer_f

ile

2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._fetch_answer_file:69 fetching from:

/rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/7

8cb2527-a2e2-489a-9fad-465a72221b37

2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:69 executing: 'sudo -u vdsm dd i

f=/rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b

/78cb2527-a2e2-489a-9fad-465a72221b37 bs=4k'

2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:70 executing: 'tar -tvf -'

2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:88 stdout:

2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:89 stderr:

2016-10-02 23:25:05 ERROR otopi.plugins.gr_he_common.core.remote_answerfile heconflib.validateConfImage:111 'version' is not stored

in the HE configuration image

2016-10-02 23:25:05 ERROR otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._fetch_answer_file:73 Unable to get t

he answer file from the shared storage

 

Looking at the detected gluster path - /rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/

 

[root@dcasrv02 ~]# ls -al /rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/

total 1049609

drwxr-xr-x. 2 vdsm kvm       4096 Oct  2 04:46 .

drwxr-xr-x. 6 vdsm kvm       4096 Oct  2 04:46 ..

-rw-rw----. 1 vdsm kvm 1073741824 Oct  2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37

-rw-rw----. 1 vdsm kvm    1048576 Oct  2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37.lease

-rw-r--r--. 1 vdsm kvm        294 Oct  2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37.meta 

 

78cb2527-a2e2-489a-9fad-465a72221b37 is  a 1 GB file, is this the engine VM ?

 

Copying the answers file form primary (/etc/ovirt-hosted-engine/answers.conf ) to  node 2 and rerunning produces the same error : (

(hosted-engine --deploy  --config-append=/root/answers.conf )

 

Also tried on node 3, same issues

 

Happy to provide logs and other debugs

 

Thanks

 

Jason

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

 


_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users