[ovirt-users] 4.0 - 2nd node fails on deploy

Jason Jeffrey jason at sudo.co.uk
Mon Oct 3 17:56:03 EDT 2016


Hi,

 

Another problem has appeared, after rebooting the primary the VM will not start.

 

Appears the symlink is broken between gluster mount ref and vdsm 

 

>From broker.log

 

Thread-169::ERROR::2016-10-04 22:44:16,189::storage_broker::138::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(get_raw_stats_for_service_type) Failed to read metadata from /rhev/data-center/mnt/glusterSD/dcastor01:engine/bbb70623-194a-46d2-a164-76a4876ecaaf/ha_agent/hosted-engine.metadata

 

[root at dcasrv01 ovirt-hosted-engine-ha]# ls -al /rhev/data-center/mnt/glusterSD/dcastor01\:engine/bbb70623-194a-46d2-a164-76a4876ecaaf/ha_agent/

total 9

drwxrwx---. 2 vdsm kvm 4096 Oct  3 17:27 .

drwxr-xr-x. 5 vdsm kvm 4096 Oct  3 17:17 ..

lrwxrwxrwx. 1 vdsm kvm  132 Oct  3 17:27 hosted-engine.lockspace -> /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/23d81b73-bcb7-4742-abde-128522f43d78/11d6a3e1-1817-429d-b2e0-9051a3cf41a4

lrwxrwxrwx. 1 vdsm kvm  132 Oct  3 17:27 hosted-engine.metadata -> /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/fd44dbf9-473a-496a-9996-c8abe3278390/cee9440c-4eb8-453b-bc04-c47e6f9cbc93    

 

[root at dcasrv01 /]# ls -al /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/

ls: cannot access /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/: No such file or directory   

 

Though file appears to be there 

 

Gluster is setup as xpool/engine 

 

[root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# pwd

/xpool/engine/brick/bbb70623-194a-46d2-a164-76a4876ecaaf/images/fd44dbf9-473a-496a-9996-c8abe3278390

[root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# ls -al

total 2060

drwxr-xr-x. 2 vdsm kvm    4096 Oct  3 17:17 .

drwxr-xr-x. 6 vdsm kvm    4096 Oct  3 17:17 ..

-rw-rw----. 2 vdsm kvm 1028096 Oct  3 20:48 cee9440c-4eb8-453b-bc04-c47e6f9cbc93

-rw-rw----. 2 vdsm kvm 1048576 Oct  3 17:17 cee9440c-4eb8-453b-bc04-c47e6f9cbc93.lease

-rw-r--r--. 2 vdsm kvm     283 Oct  3 17:17 cee9440c-4eb8-453b-bc04-c47e6f9cbc93.meta   

 

 

[root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# gluster volume info

 

Volume Name: data

Type: Replicate

Volume ID: 54fbcafc-fed9-4bce-92ec-fa36cdcacbd4

Status: Started

Number of Bricks: 1 x (2 + 1) = 3

Transport-type: tcp

Bricks:

Brick1: dcastor01:/xpool/data/brick

Brick2: dcastor03:/xpool/data/brick

Brick3: dcastor02:/xpool/data/bricky (arbiter)

Options Reconfigured:

performance.readdir-ahead: on

performance.quick-read: off

performance.read-ahead: off

performance.io-cache: off

performance.stat-prefetch: off

cluster.eager-lock: enable

network.remote-dio: enable

cluster.quorum-type: auto

cluster.server-quorum-type: server

storage.owner-uid: 36

storage.owner-gid: 36

 

Volume Name: engine

Type: Replicate

Volume ID: dd4c692d-03aa-4fc6-9011-a8dad48dad96

Status: Started

Number of Bricks: 1 x (2 + 1) = 3

Transport-type: tcp

Bricks:

Brick1: dcastor01:/xpool/engine/brick

Brick2: dcastor02:/xpool/engine/brick

Brick3: dcastor03:/xpool/engine/brick (arbiter)

Options Reconfigured:

performance.readdir-ahead: on

performance.quick-read: off

performance.read-ahead: off

performance.io-cache: off

performance.stat-prefetch: off

cluster.eager-lock: enable

network.remote-dio: enable

cluster.quorum-type: auto

cluster.server-quorum-type: server

storage.owner-uid: 36

storage.owner-gid: 36

 

Volume Name: export

Type: Replicate

Volume ID: 23f14730-d264-4cc2-af60-196b943ecaf3

Status: Started

Number of Bricks: 1 x (2 + 1) = 3

Transport-type: tcp

Bricks:

Brick1: dcastor02:/xpool/export/brick

Brick2: dcastor03:/xpool/export/brick

Brick3: dcastor01:/xpool/export/brick (arbiter)

Options Reconfigured:

performance.readdir-ahead: on

storage.owner-uid: 36

storage.owner-gid: 36

 

Volume Name: iso

Type: Replicate

Volume ID: b2d3d7e2-9919-400b-8368-a0443d48e82a

Status: Started

Number of Bricks: 1 x (2 + 1) = 3

Transport-type: tcp

Bricks:

Brick1: dcastor01:/xpool/iso/brick

Brick2: dcastor02:/xpool/iso/brick

Brick3: dcastor03:/xpool/iso/brick (arbiter)

Options Reconfigured:

performance.readdir-ahead: on

storage.owner-uid: 36

storage.owner-gid: 36                                   

 

 

[root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# gluster volume status

Status of volume: data

Gluster process                             TCP Port  RDMA Port  Online  Pid

------------------------------------------------------------------------------

Brick dcastor01:/xpool/data/brick           49153     0          Y       3076

Brick dcastor03:/xpool/data/brick           49153     0          Y       3019

Brick dcastor02:/xpool/data/bricky          49153     0          Y       3857

NFS Server on localhost                     2049      0          Y       3097

Self-heal Daemon on localhost               N/A       N/A        Y       3088

NFS Server on dcastor03                     2049      0          Y       3039

Self-heal Daemon on dcastor03               N/A       N/A        Y       3114

NFS Server on dcasrv02                      2049      0          Y       3871

Self-heal Daemon on dcasrv02                N/A       N/A        Y       3864

 

Task Status of Volume data

------------------------------------------------------------------------------

There are no active volume tasks

 

Status of volume: engine

Gluster process                             TCP Port  RDMA Port  Online  Pid

------------------------------------------------------------------------------

Brick dcastor01:/xpool/engine/brick         49152     0          Y       3131

Brick dcastor02:/xpool/engine/brick         49152     0          Y       3852

Brick dcastor03:/xpool/engine/brick         49152     0          Y       2992

NFS Server on localhost                     2049      0          Y       3097

Self-heal Daemon on localhost               N/A       N/A        Y       3088

NFS Server on dcastor03                     2049      0          Y       3039

Self-heal Daemon on dcastor03               N/A       N/A        Y       3114

NFS Server on dcasrv02                      2049      0          Y       3871

Self-heal Daemon on dcasrv02                N/A       N/A        Y       3864

 

Task Status of Volume engine

------------------------------------------------------------------------------

There are no active volume tasks

 

Status of volume: export

Gluster process                             TCP Port  RDMA Port  Online  Pid

------------------------------------------------------------------------------

Brick dcastor02:/xpool/export/brick         49155     0          Y       3872

Brick dcastor03:/xpool/export/brick         49155     0          Y       3147

Brick dcastor01:/xpool/export/brick         49155     0          Y       3150

NFS Server on localhost                     2049      0          Y       3097

Self-heal Daemon on localhost               N/A       N/A        Y       3088

NFS Server on dcastor03                     2049      0          Y       3039

Self-heal Daemon on dcastor03               N/A       N/A        Y       3114

NFS Server on dcasrv02                      2049      0          Y       3871

Self-heal Daemon on dcasrv02                N/A       N/A        Y       3864

 

Task Status of Volume export

------------------------------------------------------------------------------

There are no active volume tasks

 

Status of volume: iso

Gluster process                             TCP Port  RDMA Port  Online  Pid

------------------------------------------------------------------------------

Brick dcastor01:/xpool/iso/brick            49154     0          Y       3152

Brick dcastor02:/xpool/iso/brick            49154     0          Y       3881

Brick dcastor03:/xpool/iso/brick            49154     0          Y       3146

NFS Server on localhost                     2049      0          Y       3097

Self-heal Daemon on localhost               N/A       N/A        Y       3088

NFS Server on dcastor03                     2049      0          Y       3039

Self-heal Daemon on dcastor03               N/A       N/A        Y       3114

NFS Server on dcasrv02                      2049      0          Y       3871

Self-heal Daemon on dcasrv02                N/A       N/A        Y       3864

 

Task Status of Volume iso

------------------------------------------------------------------------------

There are no active volume tasks

                                                                                  

Thanks

 

Jason

 

 

 

From: users-bounces at ovirt.org [mailto:users-bounces at ovirt.org] On Behalf Of Jason Jeffrey
Sent: 03 October 2016 18:40
To: users at ovirt.org
Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy

 

Hi,

 

Setup log attached for primary

 

Regards

 

Jason 

 

From: Simone Tiraboschi [mailto:stirabos at redhat.com] 
Sent: 03 October 2016 09:27
To: Jason Jeffrey <jason at sudo.co.uk <mailto:jason at sudo.co.uk> >
Cc: users <users at ovirt.org <mailto:users at ovirt.org> >
Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy

 

 

 

On Mon, Oct 3, 2016 at 12:45 AM, Jason Jeffrey <jason at sudo.co.uk <mailto:jason at sudo.co.uk> > wrote:

Hi,

 

I am trying to build a x3 HC cluster, with a self hosted engine using gluster.

 

I have successful built the 1st node,  however when I attempt to run hosted-engine –deploy on node 2, I get the following error

 

[WARNING] A configuration file must be supplied to deploy Hosted Engine on an additional host.

[ ERROR ] 'version' is not stored in the HE configuration image

[ ERROR ] Unable to get the answer file from the shared storage

[ ERROR ] Failed to execute stage 'Environment customization': Unable to get the answer file from the shared storage

[ INFO  ] Stage: Clean up

[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20161002232505.conf'

[ INFO  ] Stage: Pre-termination

[ INFO  ] Stage: Termination

[ ERROR ] Hosted Engine deployment failed    

 

Looking at the failure in the log file..

 

Can you please attach hosted-engine-setup logs from the first host?

 

 

2016-10-02 23:25:05 WARNING otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._customization:151 A configuration

file must be supplied to deploy Hosted Engine on an additional host.

2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._fetch_answer_file:61 _fetch_answer_f

ile

2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._fetch_answer_file:69 fetching from:

/rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/7

8cb2527-a2e2-489a-9fad-465a72221b37

2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:69 executing: 'sudo -u vdsm dd i

f=/rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b

/78cb2527-a2e2-489a-9fad-465a72221b37 bs=4k'

2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:70 executing: 'tar -tvf -'

2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:88 stdout:

2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:89 stderr:

2016-10-02 23:25:05 ERROR otopi.plugins.gr_he_common.core.remote_answerfile heconflib.validateConfImage:111 'version' is not stored

in the HE configuration image

2016-10-02 23:25:05 ERROR otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._fetch_answer_file:73 Unable to get t

he answer file from the shared storage

 

Looking at the detected gluster path - /rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/

 

[root at dcasrv02 ~]# ls -al /rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/

total 1049609

drwxr-xr-x. 2 vdsm kvm       4096 Oct  2 04:46 .

drwxr-xr-x. 6 vdsm kvm       4096 Oct  2 04:46 ..

-rw-rw----. 1 vdsm kvm 1073741824 Oct  2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37

-rw-rw----. 1 vdsm kvm    1048576 Oct  2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37.lease

-rw-r--r--. 1 vdsm kvm        294 Oct  2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37.meta  

 

78cb2527-a2e2-489a-9fad-465a72221b37 is  a 1 GB file, is this the engine VM ?

 

Copying the answers file form primary (/etc/ovirt-hosted-engine/answers.conf ) to  node 2 and rerunning produces the same error : (

(hosted-engine --deploy  --config-append=/root/answers.conf )

 

Also tried on node 3, same issues 

 

Happy to provide logs and other debugs

 

Thanks 

 

Jason 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


_______________________________________________
Users mailing list
Users at ovirt.org <mailto:Users at ovirt.org> 
http://lists.ovirt.org/mailman/listinfo/users

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20161003/6a74079a/attachment-0001.html>


More information about the Users mailing list