[ovirt-users] 4.0 - 2nd node fails on deploy
Jason Jeffrey
jason at sudo.co.uk
Mon Oct 3 17:56:03 EDT 2016
Hi,
Another problem has appeared, after rebooting the primary the VM will not start.
Appears the symlink is broken between gluster mount ref and vdsm
>From broker.log
Thread-169::ERROR::2016-10-04 22:44:16,189::storage_broker::138::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(get_raw_stats_for_service_type) Failed to read metadata from /rhev/data-center/mnt/glusterSD/dcastor01:engine/bbb70623-194a-46d2-a164-76a4876ecaaf/ha_agent/hosted-engine.metadata
[root at dcasrv01 ovirt-hosted-engine-ha]# ls -al /rhev/data-center/mnt/glusterSD/dcastor01\:engine/bbb70623-194a-46d2-a164-76a4876ecaaf/ha_agent/
total 9
drwxrwx---. 2 vdsm kvm 4096 Oct 3 17:27 .
drwxr-xr-x. 5 vdsm kvm 4096 Oct 3 17:17 ..
lrwxrwxrwx. 1 vdsm kvm 132 Oct 3 17:27 hosted-engine.lockspace -> /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/23d81b73-bcb7-4742-abde-128522f43d78/11d6a3e1-1817-429d-b2e0-9051a3cf41a4
lrwxrwxrwx. 1 vdsm kvm 132 Oct 3 17:27 hosted-engine.metadata -> /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/fd44dbf9-473a-496a-9996-c8abe3278390/cee9440c-4eb8-453b-bc04-c47e6f9cbc93
[root at dcasrv01 /]# ls -al /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/
ls: cannot access /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/: No such file or directory
Though file appears to be there
Gluster is setup as xpool/engine
[root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# pwd
/xpool/engine/brick/bbb70623-194a-46d2-a164-76a4876ecaaf/images/fd44dbf9-473a-496a-9996-c8abe3278390
[root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# ls -al
total 2060
drwxr-xr-x. 2 vdsm kvm 4096 Oct 3 17:17 .
drwxr-xr-x. 6 vdsm kvm 4096 Oct 3 17:17 ..
-rw-rw----. 2 vdsm kvm 1028096 Oct 3 20:48 cee9440c-4eb8-453b-bc04-c47e6f9cbc93
-rw-rw----. 2 vdsm kvm 1048576 Oct 3 17:17 cee9440c-4eb8-453b-bc04-c47e6f9cbc93.lease
-rw-r--r--. 2 vdsm kvm 283 Oct 3 17:17 cee9440c-4eb8-453b-bc04-c47e6f9cbc93.meta
[root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# gluster volume info
Volume Name: data
Type: Replicate
Volume ID: 54fbcafc-fed9-4bce-92ec-fa36cdcacbd4
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor01:/xpool/data/brick
Brick2: dcastor03:/xpool/data/brick
Brick3: dcastor02:/xpool/data/bricky (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
Volume Name: engine
Type: Replicate
Volume ID: dd4c692d-03aa-4fc6-9011-a8dad48dad96
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor01:/xpool/engine/brick
Brick2: dcastor02:/xpool/engine/brick
Brick3: dcastor03:/xpool/engine/brick (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
Volume Name: export
Type: Replicate
Volume ID: 23f14730-d264-4cc2-af60-196b943ecaf3
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor02:/xpool/export/brick
Brick2: dcastor03:/xpool/export/brick
Brick3: dcastor01:/xpool/export/brick (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
storage.owner-uid: 36
storage.owner-gid: 36
Volume Name: iso
Type: Replicate
Volume ID: b2d3d7e2-9919-400b-8368-a0443d48e82a
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor01:/xpool/iso/brick
Brick2: dcastor02:/xpool/iso/brick
Brick3: dcastor03:/xpool/iso/brick (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
storage.owner-uid: 36
storage.owner-gid: 36
[root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# gluster volume status
Status of volume: data
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick dcastor01:/xpool/data/brick 49153 0 Y 3076
Brick dcastor03:/xpool/data/brick 49153 0 Y 3019
Brick dcastor02:/xpool/data/bricky 49153 0 Y 3857
NFS Server on localhost 2049 0 Y 3097
Self-heal Daemon on localhost N/A N/A Y 3088
NFS Server on dcastor03 2049 0 Y 3039
Self-heal Daemon on dcastor03 N/A N/A Y 3114
NFS Server on dcasrv02 2049 0 Y 3871
Self-heal Daemon on dcasrv02 N/A N/A Y 3864
Task Status of Volume data
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: engine
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick dcastor01:/xpool/engine/brick 49152 0 Y 3131
Brick dcastor02:/xpool/engine/brick 49152 0 Y 3852
Brick dcastor03:/xpool/engine/brick 49152 0 Y 2992
NFS Server on localhost 2049 0 Y 3097
Self-heal Daemon on localhost N/A N/A Y 3088
NFS Server on dcastor03 2049 0 Y 3039
Self-heal Daemon on dcastor03 N/A N/A Y 3114
NFS Server on dcasrv02 2049 0 Y 3871
Self-heal Daemon on dcasrv02 N/A N/A Y 3864
Task Status of Volume engine
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: export
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick dcastor02:/xpool/export/brick 49155 0 Y 3872
Brick dcastor03:/xpool/export/brick 49155 0 Y 3147
Brick dcastor01:/xpool/export/brick 49155 0 Y 3150
NFS Server on localhost 2049 0 Y 3097
Self-heal Daemon on localhost N/A N/A Y 3088
NFS Server on dcastor03 2049 0 Y 3039
Self-heal Daemon on dcastor03 N/A N/A Y 3114
NFS Server on dcasrv02 2049 0 Y 3871
Self-heal Daemon on dcasrv02 N/A N/A Y 3864
Task Status of Volume export
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: iso
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick dcastor01:/xpool/iso/brick 49154 0 Y 3152
Brick dcastor02:/xpool/iso/brick 49154 0 Y 3881
Brick dcastor03:/xpool/iso/brick 49154 0 Y 3146
NFS Server on localhost 2049 0 Y 3097
Self-heal Daemon on localhost N/A N/A Y 3088
NFS Server on dcastor03 2049 0 Y 3039
Self-heal Daemon on dcastor03 N/A N/A Y 3114
NFS Server on dcasrv02 2049 0 Y 3871
Self-heal Daemon on dcasrv02 N/A N/A Y 3864
Task Status of Volume iso
------------------------------------------------------------------------------
There are no active volume tasks
Thanks
Jason
From: users-bounces at ovirt.org [mailto:users-bounces at ovirt.org] On Behalf Of Jason Jeffrey
Sent: 03 October 2016 18:40
To: users at ovirt.org
Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy
Hi,
Setup log attached for primary
Regards
Jason
From: Simone Tiraboschi [mailto:stirabos at redhat.com]
Sent: 03 October 2016 09:27
To: Jason Jeffrey <jason at sudo.co.uk <mailto:jason at sudo.co.uk> >
Cc: users <users at ovirt.org <mailto:users at ovirt.org> >
Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy
On Mon, Oct 3, 2016 at 12:45 AM, Jason Jeffrey <jason at sudo.co.uk <mailto:jason at sudo.co.uk> > wrote:
Hi,
I am trying to build a x3 HC cluster, with a self hosted engine using gluster.
I have successful built the 1st node, however when I attempt to run hosted-engine –deploy on node 2, I get the following error
[WARNING] A configuration file must be supplied to deploy Hosted Engine on an additional host.
[ ERROR ] 'version' is not stored in the HE configuration image
[ ERROR ] Unable to get the answer file from the shared storage
[ ERROR ] Failed to execute stage 'Environment customization': Unable to get the answer file from the shared storage
[ INFO ] Stage: Clean up
[ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20161002232505.conf'
[ INFO ] Stage: Pre-termination
[ INFO ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed
Looking at the failure in the log file..
Can you please attach hosted-engine-setup logs from the first host?
2016-10-02 23:25:05 WARNING otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._customization:151 A configuration
file must be supplied to deploy Hosted Engine on an additional host.
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._fetch_answer_file:61 _fetch_answer_f
ile
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._fetch_answer_file:69 fetching from:
/rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/7
8cb2527-a2e2-489a-9fad-465a72221b37
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:69 executing: 'sudo -u vdsm dd i
f=/rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b
/78cb2527-a2e2-489a-9fad-465a72221b37 bs=4k'
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:70 executing: 'tar -tvf -'
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:88 stdout:
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:89 stderr:
2016-10-02 23:25:05 ERROR otopi.plugins.gr_he_common.core.remote_answerfile heconflib.validateConfImage:111 'version' is not stored
in the HE configuration image
2016-10-02 23:25:05 ERROR otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._fetch_answer_file:73 Unable to get t
he answer file from the shared storage
Looking at the detected gluster path - /rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/
[root at dcasrv02 ~]# ls -al /rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/
total 1049609
drwxr-xr-x. 2 vdsm kvm 4096 Oct 2 04:46 .
drwxr-xr-x. 6 vdsm kvm 4096 Oct 2 04:46 ..
-rw-rw----. 1 vdsm kvm 1073741824 Oct 2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37
-rw-rw----. 1 vdsm kvm 1048576 Oct 2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37.lease
-rw-r--r--. 1 vdsm kvm 294 Oct 2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37.meta
78cb2527-a2e2-489a-9fad-465a72221b37 is a 1 GB file, is this the engine VM ?
Copying the answers file form primary (/etc/ovirt-hosted-engine/answers.conf ) to node 2 and rerunning produces the same error : (
(hosted-engine --deploy --config-append=/root/answers.conf )
Also tried on node 3, same issues
Happy to provide logs and other debugs
Thanks
Jason
_______________________________________________
Users mailing list
Users at ovirt.org <mailto:Users at ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20161003/6a74079a/attachment-0001.html>
More information about the Users
mailing list