Hi,
Another problem has appeared, after rebooting the primary the VM will not start.
Appears the symlink is broken between gluster mount ref and vdsm
From broker.log
Thread-169::ERROR::2016-10-04 22:44:16,189::storage_broker::
138::ovirt_hosted_engine_ha. broker.storage_broker. StorageBroker::(get_raw_stats_ for_service_type) Failed to read metadata from /rhev/data-center/mnt/ glusterSD/dcastor01:engine/ bbb70623-194a-46d2-a164- 76a4876ecaaf/ha_agent/hosted- engine.metadata
[root@dcasrv01 ovirt-hosted-engine-ha]# ls -al /rhev/data-center/mnt/
glusterSD/dcastor01\:engine/ bbb70623-194a-46d2-a164- 76a4876ecaaf/ha_agent/ total 9
drwxrwx---. 2 vdsm kvm 4096 Oct 3 17:27 .
drwxr-xr-x. 5 vdsm kvm 4096 Oct 3 17:17 ..
lrwxrwxrwx. 1 vdsm kvm 132 Oct 3 17:27 hosted-engine.lockspace -> /var/run/vdsm/storage/
bbb70623-194a-46d2-a164- 76a4876ecaaf/23d81b73-bcb7- 4742-abde-128522f43d78/ 11d6a3e1-1817-429d-b2e0- 9051a3cf41a4 lrwxrwxrwx. 1 vdsm kvm 132 Oct 3 17:27 hosted-engine.metadata -> /var/run/vdsm/storage/
bbb70623-194a-46d2-a164- 76a4876ecaaf/fd44dbf9-473a- 496a-9996-c8abe3278390/ cee9440c-4eb8-453b-bc04- c47e6f9cbc93
[root@dcasrv01 /]# ls -al /var/run/vdsm/storage/
bbb70623-194a-46d2-a164- 76a4876ecaaf/ ls: cannot access /var/run/vdsm/storage/
bbb70623-194a-46d2-a164- 76a4876ecaaf/: No such file or directory
Though file appears to be there
Gluster is setup as xpool/engine
[root@dcasrv01 fd44dbf9-473a-496a-9996-
c8abe3278390]# pwd /xpool/engine/brick/bbb70623-
194a-46d2-a164-76a4876ecaaf/ images/fd44dbf9-473a-496a- 9996-c8abe3278390 [root@dcasrv01 fd44dbf9-473a-496a-9996-
c8abe3278390]# ls -al total 2060
drwxr-xr-x. 2 vdsm kvm 4096 Oct 3 17:17 .
drwxr-xr-x. 6 vdsm kvm 4096 Oct 3 17:17 ..
-rw-rw----. 2 vdsm kvm 1028096 Oct 3 20:48 cee9440c-4eb8-453b-bc04-
c47e6f9cbc93 -rw-rw----. 2 vdsm kvm 1048576 Oct 3 17:17 cee9440c-4eb8-453b-bc04-
c47e6f9cbc93.lease -rw-r--r--. 2 vdsm kvm 283 Oct 3 17:17 cee9440c-4eb8-453b-bc04-
c47e6f9cbc93.meta
[root@dcasrv01 fd44dbf9-473a-496a-9996-
c8abe3278390]# gluster volume info
Volume Name: data
Type: Replicate
Volume ID: 54fbcafc-fed9-4bce-92ec-
fa36cdcacbd4 Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor01:/xpool/data/brick
Brick2: dcastor03:/xpool/data/brick
Brick3: dcastor02:/xpool/data/bricky (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
Volume Name: engine
Type: Replicate
Volume ID: dd4c692d-03aa-4fc6-9011-
a8dad48dad96 Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor01:/xpool/engine/brick
Brick2: dcastor02:/xpool/engine/brick
Brick3: dcastor03:/xpool/engine/brick (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
Volume Name: export
Type: Replicate
Volume ID: 23f14730-d264-4cc2-af60-
196b943ecaf3 Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor02:/xpool/export/brick
Brick2: dcastor03:/xpool/export/brick
Brick3: dcastor01:/xpool/export/brick (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
storage.owner-uid: 36
storage.owner-gid: 36
Volume Name: iso
Type: Replicate
Volume ID: b2d3d7e2-9919-400b-8368-
a0443d48e82a Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor01:/xpool/iso/brick
Brick2: dcastor02:/xpool/iso/brick
Brick3: dcastor03:/xpool/iso/brick (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
storage.owner-uid: 36
storage.owner-gid: 36
[root@dcasrv01 fd44dbf9-473a-496a-9996-
c8abe3278390]# gluster volume status Status of volume: data
Gluster process
TCP Port RDMA Port Online Pid ------------------------------
------------------------------ ------------------ Brick dcastor01:/xpool/data/brick
49153 0 Y 3076 Brick dcastor03:/xpool/data/brick
49153 0 Y 3019 Brick dcastor02:/xpool/data/bricky
49153 0 Y 3857 NFS Server on localhost 2049 0 Y 3097
Self-heal Daemon on localhost N/A N/A Y 3088
NFS Server on dcastor03 2049 0 Y 3039
Self-heal Daemon on dcastor03 N/A N/A Y 3114
NFS Server on dcasrv02 2049 0 Y 3871
Self-heal Daemon on dcasrv02 N/A N/A Y 3864
Task Status of Volume data
------------------------------
------------------------------ ------------------ There are no active volume tasks
Status of volume: engine
Gluster process
TCP Port RDMA Port Online Pid ------------------------------
------------------------------ ------------------ Brick dcastor01:/xpool/engine/brick
49152 0 Y 3131 Brick dcastor02:/xpool/engine/brick
49152 0 Y 3852 Brick dcastor03:/xpool/engine/brick
49152 0 Y 2992 NFS Server on localhost 2049 0 Y 3097
Self-heal Daemon on localhost N/A N/A Y 3088
NFS Server on dcastor03 2049 0 Y 3039
Self-heal Daemon on dcastor03 N/A N/A Y 3114
NFS Server on dcasrv02 2049 0 Y 3871
Self-heal Daemon on dcasrv02 N/A N/A Y 3864
Task Status of Volume engine
------------------------------
------------------------------ ------------------ There are no active volume tasks
Status of volume: export
Gluster process
TCP Port RDMA Port Online Pid ------------------------------
------------------------------ ------------------ Brick dcastor02:/xpool/export/brick
49155 0 Y 3872 Brick dcastor03:/xpool/export/brick
49155 0 Y 3147 Brick dcastor01:/xpool/export/brick
49155 0 Y 3150 NFS Server on localhost 2049 0 Y 3097
Self-heal Daemon on localhost N/A N/A Y 3088
NFS Server on dcastor03 2049 0 Y 3039
Self-heal Daemon on dcastor03 N/A N/A Y 3114
NFS Server on dcasrv02 2049 0 Y 3871
Self-heal Daemon on dcasrv02 N/A N/A Y 3864
Task Status of Volume export
------------------------------
------------------------------ ------------------ There are no active volume tasks
Status of volume: iso
Gluster process
TCP Port RDMA Port Online Pid ------------------------------
------------------------------ ------------------ Brick dcastor01:/xpool/iso/brick
49154 0 Y 3152 Brick dcastor02:/xpool/iso/brick
49154 0 Y 3881 Brick dcastor03:/xpool/iso/brick
49154 0 Y 3146 NFS Server on localhost 2049 0 Y 3097
Self-heal Daemon on localhost N/A N/A Y 3088
NFS Server on dcastor03 2049 0 Y 3039
Self-heal Daemon on dcastor03 N/A N/A Y 3114
NFS Server on dcasrv02 2049 0 Y 3871
Self-heal Daemon on dcasrv02 N/A N/A Y 3864
Task Status of Volume iso
------------------------------
------------------------------ ------------------ There are no active volume tasks
Thanks
Jason
From: users-bounces@ovirt.org [mailto:users-bounces@ovirt.
org ] On Behalf Of Jason Jeffrey
Sent: 03 October 2016 18:40
To: users@ovirt.org
Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy
Hi,
Setup log attached for primary
Regards
Jason
From: Simone Tiraboschi [mailto:stirabos@redhat.com]
Sent: 03 October 2016 09:27
To: Jason Jeffrey <jason@sudo.co.uk>
Cc: users <users@ovirt.org>
Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy
On Mon, Oct 3, 2016 at 12:45 AM, Jason Jeffrey <jason@sudo.co.uk> wrote:
Hi,
I am trying to build a x3 HC cluster, with a self hosted engine using gluster.
I have successful built the 1st node, however when I attempt to run hosted-engine –deploy on node 2, I get the following error
[WARNING] A configuration file must be supplied to deploy Hosted Engine on an additional host.
[ ERROR ] 'version' is not stored in the HE configuration image
[ ERROR ] Unable to get the answer file from the shared storage
[ ERROR ] Failed to execute stage 'Environment customization': Unable to get the answer file from the shared storage
[ INFO ] Stage: Clean up
[ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-
setup/answers/answers- 20161002232505.conf' [ INFO ] Stage: Pre-termination
[ INFO ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed
Looking at the failure in the log file..
Can you please attach hosted-engine-setup logs from the first host?
2016-10-02 23:25:05 WARNING otopi.plugins.gr_he_common.
core.remote_answerfile remote_answerfile._ customization:151 A configuration file must be supplied to deploy Hosted Engine on an additional host.
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.
core.remote_answerfile remote_answerfile._fetch_ answer_file:61 _fetch_answer_f ile
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.
core.remote_answerfile remote_answerfile._fetch_ answer_file:69 fetching from: /rhev/data-center/mnt/
glusterSD/dcastor02:engine/ 0a021563-91b5-4f49-9c6b- fff45e85a025/images/f055216c- 02f9-4cd1-a22c-d6b56a0a8e9b/7 8cb2527-a2e2-489a-9fad-
465a72221b37 2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.
core.remote_answerfile heconflib._dd_pipe_tar:69 executing: 'sudo -u vdsm dd i f=/rhev/data-center/mnt/
glusterSD/dcastor02:engine/ 0a021563-91b5-4f49-9c6b- fff45e85a025/images/f055216c- 02f9-4cd1-a22c-d6b56a0a8e9b /78cb2527-a2e2-489a-9fad-
465a72221b37 bs=4k' 2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.
core.remote_answerfile heconflib._dd_pipe_tar:70 executing: 'tar -tvf -' 2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.
core.remote_answerfile heconflib._dd_pipe_tar:88 stdout: 2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.
core.remote_answerfile heconflib._dd_pipe_tar:89 stderr: 2016-10-02 23:25:05 ERROR otopi.plugins.gr_he_common.
core.remote_answerfile heconflib.validateConfImage: 111 'version' is not stored in the HE configuration image
2016-10-02 23:25:05 ERROR otopi.plugins.gr_he_common.
core.remote_answerfile remote_answerfile._fetch_ answer_file:73 Unable to get t he answer file from the shared storage
Looking at the detected gluster path - /rhev/data-center/mnt/
glusterSD/dcastor02:engine/ 0a021563-91b5-4f49-9c6b- fff45e85a025/images/f055216c- 02f9-4cd1-a22c-d6b56a0a8e9b/
[root@dcasrv02 ~]# ls -al /rhev/data-center/mnt/
glusterSD/dcastor02:engine/ 0a021563-91b5-4f49-9c6b- fff45e85a025/images/f055216c- 02f9-4cd1-a22c-d6b56a0a8e9b/ total 1049609
drwxr-xr-x. 2 vdsm kvm 4096 Oct 2 04:46 .
drwxr-xr-x. 6 vdsm kvm 4096 Oct 2 04:46 ..
-rw-rw----. 1 vdsm kvm 1073741824 Oct 2 04:46 78cb2527-a2e2-489a-9fad-
465a72221b37 -rw-rw----. 1 vdsm kvm 1048576 Oct 2 04:46 78cb2527-a2e2-489a-9fad-
465a72221b37.lease -rw-r--r--. 1 vdsm kvm 294 Oct 2 04:46 78cb2527-a2e2-489a-9fad-
465a72221b37.meta
78cb2527-a2e2-489a-9fad-
465a72221b37 is a 1 GB file, is this the engine VM ?
Copying the answers file form primary (/etc/ovirt-hosted-engine/
answers.conf ) to node 2 and rerunning produces the same error : ( (hosted-engine --deploy --config-append=/root/answers.
conf )
Also tried on node 3, same issues
Happy to provide logs and other debugs
Thanks
Jason
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users