[ovirt-users] 4.0 - 2nd node fails on deploy
Jason Jeffrey
jason at sudo.co.uk
Tue Oct 4 11:09:55 UTC 2016
Hi,
Thanks for the review, further logs attached
Regards
Jason
From: Simone Tiraboschi [mailto:stirabos at redhat.com]
Sent: 04 October 2016 09:52
To: Jason Jeffrey <jason at sudo.co.uk>
Cc: users <users at ovirt.org>
Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy
On Mon, Oct 3, 2016 at 11:56 PM, Jason Jeffrey <jason at sudo.co.uk <mailto:jason at sudo.co.uk> > wrote:
Hi,
Another problem has appeared, after rebooting the primary the VM will not start.
Appears the symlink is broken between gluster mount ref and vdsm
The first host was correctly deployed but it seas that you are facing some issue connecting the storage.
Can you please attach vdsm logs and /var/log/messages from the first host?
>From broker.log
Thread-169::ERROR::2016-10-04 22:44:16,189::storage_broker::138::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(get_raw_stats_for_service_type) Failed to read metadata from /rhev/data-center/mnt/glusterSD/dcastor01:engine/bbb70623-194a-46d2-a164-76a4876ecaaf/ha_agent/hosted-engine.metadata
[root at dcasrv01 ovirt-hosted-engine-ha]# ls -al /rhev/data-center/mnt/glusterSD/dcastor01\:engine/bbb70623-194a-46d2-a164-76a4876ecaaf/ha_agent/
total 9
drwxrwx---. 2 vdsm kvm 4096 Oct 3 17:27 .
drwxr-xr-x. 5 vdsm kvm 4096 Oct 3 17:17 ..
lrwxrwxrwx. 1 vdsm kvm 132 Oct 3 17:27 hosted-engine.lockspace -> /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/23d81b73-bcb7-4742-abde-128522f43d78/11d6a3e1-1817-429d-b2e0-9051a3cf41a4
lrwxrwxrwx. 1 vdsm kvm 132 Oct 3 17:27 hosted-engine.metadata -> /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/fd44dbf9-473a-496a-9996-c8abe3278390/cee9440c-4eb8-453b-bc04-c47e6f9cbc93
[root at dcasrv01 /]# ls -al /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/
ls: cannot access /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/: No such file or directory
Though file appears to be there
Gluster is setup as xpool/engine
[root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# pwd
/xpool/engine/brick/bbb70623-194a-46d2-a164-76a4876ecaaf/images/fd44dbf9-473a-496a-9996-c8abe3278390
[root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# ls -al
total 2060
drwxr-xr-x. 2 vdsm kvm 4096 Oct 3 17:17 .
drwxr-xr-x. 6 vdsm kvm 4096 Oct 3 17:17 ..
-rw-rw----. 2 vdsm kvm 1028096 Oct 3 20:48 cee9440c-4eb8-453b-bc04-c47e6f9cbc93
-rw-rw----. 2 vdsm kvm 1048576 Oct 3 17:17 cee9440c-4eb8-453b-bc04-c47e6f9cbc93.lease
-rw-r--r--. 2 vdsm kvm 283 Oct 3 17:17 cee9440c-4eb8-453b-bc04-c47e6f9cbc93.meta
[root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# gluster volume info
Volume Name: data
Type: Replicate
Volume ID: 54fbcafc-fed9-4bce-92ec-fa36cdcacbd4
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor01:/xpool/data/brick
Brick2: dcastor03:/xpool/data/brick
Brick3: dcastor02:/xpool/data/bricky (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
Volume Name: engine
Type: Replicate
Volume ID: dd4c692d-03aa-4fc6-9011-a8dad48dad96
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor01:/xpool/engine/brick
Brick2: dcastor02:/xpool/engine/brick
Brick3: dcastor03:/xpool/engine/brick (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
Volume Name: export
Type: Replicate
Volume ID: 23f14730-d264-4cc2-af60-196b943ecaf3
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor02:/xpool/export/brick
Brick2: dcastor03:/xpool/export/brick
Brick3: dcastor01:/xpool/export/brick (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
storage.owner-uid: 36
storage.owner-gid: 36
Volume Name: iso
Type: Replicate
Volume ID: b2d3d7e2-9919-400b-8368-a0443d48e82a
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: dcastor01:/xpool/iso/brick
Brick2: dcastor02:/xpool/iso/brick
Brick3: dcastor03:/xpool/iso/brick (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
storage.owner-uid: 36
storage.owner-gid: 36
[root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# gluster volume status
Status of volume: data
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick dcastor01:/xpool/data/brick 49153 0 Y 3076
Brick dcastor03:/xpool/data/brick 49153 0 Y 3019
Brick dcastor02:/xpool/data/bricky 49153 0 Y 3857
NFS Server on localhost 2049 0 Y 3097
Self-heal Daemon on localhost N/A N/A Y 3088
NFS Server on dcastor03 2049 0 Y 3039
Self-heal Daemon on dcastor03 N/A N/A Y 3114
NFS Server on dcasrv02 2049 0 Y 3871
Self-heal Daemon on dcasrv02 N/A N/A Y 3864
Task Status of Volume data
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: engine
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick dcastor01:/xpool/engine/brick 49152 0 Y 3131
Brick dcastor02:/xpool/engine/brick 49152 0 Y 3852
Brick dcastor03:/xpool/engine/brick 49152 0 Y 2992
NFS Server on localhost 2049 0 Y 3097
Self-heal Daemon on localhost N/A N/A Y 3088
NFS Server on dcastor03 2049 0 Y 3039
Self-heal Daemon on dcastor03 N/A N/A Y 3114
NFS Server on dcasrv02 2049 0 Y 3871
Self-heal Daemon on dcasrv02 N/A N/A Y 3864
Task Status of Volume engine
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: export
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick dcastor02:/xpool/export/brick 49155 0 Y 3872
Brick dcastor03:/xpool/export/brick 49155 0 Y 3147
Brick dcastor01:/xpool/export/brick 49155 0 Y 3150
NFS Server on localhost 2049 0 Y 3097
Self-heal Daemon on localhost N/A N/A Y 3088
NFS Server on dcastor03 2049 0 Y 3039
Self-heal Daemon on dcastor03 N/A N/A Y 3114
NFS Server on dcasrv02 2049 0 Y 3871
Self-heal Daemon on dcasrv02 N/A N/A Y 3864
Task Status of Volume export
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: iso
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick dcastor01:/xpool/iso/brick 49154 0 Y 3152
Brick dcastor02:/xpool/iso/brick 49154 0 Y 3881
Brick dcastor03:/xpool/iso/brick 49154 0 Y 3146
NFS Server on localhost 2049 0 Y 3097
Self-heal Daemon on localhost N/A N/A Y 3088
NFS Server on dcastor03 2049 0 Y 3039
Self-heal Daemon on dcastor03 N/A N/A Y 3114
NFS Server on dcasrv02 2049 0 Y 3871
Self-heal Daemon on dcasrv02 N/A N/A Y 3864
Task Status of Volume iso
------------------------------------------------------------------------------
There are no active volume tasks
Thanks
Jason
From: users-bounces at ovirt.org <mailto:users-bounces at ovirt.org> [mailto:users-bounces at ovirt.org <mailto:users-bounces at ovirt.org> ] On Behalf Of Jason Jeffrey
Sent: 03 October 2016 18:40
To: users at ovirt.org <mailto:users at ovirt.org>
Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy
Hi,
Setup log attached for primary
Regards
Jason
From: Simone Tiraboschi [mailto:stirabos at redhat.com]
Sent: 03 October 2016 09:27
To: Jason Jeffrey <jason at sudo.co.uk <mailto:jason at sudo.co.uk> >
Cc: users <users at ovirt.org <mailto:users at ovirt.org> >
Subject: Re: [ovirt-users] 4.0 - 2nd node fails on deploy
On Mon, Oct 3, 2016 at 12:45 AM, Jason Jeffrey <jason at sudo.co.uk <mailto:jason at sudo.co.uk> > wrote:
Hi,
I am trying to build a x3 HC cluster, with a self hosted engine using gluster.
I have successful built the 1st node, however when I attempt to run hosted-engine –deploy on node 2, I get the following error
[WARNING] A configuration file must be supplied to deploy Hosted Engine on an additional host.
[ ERROR ] 'version' is not stored in the HE configuration image
[ ERROR ] Unable to get the answer file from the shared storage
[ ERROR ] Failed to execute stage 'Environment customization': Unable to get the answer file from the shared storage
[ INFO ] Stage: Clean up
[ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20161002232505.conf'
[ INFO ] Stage: Pre-termination
[ INFO ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed
Looking at the failure in the log file..
Can you please attach hosted-engine-setup logs from the first host?
2016-10-02 23:25:05 WARNING otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._customization:151 A configuration
file must be supplied to deploy Hosted Engine on an additional host.
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._fetch_answer_file:61 _fetch_answer_f
ile
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._fetch_answer_file:69 fetching from:
/rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/7
8cb2527-a2e2-489a-9fad-465a72221b37
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:69 executing: 'sudo -u vdsm dd i
f=/rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b
/78cb2527-a2e2-489a-9fad-465a72221b37 bs=4k'
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:70 executing: 'tar -tvf -'
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:88 stdout:
2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile heconflib._dd_pipe_tar:89 stderr:
2016-10-02 23:25:05 ERROR otopi.plugins.gr_he_common.core.remote_answerfile heconflib.validateConfImage:111 'version' is not stored
in the HE configuration image
2016-10-02 23:25:05 ERROR otopi.plugins.gr_he_common.core.remote_answerfile remote_answerfile._fetch_answer_file:73 Unable to get t
he answer file from the shared storage
Looking at the detected gluster path - /rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/
[root at dcasrv02 ~]# ls -al /rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/
total 1049609
drwxr-xr-x. 2 vdsm kvm 4096 Oct 2 04:46 .
drwxr-xr-x. 6 vdsm kvm 4096 Oct 2 04:46 ..
-rw-rw----. 1 vdsm kvm 1073741824 Oct 2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37
-rw-rw----. 1 vdsm kvm 1048576 Oct 2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37.lease
-rw-r--r--. 1 vdsm kvm 294 Oct 2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37.meta
78cb2527-a2e2-489a-9fad-465a72221b37 is a 1 GB file, is this the engine VM ?
Copying the answers file form primary (/etc/ovirt-hosted-engine/answers.conf ) to node 2 and rerunning produces the same error : (
(hosted-engine --deploy --config-append=/root/answers.conf )
Also tried on node 3, same issues
Happy to provide logs and other debugs
Thanks
Jason
_______________________________________________
Users mailing list
Users at ovirt.org <mailto:Users at ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users at ovirt.org <mailto:Users at ovirt.org>
http://lists.ovirt.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20161004/4f6b6d85/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: messages.gz
Type: application/octet-stream
Size: 1350214 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20161004/4f6b6d85/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vdsm.log.gz
Type: application/octet-stream
Size: 3160609 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20161004/4f6b6d85/attachment-0003.obj>
More information about the Users
mailing list