[ovirt-users] 4.0 - 2nd node fails on deploy

Simone Tiraboschi stirabos at redhat.com
Tue Oct 4 04:51:48 EDT 2016


On Mon, Oct 3, 2016 at 11:56 PM, Jason Jeffrey <jason at sudo.co.uk> wrote:

> Hi,
>
>
>
> Another problem has appeared, after rebooting the primary the VM will not
> start.
>
>
>
> Appears the symlink is broken between gluster mount ref and vdsm
>

The first host was correctly deployed but it seas that you are facing some
issue connecting the storage.
Can you please attach vdsm logs and /var/log/messages from the first host?


>
>
> From broker.log
>
>
>
> Thread-169::ERROR::2016-10-04 22:44:16,189::storage_broker::
> 138::ovirt_hosted_engine_ha.broker.storage_broker.
> StorageBroker::(get_raw_stats_for_service_type) Failed to read metadata
> from /rhev/data-center/mnt/glusterSD/dcastor01:engine/
> bbb70623-194a-46d2-a164-76a4876ecaaf/ha_agent/hosted-engine.metadata
>
>
>
> [root at dcasrv01 ovirt-hosted-engine-ha]# ls -al /rhev/data-center/mnt/
> glusterSD/dcastor01\:engine/bbb70623-194a-46d2-a164-76a4876ecaaf/ha_agent/
>
> total 9
>
> drwxrwx---. 2 vdsm kvm 4096 Oct  3 17:27 .
>
> drwxr-xr-x. 5 vdsm kvm 4096 Oct  3 17:17 ..
>
> lrwxrwxrwx. 1 vdsm kvm  132 Oct  3 17:27 hosted-engine.lockspace ->
> /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/23d81b73-bcb7-
> 4742-abde-128522f43d78/11d6a3e1-1817-429d-b2e0-9051a3cf41a4
>
> lrwxrwxrwx. 1 vdsm kvm  132 Oct  3 17:27 hosted-engine.metadata ->
> /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/fd44dbf9-473a-
> 496a-9996-c8abe3278390/cee9440c-4eb8-453b-bc04-c47e6f9cbc93
>
>
>
> [root at dcasrv01 /]# ls -al /var/run/vdsm/storage/bbb70623-194a-46d2-a164-
> 76a4876ecaaf/
>
> ls: cannot access /var/run/vdsm/storage/bbb70623-194a-46d2-a164-76a4876ecaaf/:
> No such file or directory
>
>
>
> Though file appears to be there
>
>
>
> Gluster is setup as xpool/engine
>
>
>
> [root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# pwd
>
> /xpool/engine/brick/bbb70623-194a-46d2-a164-76a4876ecaaf/
> images/fd44dbf9-473a-496a-9996-c8abe3278390
>
> [root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# ls -al
>
> total 2060
>
> drwxr-xr-x. 2 vdsm kvm    4096 Oct  3 17:17 .
>
> drwxr-xr-x. 6 vdsm kvm    4096 Oct  3 17:17 ..
>
> -rw-rw----. 2 vdsm kvm 1028096 Oct  3 20:48 cee9440c-4eb8-453b-bc04-
> c47e6f9cbc93
>
> -rw-rw----. 2 vdsm kvm 1048576 Oct  3 17:17 cee9440c-4eb8-453b-bc04-
> c47e6f9cbc93.lease
>
> -rw-r--r--. 2 vdsm kvm     283 Oct  3 17:17 cee9440c-4eb8-453b-bc04-c47e6f9cbc93.meta
>
>
>
>
>
>
> [root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# gluster volume info
>
>
>
> Volume Name: data
>
> Type: Replicate
>
> Volume ID: 54fbcafc-fed9-4bce-92ec-fa36cdcacbd4
>
> Status: Started
>
> Number of Bricks: 1 x (2 + 1) = 3
>
> Transport-type: tcp
>
> Bricks:
>
> Brick1: dcastor01:/xpool/data/brick
>
> Brick2: dcastor03:/xpool/data/brick
>
> Brick3: dcastor02:/xpool/data/bricky (arbiter)
>
> Options Reconfigured:
>
> performance.readdir-ahead: on
>
> performance.quick-read: off
>
> performance.read-ahead: off
>
> performance.io-cache: off
>
> performance.stat-prefetch: off
>
> cluster.eager-lock: enable
>
> network.remote-dio: enable
>
> cluster.quorum-type: auto
>
> cluster.server-quorum-type: server
>
> storage.owner-uid: 36
>
> storage.owner-gid: 36
>
>
>
> Volume Name: engine
>
> Type: Replicate
>
> Volume ID: dd4c692d-03aa-4fc6-9011-a8dad48dad96
>
> Status: Started
>
> Number of Bricks: 1 x (2 + 1) = 3
>
> Transport-type: tcp
>
> Bricks:
>
> Brick1: dcastor01:/xpool/engine/brick
>
> Brick2: dcastor02:/xpool/engine/brick
>
> Brick3: dcastor03:/xpool/engine/brick (arbiter)
>
> Options Reconfigured:
>
> performance.readdir-ahead: on
>
> performance.quick-read: off
>
> performance.read-ahead: off
>
> performance.io-cache: off
>
> performance.stat-prefetch: off
>
> cluster.eager-lock: enable
>
> network.remote-dio: enable
>
> cluster.quorum-type: auto
>
> cluster.server-quorum-type: server
>
> storage.owner-uid: 36
>
> storage.owner-gid: 36
>
>
>
> Volume Name: export
>
> Type: Replicate
>
> Volume ID: 23f14730-d264-4cc2-af60-196b943ecaf3
>
> Status: Started
>
> Number of Bricks: 1 x (2 + 1) = 3
>
> Transport-type: tcp
>
> Bricks:
>
> Brick1: dcastor02:/xpool/export/brick
>
> Brick2: dcastor03:/xpool/export/brick
>
> Brick3: dcastor01:/xpool/export/brick (arbiter)
>
> Options Reconfigured:
>
> performance.readdir-ahead: on
>
> storage.owner-uid: 36
>
> storage.owner-gid: 36
>
>
>
> Volume Name: iso
>
> Type: Replicate
>
> Volume ID: b2d3d7e2-9919-400b-8368-a0443d48e82a
>
> Status: Started
>
> Number of Bricks: 1 x (2 + 1) = 3
>
> Transport-type: tcp
>
> Bricks:
>
> Brick1: dcastor01:/xpool/iso/brick
>
> Brick2: dcastor02:/xpool/iso/brick
>
> Brick3: dcastor03:/xpool/iso/brick (arbiter)
>
> Options Reconfigured:
>
> performance.readdir-ahead: on
>
> storage.owner-uid: 36
>
> storage.owner-gid: 36
>
>
>
>
>
> [root at dcasrv01 fd44dbf9-473a-496a-9996-c8abe3278390]# gluster volume
> status
>
> Status of volume: data
>
> Gluster process                             TCP Port  RDMA Port  Online
> Pid
>
> ------------------------------------------------------------
> ------------------
>
> Brick dcastor01:/xpool/data/brick           49153     0          Y
> 3076
>
> Brick dcastor03:/xpool/data/brick           49153     0          Y
> 3019
>
> Brick dcastor02:/xpool/data/bricky          49153     0          Y
> 3857
>
> NFS Server on localhost                     2049      0          Y
>     3097
>
> Self-heal Daemon on localhost               N/A       N/A        Y
> 3088
>
> NFS Server on dcastor03                     2049      0          Y
> 3039
>
> Self-heal Daemon on dcastor03               N/A       N/A        Y
> 3114
>
> NFS Server on dcasrv02                      2049      0          Y
> 3871
>
> Self-heal Daemon on dcasrv02                N/A       N/A        Y
> 3864
>
>
>
> Task Status of Volume data
>
> ------------------------------------------------------------
> ------------------
>
> There are no active volume tasks
>
>
>
> Status of volume: engine
>
> Gluster process                             TCP Port  RDMA Port  Online
> Pid
>
> ------------------------------------------------------------
> ------------------
>
> Brick dcastor01:/xpool/engine/brick         49152     0          Y
> 3131
>
> Brick dcastor02:/xpool/engine/brick         49152     0          Y
> 3852
>
> Brick dcastor03:/xpool/engine/brick         49152     0          Y
> 2992
>
> NFS Server on localhost                     2049      0          Y
> 3097
>
> Self-heal Daemon on localhost               N/A       N/A        Y
> 3088
>
> NFS Server on dcastor03                     2049      0          Y
> 3039
>
> Self-heal Daemon on dcastor03               N/A       N/A        Y
> 3114
>
> NFS Server on dcasrv02                      2049      0          Y
> 3871
>
> Self-heal Daemon on dcasrv02                N/A       N/A        Y
> 3864
>
>
>
> Task Status of Volume engine
>
> ------------------------------------------------------------
> ------------------
>
> There are no active volume tasks
>
>
>
> Status of volume: export
>
> Gluster process                             TCP Port  RDMA Port  Online
> Pid
>
> ------------------------------------------------------------
> ------------------
>
> Brick dcastor02:/xpool/export/brick         49155     0          Y
> 3872
>
> Brick dcastor03:/xpool/export/brick         49155     0          Y
> 3147
>
> Brick dcastor01:/xpool/export/brick         49155     0          Y
> 3150
>
> NFS Server on localhost                     2049      0          Y
> 3097
>
> Self-heal Daemon on localhost               N/A       N/A        Y
> 3088
>
> NFS Server on dcastor03                     2049      0          Y
> 3039
>
> Self-heal Daemon on dcastor03               N/A       N/A        Y
> 3114
>
> NFS Server on dcasrv02                      2049      0          Y
> 3871
>
> Self-heal Daemon on dcasrv02                N/A       N/A        Y
> 3864
>
>
>
> Task Status of Volume export
>
> ------------------------------------------------------------
> ------------------
>
> There are no active volume tasks
>
>
>
> Status of volume: iso
>
> Gluster process                             TCP Port  RDMA Port  Online
> Pid
>
> ------------------------------------------------------------
> ------------------
>
> Brick dcastor01:/xpool/iso/brick            49154     0          Y
> 3152
>
> Brick dcastor02:/xpool/iso/brick            49154     0          Y
> 3881
>
> Brick dcastor03:/xpool/iso/brick            49154     0          Y
> 3146
>
> NFS Server on localhost                     2049      0          Y
> 3097
>
> Self-heal Daemon on localhost               N/A       N/A        Y
> 3088
>
> NFS Server on dcastor03                     2049      0          Y
> 3039
>
> Self-heal Daemon on dcastor03               N/A       N/A        Y
> 3114
>
> NFS Server on dcasrv02                      2049      0          Y
> 3871
>
> Self-heal Daemon on dcasrv02                N/A       N/A        Y
> 3864
>
>
>
> Task Status of Volume iso
>
> ------------------------------------------------------------
> ------------------
>
> There are no active volume tasks
>
>
>
>
> Thanks
>
>
>
> Jason
>
>
>
>
>
>
>
> *From:* users-bounces at ovirt.org [mailto:users-bounces at ovirt.org] *On
> Behalf Of *Jason Jeffrey
> *Sent:* 03 October 2016 18:40
> *To:* users at ovirt.org
>
> *Subject:* Re: [ovirt-users] 4.0 - 2nd node fails on deploy
>
>
>
> Hi,
>
>
>
> Setup log attached for primary
>
>
>
> Regards
>
>
>
> Jason
>
>
>
> *From:* Simone Tiraboschi [mailto:stirabos at redhat.com
> <stirabos at redhat.com>]
> *Sent:* 03 October 2016 09:27
> *To:* Jason Jeffrey <jason at sudo.co.uk>
> *Cc:* users <users at ovirt.org>
> *Subject:* Re: [ovirt-users] 4.0 - 2nd node fails on deploy
>
>
>
>
>
>
>
> On Mon, Oct 3, 2016 at 12:45 AM, Jason Jeffrey <jason at sudo.co.uk> wrote:
>
> Hi,
>
>
>
> I am trying to build a x3 HC cluster, with a self hosted engine using
> gluster.
>
>
>
> I have successful built the 1st node,  however when I attempt to run
> hosted-engine –deploy on node 2, I get the following error
>
>
>
> [WARNING] A configuration file must be supplied to deploy Hosted Engine on
> an additional host.
>
> [ ERROR ] 'version' is not stored in the HE configuration image
>
> [ ERROR ] Unable to get the answer file from the shared storage
>
> [ ERROR ] Failed to execute stage 'Environment customization': Unable to
> get the answer file from the shared storage
>
> [ INFO  ] Stage: Clean up
>
> [ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-
> setup/answers/answers-20161002232505.conf'
>
> [ INFO  ] Stage: Pre-termination
>
> [ INFO  ] Stage: Termination
>
> [ ERROR ] Hosted Engine deployment failed
>
>
>
> Looking at the failure in the log file..
>
>
>
> Can you please attach hosted-engine-setup logs from the first host?
>
>
>
>
>
> 2016-10-02 23:25:05 WARNING otopi.plugins.gr_he_common.core.remote_answerfile
> remote_answerfile._customization:151 A configuration
>
> file must be supplied to deploy Hosted Engine on an additional host.
>
> 2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile
> remote_answerfile._fetch_answer_file:61 _fetch_answer_f
>
> ile
>
> 2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile
> remote_answerfile._fetch_answer_file:69 fetching from:
>
> /rhev/data-center/mnt/glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-
> fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/7
>
> 8cb2527-a2e2-489a-9fad-465a72221b37
>
> 2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile
> heconflib._dd_pipe_tar:69 executing: 'sudo -u vdsm dd i
>
> f=/rhev/data-center/mnt/glusterSD/dcastor02:engine/
> 0a021563-91b5-4f49-9c6b-fff45e85a025/images/f055216c-
> 02f9-4cd1-a22c-d6b56a0a8e9b
>
> /78cb2527-a2e2-489a-9fad-465a72221b37 bs=4k'
>
> 2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile
> heconflib._dd_pipe_tar:70 executing: 'tar -tvf -'
>
> 2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile
> heconflib._dd_pipe_tar:88 stdout:
>
> 2016-10-02 23:25:05 DEBUG otopi.plugins.gr_he_common.core.remote_answerfile
> heconflib._dd_pipe_tar:89 stderr:
>
> 2016-10-02 23:25:05 ERROR otopi.plugins.gr_he_common.core.remote_answerfile
> heconflib.validateConfImage:111 'version' is not stored
>
> in the HE configuration image
>
> 2016-10-02 23:25:05 ERROR otopi.plugins.gr_he_common.core.remote_answerfile
> remote_answerfile._fetch_answer_file:73 Unable to get t
>
> he answer file from the shared storage
>
>
>
> Looking at the detected gluster path - /rhev/data-center/mnt/
> glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-
> fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/
>
>
>
> [root at dcasrv02 ~]# ls -al /rhev/data-center/mnt/
> glusterSD/dcastor02:engine/0a021563-91b5-4f49-9c6b-
> fff45e85a025/images/f055216c-02f9-4cd1-a22c-d6b56a0a8e9b/
>
> total 1049609
>
> drwxr-xr-x. 2 vdsm kvm       4096 Oct  2 04:46 .
>
> drwxr-xr-x. 6 vdsm kvm       4096 Oct  2 04:46 ..
>
> -rw-rw----. 1 vdsm kvm 1073741824 Oct  2 04:46 78cb2527-a2e2-489a-9fad-
> 465a72221b37
>
> -rw-rw----. 1 vdsm kvm    1048576 Oct  2 04:46 78cb2527-a2e2-489a-9fad-
> 465a72221b37.lease
>
> -rw-r--r--. 1 vdsm kvm        294 Oct  2 04:46 78cb2527-a2e2-489a-9fad-465a72221b37.meta
>
>
>
>
> 78cb2527-a2e2-489a-9fad-465a72221b37 is  a 1 GB file, is this the engine
> VM ?
>
>
>
> Copying the answers file form primary (/etc/ovirt-hosted-engine/answers.conf
> ) to  node 2 and rerunning produces the same error : (
>
> (hosted-engine --deploy  --config-append=/root/answers.conf )
>
>
>
> Also tried on node 3, same issues
>
>
>
> Happy to provide logs and other debugs
>
>
>
> Thanks
>
>
>
> Jason
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20161004/028c94ab/attachment-0001.html>


More information about the Users mailing list