Re: [ovirt-users] Unable to reactivate host after reboot due to failed Gluster probe

29 Jan 2015

      Hello,

when looking into engine.log, I can see, that "gluster probe" returned 
errno 107. But I can't figure out why:

2015-01-29 10:40:03,546 ERROR 
[org.ovirt.engine.core.bll.InitVdsOnUpCommand] 
(DefaultQuartzScheduler_Worker-59) [5977aac5] Could not peer probe the 
gluster server node-03. Error: VdcBLLException: org.ovirt.eng
ine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: 
VDSErrorException: Failed to AddGlusterServerVDS, error = Add host failed
error: Probe returned with unknown errno 107

Just for the record: We use the /etc/hosts method because of missing 
possibility to choose the network interface for Gluster. The three 
Gluster peer hosts have modified /etc/hosts files with addresses binded 
to a different interface than the ovirtmgmt addresses.

Example:

root@node-03:~ $ cat /etc/hosts
192.168.200.195  node-01
192.168.200.196  node-02
192.168.200.198  node-03

The /etc/hosts file on engine host isn't modified.

On 01/29/2015 10:39 AM, Jan Siml wrote:
...
Hello,
we have a strange behavior within an oVirt cluster. Version is 3.5.1,
engine is running on EL6 machine and hosts are using EL7 as operating
system. The cluster uses a GlusterFS backed storage domain amongst
others. Three of four hosts are peers in the Gluster cluster (3 bricks,
3 replica).
When all hosts are restarted (maybe due to power outage), engine can't
activate them again, because Gluster probe fails. The message given in
UI is:
"Gluster command [gluster peer node-03] failed on server node-03."
Checking Gluster peer and volume status on each host confirms that
Gluster peers are known to each other and volume is up.
node-03:~ $ gluster peer status
Number of Peers: 2
Hostname: node-02
Uuid: 3fc36f55-d3a2-4efc-b2f0-31f83ed709d9
State: Peer in Cluster (Connected)
Hostname: node-01
Uuid: 18027b35-971b-4b21-bb3d-df252b4dd525
State: Peer in Cluster (Connected)
node-03:~ $ gluster volume status
Status of volume: glusterfs-1
Gluster process                    Port    Online    Pid
------------------------------------------------------------------------------
Brick node-01:/export/glusterfs/brick           49152    Y    12409
Brick node-02:/export/glusterfs/brick        49153    Y    9978
Brick node-03:/export/glusterfs/brick        49152    Y    10001
Self-heal Daemon on localhost            N/A    Y    10003
Self-heal Daemon on node-01            N/A    Y    11590
Self-heal Daemon on node-02            N/A    Y    9988
Task Status of Volume glusterfs-1
------------------------------------------------------------------------------
There are no active volume tasks
Storage domain in oVirt UI is fine (active and green) and usable. But
neither Gluster volume nor any brick is visible in UI.
If I try the command which is shown in UI it returns:
root@node-03:~ $ gluster peer probe node-03
peer probe: success. Probe on localhost not needed
root@node-03:~ $ gluster --mode=script peer probe node-03 --xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<cliOutput>
   <opRet>0</opRet>
   <opErrno>1</opErrno>
   <opErrstr>(null)</opErrstr>
   <output>Probe on localhost not needed</output>
</cliOutput>
Is this maybe just an engine side parsing error?
-- 
Kind regards

Jan Siml