I am trying to setup gluster using Infiniband and mounting it in RDMA mode. I have created a network hook so that it configures the interfaces as MTU=65520 and CONNECTED_MODE=yes. I have a second server doing NFS with RDMA over Infiniband with some VMs on it. When I try and transfer files to the gluster storage it is taking a while and I am seeing the message “VDSM {hostname} command GetGlusterVolumeHealInfoVDS failed: Message timeout which can be caused by communication issues” This is usually followed by “Host {hostname} is not responding. It will stay in Connecting state for a grace period of 60 seconds and after that an attempt to fence the host will be issued.” I just installed the Infiniband hardware. The switch is a Qlogic 12200-18 with QLE7340 single port Infiniband cards in each of the 3 servers. The error message varies on which of the 3 nodes it comes from. Each of the 3 servers is running the opensm service.

 

Thanks,

 

Matt