I am trying to setup gluster using Infiniband and mounting it in RDMA mode. I have created
a network hook so that it configures the interfaces as MTU=65520 and CONNECTED_MODE=yes. I
have a second server doing NFS with RDMA over Infiniband with some VMs on it. When I try
and transfer files to the gluster storage it is taking a while and I am seeing the message
“VDSM {hostname} command GetGlusterVolumeHealInfoVDS failed: Message timeout which can be
caused by communication issues” This is usually followed by “Host {hostname} is not
responding. It will stay in Connecting state for a grace period of 60 seconds and after
that an attempt to fence the host will be issued.” I just installed the Infiniband
hardware. The switch is a Qlogic 12200-18 with QLE7340 single port Infiniband cards in
each of the 3 servers. The error message varies on which of the 3 nodes it comes from.
Each of the 3 servers is running the opensm service.