On Mar 28, 2018, at 10:59 PM, Artem Tambovskiy
<artem.tambovskiy(a)gmail.com> wrote:
Hi,
How many hosts you have? Check hosted-engine.conf on all hosts including the one you have
problem with and look if all host_id values are unique. It might happen that you have
several hosts with host_id=1
Hi Artem,
Thanks. 3 compute hosts, 3 gluster hosts. Checked them all, they're all unique (1, 2,
6, 101, 102 and 103), so that isn't the problem.
Found another datapoint that I'm not entirely sure what to do with. Tried reinstalling
the afflicted host - call it host1. Moved the HE off of it, removed it from the GUI,
reinstalled. At this point, the SPM was on host3. After it was back up, we moved the SPM
to host1. The problem ceased on host1 for several hours and then returned. But most
notably, the problem started happening on host3!
So it seems somehow related to/influenced by the SPM. And I'm deeply confused.
-j