On Thu, Feb 2, 2017 at 6:05 PM, Gianluca Cecchi
<gianluca.cecchi(a)gmail.com> wrote:
On Thu, Feb 2, 2017 at 3:51 PM, Nir Soffer <nsoffer(a)redhat.com>
wrote:
>
>
> > Can you confirm that the host can be active when I restart vdsmd
> > service?
>
> Sure. This may abort a storage operation if one is running when you
> restart
> vdsm, but vdsm is designed so you can restart or kill it safely.
>
> For example, if you abort a disk copy in the middle, the operation will
> fail
> and the destination disk will be deleted.
>
> If you want to avoid such issue, you can put a host to maintenance, but
> this
> requires migration of vms to other hosts.
>
> Nir
OK. Created 50_thin_block_extension_rules.conf under /etc/vdsm/vdsm.conf.d
and restarted vdsmd
One last (latest probably... ;-) question
Is it expected that if I restart vdsmd on the host that is the SPM, then SPM
is shifted to another node?
Yes, engine will move spm to another host when spm fails, unless you
disabled spm role for any other host (see host > spm tab).
Because when restarting vdsmd on the host that is not SPM I
didn't get any
message in web admin gui and restart of vdsmd itself was very fast.
Instead on the host with SPM, the command took several seconds and I got
these events
It is expected the restarting the spm is slower, but we need to see vdsm logs
to understand why.
Feb 2, 2017 4:01:23 PM Host ovmsrv05 power management was verified
successfully.
Feb 2, 2017 4:01:23 PM Status of host ovmsrv05 was set to Up.
Feb 2, 2017 4:01:19 PM Executing power management status on Host ovmsrv05
using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212.
Feb 2, 2017 4:01:18 PM Storage Pool Manager runs on Host ovmsrv06 (Address:
ovmsrv06.datacenter.polimi.it).
Feb 2, 2017 4:01:13 PM VDSM ovmsrv05 command failed: Recovering from crash
or Initializing
Feb 2, 2017 4:01:11 PM Host ovmsrv05 is initializing. Message: Recovering
from crash or Initializing
Feb 2, 2017 4:01:11 PM VDSM ovmsrv05 command failed: Recovering from crash
or Initializing
Feb 2, 2017 4:01:11 PM Invalid status on Data Center Default. Setting Data
Center status to Non Responsive (On host ovmsrv05, Error: Recovering from
crash or Initializing).
Feb 2, 2017 4:01:11 PM VDSM ovmsrv05 command failed: Recovering from crash
or Initializing
Feb 2, 2017 4:01:05 PM Host ovmsrv05 is not responding. It will stay in
Connecting state for a grace period of 80 seconds and after that an attempt
to fence the host will be issued.
Feb 2, 2017 4:01:05 PM Host ovmsrv05 is not responding. It will stay in
Connecting state for a grace period of 80 seconds and after that an attempt
to fence the host will be issued.
Feb 2, 2017 4:01:05 PM VDSM ovmsrv05 command failed: Connection reset by
peer
It look like the engine discovered that the SPM was down, and reconnected.
It is expected that changes in the spm status are detected early and engine
is trying to recover the spm, the SPM role is critical in ovirt.
Are you sure you did not get any message when restarting the other host?
I would expect that engine detect and report a restart of all hosts.
If you can reproduce this, restarting vdsm is not detected on engine and
not reported in engine even log, please file a bug.
Nir