Nir,

Messages: https://t-x.dignus.nl/messages.txt
Sanlock: https://t-x.dignus.nl/sanlock.log.txt

Any input is more than welcome!


On Wed, Feb 19, 2014 at 10:38 AM, Nir Soffer <nsoffer@redhat.com> wrote:
----- Original Message -----
> From: "Johan Kooijman" <mail@johankooijman.com>
> To: "users" <users@ovirt.org>
> Sent: Tuesday, February 18, 2014 1:32:56 PM
> Subject: [Users] Nodes lose storage at random
>
> Hi All,
>
> We're seeing some weird issues in our ovirt setup. We have 4 nodes connected
> and an NFS (v3) filestore (FreeBSD/ZFS).
>
> Once in a while, it seems at random, a node loses their connection to
> storage, recovers it a minute later. The other nodes usually don't lose
> their storage at that moment. Just one, or two at a time.
>
> We've setup extra tooling to verify the storage performance at those moments
> and the availability for other systems. It's always online, just the nodes
> don't think so.

In the logs, we see that vdsm was restarted:
MainThread::DEBUG::2014-02-18 10:48:35,809::vdsm::45::vds::(sigtermHandler) Received signal 15

But we don't know why it happened.

Please attach also /var/log/messages and /var/log/sanlock.log around the time that
vdsm was restarted.

Thanks,
Nir



--
Met vriendelijke groeten / With kind regards,
Johan Kooijman

T +31(0) 6 43 44 45 27
F +31(0) 162 82 00 01
E mail@johankooijman.com