sanlock is at the latest version (this solved another problem we had a few days ago):

$ rpm -q sanlock
sanlock-2.6-7.fc18.x86_64

the storage is on the same machine as the engine and vdsm.
iptables is up but there is a rule to allow all localhost traffic.


On Sun, Mar 24, 2013 at 11:34 AM, Maor Lipchuk <mlipchuk@redhat.com> wrote:
From the VDSM log, it seems that the master storage domain was not
responding.

Thread-23::DEBUG::2013-03-22
18:50:20,263::domainMonitor::216::Storage.DomainMonitorThread::(_monitorDomain)
Domain 1083422e-a5db-41b6-b667-b9ef1ef244f0 changed its status to Invalid
....
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/domainMonitor.py", line 186, in
_monitorDomain
    self.domain.selftest()
  File "/usr/share/vdsm/storage/nfsSD.py", line 108, in selftest
    fileSD.FileStorageDomain.selftest(self)
  File "/usr/share/vdsm/storage/fileSD.py", line 480, in selftest
    self.oop.os.statvfs(self.domaindir)
  File "/usr/share/vdsm/storage/remoteFileHandler.py", line 280, in
callCrabRPCFunction
    *args, **kwargs)
  File "/usr/share/vdsm/storage/remoteFileHandler.py", line 180, in
callCrabRPCFunction
    rawLength = self._recvAll(LENGTH_STRUCT_LENGTH, timeout)
  File "/usr/share/vdsm/storage/remoteFileHandler.py", line 146, in _recvAll
    raise Timeout()
Timeout
.....

I'm also see a san lock issue, but I think that is because the storage
could not be reached:
ReleaseHostIdFailure: Cannot release host id:
('1083422e-a5db-41b6-b667-b9ef1ef244f0', SanlockException(16, 'Sanlock
lockspace remove failure', 'Device or resource busy'))

Can you try to see if the ip tables are running on your host, and if so,
please check if it is blocking the storage server by any chance?
Can you try to manually mount this NFS and see if it works?
Is it possible the storage server got connectivity issues?


Regards,
Maor

On 03/22/2013 08:24 PM, Limor Gavish wrote:
> Hello,
>
> I am using Ovirt 3.2 on Fedora 18:
> [wil@bufferoverflow ~]$ rpm -q vdsm
> vdsm-4.10.3-7.fc18.x86_64
>
> (the engine is built from sources).
>
> I seem to have hit this bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=922515
>
> in the following configuration:
> Single host (no migrations)
> Created a VM, installed an OS inside (Fedora18)
> stopped the VM.
> created template from it.
> Created an additional VM from the template using thin provision.
> Started the second VM.
>
> in addition to the errors in the logs the storage domains (both data and
> ISO) crashed, i.e went to "unknown" and "inactive" states respectively.
> (see the attached engine.log)
>
> I attached the VDSM and engine logs.
>
> is there a way to work around this problem?
> It happens repeatedly.
>
> Yuval Meir
>
>
>
> _______________________________________________
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>