[Users] oVirt storage is down and doesn't come up

Limor Gavish lgavish at gmail.com
Fri Apr 12 16:29:06 UTC 2013


Hi,

For some reason, without doing anything, all the storage domains became
down and restarting VDSM or the entire machine do not bring it up.
I am not using lvm
The following errors appear several times in vdsm.log (full logs are
attached):

Thread-22::WARNING::2013-04-12
19:00:08,597::lvm::378::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] ['
 Volume group "1083422e-a5db-41b6-b667-b9ef1ef244f0" not found']
Thread-22::DEBUG::2013-04-12
19:00:08,598::lvm::402::OperationMutex::(_reloadvgs) Operation 'lvm reload
operation' released the operation mutex
Thread-22::DEBUG::2013-04-12
19:00:08,681::resourceManager::615::ResourceManager::(releaseResource)
Trying to release resource 'Storage.5849b030-626e-47cb-ad90-3ce782d831b3'
Thread-22::DEBUG::2013-04-12
19:00:08,681::resourceManager::634::ResourceManager::(releaseResource)
Released resource 'Storage.5849b030-626e-47cb-ad90-3ce782d831b3' (0 active
users)
Thread-22::DEBUG::2013-04-12
19:00:08,681::resourceManager::640::ResourceManager::(releaseResource)
Resource 'Storage.5849b030-626e-47cb-ad90-3ce782d831b3' is free, finding
out if anyone is waiting for it.
Thread-22::DEBUG::2013-04-12
19:00:08,682::resourceManager::648::ResourceManager::(releaseResource) No
one is waiting for resource 'Storage.5849b030-626e-47cb-ad90-3ce782d831b3',
Clearing records.
Thread-22::ERROR::2013-04-12
19:00:08,682::task::850::TaskManager.Task::(_setError)
Task=`e35a22ac-771a-4916-851f-2fe9d60a0ae6`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 857, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 939, in connectStoragePool
    masterVersion, options)
  File "/usr/share/vdsm/storage/hsm.py", line 986, in _connectStoragePool
    res = pool.connect(hostID, scsiKey, msdUUID, masterVersion)
  File "/usr/share/vdsm/storage/sp.py", line 695, in connect
    self.__rebuild(msdUUID=msdUUID, masterVersion=masterVersion)
  File "/usr/share/vdsm/storage/sp.py", line 1232, in __rebuild
    masterVersion=masterVersion)
  File "/usr/share/vdsm/storage/sp.py", line 1576, in getMasterDomain
    raise se.StoragePoolMasterNotFound(self.spUUID, msdUUID)
StoragePoolMasterNotFound: Cannot find master domain:
'spUUID=5849b030-626e-47cb-ad90-3ce782d831b3,
msdUUID=1083422e-a5db-41b6-b667-b9ef1ef244f0'
Thread-22::DEBUG::2013-04-12
19:00:08,685::task::869::TaskManager.Task::(_run)
Task=`e35a22ac-771a-4916-851f-2fe9d60a0ae6`::Task._run:
e35a22ac-771a-4916-851f-2fe9d60a0ae6
('5849b030-626e-47cb-ad90-3ce782d831b3', 1,
'5849b030-626e-47cb-ad90-3ce782d831b3',
'1083422e-a5db-41b6-b667-b9ef1ef244f0', 3942) {} failed - stopping task
Thread-22::DEBUG::2013-04-12
19:00:08,685::task::1194::TaskManager.Task::(stop)
Task=`e35a22ac-771a-4916-851f-2fe9d60a0ae6`::stopping in state preparing
(force False)
Thread-22::DEBUG::2013-04-12
19:00:08,685::task::974::TaskManager.Task::(_decref)
Task=`e35a22ac-771a-4916-851f-2fe9d60a0ae6`::ref 1 aborting True
Thread-22::INFO::2013-04-12
19:00:08,686::task::1151::TaskManager.Task::(prepare)
Task=`e35a22ac-771a-4916-851f-2fe9d60a0ae6`::aborting: Task is aborted:
'Cannot find master domain' - code 304

*[wil at bufferoverflow ~]$ **sudo vgs --noheadings --units b --nosuffix
--separator \| -o
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free
*
  No volume groups found

*[wil at bufferoverflow ~]$ **mount*
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
devtmpfs on /dev type devtmpfs
(rw,nosuid,size=8131256k,nr_inodes=2032814,mode=755)
securityfs on /sys/kernel/security type securityfs
(rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts
(rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755)
tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup
(rw,nosuid,nodev,noexec,relatime,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
cgroup on /sys/fs/cgroup/cpuset type cgroup
(rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup
(rw,nosuid,nodev,noexec,relatime,cpuacct,cpu)
cgroup on /sys/fs/cgroup/memory type cgroup
(rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup
(rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup
(rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls type cgroup
(rw,nosuid,nodev,noexec,relatime,net_cls)
cgroup on /sys/fs/cgroup/blkio type cgroup
(rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup
(rw,nosuid,nodev,noexec,relatime,perf_event)
/dev/sda3 on / type ext4 (rw,relatime,data=ordered)
rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
sunrpc on /proc/fs/nfsd type nfsd (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs
(rw,relatime,fd=34,pgrp=1,timeout=300,minproto=5,maxproto=5,direct)
mqueue on /dev/mqueue type mqueue (rw,relatime)
tmpfs on /tmp type tmpfs (rw)
configfs on /sys/kernel/config type configfs (rw,relatime)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,relatime)
/dev/sda5 on /home type ext4 (rw,relatime,data=ordered)
/dev/sda1 on /boot type ext4 (rw,relatime,data=ordered)
kernelpanic.home:/home/KP_Data_Domain on
/rhev/data-center/mnt/kernelpanic.home:_home_KP__Data__Domain type nfs
(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=10.100.101.100,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.100.101.100)
bufferoverflow.home:/home/BO_ISO_Domain on
/rhev/data-center/mnt/bufferoverflow.home:_home_BO__ISO__Domain type nfs
(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=10.100.101.108,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.100.101.108)

*[wil at bufferoverflow ~]$ **sudo find / -name
5849b030-626e-47cb-ad90-3ce782d831b3*
/run/vdsm/pools/5849b030-626e-47cb-ad90-3ce782d831b3

*[wil at bufferoverflow ~]$* *sudo find / -name
1083422e-a5db-41b6-b667-b9ef1ef244f0*
/home/BO_Ovirt_Storage/1083422e-a5db-41b6-b667-b9ef1ef244f0

I will extremely appreciate any help,
Limor Gavish
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20130412/83233d61/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: storage_logs.zip
Type: application/zip
Size: 27406 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20130412/83233d61/attachment-0001.zip>


More information about the Users mailing list