This is a multi-part message in MIME format.
--------------060409060909070106030504
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Hi Limor,
First we should probably start with checking which mount is the master
storage domain that appears as not found, this should be checked against
the oVirt server database, please run
select sds.id, ssc.connection from storage_domain_static sds join
storage_server_connections ssc on sds.storage=ssc.id
where sds.id='1083422e-a5db-41b6-b667-b9ef1ef244f0';
You can run this via psql or a Postgres ui if you have one.
In the results you will see the storage connection in the format of
%hostname%:/%mountName%, then in the VDSM server check in the mount list
that you see that it is mounted, the mount itself should contain a
directory named as the uuid of the master domain, let me know the result.
Tal.
On 04/12/2013 07:29 PM, Limor Gavish wrote:
Hi,
For some reason, without doing anything, all the storage domains
became down and restarting VDSM or the entire machine do not bring it up.
I am not using lvm
The following errors appear several times in vdsm.log (full logs are
attached):
Thread-22::WARNING::2013-04-12
19:00:08,597::lvm::378::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 []
[' Volume group "1083422e-a5db-41b6-b667-b9ef1ef244f0" not found']
Thread-22::DEBUG::2013-04-12
19:00:08,598::lvm::402::OperationMutex::(_reloadvgs) Operation 'lvm
reload operation' released the operation mutex
Thread-22::DEBUG::2013-04-12
19:00:08,681::resourceManager::615::ResourceManager::(releaseResource)
Trying to release resource 'Storage.5849b030-626e-47cb-ad90-3ce782d831b3'
Thread-22::DEBUG::2013-04-12
19:00:08,681::resourceManager::634::ResourceManager::(releaseResource)
Released resource 'Storage.5849b030-626e-47cb-ad90-3ce782d831b3' (0
active users)
Thread-22::DEBUG::2013-04-12
19:00:08,681::resourceManager::640::ResourceManager::(releaseResource)
Resource 'Storage.5849b030-626e-47cb-ad90-3ce782d831b3' is free,
finding out if anyone is waiting for it.
Thread-22::DEBUG::2013-04-12
19:00:08,682::resourceManager::648::ResourceManager::(releaseResource)
No one is waiting for resource
'Storage.5849b030-626e-47cb-ad90-3ce782d831b3', Clearing records.
Thread-22::ERROR::2013-04-12
19:00:08,682::task::850::TaskManager.Task::(_setError)
Task=`e35a22ac-771a-4916-851f-2fe9d60a0ae6`::Unexpected error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/task.py", line 857, in _run
return fn(*args, **kargs)
File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
res = f(*args, **kwargs)
File "/usr/share/vdsm/storage/hsm.py", line 939, in connectStoragePool
masterVersion, options)
File "/usr/share/vdsm/storage/hsm.py", line 986, in _connectStoragePool
res = pool.connect(hostID, scsiKey, msdUUID, masterVersion)
File "/usr/share/vdsm/storage/sp.py", line 695, in connect
self.__rebuild(msdUUID=msdUUID, masterVersion=masterVersion)
File "/usr/share/vdsm/storage/sp.py", line 1232, in __rebuild
masterVersion=masterVersion)
File "/usr/share/vdsm/storage/sp.py", line 1576, in getMasterDomain
raise se.StoragePoolMasterNotFound(self.spUUID, msdUUID)
StoragePoolMasterNotFound: Cannot find master domain:
'spUUID=5849b030-626e-47cb-ad90-3ce782d831b3,
msdUUID=1083422e-a5db-41b6-b667-b9ef1ef244f0'
Thread-22::DEBUG::2013-04-12
19:00:08,685::task::869::TaskManager.Task::(_run)
Task=`e35a22ac-771a-4916-851f-2fe9d60a0ae6`::Task._run:
e35a22ac-771a-4916-851f-2fe9d60a0ae6
('5849b030-626e-47cb-ad90-3ce782d831b3', 1,
'5849b030-626e-47cb-ad90-3ce782d831b3',
'1083422e-a5db-41b6-b667-b9ef1ef244f0', 3942) {} failed - stopping task
Thread-22::DEBUG::2013-04-12
19:00:08,685::task::1194::TaskManager.Task::(stop)
Task=`e35a22ac-771a-4916-851f-2fe9d60a0ae6`::stopping in state
preparing (force False)
Thread-22::DEBUG::2013-04-12
19:00:08,685::task::974::TaskManager.Task::(_decref)
Task=`e35a22ac-771a-4916-851f-2fe9d60a0ae6`::ref 1 aborting True
Thread-22::INFO::2013-04-12
19:00:08,686::task::1151::TaskManager.Task::(prepare)
Task=`e35a22ac-771a-4916-851f-2fe9d60a0ae6`::aborting: Task is
aborted: 'Cannot find master domain' - code 304
*[wil@bufferoverflow ~]$ */sudo vgs --noheadings --units b --nosuffix
--separator \| -o
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free/
No volume groups found
*[wil@bufferoverflow ~]$ */mount/
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
devtmpfs on /dev type devtmpfs
(rw,nosuid,size=8131256k,nr_inodes=2032814,mode=755)
securityfs on /sys/kernel/security type securityfs
(rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts
(rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755)
tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup
(rw,nosuid,nodev,noexec,relatime,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
cgroup on /sys/fs/cgroup/cpuset type cgroup
(rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup
(rw,nosuid,nodev,noexec,relatime,cpuacct,cpu)
cgroup on /sys/fs/cgroup/memory type cgroup
(rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup
(rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup
(rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls type cgroup
(rw,nosuid,nodev,noexec,relatime,net_cls)
cgroup on /sys/fs/cgroup/blkio type cgroup
(rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup
(rw,nosuid,nodev,noexec,relatime,perf_event)
/dev/sda3 on / type ext4 (rw,relatime,data=ordered)
rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
sunrpc on /proc/fs/nfsd type nfsd (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs
(rw,relatime,fd=34,pgrp=1,timeout=300,minproto=5,maxproto=5,direct)
mqueue on /dev/mqueue type mqueue (rw,relatime)
tmpfs on /tmp type tmpfs (rw)
configfs on /sys/kernel/config type configfs (rw,relatime)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,relatime)
/dev/sda5 on /home type ext4 (rw,relatime,data=ordered)
/dev/sda1 on /boot type ext4 (rw,relatime,data=ordered)
kernelpanic.home:/home/KP_Data_Domain on
/rhev/data-center/mnt/kernelpanic.home:_home_KP__Data__Domain type nfs
(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=10.100.101.100,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.100.101.100)
bufferoverflow.home:/home/BO_ISO_Domain on
/rhev/data-center/mnt/bufferoverflow.home:_home_BO__ISO__Domain type
nfs
(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=10.100.101.108,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.100.101.108)
*[wil@bufferoverflow ~]$ */sudo find / -name
5849b030-626e-47cb-ad90-3ce782d831b3/
/run/vdsm/pools/5849b030-626e-47cb-ad90-3ce782d831b3
*[wil@bufferoverflow ~]$* /sudo find / -name
1083422e-a5db-41b6-b667-b9ef1ef244f0/
/home/BO_Ovirt_Storage/1083422e-a5db-41b6-b667-b9ef1ef244f0
I will extremely appreciate any help,
Limor Gavish
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
--------------060409060909070106030504
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Hi Limor,<br>
First we should probably start with checking which mount is the
master storage domain that appears as not found, this should be
checked against the oVirt server database, please run <br>
<br>
select sds.id, ssc.connection from storage_domain_static sds join
storage_server_connections ssc on sds.storage=ssc.id<br>
where sds.id='1083422e-a5db-41b6-b667-b9ef1ef244f0';<br>
<br>
You can run this via psql or a Postgres ui if you have one. <br>
In the results you will see the storage connection in the format of
%hostname%:/%mountName%, then in the VDSM server check in the mount
list that you see that it is mounted, the mount itself should
contain a directory named as the uuid of the master domain, let me
know the result.<br>
<br>
Tal.<br>
<br>
<br>
<br>
On 04/12/2013 07:29 PM, Limor Gavish wrote:
<blockquote
cite="mid:CAAPVUTbFYjR0pteM5o9AgWZzYRv3VNZnfr3n1BhJoGeVzO8_kA@mail.gmail.com"
type="cite">
<div dir="ltr">Hi,
<div><br>
</div>
<div style="">For some reason, without doing anything, all the
storage domains became down and restarting VDSM or the entire
machine do not bring it up.</div>
<div style="">I am not using lvm<br>
The following errors appear several times in vdsm.log (full
logs are attached):</div>
<div style=""><br>
</div>
<div style="">
<div>Thread-22::WARNING::2013-04-12
19:00:08,597::lvm::378::Storage.LVM::(_reloadvgs) lvm vgs
failed: 5 [] [' Volume group
"1083422e-a5db-41b6-b667-b9ef1ef244f0" not found']</div>
<div>Thread-22::DEBUG::2013-04-12
19:00:08,598::lvm::402::OperationMutex::(_reloadvgs)
Operation 'lvm reload operation' released the operation
mutex</div>
<div>Thread-22::DEBUG::2013-04-12
19:00:08,681::resourceManager::615::ResourceManager::(releaseResource)
Trying to release resource
'Storage.5849b030-626e-47cb-ad90-3ce782d831b3'</div>
<div>Thread-22::DEBUG::2013-04-12
19:00:08,681::resourceManager::634::ResourceManager::(releaseResource)
Released resource
'Storage.5849b030-626e-47cb-ad90-3ce782d831b3' (0 active
users)</div>
<div>Thread-22::DEBUG::2013-04-12
19:00:08,681::resourceManager::640::ResourceManager::(releaseResource)
Resource 'Storage.5849b030-626e-47cb-ad90-3ce782d831b3' is
free, finding out if anyone is waiting for it.</div>
<div>Thread-22::DEBUG::2013-04-12
19:00:08,682::resourceManager::648::ResourceManager::(releaseResource)
No one is waiting for resource
'Storage.5849b030-626e-47cb-ad90-3ce782d831b3', Clearing
records.</div>
<div>
Thread-22::ERROR::2013-04-12
19:00:08,682::task::850::TaskManager.Task::(_setError)
Task=`e35a22ac-771a-4916-851f-2fe9d60a0ae6`::Unexpected
error</div>
<div>Traceback (most recent call last):</div>
<div> File "/usr/share/vdsm/storage/task.py", line
857, in
_run</div>
<div> return fn(*args, **kargs)</div>
<div> File "/usr/share/vdsm/logUtils.py", line 45, in
wrapper</div>
<div> res = f(*args, **kwargs)</div>
<div> File "/usr/share/vdsm/storage/hsm.py", line 939,
in
connectStoragePool</div>
<div> masterVersion, options)</div>
<div> File "/usr/share/vdsm/storage/hsm.py", line 986,
in
_connectStoragePool</div>
<div> res = pool.connect(hostID, scsiKey, msdUUID,
masterVersion)</div>
<div> File "/usr/share/vdsm/storage/sp.py", line 695,
in
connect</div>
<div> self.__rebuild(msdUUID=msdUUID,
masterVersion=masterVersion)</div>
<div> File "/usr/share/vdsm/storage/sp.py", line 1232,
in
__rebuild</div>
<div> masterVersion=masterVersion)</div>
<div> File "/usr/share/vdsm/storage/sp.py", line 1576,
in
getMasterDomain</div>
<div>
raise se.StoragePoolMasterNotFound(self.spUUID,
msdUUID)</div>
<div>StoragePoolMasterNotFound: Cannot find master domain:
'spUUID=5849b030-626e-47cb-ad90-3ce782d831b3,
msdUUID=1083422e-a5db-41b6-b667-b9ef1ef244f0'</div>
<div>Thread-22::DEBUG::2013-04-12
19:00:08,685::task::869::TaskManager.Task::(_run)
Task=`e35a22ac-771a-4916-851f-2fe9d60a0ae6`::Task._run:
e35a22ac-771a-4916-851f-2fe9d60a0ae6
('5849b030-626e-47cb-ad90-3ce782d831b3', 1,
'5849b030-626e-47cb-ad90-3ce782d831b3',
'1083422e-a5db-41b6-b667-b9ef1ef244f0', 3942) {} failed -
stopping task</div>
<div>Thread-22::DEBUG::2013-04-12
19:00:08,685::task::1194::TaskManager.Task::(stop)
Task=`e35a22ac-771a-4916-851f-2fe9d60a0ae6`::stopping in
state preparing (force False)</div>
<div>Thread-22::DEBUG::2013-04-12
19:00:08,685::task::974::TaskManager.Task::(_decref)
Task=`e35a22ac-771a-4916-851f-2fe9d60a0ae6`::ref 1 aborting
True</div>
<div>Thread-22::INFO::2013-04-12
19:00:08,686::task::1151::TaskManager.Task::(prepare)
Task=`e35a22ac-771a-4916-851f-2fe9d60a0ae6`::aborting: Task
is aborted: 'Cannot find master domain' - code 304</div>
<div style="">
<br>
</div>
<div>
<div><b>[wil@bufferoverflow ~]$ </b><font
color="#666666"><i>sudo
vgs --noheadings --units b --nosuffix --separator \|
-o
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free</i></font></div>
<div> No volume groups found</div>
</div>
<div><br>
</div>
<div>
<div><b>[wil@bufferoverflow ~]$ </b><i><font
color="#444444">mount</font></i></div>
<div>proc on /proc type proc
(rw,nosuid,nodev,noexec,relatime)</div>
<div>sysfs on /sys type sysfs
(rw,nosuid,nodev,noexec,relatime)</div>
<div>devtmpfs on /dev type devtmpfs
(rw,nosuid,size=8131256k,nr_inodes=2032814,mode=755)</div>
<div>securityfs on /sys/kernel/security type securityfs
(rw,nosuid,nodev,noexec,relatime)</div>
<div>tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)</div>
<div>devpts on /dev/pts type devpts
(rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)</div>
<div>tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755)</div>
<div>tmpfs on /sys/fs/cgroup type tmpfs
(rw,nosuid,nodev,noexec,mode=755)</div>
<div>cgroup on /sys/fs/cgroup/systemd type cgroup
(rw,nosuid,nodev,noexec,relatime,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)</div>
<div>cgroup on /sys/fs/cgroup/cpuset type cgroup
(rw,nosuid,nodev,noexec,relatime,cpuset)</div>
<div>cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup
(rw,nosuid,nodev,noexec,relatime,cpuacct,cpu)</div>
<div>cgroup on /sys/fs/cgroup/memory type cgroup
(rw,nosuid,nodev,noexec,relatime,memory)</div>
<div>cgroup on /sys/fs/cgroup/devices type cgroup
(rw,nosuid,nodev,noexec,relatime,devices)</div>
<div>cgroup on /sys/fs/cgroup/freezer type cgroup
(rw,nosuid,nodev,noexec,relatime,freezer)</div>
<div>cgroup on /sys/fs/cgroup/net_cls type cgroup
(rw,nosuid,nodev,noexec,relatime,net_cls)</div>
<div>cgroup on /sys/fs/cgroup/blkio type cgroup
(rw,nosuid,nodev,noexec,relatime,blkio)</div>
<div>cgroup on /sys/fs/cgroup/perf_event type cgroup
(rw,nosuid,nodev,noexec,relatime,perf_event)</div>
<div>/dev/sda3 on / type ext4 (rw,relatime,data=ordered)</div>
<div>rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs
(rw,relatime)</div>
<div>debugfs on /sys/kernel/debug type debugfs
(rw,relatime)</div>
<div>sunrpc on /proc/fs/nfsd type nfsd (rw,relatime)</div>
<div>hugetlbfs on /dev/hugepages type hugetlbfs
(rw,relatime)</div>
<div>systemd-1 on /proc/sys/fs/binfmt_misc type autofs
(rw,relatime,fd=34,pgrp=1,timeout=300,minproto=5,maxproto=5,direct)</div>
<div>mqueue on /dev/mqueue type mqueue (rw,relatime)</div>
<div>tmpfs on /tmp type tmpfs (rw)</div>
<div>configfs on /sys/kernel/config type configfs
(rw,relatime)</div>
<div>binfmt_misc on /proc/sys/fs/binfmt_misc type
binfmt_misc (rw,relatime)</div>
<div>/dev/sda5 on /home type ext4
(rw,relatime,data=ordered)</div>
<div>/dev/sda1 on /boot type ext4
(rw,relatime,data=ordered)</div>
<div>kernelpanic.home:/home/KP_Data_Domain on
/rhev/data-center/mnt/kernelpanic.home:_home_KP__Data__Domain
type nfs
(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=10.100.101.100,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.100.101.100)</div>
<div>bufferoverflow.home:/home/BO_ISO_Domain on
/rhev/data-center/mnt/bufferoverflow.home:_home_BO__ISO__Domain
type nfs
(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=10.100.101.108,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.100.101.108)</div>
</div>
<div><br>
</div>
<div>
<div><b>[wil@bufferoverflow ~]$ </b><font
color="#444444"><i>sudo
find / -name
5849b030-626e-47cb-ad90-3ce782d831b3</i></font></div>
<div>/run/vdsm/pools/5849b030-626e-47cb-ad90-3ce782d831b3</div>
<div><br>
</div>
<div><b>[wil@bufferoverflow ~]$</b> <font
color="#444444"><i>sudo
find / -name
1083422e-a5db-41b6-b667-b9ef1ef244f0</i></font></div>
<div>/home/BO_Ovirt_Storage/1083422e-a5db-41b6-b667-b9ef1ef244f0</div>
</div>
<div><br>
</div>
<div style="">I will extremely appreciate any help,</div>
<div style="">Limor Gavish</div>
</div>
</div>
<pre wrap="">
<fieldset class="mimeAttachmentHeader"></fieldset>
_______________________________________________
Users mailing list
<a class="moz-txt-link-abbreviated"
href="mailto:Users@ovirt.org">Users@ovirt.org</a>
<a class="moz-txt-link-freetext"
href="http://lists.ovirt.org/mailman/listinfo/users">http://...
</pre>
</blockquote>
</body>
</html>
--------------060409060909070106030504--