[Users] hosted engine issues

Mon Mar 3 14:25:15 UTC 2014

----- Original Message -----
> From: "René Koch" <rkoch at linuxland.at>
> To: "Yedidyah Bar David" <didi at redhat.com>, "Martin Sivak" <msivak at redhat.com>
> Cc: users at ovirt.org
> Sent: Monday, March 3, 2014 4:10:51 PM
> Subject: Re: [Users] hosted engine issues
> 
> On 03/03/2014 02:13 PM, Yedidyah Bar David wrote:
> >> Me neither. Is everything Read-Write? Read-Only FS might report no space
> >> left
> >> as well in some cases. Other than that, I do not know.
> >
> > Perhaps some ipc resource? semaphores?
> >
> > Please check:
> >
> > ipcs
> >
> > cat /proc/sys/kernel/sem
> >
> > I know nothing about libvirt, that's just a wild guess.
> 
> # ipcs
> 
> ------ Shared Memory Segments --------
> key        shmid      owner      perms      bytes      nattch     status
> 
> 0x00000000 0          root       644        80         2
> 
> 0x00000000 32769      root       644        16384      2
> 
> 0x00000000 65538      root       644        280        2
> 
> 
> ------ Semaphore Arrays --------
> key        semid      owner      perms      nsems
> 0x00000000 0          root       600        1
> 0x00000000 65537      root       600        1
> 0x000000a7 163842     root       600        1

This means you have 3 semaphore sets, of one semaphore each.

> 
> ------ Message Queues --------
> key        msqid      owner      perms      used-bytes   messages
> 

Also the rest is moderate usage.

> # cat /proc/sys/kernel/sem
> 250	32000	32	128

So you are far from the maxima (250 per set, 32000 total, 128 sets).

> 
> 
> Do you see anything in this output?
> I have no clue how to interpret this...

See e.g. http://man7.org/linux/man-pages/man5/proc.5.html 

Is the above on a node? engine? both nodes are similar? If so, that's
not the reason for the "no space left on device".

If this error is reproducible, you can try to find the process that this
happens to (perhaps libvirtd, vdsmd, or the hosted-engine ha daemon) and do:
strace -f -o /tmp/trace1 -tt -s 512 -p PID
where PID is the pid of that process, then search /tmp/trace1 for 'no space
left on device' and see the exact call that failed.
-- 
Didi