On Tue, Mar 20, 2018 at 10:57 AM, Barak Korren <bkorren(a)redhat.com> wrote:
On 20 March 2018 at 10:53, Yedidyah Bar David <didi(a)redhat.com>
wrote:
> On Tue, Mar 20, 2018 at 10:11 AM, Barak Korren <bkorren(a)redhat.com> wrote:
>> On 20 March 2018 at 09:17, Yedidyah Bar David <didi(a)redhat.com> wrote:
>>> On Mon, Mar 19, 2018 at 6:56 PM, Dominik Holler <dholler(a)redhat.com>
wrote:
>>>> Thanks Gal, I expect the problem is fixed until something eats
>>>> all space in /dev/shm.
>>>> But the usage of /dev/shm is logged in the output, so we would be able
>>>> to detect the problem next time instantly.
>>>>
>>>> From my point of view it would be good to know why /dev/shm was full,
>>>> to prevent this situation in future.
>>>
>>> Gal already wrote below - it was because some build failed to clean up
>>> after itself.
>>>
>>> I don't know about this specific case, but I was told that I am
>>> personally causing such issues by using the 'cancel' button, so I
>>> sadly stopped. Sadly, because our CI system is quite loaded and when I
>>> know that some build is useless, I wish to kill it and save some
>>> load...
>>>
>>> Back to your point, perhaps we should make jobs check /dev/shm when
>>> they _start_, and either alert/fail/whatever if it's not almost free,
>>> or, if we know what we are doing, just remove stuff there? That might
>>> be much easier than fixing things to clean up in end, and/or debugging
>>> why this cleaning failed.
>>
>> Sure thing, patches to:
>>
>> [jenkins repo]/jobs/confs/shell-scripts/cleanup_slave.sh
>>
>> Are welcome, we often find interesting stuff to add there...
>>
>> If constrained for time, please turn this comment into an orderly RFE in Jira...
>
> Searched for '/dev/shm' and found way too many places to analyze them
> all and add something to cleanup_slave to cover all.
Where did you search?
ovirt-system-tests, lago, lago-ost-plugin.
ovirt-system-tests has 83 occurrences. I realize almost all are in
lago guests, but looking still takes time...
In theory I can patch cleanup_slave.sh as you suggested, removing
_everything_ there.
Not sure this is safe.
>
> Pushed this for now:
>
>
https://gerrit.ovirt.org/89215
>
>>
>> --
>> Barak Korren
>> RHV DevOps team , RHCE, RHCi
>> Red Hat EMEA
>>
redhat.com | TRIED. TESTED. TRUSTED. |
redhat.com/trusted
>
>
>
> --
> Didi
--
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. |
redhat.com/trusted
--
Didi