On Mon, Feb 8, 2021 at 9:05 AM Yedidyah Bar David <didi(a)redhat.com> wrote:
Hi all,
I ran a loop of [1] (from [2]). The loop succeeded for ~ 380
iterations, then failed with 'Too many open files'. First failure was:
2021-02-08 02:21:15,702+0100 ERROR (jsonrpc/4) [storage.HSM] Could not
connect to storageServer (hsm:2446)
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line
2443, in connectStorageServer
conObj.connect()
File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py",
line 449, in connect
return self._mountCon.connect()
File "/usr/lib/python3.6/site-packages/vdsm/storage/storageServer.py",
line 171, in connect
self._mount.mount(self.options, self._vfsType, cgroup=self.CGROUP)
File "/usr/lib/python3.6/site-packages/vdsm/storage/mount.py", line
210, in mount
cgroup=cgroup)
File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py",
line 56, in __call__
return callMethod()
File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py",
line 54, in <lambda>
**kwargs)
File "<string>", line 2, in mount
File "/usr/lib64/python3.6/multiprocessing/managers.py", line 772,
in _callmethod
raise convert_to_error(kind, result)
OSError: [Errno 24] Too many open files
But obviously, once it did, it continued failing for this reason on
many later operations.
Is this considered a bug? Do we actively try to prevent such cases? So
should I open one and attach logs? Or it can be considered a "corner
case"?
Using vdsm-4.40.50.3-37.git7883b3b43.el8.x86_64 from
ost-images-el8-he-installed-1-202102021144.x86_64 .
I can also let access to the machine(s) if needed, for now.
Sorry, now cleaned this env. Can try to reproduce if there is interest.
--
Didi