Kallithea issues archive

Issue #32: IOError: [Errno 24] Too many open files

Reported by: Sven R. Kunze
State: resolved
Created on: 2014-09-09 10:37
Updated on: 2019-06-13 19:25

Description

We are using kallithea via Apache WSGI.

What can we do about it?

mod_wsgi (pid=8239): Exception occurred processing WSGI script '/localhome/kallithea/apache/wsgi.py'.
Traceback (most recent call last):
  File "/localhome/kallithea/lib/python2.7/site-packages/paste/gzipper.py", line 38, in __call__
    response.gzip_start_response)
  File "/localhome/kallithea/lib/python2.7/site-packages/paste/cascade.py", line 130, in __call__
    return self.apps[-1](environ, start_response)
  File "/localhome/kallithea/lib/python2.7/site-packages/paste/registry.py", line 379, in __call__
    app_iter = self.application(environ, start_response)
  File "/localhome/kallithea/lib/python2.7/site-packages/pylons/middleware.py", line 163, in __call__
    self.app, new_environ, catch_exc_info=True)
  File "/localhome/kallithea/lib/python2.7/site-packages/pylons/util.py", line 48, in call_wsgi_application
    app_iter = application(environ, start_response)
  File "/localhome/kallithea/lib/python2.7/site-packages/kallithea/lib/middleware/wrapper.py", line 43, in __call__
    return self.application(environ, start_response)
  File "/localhome/kallithea/lib/python2.7/site-packages/kallithea/lib/base.py", line 277, in __call__
    return self._handle_request(environ, start_response)
  File "/localhome/kallithea/lib/python2.7/site-packages/kallithea/lib/middleware/simplegit.py", line 68, in _handle_request
    return self.application(environ, start_response)
  File "/localhome/kallithea/lib/python2.7/site-packages/kallithea/lib/base.py", line 277, in __call__
    return self._handle_request(environ, start_response)
  File "/localhome/kallithea/lib/python2.7/site-packages/kallithea/lib/middleware/simplehg.py", line 73, in _handle_request
    return self.application(environ, start_response)
  File "/localhome/kallithea/lib/python2.7/site-packages/weberror/errormiddleware.py", line 156, in __call__
    return self.application(environ, start_response)
  File "/localhome/kallithea/lib/python2.7/site-packages/beaker/middleware.py", line 155, in __call__
    return self.wrap_app(environ, session_start_response)    
  File "/localhome/kallithea/lib/python2.7/site-packages/routes/middleware.py", line 131, in __call__
    response = self.app(environ, start_response)
  File "/localhome/kallithea/lib/python2.7/site-packages/pylons/wsgiapp.py", line 107, in __call__
    response = self.dispatch(controller, environ, start_response)
  File "/localhome/kallithea/lib/python2.7/site-packages/pylons/wsgiapp.py", line 312, in dispatch
    return controller(environ, start_response)
  File "/localhome/kallithea/lib/python2.7/site-packages/kallithea/lib/base.py", line 383, in __call__
    return WSGIController.__call__(self, environ, start_response)
  File "/localhome/kallithea/lib/python2.7/site-packages/pylons/controllers/core.py", line 266, in __call__
    return response(environ, self.start_response)
  File "/localhome/kallithea/lib/python2.7/site-packages/webob/response.py", line 917, in __call__
    start_response(self.status, headerlist)
  File "/localhome/kallithea/lib/python2.7/site-packages/beaker/middleware.py", line 149, in session_start_response
    session.persist()
  File "/localhome/kallithea/lib/python2.7/site-packages/beaker/session.py", line 717, in persist
    self._session().save()
  File "/localhome/kallithea/lib/python2.7/site-packages/beaker/session.py", line 423, in save
    self.namespace.release_write_lock()
  File "/localhome/kallithea/lib/python2.7/site-packages/beaker/container.py", line 236, in release_write_lock
    self.close(checkcount=True)
  File "/localhome/kallithea/lib/python2.7/site-packages/beaker/container.py", line 259, in close
    self.do_close()
  File "/localhome/kallithea/lib/python2.7/site-packages/beaker/container.py", line 672, in do_close
    fh = open(self.file, 'wb')
IOError: [Errno 24] Too many open files: '/home/kallithea/data/sessions/container_file/c/cc/cc773b66d9ed479cb92c0920d4e83331.cache'

Attachments

Comments

Comment by Mads Kiilerich, on 2014-09-09 11:33

This looks very much like a configuration issue, not a bug in Kallithea.

Please reach out to the community on the mailig list - and make sure you include all relevant information about your setup.

Comment by domruf, on 2014-09-09 19:46

I had this once. I'm not sure if it was with rhodecode or already with kallithea.

But the problem was fixed after changing the beaker configuration from memory to db.

Comment by Sven R. Kunze, on 2014-09-15 11:03

@domruf Thanks for that idea. I will look into it and come back soon.

@kiilerix It would be great if new users could recognize this as a configuration error as easily as you can do.

Btw. fixing it by setting beaker configuration from memory to db does not look like a configuration issue. From my point of view, that would imply that memory should be a disallowed option.

Comment by domruf, on 2014-09-15 11:32

I agree at least the default *.ini files should use a db configuration.

Comment by domruf, on 2014-09-15 11:36

Comment by Sven R. Kunze, on 2014-09-15 12:17

@domruf It seems to work. Thank you.

Indeed. I am not sure why db is not the default. Shall I keep that that issue open until it is the default?

Comment by Sven R. Kunze, on 2014-09-15 13:51

For all who use host-based authentication:

## db session ##
beaker.session.type = ext:database
## host-based authentication
beaker.session.sa.url = postgresql://@/kallithea
beaker.session.table_name = db_session

Comment by Mads Kiilerich, on 2014-09-15 19:29

We use the default beaker settings in production and do not see any problem.

It do however sound like there in some situations is a leak. I think we should try very hard to work around bugs by recommending a special configuration. We should try to find the root cause.

I guess this also is with git repositories? We only use Mercurial. I don't know if that could explain the difference.

Could you try to find out out more about what the problem is? Is it leaking a fixed number of FDs per request? Do you also see it if you try with a Mercurial repo? Or if you run a test instance with 'paster serve'?

Comment by Mads Kiilerich, on 2014-09-15 23:05

Testing confirms that it do leak a fd when invoking git as a part of clone. Probably some issue in pygrack.py or subprocessio.py .

@jelmer : do dulwich have a clean what to do what https://kallithea-scm.org/repos/kallithea/files/155f281be5f8013c68be2dbdfe2f8a0a424da383/kallithea/lib/middleware/pygrack.py is doing by invoking git?

Comment by Sven R. Kunze, on 2014-09-16 08:09

@kiilerix I agree finding the root cause is the better way.

To answer your questions: Yes, git only; never tried with Mercurial. We used paster only to test if the server starts up properly but not for long-lasting usage.

And we indeed do many git clones.

Comment by Mads Kiilerich, on 2014-09-16 20:09

Please try

--- a/kallithea/lib/vcs/subprocessio.py
+++ b/kallithea/lib/vcs/subprocessio.py
@@ -386,6 +386,7 @@ class SubprocessIOChunker(object):
         self.process = _p
         self.output = bg_out
         self.error = bg_err
+        self.inputstream = inputstream

     def __iter__(self):
         return self
@@ -413,6 +414,10 @@ class SubprocessIOChunker(object):
             self.error.close()
         except:
             pass
+        try:
+            os.close(self.inputstream)
+        except:
+            pass

     def __del__(self):
         self.close()

(Assuming you are on linux, the fd leak can be seen in the right /proc/$PID/fd/ )

Comment by Sven R. Kunze, on 2014-09-17 08:11

Is there a testcase that we can use to make sure the fd is closed? We would not want to jeopardize our production system.

Comment by Mads Kiilerich, on 2014-09-17 08:21

You could keep an eye on the number of open fds, both in the initial setup and with my patch.

I do however not see (and cannot reproduce) that beaker should make a difference.

Comment by Sven R. Kunze, on 2014-09-17 09:50

Well, I meant something like: py.test -k test_all_fds_are_closed. I also think that would be a good addition in order to indicate such problems in the future.

I think I can check this for a short time without causing too much trouble in our systems. I come back to you soon.

Comment by Sven R. Kunze, on 2014-09-22 09:42

Alright, it works fine now. It can be merged. Thank you for your time and consideration.

Comment by Sven R. Kunze, on 2014-09-22 09:42

Comment by Mads Kiilerich, on 2014-09-22 10:22

You are welcome. Looking forward to your contributions ;-)

Comment by Sven R. Kunze, on 2014-09-22 12:12

What do you mean? Do you need a pull request for this?

Comment by Mads Kiilerich, on 2014-09-22 13:48

No - I will push the fix soon. But I'm sure there will be other opportunities for contributing ;-)

Comment by Martin Dorey, on 2019-06-13 18:45

I posted your fix upstream as https://github.com/dvdotsenko/git_http_backend.py/issues/19, where Uma Parvathappa and I found and fixed another, similar bug. I had been hoping to get some feedback from upstream before letting you know. The feedback still hasn’t happened but I learned today that your change and ours were integrated in
https://code.rhodecode.com/rhodecode-vcsserver/changeset/560860c27980b05475c668832c3befbd8aeeb78c.

Comment by Thomas De Schampheleire, on 2019-06-13 19:25

So the remaining part to be applied to Kallithea would be:

diff --git a/kallithea/lib/vcs/subprocessio.py b/kallithea/lib/vcs/subprocessio.py
--- a/kallithea/lib/vcs/subprocessio.py
+++ b/kallithea/lib/vcs/subprocessio.py
@@ -57,15 +57,17 @@ class StreamFeeder(threading.Thread):

     def run(self):
         t = self.writeiface
-        if self.bytes:
-            os.write(t, self.bytes)
-        else:
-            s = self.source
-            b = s.read(4096)
-            while b:
-                os.write(t, b)
+        try:
+            if self.bytes:
+                os.write(t, self.bytes)
+            else:
+                s = self.source
                 b = s.read(4096)
-        os.close(t)
+                while b:
+                    os.write(t, b)
+                    b = s.read(4096)
+        finally:
+            os.close(t)

     @property
     def output(self):

Is that correct?

What is the correct attribution for this change? Did you write it or Uma?

It seems the upstream at which you opened the issue is not active anymore, last change years ago.