Kallithea issues archive

Issue #292: File Lock Error with LDAP and Repo-Locking

Reported by: Marcus Marcus
State: resolved
Created on: 2017-08-11 10:02
Updated on: 2017-08-31 19:53

Description

Hi,

if you are using the Repo-Lock feature and the LDAP-Login feature you got an exception, if another User has locked the Repo and your User is not authenticated. The only way to get Kallithea wrking again is to resatrrt the Paster-Processes.

Kallithea 0.33 python-ldap 2.4.10 waitress 0.8.8 paster 0.8 PasteScript 2.0.2

Regards Marcus

WARNING:kallithea.lib.auth:user <AuthUser('id:None[None] auth:False')> NOT authenticated with regular auth @ HomeController:index
ERROR:kallithea.lib.auth_modules.auth_internal:user TESTci10 had a bad password
WARNING:kallithea.lib.auth_modules:User `TESTci10` failed to authenticate against kallithea.lib.auth_modules.auth_internal
Repository `test/sst/repo1` locked by user `TEST123`Repository `test/sst/repo1` locked by user `User1`Error - <type 'exceptions.OSError'>: [Errno 13] Permission denied: '/data/kallithea_Produktion/data/sessions/container_file_lock/8/81/814cd5d61e7208e7b6deca486566c46701abc88f.lock'
ERROR:waitress:Exception when serving /MY.Test
Traceback (most recent call last):
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/waitress-0.8.8-py2.7.egg/waitress/channel.py", line 337, in service
    task.service()
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/waitress-0.8.8-py2.7.egg/waitress/task.py", line 173, in service
    self.execute()
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/waitress-0.8.8-py2.7.egg/waitress/task.py", line 392, in execute
    app_iter = self.channel.server.application(env, start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/paste/gzipper.py", line 34, in __call__
    response.gzip_start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/paste/cascade.py", line 130, in __call__
    return self.apps[-1](environ, start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/paste/registry.py", line 379, in __call__
    app_iter = self.application(environ, start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/kallithea/lib/middleware/wrapper.py", line 43, in __call__
    return self.application(environ, start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/kallithea/lib/base.py", line 312, in __call__
    return self._handle_request(environ, start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/kallithea/lib/middleware/simplegit.py", line 68, in _handle_request
    return self.application(environ, start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/kallithea/lib/base.py", line 312, in __call__
    return self._handle_request(environ, start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/kallithea/lib/middleware/simplehg.py", line 73, in _handle_request
    return self.application(environ, start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/Pylons-1.0.2-py2.7.egg/pylons/middleware.py", line 168, in __call__
    self.app, new_environ, catch_exc_info=True)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/Pylons-1.0.2-py2.7.egg/pylons/util.py", line 50, in call_wsgi_application
    app_iter = application(environ, start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/weberror/errormiddleware.py", line 165, in __call__
    return self.application(environ, start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/kallithea/lib/middleware/sessionmiddleware.py", line 62, in __call__
    return self.wrap_app(environ, session_start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/Routes-1.13-py2.7.egg/routes/middleware.py", line 131, in __call__
    response = self.app(environ, start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/Pylons-1.0.2-py2.7.egg/pylons/wsgiapp.py", line 103, in __call__
    response = self.dispatch(controller, environ, start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/Pylons-1.0.2-py2.7.egg/pylons/wsgiapp.py", line 313, in dispatch
    return controller(environ, start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/kallithea/lib/base.py", line 446, in __call__
    return WSGIController.__call__(self, environ, start_response)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/Pylons-1.0.2-py2.7.egg/pylons/controllers/core.py", line 271, in __call__
    return response(environ, self.start_response)
  File "build/bdist.solaris-2.11-sun4v.64bit/egg/webob/response.py", line 939, in __call__
    start_response(self.status, headerlist)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/kallithea/lib/middleware/sessionmiddleware.py", line 56, in session_start_response
    session.persist()
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/Beaker-1.6.4-py2.7.egg/beaker/session.py", line 717, in persist
    self._session().save()
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/Beaker-1.6.4-py2.7.egg/beaker/session.py", line 407, in save
    self.namespace.acquire_write_lock(replace=True)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/Beaker-1.6.4-py2.7.egg/beaker/container.py", line 225, in acquire_write_lock
    r = self.access_lock.acquire_write_lock(wait)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/Beaker-1.6.4-py2.7.egg/beaker/synchronization.py", line 186, in acquire_write_lock
    x = self.do_acquire_write_lock(wait)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/Beaker-1.6.4-py2.7.egg/beaker/synchronization.py", line 255, in do_acquire_write_lock
    filedescriptor = self._open(os.O_CREAT | os.O_WRONLY)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/Beaker-1.6.4-py2.7.egg/beaker/synchronization.py", line 236, in _open
    filedescriptor = os.open(self.filename, mode)
OSError: [Errno 13] Permission denied: '/data/kallithea_Produktion/data/sessions/container_file_lock/8/81/814cd5d61e7208e7b6deca486566c46701abc88f.lock'

Attachments

Comments

Comment by Thomas De Schampheleire, on 2017-08-11 19:45

Would it be possible for you to test this on the default branch of Kallithea, i.e. not on a released version like 0.3.3 but by using the instructions at http://kallithea.readthedocs.io/en/latest/contributing.html ?

There have been many changes on the default branch that are not yet released... we are currently working towards a release with these changes.

Comment by Marcus Marcus, on 2017-08-11 22:03

I can test it with this release on Monday.

Comment by Mads Kiilerich, on 2017-08-13 15:26

That error is surprising. Everything should be running under the system account used for Kallithea by your WSGI server (which seems to be waitress, run through paster serve?). Who owns that lock file and what are the permissions?

It sounds like IIS when it runs server side code as the system user that is logged in.

It doesn't seem to be related to Repo-Lock - this lock is just used to avoid multiple writers to the same session file in the file system.

Comment by Marcus Marcus, on 2017-08-14 14:50

@Thomas: I installed the lastest release (default branch). After installing the libffi package its compilled with no Errors. But I cannot get the new ini:

(kallithea-venv) bash-4.4$ paster make-config Kallithea /tmp/NewIni.ini
Distribution already installed:
  Kallithea 0.3.99 from /data/kallithea_Produktion/kalLatest/kallithea
Traceback (most recent call last):
  File "/data/kallithea_Produktion/kallithea-venv/bin/paster", line 11, in <module>
    sys.exit(run())
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/paste/script/command.py", line 102, in run
    invoke(command, command_name, options, args[1:])
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/paste/script/command.py", line 141, in invoke
    exit_code = runner.run(args)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/paste/script/appinstall.py", line 66, in run
    return super(AbstractInstallCommand, self).run(new_args)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/paste/script/command.py", line 236, in run
    result = self.command()
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/paste/script/appinstall.py", line 293, in command
    self.distro, self.options.ep_group, self.options.ep_name)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/paste/script/appinstall.py", line 232, in get_installer
    'paste.app_install', ep_name)
  File "/data/kallithea_Produktion/kallithea-venv/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2630, in load_entry_point
    raise ImportError("Entry point %r not found" % ((group, name),))
ImportError: Entry point ('paste.app_install', 'main') not found

The old ini doesnt work anymore.

I use the test-ini from inside the repo. It works, but fails on the database connection (version not up to date)

I tried:

paster upgrade-db production_MySQL1.ini

but the param upgrade-db is not available. How to upgrade the DB? Should I use gearbox instead of paster?

@Mads: Im running Kallithea with different Configurations on Solaris 11.3 with Python 2.13 (64Bit) At the moment we run it with waitress / paster because of the Celery issue.

I also tried Gunicorn/Tornado, but have the mysql has gone problem here :-(

We have an Apache run in Front for Load-Ballancing the Pasters

The lock File has the Kallithea System-User as Owner and -rwxr-xr-x permissions. It seems that the file is openend by another paster-prozess and hangs, so that no other paster Prozess can access them?

Comment by Thomas De Schampheleire, on 2017-08-14 18:40

My apologies: the instructions I linked are for the latest released version (0.3.3) but on the default branch a few things changed. Here are the updated instructions:

        hg clone https://kallithea-scm.org/repos/kallithea
        cd kallithea
        virtualenv ../kallithea-venv
        source ../kallithea-venv/bin/activate
        pip install --upgrade pip setuptools
        pip install -e .
        gearbox make-config my.ini
        gearbox setup-db -c my.ini --user=user --email=user@example.com --password=password --repos=/tmp
        gearbox serve -c my.ini --reload &

Essentially, paster got replaced by gearbox, which also requires the additional '-c' option before the ini file.

Regarding the possibility of another paster process holding the lock, have you checked with 'ps' that all paster processes are those you expect? You could also use fuser, or the solaris equivalent, to check who has a hold of that file, if any.

Comment by Marcus Marcus, on 2017-08-15 14:21

I have 20 pasters parallel at the moment. After restarting all of them the lock problem was gone. I will try to reproduce it

Comment by Marcus Marcus, on 2017-08-16 17:34

I now have the latest version up and running. Is there a script for updating the 0.33 DB to the 0.99 Version?

I see a lot of DB-request at the login page. Are these request needed, if you have no active session? It takes up to 5 seconds for all the repo/group scanning...

The ini produced with the make-config switch doesn't have any gearbox entries. The one from the inside the kallithea-test ini have some. Are these are not needed or not implemented yet?

Comment by Thomas De Schampheleire, on 2017-08-16 19:17

There are upgrade instructions in docs/upgrade.rst. Database upgrade happens via alembic now.

The gearbox entries you refer to are these, right:

## Gearbox default web server ##
#use = egg:gearbox#wsgiref
## nr of worker threads to spawn
#threadpool_workers = 1
## max request before thread respawn
#threadpool_max_requests = 100
## option to use threads of process
#use_threadpool = true

## Gearbox gevent web server ##
#use = egg:gearbox#gevent

Kallithea can use different web servers, like waitress or the gearbox one. The make-config script sets waitress as default and removes references to the others. The base file used by make-config is kallithea/lib/paster_commands/template.ini.mako You can change the ini file to match what you used before, but on the stable branch with paster, waitress was also the default.

Regarding the db request, can you clarify exactly what you do? You mean that after you login, before you get the index page with all repositories, you see a lot of db queries? Or do you mean something else? And what exactly do you mean with 'no active session' ?

Comment by Marcus Marcus, on 2017-08-18 14:31

I request the login Page (with ldap Auth enabled). This http-request takes up to 10 Seconds.

With "no active session" I mean i am not logged into Kallithea.

I upgraded the database and now I make some performance testst with the databases. In case of MySQL is ist possible to run as innoDB and as MyIsam or are there any limitations from Kallithea?

Comment by Mads Kiilerich, on 2017-08-22 19:31

I am not aware of anything special requirements to MySQL. Kallithea try to just rely on SqlAlchemy as database abstraction layer. Unfortunately, that is not entirely what it does. The limitations of MySQL shows in some places and we had to add some workarounds. Some use MySQL and like it. I prefer PostgreSQL.

Anyway: 10 s sounds like way too much. Try to figure out where the time is spent.

Comment by Marcus Marcus, on 2017-08-22 20:02

At the moment we analyse the SQL requests. I hope I have some more info in the next days. I changed to Gearbox and gevent, it is faster (about 30%) then the paster/waitress kombination

Comment by Mads Kiilerich, on 2017-08-22 20:38

I'm surprised you see that much performance difference from web servers. The webserver just forward data - most of the work is done by the application which has to do the same amount of work in both cases. Watch out for other differences that can explain the numbers you see.

Comment by Marcus Marcus, on 2017-08-31 13:00

We searched the problem with the login page.

If the page is called it makes a request for user "none" to the database. The result is, that the db gives back all the 4000 repos we have, because the default User has the permission "none" in all these repositories.

The request locks like

SELECT
         repo_to_perm.repo_to_perm_id AS repo_to_perm_repo_to_perm_id,
         repo_to_perm.user_id AS repo_to_perm_user_id,
         repo_to_perm.permission_id AS repo_to_perm_permission_id,
         repo_to_perm.repository_id AS repo_to_perm_repository_id,
         repositories.user_id AS repositories_user_id,
         repositories.statistics AS repositories_statistics,
         repositories.downloads AS repositories_downloads,
         repositories.landing_revision AS repositories_landing_revision,
         repositories.locked AS repositories_locked,
         repositories.changeset_cache AS repositories_changeset_cache,
         repositories.repo_id AS repositories_repo_id,
         repositories.repo_name AS repositories_repo_name,
         repositories.repo_state AS repositories_repo_state,
         repositories.clone_uri AS repositories_clone_uri,
         repositories.repo_type AS repositories_repo_type,
         repositories.private AS repositories_private,
         repositories.description AS repositories_description,
         repositories.created_on AS repositories_created_on,
         repositories.updated_on AS repositories_updated_on,
         repositories.enable_locking AS repositories_enable_locking,
         repositories.fork_id AS repositories_fork_id,
         repositories.group_id AS repositories_group_id,
         permissions.permission_id AS permissions_permission_id,
         permissions.permission_name AS permissions_permission_name
        FROM repo_to_perm
        INNER JOIN repositories ON repo_to_perm.repository_id = repositories.repo_id
        INNER JOIN permissions ON repo_to_perm.permission_id = permissions.permission_id
        WHERE repo_to_perm.user_id = 1;

This select will also called by every hg clone, Push, Pull etc.

To fix this, we changed in kallithea/kallithea/lib/auth.py

    @LazyProperty
    def permissions(self):
        return self.__get_perms(user=self, cache=False)

to

return self.__get_perms(user=self, cache=True)

Is this a bug in kallithea, or is there a reason why this cache was deactivated? (I know if you make changes to the permissions it takes the cache-refresh-time so the permissions takes effekt)

I also tried to change the DB-Request from kallithea/kallithea/model/db.py <pre> classmethod def get_default_perms(cls, default_user_id): q = Session().query(UserRepoToPerm, Repository, cls) \ .join((Repository, UserRepoToPerm.repository_id == Repository.repo_id)) \ .join((cls, UserRepoToPerm.permission_id == cls.permission_id)) \ .filter(UserRepoToPerm.user_id == default_user_id ```

to classmethod def get_default_perms(cls, default_user_id): q = Session().query(UserRepoToPerm, Repository, cls) \ .join((Repository, UserRepoToPerm.repository_id == Repository.repo_id)) \ .join((cls, UserRepoToPerm.permission_id == cls.permission_id)) \ .filter(UserRepoToPerm.user_id == default_user_id)\ .filter(cls.permission_id<>1)

The Webgui then works fine, but an hg clone refuses with an Permission denied.

I gues this is a problem in combination with the LDAP-Permissions we use. Do you have an idea how to fix this?

We are thinking about an own cache-region only for the repositories (Short-Perm) and an Flush Routine, if you update the Permissions in Kallithea. What do you think?

Regards Marcus

Comment by Mads Kiilerich, on 2017-08-31 19:27

That seems to turn in to an entirely different issue. Perhaps file a new issue, fix the formatting, and clarify what version that is ... and that it is unrelated to ldap.

I don't think anybody has optimized Kallithea for that number of repositories yet. We would have to do that.

Also make sure you have all the right indices (for example with some DBA hacks like dumping your data from the existing database and import them into a fresh but completely empty database).

Comment by Marcus Marcus, on 2017-08-31 19:53

Since I cannot reproduce the lock problem anymore, I will close this issue.

Comment by Marcus Marcus, on 2017-08-31 19:53