Kallithea issues archive

Issue #177: Duplicate entry error in cache_invalidation table

Reported by: domruf
State: resolved
Created on: 2015-12-08 14:33
Updated on: 2018-05-19 15:19

Description

From time to time I get the following error.

WebApp Error: <class 'sqlalchemy.exc.IntegrityError'>: (IntegrityError) (1062, "Duplicate entry 'myrepo' for key 'cache_key'") 'INSERT INTO cache_invalidation (cache_key, cache_args, cache_active) VALUES (%s, %s, %s)' ('myrepo', 'myrepo', 1)

...
Module kallithea.model.db:1449 in scm_instance_cached
>>  valid = CacheInvalidation.test_and_set_valid(rn, None, valid_cache_keys=valid_cache_keys)
Module kallithea.model.db:2141 in test_and_set_valid
>>  Session().commit()
Module sqlalchemy.orm.session:710 in commit
...

When I look at the code it seems that in case an inactive inv_obj is found it is set to active and re-add. But re-adding an object fails.

https://kallithea-scm.org/repos/kallithea/files/8bc8366a6874640c73c20ce8ecfb596c0b4134db/kallithea/model/db.py#L2134

Attachments

Comments

Comment by Mads Kiilerich, on 2015-12-08 17:33

Re-add? Session().add(inv_obj) just add the record to the session so changes will be committed.

But I guess the problem can happen in a multi threaded web server if two requests start caching at once and both try to add an inv_obj.

I guess the solution would be to catch the commit error and loop back to the query until it succeeds (up to 3 times?). Can you try that?

Comment by domruf, on 2015-12-08 21:37

I'm still not clear what is happening here.

I thought Session().add(inv_obj) tries to add a new record? And then if there is already a record with that key it fails.

Kallithea is running with paster.

Comment by domruf, on 2016-01-08 14:00

Correct me if I'm wrong but I isn't this how it should work?

diff -r cb3b5dabcce4 kallithea/model/db.py
--- a/kallithea/model/db.py Tue Dec 08 15:04:42 2015 +0100
+++ b/kallithea/model/db.py Fri Jan 08 14:58:50 2016 +0100
@@ -2134,12 +2134,17 @@
         inv_obj = cls.query().filter(cls.cache_key == cache_key).scalar()
         if not inv_obj:
             inv_obj = CacheInvalidation(cache_key, repo_name)
-        if inv_obj.cache_active:
-            return True
-        inv_obj.cache_active = True
-        Session().add(inv_obj)
-        Session().commit()
-        return False
+            inv_obj.cache_active = True
+            Session().add(inv_obj)
+            Session().commit()
+            return False
+        else:
+            if inv_obj.cache_active:
+                return True
+            else:
+                inv_obj.cache_active = True
+                Session().commit()
+                return False

     @classmethod
     def get_valid_cache_keys(cls):

Comment by domruf, on 2016-01-08 14:19

Sorry just realized you already made some changes regarding this issue in efce61a.

But shouldn't you do a Session.rollback() in the except block?

Comment by Mads Kiilerich, on 2016-01-08 16:05

In what way would your proposed change be different (and better) from how it was before?

It seems like you add the new CacheInvalidation object to the session ... but AFAIK it is that by default (@kwi?)

More important: after setting cache_active=True on existing objects, you have to add them to the session before committing.

Comment by Søren Løvborg, on 2016-01-08 16:27

It's the other way. Newly created objects must be added to the SQLAlchemy session. Existing objects retrieved using a SQLAlchemy session query are already part of the session, and don't need to be re-added, even if modified.

Comment by domruf, on 2016-01-08 16:28

According to http://docs.sqlalchemy.org/en/latest/orm/session_basics.html#adding-new-or-existing-items "For instances which are persistent (i.e. were loaded by this session), they are already present and do not need to be added. "

My interpretation so fare was that I therefore don't have to add it to the session. In fact I believed that doing so will result in a new INSERT which consequently will case the "Duplicate entry error".

Comment by domruf, on 2016-01-08 16:31

as for why I did it with multiple ifs:

I think it is more explicit and therefore is in the tradition of the second rule of The Zen of Python. :-)

Comment by Søren Løvborg, on 2016-01-08 16:31

Session.add does not correspond to a SQL INSERT, and adding an object that is already in the session does no harm... it's just redundant.

(All actual SQL statements are executed during session flush, which is e.g. triggered by explicitly committing the session and implicitly by performing queries.)

However, a newly created object must be added explicitly, otherwise SQLAlchemy won't know it exists.

Comment by domruf, on 2016-01-08 16:34

@kwi thank that is good to know. I'm not that familiar with SQLAlchemy.

Comment by domruf, on 2016-01-08 16:44

I see now that you already have addressed the rollback in d5707598fd64.

I will try to update to the new stable branch and report if I still have problems.

Comment by Thomas De Schampheleire, on 2018-05-19 15:19

Fixed with efce61aac33d