Issue #306: Internal server error when passing-in large context value for changeset with Git repository
|Reported by:||Branko Majic|
|Created on:||2018-02-06 06:31|
|Updated on:||2018-05-18 19:39|
When attempting to view a changeset in a Git repository while passing-in a large context value to the controller, an internal server error (500) will be thrown at user.
This affects both the
stable branch. Below reproduction steps etc are centered around
stable branch will have similar result with different tracebacks.
Set-up Kallithea (minimal development environment described in
contributing.rstshould suffice. Below instructions assume the user will be called
adminfor simplicity sake.
Log-in into Kallithea and create a new Git repository called
Clone the empty repository:
cd /tmp/ git clone http://admin@localhost:5000/test-changeset-context-500
Add a file to repository:
cd /tmp/test-changeset-context-500 echo "This is the initial file revision." > README.rst git add README.rst git commit -m "Added README.rst."
Make a change to committed file:
cd /tmp/test-changeset-context-500 echo "This is the second file revision." > README.rst git add README.rst git commit -m "Added README.rst."
Push the changes (at the moment of this writing some errors might end-up being reported by Kallithea, but those are not relevant for this particular issue):
cd /tmp/test-changeset-context-500 git push
Log-in into Kallithea and open the changelog page for test-changeset-context-500 repository.
Click on the top changeset.
Modify the resulting URL to changeset by appending
?context=2147483648to it, and open it in a browser. For example:
- In step (9), changeset is shown to user with all the relevant details.
In step (9), a
500 Internal Server Errorerror is shown to the user, with the following traceback:
Traceback (most recent call last): File "/home/user/.virtualenvs/kallithea/lib/python2.7/site-packages/tg/appwrappers/session.py", line 71, in __call__ response = self.next_handler(controller, environ, context) File "/home/user/.virtualenvs/kallithea/lib/python2.7/site-packages/tg/appwrappers/i18n.py", line 71, in __call__ return self.next_handler(controller, environ, context) File "/home/user/.virtualenvs/kallithea/lib/python2.7/site-packages/tg/wsgiapp.py", line 285, in _dispatch return controller(environ, context) File "/home/user/projects/kallithea/kallithea/lib/base.py", line 553, in __call__ return super(BaseController, self).__call__(environ, context) File "/home/user/.virtualenvs/kallithea/lib/python2.7/site-packages/tg/controllers/dispatcher.py", line 119, in __call__ response = self._perform_call(context) File "/home/user/.virtualenvs/kallithea/lib/python2.7/site-packages/tg/controllers/dispatcher.py", line 108, in _perform_call r = self._call(action, params, remainder=remainder, context=context) File "/home/user/.virtualenvs/kallithea/lib/python2.7/site-packages/tg/controllers/decoratedcontroller.py", line 119, in _call output = controller_caller(context_config, bound_controller_callable, remainder, params) File "/home/user/.virtualenvs/kallithea/lib/python2.7/site-packages/tg/decorators.py", line 44, in _decorated_controller_caller return application_controller_caller(tg_config, controller, remainder, params) File "/home/user/.virtualenvs/kallithea/lib/python2.7/site-packages/tg/configuration/app_config.py", line 127, in call_controller return controller(*remainder, **params) File "<decorator-gen-50>", line 2, in index File "/home/user/projects/kallithea/kallithea/lib/auth.py", line 810, in __wrapper return func(*fargs, **fkwargs) File "<decorator-gen-49>", line 2, in index File "/home/user/projects/kallithea/kallithea/lib/auth.py", line 860, in __wrapper return func(*fargs, **fkwargs) File "/home/user/projects/kallithea/kallithea/controllers/changeset.py", line 332, in index return self._index(revision, method=method) File "/home/user/projects/kallithea/kallithea/controllers/changeset.py", line 278, in _index diff_limit=diff_limit) File "/home/user/projects/kallithea/kallithea/lib/diffs.py", line 302, in __init__ self.parsed = self._parse_gitdiff(inline_diff=inline_diff) File "/home/user/projects/kallithea/kallithea/lib/diffs.py", line 378, in _parse_gitdiff chunks, added, deleted = _parse_lines(diff_lines) File "/home/user/projects/kallithea/kallithea/lib/diffs.py", line 570, in _parse_lines raise Exception('error parsing diff @@ line %r' % line) Exception: error parsing diff @@ line u'@@ -2147483649,4294967295- +2147483649,4294967295- @@\n'
The issue stems from Git itself, most likely due to use of
long int for storing request context passed-in as part of a
git diff call. Looking at
kallithea.lib.vcs.backends.git.repository.GitRepository.get_diff one can see the diff will be called with:
git diff -U2147483648 --full-index --binary -p -M --abbrev=40 CHANGESET1 CHANGESET2
If we run the same command direclty with git on cloned repository, the output will look roughly as:
diff --git a/README.rst b/README.rst index 6fdfae0f1c6e146d76d31504b1cfabb5b63ab8c8..a262de6fb2424cf167b409671354ce706add7153 100644 --- a/README.rst +++ b/README.rst @@ -2147483649,4294967295- +2147483649,4294967295- @@ -This is the initial file revision. +This is the second file revision.
Note that the
@@ line essentially looks like it essentially suffers from integer overflow. Should we try to reduce the context by one (to
2147483647), we'll get output:
diff --git a/README.rst b/README.rst index 6fdfae0f1c6e146d76d31504b1cfabb5b63ab8c8..a262de6fb2424cf167b409671354ce706add7153 100644 --- a/README.rst +++ b/README.rst @@ -1 +1 @@ -This is the initial file revision. +This is the second file revision.
Within Kallithea itself, the problem arises when the regex
kallithea.lib.diffs._chunk_re fails to match
old_line, old_end, new_line, new_end at
kallithea/lib/diffs.py:567, due to extra minus sign after
Mercurial does not seem to suffer from this limitation, even when passing huge values to
Ideally, this should be fixed in Git, but it might be much easier to fix it in Kallithea. Not sure what the best course of action should be, with some possibilities being:
Throw a specific error at user stating that the context value has been exceeded.
Silently "truncate" the context value to the next valid one.
"Truncate" the context value to the next valid one, but show a warning message to user. This might give the best consistency accross Mercurial/Git, while providing at least some feedback on passed-in value being incorrect.
Another thing to keep in mind is that this could be done either at controller or
kallithea.lib.vcs.backends.git.repository.GitRepository.get_diff level. Doing it at
get_diff level might be more future-proof, but then comes the issue of how to pass on a warning to controller. Maybe doing it on both sides could prove useful - so controller will sanitize and produce warning message for user, while
get_diff would sanitize and log a warning instead (so any possible future controller implementation would be aware of it in some way).
Comment by Mads Kiilerich, on 2018-02-06 09:09
This seems like a very edge case, triggered by failure elsewhere. Failing badly on it seems ok-ish.
To handle it better, can we just accept and ignore the extra trailing
- in the regexp? And add test coverage in test_diff_parsers.py .
I guess the git issue could have bigger impact elsewhere - perhaps even security implications. That should probably be reported to Git.
Comment by Branko Majic, on 2018-02-06 15:38
Actually, I get frequent mails from my web server, where some crawler bot (or something more malicious) is happily hammering away with this specific value. I'll try to get in contact with Git people, although from what I've seen this thing in particular does not trigger any issues. I'm suspecting it could be a bug in a simple
printf of sorts (but, as you said - you never know, would be interesting to see what happens when attempting to apply a patch).
The regexp could be expanded for this, although it'd look a bit ugly - mainly because Mercurial repos would not produce similar output (e.g. consistency). Want me to go down that route (updating regex) then?
Comment by Branko Majic, on 2018-02-09 17:21
Somewhat similar issue happens when passing-in negative value as well (in case of Git, subprocess exits with error code, in case of Mercurial an "invalid list index" error is thrown).
Comment by Thomas De Schampheleire, on 2018-05-18 19:39
Fixed with 55d2b08d9c44