Kallithea issues archive

Issue #18: Search needs to be improved

Reported by: 557058%3Adea91e4c-e257-42be-bc28-2cf352c368c8
State: new
Created on: 2014-08-13 16:07
Updated on: 2015-01-22 13:38


Currently, the search in Kallithea is very limited. It's not possible to find a commit by author or date, for example. Something like Bitbucket's changeset filter box is needed.



Comment by Mads Kiilerich, on 2014-08-13 16:43

I plan to make some kind of revset reports/playground where revsets and custom changeset annotations can be shown in table or graph.

FWIW: the current swoosh based indexing doesn't work for our scale.

Comment by domruf, on 2014-10-16 09:05

I think pull requests should also be searchable

@kiilerix do you plan to replace whoosh with something like solr or elasticsearch?

Comment by Mads Kiilerich, on 2014-10-16 11:46

Indexing of source do not work at our scale and with our number of branches so that is not something I will invest in. But it is great if others can improve it for other use cases.

Except for git support, I would prefer if there was a Mercurial extension that could provide the search. I don't know if it would be feasible to make the search engines plugable. AFAIK the advantage of whosh is that it is simple and pure Python. The others might be more work to set up.

  • but I agree that pull request search would be nice, also for us.

I guess the problems with whoosh could/should be fixed independently of switching backend and adding PR search.

Comment by domruf, on 2014-10-17 08:52

I'm not sure what kind of mercurial plugin you have in mind.

AFAIK a mercurial plugin would require a pure python implementation, right? I'm not really that familiar with search applications, but I think whoosh is the only pure python indexing library. The "scaleable" solutions I know (elasticsearch and solr) require a separate Java process.

Do you think whoosh would be feasable if we disabled the full text search and only put the meta data like commit messages, author, date, etc. into the index?

Comment by Mads Kiilerich, on 2014-10-17 12:27

Yes, indexing of meta data only would probably be useful for us.

I am also considering putting more meta data into the database so we more efficiently can "join" repo data with db data.

But I think first step should be to get the existing whoosh stuff under control and figure out what just is bugs and what is inherent problems with the architecture.

Comment by Thomas De Schampheleire, on 2015-01-22 08:52

FWIW, I also think we should be able to find commits easily, in particular in following use case: - Kallithea used as code review tool - developers force pushing into a review fork - developer creating a pull request to start review

If 5 developers push their code at the same time to the review fork, and afterwards one of the developers wants to create the pull request off of his head, he cannot easily find it back. This could be mitigated by:

a. implementing a way to filter on commit author, date, commit message, etc. (as described earlier in this issue) and/or b. adding a way to see all heads of a repository (ordered by date).

Comment by Mads Kiilerich, on 2015-01-22 13:21

We should also expose Mercurial revsets, just like hgweb do. I think that would be a better solution to some of these usecases ... but not cover git.

Comment by Andrej Shadura, on 2015-01-22 13:38

As far as I remember git has something of their own (see gitrevisions(7)) but it seems to be not so powerful... we could "compile" hg revsets into git revisions and apply some more magic to support unsupported features...