Kallithea issues archive

Issue #257: Missing DB-Indexes

Reported by: Thomas Güttler
State: new
Created on: 2016-12-06 09:51
Updated on: 2016-12-06 12:18

Description

I used this SQL query to check if there are missing indexes:

kallithea=> SELECT relname, seq_scan-idx_scan AS too_much_seq, case when seq_scan-idx_scan>0 THEN 'Missing Index?' ELSE 'OK' END, pg_relation_size(relname::regclass) AS rel_size, seq_scan, idx_scan
kallithea->  FROM pg_stat_all_tables
kallithea->  WHERE schemaname='public' AND pg_relation_size(relname::regclass)>80000 ORDER BY too_much_seq DESC;

Result

   relname    | too_much_seq |      case      | rel_size | seq_scan | idx_scan 
--------------+--------------+----------------+----------+----------+----------
 user_logs    |          638 | Missing Index? | 98467840 |      638 |        0
 repositories |      -864284 | OK             |    90112 |    84041 |   948325

There are a lot of rows in this table:

kallithea=> select count(*) from user_logs;
  count  
---------
 1051155

I guess this is not the root of our performance issue, but maybe this could be optimized nevertheless.

This is the table schema:

kallithea=> \d user_logs
                                             Tabelle »public.user_logs«
     Spalte      |             Typ             |                              Attribute                              
-----------------+-----------------------------+---------------------------------------------------------------------
 user_log_id     | integer                     | not null Vorgabewert nextval('user_logs_user_log_id_seq'::regclass)
 user_id         | integer                     | 
 username        | character varying(255)      | 
 repository_id   | integer                     | 
 repository_name | character varying(255)      | 
 user_ip         | character varying(255)      | 
 action          | text                        | 
 action_date     | timestamp without time zone | 
Indexe:
    "user_logs_pkey" PRIMARY KEY, btree (user_log_id)
Fremdschlüssel-Constraints:
    "user_logs_repository_id_fkey" FOREIGN KEY (repository_id) REFERENCES repositories(repo_id)
    "user_logs_user_id_fkey" FOREIGN KEY (user_id) REFERENCES users(user_id)

I see two solutions:

  1. cut this table from time to time

  2. use an index for this table.

Unfortunately I don't know which column (or combined columns) would need an index.

The source of this PostgreSQL Index Usage Analysis is here: http://stackoverflow.com/questions/3318727/postgresql-index-usage-analysis

Attachments

Comments

Comment by Mads Kiilerich, on 2016-12-06 12:06

That is on Kallithea 0.3.2?

Comment by Mads Kiilerich, on 2016-12-06 12:18

I don't know if these numbers point at an actual problem.

Instead, I suggest logging queries that actually are slow, like for example https://www.drupal.org/docs/7/guidelines-for-sql/logging-slow-sql-queries-server-side-in-postgresql (the part until postgreql has been restarted). Then check the queries that actually are slow.

I made a number of index tweaks in the Kallithea development branch based on such actual numbers. But I don't think I have seen any actual problems with user_logs.

You mention "our performance issue" - exactly what is that?