I enabled auto_explain
with auto_explain.log_min_duration = '100ms'
in my testing instance (~34000 tickets), and maybe I’ve already found the first candidate
LOG: duration: 152.469 ms plan:
Query Text: SELECT "activity_streams".* FROM "activity_streams" WHERE ((permission_id IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,47,50,52) AND group_id is NULL) OR (permission_id IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,47,50,52) AND group_id IN (4,1,2,3,6,7,8,9)) OR (permission_id is NULL AND group_id IN (4,1,2,3,6,7,8,9))) ORDER BY created_at DESC, id DESC LIMIT $1
Limit (cost=20407.36..20407.42 rows=25 width=44)
-> Sort (cost=20407.36..20624.18 rows=86730 width=44)
Sort Key: created_at DESC, id DESC
-> Seq Scan on activity_streams (cost=0.00..17959.90 rows=86730 width=44)
Filter: (((permission_id = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,47,50,52}'::integer[])) AND (group_id IS NULL)) OR ((permission_id = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,47,50,52}'::integer[])) AND (group_id = ANY ('{4,1,2,3,6,7,8,9}'::integer[]))) OR ((permission_id IS NULL) AND (group_id = ANY ('{4,1,2,3,6,7,8,9}'::integer[]))))
Not sure though how this would look on a production instance; as I’ve recently dumped a lot of mail into Zammad, my activity_streams
might be unrealistically big. Entries older than 3 months are automatically pruned from the table, so maybe this is actually irrelevant in the real world. But then again, my activity_streams
is 8576 kB big, which is actually smaller than the 11 MB of MrGeneration’s instance, so maybe not?
The problem here probably doesn’t lie with the table - it has the necessary indices. PostgreSQL is probably using a Seq Scan here because the query returns almost all rows, so an index lookup would only make it slower. For some reason Zammad seems to load the activity stream without setting a LIMIT. I’m not sure how to determine the value of $1
, but the fact that the query is returning 86730 (of 112383) rows leads me to believe that is NULL
. As Zammad only displays the first 25 events on the dashboard anyway, using a LIMIT seems advisable?
Hm, but even with a LIMIT it doesn’t get much better:
zammad_production=# explain analyze SELECT "activity_streams".* FROM "activity_streams" WHERE ((permission_id IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,47,50,52) AND group_id is NULL) OR (permission_id IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,47,50,52) AND group_id IN (4,1,2,3,6,7,8,9)) OR (permission_id is NULL AND group_id IN (4,1,2,3,6,7,8,9))) ORDER BY created_at DESC, id DESC limit 100;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=21274.66..21274.91 rows=100 width=44) (actual time=79.401..79.413 rows=100 loops=1)
-> Sort (cost=21274.66..21491.48 rows=86730 width=44) (actual time=79.399..79.403 rows=100 loops=1)
Sort Key: created_at DESC, id DESC
Sort Method: top-N heapsort Memory: 32kB
-> Seq Scan on activity_streams (cost=0.00..17959.90 rows=86730 width=44) (actual time=0.020..59.270 rows=112382 loops=1)
Filter: (((permission_id = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,47,50,52}'::integer[])) AND (group_id IS NULL)) OR ((permission_id = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,47,50,52}'::integer[])) AND (group_id = ANY ('{4,1,2,3,6,7,8,9}'::integer[]))) OR ((permission_id IS NULL) AND (group_id = ANY ('{4,1,2,3,6,7,8,9}'::integer[]))))
Rows Removed by Filter: 1
Planning time: 0.444 ms
Execution time: 79.456 ms
(9 Zeilen)
The problem seems to be caused by the GROUP BY
- without GROUP BY
and with LIMIT 100
, the query runs in 0.091 ms.
This fixes it:
CREATE INDEX ON activity_streams (created_at, id);
zammad_production=# explain analyze SELECT "activity_streams".* FROM "activity_streams" WHERE ((permission_id IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,47,50,52) AND group_id is NULL) OR (permission_id IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,47,50,52) AND group_id IN (4,1,2,3,6,7,8,9)) OR (permission_id is NULL AND group_id IN (4,1,2,3,6,7,8,9))) ORDER BY
created_at DESC, id DESC LIMIT 100;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..25.47 rows=100 width=44) (actual time=0.030..0.114 rows=100 loops=1)
-> Index Scan Backward using activity_streams_created_at_id_idx on activity_streams (cost=0.42..21646.49 rows=86407 width=44) (actual time=0.029..0.104 rows=100 loops=1)
Filter: (((permission_id = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,47,50,52}'::integer[])) AND (group_id IS NULL)) OR ((permission_id = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,47,50,52}'::integer[])) AND (group_id = ANY ('{4,1,2,3,6,7,8,9}'::integer[]))) OR ((permission_id IS NULL) AND (group_id = ANY ('{4,1,2,3,6,7,8,9}'::integer[]))))
Planning time: 0.362 ms
Execution time: 0.135 ms
(5 Zeilen)
Apparently PostgreSQL is unable to combine the indices index_activity_streams_on_created_at
activity_streams_pkey
and requires a multi column index for that purpose