Opened 8 years ago
Last modified 7 weeks ago
#7500 assigned enhancement
Harmful bp_activity indexes
Reported by: | brandonliles | Owned by: | espellcaste |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | normal | Version: | |
Component: | Performance | Keywords: | has-patch |
Cc: |
Description
The indexes on the bp_activity table hide_sitewide
and is_spam
are harmful for query performance since the columns being indexed are tinyint(1). If MySQL uses these indexes it will likely result in table scans.
A far more selective index would be:
KEY component_type_user_id_date_recorded
(component
, type
, user_id
, date_recorded
)
Attachments (3)
Change History (19)
This ticket was mentioned in Slack in #buddypress by hnla. View the logs.
7 years ago
#4
@
7 years ago
Hi @johnjamesjacoby ,
On our site we have a little over 1.6 million rows in the bp_activity table. Most of the time MySQL was using a decent query plan, but several times an hour we were seeing MySQL use the hide_sitewide and is_spam indexes for the activity loop which was about the same performance as a table scan. On our system that was resulting in a query that averaged 6 seconds. In the graph I've posted here you see the result of our index adjustment. With those indexes removed, we're no longer seeing that bad query plan.
#5
@
7 years ago
Thanks for the quick reply!
Is that just from removing the hide_sitewide
and is_spam
alone? Or is that also with your custom compound key?
component_type_user_id_date_recorded (component, type, user_id, date_recorded)
#6
@
7 years ago
The graphs/data include that compound key that we added after analyzing the output from EXPLAIN.
#7
@
7 years ago
- Component changed from Core to Performance
- Milestone changed from Awaiting Review to Under Consideration
@brandonliles - So is 01.patch
what you are recommending? Or did you keep the existing indexes on hide_sitewide
and is_spam
intact?
My SQL-fu is not that great, so I'll leave it to you guys to discuss.
#8
@
7 years ago
@r-a-y 01.patch is exactly what I'm recommending. I didn't keep the existing indexes, due to their low selectivity they are more likely to cause the optimizer to use a poor query plan.
#9
@
7 years ago
- Keywords has-patch added; 2nd-opinion removed
- Type changed from defect (bug) to enhancement
02.patch
adds an upgrade routine to drop the hide_sitewide
and is_spam
indexes and upgrades the activity DB table to add the new, proposed index.
This is what shows up for SHOW INDEX FROM wp_bp_activity
afterwards:
$ wp db query "SHOW INDEX FROM wp_bp_activity" +----------------+------------+--------------------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +----------------+------------+--------------------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | wp_bp_activity | 0 | PRIMARY | 1 | id | A | 713 | NULL | NULL | | BTREE | | | | wp_bp_activity | 1 | date_recorded | 1 | date_recorded | A | 713 | NULL | NULL | | BTREE | | | | wp_bp_activity | 1 | user_id | 1 | user_id | A | 35 | NULL | NULL | | BTREE | | | | wp_bp_activity | 1 | item_id | 1 | item_id | A | 89 | NULL | NULL | | BTREE | | | | wp_bp_activity | 1 | secondary_item_id | 1 | secondary_item_id | A | 356 | NULL | NULL | YES | BTREE | | | | wp_bp_activity | 1 | component | 1 | component | A | 13 | NULL | NULL | | BTREE | | | | wp_bp_activity | 1 | type | 1 | type | A | 32 | NULL | NULL | | BTREE | | | | wp_bp_activity | 1 | mptt_left | 1 | mptt_left | A | 33 | NULL | NULL | | BTREE | | | | wp_bp_activity | 1 | mptt_right | 1 | mptt_right | A | 39 | NULL | NULL | | BTREE | | | | wp_bp_activity | 1 | component_type_user_id_date_recorded | 1 | component | A | 13 | NULL | NULL | | BTREE | | | | wp_bp_activity | 1 | component_type_user_id_date_recorded | 2 | type | A | 39 | NULL | NULL | | BTREE | | | | wp_bp_activity | 1 | component_type_user_id_date_recorded | 3 | user_id | A | 118 | NULL | NULL | | BTREE | | | | wp_bp_activity | 1 | component_type_user_id_date_recorded | 4 | date_recorded | A | 713 | NULL | NULL | | BTREE | | | +----------------+------------+--------------------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
#10
@
7 years ago
- Milestone changed from Under Consideration to 3.0
Would like to tackle this in v3.0.
Howdy @brandonliles!
Thanks for opening this ticket and detailing your findings.
It's generally true that indexes on columns with low cardinality is ineffective, however, benchmarks to confirm that full table scans are currently a problem would be useful. That way we can also confirm removing those indexes will improve performance as anticipated.
Once we've done that, an upgrade routine will need to be introduced that comfortably locks those tables while the indexes are adjusted. Since the Activity table is highly likely to have many, many rows, even index removal is not a simple or painless process.
TL;DR - We need: