Skip to:
Content

Opened 13 days ago

Last modified 12 days ago

#7500 new defect (bug)

Harmful bp_activity indexes

Reported by: brandonliles Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version:
Component: Core Keywords: 2nd-opinion
Cc:

Description

The indexes on the bp_activity table hide_sitewide and is_spam are harmful for query performance since the columns being indexed are tinyint(1). If MySQL uses these indexes it will likely result in table scans.

A far more selective index would be:
KEY component_type_user_id_date_recorded (component, type, user_id, date_recorded)

Attachments (1)

24hour-window-very-clear-effect.png (166.3 KB) - added by brandonliles 12 days ago.
Index Adjustment Query Impact Graph 1

Download all attachments as: .zip

Change History (7)

#1 @hnla
13 days ago

  • Keywords 2nd-opinion added

This ticket was mentioned in Slack in #buddypress by hnla. View the logs.


12 days ago

#3 @johnjamesjacoby
12 days ago

Howdy @brandonliles!

Thanks for opening this ticket and detailing your findings.

It's generally true that indexes on columns with low cardinality is ineffective, however, benchmarks to confirm that full table scans are currently a problem would be useful. That way we can also confirm removing those indexes will improve performance as anticipated.

Once we've done that, an upgrade routine will need to be introduced that comfortably locks those tables while the indexes are adjusted. Since the Activity table is highly likely to have many, many rows, even index removal is not a simple or painless process.

TL;DR - We need:

  • Benchmarks
  • Confirmation
  • Safe & friendly upgrade routine
Last edited 12 days ago by johnjamesjacoby (previous) (diff)

@brandonliles
12 days ago

Index Adjustment Query Impact Graph 1

#4 @brandonliles
12 days ago

Hi @johnjamesjacoby ,

On our site we have a little over 1.6 million rows in the bp_activity table. Most of the time MySQL was using a decent query plan, but several times an hour we were seeing MySQL use the hide_sitewide and is_spam indexes for the activity loop which was about the same performance as a table scan. On our system that was resulting in a query that averaged 6 seconds. In the graph I've posted here you see the result of our index adjustment. With those indexes removed, we're no longer seeing that bad query plan.

#5 @johnjamesjacoby
12 days ago

Thanks for the quick reply!

Is that just from removing the hide_sitewide and is_spam alone? Or is that also with your custom compound key?

component_type_user_id_date_recorded (component, type, user_id, date_recorded)

#6 @brandonliles
12 days ago

The graphs/data include that compound key that we added after analyzing the output from EXPLAIN.

Note: See TracTickets for help on using tickets.