Skip to:
Content

BuddyPress.org

Opened 4 years ago

Last modified 2 months ago

#8413 assigned enhancement

Better optimize queries searching for @mentions

Reported by: yesbutmaybeno's profile yesbutmaybeno Owned by: espellcaste's profile espellcaste
Milestone: Under Consideration Priority: normal
Severity: normal Version:
Component: Performance Keywords: needs-patch
Cc:

Description

Right now if a user goes onto their "mentions" page this query is run:

# Query_time: 3.036380  Lock_time: 0.000062  Rows_sent: 0  Rows_examined: 772130
SET timestamp=1607682328;
SELECT DISTINCT a.id  FROM wp_bp_activity a  WHERE
(
        (
                a.content LIKE '%@bobtho<%'
                AND
                a.hide_sitewide = 0
        )
)
AND a.is_spam = 0 AND a.type NOT IN ('last_activity') ORDER BY a.date_recorded DESC, a.id DESC LIMIT 0, 21;

Via https://github.com/buddypress/BuddyPress/blob/d2223de16c4b525906aaa89597a6ccc30515f58d/src/bp-activity/classes/class-bp-activity-activity.php#L380

Unfortunately, this can take a very long time, depending on the amount of rows. In my case, it's 750,000+ activities, so if a user doesn't have "21" (the limit in the query) mentions, it has to scan all 750,000+ activities doing a LIKE '%%' search and can take upwards of 3,4,5,6+ seconds on my particular server.

My 1+ second "slow queries" log was about 80% filled with the above query.

Potential solution suggested here: https://buddypress.org/support/topic/slow-queries/

Change History (14)

#1 @yesbutmaybeno
4 years ago

Unsure if feasible or even helpful, would applying a FullText index on the content column and then changing the search query to utilize FTS help in this situation?

#2 @imath
4 years ago

  • Component changed from Core to Activity
  • Keywords needs-patch added
  • Milestone changed from Awaiting Review to 8.0.0

Thanks for your report @yesbutmaybeno Let's try to improve this during the 8.0.0 development cycle.

@r-a-y do you have an idea about it?

#3 @yesbutmaybeno
4 years ago

Can confirm a fulltext index on content and using MATCH...AGAINST is a massive improvement (query takes 0.001s)

SELECT DISTINCT a.id  FROM wp_bp_activity a  WHERE
 (
         (
                 MATCH (a.content) AGAINST ('bobtho' IN BOOLEAN MODE)
		 AND
		 a.content LIKE '%@bobtho<%'
                 AND
                 a.hide_sitewide = 0
         )
 )
 AND a.is_spam = 0 AND a.type NOT IN ('last_activity') ORDER BY
 a.date_recorded DESC, a.id DESC LIMIT 0, 21

From my limited understanding, applying a FullText index on an InnoDB table requires MySQL 5.6.4+ *and* it also doesn't play nice with '@' in the query, which is why the LIKE is still included above, but it's a secondary filter on the rows produced by the fast MATCH...AGAINST above it.

This ticket was mentioned in Slack in #buddypress by imath. View the logs.


4 years ago

#5 @imath
4 years ago

  • Milestone changed from 8.0.0 to Up Next

I believe we won't have time to improve this during 8.0.0. Sorry about it.

#6 @imath
2 years ago

  • Milestone changed from Up Next to 12.0.0

#7 @imath
18 months ago

  • Milestone changed from 12.0.0 to Awaiting Contributions

We need a MySQL Ninja, who jumps in?

This ticket was mentioned in Slack in #buddypress by imath. View the logs.


10 months ago

#9 @espellcaste
10 months ago

  • Component changed from Activity to Performance
  • Milestone changed from Awaiting Contributions to Under Consideration
  • Owner set to espellcaste
  • Status changed from new to assigned

#10 @espellcaste
7 months ago

  • Milestone changed from Under Consideration to Up Next

#11 @imath
6 months ago

  • Milestone changed from Up Next to 15.0.0

#12 @espellcaste
3 months ago

  • Milestone changed from 15.0.0 to Up Next

Bumping to the next version.

#13 @espellcaste
2 months ago

  • Milestone changed from Up Next to Under Consideration

@yesbutmaybeno I think this is a great idea, however, we need to consider a few things here:

  1. a fulltext index can take a lot of time to index in a big database (a site with thousands or millions of activities). So we need to consider a good migration routine to account for large communities.
  1. The way the example was provided is not 1:1 with how this field, search_terms, works. This field can accept any string. And a fulltext index might help here, but since it doesn't support @, we might need to add something custom to support a handle in some specific situations. The LIKE operator is a good example, but it can't be the default behavior.

https://i.postimg.cc/gcS9hRwp/Clean-Shot-2024-11-04-at-11-44-20-2x.jpg

I'm inclined to think we could add this feature for new setups or maybe hide/show behind a feature flag.

Another idea would be to not bundle this into core and add documentation so that those with large number of activities can create the index themselves and update the SQL query to use the new index (the MATCH AGAINST) syntax.

cc: @imath

This ticket was mentioned in Slack in #buddypress by espellcaste. View the logs.


2 months ago

Note: See TracTickets for help on using tickets.