Skip to:
Content

BuddyPress.org

Opened 8 years ago

Closed 7 years ago

#7442 closed enhancement (fixed)

Use JOIN rather than Subquery on user search

Reported by: brandonliles's profile brandonliles Owned by: djpaul's profile DJPaul
Milestone: 3.0 Priority: normal
Severity: normal Version:
Component: Core Keywords: has-patch dev-feedback early
Cc:

Description

The user search relies on two subqueries. This is a known weakness of MySQL performance. Using joins will improve performance.

Attachments (3)

bp-search-use-joins.patch (2.2 KB) - added by brandonliles 8 years ago.
bp-search-use-joins2.patch (2.0 KB) - added by brandonliles 8 years ago.
bp-search-optimize.patch (2.3 KB) - added by brandonliles 8 years ago.
Subquery Materialization Approach

Download all attachments as: .zip

Change History (17)

#1 @r-a-y
8 years ago

Thanks for creating this ticket and for the patch, @brandonliles!

However, the patch is breaking multiple unit tests.

If you are able to, can you set up WordPress for PHP unit testing?
https://make.wordpress.org/core/handbook/testing/automated-testing/phpunit/

Once you have done so, navigate to the buddypress directory and run phpunit in a command prompt. Then, you will be able to see what testcases are breaking.

If you need further help in setting up PHPUnit, let us know.

#2 @brandonliles
8 years ago

Thanks for the feedback @r-a-y just uploaded a second version of the patch that doesn't break stuff :)

#3 @hnla
8 years ago

  • Keywords has-patch dev-feedback added
  • Milestone changed from Awaiting Review to 2.9

@brandonliles
8 years ago

Subquery Materialization Approach

#4 @brandonliles
8 years ago

After seeing the query still perform poorly on a site I took another look, although joins can improve performance, MySQL still tends to look at too many rows in the bp_activity table. Determining the matching user_ids first and plugging those values into the query is a major performance improvement.

New patch bp-search-optimize.patch added.

Last edited 8 years ago by brandonliles (previous) (diff)

This ticket was mentioned in Slack in #buddypress by hnla. View the logs.


8 years ago

#6 @r-a-y
8 years ago

Thanks for the new patch, @brandonliles.

bp-search-optimize.patch passes existing unit tests. The rationale behind the "query for IDs first" approach makes sense to me.

Looks good, but needs sign off from a lead dev. We might need some new unit tests to cover this change though.

Last edited 8 years ago by r-a-y (previous) (diff)

This ticket was mentioned in Slack in #buddypress by hnla. View the logs.


7 years ago

This ticket was mentioned in Slack in #buddypress by r-a-y. View the logs.


7 years ago

#9 @hnla
7 years ago

  • Keywords early added
  • Milestone changed from 2.9 to 3.0

#10 @boonebgorges
7 years ago

@brandonliles Thanks for the ticket and for your patches!

For some background on your original suggestion: We intentionally converted these (and other) user-related queries to subqueries a number of years ago. BuddyPress must remain compatible with configurations where the wp_users table does not live in the same database - or even the same server - as wp_bp_xprofile_data. See eg https://buddypress.trac.wordpress.org/ticket/4060#comment:25

Separating the subquery into a separate $wpdb->get_col() seems like an OK move to me. It'd be even better if we could use this opportunity to wrap a layer of caching around the search. Do you happen to have any benchmarks related to the performance improvements you've mentioned?

I think we have sufficient test coverage for this kind of change. I can't think of a situation where the suggested change would break something that wouldn't be detected by the tests we've already got.

#11 @DJPaul
7 years ago

Before I move this out the 3.0 milestone for inactivity, this path actually looks pretty close, and there's previous agreement from two of the team. It is very unfortunate there are no benchmarks, but if this is meant to be an improvement, we can add it to 3.0 but will be unable to any kind of performance number for the specific improvement.

As far as the patch looks, we need to wrap the new query parts in $wpdb->prepare(), I think.

@boonebgorges Is the new query not being cached a blocker? If so, how long should it be cached for, and when should the cache be cleared? It seems to me at a glance that we'd have to clear said cache every time the users table is written to (which we can't do, when that table is running as a WP global table).

Version 0, edited 7 years ago by DJPaul (next)

#12 @boonebgorges
7 years ago

No, caching is not a blocker.

It seems to me at a glance that we'd have to clear said cache every time the users table is written to -- which we can't do, when that table is running as a WP global table.

The general strategy here in WP is to hash the cache key with an incrementor, and then to expire caches by bumping the incrementor, which would happen on events like wp_insert_user() and wp_update_user(). But, while WP already does this for term and comment queries, and for some post queries, it doesn't have a last_changed incrementor for 'users'. See https://core.trac.wordpress.org/ticket/40613. I don't see a strong argument that BP should be handling this, so I agree that caching is not feasible at the moment.

#13 @DJPaul
7 years ago

  • Owner set to DJPaul
  • Status changed from new to assigned

OK, next week I'll drop the patch in (probably without the caching addition, due to time).

#14 @djpaul
7 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 11886:

Members, search: split user matching into seperate query.

Removing the subquery decreases the amount of rows MySQL needs to look at in the bp_activity table.
In the future, if we or WordPress add a user cache incrementor, we could cache this query.

Fixes #7442

Props brandonliles, r-a-y, boonebgorges

Note: See TracTickets for help on using tickets.