Skip to:
Content

BuddyPress.org

Opened 10 years ago

Closed 8 years ago

Last modified 8 years ago

#5451 closed enhancement (fixed)

Groups query overhaul

Reported by: boonebgorges's profile boonebgorges Owned by: boonebgorges's profile boonebgorges
Milestone: 2.7 Priority: high
Severity: normal Version:
Component: Groups Keywords: has-patch commit
Cc:

Description

In the past few versions of BP, we've overhauled the main members and activity queries. This has given us huge boosts in flexibility, consistency, cacheability, and performance.

Next I would like to tackle the Groups component. At a minimum:

  • Refactor the main query so that it's split into two parts: one to get matching IDs, and another to fill in the data for the matched IDs.
  • Once the query is split, implement some object caching for the main get() method.

It may also be worth thinking about reorganizing into a BP_Group_Query class, akin to BP_User_Query.

Related tickets: #5099, #5407

Attachments (4)

5451.diff (14.9 KB) - added by boonebgorges 8 years ago.
5451.2.diff (6.5 KB) - added by boonebgorges 8 years ago.
5451.isset.patch (1.3 KB) - added by r-a-y 8 years ago.
5451.forum_id.patch (894 bytes) - added by r-a-y 8 years ago.

Download all attachments as: .zip

Change History (49)

This ticket was mentioned in IRC in #buddypress-dev by boonebgorges. View the logs.


10 years ago

#2 @johnjamesjacoby
10 years ago

Let's consider moving last_activity out of meta and into the bp_groups table proper with an additional column. I don't love that it's inconsistent with Member activity, but there's no sense in trying to query meta if we don't have to.

We could do similar with invite_status and total_member_count since queries are ran against both also.

#3 @r-a-y
10 years ago

total_member_count is a good candidate to be moved into the main bp_groups table since group template queries rely on it.

I'm not so sure about invite_status.

#4 @DJPaul
10 years ago

  • Milestone changed from 2.1 to 2.2

#5 @boonebgorges
10 years ago

In 9082:

Fix group meta query SQL manipulation.

The wildcard operators were acting too greedily, preventing some JOIN clauses
from being recognized and resulting in invalid SQL syntax. This problem had
previously been masked by ideosyncracies of WP's WP_Meta_Query SQL syntax,
which changed in [WP29887].

This change should be considered a band-aid fix for what is already a band-aid
fix. The proper resolution for this issue is to migrate away from the comma
syntax for JOINs used in BP_Groups_Group. See #5874, #5451.

#6 @DJPaul
9 years ago

  • Milestone changed from 2.2 to 2.3

This ticket was mentioned in Slack in #buddypress by mamaduka. View the logs.


9 years ago

#8 @DJPaul
9 years ago

  • Milestone changed from 2.3 to Future Release

#9 @boonebgorges
8 years ago

  • Milestone changed from Future Release to 2.7

Let's split the query and add caching for 2.7. See #6643.

#10 @boonebgorges
8 years ago

See also #6358. Making BP_Groups_Group::get() return an array of BP_Groups_Group objects will help to eliminate some unneeded bits of this query.

@boonebgorges
8 years ago

#11 @boonebgorges
8 years ago

  • Keywords has-patch 2nd-opinion added

5451.diff contains the first steps toward overhauling the query. It's the "first steps" because so far I've only refactored what we've currently got. No additional caching or explicit performance improvements are included here (with one exception - see below). I figured it'd be easier to get a review and/or a second opinion on the approach if I broke things up. Here's what the patch does:

  • Instead of doing a single query to fetch all groups matching the get() params, there are now two queries: an ID query and a cache-priming query. The old query looked like this:
SELECT DISTINCT g.id, g.*, gm1.meta_value AS total_member_count, gm2.meta_value AS last_activity
FROM wp_bp_groups_groupmeta gm1, wp_bp_groups_groupmeta gm2, wp_bp_groups g
WHERE g.id = gm1.group_id
AND g.id = gm2.group_id
AND gm2.meta_key = 'last_activity'
AND gm1.meta_key = 'total_member_count'
ORDER BY last_activity DESC
LIMIT 0, 20

while the new ones look like this:

SELECT DISTINCT g.id
FROM wp_bp_groups g JOIN wp_bp_groups_groupmeta gm_last_activity on ( g.id = gm_last_activity.group_id )
WHERE gm_last_activity.meta_key = 'last_activity'
ORDER BY gm_last_activity.meta_value DESC
LIMIT 0, 20

SELECT g.*
FROM wp_bp_groups g
WHERE g.id IN (63,68,71,70,69,18,66,46,20,61,44,32,45,62,67,24,58,65,27,2)
  • As you can see, a costly table join has been removed. This table join was previously used to get the total_member_count. We only need this join when type=popular - ie, when we're sorting by total_member_count. See the section with the comment // Ordering by 'total_member_count' requires another table join.
  • BP_Groups_Group::get() now returns an array of BP_Groups_Group objects. See #6358. This has lots of benefits. The important one for the purposes of this ticket is that it centralizes cache management.
  • last_activity and total_member_count are no longer loaded into the group object by default. Instead, they've been made protected and are accessible by magic method. This reduces overhead when the values are not needed, and points the way to a future where we can do away with populate_extras altogether.
  • The rebuilding of the query syntax means that we can do away with all the funky string manipulation we were doing with tax_query and meta_query. So 5451.diff fixes #5099 as well.

Next steps, once this patch is in:

  • Incrementor-based caching for the ID query. See #6643.
  • Better pre-loop caching for group admins. With object caching disabled, bp_get_group_join_button() is causing a query for every group in the loop. #5711 added cache support for get_group_administrator_ids(), so all we'd need is single query at the beginning of a group loop that pre-fetches these admin lists and puts them into the cache.
  • Move some/all of the populate_extras stuff into magic methods, where they can be lazy-loaded. This will make our cache treatment more standardized, since we won't be maintaining separate versions of the object.

This ticket was mentioned in Slack in #buddypress by boone. View the logs.


8 years ago

#13 @dcavins
8 years ago

This first patch is great. It makes the SQL query building much easier to read. Far and away the most exciting change from my perspective is that groups_get_groups() returns honest-to-goodness BP_Groups_Group objects without incurring any meaningful overhead.

The extra query that is introduced is very fast and well worth the cache-friendliness for the future.

Thanks for the work on this! Let me know if there's any part of this update I can help with.

Last edited 8 years ago by dcavins (previous) (diff)

This ticket was mentioned in Slack in #buddypress by boone. View the logs.


8 years ago

This ticket was mentioned in Slack in #buddypress by boone. View the logs.


8 years ago

#16 @boonebgorges
8 years ago

I've found a couple uses of the filters. Only a few in public plugins:

On GitHub (I've excluded anything that I wrote or maintain myself). Many of these appear to be copy-paste duplicates of each other:

This covers most of the uses I could spot in the wild.

So, some things will break. It *might* be possible to salvage some of these by continuing to assemble the $sql array that gets passed to the filter, even if it's never used in BP itself; but this will require lots of weird code.

Given the above, I think we're OK to make the change, with associated documentation and posts on bpdevel. I'd like to hear what others think about that - especially @DJPaul and @r-a-y (@dcavins has already voiced his support :) )

#17 @r-a-y
8 years ago

Thumbs up to the patch here as well.

One thing to add is maybe we can query for groups without the last_activity groupmeta JOIN altogether if orderby is empty or something like that.

#18 @boonebgorges
8 years ago

In 11071:

Groups: Overhaul BP_Groups_Group::get() SQL query.

The new query follows the WP and BP convention of "split" queries: one
for the IDs matching the query parameters, and a second one for the
objects corresponding to the matched IDs. This configuration allows for
improved caching and better performance, especially on installations
with large numbers of groups.

The rewrite serves as a convenient excuse to address a number of
longtime annoyances with the way group queries work:

  • Previously, comma syntax was used for table joins, rather than the JOIN keyword. This required some string manipulation when using external tools for generating SQL fragments, such as WP_Tax_Query and WP_Meta_Query. See #5099, #5874. We now use the more standard JOIN syntax.
  • The logic for assembling the "total" query is overwhelmingly simpler.
  • Previously, group queries required three joined tables: the groups table, one groupmeta table (for last_activity), and a second groupmeta table (for total_member_count). After this changeset, these tables will only be joined when needed for sorting (orderby or type). The last_activity and total_member_count properties, when needed for display in a group loop, are lazyloaded from groupmeta - where they're precached by default (see update_meta_cache) - rather than pulled as part of the primary query, and are made available via __get() for backward compatibility.

See #5451, #5874. Fixes #5099.

#19 @boonebgorges
8 years ago

Thanks for the look, @r-a-y! Good catch about the last_activity join - I took your suggestion in [11071]. I think that'll have a pretty huge positive effect in certain use cases.

#20 @boonebgorges
8 years ago

In 11072:

Cache the results of group ID and count queries.

Query results are cached using an incrementor, and are invalidated in
bulk whenever a group is created, updated, or deleted, or when
groupmeta is modified.

See #5451.

#21 @boonebgorges
8 years ago

In 11073:

Add actions to BP taxonomy wrappers that modify the database.

These actions mirror those in WP's wp_set_object_terms() and
wp_remove_object_terms(), but pass a full complement of parameters.
See https://core.trac.wordpress.org/ticket/38006.

The new actions can be used, in part, for cache invalidation.

See #5451.

#22 @boonebgorges
8 years ago

In 11074:

Groups: Bust incrementor cache when modifying group taxonomy terms.

Group taxonomy terms, such as those belonging to bp_group_type, can
affect the results of BP_Groups_Group::get() queries. As such,
changing a group taxonomy term must invalidate the query cache.

WP's taxonomy API doesn't provide a direct method to determine whether
the object whose terms are being modified is a BuddyPress group. So
we infer the object type by looking at the object types that are
registered for the specified taxonomy; if bp_group is one of them, we
invalidate. This could result in cases where the cache is being purged
unnecessarily, but better safe than sorry.

See #5451.

#23 @boonebgorges
8 years ago

In 11075:

Use strict mode for in_array() when possible.

Introduced in [11074].

See #5451.

@boonebgorges
8 years ago

#24 @boonebgorges
8 years ago

Better pre-loop caching for group admins. With object caching disabled, bp_get_group_join_button() is causing a query for every group in the loop. #5711 added cache support for get_group_administrator_ids(), so all we'd need is single query at the beginning of a group loop that pre-fetches these admin lists and puts them into the cache.

5451.2.diff implements this. It saves 19 queries when viewing the group directory without persistent caching, which is nice :-D @r-a-y - you're the author of the original caching in #5711. Would you mind giving my approach a sanity check? The weirdness with stdClass is to ensure compatibility with the existing format of the cache values.

#25 @r-a-y
8 years ago

  • Keywords 2nd-opinion removed

I like the general approach, @boone. Patch works as advertised!

A couple of things:

  • This problem only occurs for people who are logged-in that are members of the group in the loop: https://buddypress.trac.wordpress.org/browser/trunk/src/bp-groups/bp-groups-template.php?marks=3245#L3219. So the query problem isn't that bad unless all the groups in the Groups Directory are all yours or if you are a member of all of them :)
  • Can we be conservative with the default value of the update_admin_cache parameter? We should default to true if on the Groups Directory or on a user's groups page ( bp_is_groups_directory() || bp_is_user_groups() )
  • Likewise, we could also add moderator ID caching, so we can save a query on a single group page.

#26 @boonebgorges
8 years ago

This problem only occurs for people who are logged-in that are members of the group in the loop

Quite right! Can you tell what kind of installation I've been testing on? :)

Can we be conservative with the default value of the update_admin_cache parameter?

Good idea. Let's do it.

Likewise, we could also add moderator ID caching, so we can save a query on a single group page.

OK. I'll do it as part of the next part, where I try to eliminate the scourge that is populate_extras().

#27 @boonebgorges
8 years ago

In 11086:

Groups: Prime group administrator cache in group loops.

This saves a database query for each group of which the logged-in user
is a member.

Props r-a-y.
See #5451.

#28 @boonebgorges
8 years ago

In 11087:

Groups: Improve query efficiency for 'admins' and 'mods' properties of group objects.

Previously, the 'admins' and 'mods' property of BP_Groups_Group
objects were only populated when setting the 'populate_extras' flag.
Even then, the query used to populate these properties was uncached,
and required a join against a global table.

This changeset reworks the way that the 'admins' and 'mods' properties
are accessed and set. The properties are now marked protected, and
are accessible by magic __get(). When requested, the cache for the
both properties is set by a single pair of queries: one to fetch
membership data from the BP table, and one to get user objects from
WordPress. The BP table query is cached, and neither query takes place
if the property is never accessed.

This moves us a step closer to eliminating the populate_extras flag
on BP_Groups_Group objects.

See #5451.

#29 @boonebgorges
8 years ago

In 11089:

Groups: Remove 'populate_extras' flag from BP_Groups_Group.

populate_extras was originally designed to reduce the number of
additional queries performed when setting up a group, by making those
queries optional in cases where the additional data was not needed.
Now, all of these properties are adequately cached, and lazy-loaded
so that they're only fetched when requested.

See #5451.

#30 @boonebgorges
8 years ago

In 11090:

Accept a group ID instead of an array in groups_get_group().

The fact that groups_get_group() required an array was an annoying
side-effect of the fact that BP_Groups_Group had a 'populate_extras'
argument. Now that the latter is gone, the function signature can be
simplified.

See #5451.

#31 @boonebgorges
8 years ago

In 11091:

Use new groups_get_group() syntax throughout BP.

See #5451.

This ticket was mentioned in Slack in #buddypress by boone. View the logs.


8 years ago

#33 @boonebgorges
8 years ago

In 11100:

Groups: Eliminate the populate_extras option from the bp_has_groups() stack.

Fun fact: Now that all these values are cached and lazy-loaded, the
parameter no longer does anything. Begone!

See #5451.

#34 @boonebgorges
8 years ago

  • Resolution set to fixed
  • Status changed from new to closed

I think I've done all I wanted on this ticket. Let's handle future improvements in another release.

#35 @boonebgorges
8 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Oops, I broke some group membership checks because of our ! empty() zealotry. Fix coming up.

This ticket was mentioned in Slack in #buddypress by boone. View the logs.


8 years ago

@r-a-y
8 years ago

#37 @r-a-y
8 years ago

Regarding comment:35, I've fixed this by calling the magic __get() method and casting the result as a boolean. Might make sense to do the same for all the other keys.

#38 @boonebgorges
8 years ago

Thanks for the patch, @r-a-y!

If I'm reading 5451.isset.patch correctly, it'll cause problems with falsey values.

// Say the user is *not* a member of the group
$group = new BP_Groups_Group();

// bool( false )
var_dump( $group->is_member );

// bool( true )
var_dump( empty( $group->is_member ) )

// bool( false )
var_dump( isset( $group->is_member ) )

I think this last part is incorrect - the value is set, but it happens to be empty. Does that seem right to you? If so, I think the right solution is to return true for the keys that are accessible by magic method.

#39 @r-a-y
8 years ago

If so, I think the right solution is to return true for the keys that are accessible by magic method.

Yeah, you're right. Carry on!

#40 @boonebgorges
8 years ago

  • Owner set to boonebgorges
  • Resolution set to fixed
  • Status changed from reopened to closed

In 11103:

Groups: Ensure that isset() returns true for newly-protected properties in BP_Groups_Group.

Props r-a-y.
Fixes #5451.

#41 @boonebgorges
8 years ago

In 11104:

Tests: Use assertTrue() instead of assertNotFalse().

I guess this is what you call Progressive Enhancement.

Introduced in [11103].
See #5451.

#42 @r-a-y
8 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

I've been playing with the legacy forums today.

And the magic methods broke legacy group forum functionality (topics did not show up under a group's "Forum" page) due to:
https://buddypress.trac.wordpress.org/browser/tags/2.6.2/src/bp-forums/bp-forums-template.php?marks=538,542#L536

Attached patch fixes this up.

Last edited 8 years ago by r-a-y (previous) (diff)

@r-a-y
8 years ago

#43 @boonebgorges
8 years ago

  • Keywords commit added

Seems OK to me. Thanks, @r-a-y.

#44 @r-a-y
8 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 11164:

Groups: Add forum_id to magic properties whitelist for the BP_Groups_Group class.

Fixes an issue where the legacy forums failed to render any topics when
on a group's "Forum" page.

See related changesets r11071, r11103.

Fixes #5451.

#45 @boonebgorges
8 years ago

In 11199:

Groups: It should be possible to set arbitrary props on BP_Groups_Group objects.

In a number of places (user_is_member, front_template), BP sets
properties on the BP_Groups_Group object. The change to magic methods
(see #5451) broke this process for properties not included in the
class definition of BP_Groups_Group. To maintain compatibility with
previous use, arbitrary properties can now be fetched and set on
these objects.

Props sbrajesh, dcavins.
Fixes #7296.

Note: See TracTickets for help on using tickets.