Skip to:
Content

BuddyPress.org

Opened 8 years ago

Closed 7 years ago

#7436 closed defect (bug) (fixed)

Very slow check_is_friend function

Reported by: januzi_pl's profile januzi_pl Owned by: dcavins's profile dcavins
Milestone: 3.0 Priority: high
Severity: major Version: 2.7
Component: Friends Keywords: dev-feedback has-patch commit
Cc: dcavins

Description

According to xdebug's cachegrind log this function (and connected to it: get_friendships, populate and construct ) is generating the most load in the site. The log says that construct and populate are being called over 200k times (and the site hasn't even got so many users). This leads to 200k calls to wp_cache_get, and in the result over 400k calls to WP_Object_Cache->get.

Attachments (4)

7436.1.diff (7.0 KB) - added by dcavins 7 years ago.
Add SQL-based alternative to using user-based bulk friendship cache.
7346.diff (2.0 KB) - added by boonebgorges 7 years ago.
7436.2.diff (8.5 KB) - added by dcavins 7 years ago.
Add individual friendship caching and a "find uncached friendships" cache warmer function.
7436.purge-incremented-cache.patch (2.8 KB) - added by r-a-y 7 years ago.

Download all attachments as: .zip

Change History (36)

#1 follow-up: @boonebgorges
8 years ago

Hi @januzi_pl - Is it possible to share some of the cachegrind results? When you say "200k calls to wp_cache_get", I assume that this is happening over the course of multiple page loads. Seeing the trace for a single pageload would be helpful. Thanks for your help!

#2 in reply to: ↑ 1 @januzi_pl
8 years ago

Hi @boonebgorges. I can share this file, but you'll have to download it from the ftp/www.
Edit: On the second thought, gzipped file has got only 17MB. So I can attach it to this ticket (probably, I'm not sure if it isn't too big). Edit2: Too big. So, the only way: http://www.jednoslad.pl/cachegrind.zip

Kcachegrind / qcachegrind. Those are two apps that I know that show graphs from cachegrind logs.

When you say "200k calls to wp_cache_get", I assume that this is happening over the course of multiple page loads.

Nope. This is from single page load (I've set xdebug to react only to trigger and I'm the only one person that is creating the logs).

Last edited 8 years ago by januzi_pl (previous) (diff)

#3 @boonebgorges
8 years ago

@januzi_pl - Thanks so much for the quick reply. A note for anyone downloading it: change the file extension to .gz and use gunzip - it's not a regular zip file.

The log shows that BP_Friends_Friendship::check_is_friend() -> BP_Friends_Friendship::get_friendships() is called 19 times, which sounds correct for a member directory (20 members, one of whom is the logged-in user, and doesn't need a friend check). The log then shows that get_friendships() calls BP_Friends_Friendship::__construct() 218,196 times, which clearly does *not* sound correct. See https://buddypress.trac.wordpress.org/browser/tags/2.7.4/src/bp-friends/classes/class-bp-friends-friendship.php?marks=261,268,289#L242 for the relevant function calls that should result in a cache hit.

It looks like bp_get_non_cached_ids() calls wp_cache_get() a huge number of times (218,204). I'm guessing this means that the following call is returning an incorrect value of some sort:

$friendship_ids = wp_cache_get( $user_id, 'bp_friends_friendships_for_user' );

If $friendship_ids mistakenly ends up returning an enormous array (the numbers suggest it's got about 11,400 members), it would cause the kind of cache I/O that you've cited.

@januzi_pl You noted that this is a fairly small site, which I'm guessing means that no user legitimately has 11,400 friends. Can you check to see what's being returned from line 261 here https://buddypress.trac.wordpress.org/browser/tags/2.7.4/src/bp-friends/classes/class-bp-friends-friendship.php?marks=261,268,289#L242? That might help us figure out what's going on. Also, if you are running a cache dropin (wp-content/object-cache.php), please share the details about which dropin it is, and what your cache backend is (Memcached, etc).

As a side note, cache multi-get support in WP would help us to mitigate this problem in cases where a user really *did* have many thousands of friends, at least in cases where the cache backend supports true multi-get. See https://core.trac.wordpress.org/ticket/20875

cc @dcavins

#4 @dcavins
8 years ago

  • Cc dcavins added

#5 @januzi_pl
8 years ago

@boonebgorges I looked up into the friendship table, didn't bother with confirmed/unconfirmed status. So, one user has go 20k friends, another ones: 5,7k, 3,9k, 800, 700, and so on (I think that I'm logged into that first account).

Line 261: array is empty, no cache.
Line 264: very big array, the last index = 11483

As for cache I've enabled w3tc with page/database/object cache, disk only. I'll switch that to the memcached, when I'll get enough free time.

Edit: Widget "friends" is affected with this issue. If there are other parts of the site using this class, then they are affected as well (friends list in the users profile probably?).

Last edited 8 years ago by januzi_pl (previous) (diff)

#6 follow-up: @r-a-y
8 years ago

  • Keywords dev-feedback added
  • Version set to 2.7

If a site has more than 10,000 users, I'd say it is not a small site :)

I think BP_Friends_Friendship::get_friendships() needs to be refactored.

Is there a need to fetch all of the user's friendship IDs in one fell swoop in BP_Friends_Friendship::get_friendships() when we are just doing a one-to-one check as is the case in BP_Friends_Friendship::check_is_friend()?

Couldn't we cache with the following key - wp_cache_set( "{$initiator_id}:{$friend_id}", 'bp_friends_friendship_pair' ) - when there is an initiator and friend ID?

The hard part is deleting the cache when a user is deleted. Or we would just let that cache get stale since in theory, that cache wouldn't be accessed again with the cachekey.

#7 in reply to: ↑ 6 @dcavins
8 years ago

Replying to r-a-y:

Is there a need to fetch all of the user's friendship IDs in one fell swoop in BP_Friends_Friendship::get_friendships() when we are just doing a one-to-one check as is the case in BP_Friends_Friendship::check_is_friend()?

The idea was to reduce database lookups on member directories, for instance, where you're going to check against 20 other members. We could do 20 one-to-one checks, but the caching strategy was designed to get the friends list one time then compare against it 20 times. I'll admit that I didn't imagine a use-case where a user would have 20,000 friends. :) In this user's case, he'd be better off doing 20 narrowly targeted database lookups. (I'm not sure at what volume the performance tradeoff happens, truthfully.)

We use a similar approach for group membership, which works great for most use cases, but if a member belongs to 20,000 groups, we'd have the same problem. I imagine this is true for most of our ID-based caching/filtering approaches: huge numbers of relationships are going to cause problems in some setups. I think @boonebgorges is having some thoughts about disk- vs memory-based caching performance.

Maybe we could add some sort of escape hatch when the results list is massive. Or a filter so that site users could choose which behavior works better for them.

#8 @boonebgorges
8 years ago

Gah, I just left a comment, but my cookie died and I lost it :(

@dcavins is correct that this improvement to group and friendship caching was meant to help in directories, and I think it's correct to say that it does help rather than hurt the majority of BP sites. But it's definitely not efficient in cases like this.

@r-a-y - The goal of caching here is to avoid 20 extra db hits in member directories. Our current caching strategy is to first check the cache, and then to fall back to the database if nothing's found in the cache. This means that it's not possible to tell the difference between a cold cache and a non-friendship without hitting the database. (It's not practical to cache the non-friendship status of every set of non-friends.) We could reduce the number of db hits by doing some cache-warming at the beginning of the loop - sorta like update_meta_cache. But this only works well for successful friendships. We'd maybe have to cache non-friendships separately, perhaps non-persistently, which means it'd be impossible to avoid hitting the db altogether for all users.

To clarify: Consider a site with three members: A, B, and C. The only friendship is between A and B. A is logged in and views /members/. Assume the cache is empty. After fetching the member IDs, we do something like this:

// Check for uncached friendships.
$member_ids = wp_list_pluck( $members_template->members, 'ID' );
$uncached = array();
foreach ( $member_ids as $member_id ) {
    // Don't check logged-in user.
    if ( $member_id === $loggedin_user_id ) {
        continue;
    }

    $cached = wp_cache_get( "$member_id:$loggedin_user_id", 'bp_friends_friendship_pair' );
    if ( false === $cached ) {
        $uncached[] = $member_id;
    }
}

// Because the cache was empty, $uncached is [ B, C ]
if ( $member_ids ) {
    $friendships = some_function_to_get_friendships_from_db( $member_ids );
    foreach ( $friendships as $friendship ) {
        wp_cache_set( ... );
    }
}

Either (a) we cache every non-friendship, or (b) on the next pageload, $member_ids will be [ C ], which will require a database hit.

Maybe this is the best we can do to account for instances where users have many thousands of friends?

#9 @dcavins
8 years ago

Hm. Since bp_friends_filter_user_query_populate_extras() is passed the list of user IDs to check against, it can (and used to do) a single query to get the friendship status of all members visible on a directory, for instance. That setup won't prep the cache for the various other ways friendship is used during a page load, but it could help @januzi_pl, in the short term.

Remove the current filter:
remove_filter( 'bp_user_query_populate_extras', 'bp_friends_filter_user_query_populate_extras', 4, 2 );
then add a new filter to a custom function using the old logic:
https://buddypress.trac.wordpress.org/browser/tags/2.5.0/src/bp-friends/bp-friends-filters.php

Totally unrelated, why doesn't bp_core_get_users() return BP_Core_User objects? I can never remember what type of user object is returned in the various contexts. Maybe that is the next standardization/caching frontier.

#10 @r-a-y
8 years ago

  • Milestone changed from 2.7.5 to Under Consideration

#11 @DJPaul
7 years ago

  • Milestone changed from Under Consideration to 3.0
  • Priority changed from low to high

This ticket was mentioned in Slack in #buddypress by dcavins. View the logs.


7 years ago

#13 @dcavins
7 years ago

I'm attaching a patch that adds a filter to prevent use of the user bulk friendship cache. It adds a lot of complexity to the function, though, so I'm of two minds whether we want to support both approaches in the long term.

The patch passes all unit tests (other than the test that verifies that the cache is used, ha ha) both in the default setup (use the bulk friendship cache), and in the alternate setup (don't use the bulk cache). I tested the second option by adding add_filter( 'bp_bulk_cache_user_friendships', '__return_false' ); to my test installation's bp-custom.php. Is there a way to automatically test all friendship tests in both situations?

Thanks for your feedback.

@dcavins
7 years ago

Add SQL-based alternative to using user-based bulk friendship cache.

#14 @DJPaul
7 years ago

Testing the patch per your comments would require a differently-configured test environment in the matrix, if we wanted to do that. I think we should look at using something like wp_is_large_network() for this, rather than a new filter. As if that's enabled, presumably, other bits of WordPress would work more optimally on such large sites.

Generally for SQL changes, for easier review, it's helpful to provide a SQL EXPLAIN for the query.

Regardless of that, running an uncached SQL query on the front-end is never going to scale, unfortunately -- especially for something where it's dynamic based on the current user (and we can't just stick a static page cache in front of it).

#15 @dcavins
7 years ago

Hi @DJPaul,

Thanks for your comments. wp_is_large_network() is unsuited for this check, because the size of the network is not the key thing here. This patch is covering what I'm guessing is an edge case that arises from using friendships in an unexpected way. If a user has hundreds of friendships, the bulk caching approach still works well. When the user has many thousands of friendships, then the bulk cache becomes a problem.

I've created a spreadsheet comparing the performance of using the bulk cache vs bypassing the bulk cache on a sample site:
https://docs.google.com/spreadsheets/d/1YeOHYhUBKfBq6bGbS4nnNKot61ntP2UCgF-eN9pSZfU/edit?usp=sharing

So it depends on your site's particular needs. On my main BP site, there are around 60,000 users. Only 12 have more than 100 friendships, so the bulk cache works well with a Redis object cache.

If other users can share some idea of how many friendships their users maintain, that would be helpful.

Thanks!

-David

@boonebgorges
7 years ago

#16 @boonebgorges
7 years ago

I've thought some more about this in the last few weeks.

We moved to a centralized system for caching this stuff in a desire to be less redundant. But I wonder whether that's harmed the underlying purpose of caching this kind of thing: namely, to improve performance. As such, I wonder if we might leave the existing caching mechanism in place for friend *lists* (and group member lists), but replacing the single friend-check caching mechanism with a direct SQL query that lives in its own cache. 7346.diff is a very simple example of how this would work. Invalidation (not in the patch) would be across *all* friendships for the time being, and could maybe be more fine-grained in the future if someone felt like figuring that out :)

This way, we retain the benefit of the caching for places where we actually are querying lots of friends at once, but reduce overhead when we don't need that info.

I do also think we should think about splitting up the way that the friendship cache is primed, as I started to suggest in comment 8 above. Here's a start: friendship object caching doesn't need to happen for all of a user's friendships in the case where results are supposed to be paginated https://buddypress.trac.wordpress.org/browser/tags/2.9.2/src/bp-friends/classes/class-bp-friends-friendship.php?marks=267-275#L248. Let's only prime these caches for the relevant 'per_page' number of items - which also means that we should be diligent about using 'per_page=1' etc when calling get_friendships() from elsewhere in BP. Actually, it may pay to try this fix *first*, before attempting the restructure I described above, since it might provide enough benefit to solve the problem described in this ticket.

What do others think?

#17 @DJPaul
7 years ago

  • Milestone changed from 3.0 to Awaiting Contributions

@dcavins
7 years ago

Add individual friendship caching and a "find uncached friendships" cache warmer function.

#18 @dcavins
7 years ago

Hi all-

I've added a new patch that changes BP_Friends_Friendship::check_is_friend() so that it does not loop through the big friends list each time. I've added a customized bulk cache warmer that acts like a _get_non_cached_ids() for a list of maybe-friend users, and caches individual friendships as Boone and Ray has discussed using our incremented cache helpers and the two users as the cache key, like 'initiator_user_id:friend_user_id'.

I've also added a new general cache function, bp_core_delete_incremented_cache() to remove these cached items by key when a friendship is updated.

Finally, this change doesn't replace the user-specific friendship cache as introduced in 2.5. I think there's still value to having that cached, since it is used in friendship calls that refer to all friendships for a user. The problem was that hoisting that load for 20 check_is_friend() calls was too unwieldy. There seem to me to be two different things going on: friendships from the perspective of a particular user, and the simple check, "Are these two users friends?"

Thanks for your comments,

-David

Version 0, edited 7 years ago by dcavins (next)

#19 @boonebgorges
7 years ago

  • Milestone changed from Awaiting Contributions to 3.0

@dcavins Thank you for your work on this. The strategy here looks like the right one.

Finally, this change doesn't replace the user-specific friendship cache as introduced in 2.5. I think there's still value to having that cached, since it is used in friendship calls that refer to all friendships for a user.

Yes, I think that's a sensible approach. I do think there's room to improve that system in ways I mentioned above, but for the purposes of check_is_friend(), your proposed changes in this ticket allow us to kick that question down the road.

If anyone on this ticket runs a site with users with lots of friends and is able to run some tests, it'd be helpful. Otherwise, I'm happy with the approach - though another look from @r-a-y would be welcome too.

#20 @r-a-y
7 years ago

Looks good, @dcavins.

My only minor thing is with the naming of the warm_friendship_cache() method. WordPress uses the term lazyload. See the WP_Query class for an example. I'm just nitpicking! Feel free to ignore.

We should also change the PHPDoc for the $ids parameter in bp_core_set_incremented_cache():
https://buddypress.trac.wordpress.org/browser/trunk/src/bp-core/bp-core-cache.php?marks=301,306#L286

In this patch, the use-case here for $ids is a string. We should also probably change the $ids name to $value.

#21 @dcavins
7 years ago

Thanks, @boonebgorges and @r-a-y for you comments.

Thanks for asking about the method name. I had no idea what to call it, and don't think the name is all it could be. :)

I'll check into the WP Query class.

Thanks again for the feedback.

-David

#22 @dcavins
7 years ago

I looked into the lazyload part of the WP_Query class, and it's interesting. It's more sophisticated than what I'm doing here, in that it gathers up args, but doesn't make (the more efficient) database query until the get_meta() call is made. So it really is lazy.

Is it overselling this cache priming function to call it lazyload_friendship_cache() since it's actually going to make a query?

Thanks again for your comments.

#23 @boonebgorges
7 years ago

Oh right, I do think it's probably inaccurate to call it lazyloading for that reason.

The terminology used in WP is update_*_cache(). Not the most beautiful wording, but maybe worth aping here. https://core.trac.wordpress.org/browser/tags/4.9.3/src/wp-includes/class-wp-term-query.php?marks=,690696#L687

#24 @dcavins
7 years ago

In 11864:

Introduce bp_core_delete_incremented_cache().

Add a convenience function for deleting cache items that use bp_core_get_incremented_cache_key() to generate their keys.

See #7436.

#25 @dcavins
7 years ago

  • Owner set to dcavins
  • Resolution set to fixed
  • Status changed from new to closed

In 11865:

Change check in BP_Friends_Friendship::check_is_friend().

In BP2.5, we introduced a new friendship caching mechanism that created a new cache item containing all of a user's friendships. This approach worked well on sites with fewer friendships per user, but caused cache access issues on sites with lots of friendship connections.

This update replaces the cache-intensive change to BP_Friends_Friendship::check_is_friend() with a return to a single lookup to determine friendship status between two users. To minimize SQL lookups, individual friendships are now cached, and a new cache-warming function has been added to bulk update these new cache items: BP_Friends_Friendship::update_bp_friends_cache().

Props januzi_pl, boonebgorges, r-a-y, djpaul, dcavins.

Fixes #7436.

#26 @r-a-y
7 years ago

  • Keywords has-patch added
  • Resolution fixed deleted
  • Status changed from closed to reopened

While debugging #7726, I encountered a bug with our friendship incremented cache not being purged when using either friends_remove_friend(), friends_accept_friendship(), friends_reject_friendship() or friends_withdraw_friendship().

purge-incremented-cache.patch adds a unit test showcasing the bug with friends_remove_friend() and fixes the problem by purging all friend-incremented caches.

#27 @DJPaul
7 years ago

  • Keywords commit added

I like it @r-a-y, lovely find!

#28 @dcavins
7 years ago

Hi @r-a-y-

Thanks for catching these inconsistencies. I thought more of the status change functions passed through save() than actually do.

Every time I work with the class BP_Friends_Friendship my brain goes fuzzy because of the sheer number of status change functions.

Would anyone be horribly upset if we (I?) rewrite the database interaction pieces of that class to be more centralized?

#29 @r-a-y
7 years ago

Every time I work with the class BP_Friends_Friendship my brain goes fuzzy because of the sheer number of status change functions.

For sure, I probably would have overlooked the other methods as well to be honest.

Would anyone be horribly upset if we (I?) rewrite the database interaction pieces of that class to be more centralized?

I wouldn't be upset, but do you have the time to do so? :)

#30 @DJPaul
7 years ago

This test throws a PHP Notice with Nouveau. I'm working on fixing the test.

#31 @djpaul
7 years ago

In 11948:

Tests, Friends: add test for friendship cache incrementor.

See #7436

Props r-a-y

#32 @djpaul
7 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 11949:

Friends: fix friendship incremented cache purge consistency.

This was not being cleared when using either friends_remove_friend(),
friends_accept_friendship(), friends_reject_friendship(), or
friends_withdraw_friendship().

Fixes #7436

Props r-a-y

Note: See TracTickets for help on using tickets.