Skip to:
Content

BuddyPress.org

Opened 13 years ago

Closed 8 years ago

Last modified 8 years ago

#3659 closed defect (bug) (fixed)

Recently Active widget doesn't scale

Reported by: mpvanwinkle77's profile mpvanwinkle77 Owned by:
Milestone: Priority: normal
Severity: normal Version: 1.2.10
Component: Core Keywords:
Cc: joesell89

Description

Hello, I don't have a solution for this yet, but I wanted to add it to the mix. The Recently Active (and Who's Online too I think) widgets don't scale well. I have a site with 98,000+ registered users and this query used by the widget was just killing the site:

SELECT DISTINCT u.ID as id, u.user_registered, u.user_nicename, u.user_login, u.display_name, u.user_email , um.meta_value as last_activity FROM wp_users u 
LEFT JOIN wp_usermeta um ON um.user_id = u.ID 
WHERE u.spam = 0 
AND u.deleted = 0 AND u.user_status = 0 AND um.meta_key = 'last_activity' 
ORDER BY um.meta_value 
DESC LIMIT 0, 18;

Clearly the scale of our community is a big factor. But we have an entire server dedicated solely to MySQL and three for httpd ... and even with these resources we were getting crushed.

I'm trying to think of ways to scale this better. I'm considering creating a field in the options table to store IDs of logged in users and then running those IDs through get_userdata() which should at least be getting cached by W3 Total Cache. This approach would come with a little more PHP overhead but hopefully spare my database server the burden of the direct user query.

In general I think some modifications to bp_core_get_users() ( i.e. BP_Core_User ) could solve the problem. Perhaps adding caching logic? Or perhaps querying just the wp_usermeta table for user_id's and then populating each one with get_userdata() ?

So anyway, just wanted to start the discussion and get feedback.

Change History (13)

#1 @boonebgorges
13 years ago

  • Milestone changed from Awaiting Review to Future Release

Thanks for the report.

perhaps querying just the wp_usermeta table for user_id's and then populating each one with get_userdata()

That solution would be pretty efficient if you were using the right kind of persistent object caching, but I think it would be pretty inefficient if you weren't, since it creates many queries where previously there was just one. However, that would need some testing, since the queries would then be of a much simpler sort.

This query is simple enough where we could fairly easily rewrite it to take advantage of the relatively new WP_User class, which has full wp_cache support already built in. See #3127 for a broader discussion. Though in this case, unlike in #3127, we wouldn't need to join against any BP tables, so it's much more feasible in this case than elsewhere in BP_Core_User.

#2 @joesell89
11 years ago

  • Cc joesell89 added

I wish I could be a solution to this problem but I don't have the know how.

I do want to say that this is an issue that I am currently trying to deal with. I think the widget is important to some sites because it shows users that there are others active on the site even if they aren't saying anything.

However, my host (WP Engine) have helped me to identify this as the single biggest performance drain on my site. I don't even have a very big site (3,700 members).

I can't come up with a solution but I do want to say that it is important on my site and I would be more than happy to test any solutions that anyone can come up with.

#3 @boonebgorges
11 years ago

This is really a subset of the larger issue with last_activity being stored in usermeta. See #5128.

#4 @r-a-y
10 years ago

  • Keywords dev-feedback added

I think this would be an ideal scenario to use a transient.

We could cache the recently active members widget as a transient with an expiration time of five minutes.

If another dev agrees, let's do this for v2.1.

#5 @boonebgorges
10 years ago

I think that the problems from the original post have probably largely been taken care of by (a) the move to BP_User_Query (which removes some problematic JOIN behavior) and (b) the refactoring of last_activity in BP 2.0.

That said, I think it'd be very cool to explore some sort of caching for widget markup in general. Transients sound fine to me - if we do this, it probably makes sense to cache the entire HTML output. We'll have to figure out what to do (if anything) about AJAX filtering. We should also include some sort of invalidation logic - I would prefer this to simply allowing the value to go stale for five minutes. Bonus is that we can increase the transient life if we have proper invalidation.

#6 @r-a-y
10 years ago

I've done a first pass of widget caching as a plugin on Github.

  • A select number of BP widgets are cached. By default, this includes all the major BP widgets like the Members, Groups, Blog Posts, Who's Online, Recently Active Members and Friends widgets.
  • Each BP widget is cached as a site transient.
  • Since a BP widget can be used more than once with different settings, each instance of a BP widget is saved as a key-value pair in this site transient.
  • Support for invalidation at various junctures including widget deletion and updating of widget settings.
  • No unit tests yet (I know, I know).
  • No cache support for widgets with AJAX tabs, but I don't think this is a big issue.

Could use some real-world testing before deciding if this is worthwhile to be included in core.

Last edited 10 years ago by r-a-y (previous) (diff)

#7 @boonebgorges
10 years ago

  • Milestone changed from Future Release to 2.2

r-a-y - This is very very cool. Great work. I'm putting it in the 2.2 milestone.

Couple thoughts/questions:

  • It looks like you've decided to store all cached instances of a given widget together to make invalidation easier. But I think this is less than ideal from a performance perspective. With this setup, persistent caches will purge all of a widget's cached instances at once (according to whatever cleanup routine the admin has set up) instead of one at a time. I wonder if we can instead do something like: store each instance in its own transient, and then have a single transient that serves as an index for the transient keys. Ie
$key = bp_widget_get_transient_key( $class_name, $instance, $args ); // unique for each instance
set_site_transient( $key, $widget_markup );

$cache_keys = get_site_transient( 'bp_widget_cache_keys' );
$cache_keys[ $class_name ][] = $key;
set_site_transient( 'bp_widget_cache_keys', $cache_keys );

Then reference that list of keys to invalidate. Not hugely different from what you've suggested, but I think it's better practice to keep the cached objects as small as possible.

  • // as per WP docs, transient keys can only be 40 characters in length - Your technique here will not guarantee unique keys if widget classes share the first 35 characters of their names. (Maybe not super likely, but think about a hypothetical DPA_Achievements_Widget_Display_Achievements_Of_User and _Group.) Maybe run through md5() or something.
  • When applied to BuddyPress, each component should be responsible for registering its own cache classes. (I know that doing it centrally doesn't hurt anything because those widgets simply won't exist if the component is not activated, but still :) )
  • bp_widget_cache_add_wp_head_hook() is pretty hacky. Let's just add a do_action() hook in bp_update_user_last_activity().
  • Reaching into the $wp_registered_widgets global is truly terrible. This is not a criticism of you ;)

Let's circle back after we've run this through its paces on a real site. But preliminarily, it looks like this is going to provide huge benefits for lots of BP sites!

#8 @r-a-y
10 years ago

It looks like you've decided to store all cached instances of a given widget together to make invalidation easier.

Yeah, that and if a site doesn't use a persistent object cache, the DB table could become bloated since a site could use multiple instances of the same widget. Multiply that number if on multisite.

The problem with my approach is like you said - the transient could potentially hold a large array of instances.

Your approach is pretty good as well. I could be just thinking way too conservatively though, so we'll most likely adopt your method. :)

With this setup, persistent caches will purge all of a widget's cached instances at once (according to whatever cleanup routine the admin has set up) instead of one at a time.

This isn't necessarily true. I invalidate a widget's instance if the widget is updated or deleted. This is okay for the majority of widgets except for cases where the widget output is variable such as the Friends widget, which relies being on a user profile page.

as per WP docs, transient keys can only be 40 characters in length - Your technique here will not guarantee unique keys if widget classes share the first 35 characters of their names.

I was thinking about that myself! As a first pass, I wanted to easily see what cached widgets were being added. But when this gets finalized, we could use the md5() approach for the transient keys.

When applied to BuddyPress, each component should be responsible for registering its own cache classes. (I know that doing it centrally doesn't hurt anything because those widgets simply won't exist if the component is not activated, but still :) )

Separating by component definitely makes sense. Will definitely do that when creating a patch for core later on.

bp_widget_cache_add_wp_head_hook() is pretty hacky. Let's just add a do_action() hook in bp_update_user_last_activity().

For sure, just needed to get something working in the meantime without hacking core.

Reaching into the $wp_registered_widgets global is truly terrible.

Yeah I know!

But preliminarily, it looks like this is going to provide huge benefits for lots of BP sites!

Yeah, I think it works quite well. Could also be used in theory for any WP widget!

#9 @boonebgorges
10 years ago

This all sounds great, r-a-y. Just wanted to respond quickly to this:

I invalidate a widget's instance if the widget is updated or deleted. This is okay for the majority of widgets except for cases where the widget output is variable such as the Friends widget, which relies being on a user profile page.

Yes, at the BP level. But when using a persistent cache, transients are stored in memory. And different caching engines have different techniques for purging old content when the allotted cache memory is full. I think APC just dumps the whole cache by default, and in that case it wouldn't matter how we stored the widget cache. But Memcache and Redis are more discriminate about purging. If we keep the instances in separate keys, we potentially reduce the number of cache misses due to automated selected purging of this kind. That's what I meant :)

#10 @DJPaul
10 years ago

  • Milestone changed from 2.2 to Future Release

I'm not convinced that we need to add fragment caching for HTML output to BuddyPress at the moment. If we were thinking of seeing if that works well in practice (I haven't tried Ray's plugin), my guess is that caching the theme compat. template parts would be more valuable.

#11 @r-a-y
10 years ago

  • Keywords needs-patch dev-feedback removed

Although it does work quite well, I'm kind of in agreement with DJPaul that widget caching shouldn't be in core. WP doesn't ship with widget caching so why should we?

my guess is that caching the theme compat. template parts would be more valuable.

Yes, if we stick to the object cache API. No, if we use site transients (like what is proposed for widget caching above).

However caching template parts is hard because of the multitude of template parts (cache bloat) and the variance of parameters used in each template loop (invalidation).


Anyway, I think I hijacked the original ticket and turned it into a widget caching ticket :)

I would be interested to hear from joesell89 and mpvanwinkle77 to see if they are still experiencing problems with the Recently Active Widget on the latest version of BP.

#12 @lakrisgubben
8 years ago

  • Resolution set to fixed
  • Status changed from new to closed

As said above, the changes from #5128 seems to have made this problem obsolete (was just benchmarking these widgets with sites with 25000 users and no caching at all and the db-lookup times are quite fast I must say, 0.16s. And of course a site with that many users would use some kind of object cache anyways...), the fact that no one has replied with similar problems for more than two years is also telling. I'm closing this as fixed, and if fragment caching of widget output would be desirable in the future I think it'd deserve it's own ticket. :D

#13 @r-a-y
8 years ago

  • Milestone Future Release deleted
Note: See TracTickets for help on using tickets.