Skip to:
Content

BuddyPress.org

Opened 7 years ago

Closed 5 weeks ago

Last modified 4 weeks ago

#7534 closed enhancement (maybelater)

Remove read notifications from the DB table after a specified time period

Reported by: r-a-y's profile r-a-y Owned by: espellcaste's profile espellcaste
Milestone: Priority: normal
Severity: normal Version:
Component: Performance Keywords: dev-feedback
Cc:

Description

I've thought about this for awhile now.

Once you've read a notification, chances are you will never go back to to the "Notifications > Read" tab to view it again.

So with that in mind, does it make sense to keep read notifications indefinitely in the database? What if you have a rather, large community that has a ton of notifications? (See https://buddypress.org/2017/05/largest-turkish-recipe-site-spiced-up-with-buddypress/)

To be conservative, we could purge read notifications from the site after a specified time period. I don't know what a good value would be (3 months, 6 months?), but thought it would be good bring this up for discussion.

Attachments (1)

7534.01.patch (6.9 KB) - added by r-a-y 7 years ago.
Changed deletion limit to 1000; add cron job on root blog only

Download all attachments as: .zip

Change History (20)

#1 @hnla
7 years ago

Makes a lot of sense, notifications are somewhat ephemeral, but the period would need to err on the longer side 6 months, then purge.

#2 @johnjamesjacoby
7 years ago

A few things:

  • Looks like Facebook goes back about 20 days
  • Twitter notifications technically are just mentions, and they go back seemingly forever
  • LinkedIn went back about 3 months

We could:

  • Have a global maximum for the entire community
  • Allow users to set their own preference (maybe some users don't want to keep them ever?)
  • Give users a "Delete read notifications instantly" type setting

We'd probably need to use a cron job to clean these up, which we've avoided needing to use so far, because of known issues with the WPCron API on multisite.

Other thoughts:

  • The query to remove notifications for all users based on their last activity is a scary one. We'd need to know the last_activity date for all users to purge the notifications table of old read ones across the board.
  • I don't think deleting all read notifications that are over 30 days old makes sense, because that clean-up routine would need to run every day across all users
  • If we are going to start running periodic clean-up routines, at the risk of making this a bigger deal, there are other things that could benefit (stale friend requests, groups, etc...)
Last edited 7 years ago by johnjamesjacoby (previous) (diff)

#3 @r-a-y
7 years ago

The query to remove notifications for all users based on their last activity is a scary one. We'd need to know the last_activity date for all users to purge the notifications table of old read ones across the board.

I was thinking of doing a date query against the notification table's date_notified column and check is_new = 0.

But:

Allow users to set their own preference (maybe some users don't want to keep them ever?)

.

Give users a "Delete read notifications instantly" type setting

A per-user setting would make this more complicated.

because that clean-up routine would need to run every day across all users

Perhaps run every week or every month?

If we are going to start running periodic clean-up routines, at the risk of making this a bigger deal, there are other things that could benefit (stale friend requests, groups, etc...)

Friend requests and groups are more pertinent data. If a group is empty with no activity, then sure.

Last edited 7 years ago by r-a-y (previous) (diff)

#4 @DJPaul
7 years ago

  • Milestone changed from Under Consideration to 3.0

We gotta do something here.

#5 @r-a-y
7 years ago

  • Keywords has-patch added

01.patch is an attempt to remove read notifications older than 30 days.

Patch does the following:

  • Fixes notification date queries(!). We were referencing the wrong date column in the BP_Notifications_Notification::get_date_query_sql() method.
  • Introduces a cronjob to delete stale notifications every week - bp_notifications_cron(). Can probably add a filter to change the interval if desired.
  • Introduces a utility function to delete stale notifications - bp_notifications_delete_stale(). This function deletes stale notifications older than 30 days, 1000 items at a time by default. If there are more stale notifications to remove, the function runs recursively until all stale notifications are removed.
  • Adds a new method to delete multiple notifications by ID - BP_Notifications_Notification::delete_by_ids(). I couldn't use the existing delete() method as it can only delete one notification at a time, which can be a hassle when deleting a ton of notifications at once.

Let me know what you think.

Last edited 7 years ago by r-a-y (previous) (diff)

@r-a-y
7 years ago

Changed deletion limit to 1000; add cron job on root blog only

#6 @DJPaul
7 years ago

Fixes notification date queries(!)

Can you commit the fix for this, as its own thing?

(EDIT: this was done in r11840)

Last edited 6 years ago by DJPaul (previous) (diff)

#7 @johnjamesjacoby
7 years ago

1000 seems high for most traditional WordPress hosting company servers. It’s possible trying to delete 1000 records will lock the database table for a few seconds.

If this is a stab in the dark, we should probably be more conservative: 250, but more frequently?

I’m also reluctant to push this change to core, like at all. Facebook/Twitter/LinkedIn and others don’t purge my old notifications/mentions/etc… The point of storing them is to keep them around, and removing them with a cron means it will happen somewhat randomly to end users without their acknowledgement.

#8 @r-a-y
7 years ago

We can drop the deletion limit to 250 or lower and can increase the date limit to something older like three months if that helps.

WordPress purges trashed posts with cron (see wp_scheduled_delete(), wp_delete_auto_drafts()). I understand that read notifications are different and are not exactly "trashed", but once you've marked a notification as read, the likelihood of ever going back to an older, read notification is extremely rare.

If we want to give some acknowledgment, we can also add a message on the "Notifications > Read" screen that read notifications older than X days will be removed.

If we don't want to run this with WP cron, then maybe this can be added as a tool under "Tools > BuddyPress" in the admin area or as a wp-cli command.

#9 @DJPaul
7 years ago

  • Milestone changed from 3.0 to Under Consideration

#10 @espellcaste
7 months ago

  • Milestone changed from Under Consideration to Up Next
  • Owner set to espellcaste
  • Status changed from new to assigned

I'll be reviewing this ticket...

#11 @shawfactor
6 months ago

I suggest a slightly different approach.

  1. Add an expiration date field to the notifications table (that can be null)
  2. Add a setting that allows admins to set a defaults expiration period and a filter that allows devs to set different expirations on a per notification or per notification type basis

This would not only address the above issue but also make it much easier to clear unread notifications that are no longer relevant (right now that involved quite a bit of custom code)

Last edited 6 months ago by shawfactor (previous) (diff)

#12 @imath
3 months ago

  • Milestone changed from Up Next to 15.0.0

#13 @espellcaste
3 months ago

  • Keywords needs-unit-tests needs-refresh added; has-patch removed

I've been thinking about this ticket and this is what I think.

The original source of the problem was a large site where keeping millions notifications in the database was not desired. But we should not assume this should be the default behavior for every BuddyPress site.

I agree with @johnjamesjacoby here that we should not remove those notifications from the database automatically. Without end users acknowledgement. Or the community's owner, since most of them will be unaware this action is happening on the background.

With trash posts, a limited number of users performs the removal action (editors inside WordPress). And the intent of the action, trashing a post, is different from having read a notification. A notification record can have any type of content that a user might want to reach back.

The option to show/hide, or delete, them, depends a lot on the decision of the community's owner. It could be performance, retention policy (legal), analytics, etc.

--

To me, the removal of notifications should work similarly to the way WordPress removes revisions. So here is what I suggest:

  1. Allow an unlimited number of notifications to be saved in the database by default.
    • Let's confirm if this page is paginated: https://site.test/members/user/notifications/read/
  2. Add option to limit the number of notifications per user that's saved in the database.
    • A custom CLI command to remove the notifications using a bulk task;
    • A cron job that uses this command and runs in a filterable schedule (weekly by default;
    • An indication here about the retention limit: https://site.test/members/user/notifications/read/

I think a CLI command is better here because it'll perform the removal in a better way than the current patch. Looping notifications, handle failures, can be fired without a cron job too (for those sites that rely on WP CLI), can be used in cron jobs, etc.

Thoughts?!

Last edited 3 months ago by espellcaste (previous) (diff)

This ticket was mentioned in Slack in #buddypress by espellcaste. View the logs.


6 weeks ago

#15 @vapvarun
6 weeks ago

We can implement an automatic removal of read notifications after 30 days or upon reaching a set cap, along with CLI support for flexible management.

#16 @imath
6 weeks ago

+1 for Cli command @espellcaste I agree it’s the right tool to deal with it. A repair tool is another option…

I’m not a fan of automatically deleting notifications after an amount of time.

There could be a link into the Unread user’s nav menu to « empty » these like we empty trash…

#17 @johnjamesjacoby
6 weeks ago

Maybe it will help to define what the reason is BuddyPress has this goal/intention, and then make sure we are doing those things.

Here, we’ve explained what we want to do, but not really what results we want to see to call it successful or a priority.

Do we think having fewer rows in this database will make it faster? If so, how many rows makes it too slow, and what is “too slow” and what default metrics do we consider reasonable, plus now we are introducing new options to the Notifications component, and is that the best thing for all installs?

Even though we can make safe assumptions, and even though we are acting with good intentions of making things faster & better, we don’t know if (as one example) maybe instead of permanently deleting rows we can add a custom “pruned” status that omits those rows from the indexing, and gain performance without losing data.

Does permanently deleting a notification unintentionally make anything worse from the members perspective? Does it unintentionally lock the database while those rows are deleted and the indexes are recalculated? Does it create a situation where a member might get a Notification Email that gets deleted by the time they get to click the link and read it? What happens now if someone tries to view a deleted notification via email?

Or, can we routinely prune notifications from the members perspective without permanently impacting other connected experiences? 🤔

Last edited 6 weeks ago by johnjamesjacoby (previous) (diff)

#18 @espellcaste
5 weeks ago

  • Keywords needs-unit-tests needs-refresh removed
  • Milestone 15.0.0 deleted
  • Resolution set to maybelater
  • Status changed from assigned to closed

I agree with @johnjamesjacoby. I was working with the ticket's original problem in mind (if you read the article and the history of the ticket, you'll get a better undertanding of what's being solved and why) and my suggestion, and others, was more about how to solve it.

At the same time, this is an old ticket and the problem mentioned here will mostly affect high-trafficked sites.

I'll resolve this ticket as maybelater and then if new information comes up that answers @johnjamesjacoby's questions specifically, we can revisit the solution.

:)

This ticket was mentioned in Slack in #buddypress by espellcaste. View the logs.


4 weeks ago

Note: See TracTickets for help on using tickets.