Skip to:

Opened 8 months ago

Last modified 2 weeks ago

#7523 assigned defect (bug)

Messages are not deleted from DB, when user is deleted

Reported by: slaFFik Owned by: r-a-y
Milestone: 3.0 Priority: normal
Severity: normal Version: 1.0
Component: Messages Keywords: has-patch


Here is one of the usecases:
If spammer sends messages and we delete him - his messages are left in DB.

There are some others as well, I believe, when user decided to delete his account by himself for some reason.
As we are currently wiping all of the data of that user, why are we not deleting messages?
From code: because they are unread, but that's a problem in this scenario.
What if we introduce a $force statement and somehow pass it through bp_core_delete_account() to all corresponding places?

Attachments (2)

7523.01.patch (13.9 KB) - added by r-a-y 8 months ago.
7523.deleted.patch (2.4 KB) - added by r-a-y 8 months ago.

Download all attachments as: .zip

Change History (17)

#1 @r-a-y
8 months ago

  • Keywords has-patch added; 2nd-opinion removed
  • Milestone changed from Awaiting Review to 2.9
  • Owner set to r-a-y
  • Status changed from new to assigned
  • Version set to 1.0

Nice catch. Surprised we never added a cleanup routine for the Messages component!

Your $force parameter sounds like a good way to fix this. The other problem is querying for all message threads for the deleted user.

01.patch does the following:

  • Adds $force parameter to BP_Message_Thread::delete() to make message thread deletion easier.
  • Adds some new parameters to BP_Message_Threads::get_all_current_threads_for_user() to query for message threads. In particular, $deleted is the most important one. $fields and $count_total are enhancements to only query for the message thread IDs and to omit the total count.
  • Refactors BP_Message_Threads::get_all_current_threads_for_user() to work better with multiple WHERE conditions.
  • Includes a unit test.
Version 0, edited 8 months ago by r-a-y (next)

8 months ago

#2 @johnjamesjacoby
8 months ago

I think the reason we don't purge deleted user data is because it can sometimes intersect into deleting related data from other users being orphaned, or not making sense anymore when 1 person's messages suddenly disappear without a trace for the others in a thread.

Activity, Messages, and Groups are the offenders.

Notifications, Friends, XProfile, and Settings could probably safely be purged.

This is partly why the "Anonymous" type user code exists in some places. To be able to say "this user is no longer around" without eluding as to what might have happened to them.

#3 @r-a-y
8 months ago

This is partly why the "Anonymous" type user code exists in some places. To be able to say "this user is no longer around" without eluding as to what might have happened to them.

01.patch doesn't handle individual messages, but message threads authored by the deleted user.

I don't think it makes sense to keep message threads authored by the deleted user, however I can see why individual messages should switch over to an anonymous user context.

Maybe we should switch from deleting content to wiping out content. This could be a good feature enhancement. FWIW, I know Reddit changes the user and content to [deleted] whenever some content has been removed.

This ticket was mentioned in Slack in #buddypress by r-a-y. View the logs.

8 months ago

#5 @r-a-y
8 months ago

In today's dev chat, we talked about deleting message threads from the removed user from the DB, but keeping individual messages from the removed user intact, but switching the content of the message to [deleted] instead of what was actually written.

Switching out the content turned out to be pretty straightforward. See deleted.patch. Patch also fixes an issue with bp_core_get_core_userdata() not caching invalid users.

8 months ago

This ticket was mentioned in Slack in #buddypress by r-a-y. View the logs.

7 months ago

#7 @johnjamesjacoby
7 months ago

Approach in the patch looks OK to me.

For final commit, can we move that false === $userdata ? 0 bit out from the function call? Inline ternaries like that are tricky to read and easily overlooked, at least by my eyes.


#8 @hnla
7 months ago

@r-a-y Are you wishing to commit this then given Johns approval or punt to 3.0?

#9 @r-a-y
7 months ago

In 11611:

Members: Cache invalid users in bp_core_get_core_userdata().

This commit caches non-existent user attempts for the
bp_core_get_core_userdata() function, so subsequent calls for the same
user ID will not ping the database again.

See #7523.

#10 @r-a-y
7 months ago

In 11612:

Messages: Do not show message content in message threads from deleted users.

If a user no longer exists, do not show the message content and replace
with a [deleted] marker.

See #7523.

#11 @r-a-y
7 months ago

  • Milestone changed from 2.9 to 3.0

I've committed deleted.patch for 2.9, which just handles message content from deleted users.

We didn't really come to a consensus on whether we should outright delete message threads authored by the deleted user, so I haven't commited that yet.

Moving to 3.0.

#12 @DJPaul
2 weeks ago

I think showing a deleted users’ messages is a good behaviour; it also helps from a audit/moderation perspective (it looks like what we did in the last release might hide the content from a site admin, for example). I’d rather keep the message chain intact.

How about something like, changing the user ID of messages-to-be-deleted to -1, and treating such in the message loop as a “deleted user”, and showing that appropriately in the templates?

#13 @johnjamesjacoby
2 weeks ago

I’d prefer we keep everything pristine if possible. What if we leave the user ID as is but add a piece of meta data that indicates the user was “deleted”?

In a multisite instance, a user might be marked as deleted but not actually deleted. Also, changing the user ID reduces the integrity of the data, and then assigns every deleted message to a ghost user.

Not to mention I don’t think our database tables do (so should) allow a negative number in those columns.

A “user_status” meta key could then be used to communicate to the UI where appropriate that the author data of that message is not available in several ways, not just “deleted” but “hidden” or something else.

#14 @DJPaul
2 weeks ago

Complexities there are around a join against message meta, and lots of duplicate-value message meta records (depending on volume of deleted messages a user has), and some sort of inverse cache strategy for “this is not actually a user”.

#15 @johnjamesjacoby
2 weeks ago

We wouldn't need to join, exactly, because we aren't excluding those messages from threads. All we need to do is say: if the user_status meta key for this message ID equals deleted then show some "Deleted" type user name instead. I don't remember if we lazy load message meta or not, but ideally if we are, there's nothing gained or lost this way.

Note: See TracTickets for help on using tickets.