Skip to:
Content

BuddyPress.org

Opened 11 years ago

Closed 10 years ago

#5303 closed defect (bug) (fixed)

DISTINCT causing poor performance on user queries

Reported by: johnjamesjacoby's profile johnjamesjacoby Owned by: johnjamesjacoby's profile johnjamesjacoby
Milestone: 2.1 Priority: normal
Severity: normal Version: 1.2
Component: Core Keywords:
Cc:

Description

On large installations with thousands/millions of users, a few of our user queries are looking for DISTINCT results on columns that already contain unique data; namely, the id column.

See: BP_User_Query::prepare_user_ids_query()

Related, Andy had noticed this years ago in r3023 and removed one, but it eventually crept back in, either in a later merge, or copy/pasted from a different query.

Change History (18)

#1 @johnjamesjacoby
11 years ago

I propose we remove DISTINCT's where they are not necessary. Patch incoming.

#2 follow-up: @johnjamesjacoby
11 years ago

In 7698:

Remove ignored DISTINCT from BP_XProfile_Group::get() queries. See #5303.

#3 @johnjamesjacoby
11 years ago

In 7699:

Remove unused DISTINCT from BP_Blogs_Blog::is_hidden() query. See #5303.

#4 @johnjamesjacoby
11 years ago

In 7700:

Remove unnecessary DISTINCT usages from bp-core-classes.php. See #5303.

#6 @johnjamesjacoby
11 years ago

In 7701:

Use correct ID column name rather than id. Reverts testing code accidentally committed in r7700. See #5303.

#7 in reply to: ↑ 2 @cmmarslender
11 years ago

Replying to johnjamesjacoby:

In 7698:

Remove ignored DISTINCT from BP_XProfile_Group::get() queries. See #5303.

This change caused profile fields to repeat when viewing a member profile

#8 @johnjamesjacoby
11 years ago

In 7708:

Revert r7698. Causing profile fields to repeat themselves in the loop. Needs unit tests before revisiting. See #5303.

#9 @johnjamesjacoby
11 years ago

In 7709:

Revert r7707, and add inline documentation to BP_Notifications_Notification:get_total_count(). See #5303.

Version 0, edited 11 years ago by johnjamesjacoby (next)

#10 @lenasterg
11 years ago

Hi.
I think also in /bp-activity/bp-activity-classes.php at line 295
the

// Select conditions
        $select_sql = "SELECT DISTINCT a.id";

should change to

// Select conditions
        $select_sql = "SELECT a.id";

#11 @boonebgorges
11 years ago

lenasterg - That DISTINCT is necessary. See #5210. I'm hopeful that the changes in r7765 (#5349) will have introduced enough performance improvements to the activity query that that particular DISTINCT won't matter.

#12 @lenasterg
11 years ago

Sorry, about the false-positive.

Is it the same case at /bp-groups/bp-groups-classes.php on line 2033 ?

$group_sql = $wpdb->prepare( "SELECT DISTINCT group_id FROM {$bp->groups->table_name_members} WHERE user_id = %d AND is_confirmed = 1 AND is_banned = 0{$pag_sql}", $user_id );

and line 2034

$total_groups = $wpdb->get_var( $wpdb->prepare( "SELECT COUNT(DISTINCT group_id) FROM {$bp->groups->table_name_members} WHERE user_id = %d AND is_confirmed = 1 AND is_banned = 0", $user_id ) );

#13 @boonebgorges
11 years ago

Yes, we can probably drop DISTINCT in these cases. There should be blocks elsewhere in BP to ensure that each group-user pair can only receive one membership. Though, fwiw, the impact of this particular DISTINCT is probably very, very small, because we're only selecting the heavily indexed group_id column.

#14 @boonebgorges
10 years ago

  • Milestone changed from 2.0 to 2.1

#15 @boonebgorges
10 years ago

In 8289:

Reintroduce DISTINCT keyword to BP_Core_User::get_users()

The lack of DISTINCT was causing duplicate entries when joining against the
xprofile fields for the purposes of alphabetical sorts.

Reverts part of r7700. See #5303

Fixes #5550

#16 @boonebgorges
10 years ago

In 8290:

Reintroduce DISTINCT keyword to BP_Core_User::get_users()

The lack of DISTINCT was causing duplicate entries when joining against the
xprofile fields for the purposes of alphabetical sorts.

Reverts part of r7700. See #5303

Fixes #5550

#17 @DJPaul
10 years ago

  • Keywords early removed

Is this done now? Are the remaining DISTINCTs required? The issue behind the commit that Boone mentions above indicates we should fix up that SQL query, and then remove the DISTINCT, but I'd prefer that on a separate ticket as this ticket is fairly general.

#18 @boonebgorges
10 years ago

  • Resolution set to fixed
  • Status changed from new to closed

Let's close this ticket, as (a) I think we've gotten the worse offenders, and (b) it's unlikely that anyone is going to do the kind of systematic review of all instances of DISTINCT that would be required to do more work toward the ticket as described.

Please open separate tickets for individual instances of DISTINCT that are causing problems.

Note: See TracTickets for help on using tickets.