Opened 13 years ago
Closed 7 years ago
#4041 closed enhancement (maybelater)
Use SELECT FOUND_ROWS() for 'total' queries instead of a second query
Reported by: | boonebgorges | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | |
Component: | Performance | Keywords: | trac-tidy-2018 |
Cc: | DJPaul |
Description
Throughout BP, we do two queries for content loops: one that is paginated (LIMIT) and one that does COUNT(*) for a total count. The second one is unnecessary; we should be using SELECT FOUND_ROWS() instead, to avoid a second, potentially costly, query.
Note that this will cause backpat problems in the case of filters on _total_ queries. I recommend the following course of action:
- Remove the _total_ filters altogether
- Mark the _paged_ filters as deprecated (in the inline docs)
- Add new filters that make no reference to _paged_ or _total_ since it will no longer be relevant.
Attachments (1)
Change History (17)
#3
@
13 years ago
I recently experienced bad results using SELECT FOUND_ROWS().
SQL_CALC_FOUND_ROWS seems to trigger a filesort when ORDER BY is used.
This behavior should be fixed in mysql > 5.6.2, http://bugs.mysql.com/bug.php?id=18454
For now I think it's better/faster to just keep the 2nd query.
The ORDER BY and SORT parts should be removed however, since it's slow(er) and useless.
#4
@
13 years ago
webraket - Thanks for the pointer. WordPress uses SELECT FOUND_ROWS() in WP_Query - have you found similar performance issues?
I would like to benchmark this before moving forward with any broad changes. Someone should write a script that will write a few million records into the activities table (I will do it this weekend if I get a chance) and then test webraket's patch for the activity query against SELECT FOUND_ROWS(). If they are close or the same, we should go with SELECT FOUND_ROWS() (all things being equal, it's better to do what WP does, better to hand work off to the DB, better to have fewer queries). But if we find that webraket's results are reproducible, then we should probably think about keeping what we currently have in place, at least for components (like Activity) where the table is liable to get very, very, very large.
#5
@
13 years ago
When I used SQL_CALC_FOUND_ROWS, that query ended up multiple times in slow-query.log
The avg. server-load spiked around 4.0
# Query_time: 20.422221 Lock_time: 0.000089 Rows_sent: 20 Rows_examined: 1836712 SELECT SQL_CALC_FOUND_ROWS a.*, u.user_email, u.user_nicename, u.user_login, u.display_name FROM wp_bp_activity a LEFT JOIN wp_users u ON a.user_id = u.ID WHERE a.hide_sitewide = 0 AND a.type != 'activity_comment' ORDER BY a.date_recorded DESC LIMIT 0, 20; # Query_time: 13.388473 Lock_time: 0.013897 Rows_sent: 20 Rows_examined: 1836778 SELECT SQL_CALC_FOUND_ROWS a.*, u.user_email, u.user_nicename, u.user_login, u.display_name FROM wp_bp_activity a LEFT JOIN wp_users u ON a.user_id = u.ID WHERE a.hide_sitewide = 0 AND a.type != 'activity_comment' ORDER BY a.date_recorded DESC LIMIT 0, 20;
1.8million rows examined seems way too much for 600.000 activity items I currently have.
Maybe SQL_CALC_FOUND_ROWS performs the wp_user JOIN for al activity items it needs to count?
Phpmyadmin shows "using where;using filesort;" with EXPLAIN SELECT...
Once I remove (SQL_CALC_FOUND_ROWS) or (ORDER BY a.date_recorded DESC) it only shows "using where;"
I went back to the count() query, resulting in server-load around 0.5 - 1.0 ;)
It's still not optimal however:
# Query_time: 15.334062 Lock_time: 0.000034 Rows_sent: 1 Rows_examined: 613581 SELECT count(a.id) FROM wp_bp_activity a WHERE a.hide_sitewide = 0 AND a.type != 'activity_comment';
PS: I also see the WP_QUERY using SQL_CALC_FOUND_ROWS in slow-query.log
Many Wordpress users reported problems with it:
http://core.trac.wordpress.org/ticket/10469
http://core.trac.wordpress.org/ticket/10964
# Query_time: 10.427559 Lock_time: 0.106090 Rows_sent: 10 Rows_examined: 90 SELECT SQL_CALC_FOUND_ROWS wp_12958_posts.* FROM wp_12958_posts WHERE 1=1 AND wp_12958_posts.post_type = 'post' AND (wp_12958_posts.post_status = 'publish') ORDER BY wp_12958_posts.post_date DESC LIMIT 0, 10; # Query_time: 12.887792 Lock_time: 0.020614 Rows_sent: 10 Rows_examined: 12 SELECT SQL_CALC_FOUND_ROWS wp_23379_posts.* FROM wp_23379_posts WHERE 1=1 AND wp_23379_posts.post_type = 'post' AND (wp_23379_posts.post_status = 'publish') ORDER BY wp_23379_posts.post_date DESC LIMIT 0, 10;
NOTE: These test results come directly from a busy production server.
Someone should test it in a local environment to ensure query's are not locking up each other to get the best measurements possible.
#7
@
13 years ago
The ticket with an almost complete solution has ended up at #WP18536. The solution was to split things into two queries. It's not complete as jjj reported a bug, but scribu has a patch up to solve that.
#8
@
13 years ago
- Milestone changed from 1.6 to Future Release
Thanks, cnorris23. I had seen that ticket but had not put two and two together as regards this one.
I've done a bit of testing on my own setups with massive data sets and have found results similar to what webraket has suggested.
In light of this, I want to put this off until the next dev cycle, at which point we can revisit and see if there's something from WP's (finished) version that we can riff off of in our own queries.
#12
@
8 years ago
- Component changed from Core to Performance
- Type changed from defect (bug) to enhancement
The second one is unnecessary; we should be using SELECT FOUND_ROWS() instead, to avoid a second, potentially costly, query.
WP already does a count of sorts with $wpdb->num_rows
for the last query:
https://codex.wordpress.org/Class_Reference/wpdb#Class_Variables
If this works, this sounds like a good performance enhancement!
#13
@
8 years ago
- Milestone changed from Future Release to 2.7
Tempted to see if we can do this for 2.7.
#15
@
7 years ago
- Keywords trac-tidy-2018 added
We're closing this ticket because it has not received any contribution or comments for at least two years. We have decided that it is better to close tickets that are good ideas, which have not gotten (or are unlikely to get) contributions, rather than keep things open indefinitely. This will help us share a more realistic roadmap for BuddyPress with you.
Everyone very much appreciates the time and effort that you spent sharing your idea with us. On behalf of the entire BuddyPress team, thank you.
If you feel strongly that this enhancement should still be added to BuddyPress, and you are able to contribute effort towards it, we encourage you to re-open the ticket, or start a discussion about it in our Slack channel. Please consider that time has proven that good ideas without contributions do not get built.
For more information, see https://bpdevel.wordpress.com/2018/01/21/our-awaiting-contributions-milestone-contains/
or find us on Slack, in the #buddypress channel: https://make.wordpress.org/chat/
Fine by me.