Skip to:
Content

BuddyPress.org

Opened 9 years ago

Closed 6 years ago

#6346 closed enhancement (maybelater)

Upgrade DB tables to utf8mb4

Reported by: djpaul's profile DJPaul Owned by:
Milestone: Priority: high
Severity: major Version:
Component: Core Keywords: needs-patch, trac-tidy-2018
Cc: mail@…

Description

Per https://make.wordpress.org/core/2015/04/02/the-utf8mb4-upgrade/, we should consider switching our DB tables to use utf8mb4. Hopefully, we can re-use a lot of the new logic in WordPress 4.2 to make this easy.

If we do this, we need to remember to check the size of the indexes. For example, WP's meta tables have a change, and our meta tables should probably be resized to keep things consistent.

A notable consequence of this is that it would bump our minimum WordPress version up to 4.2.

Attachments (4)

6346.01.patch (5.6 KB) - added by johnjamesjacoby 9 years ago.
First pass
6346.02.patch (1.6 KB) - added by johnjamesjacoby 9 years ago.
Just add meta_key index lengths
6346.hex-unit-test.patch (840 bytes) - added by r-a-y 8 years ago.
6346.hex.patch (1.7 KB) - added by r-a-y 8 years ago.

Download all attachments as: .zip

Change History (34)

#1 @boonebgorges
9 years ago

  • Milestone changed from Under Consideration to Future Release

Yes, we should do this at some point.

A notable consequence of this is that it would bump our minimum WordPress version up to 4.2.

I imagine there's a way to do this so that it doesn't require 4.2.

  • Shrinking the indexes does not depend on 4.2
  • We could check WP version before attempting to use WP functions for the table conversion. If the minimum version were not met, the migrator would attempt to run again on the next BP update.
  • There is some magic that should happen to ensure that invalid characters should be stripped before insert/update - see wpdb::strip_invalid_text_for_column() - but we can probably safely wrap this in an is_callable() wrapper, as WP itself does in one or two places. This would give us better error checking when 4.2 is available, but would leave the status quo in place on < 4.2.

Requiring 4.2 will mean that we can't do this in BP for at least another couple major versions. I think it's worth exploring ways to do it sooner - even for 2.3, if we figure out the relevant workarounds in time.

@johnjamesjacoby
9 years ago

First pass

#2 @johnjamesjacoby
9 years ago

6346.01.patch does a few things:

  • Introduces a convenience function for getting the max index length (could be upstream patch)
  • Adds max index length for our meta_key columns
  • Moves id PRIMARY KEY to its own line
  • Makes our id columns unsigned to match WordPress core and get what little speed boost comes from it

Note that dbDelta() appears to not play nicely with our queries, at least on my test installations. I don't quite understand why yet.

Version 2, edited 9 years ago by johnjamesjacoby (previous) (next) (diff)

@johnjamesjacoby
9 years ago

Just add meta_key index lengths

#3 @johnjamesjacoby
9 years ago

6346.02.patch is a more conservative approach that we can easily roll into 2.3 today, and it silences our dbDelta() issues.

I'm going to commit the 02 patch immediately because:

  • I'm really sick of seeing the notices in my logs
  • It is extremely low risk – see: almost none – WordPress core confidently did this, so I believe we can also
  • It lets us work towards more schema tweaks proposed in the 01 patch

#4 @johnjamesjacoby
9 years ago

In 9695:

Core: Add max index length to meta_key indices for new installations.

  • More to do regarding database upgrade paths later
  • Fixes notices while running phpunit tests

See #6346.

#5 @johnjamesjacoby
9 years ago

We have maybe_convert_table_to_utf8mb4() at our disposal for upgrades too, we just don't have an easy way to iterate through all of our tables, and it will take a bit of refactoring to make it so.

#6 @netweb
9 years ago

  • Milestone changed from Future Release to Under Consideration
  • Severity changed from normal to major

I just had the following notices on an existing install after svn up'ing this morning:

WordPress database error: [Duplicate key name 'meta_key']
ALTER TABLE wp_bp_activity_meta ADD KEY meta_key (meta_key(191))

WordPress database error: [Duplicate key name 'meta_key']
ALTER TABLE wp_bp_groups_groupmeta ADD KEY meta_key (meta_key(191))

WordPress database error: [Duplicate key name 'meta_key']
ALTER TABLE wp_bp_xprofile_meta ADD KEY meta_key (meta_key(191))

WordPress database error: [Duplicate key name 'meta_key']
ALTER TABLE wp_bp_user_blogs_blogmeta ADD KEY meta_key (meta_key(191))

That's 4 of the 6 in r9695

BuddyPress was already activated so these alter db queries did not occur because it was a "new installation" per the commit message of r9695, these occurred on an existing installation.

#7 @netweb
9 years ago

To add some more context:

This was my Windows PC running native PHP, MariaDB and Apache (i.e. Not VVV) at r9683 yesterday:

F:\BuddyPressPS>svn up
Updating '.':
U    src\bp-core\bp-core-functions.php
U    src\bp-loader.php

Fetching external item into 'src\bp-forums\bbpress':

Fetching external item into 'src\bp-forums\bbpress\bb-includes\backpress':

Fetching external item into 'src\bp-forums\bbpress\bb-includes\backpress\pomo':
Updated external to revision 191.

At revision 361.
At revision 5670.
At revision 9683.

F:\BuddyPressPS>phpunit
Installing...
Running as single site... To run multisite, use -c tests/phpunit/multisite.xml
Installing BuddyPress...
Not running ajax tests... To execute these, use --group ajax.
PHPUnit 4.2.6 by Sebastian Bergmann.

Configuration read from F:\BuddyPress\phpunit.xml.dist

...............................................................  63 / 951 (  6%)
............................................................... 126 / 951 ( 13%)
............................................................... 189 / 951 ( 19%)

I was not seeing the WordPress DB error that @JohnJamesJacoby saw here reported in Slack:

vagrant@vvv:/srv/www/wordpress-develop/src/wp-content/plugins/buddypress$ phpunit
Installing...
Running as single site... To run multisite, use -c tests/phpunit/multisite.xml
Installing BuddyPress...
WordPress database error Duplicate key name 'meta_key' for query ALTER TABLE wptests_bp_notifications_meta ADD KEY meta_key (meta_key) made by bp_version_updater, bp_core_install, bp_core_install_notifications, dbDelta
<div id='error'>
            <p class='wpdberror'><strong>WordPress database error:</strong> [Duplicate key name &#039;meta_key&#039;]<br />
            <code>ALTER TABLE wptests_bp_notifications_meta ADD KEY meta_key (meta_key)</code></p>
            </div>Not running ajax tests. To execute these, use --group ajax.
Not running ms-files tests. To execute these, use --group ms-files.
Not running external-http tests. To execute these, use --group external-http.
PHPUnit 4.0.20 by Sebastian Bergmann.

#8 @johnjamesjacoby
9 years ago

In 9703:

XProfile: Quick clean-up to BP_XProfile_Query

  • Remove unused $wpdb global reference
  • Remove unused $context parameter in get_sql() method
  • Add brackets to conditionals

See #6346.

#9 @boonebgorges
9 years ago

The issue is that dbDelta() is attempting to change the indexes on the tables, but it doesn't have proper support for DROP INDEX. See https://core.trac.wordpress.org/ticket/22023#comment:52 (whoops), https://core.trac.wordpress.org/ticket/30795, https://core.trac.wordpress.org/ticket/26801, https://core.trac.wordpress.org/ticket/31869.

We need an equivalent of bp_pre_schema_upgrade(). johnjamesjacoby is correct that we don't have an easy way to loop through tables, but at bp_pre_schema_upgrade() the components haven't yet installed/registered themselves anyway, so a hardcoded array seems fine to me. I'm going to commit a scaffold for this, and we can add tables as necessary.

#10 @boonebgorges
9 years ago

In 9716:

Introduce bp_pre_schema_upgrade() and use it to drop old meta_key indexes.

After [9695], BP tries to update existing indexes using dbDelta(), but
dbDelta() cannot DROP INDEX properly. So we must run a separate routine
before the regular schema updates. See WP's pre_schema_upgrade().

See #6346.

#11 @boonebgorges
9 years ago

In 9717:

In bp_pre_schema_upgrade(), only drop indexes if the table exists.

See [9716]. See #6346.

#12 @boonebgorges
9 years ago

In 9718:

In bp_pre_schema_upgrade(), use bp_esc_like() rather than $wpdb->esc_like().

The $wpdb method was introduced in WP 4.0. bp_esc_like() is a backward-
compatible wrapper.

See [9716]. See #6346.

#13 @DJPaul
9 years ago

  • Milestone changed from Under Consideration to 2.3

#14 @boonebgorges
9 years ago

  • Milestone changed from 2.3 to 2.4

Since 4.2, there have been a number of weird tickets opened on WP Trac related to utf8mb4 updates, including the loss of data in some cases where one odd collation is changed to another. The benefits of making the move in BP are pretty modest, so I'd say it's not worth the risk of rushing into this. Let's continue the task in 2.4, once things have settled down in WP.

#15 follow-up: @DJPaul
9 years ago

Looking forward to getting this in now.

#16 in reply to: ↑ 15 @wpdennis
9 years ago

  • Cc mail@… added
  • Keywords dev-feedback added

Didn't want to open a new ticket, but I think Emojii support is more than an enhancement and kind of necessary to prevent unexpected behaviour.

If someone is posting content with Emojies in e.g. bbpress, $wpdb->query() returns false and BuddyPress fails silently without creating the activity in BP_Activity_Activity->save():

if ( false === $wpdb->query( $q ) ) {
    return false;
}

So, in cases where Emojis are "injected" by bp_activity_add() BuddyPress doesn't work as expected. Depending on the level of usage of activites (e.g. "What's new?" widgets), this could be an issue.

#17 @DJPaul
8 years ago

  • Keywords needs-patch added; dev-feedback removed
  • Milestone changed from 2.4 to 2.5

One day we'll have emojis, but not today. :(

#18 @r-a-y
8 years ago

If someone is posting content with Emojies in e.g. bbpress, $wpdb->query() returns false and BuddyPress fails silently without creating the activity in BP_Activity_Activity->save()

This isn't just specific to emojis.

If a hex character prefixed with A-F is used in the content:
http://www.ascii.cl/htmlcodes.htm

bp_activity_add() fails silently as well. I've attached a unit test exhibiting this problem. This test can be fixed temporariliy by using htmlentities() before inserting the activity item:
add_filter( 'bp_activity_content_before_save', 'htmlentities' ); (This doesn't fix the emojis problem)

I ran into this problem with the WP Front-end Editor plugin with BP activity integration.

Upgrading our activity DB tables might also fix this.

FYI, wp_insert_post() also fails with this test content as well. So maybe we don't want to do anything here and leave it to plugins to properly encode characters before insertion?

#19 @boonebgorges
8 years ago

The decision that the WP core team made, when introducing the strip-invalid-chars stuff, was to let controllers handle this. See edit_post(): https://core.trac.wordpress.org/browser/tags/4.3.1/src/wp-admin/includes/post.php?marks=369-381#L365 We could do something similar; we'd need to do it in all the various places where we insert multibyte-friendly content.

#20 @r-a-y
8 years ago

Thanks for the edit_post() reference!

Updated patch adds that logic to bp_activity_add() and is only applicable for WP 4.2+.

As for emojis, at the moment, if you try to paste emojis into the activity update form, they will be stripped.

@r-a-y
8 years ago

#21 @DJPaul
8 years ago

  • Milestone changed from 2.5 to 2.6

#22 @DJPaul
8 years ago

  • Milestone changed from 2.6 to Future Release

#23 @DJPaul
8 years ago

  • Component changed from Component - Any/All to Core

#24 @timfield
8 years ago

People on mobile devices are quick to use Emoji's these days given they are part of the keyboard as a rule, so this is becoming quite a problem.

I manually updated my wp_bp_activity*, wp_bp_messages* tables using maybe_convert_table_to_utf8mb4 And I'm now able to create activity feeds and messages that contain Emoji. Is this likely to have other implications or is this considered a safe way to fix the issue?

#25 @DJPaul
8 years ago

  • Milestone changed from Future Release to 2.7
  • Priority changed from normal to high

When BP 2.7 comes out in a couple months, #7117 fixed the default charset; we'll default to what WordPress uses (utf8mb4). So new installs will be fine. Up until then, BuddyPress has been using the default charset as specified by MySQL. On my laptop, that's latin1.

WordPress migrated its tables to utf8mb4 with an appropriate charset, if (via https://make.wordpress.org/core/2015/04/02/the-utf8mb4-upgrade/):

  • You’re currently using the utf8 character set.
  • Your MySQL server is version 5.5.3 or higher (including all 10.x versions of MariaDB).
  • Your MySQL client libraries are version 5.5.3 or higher. If you’re using mysqlnd, 5.0.9 or higher.

The first part of this is implemented in WP's maybe_convert_table_to_utf8mb4().
The other schema change that was required were restrictions on some indexes' sizes. I've just checked, and we've done those changes correctly in r9695.

A remaining task is to run strip_invalid_text_for_column where appropriate, as @boonebgorges and @r-a-y discussed above, about a year ago.

There is an argument to be made that we should upgrade existing site's BP table schemas from <anything> to utf8mb4, but because <anything> could and will be different across different servers, this sounds technically complicated to do in a robust way, and if we get it wrong, we'll corrupt our users' data. There is probably a Good Reason why WordPress only updates utf8 to utf8mb4.

This ticket was mentioned in Slack in #buddypress by mercime. View the logs.


8 years ago

#27 @mercime
8 years ago

  • Milestone changed from 2.7 to Future Release

This ticket was mentioned in Slack in #buddypress by slaffik. View the logs.


7 years ago

#29 @DJPaul
6 years ago

  • Keywords trac-tidy-2018 added

We're closing this ticket because it has not received any contribution or comments for at least two years. We have decided that it is better to close tickets that are good ideas, which have not gotten (or are unlikely to get) contributions, rather than keep things open indefinitely. This will help us share a more realistic roadmap for BuddyPress with you.

Everyone very much appreciates the time and effort that you spent sharing your idea with us. On behalf of the entire BuddyPress team, thank you.

If you feel strongly that this enhancement should still be added to BuddyPress, and you are able to contribute effort towards it, we encourage you to re-open the ticket, or start a discussion about it in our Slack channel. Please consider that time has proven that good ideas without contributions do not get built.

For more information, see https://bpdevel.wordpress.com/2018/01/21/our-awaiting-contributions-milestone-contains/
or find us on Slack, in the #buddypress channel: https://make.wordpress.org/chat/

#30 @DJPaul
6 years ago

  • Milestone Awaiting Contributions deleted
  • Resolution set to maybelater
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.