Opened 9 years ago
Closed 7 years ago
#6346 closed enhancement (maybelater)
Upgrade DB tables to utf8mb4
Reported by: | DJPaul | Owned by: | |
---|---|---|---|
Milestone: | Priority: | high | |
Severity: | major | Version: | |
Component: | Core | Keywords: | needs-patch, trac-tidy-2018 |
Cc: | mail@… |
Description
Per https://make.wordpress.org/core/2015/04/02/the-utf8mb4-upgrade/, we should consider switching our DB tables to use utf8mb4. Hopefully, we can re-use a lot of the new logic in WordPress 4.2 to make this easy.
If we do this, we need to remember to check the size of the indexes. For example, WP's meta tables have a change, and our meta tables should probably be resized to keep things consistent.
A notable consequence of this is that it would bump our minimum WordPress version up to 4.2.
Attachments (4)
Change History (34)
#2
@
9 years ago
6346.01.patch does a few things:
- Introduces a convenience function for getting the max index length (could be upstream patch)
- Adds max index length for our
meta_key
columns - Moves
id
PRIMARY KEY
to its own line - Makes our
id
columnsunsigned
to match WordPress core and get what little speed boost comes from it
dbDelta()
runs our updates correctly, but doesn't seem to recognize they aren't necessary after the initial run, and I don't quite understand why that is (if it's a problem with dbDelta()
or our usage of it.)
#3
@
9 years ago
6346.02.patch is a more conservative approach that we can easily roll into 2.3 today, and it silences our dbDelta()
issues.
I'm going to commit the 02
patch immediately because:
- I'm really sick of seeing the notices in my logs
- It is extremely low risk – see: almost none – WordPress core confidently did this, so I believe we can also
- It lets us work towards more schema tweaks proposed in the
01
patch
#5
@
9 years ago
We have maybe_convert_table_to_utf8mb4()
at our disposal for upgrades too, we just don't have an easy way to iterate through all of our tables, and it will take a bit of refactoring to make it so.
#6
@
9 years ago
- Milestone changed from Future Release to Under Consideration
- Severity changed from normal to major
I just had the following notices on an existing install after svn up'ing this morning:
WordPress database error: [Duplicate key name 'meta_key'] ALTER TABLE wp_bp_activity_meta ADD KEY meta_key (meta_key(191)) WordPress database error: [Duplicate key name 'meta_key'] ALTER TABLE wp_bp_groups_groupmeta ADD KEY meta_key (meta_key(191)) WordPress database error: [Duplicate key name 'meta_key'] ALTER TABLE wp_bp_xprofile_meta ADD KEY meta_key (meta_key(191)) WordPress database error: [Duplicate key name 'meta_key'] ALTER TABLE wp_bp_user_blogs_blogmeta ADD KEY meta_key (meta_key(191))
That's 4 of the 6 in r9695
BuddyPress was already activated so these alter db queries did not occur because it was a "new installation" per the commit message of r9695, these occurred on an existing installation.
#7
@
9 years ago
To add some more context:
This was my Windows PC running native PHP, MariaDB and Apache (i.e. Not VVV) at r9683 yesterday:
F:\BuddyPressPS>svn up Updating '.': U src\bp-core\bp-core-functions.php U src\bp-loader.php Fetching external item into 'src\bp-forums\bbpress': Fetching external item into 'src\bp-forums\bbpress\bb-includes\backpress': Fetching external item into 'src\bp-forums\bbpress\bb-includes\backpress\pomo': Updated external to revision 191. At revision 361. At revision 5670. At revision 9683. F:\BuddyPressPS>phpunit Installing... Running as single site... To run multisite, use -c tests/phpunit/multisite.xml Installing BuddyPress... Not running ajax tests... To execute these, use --group ajax. PHPUnit 4.2.6 by Sebastian Bergmann. Configuration read from F:\BuddyPress\phpunit.xml.dist ............................................................... 63 / 951 ( 6%) ............................................................... 126 / 951 ( 13%) ............................................................... 189 / 951 ( 19%)
I was not seeing the WordPress DB error that @JohnJamesJacoby saw here reported in Slack:
vagrant@vvv:/srv/www/wordpress-develop/src/wp-content/plugins/buddypress$ phpunit Installing... Running as single site... To run multisite, use -c tests/phpunit/multisite.xml Installing BuddyPress... WordPress database error Duplicate key name 'meta_key' for query ALTER TABLE wptests_bp_notifications_meta ADD KEY meta_key (meta_key) made by bp_version_updater, bp_core_install, bp_core_install_notifications, dbDelta <div id='error'> <p class='wpdberror'><strong>WordPress database error:</strong> [Duplicate key name 'meta_key']<br /> <code>ALTER TABLE wptests_bp_notifications_meta ADD KEY meta_key (meta_key)</code></p> </div>Not running ajax tests. To execute these, use --group ajax. Not running ms-files tests. To execute these, use --group ms-files. Not running external-http tests. To execute these, use --group external-http. PHPUnit 4.0.20 by Sebastian Bergmann.
#9
@
9 years ago
The issue is that dbDelta()
is attempting to change the indexes on the tables, but it doesn't have proper support for DROP INDEX
. See https://core.trac.wordpress.org/ticket/22023#comment:52 (whoops), https://core.trac.wordpress.org/ticket/30795, https://core.trac.wordpress.org/ticket/26801, https://core.trac.wordpress.org/ticket/31869.
We need an equivalent of bp_pre_schema_upgrade()
. johnjamesjacoby is correct that we don't have an easy way to loop through tables, but at bp_pre_schema_upgrade()
the components haven't yet installed/registered themselves anyway, so a hardcoded array seems fine to me. I'm going to commit a scaffold for this, and we can add tables as necessary.
#14
@
9 years ago
- Milestone changed from 2.3 to 2.4
Since 4.2, there have been a number of weird tickets opened on WP Trac related to utf8mb4 updates, including the loss of data in some cases where one odd collation is changed to another. The benefits of making the move in BP are pretty modest, so I'd say it's not worth the risk of rushing into this. Let's continue the task in 2.4, once things have settled down in WP.
#16
in reply to:
↑ 15
@
9 years ago
- Cc mail@… added
- Keywords dev-feedback added
Didn't want to open a new ticket, but I think Emojii support is more than an enhancement and kind of necessary to prevent unexpected behaviour.
If someone is posting content with Emojies in e.g. bbpress, $wpdb->query() returns false and BuddyPress fails silently without creating the activity in BP_Activity_Activity->save():
if ( false === $wpdb->query( $q ) ) { return false; }
So, in cases where Emojis are "injected" by bp_activity_add() BuddyPress doesn't work as expected. Depending on the level of usage of activites (e.g. "What's new?" widgets), this could be an issue.
#17
@
9 years ago
- Keywords needs-patch added; dev-feedback removed
- Milestone changed from 2.4 to 2.5
One day we'll have emojis, but not today. :(
#18
@
9 years ago
If someone is posting content with Emojies in e.g. bbpress,
$wpdb->query()
returns false and BuddyPress fails silently without creating the activity inBP_Activity_Activity->save()
This isn't just specific to emojis.
If a hex character prefixed with A-F is used in the content:
http://www.ascii.cl/htmlcodes.htm
bp_activity_add()
fails silently as well. I've attached a unit test exhibiting this problem. This test can be fixed temporariliy by using htmlentities()
before inserting the activity item:
add_filter( 'bp_activity_content_before_save', 'htmlentities' );
(This doesn't fix the emojis problem)
I ran into this problem with the WP Front-end Editor plugin with BP activity integration.
Upgrading our activity DB tables might also fix this.
FYI, wp_insert_post()
also fails with this test content as well. So maybe we don't want to do anything here and leave it to plugins to properly encode characters before insertion?
#19
@
9 years ago
The decision that the WP core team made, when introducing the strip-invalid-chars stuff, was to let controllers handle this. See edit_post()
: https://core.trac.wordpress.org/browser/tags/4.3.1/src/wp-admin/includes/post.php?marks=369-381#L365 We could do something similar; we'd need to do it in all the various places where we insert multibyte-friendly content.
#20
@
9 years ago
Thanks for the edit_post()
reference!
Updated patch adds that logic to bp_activity_add()
and is only applicable for WP 4.2+.
As for emojis, at the moment, if you try to paste emojis into the activity update form, they will be stripped.
#24
@
8 years ago
People on mobile devices are quick to use Emoji's these days given they are part of the keyboard as a rule, so this is becoming quite a problem.
I manually updated my wp_bp_activity*, wp_bp_messages* tables using maybe_convert_table_to_utf8mb4 And I'm now able to create activity feeds and messages that contain Emoji. Is this likely to have other implications or is this considered a safe way to fix the issue?
#25
@
8 years ago
- Milestone changed from Future Release to 2.7
- Priority changed from normal to high
When BP 2.7 comes out in a couple months, #7117 fixed the default charset; we'll default to what WordPress uses (utf8mb4
). So new installs will be fine. Up until then, BuddyPress has been using the default charset as specified by MySQL. On my laptop, that's latin1
.
WordPress migrated its tables to utf8mb4 with an appropriate charset, if (via https://make.wordpress.org/core/2015/04/02/the-utf8mb4-upgrade/):
- You’re currently using the utf8 character set.
- Your MySQL server is version 5.5.3 or higher (including all 10.x versions of MariaDB).
- Your MySQL client libraries are version 5.5.3 or higher. If you’re using mysqlnd, 5.0.9 or higher.
The first part of this is implemented in WP's maybe_convert_table_to_utf8mb4()
.
The other schema change that was required were restrictions on some indexes' sizes. I've just checked, and we've done those changes correctly in r9695.
A remaining task is to run strip_invalid_text_for_column
where appropriate, as @boonebgorges and @r-a-y discussed above, about a year ago.
There is an argument to be made that we should upgrade existing site's BP table schemas from <anything>
to utf8mb4, but because <anything>
could and will be different across different servers, this sounds technically complicated to do in a robust way, and if we get it wrong, we'll corrupt our users' data. There is probably a Good Reason why WordPress only updates utf8
to utf8mb4
.
This ticket was mentioned in Slack in #buddypress by mercime. View the logs.
8 years ago
This ticket was mentioned in Slack in #buddypress by slaffik. View the logs.
8 years ago
#29
@
7 years ago
- Keywords trac-tidy-2018 added
We're closing this ticket because it has not received any contribution or comments for at least two years. We have decided that it is better to close tickets that are good ideas, which have not gotten (or are unlikely to get) contributions, rather than keep things open indefinitely. This will help us share a more realistic roadmap for BuddyPress with you.
Everyone very much appreciates the time and effort that you spent sharing your idea with us. On behalf of the entire BuddyPress team, thank you.
If you feel strongly that this enhancement should still be added to BuddyPress, and you are able to contribute effort towards it, we encourage you to re-open the ticket, or start a discussion about it in our Slack channel. Please consider that time has proven that good ideas without contributions do not get built.
For more information, see https://bpdevel.wordpress.com/2018/01/21/our-awaiting-contributions-milestone-contains/
or find us on Slack, in the #buddypress channel: https://make.wordpress.org/chat/
Yes, we should do this at some point.
I imagine there's a way to do this so that it doesn't require 4.2.
wpdb::strip_invalid_text_for_column()
- but we can probably safely wrap this in anis_callable()
wrapper, as WP itself does in one or two places. This would give us better error checking when 4.2 is available, but would leave the status quo in place on < 4.2.Requiring 4.2 will mean that we can't do this in BP for at least another couple major versions. I think it's worth exploring ways to do it sooner - even for 2.3, if we figure out the relevant workarounds in time.