#1445 closed defect (bug) (fixed)
Forums install as latin1 not utf8
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | 1.5 | Priority: | critical |
Severity: | Version: | ||
Component: | Core | Keywords: | confirmed |
Cc: | kayue |
Description
Thread: http://buddypress.org/forums/topic/forums-section-displaying-like
Tested on BP 1.1.3 and WPMU 2.8.6.
wp_bp_ DB tables are utf8 but wp_bb_ are latin1. This causes problems on non-English character sets on the forums. See thread linked above for screenshots. Looking at BP/BB code for the one-click forum install (this is tested only against a "new" forum install, not upgrade) it looks like the default ought to be UTF8.
In my BB-config.php in the root of the site, the charset IS set to UTF but the DB tables still installed as latin1. I'm not sure if this is a BP or BB issue; the forums component is probably the weakest area of my BuddyPress knowledge. I've had a look at the code and I can't see anything obviously wrong.
A new issue is that I suspect the majority of installs will have this problem. Converting DB types to utf8 is apparently a non-trivial task. There is a blog post at http://yihui.name/en/2009/05/convert-mysql-database-to-utf-8-in-wordpress/ (the WP Codex is a mess on this subject) but looking at that plugin's code, it doesn't seem to manage the existing database data; the WP codex suggests changing it to a binary type before changing the charset, then swapping it back again.
Change History (6)
#2
@
15 years ago
- Milestone changed from 1.2 to 1.2.1
Need to look at this in more detail, there is not enough time now before 1.2, so punting to 1.2.1.
#3
@
15 years ago
- Component set to Core
This is an important ticket to get fixed in 1.3 for non-English installs.
Looking into the latter issue (how to fix the DB type for existing installs):
Tables need to be changed, something like:
ALTER TABLE $table DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci
Any binary columns need to change, something like. I think text fields are OK but someone needs to read the MySQL documentation carefully to confirm, don't take my word for it.
ALTER TABLE $table CHANGE $field_name $field_name $field_type CHARACTER SET utf8 COLLATE utf8_bin
It is possible to store UTF8 data in a non-UTF8 database. I think looking at BuddyPress (&WPMU), this will be the case. If it were non-UTF8 data in a non-UTF8 database, to convert that to UTF8, the process seems to be to convert everything to a binary type, change the charset, then switch it back.
In a nutshell I think we can just do a ALTER TABLE $table DEFAULT CHARACTER SET and bob's your uncle.