Skip to:
Content

Opened 4 years ago

Closed 3 years ago

Last modified 3 years ago

#1445 closed defect (bug) (fixed)

Forums install as latin1 not utf8

Reported by: DJPaul Owned by:
Milestone: 1.5 Priority: critical
Severity: Version:
Component: Core Keywords: confirmed
Cc: kayue

Description

Thread: http://buddypress.org/forums/topic/forums-section-displaying-like
Tested on BP 1.1.3 and WPMU 2.8.6.

wp_bp_ DB tables are utf8 but wp_bb_ are latin1. This causes problems on non-English character sets on the forums. See thread linked above for screenshots. Looking at BP/BB code for the one-click forum install (this is tested only against a "new" forum install, not upgrade) it looks like the default ought to be UTF8.

In my BB-config.php in the root of the site, the charset IS set to UTF but the DB tables still installed as latin1. I'm not sure if this is a BP or BB issue; the forums component is probably the weakest area of my BuddyPress knowledge. I've had a look at the code and I can't see anything obviously wrong.

A new issue is that I suspect the majority of installs will have this problem. Converting DB types to utf8 is apparently a non-trivial task. There is a blog post at http://yihui.name/en/2009/05/convert-mysql-database-to-utf-8-in-wordpress/ (the WP Codex is a mess on this subject) but looking at that plugin's code, it doesn't seem to manage the existing database data; the WP codex suggests changing it to a binary type before changing the charset, then swapping it back again.

Change History (6)

comment:1 DJPaul4 years ago

Looking into the latter issue (how to fix the DB type for existing installs):

Tables need to be changed, something like:
ALTER TABLE $table DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci

Any binary columns need to change, something like. I think text fields are OK but someone needs to read the MySQL documentation carefully to confirm, don't take my word for it.
ALTER TABLE $table CHANGE $field_name $field_name $field_type CHARACTER SET utf8 COLLATE utf8_bin

It is possible to store UTF8 data in a non-UTF8 database. I think looking at BuddyPress (&WPMU), this will be the case. If it were non-UTF8 data in a non-UTF8 database, to convert that to UTF8, the process seems to be to convert everything to a binary type, change the charset, then switch it back.

In a nutshell I think we can just do a ALTER TABLE $table DEFAULT CHARACTER SET and bob's your uncle.

comment:2 apeatling4 years ago

  • Milestone changed from 1.2 to 1.2.1

Need to look at this in more detail, there is not enough time now before 1.2, so punting to 1.2.1.

comment:3 DJPaul4 years ago

  • Component set to Core

This is an important ticket to get fixed in 1.3 for non-English installs.

comment:4 kayue3 years ago

  • Cc kayue added
  • Priority changed from major to critical

This need to be fix soon.

comment:5 djpaul3 years ago

  • Resolution set to fixed
  • Status changed from new to closed

(In [3349]) Fixes bbPress DB installation to use BBDB_CHARSET. Fixes #1445. (trunk)

comment:6 djpaul3 years ago

(In [3350]) Fixes bbPress DB installation to use BBDB_CHARSET. Fixes #1445. (branch)

Note: See TracTickets for help on using tickets.