Opened 15 years ago
Closed 14 years ago
#1222 closed defect (bug) (fixed)
Message excerpts don't work on inbox/sentbox under i18n environment
Reported by: | takuya | Owned by: | |
---|---|---|---|
Milestone: | 1.5 | Priority: | normal |
Severity: | Version: | ||
Component: | Core | Keywords: | i18n, excerpt, needs-patch |
Cc: |
Description
When messages are Japanese (or any other languages which uses special characters), inbox and sentbox display full content of message. Like other excerpt problems I reported, excerpt function doesn't work if BuddyPress is used in UTF8 or certain languages, mostly Asian.
Here's i18n discussion topic for reference.
http://buddypress.org/forums/topic/buddypress-i18n-topics
Change History (5)
#2
@
15 years ago
- Milestone changed from 1.2 to 1.2.1
I need a patch for this from someone who is familiar with the issue, otherwise it's probably not going to get fixed.
#3
@
14 years ago
- Component set to Core
- Priority changed from major to normal
The problem appears to be with the way that Asian languages use spaces. bp_create_excerpt() creates an excerpt and attempts not to break any words in half, and it does this by assuming that words correspond to the characters that appear between spaces or punctuation. When a language has sentences that are 40 or 50 characters long, bp_create_excerpt() interprets it as a single word. Thus, when bp_create_excerpt() attempts to truncate the text to 10 words (as in the case of message excerpts), it ends up making an excerpt of 10 whole sentences, which in many cases is the entire message.
One solution is for the value of $excerpt_length to be interpreted in characters rather than in words. We'd still probably use a space as a delimiter to ensure that words (in alphabetic languages) don't get split in half, but instead of always having an excerpt of 10 words, it'd be an except of (say) 75 letters, rounded up to the nearest space. This might break some plugins currently using bp_create_excerpt() with a specific value for $excerpt_length.
Before writing a patch, I would like to get feedback from someone with more knowledge of l18n best practices.
Bumping the ticket priority down because this problem can be worked around by filtering 'bp_create_excerpt' or 'bp_get_message_thread_excerpt'.
#4
@
14 years ago
Got some helpful feedback from http://buddypress.org/community/groups/localization/forum/topic/word-count-vs-character-count-for-automated-excerpts-feedback-wanted/#post-83508
In my fix, I also modify all the places throughout BP where bp_create_excerpt() is called with an explicit $except_length, as well as the default value of $excerpt length. I used the multiplier of 4.5 to convert word lengths to character lengths, as a bit of internet research (along with http://buddypress.org/community/groups/localization/forum/topic/word-count-vs-character-count-for-automated-excerpts-feedback-wanted/#post-83458) made it sound like it'd be a good cross-language compromise. It seems to make excerpts look good everywhere I checked.
Very similar to http://trac.buddypress.org/ticket/654.