Skip to:
Content

BuddyPress.org

Opened 15 years ago

Closed 14 years ago

#1222 closed defect (bug) (fixed)

Message excerpts don't work on inbox/sentbox under i18n environment

Reported by: takuya's profile takuya Owned by:
Milestone: 1.5 Priority: normal
Severity: Version:
Component: Core Keywords: i18n, excerpt, needs-patch
Cc:

Description

When messages are Japanese (or any other languages which uses special characters), inbox and sentbox display full content of message. Like other excerpt problems I reported, excerpt function doesn't work if BuddyPress is used in UTF8 or certain languages, mostly Asian.

Here's i18n discussion topic for reference.
http://buddypress.org/forums/topic/buddypress-i18n-topics

Change History (5)

#1 @DJPaul
15 years ago

  • Keywords needs-patch added

#2 @apeatling
15 years ago

  • Milestone changed from 1.2 to 1.2.1

I need a patch for this from someone who is familiar with the issue, otherwise it's probably not going to get fixed.

#3 @boonebgorges
14 years ago

  • Component set to Core
  • Priority changed from major to normal

The problem appears to be with the way that Asian languages use spaces. bp_create_excerpt() creates an excerpt and attempts not to break any words in half, and it does this by assuming that words correspond to the characters that appear between spaces or punctuation. When a language has sentences that are 40 or 50 characters long, bp_create_excerpt() interprets it as a single word. Thus, when bp_create_excerpt() attempts to truncate the text to 10 words (as in the case of message excerpts), it ends up making an excerpt of 10 whole sentences, which in many cases is the entire message.

One solution is for the value of $excerpt_length to be interpreted in characters rather than in words. We'd still probably use a space as a delimiter to ensure that words (in alphabetic languages) don't get split in half, but instead of always having an excerpt of 10 words, it'd be an except of (say) 75 letters, rounded up to the nearest space. This might break some plugins currently using bp_create_excerpt() with a specific value for $excerpt_length.

Before writing a patch, I would like to get feedback from someone with more knowledge of l18n best practices.

Bumping the ticket priority down because this problem can be worked around by filtering 'bp_create_excerpt' or 'bp_get_message_thread_excerpt'.

#4 @boonebgorges
14 years ago

Got some helpful feedback from http://buddypress.org/community/groups/localization/forum/topic/word-count-vs-character-count-for-automated-excerpts-feedback-wanted/#post-83508

In my fix, I also modify all the places throughout BP where bp_create_excerpt() is called with an explicit $except_length, as well as the default value of $excerpt length. I used the multiplier of 4.5 to convert word lengths to character lengths, as a bit of internet research (along with http://buddypress.org/community/groups/localization/forum/topic/word-count-vs-character-count-for-automated-excerpts-feedback-wanted/#post-83458) made it sound like it'd be a good cross-language compromise. It seems to make excerpts look good everywhere I checked.

#5 @boonebgorges
14 years ago

  • Resolution set to fixed
  • Status changed from new to closed

(In [3610]) Changes bp_create_excerpt() so that excerpts are generated using character counts rather than word counts. Modifies excerpt_length throughout BP to account for the change. Fixes #1222

Note: See TracTickets for help on using tickets.