Opened 15 years ago
Closed 15 years ago
#1660 closed defect (bug) (fixed)
bp_create_excerpt can split html tags
Reported by: | junsuijin | Owned by: | junsuijin |
---|---|---|---|
Milestone: | 1.2 | Priority: | major |
Severity: | Version: | ||
Component: | Keywords: | has-patch, needs testing | |
Cc: |
Description
Because exploding content with a space delimiter can split html tags like 'a' or 'img' and cut them at the 55 word-limit, creating ugly half-codes at the end of excerpts, a preg_split should be used to count <a...</a>, <img ... >, <object...</object> and other inline elements like span etc as single words. I have tested this code extensively for trimming unstripped blog post excerpts but have not tested with BuddyPress specifically. Decisions can also be made as to whether to even allow <object>s into the excerpts (I haven't looked but perhaps these tags get removed at some other point). This code does not handle iframes. If desired I can use an 'x' flag on the regex and attempt to make it more readable.
Attachments (2)
Change History (5)
#2
@
15 years ago
bp_create_excerpt.2.patch is updated to remove object support.
A short explanation of the regex pattern:
it splits the string on <tag ...>[optional non-space chars]. It also splits on the space character(s), similar to what the normal explode would do. If it splits on a <tag> with or without a word attached after it (<opening tag>word as explained above), it captures that delimiter, discarding the surrounding space characters and adding it as another word in the array. I have tested this regex extensively on unstripped WordPress post excerpts and it has to be formatted in this order to function properly. The force_balance_tags function must also be used (it is currently added to the bp_create_excerpt filter), because the word limit might be hit after <opening tag>word another-word...before the closing tag occurs.
Object tags are handled before this. Could you update the patch to remove the object support? Thanks.