Skip to:
Content

BuddyPress.org


Ignore:
Timestamp:
06/23/2015 01:42:40 PM (10 years ago)
Author:
boonebgorges
Message:

Improve the HTML-handling logic of bp_create_excerpt().

  • The regular expression that detects tags should be less generous, to avoid false matches.
  • Word boundary detection, as used when exact=false, should never return a space character inside of an HTML tag (such as the space in <a href="http://example.com">).

These changes fix some tag-parsing issues introduced in [9523].

Fixes #6517.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/src/bp-core/bp-core-template.php

    r9844 r9962  
    814814
    815815        // Find all the tags and HTML comments and put them in a stack for later use
    816         preg_match_all( '/(<\/?([\w!].+?)[^>]*>)?([^<>]*)/', $text, $tags, PREG_SET_ORDER );
     816        preg_match_all( '/(<\/?([\w+!]+)[^>]*>)?([^<>]*)/', $text, $tags, PREG_SET_ORDER );
    817817
    818818        foreach ( $tags as $tag ) {
     
    866866    // If $exact is false, we can't break on words
    867867    if ( empty( $r['exact'] ) ) {
    868         $spacepos = mb_strrpos( $truncate, ' ' );
     868        // Find the position of the last space character not part of a tag.
     869        preg_match_all( '/<[a-z\!\/][^>]*>/', $truncate, $truncate_tags, PREG_OFFSET_CAPTURE );
     870        $rtruncate = strrev( $truncate );
     871        $spacepos = false;
     872        for ( $i = strlen( $rtruncate ) - 1; $i >= 0; $i-- ) {
     873            if ( ' ' !== $rtruncate[ $i ] ) {
     874                continue;
     875            }
     876
     877            // Convert rpos to negative offset on forward-facing string.
     878            $pos = -1 - $i;
     879
     880            // If there are no tags in the string, the first space found is the right one.
     881            if ( empty( $truncate_tags[0] ) ) {
     882                $spacepos = $pos;
     883                break;
     884            }
     885
     886            // Look at each tag to see if the space is inside of it.
     887            foreach ( $truncate_tags[0] as $truncate_tag ) {
     888                $start = $truncate_tag[1];
     889                $end   = $start + strlen( $truncate_tag[0] );
     890                if ( $pos > $start && $pos < $end ) {
     891                    $spacepos = $pos;
     892                    break 2;
     893                }
     894            }
     895        }
     896
    869897        if ( false !== $spacepos ) {
    870898            if ( $r['html'] ) {
Note: See TracChangeset for help on using the changeset viewer.