Opened 13 years ago
Closed 13 years ago
#3452 closed defect (bug) (fixed)
Long Inserted Image in Post doesn't Render in Activity Stream
Reported by: | nahummadrid | Owned by: | |
---|---|---|---|
Milestone: | 1.5 | Priority: | normal |
Severity: | normal | Version: | 1.5 |
Component: | Activity | Keywords: | has-patch |
Cc: |
Description
Let's say I insert an image in the body of a post and the html is unusually long because the title, alt, or just the image file name, the posted image thumbnail doesn't render correctly in the activity stream item for that post. The html gets cut off because of the excerpt length and the only thing that shows in the activity is the cutoff html <img src="file...[...]
Attachments (4)
Change History (21)
#1
@
13 years ago
- Component changed from Core to Activity
- Milestone changed from Awaiting Review to 1.5
#2
@
13 years ago
So.. this isn't a regression, or a bug introduced in 1.5? Maybe punt to 1.5.1 unless someone comes up with a patch?
#3
@
13 years ago
It's introduced in 1.5 with the "Read More" functionality. I don't think we excerpted comments before 1.5.
#4
@
13 years ago
CakePHP uses this to truncate while preserving HTML:
http://api.cakephp.org/view_source/text-helper/#line-165
License is MIT.
#5
@
13 years ago
- Keywords has-patch added
That function from CakePHP is pretty awesome. I was able to drop it in more or less as-is, and it seems to be working great. I modified it a bit to match BP coding standards, to add filters, and to do some backward compatibility.
As for function arguments. In 1.2.9, the parameters looked like this:
function bp_create_excerpt( $text, $length = 123, $filter_shortcodes = true )
The previous BP 1.5 version added a fourth parameter, $append_text, which defaulted to ' […]'. The Cake function worked a bit differently:
function truncate( $text, $length = 123, $options = array() )
where $options is an array (which includes 'ending', corresponding to our $append_text). I decided to switch over to the Cake method, to minimize refactoring. I've maintained backward compatibility with BP 1.2.9's function arguments (see the first line, where I do is_bool() on $options). The correct arguments from here on out are spelled out in the PHPDoc for the function.
Please give it a go with some different HTML-strewn content and see if you get what you'd expect.
#6
@
13 years ago
Some thoughts:
- we should i18n "[…]"
- the "class_exists( 'Multibyte' );" check won't do anything since BP/WP doesn't have that class (guess it's from cakephp).
- it uses mb_substr, mb_strlen, mb_strrpos. WP core has fallback version of mb_substr, but not the others.
#7
@
13 years ago
Good call on the first two points.
Re: mb_ string functions. Why do we need fallbacks? The official PHP.net docs say that all three functions have been around since PHP 4.0.6. I'm guessing WP's fallback in compat.php is left over from when PHP < 4.0 was supported. (Or am I missing something, and the multibyte string functions need an extra PHP dependency?)
#8
@
13 years ago
Issue is that mbstring is a non-default extension. You have to set it when PHP is compiled (or find an appropriate binary). Looks like PHP 5.4 is the first release which will have it enabled by default (http://www.php.net/releases/NEWS_5_4_0_alpha1.txt, under "General Improvements")
#9
@
13 years ago
OK, good call. I hadn't seen that.
In 3452.02.patch I implement your first two suggestions.
As for the multibyte issue. In most places, I think that the non-multibyte equivalent will be fine.
- Using strlen() instead of mb_strlen() means that sometimes the excerpt will end up being shorter than intended (because multibyte characters will have each of their bytes counted), but this is not the end of the world. (In languages where it makes a huge difference, because most characters are multibyte, a filter can be placed on 'bp_excerpt_length' to make the number larger by default. Imprecise, but better than nothing.)
- Using strrpos() instead of mb_strrpos() - doesn't really make a difference, as long as we're consistent. mb_strrpos() is used in Cake to get the final position of a whitespace character, so that words aren't broken. strrpos( $truncate, ' ' ); and mb_strrpos( $truncate, ' ' ); will give different values, but as long as we're consistent in how they're used (and as long as we use either substr or mb_substr, as appropriate, when trimming) I don't think there will be an issue.
The remaining problem with multibyte, aside from excerpt length, is that MB characters can sometimes get cut in the middle. When !$exact (ie when we only break on word breaks) it won't be an issue, since we'll be chopping everything after the last space anyway. For other cases, I've stolen some code from wp_html_excerpt() to remove partial entities at the end of the string.
In 3452.02.patch, I have simply taken out the mb_ functions. What do you think instead of providing our own versions of these functions (maybe in bp-core-wpabstraction.php or something)? They would simply return the non-multibyte return value if the MB function doesn't exist. That way, the mb_ functions could be used if they exist, which it seems like will be more likely to be true in parts of the world where it's more important.
#10
@
13 years ago
In languages where it makes a huge difference, because most characters are multibyte, a filter can be placed on 'bp_excerpt_length' to make the number larger by default.
I think it's a very bad idea to expect users to create a filter just to get their language parsed right. The average BP user shouldn't know or care about any of the filters.
What do you think instead of providing our own versions of these functions (maybe in bp-core-wpabstraction.php or something)? They would simply return the non-multibyte return value if the MB function doesn't exist.
+1.
#11
@
13 years ago
3473.03.patch restores the multibyte functions, and adds fallback versions (taken from MediaWiki) of the necessary functions, in wp-core-abstraction.php.
#12
@
13 years ago
DJPaul points out that I have to remove the function prefixes. See 3473.04.patch.
#13
@
13 years ago
I just tested with a blog post with pictures and a lot of other body text. I get the picture and the generated excerpt in the activity stream fine, but when I click the "read more" button, it doesn't return any extra text. e.g. the javascript seems to work, but the rest of the post isn't returned.
#14
@
13 years ago
It's fetching the *activity item*, not the blog post. The activity item is itself an excerpt of the blog post. (Maybe this suggests that we shouldn't be doing Read More excerpting for activity items of type new_blog_post)
#15
@
13 years ago
Let's update the patch to not add the "read more" link for new blog posts and blog comments, and probably forum updates, too. I think those are the only types where we link out to the original content? Other than this, the patch seemed to work fine for me.
#16
@
13 years ago
Instead of making an explicit exception for certain activity types, I went with a slightly different method. The check
if ( strlen( $excerpt ) > $excerpt_length ) )
wasn't working right anymore, because the excerpt could contain additional characters due to HTML tags that would make this conditional true, even if the rendered result (when Read More was clicked) would show content which is less than $excerpt_length. So instead, I run *every* item through bp_create_excerpt(). If it's too short, then bp_create_excerpt() just returns the original string; otherwise it truncates. Then, before appending the 'Read More' link, I check to see whether the return value of bp_create_excerpt() is equivalent to the original string - a true test of whether it's been truncated. It seems much more elegant and efficient now.
#17
@
13 years ago
- Resolution set to fixed
- Status changed from new to closed
(In [4989]) Modifies bp_create_excerpt() to take account of HTML tag length when truncating long texts. Thanks to CakePHP for the function. Refactors BP use of bp_create_excerpt(), where necessary, to account for new function argument structure. Reworks bp_activity_truncate_entry() so that already-truncated activity items (like blog posts) do not get a Read More link when they contain HTML. Introduces fallback functions for some multibyte string functions, to make bp_create_excerpt() work properly. Thanks to MediaWiki for these fallbacks. Fixes #3452. Props DJPaul for all the help with this patch.
I noticed this in one of my plugins as well. We should be able to fix it by doing some selective HTML removal before creating the excerpt.
There should be two steps: