Opened 6 years ago
Last modified 6 years ago
#7948 new enhancement
HTML sanitization for user-generated content in notification emails
Reported by: | boonebgorges | Owned by: | DJPaul |
---|---|---|---|
Milestone: | Awaiting Contributions | Priority: | normal |
Severity: | normal | Version: | |
Component: | Emails | Keywords: | 2nd-opinion |
Cc: |
Description
[10479] migrated email notifications over to bp_send_email()
. In the old non-HTML system, we stripped all tags from user-provided content before assembling the email, because HTML wasn't supported. We maintained this convention after the change, and we run content through wp_strip_all_tags()
before passing it to bp_send_email()
.
I'm hoping that @djpaul can provide some of the original motivation for this. I'm assuming it was mostly because rendering HTML in emails is really hard, and it was one more thing to worry about; also, we were putting user-provided content into a blockquote
, which made the problem of embedded content even more potentially complex.
I've implemented bp_send_email()
integration in a few plugins, and have run into problems when I don't strip tags from usermessage
. HTML generally works fine, but there are certain things that can break. Nested blockquotes
cause problem, as do inline images that are wider than 600px (the width of the BP email template). Since bp_send_email()
doesn't do any validation or sanitization of usermessage
- it doesn't need to, for core emails - these issues are allowed to bleed through into the email content.
I can work around these issues in the plugins, but the issues are likely to be shared by any plugin using bp_send_email()
, so I wanted to gauge whether others (especially @djpaul) think it's worth exploring the introduction of a few pieces of validation into BP core itself. I was thinking of starting by a tool that parses img
tags and ensures that width
can never be more than 600
. But if others think that this is insane to try to handle in BP, we could just add something to the docs that explains that usermessage
is expected to be plaintext, and clients are responsible for their own sanitization if they choose to pass HTML.
Change History (4)
#1
@
6 years ago
- Component changed from Core to Emails
- Milestone changed from Awaiting Review to Under Consideration
- Owner set to DJPaul
#2
@
6 years ago
Thanks for the thoughts, Paul.
Regarding a limited set of HTML elements, BP's default is a combination of WP's default $allowedtags
https://core.trac.wordpress.org/browser/tags/4.9.8/src/wp-includes/kses.php#L418 plus a handful of others https://buddypress.trac.wordpress.org/browser/tags/3.2.0/src/bp-core/bp-core-functions.php#L3812 I think we probably want to be more conservative than this, disallowing the following from that combined list:
span
blockquote
cite
q
The rest are inline entities that should either be supported by, or ignored by, email clients. Does that seem like a sensible place to start?
Yes.
Image width is going to be dependant on the template. We could do support for the template that we ship with, but how then does that PHP adapt to the custom email template that a theme may add?
We could support a limited set of HTML elements via KSES, if you want to come up with a sensible list.
As you allude to, this was not done originally, for simplicity and time reasons.