Skip to:
Content

BuddyPress.org

Opened 14 months ago

Last modified 11 months ago

#7948 new enhancement

HTML sanitization for user-generated content in notification emails

Reported by: boonebgorges Owned by: DJPaul
Milestone: Awaiting Contributions Priority: normal
Severity: normal Version:
Component: Emails Keywords: 2nd-opinion
Cc:

Description

[10479] migrated email notifications over to bp_send_email(). In the old non-HTML system, we stripped all tags from user-provided content before assembling the email, because HTML wasn't supported. We maintained this convention after the change, and we run content through wp_strip_all_tags() before passing it to bp_send_email().

I'm hoping that @djpaul can provide some of the original motivation for this. I'm assuming it was mostly because rendering HTML in emails is really hard, and it was one more thing to worry about; also, we were putting user-provided content into a blockquote, which made the problem of embedded content even more potentially complex.

I've implemented bp_send_email() integration in a few plugins, and have run into problems when I don't strip tags from usermessage. HTML generally works fine, but there are certain things that can break. Nested blockquotes cause problem, as do inline images that are wider than 600px (the width of the BP email template). Since bp_send_email() doesn't do any validation or sanitization of usermessage - it doesn't need to, for core emails - these issues are allowed to bleed through into the email content.

I can work around these issues in the plugins, but the issues are likely to be shared by any plugin using bp_send_email(), so I wanted to gauge whether others (especially @djpaul) think it's worth exploring the introduction of a few pieces of validation into BP core itself. I was thinking of starting by a tool that parses img tags and ensures that width can never be more than 600. But if others think that this is insane to try to handle in BP, we could just add something to the docs that explains that usermessage is expected to be plaintext, and clients are responsible for their own sanitization if they choose to pass HTML.

Change History (4)

#1 @DJPaul
14 months ago

  • Component changed from Core to Emails
  • Milestone changed from Awaiting Review to Under Consideration
  • Owner set to DJPaul

rendering HTML in emails is really hard
user-provided content

Yes.

it's worth exploring the introduction of a few pieces of validation into BP core itself

Image width is going to be dependant on the template. We could do support for the template that we ship with, but how then does that PHP adapt to the custom email template that a theme may add?

We could support a limited set of HTML elements via KSES, if you want to come up with a sensible list.
As you allude to, this was not done originally, for simplicity and time reasons.

#2 @boonebgorges
12 months ago

Thanks for the thoughts, Paul.

Regarding a limited set of HTML elements, BP's default is a combination of WP's default $allowedtags https://core.trac.wordpress.org/browser/tags/4.9.8/src/wp-includes/kses.php#L418 plus a handful of others https://buddypress.trac.wordpress.org/browser/tags/3.2.0/src/bp-core/bp-core-functions.php#L3812 I think we probably want to be more conservative than this, disallowing the following from that combined list:

  • span
  • blockquote
  • cite
  • q

The rest are inline entities that should either be supported by, or ignored by, email clients. Does that seem like a sensible place to start?

#3 @DJPaul
12 months ago

A list of HTML elements supported by a range of modern email clients is pretty hard to come across.
I don't disagree with the list(s) you've proposed but testing them cross-client is going to be a huge timesink, so most of these are going to be shipped untested. Which I'm OK with.

#4 @DJPaul
11 months ago

  • Milestone changed from Under Consideration to Awaiting Contributions
Note: See TracTickets for help on using tickets.