Notes on contentEditable and HTML injection

In the bad old days, all user-supplied text in a web page was entered using one or other form element, input for short texts such as names, and textarea for texts that may span multiple lines, such as comments or user feedback. These elements, while extremely useful and serviceable, didn’t always fit in with the webpages they occupied, and had no capacity for WYSIWYG editing of formatted text.

But now, using the contentEditable attribute (one of HTML5’s many innovations) almost any element on the page can be used for user-supplied text. This is used to great effect in Matthew Butterick’s font demos and the Medium post editor.

The real power of contentEditable is the way it seamlessly blends editable content into a webpage. As an example, click anywhere in this paragraph and put some words in my mouth. Neat, eh?

But there’s more! Once you’ve had your full of editing my words, try selecting and dragging some of this styled text into the editable paragraph above. See what happens? The formatting is preserved! And that’s how it’s possible to use contentEditable to create rich text editors in the web browser. Because contentEditable is just an attribute you can attach to any element, rather than a special text-input box, it can contain any HTML the browser will render.

I emphasis the word any because that any includes <script> tags, and should thus be setting alarm bells off in the minds of anyone who knows about Cross-Site Scripting (XSS for short). What’s interesting here is that, unlike your standard input and textarea boxes, a contentEditable element will automatically HTML entity encode dangerous elements like < and >.¹

An initial, naive XSS test of just typing <script>alert(1)</script> into a contentEditable and submitting it to the server for inclusion in the page on a refresh will fail. If HTML entities are being encoded on the server side (as they should be), you’ll end up with the ugliness of double-encoded characters, i.e.

&amp;lt;script&amp;gt;alert(1)&amp;lt;/script&amp;gt;

This could prompt you to remove your server-side entity encoding. But try the drag test, and you’ll get some nice HTML injection, so rather keep your validation.

There’s no way to drag executable JavaScript into your contentEditable, and there doesn’t need to be (though it would be kinda cool). As I explained in my first post on CSRF, a web application’s frontend is not really the most straightforward way to interact with it, and if the application is not securely developed, the frontend provides only a fraction of the application’s real (largely unintended) functionality. It’s all about the text that’s sent to the server.

So instead of waiting for <script> tags to become draggable, all you need to do is pop open curl (if you’re a masochist) or an intercepting proxy like Burp Suite and use it to change the harmless <span>s and <div>s in your request into executable JavaScript.

contentEditable’s behaviour is fairly reasonable and logical, but it came as a bit of a surprise when I first used it, especially the dragging. Amusingly, if you had no server-side input validation, replacing all of your input and textarea boxes with contentEditable divs would make XSS slightly more difficult, as you would need more than just your browser to exploit it. But that’s not a security feature or anything you should rely on.

So validate your inputs, and treat contentEditable with care. You may need to let some HTML through in order for your rich text editor to work, but always whitelist.

Which actually makes total sense, because how else would it display them? Having pairs of <>s straight-up disappearing in your site’s textbox would be a bit of an antifeature. ↩︎