Possible XSS attack using data: protocol in rich-text fields 

Issue date: 23-03-2018
Affects versions: 12.1, 12.0, 11.2, 10.2

Issue ID: SECURITY-40

Affected Product Version(s)
This vulnerability applies to CMS 10.2.8, CMS 11.2.4, CMS 12.0.3 and CMS 12.1.0 and earlier versions.

Severity 
low
 

Description

In rich-text fields the data: protocol can be exploited, likely combined with base64 encoding, to inject javascript based XXS attacks in a <link> href attribute or <object> data attribute.

This vulnerability is only exploitable by an authenticated CMS user who can enter and modify rich-text fields in documents. As such the severity for this vulnerability is low. 

Instructions

For all current supported CMS versions this vulnerability has been fixed, through code changes only, and only requires updating to the latest maintenance releases: CMS 10.2.9, CMS 11.2.5, CMS 12.0.4 or CMS 12.1.1.

The applied fix, HHP-24 (and backports thereof), prevents (actively removes) usage of the data: protocol for these two specific use-cases. 

Note that this fix is only active for server-side HTML Cleaning of (by default) richtext html fields, and for CMS 12 and higher when the omitJavascriptProtocol htmlprocessor setting is not disabled (default: enabled, before CMS 12: always enabled).

While this vulnerability unlikely has been exploited, the fix will only prevent future misuse, e.g. will be applied when new rich-text content is added or existing content is modified.

To make sure existing content is free from this, customers are strongly recommended to have an administrator run the check/report script below with the Updater Editor, after upgrading to the latest maintenance release!

HtmlDataProtocolCheck script

The following reporting-only Groovy script can be executed by an administrator in the CMS Updater Editor after the maintenance upgrade (the script depends on some of the fixes and improvements) using the following settings:

Name: HtmlDataProtocolCheck
Select node using: Updater
Batch Size: 1000

Script content:

package org.hippoecm.frontend.plugins.cms.admin.updater;
import org.htmlcleaner.HtmlCleaner
import org.htmlcleaner.TagNode
import org.onehippo.repository.update.BaseNodeUpdateVisitor

import javax.jcr.Node
import javax.jcr.NodeIterator
import javax.jcr.RepositoryException
import javax.jcr.Session
import javax.jcr.query.Query
import javax.jcr.query.QueryManager

class HtmlDataProtocolCheck extends BaseNodeUpdateVisitor {

  private HtmlCleaner cleaner = new HtmlCleaner();
  private NodeIterator nodeIterator;

  Node firstNode(final Session session) throws RepositoryException {
    final QueryManager queryManager = session.getWorkspace().getQueryManager();
    final Query jcrQuery = queryManager.createQuery("//element(*, hippostd:html)", "xpath");
    nodeIterator = jcrQuery.execute().getNodes();
    return nextNode();
  }

  Node nextNode() throws RepositoryException {
    return nodeIterator.hasNext() ? nodeIterator.next() : null;
  }

  boolean doUpdate(Node node) throws RepositoryException {
    TagNode rootNode = cleaner.clean(node.getProperty("hippostd:content").getString());
    StringBuilder msg = new StringBuilder();
    checkContent(rootNode, msg);
    if (msg.length() > 0) {
      log.info("Found \"data:\" protocol usage in property hippostd:content at "+node.getPath()+":\n"+msg.toString());
    }
    return false;
  }

  void checkContent(TagNode tagNode, StringBuilder msg) {
    checkDataProtocol(tagNode, "a", "href", msg);
    checkDataProtocol(tagNode, "object", "data", msg);
    for (final TagNode childNode : tagNode.getChildTags()) {
      checkContent(childNode, msg);
    }
  }

  private void checkDataProtocol(TagNode tagNode, String tagName, String attrName, StringBuilder msg) {
    if (tagName.equals(tagNode.getName()) &&
            tagNode.hasAttribute(attrName) &&
            tagNode.getAttributeByName(attrName).startsWith("data:")) {
      String attrValue = tagNode.getAttributeByName(attrName);
      attrValue = attrValue.length() <= 70 ? attrValue : attrValue.substring(0, 67) + "...";
      msg.append("- <"+tagNode.getName()+" "+attrName+"\"="+attrValue+"\"\n");
    }
  }

  boolean logSkippedNodePaths() {
    return false;
  }

  boolean skipCheckoutNodes() {
    return true
  }

  boolean undoUpdate(Node node) {
    throw new UnsupportedOperationException();
  }
}

After executing the above script, which might take some time depending on the number of documents and rich-text fields per document, it will report in which rich-text fields data: protocol usage(s) have been detected, if any.
Note that the above updater script will verify all rich-text fields (node-type: hippostd:html), regardless if they are actively cleaned upon editing through the CMS by a rich-text htmlprocessor or not! 
Reported usages then can be fixed manually by opening and saving, and if needed (re)publishing, the document(s) these fields are part of. 
Alternatively, an adminstrator also can use the reported rich-text html node paths to navigate to and modify the content property directly through the CMS Console.