Node name encoding

Introduction

By default, documents and folders in the CMS have two name values:

  • display name - a translatable string value that is meant for display on a webpage (like a breadcrumb value, a link name or a page title) or in the document listing in the CMS.
  • node name - the actual node name as stored in the repository, which is also the value used by the HST for constructing a URL.

Both values are encoded using an implementation of the org.hippoecm.repository.api.StringCodec interface to ensure unsupported charcters are either removed or replaced with a supported character.

For display names, the class org.hippoecm.repository.api.StringCodecFactory$IdentEncoding is used which simply returns the input value as is. For node names, the class org.hippoecm.repository.api.StringCodecFactory$UriEncoding  is used which, as its name implies, performs a one-way encoding (no decoding possible) for translating any UTF-8 String to a suitable set of characters that can be used in URIs. See this page for a detailed explanation.

Configuration

Both codecs can be configured in the repository. The repository location depends on the version of Hippo CMS.

Hippo CMS version Repository location
12.1 and older

/hippo:configuration/hippo:frontend/cms/cms-services/settingsService/codecs

12.2 and newer

/hippo:configuration/hippo:modules/stringcodec/hippo:moduleconfig

Both locations accept two properties named encoding.display and encoding.node. As a property value you need to use the value returned by Class.getName(), e.g. org.hippoecm.repository.api.StringCodecFactory$UriEncoding.

Different node name encoding per locale

In some cases it is desirable to have a different StringCodec for encoding node names per locale. This way, the URLs constructed by the HST will be in the format that users (and machines) expect it to be for a related locale. For example, 'รค' is generally encoded as 'a' but in German it should be 'ae'.

To support this the configuration option for setting a node name codec has been extended. For example, a StringCodec for the German language can be configured with a property named encoding.node.de, or if you need to be more specific (like a different StringCodec for both Austrian and German), two properties should be added with the names encoding.node.de_de and encoding.node.de_at.

Bloomreach Experience Manager ships with a default StringCodec implementation. If you decide to configure a custom StringCodec you will have to implement it yourself. A good starting point is class org.hippoecm.repository.api.StringCodecFactory$UriEncoding which can be found in the Hippo Repository project.

 

 

Did you find this page helpful?
How could this documentation serve you better?
On this page
    Did you find this page helpful?
    How could this documentation serve you better?