Configuring Replication
Replication is implemented using a push mechanism on the source side. Changes in this repository trigger the packaging and sending of the changes over to the target. The bulk of the processing, the control, and the configuration is located on the source. The root of the replication configuration is located at /hippo:configuration/hippo:modules/replication/hippo:moduleconfig.
Configuring the replication target
In order for the source to know where to replicate to, it needs to be configured with the right URL to connect to and the credentials of the (target) user to connect with. Currently only one replication target is supported.
Below /hippo:configuration/hippo:modules/replication/hippo:moduleconfig/targets add a node of type hipposys:moduleconfig.
Specify the url of the target replication REST service as the value of the property location. This should be something like http://host[:port]/cmsapp/ws/replication-target.
Also specify the credentials of the target user to connect with using the properties username and password. The username and password should be of a valid (target) repository user, who is also authorized with the restuser role for the /hippo:configuration/hippo:domains/replication-rest domain (by default all users from the admin group are already authorized).
Tweaking the replication process
As properties directly under the root of the configuration tree at /hippo:configuration/hippo:modules/replication/hippo:moduleconfig, you can tweak the following settings:
Property name | Multi-valued | Default | Needs Restart | Description |
monitorinterval | no | 10000 | no | How often the repository journal is monitored for changes, specified in milliseconds. |
packagerinterval | no | 10000 | no | How often detected and queued changes are processed and sent to the target, specified in milliseconds. |
syncbundlesize | no | 100 | no | Setting that controls how many replication scopes are bundled in a single replication package during a sync. |
ignoredeventpathpatterns | yes | yes | List of regular expression patterns of events that the change monitor should ignore. | |
maxbinarysize | no | 10485760 (10mb) | yes | If the size of a binary exceeds this limit, it is not replicated. Specify -1 for no limit. |
Including and excluding nodes
In order for a certain node to be replicated it needs to conform to a number of criteria. The main one is that it must be located below a path that is configured to be included, and is not located below a path that is configured to be excluded or ignored. This is configured on the node /hippo:configuration/hippo:modules/replication/hippo:moduleconfig/metadata with the multi-valued properties excludedpaths and includedpaths. No restart is needed when changing these properties.
Ignoring nodes
Excluding paths from replication not only means that the nodes below that path are not replicated from source to target, it also means that those paths are removed from the target if they are there. If you have for example comments that get added through the site and which are only added on the target, those comments will get removed by replication if they are added in a subtree below an included path, even if the path to those comments is configured to be excluded on the source. For this use case - preventing content that is not on the source to be removed from the target - you must add the path to the multi-valued ignoredpaths property on the node /hippo:configuration/hippo:modules/replication/hippo:moduleconfig/metadata. Note that such ignored paths must be present on the source though in order to get the desired effect.
For example, let's say the target stores user comments inside the folder /content/documents/comments (e.g. /content/documents/comments/2016/01/07/usercomment1) . To prevent /content/documents/comments and all the user comments stored inside it from being removed from target by replication you must:
- Create the folder /content/documents/comments on the source.
- Add the folder /content/document/comments to both the excludedpaths and the ignoredpaths property of the node /hippo:configuration/hippo:modules/replication/hippo:moduleconfig/metadata on the source.
Authentication
To avoid that the target accepts content from an untrusted source Tomcat can be configured to use two-way SSL authentication for the replication REST service. This ensures that the connection between the source and target can only be established when valid and known certificates are used on both ends.
For information on how to setup Tomcat for this see the Tomcat SSL/TLS Configuration HOW-TO document.
Replcation SSL authentication configuration example
Example configuration enforcing the target to use SSL and validate the client certificate for accepting the source connection. A self signed certificate can be used but the server name used in the certificate must match the actual name.
To log information about connection issues configure log4j to log org.apache.cxf at level info.
Replication target configuration
As an example on how the replication target can be configured to only accept a connection from a trusted source the following Tomcat connector configuration only accepts a connection that provides a certificate in the provided truststore:
<Connector port="8443" protocol="org.apache.coyote.http11.Http11NioProtocol" maxThreads="150" SSLEnabled="true" scheme="https" secure="true" clientAuth="true" sslProtocol="TLS" truststoreFile="/home/hippo/.keystore" truststorePass="changeit"/>
Replication source configuration
The source can be configured to provide a certificate to the target using Java system properties. This is validated by the target using the target's trust store. Example of a source keystore configuration using Cargo systemProperties in a test environment:
<javax.net.ssl.keyStore>/home/hippo/.keystore</javax.net.ssl.keyStore> <javax.net.ssl.keyStorePassword>changeit</javax.net.ssl.keyStorePassword>