Automatic Export Add-on
Introduction
Goal
Automatically export changes made in the development content repository to the repository data modules in your project.
Background
Experience has shown that one of the tricky tasks for a developer while working on a project is keeping track of the changes made to the local development repository and export those changes to the relevant files before commiting them to a version control system.
In order to help you with that task, the automatic export mechanism tracks changes you make to the repository and exports those changes to your file system.
Getting Started
Enable Automatic Export
In order to enable automatic export, you need to tell it where to find and store export files. This is done by passing two Java system properties when starting the CMS.
The first property project.basedir is specified in the primary project POM:
<profile> <id>cargo.run</id> <build> <plugins> <plugin> <groupId>org.codehaus.cargo</groupId> <artifactId>cargo-maven2-plugin</artifactId> <configuration> <snip/> <container> <systemProperties> <!-- enables auto export: --> <project.basedir>${project.basedir}</project.basedir> </systemProperties> </container> </configuration> </plugin> </plugins> </build> </profile>
The project.basedir system property must be set to the absolute path (starting with a /) of the root of your project. The easiest way to do this is by using the Maven property project.basedir, as shown in the example above. The repo.autoexport.allowed property must also be set to true, which is the default provided in the hippo-cms-project POM from which the project POM inherits. When you run your newly-built CMS project using the cargo.run profile, these properties will enable automatic export by default. If you log into the Console, you can press the Disable/Enable auto export button in the top menu to turn this function on and off.
Automatic export reads your project's files and updates these when something changes to the nodes serialized in these files. However, sometimes there is no appropriate file to write changes to and in this case automatic export creates a new file for you. In order to determine on what nodes to add a new file and where to put these files in the directory hierarchy, it tries to follow a best-practices convention.
Best practice setup of application, development, and webfiles modules
It is considered a best practice to setup multiple Maven modules to contain repository data for your project. A new project created using the archetype for Bloomreach Experience Manager will create five modules: repository-data/application, repository-data/development, repository-data/site, repository-data/site-development and repository-data/webfiles. The development and site-development modules are intended to separate data files that should be deployed to environments for development and testing from data files that should be deployed to production. This distinction is also supported via the maven profile without-development-data, which is intended to be used with cargo.run in situations where you want to avoid using development data in your local environment, and via the dist-with-development-data module, which is intended for situations where you want to deploy development data into a remote testing environment.
The webfiles module typically does not need to be updated via automatic export, since it is more common to make edits to those files directly on the filesystem within the development project and allow automatic reload to import the changed files into the repository. See Using Web Files for more information.
Restriction: No Downstream Module dependencies allowed
Automatic export can only be used and enabled if there are no other downstream modules (modules loaded or explicitly ordered after) which are not configured to be used for auto export and contain config, content, or namespace definitions. Modules that contain only webfilebundle definitions are excepted from this rule.
Note: this might occur if you have additional modules in your project which are not explicitly dependent on other modules, in which case they will be ordered via lexical sorting by module name. This might happen if multiple development teams are working on different independent modules within the same project.
Possible solutions include:
- configure all modules for auto export, if and only if that is the intended purpose (which typically should not be the case for a webfiles module), or
- make sure the application module is ordered after any module not being exported, via the hcm-module.yaml file. See the last line in the following example:
group: name: myproject after: hippo-cms project: myproject module: name: repository-data-application after: repository-data-other-team
Disabling Automatic Export from the Console
Sometimes you might want to add or remove nodes and properties from the repository for testing purposes or import some batch content only for local development, while you don't want these changes to be exported. In these cases you can disable export temporarily from updating your configuration from the console.
Note that when you re-enable automatic export, any changed nodes will be examined for differences and exported, even for nodes that were changed while automatic export was disabled. If you do not want specific nodes to be exported, you must ensure that they are removed from your repository before re-enabling automatic export. Alternatively, you may explicitly ignore specific nodes by adding the appropriate .meta:category: runtime or content declarations to the YAML-based Configuration Model, or use the autoexport:excluded property described below.
Configuration Options
The automatic export configuration is located at /hippo:configuration/hippo:modules/autoexport/hippo:moduleconfig in the repository. Log in to the Console and browse to that node. The below options are defined as properties on that node.
Exclusion Patterns
Instead of disabling and enabling automatic export from the console every time you need to, it may be more convenient to configure exclusion patterns for paths you never want to have exported. This is done using the multi-valued property autoexport:excluded. You can specify wildcard patterns as the values of this property, where * means any path element and ** means any path. Say, for instance, that you want to exclude all content below /foo/bar. You would then specify /foo/bar/** as your exclusion pattern. Often however, you not only want everything below a certain path, but also the path itself. So to also exclude /foo/bar itself you would have to specify that explicitly as well by adding /foo/bar.
Override .meta:residual-child-node-category
During local development, you may want to use a different setting for .meta:residual-child-node-category for a specific node than the value defined in the configuration model. The multi-value string property autoexport:overrideresidualchildnodecategory allows you to configure such overrides. Values for this property should follow the pattern <path>: <category>
- <path>: a node path (the same wildcards as in "autoexport:excluded" can be used)
- <category>: can be config, content or system
A few overrides are added into the configuration model by default, but you can adjust this if your project or workflow so requires. As an example: the default for .meta:residual-child-node-category for the node /hst:hst/hst:configurations is content. This is because new channels created from blueprints should be classified as content rather than config. However, during local development when you add a new channel it is more likely those nodes and properties should go into the config tree. For this reason, the value /hst:hst/hst:configurations: config is added by default through the hst configuration model.
Inject .meta:residual-child-node-category
During local development, you may want to automatically transition from config to content at certain paths when importing data or copying trees in the console. The multi-value string property autoexport:injectresidualchildnodecategory allows you to configure this behavior. Values for this property should follow the pattern <path>: <category>
- <path>: a node path (the same wildcards as in autoexport:excluded can be used). An additional capability for these paths is that they can optionally end with [<primarytype>] (see example below); if such a primary type string is present, the pattern only matches if the node has that exact primary type
- <category> can be config, content or system
When AutoExport is exporting a path that matches the pattern (including the optional primary type), the node itself is exported as config, with all of its properties, and an additional .meta:residual-child-node-category definition is added with the pattern's category (typically content). Any child nodes are exported as content, or ignored (when using category system).
A few values are added into the configuration model by default, but you can adjust this if your project or workflow so requires. One example is the setting **/hst:workspace/**[hst:containercomponent]: content, which makes sure that new componentcontaineritems in the HST workspace are serialized as content definitions.
Filtering out UUIDs
When nodes carry the mix:referenceable mixin, export would normally generate jcr:uuid property elements for those nodes. Because these uuids change whenever the node is redefined, and because, for some contexts, uuids are not strictly needed, this may lead to unnecessary SCM conflicts. For this reason, you may configure automatic export to filter out these jcr:uuid properties using the multi-valued property autoexport:filteruuidpaths. The values specified here also can (and must) use wildcard patterns, where * means any path element and ** means any path. So, for instance, /hst:*/** will filter out jcr:uuid properties for all hst:hst type nodes.
Configuring AutoExport Modules
Typically, projects have their content and/or configuration split up in multiple modules. Automatic export supports this via the multi-valued property autoexport:modules which accepts string values of the following format: mymodule:/repositorypath. Here, mymodule is the file path relative to the project.basedir to the Maven module's root directory, and /repositorypath is a path in the repository. Everything below /repositorypath, including /repositorypath itself, will be exported to the module mymodule.
It is also possible to specify a value with no repository path, like mymodule. In this case, existing definitions in that module will be updated, but no new definitions will be created in that module. This can be useful for test data used during development, which can be moved manually to a development module and then kept up to date automatically.
One typical use-case for multiple content modules is to have separate namespaces go into separate modules. In order to configure auto export for this add something like 'foocontent:/hippo:namespaces/foo' to the autoexport:modules property. All document type definitions for that namespace, along with the namespace definition and cnd will be exported to module foocontent.
Note that auto export will not move definitions from one module to another when they already exist. When changes are made to existing definitions, it will simply update them in their current location. It will only add new items to the configured module when these need to be created. If you want to move items you must do that manually and then restart the project.
The Bloomreach Experience Manager project archetype provides the following default configuration for the autoexport modules:
/hippo:configuration/hippo:modules/autoexport: /hippo:moduleconfig: autoexport:modules: ['repository-data/application:/', repository-data/development, 'repository-data/site:myproject:/hst:myproject','repository-data/site:myproject']
Changing the module configuration requires a restart.
File Structure
Automatic export reads your project's files and updates these when something changes to the nodes serialized in these files. However, sometimes there is no appropriate file to write changes to and in this case automatic export creates a new file for you. In order to determine on what nodes to add a new file and where to put these files in the directory hierarchy, it tries to follow a best-practices convention.
Limitations
1. Content roots that have Same-Name Siblings
Auto-export may create invalid YAML configuration files or fail to notice all repository changes in some situations involving content roots that define a root node that has same-name siblings. Using the default file conventions, this will only occur when same-name siblings are used in one of the top-three repository levels below /content, which typically define document or gallery folders. Avoid using same-name siblings in this context. In general, use of same-named siblings should be discouraged for forward-compatibility reasons.
2. Content root Ordering
The ordering of content root nodes is not correctly exported in all situations. This is likely to occur for overlapping content definitions, i.e. one content definition defining /content/documents/myproject/news and another defining /content/documents/myproject/news/2017/10 or content definitions that are contributed by upstream modules. Manual editing of YAML configuration files may be needed to produce the desired node ordering.
3. Local resource files are not are not deleted
Binary resources in the repository are represented in yaml by jcr:data elements that have a reference to an actual resource file:
jcr:data: type: binary resource: /content/gallery/myproject/myimage/myimage_original.jpg
The known limitation here is that if an image or file is deleted from the repository, the yaml file is updated correctly, but the referenced file in the file system is not deleted. Manually deleting the file is needed to avoid polluting the project's sources.