Configuring brX Content Search Integration

Configuration

The Content Feed addon can be configured to create full and delta content feeds automatically on a recurring schedule. For more information on the differences between these feed types, please see the brSM Content Search documentation.

The Content Feed addon uses dynamic bean generation when serialising brXM content. It requires that dynamic beans are not disabled in your project, please refer to the relevant documentation page and ensure dynamic beans are enabled.

In case you are using Taxonomy or Selections plugins, you will need to make sure the annotated-classes param in cms webapp's web.xml is up to date. For example, this entry should be added for taxonomy:
<context-param>
    <param-name>hst-beans-annotated-classes</param-name>
    <param-value>classpath*:org/onehippo/taxonomy/**/*.class</param-value>
</context-param>

Configure Content Feed

When installing the addon, the following default configuration is bootstrapped under /hippo:configuration/hippo:modules/content-feed:

definitions:
  config:
    /hippo:configuration/hippo:modules/content-feed/hippo:moduleconfig/control:
      fullFeedTime: '01:00'
    /hippo:configuration/hippo:modules/content-feed/hippo:moduleconfig/metadata:
      feedCronExpression: 0 */5 * ? * *
      enabled: false
      includedPaths: [/content/documents]
      excludedDocumentTypes: ['resourcebundle:resourcebundle', 'robotstxt:section']
      numberOfSiteWebapps : 1
      referenceBeanDepthLimit: 2
      referenceBeanTotalLimit: 50

The parameters for each section are described below.

Section control

Parameter

Default value

Description

fullFeedTime

01:00

Full feed trigger time for the daily full feeds.

maxFullFeedRetryCount

3

Number of retries if the delivery of the created feed file fails.

jobStatusRetryCount

20

Number of retries to get the status of feed send or feed create operations which is running on Dataconnect. If the operations are in running or queued status, then retries are done in every 30 seconds until the retrycount value is reached.

Section metadata

Parameter

Default value

Description

feedCronExpression

0 */5 * ? * *

The scheduling interval for delta feeds, uses CRON format

enabled

false

Should the content feed be generated, or should this entire service be temporarily disabled? This value must be changed to ‘true’ to enable the content feed service.

When using a blue-green deployment strategy, make sure the service is enabled only in the current production environment and always disable the service in the current stand-by environment.

includedPaths

[/content/documents]

Multiple-value “white list” property that lists the paths where the content feed service looks for documents to include in the search index.

excludedDocumentTypes

['resourcebundle:resourcebundle', 'robotstxt:section']

Multiple-value “block list” property that contains document types that will be excluded from the content feed.

numberOfSiteWebapps

1

Number of site webapps which should haven been registered before feed execution begins.

referenceBeanDepthLimit

2

When including values from linked documents in the content feed, this value sets a limit to how far removed a referenced document can be while still being included. The default value of “2” will include documents that are indirectly referenced 2 steps from the current document. A value of 1 would include only values from directly-referenced documents.

referenceBeanTotalLimit

50

When including values from linked documents in the content feed, this value sets a maximum number of referenced documents that will be included, separate from the depth limit above. This value will NOT allow for referenced documents to be included if they do not also meet the depth limit.

Configure Environment-Specific Properties

The properties below should be added to an environment specific HST properties file for the CMS/Platform, such as the cms webapp's hst-config.properties. See HST container configuration docs for details.

// The HST host group containing mount(s) that serve live content. 
// This should be configured such that appropriate links are generated for  
// content search results. For example, a production search index should use 
// the host group that matches the production brXM environment. E.g.: prod, staging.
brx.contentfeed.environment = dev-localhost

Configure DataConnect Feed Transfer

In this section you can configure the destination for the content feed. Currently, the addon uses an SFTP connection to deliver the feed to brSM. You will need to configure an SSH key pair that is unique for your brSM account.

brx.dataconnect.transfer.connection.host = sftp-staging.connect.bloomreach.com (staging) or sftp.connect.bloomreach.com (production)
[Host of the SFTP server]
brx.dataconnect.transfer.connection.port = 22
[Port of the SFTP server]
brx.dataconnect.transfer.connection.username =
[SFTP account username, provided to client by a Bloomreach representative.]
brx.dataconnect.transfer.connection.location = home/<username>
[Directory in remote SFTP. The value should begin with /home and contain the username defined in brx.dataconnect.transfer.connection.username.]
brx.dataconnect.transfer.connection.privatekeypath =
[The file path of the private key used for authenticating against the remote host (e.g. ${brc.appconfigpath}/id_rsa.properties). See the notes about how to configure the private key and the required key file format below.]
brx.dataconnect.transfer.connection.privatekeypassphrase = [The passphrase of the private key, in case one is needed.]
brx.dataconnect.transfer.connection.serverhostkey =
AAAAB3NzaC1yc2EAAAADAQABAAABAQCBaNJJwEuhk4QxT4BWqPcsDvbaAR6JWB/Ypq/RV+nifYyBRplhfrgwGtr5EFKM2xK/yo0qO0uSYE8uBiaGAFefJ/f1tNMmwqMdVvp9oWLSvH1e+gqHGqubSKazUCX6AKXJ1ZPw+uG9lM267OwIKTvM8iKteXCafNRvY04bh3oCejWu9Djiu44BaTIjIl4QV9qKLm+lTa4vxvDZPGJDKOIfbuHEFz0H2vXOT478g+C4nojDAsZxbU6mAs/Bqa81BsZKHZiNK0UvYJvtIr0y3f0FFUR6JBUoHY4q0RLw+Mit7DenuGIJdw/t1nDEIikAQ8eGcseZcx+vvh/PmayDtdlh
[The host key (public key) of the server that the will be connected to]
brx.dataconnect.transfer.connection.connect.timeout = 60
[Timeout duration in seconds for the sftp connection. If the connection can’t be established on this duration, feed execution will be failed.]
To see how a private key can be configured in the cloud environment, see Set Environment Configuration Properties for details. Private key file should have .properties extension like id_rsa.properties.
The value provided above for property brx.dataconnect.transfer.connection.serverhostkey, is the public key of Boomreach's production server. Contact your Bloomreach representative if you're using a different server, or some other, than production, environment.

Content Feed SFTP connection supports RSA type SSH private keys and some other types. It doesn’t support private keys with OpenSSH type. On many recent linux and macOS systems, the ssh-keygen command creates private keys with OpenSSH type. If you already have a private key, it can be checked by looking at the first row of the generated private key file. An OpenSSH type key starts with  -----BEGIN OPENSSH PRIVATE KEY-----, whereas an RSA type key starts with -----BEGIN RSA PRIVATE KEY-----

A new RSA based can be created with the following command:

ssh-keygen -t rsa

Or if you want to convert your existing OpenSSH type key to RSA one, use the following command:

ssh-keygen -p -N <your_new_password> -m pem -f <path-of-your-private-key>

The RSA private key should not have a passphrase protection. In case your RSA private key is passphrase protected,  an unencrypted copy of the private key can be created with the following command:

openssl rsa -in <original_private_key_file> -out <new_unencrypted_private_key_file>

Configure DataConnect API

The last step in setting up the feed generation, is to configure the client for calls to brSM’s DataConnect API. The DataConnect API is used to trigger the creation of a new content search index from an uploaded content feed. The values described below will be provided by a Bloomreach Support representative. 

brx.dataconnect.api.baseurl = 
https://api-staging.connect.bloomreach.com/dataconnect/api/v1/ (staging) or
https://api.connect.bloomreach.com/dataconnect/api/v1/ (production)
[the DataConnect service url]
brx.dataconnect.api.key = [your DataConnect API key]
brx.dataconnect.accountId = [your brSM account id]
brx.dataconnect.catalog = [your brSM catalog name]

Configure Query API

The following properties configure the site’s access to the brX Content Search API. These properties must be additionally defined in the site webapp's hst-config.properties.

brx.search.uri: [your brSM search API endpoint]
brx.search.accountId:  [your brSM account id]
brx.search.cache.enabled: false
brx.search.catalogs: [your SM catalogs (comma separated)]

 

Did you find this page helpful?
How could this documentation serve you better?
On this page
    Did you find this page helpful?
    How could this documentation serve you better?