SQL Data Store
Introduction
The Relevance Module requires a SQL database for persisting visitor data, request log data, and statistics data.
Supported SQL database engines
To provide a high performance and reliable storage solution that supports accumulation of a lot of data, the SQL store database engine must support SQL MERGE (UPSERT) statements.
The Relevance SQL Data Store currently supports three SQL database engines for production usage:
SQL Database Engine |
---|
MySQL (also through Amazon RDS) |
Oracle (also through Amazon RDS) |
PostgreSQL |
For development the embedded H2 database engine is used, just like for the repository.
The Relevance SQL database may be served by the same SQL database server as configured for the repository if one of the above database engines is used. Otherwise a separate SQL database server needs to be setup for this purpose.
It also is advised to use a separate database (schema) for the Relevance data, not to store it in the same database (schema) as used for the repository data, because the vastly different performance and storage characteristics between these two usages.
Configuration
A Relevance SQL Store connects to its database through a JNDI data source lookup which needs to be defined on container level, e.g. Apache Tomcat.
Similar to how a JNDI data source for the CMS repository database is configured, a separate JNDI data source for the Relevance SQL data store needs to be configured in the container environment.
An example MySQL-specific JNDI data source definition, to be added to the Tomcat conf/context.xml is:
<Resource name="jdbc/targetingDS" auth="Container" type="javax.sql.DataSource" maxTotal="100" maxIdle="10" initialSize="10" maxWaitMillis="10000" testWhileIdle="true" testOnBorrow="false" validationQuery="SELECT 1" timeBetweenEvictionRunsMillis="10000" minEvictableIdleTimeMillis="60000" username="DBUSER" password="DBPASSWORD" driverClassName="com.mysql.cj.jdbc.Driver" url="jdbc:mysql://DBHOST:DBPORT/DBNAME?characterEncoding=utf8"/>
In the above example the DBUSER, DBPASSWORD, DBHOSTDBPORT, and DBNAME should be adjusted for the specific database used.
Furthermore, the database JDBC driver jar needs to be provided in the Tomcat common (shared) classloader.
This Relevance-specific JNDI Datasource, in the above example configured with the default name jdbc/targetingDS, then can be used to as lookup in the SQL Store specific configurations.
This JNDI data source name must be configured for each Relevance SQL store (targetingdata, requestlog, statistics), like below default bootstrapped configuration:
/targeting:targeting/targeting:datastores/targeting:targetingdata: dataSource: jdbc/targetingDS targeting:storefactoryclass: com.onehippo.cms.targeting.storage.sql.DelegatingSqlStoreFactory /targeting:targeting/targeting:datastores/targeting:requestlog: dataSource: jdbc/targetingDS targeting:storefactoryclass: com.onehippo.cms.targeting.storage.sql.DelegatingSqlStoreFactory /targeting:targeting/targeting:datastores/targeting:statistics: dataSource: jdbc/targetingDS targeting:storefactoryclass: com.onehippo.cms.targeting.storage.sql.DelegatingSqlStoreFactory
The above-configured targeting:storefactoryclass com.onehippo.cms.targeting.storage.sql.DelegatingSqlStoreFactory will look up the configured JNDI data source and from its connection dynamically derive the used database engine. It thereafter delegates to a database engine specific SQLStoreFactory instance.
The SQL Stores then will automatically create needed database tables on first access, named visitors, requestlog, and personastatistics respectively.
Optionally these database table names can be prefixed by specifying additional property tablePrefix.
A configuration like for example:
/targeting:targeting/targeting:datastores/targeting:targetingdata: dataSource: jdbc/targetingDS tablePrefix: hippo_ targeting:storefactoryclass: com.onehippo.cms.targeting.storage.sql.DelegatingSqlStoreFactory
will result in using hippo_visitors as table name.
As long as the JNDI data source name(s) are not changed nor the tablePrefix(es) the above configuration will be valid for either development, test, acceptance or production environment: which database is used and with which settings is fully derived from the application container provided JNDI data source configuration.
For completeness, the table below lists all available configuration properties:
Property | Type | Default Value | Description |
dataSource | String | n/a | JNDI data source as configured at container level. |
tablePrefix | String | empty string | Optional table name prefix. |
targeting:storefactoryclass | String | n/a | Fully qualified name of store factory Java class. |
maxAgeDays | long |
35 for requestlogs 60 fr targetingdata |
Records older than this are deleted. 0 means records are never deleted. |
cleanupJobCronTrigger1 | String | n/a | Valid cron expression defining the interval at which data store cleanup jobs run. The cleanup jobs will only run if maxAgeDays is greater than 0. If the cleanupJobCronTrigger property is absent then the jobs execute with a fixed delay of one hour. |
1 Available since version 13.4.0.
Operational processing parameters which are not data store specific, like the number of synchronous retrieve or asynchronous store threads and retrieve timeout, can be configured through a separate Visitor Service configuration.
Example development configuration
For initial development setup using an embedded H2 database, the following example JNDI data source configuration can be used to copy/paste in the (archetype-based) implementation project's conf/context.xml:
<Resource name="jdbc/targetingDS" auth="Container" type="javax.sql.DataSource" maxTotal="100" maxIdle="10" initialSize="10" maxWaitMillis="10000" testWhileIdle="true" testOnBorrow="false" validationQuery="SELECT 1" timeBetweenEvictionRunsMillis="10000" minEvictableIdleTimeMillis="60000" username="sa" password="" driverClassName="org.h2.Driver" url="jdbc:h2:${repo.path}/targeting/targeting;MVCC=TRUE"/>
The above configuration will store the targeting H2 database as sibling of the repository database under the same storage folder (${repo.path}).
The H2 database jar doesn't need to be deployed with Cargo as this already is done by default.
If one of the other supported databases is used for local development then configure the Maven Cargo Plugin in your project root pom.xml (within the cargo.run profile) to deploy the specific database driver to the Cargo "extra" classpath (Tomcat common/lib folder).
For example, for MySQL add the following:
... <profile> <id>cargo.run</id> <dependencies> <dependency> <groupId>com.mysql</groupId> <artifactId>mysql-connector-j</artifactId> <version>${mysql-connector.version}</version> <scope>provided</scope> </dependency> </dependencies> ... <build> <plugins> ... <plugin> <groupId>org.codehaus.cargo</groupId> <artifactId>cargo-maven2-plugin</artifactId> <configuration> ... <container> <dependencies combine.children="append"> <dependency> <groupId>com.mysql</groupId> <artifactId>mysql-connector-j</artifactId> <classpath>extra</classpath> </dependency> ...