HST Page Caching (Context-Aware Cache)
Bloomreach Experience Manager's delivery tier (HST) includes a page cache valve. Its primary focus is:
- Make sure hotspot pages can be served at very high volumes and frequently from cache.
- Make sure thundering herds are less harmful because the page cache is blocking : Only a single request for a single page cachekey will be processed concurrently. All concurrent requests for the exact same page will be blocked and served a rendered response by a single request.
In the section above we explicitly mention hotspot pages. The reason for this is that the HST page cache is not suited for 'fixing' really bad performing pages (for example pages that have to wait on very slow server side remote service rest calls). The primary goal of the HST page cache is to optimize already well performing pages in such a way that they become blistering fast.
An Enterprise Caching module is available which supports, in addition to the open source (first level) page caching, also second level page caching and stale page caching, making the page cache much more powerful and cluster-aware.
The page caching works well for personalized sites as well, for example through targeting.
Serving 20.000+ pages per second
The HST page cache serves more than 20.000 pages per second once a page is cached. This is an enormous throughput, certainly if you realize that for a request that has a cached response, the HST still does:
- Host, mount and sitemap item matching.
- Look up the component configuration and checks whether the component tree is cacheable.
- Executes all the initialization valves, all the processing valves until the pageCachingValve, and all the clean up valves.
Because the HST design scales very well up and out, page caching can serve such a large throughput while still doing all the processing mentioned above.
Seamless integrated with personalized pages
When using Hippo Relevance to support personalized pages, the page cache works as well because the computed profile for the visitor is included in the cache key.
By default, page caching is not enabled. Enabling page caching can be done runtime in the repository. Caching is always disabled for the following requests:
- action URLs
- preview sites
- sites in cms channel manager / template composer
- requests that are processed with subjectbasedsession.
The caching valve is currently included in the following pipelines
DefaultSitePipeline (normal html page rendering)
JaxrsRestContentPipeline (see RESTful JAXRS component support)
JaxrsRestPlainPipeline (see RESTful JAXRS component support)
restApiPipeline (see Content REST API)
PageModelPipeline (see SPA++ Page Model JSON API)
Caching for normal live sites can be switched on and off on different levels of HST configuration. It can be switched on/off per:
- Hosts, Host, Mount or SitemapItem configuration.
- HST component configuration
A rendered response is only cacheable if and only if the matched sitemap item is cacheable and the root component that the matched sitemap item points to is cacheable.
Cacheable configuration on Hosts, Host, Mount and SitemapItem level
To switch on caching globally for all pages for all channels, you can set on hst:hosts the property hst:cacheable=true. Every hst:host, sub host, mount, sub mount and sitemap items now inherits this property. When not configured on hst:hosts, caching is false. Every host, sub host, mount, sub mount or sitemap item can override this cacheable property. Everything below the item that did override the value, inherits the overridden value.
For example, assume your configuration looks like:
/hst:hosts: hst:cacheable: false /dev: /localhost: hst:cacheable: true /hst:root: /nl: hst:cacheable: false /127.0.0.2: /hst:root: /nl: hst:cacheable: true
In this example, both localhost/hst:root and 127.0.0.2/hst:root/nl have caching set to true.
In a sitemap, all items are by default cacheable or not depending on the hst:mount. If the mount is cacheable, then the sitemap is cacheable by default.
For example, assume the mount has hst:cacheable = true, and the sitemap looks as follows:
/hst:sitemap: /home: /news: /*: hst:cacheable: false /*: /**.html: hst:cacheable: true
Then the items home, news and news/*/*/**.html are cacheable, and the items news/* and news/*/* are uncacheable.
Cacheable configuration on HST component configuration
By default, every hst:component is cacheable. When an hst:component needs to be uncacheable, you can indicate this through hst:cacheable=false.
A composite tree of HST components is only cacheable if every HST component in the composite structure of components that will be rendered during the request is cacheable
Marking pages during rendering as uncacheable
There are two situations through which a page that is marked as cacheable by the sitemap item and/or hst component but that the page in the end still won't be cached:
- Setting one or more non-caching header : If a developer puts during rendering ( HstComponent / jsp / ftl) to the response a Pragma no-cache, or a Cache-Control no-cache or an expires of 0 or lower, the rendered response won't be cached
- Setting cookies on the response : If a developer sets during rendering ( HstComponent / jsp / ftl) sets cookies on the response, the rendered response won't be cached. Note that cookies set by the HST framework won't influence caching. Also note that HttpSession cookies do not influence caching. If you create a HttpSession as developer and the page cannot be cached, you should either set a non-caching header or mark the hst component as uncacheable.
Page Cache Configuration options
The following hst config properties (see HST container configuration) can be set and have the following values by default:
pageCache.maxSize = 1000 pageCache.timeToLiveSeconds = 3600 pageCache.clearOnContentChange = true (since CMS 11.2.0) pageCache.clearOnHstConfigChange = true (since CMS 11.2.0)
The maxSize refers to how many page responses will be kept in memory as a maximum (LRU eviction policy after max size has been reached) and is by default 1000. If you make this value larger, make sure that your application has enough memory. The timeToLiveSeconds is the maximum time a cached response is valid. After this time, a page is always recreated.
The clearOnContentChange means that on any content change, the entire cache is flushed. The reason for this is that the HST cannot know on which pages the changed/updated/new/deleted content should be shown, hence, the entire cache is flushed by default on any content change. We could had chosen to not clear the entire cache, but instead, for example use a timeToLiveSeconds of 300 (5 minutes), and accept that at most after 5 minutes a content change becomes visible for a live site. There is however one downside with this strategy: In a clustered setup you could end up with two cluster nodes serving a different page for the same URL for some time. This is in general unacceptable, unless your loadbalancer uses node affinity any way, making sure that the same visitor gets directed to the same cluster node all the time. In this scenario, it might be acceptable that changes become visible after, say, 5 minutes. In that case, configure:
pageCache.timeToLiveSeconds = 300 pageCache.clearOnContentChange = false
Obviously the above clearOnContentChange = false you never configure when you have the Enterprise Caching and have enabled cluster wide second level page caching, as this second level page caching does take care of making sure the entire cluster always serves the same cached responses.
The clearOnHstConfigChange is the same as clearOnContentChange only now for changes in HST configuration.