Search documents within the CMS
Using the search box above the folders you can search the documents, images or assets for a certain word or words. Type in a word in the search box and press enter; the results of the search will be shown in the document listing. The scope of the search is limited to the documents below the currently selected folder in the folder view. This fact is also highlighted in the grey area below the search box after performing a search. To broaden your search to include all the documents in the CMS click 'All'.
Clicking the cross in the search box turns the CMS back into browse mode. You can see which mode you are in by looking at the magnifier glass icon or the cross.
Searching for documents within the CMS
The document browse perspective contains a search box that will return matching documents in the currently selected folder and all its child folders. Some behavior of the search is configurable, see the Configure CMS Search page.
The text entered in the search box is split into tokens (words) separated by spaces. The tokens are interpreted as follows:
- All tokens are ANDed
- The characters '?' and '*' are removed from all tokens
- Tokens of some minimal length (3 per default, configurable) characters or longer will be post fixed with a * wildcard to match all documents that have a word that start with the token
- Tokens shorter than the minimal length will not be post fixed with a * wildcard. Only documents that have the token as word will match
- Stop words (the, a, an) will be ignored
- Searches are case-insensitive ("Hippo" vs "hiPPO" results in same hits)
- Searches are completely diacritics (accents) agnostic ("très" results in same hits as "tres"). Older versions are partly diacritics agnostic (only for whole words, thus agnostic for "très" but not for "trè")
Search query examples:
matches documents containing the word "hippo"
does not match documents containing the word "shipment"
matches documents containing both the words "hippo" and "things"
does not match documents containing only the word "hippo" or only the word "things"
matches documents containing "hippopotamus"
does not match documents containing the word "government" ("go" is not post fixed by a wildcard because it is too short)
How to search for two words that are OR-ed
If you write OR (in capitals!) between the words, then documents will be found that have at least one of the words. Documents that have both words will in general score higher and thus be more in the top results. However this also depends on how long the documents are (search for Lucene scoring in Google if you want to know more).
OR Search query example
"hip OR thing"
- matches documents that contain "hip" or "thing" or words that start with "hip" or "thing"
By default, the best scoring documents are shown first in the results. Words that match exactly score better than words that match partially (i.e. the ones with the automatically added * wildcard).
In case of OR searches, documents that contain all words in general score higher than documents that contain some of the words. However, the exact score of a document also depends on the length of the document and how many times the specific words occur in the entire repository.
Hyphens and punctuation
Search queries are interpreted by the search engine using the following behavior:
Splits words at punctuation characters, removing punctuation. However, a dot that is not followed by whitespace is considered part of a search term.
Splits words at hyphens, unless there is a number in the search term. In that case the whole term is interpreted as a "product number" and is not split.
Email addresses and internet hostnames are recognized as become one search term.
As a result, hyphens cannot be searched for, unless the word containing a hyphen also contains a number. For example, searching for "foo-bar" is not possible, but searching for "foo-bar-5" is. However, it is possible to search for parts of a word after a hyphen. For example, searching "bar" will match documents containing "foo-bar".
Punctiation cannot be searched for either, except for dots within words. For example, searching for "foo,bar" is not possible, but searching for "foo.bar" is.