It is important for users to search for features and feature values. Therefore features must be indexed.

Introduction

censhare indexes a feature- based on its value type and the index configuration. The index is necessary to throw proper and meaningful search results. For example, users should be able to search for individual words or partial words in a string or a text. For referenced assets, it should be possible to search for the assets themselves, not for the resource key. When searching for a date, not just exact matches should be found but also time periods. In the censhare Admin Client, you can define how censhare indexes a feature in the censhare database (cdb).

The "Custom configuration" area

You configure the index for a feature in the "Index configuration" area. Click the "Add" button. A new area displays, "Custom configuration". A feature can contain multiple custom indexes, for example for different servers.

In the system features, you can also make changes to the index configuration by creating a "Custom configuration".Open the Admin Mode in the censhare Admin-Client. Go to the "censhare Admin Client" menu and open the dialog "About censhare Admin Client". Then press and hold the ALT key and close the dialog by clicking OK. Below the "censhare default configuration" area you will now see the "+" button. Click that to add another "Custom configuration" area. We recommend accepting all of the censhare default configuration settings in the custom configuration and then making your desired additions or changes.


Index configuration

Field

Mandatory

Description

Server names

No

The field defines if the index configuration is valid for a special server. Using the "Add" button at the end of the dialog you can create further index configurations and assign them to other servers.

Disable on this server

No

In certain cases, an index on a server is not required because users do not log in. This is the case, for example, if a server is available as a proxy in a DMZ or runs only technical processes. To increase performance and reduce memory usage, indexing can be disabled. In this case, you need to select the server in the "Server name" field on which no index is to be created for this feature. If you do not want to create the index on several servers you must add an own "Custom configuration" for each server.

Rebuild Step

No

In this field, you can set the priority that the index shall have in the rebuilding of the censhare database (cdb). 1 means the highest priority. The higher the value, the later the index will be created. If the field is empty, the full-text index will be created at the very end.

All indexes that are important for the operation of censhare already have the value 1 by default. Once censhare has created all the features with the value 1, the cdb is available and users can work with censhare. All other indexes are then created step-by-step during ongoing operation. Please be careful when assigning the priority level 1. The more features you assign this value to, the longer it takes for the cdb to become available once again. The advantage of the step-by-step creation of indexes can be thus be nullified.

The Features table shows in column Rebuild Step which features have a value of 1. For all others, there is no further prioritization at the moment.

Tip: If you have only changed one feature where you have to update the cdb, it is not necessary to recreate the whole cdb. In this case you can only rebuild the index for the changed feature using the server action "Rebuild embedded database index". If there is a full-text index then that also needs to be updated. 

Type

No

In the "Type" field you can change the behavior of the index if an alternative configuration to the standard is desired.

Use ISNULL column

No

When selecting this field an additional column is created. It only stores if the feature is not defined (ISNULL). The column accelerates the query if a feature does not exist but also uses more memory.

Use NOTNULL column

No

When selecting this field an additional column is created. It only stores whether the feature is defined (NOTNULL). The column accelerates the query whether a feature exists but also uses more memory.

Use sorting columns

No

If this field is activated, the sorting is lexicographical. For instance, umlauts or upper-lower case are considered. This is important if a country-specific sorting should be done. If you use a sorting column you also have to select the field "Use sort-values".

Use Sort values

No

If you select this field, the sorting of large search results is accelerated explicitly. Without this parameter, the sorting takes longer because censhare has to load all records first in order to extract the values. If you select the field, the values for this feature are also stored independently. However, this also consumes additional storage space. Select this field if you expect that the search results for this feature will become very large.

Hierarchical feature mode

No

The field defines for hierarchical features how a user can select a specific feature for searching. For more information on creating hierarchical features, see the section

Store flat: No feature hierarchy is presented. Users can directly select subordinate features.

Store hierarchical: The feature hierarchy is presented. Users must first select a higher-ranking feature. After that, subordinate features can be selected from a different list.

Store flat or hierarchical This option combines the two possibilities. Please note that double the amount of memory is needed for this.

For example, for an "Address" it makes sense to organize the feature hierarchically. Users can select child features such as "Business" or "Private" and then on another hierarchy level search for a feature like "Street" or "Postal code". If, on the other hand, "Memory flat" is selected, users can search directly for a street. However, it is not possible to search for a street in a private address.


For more information, see Relevance mode


No

Here the algorithm that calculates the relevance of a feature for the search is used.

By default, censhare uses the BM25-Algorithm. It takes into account the frequency of occurrence of terms in a document and their frequency in all documents. If a text contains the searched term several times, the relevance of the text increases. If the searched word occurs in many texts, but not more often, this contributes less to the relevance. In comparison to BM25 the algorithm BM25flat does take not into account the frequency of words.

Sort by XPATH expression

No

Enter an XPATH expression. It will be used for sorting. However, this costs performance accordingly.


The "Notes" area

Here you can add information for the search engine, for example, for optimization.

Index types

The index type tells you what kind of data censhare indexes. Select the index type for a feature according to its feature value type:

Index-Typ

Description

Default

This entry defines the standard behavior.

Hierarchical

Choose this entry if the value type of the feature is hierarchical.

Pair

This entry is selected when the value type for the feature consists of a pair. Be aware of the effect on performance for database queries for this type.

Coordinates

Choose this entry if the value type of the feature has coordinates.

Range

The entry is still available for compatibility reasons and should no longer be used.

Numeric TRIE

This entry is selected when the value type for the feature consists of an area. It replaces the "Range" value. The configuration of "Numeric TRIE" works more efficiently than "Range". If you select this value the additional "Decomposition in bits" field appears. It is responsible for the creation of the index tree of a search. The default value is 6 Bits.

Full-text index

Select this entry when the feature is to be indexed as a full text. If you select this entry, a range of additional configuration parameters is provided, as well as the additional section "Settings for fuzzy searches".

Virtual Fulltext

This compiles multiple full-text indexes. For example, the quick search in censhare Web uses a virtual full-text index. While executing a search different full-text indexes are searched.