How to rebuild the censhare embedded database (cdb) based on a Linux operating system.


Introduction

A cdb rebuild is necessary when:

  • the configuration of the cdb or their indexes changed.

  • new calculated features are added.

If the cause that requires a cdb rebuild is not production critical, schedule the rebuild outside the main working hours!

Hints

  • The cdb rebuilds on basis of the connected database (Oracle or PostgreSQL). The "currversion" of the "asset" table is triggered by the table asset_ccn_counter.

  • Do not update the jvm/cdb configuration while the censhare Server is running at main working hours!

  • Deactivate the automatic CCN sync in the censhare Admin Client during the cdb rebuild at: Configuration/Modules/Administartion/Embedded Database/Embedded database sync using CCNs (automatic).

  • Make sure to not cross any do deletion or archiving jobs while rebuilding, or do a server shutdown. Otherwise your cdb will be incomplete and you have to start the rebuild from scratch.

  • cdb rebuilds can cause the deletion of content of online channel websites.
    Reason: Since the new online channel versions use the app servers cdb instead of the oracle database for synchronizing the content of the web servers and the rebuild of the cdb will lead to an initially empty cdb the content on the delivery servers will also be deleted und gradually reappear while the rebuild proceeds.
    Solution: Disconnect the rmi connection between the online channel satellite and the app server during the rebuild to prevent the deletion of the content on the web servers. This can be archived by changing the url in the satellite configuration file on the web servers to an invalid domain. Note that the functionality of the web servers might be limited during that time because neither updates can be written (e.g. user-generated content) nor can commands be called from the web servers.

  • Note down the old configuration as you have to change them back after the rebuild.

The embedded DB infrastructure uses the following services:

  • AssetStoreService: This service manages the local storage of assets in cdb

  • AssetQueryService: This service manages the retrieval of assets from local storage via queries

  • IIndexService: This service manages direct access to inverted index storage for additional embedded database users like translation and terminology

Disable the “Contentfulltext” flag to improve speed of cdb rebuilds

The Content (full text) feature indexes the content of asset files (for example, text documents). It can be searched from a field in the General tab of the Detail search in both web client and Java client, called “Content (full text)“.

To reduce the time of the cdb rebuild, you can change this feature which can later be restored:

  1. In the censhare Admin Client, activate the Admin mode.

  2. Open the Master data/Features table and search the censhare:text.content feature

  3. To open the feature configuration dialog, double-click the entry.

  4. At the bottom of the feature configuration dialog, under Custom configuration / Fulltext index disable the flag “Content fulltext”

  5. The cdb is rebuilt (and synched to remote servers, if applicable)

  6. Re-enable the flag “Content fulltext”

  7. Enable the server action “Embedded database fulltext index sync” under Configuration /Administration / Embedded database on the Master server

  8. Run the server action from the Master server, then from the other servers, enabling and disabling it on each server as you go. Note: It may be required to log on and off again to see the server action on the remote server. Make sure to sync the config across to the Remote servers. Alternatively, it’s also possible to synch the cdb from the Master server once the full-text index sync is finished.

Increase jvm heap space

Before changing the jvm and cdb settings you need to check for the available RAM:

free -g | head -2 total used free shared buffers cached Mem: 15 4 11 0 0 0
CODE

With 11GB free RAM, 10GB for jvm is sufficient.

Adapt the jvm settings:

vi ~/cscs/app/config/launcher.$CSS_ID.xml
CODE

Search for the active preconfigured jvm settings, marked with "current". For example:

<!-- current: machines with 8 or more threads 64Bit Java, RAM greater-equal 4GB -->
CODE

Active configuration:

<jvmarg value="-Xms2556m" enabled="true"/> <jvmarg value="-Xmx2556m" enabled="true"/>
CODE

New configuration:

<jvmarg value="-Xms10g" enabled="true"/> <jvmarg value="-Xmx10g" enabled="true"/>
CODE

Increase data-cache and node-cache

Use up to half of the jvm for data-cache and 1/4 for node-cache.

vi ~/cscs/app/services/assetstore/config.$CSS_ID.xml
CODE

A good value for data-cache and node-cache would be 5GB and 2.5GB.

Default configuration:

node-cache-size-mb="300" data-cache-size-mb="800"
CODE

New configuration:

node-cache-size-mb="2500" data-cache-size-mb="5000"
CODE

Save cdb. Set prerequisites

Stop the censhare Server.

censhare.rc stop
CODE

Create a working copy by moving the folder aside. Server will automatically create ~work/cdb if it does not exist during server start

mv ~/work/cdb ~/work/cdb.savedbeforerebuild
CODE

Start rebuild

Start the censhare Server, and the cdb is rebuilding itself:

censhare.rc start
CODE

Watch rebuild

Monitor the rebuild:

tail -F ~/work/logs/server-0.0.log | grep -iE "updater wrote|cdb"
AssetStoreUpdater: AssetStoreService: AssetStoreUpdater: Updater wrote 200 (200 high) updates in 200ms, 5000 tasks in update queue, 64 tasks in buildQueue, 1000 tasks in fulltextBuildQueue, 2594183 events in queue.
CODE

Explanation AssetStoreUpdater log entries

Updater wrote 200 (high) updates in 200ms
CODE

The first part gives information about how many updates are written in how many milliseconds. Generally, the AssetStoreUpdater needs 1-2ms for 1 update. So 1 update per ms is a good value. The (high) stands for high priority assets, there can be assets with low priority as well. max. value = 200

5000 tasks in update queue
CODE

The update queue contains write instructions for the cdb. If this queue is regularly full, it hints at a too low CDB cache-memory or a slow hard drive. If the update queue is empty but the event queue is still having updates this means that the database serves not enough updates for the update queue. max. value = 5000

64 tasks in buildQueue
and
1000 tasks in fulltextBuildQueue
CODE

The buildQueue and fulltextBuildQueue are containing tasks for the calculation logic. This logic splits assets such as text (fulltextBuildQueue) into data fragments for the cdb. max. value each = 1000

2594183 events in queue
CODE

This number describes the current version of assets to be updated. A high amount of events is normal for a rebuild.

As soon as the rebuild is finished (0 tasks in the queues)

AssetStoreUpdater: AssetStoreService: AssetStoreUpdater: Updater wrote 1 updates (1 high) in 0ms, 0 tasks in update queue, 0 tasks in buildQueue, 0 tasks in fulltextBuildQueue, 0 events in queue.
CODE

wait for the next checkpoint. A checkpoint is going to be written every 10 minutes.

AssetStoreUpdater: CDBEnvironment: AssetStoreUpdater: writing checkpoint AssetStoreUpdater: CDBEnvironment: AssetStoreUpdater: Evicted 0 bytes from data cache temp AssetStoreUpdater: CDBEnvironment: AssetStoreUpdater: Evicted 0 bytes from node cache temp AssetStoreUpdater: CDBEnvironment: AssetStoreUpdater: verified read node cache size OK: 16548411 AssetStoreUpdater: CDBEnvironment: AssetStoreUpdater: verified temp node cache size OK: 0 AssetStoreUpdater: CDBEnvironment: AssetStoreUpdater: verified total node cache size: 16548411, target max size 314572800 AssetStoreUpdater: CDBEnvironment: AssetStoreUpdater: verified read data cache size OK: 63141939 AssetStoreUpdater: CDBEnvironment: AssetStoreUpdater: verified temp data cache size OK: 0 AssetStoreUpdater: CDBEnvironment: AssetStoreUpdater: verified total data cache size: 63141939, target max size 838860800 AssetStoreUpdater: AssetStoreService: AssetStoreUpdater: 44 Transactions done, checkpoint finished in 82ms.
CODE

Rollback settings

As soon as the checkpoint is finished, stop the censhare Server, restore the changes made to the config files and start again.

Troubleshooting

cdb update during runtime

With the server action within censhare Admin Client at Configuration|Modules|Administration|Embedded Database|Embedded database sync using CCNs it is possible to sync the oracle database table with the censhare embedded database during runtime.

The difference between the rebuild from scratch via moving the ~/work/cdb folder aside and using the server action is the server action does not delete the existing cdb files it only appends new ones. This means that the cdb folder size is bigger afterwards.

If you have changed assets via SQL within the oracle database you have to insert the changed asset ids into the asset_ccn_counter table. Afterward, you have to execute the above-mentioned server action:

SQL> INSERT INTO asset_ccn_counter (asset_id) VALUES ([asset-ids]); SQL> commit;
CODE

After restart: AssetStore Service not available

The AssetStore Service must read all cdb files until the last checkpoint. If this was performed the AssetStore Service is available. This can take up to several minutes depending on the cdb folder size. Until this time warning/error messages are possible. The necessary time can be reduced if the filling level of the cdb files (cleaner-usage-threshold) will be increased because then less cdb files exist and must be read.

After restart: Client fuzzy search not possible

Even if the AssetStore Service was started successfully the fuzzy search is not working. Important for a working fuzzy search is the KnownValues . This can take up to several minutes depending on the cdb folder size. Until this time warning/error messages are possible. The following line will be shown within the server log file if the KnownValues were successfully loaded:

LoadKnownValues: AssetStoreService: no-context: async: loadKnownValues done.
CODE

Show and delete cdb statistics

With the server action within censhare Admin Client at Configuration|Modules|Administration|Embedded Database|Embedded database statistics it is possible to show and delete the current statistics within a window.

Interpret cdb statistics

Decrease disk usage by changing cleaner-usage-threshold parameter

The parameter is named cleaner-usage-threshold and can only be changed within the xml file at Configuration/Services/Embedded Database. This means: if only 30% of the files of a cdb file are used, the 30% will be moved into the last created cdb file and the unnecessary file will be deleted which will free up disk space.

Formula: totalFileUsage% : cleaner-usage-threshold = factor Current cdb folder size x factor = maximum cdb folder size

Example: If totalFileUsage 60% = 30GB used disk space and cleaner-usage-threshold = 10: 60 : 10 = 6 30GB x 6 = 180GB