Configure hotfolder actions
Hotfolder actions configure automated commands that handle files that are transferred to the censhare server. Hotfolder actions are specific for each module.
Purpose
For some modules in censhare, hotfolder actions are available to process files that are transferred to the censhare Server. Usually, hotfolder actions are an automated variant of the respective manual action.
Context
Hotfolder actions are configured in the censhare Admin Client in the respective Configuration/Modules configuration.
Prerequisites
You must be familiar with the Module, for which you want to configure a Hotfolder action, and with the node's schema of the configuration.
Introduction
Besides the manual file upload in censhare, the censhare Server can import and process files that are uploaded to defined directories on the file system (aka hotfolders). The in directory of a hotfolder action can be connected to third-party systems or accessed via FTP.
If a hotfolder action is configured for a module, the censhare Server scans the corresponding in directory. If one or multiple files are uploaded to that directory, the hotfolder actions process the files, creates or updates the respective assets, and imports the files into the filesystem.
In some special use cases, a hotfolder action accesses the data in the uploaded files and writes these data into the target assets. In this case, no assets are created from the uploaded file. This type of hotfolder action is used for example in the TMX importer module of the Translation with Memory application.
Hotfolder configuration dialogs
The configuration dialogs of hotfolder actions in the censhare Admin Client show the necessary fields to configure the respective action. Therefore, dialogs in different hotfolder actions can show different fields and options. However, the underlying node schema is the same for all hotfolder actions.
The General setup of hotfolder actions is similar to the general setup of other modules in censhare. For a reference of this part of the setup, see the General setup section below.
The Filesystem setup is required for all hotfolder actions. For a reference of this part of the setup, see the Filesystem setup section below.
For other configuration fields than the ones referred in the General setup and Filesystem setup sections, see the Node schema and Attributes section.
Key steps
General setup
In the General setup area, configure the hotfolder action as follows:
Field | Default | Description |
---|---|---|
Server Name | - | In a censhare cluster or master server configuration, you can restrict the server action to one server or create different configurations on different servers. By default, the action is enabled on all servers. |
Enabled | false | Must be enabled to. |
Title | - | If a resource placeholder exists, do not change. For custom hotfolder actions without resource placeholders, enter a name. |
Description | - | If a resource placeholder exists, do not change. For custom hotfolder actions without resource placeholders, enter a description. |
Version | - | For internal use only. |
Interval | 30 seconds | See <listen-events/>. |
Process user | - | Optional. To process different kinds of data/files, you can create multiple configurations of the same hotfolder action. In this case, we recommend to select a process user. |
Filesystem setup
Field | File system | Path |
---|---|---|
Input dir | The file system in which files are uploaded. See scan-dir-filesystem in <scan/>. | The relative path to uploaded files. See scan-dir-relpath in <scan/>. |
Working dir | The file system in which files are moved when they are processed. See work-dir-filesystem in <scan/>. | The relative path to files when they are processed. See work-dir-relpath in <scan/>. |
Output dir | The file system in which processed files are moved after they were successfully processed. See completed-dir-filesystem in <import-manager/>. | The relative path of files when they are successfully imported. See completed-dir-relpath in <import-manager/>. |
Error dir | The file system in which processed files are moved if they could not be processed. See error-dir-filesystem in <import-manager/>. | The relative path to files that could not be processed. See error-dir-relpath in <import-manager/>. |
Node schema
The underlying node schema of all hotfolder actions is as follows. For a reference of the [ATTRIBUTES] that are available in each node, see the Attributes section below.
<cmd xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance
xsi:noNamespaceSchemaLocation="http://www.censhare.com/xml/3.0.0/command.xsd"
xmlns:corpus="http://www.censhare.com/xml/3.0.0/corpus"
xmlns:new-val="http://www.censhare.com/xml/3.0.0/new-val"
xmlns:new-fct="http://www.censhare.com/xml/3.0.0/new-fct">
<xml-info title="${reg-title-hotfolder-[MODULE]}"
description="${reg-description-hotfolder-[MODULE]}"
localize="true" knowledge-level="3"
special-licence="false"
feature-help=""
version="">
<property-resources>
<base url="file:module"/>
<base url="file:../../common/global"/>
</property-resources>
</xml-info>
<cmd-info name="hotfolder.[MODULE]"
type="auto-execute"
enabled="false"
mode="loop"
ignore-error="true">
<error-log module="hotfolder"/>
</cmd-info>
<admin-info is-template="true" dialog-id="config" dialog-url="file:config-dialog.xml"/>
<params storage-item-creation-external-process-error="${storage-item-creation-external-process-error}"/>
<listen-events>
<timer [ATTRIBUTES]/>
<cron [ATTRIBUTES]/>
</listen-events>
<commands currentstep="0">
<command target="EventManager" method="waitevent" key="listen-events"/>
<command target="ImportManager" method="scan" key="scan"/>
<command target="ImportManager" method="file-group-rollback-transaction" key="import-manager"/>
<command target="ScriptletManager" scriptlet="modules.hotfolder.Hotfolder" method="checkinFiles"/>
</commands>
<scan [ATTRIBUTES]>
<match [ATTRIBUTES]>
<data [ATTRIBUTES]>
<asset-duplicate-check [ATTRIBUTES]/>
<asset-attribute [ATTRIBUTES]/>
</data>
</match>
</scan>
<import-manager [ATTRIBUTES]/>
<group-asset-importer [ATTRIBUTES]>
<asset-attribute [ATTRIBUTES]/>
</group-asset-importer>
</cmd>
Attributes
<listen-events/>
The event listenter is a standard method that is required for the automatic execution of commands. In this node, use either a <timer/> node for intervals, or a <cron/> node for cron expressions to control the execution of the command.
<timer/>
The time node sets a fixed interval (in seconds) for the execution of the hotfolder command.
Attribute | Description |
---|---|
delta-sec | Scans the in directory every x seconds. Default is 30. |
<cron/>
Attribute | Description |
---|---|
pattern | Cron expression for the automatic execution of the hotfolder command. |
<scan/>
This method creates a command that processes files that are stored in the defined in directory.
Attribute | Description |
---|---|
group-result | If true, results are grouped by <match/> nodes and sub-directories. |
group-export-key | Default is file-export-group. Must be identical with the group-export-key in the <import-manager/> node (cross-reference). |
fork | If true, the command creates a copy of itself for each process, and the original command remains in waiting state. If false, only one command is used to process all files. Default is true. |
scan-dir-filesystem | The file system where uploaded files are stored. This file system must be accessible from external locations, for example via FTP. The default is interfaces. File systems are configured in the Master data/File systems table in the censhare Admin Client. |
scan-dir-relpath | The relative path that points to the directory that the hotfolder action scans for incoming files. Use the pattern file:[hotfolder]/in (replace [hotfolder] by the name of the hotfolder action). |
work-dir-filesystem | Moves files to the working directory before a command is executed. Recommended to avoid that files are queued a second time when the in directory is scanned again before the last invoked command is finished. Default is interfaces. |
work-dir-relpath | The relative path to the working directory. Default is file:[hotfolder]/work (replace [hotfolder] by the name of the hotfolder action). |
out-filesystem-key | XPath expression to retrieve the filesystem key. Default is file@filesystem. |
out-relpath-key | XPath expression to retrieve the file URL on the filesystem. Default is file@url. |
out-url-key | Optional. The attribute key where the file URL is placed in the executed command template. |
context-url | Optional. The context url for relative urls in the dir and work-dir attributes. If context-url itself is relative, then the path is relative to the ImportManager context. |
max-files | The maximum number of files that are moved from the in directory to the work directory. |
max-queue-size | The maximum number of commands that can run in parallel, regardless of the command state. If fork = false, the max-queue-size is always 1. If max-files = 1, the max-queue-size is ignored. |
create-unique-url | If set to true, files that are moved to the working directory, cannot be overwritten accidentally. Default is true. |
last-modification-delay-sec | Only processes a file when its last modification date is older than the delay (in seconds). The delay ensures that the file upload is complete before a command starts processing the file. This attribute is necessary for UNIX systems because it is not possible to determine whether a file transfer is finished or still in progress. |
scan-subdirectories | If true, files to be processed can be placed in nested directories. |
include-base-path | Optional. If scan-subdirectories is set to true, this attribute can be used to specify a regular expression that matches sub-directories including their relative path from the scan-dir-relpath attribute. |
include-directory-regex-pattern | If scan-subdirectories is set to true, this attribute can be used to specify a regular expression for sub-directories to be included. Default is .* (meaning, any sub-directory is included). |
include-directory-glob-pattern | Optional. If scan-subdirectories is set to true, this attribute can be used to specify a wildcard expression for sub-directories to be included. |
exclude-directory-regex-pattern | Optional. If scan-subdirectories is set to true, this attribute can be used to specify a regular expression for sub-directories to be skipped. |
exclude-directory-glob-pattern | Optional. If scan-subdirectories is set to true, this attribute can be used to specify a wildcard expression for sub-directories to be skipped. |
ignore-non-matching-files | If true, files that do not match the include-file-regex-pattern or exclude-file-regex-pattern file pattern, are skipped. Otherwise, non-matching files are moved to the error directory. |
add-data-node | Optional. If true, adds a <data/> node to each <match/> node for each executed command. |
<match/>
In the match node, you can define criteria in regular expressions that must be matched to process a file, or criteria that must be matched to skip a file.
Attribute | Description |
---|---|
include-file-regex-pattern | Only files that match this regular expression are processed. Either include-file-regex-pattern or include-file-glob-pattern must be defined. |
include-file-glob-pattern | Only files that match this wildcard expression are processed. Either include-file-regex-pattern or include-file-glob-pattern must be defined. |
exclude-file-regex-pattern | Optional. Files that match this regular expression are skipped. |
exclude-file-glob-pattern | Optional. Files that match this wildcard expression are skipped. |
<data/>
In the data node, you can write attributes that are applied to each file that is processed. In the <scan/> node, the add-data-node attribute must be true for this method.
Attribute | Description |
---|---|
create-assets | Set to true to create assets from imported files. In this case, you must also set the asset type to be created. If you do not want to create assets (for example: TMX import), you can set the attribute to false, or simply remove the <data/> element. |
use-duplicate-check | Set to true to perform a duplicate check. If true, set also the perform-update attribute. |
perform-update | If use-duplicate-check=true, this attribute defines how to handle duplicates: Set to true to force updating the storage item, set to false to use the existing storage item. |
checkout-checkin-options | Set to outin to perform a check-out and check-in of assets. This creates a new asset version. Set to update, to update the storage item. This does not create a new version. |
replace-files | Set to true to replace existing storage items (non-master files). |
type | If create-assets=true, set the asset type to be created with this attribute. |
map-metadata | Optional. If true, file metadata are written into asset properties. Note: This requires metadata mapping. |
application | Optional. Sets an application to edit the file in censhare. If no specific application is available, set to default. |
<asset-duplicate-check/>
This method defines, which attributes/values are used for the duplicate check. The <asset-duplicate-check/> node is only required if use-duplicate-check is set to true in the <data/> node.
Attribute | Description |
---|---|
key | The asset attribute against which the duplicate check is performed. Default is @id_extern. |
src-pattern | Regular expression that retrieves the string used for the duplicate check against the key attribute. |
replacement | Optional replacement pattern to transform the src-pattern string. |
<asset-attribute/>
This method can be used to retrieve any value from the processed file and write it into an asset attribute.
Attribute | Description |
---|---|
key | The asset attribute key. For example: @name. |
src-pattern | A regular expression that retrieves the desired string value from the source file. |
replacement | Optional. A replacement pattern to transform the string retrieved with src-pattern. |
use-path | Optional. If true, the context path to the source is included. |
<import-manager/>
The <import-manager/> method defines how to handle processed files.
Attribute | Description |
---|---|
group-export-key | Default is file-export-group. Must be identical with the group-export-key in the <scan/> node (cross reference). |
completed-dir-filesystem | The file system in which successfully processed files are stored. Default is interfaces. |
completed-dir-relpath | The relative URL to successfully processed files. Default is file:[hotfolder]/work (replace [hotfolder] by the name of the hotfolder action). |
error-dir-filesystem | The file system in which files that caused an error during processing are stored. Default is interfaces. |
error-dir-relpath | The relative URL to files in the error directory. Default is file:[hotfolder]/error (replace [hotfolder] by the name of the hotfolder action). |
transaction-group-name | Value is hotfolder-rollback-transaction-group - do not change! |
<group-asset-importer/>
The <group-asset-importer/> node contains the configuration for the processing of sub-folders. Optionally, you can add one or multiple <asset-attribute/> nodes inside this node.
Attribute | Description |
---|---|
group-asset-setup | If true, imported assets from sub-folders are grouped with the corresponding Group assets. |
include-group-regex-pattern | Enter a regular expression that matches sub-folders to be included. |
exclude-file-group-pattern | Enter a regular expression that matches sub-folders to be excluded. |
create-asset-tree | If true, assets from sub-folders are assigned as child assets. |
checkout-checkin-options | see the respective attributes in the <scan/> node. |
replace-files | |
create assets | |
perform update |
Result
The censhare Server runs the hotfolder action and processes the files according to the configuration. Updated and newly created assets are available in the system.