Configure import & export of terminology
Dictionaries help "Translation with memory" users to produce translations with consistent terminology They contain terms, brand names and proper names. This article describes how to set up and carry out imports and exports of dictionary entries in the XML-based "Term-Base-eXchange" (TBX) format.
What's new
You can import TBX files with custom locale definition.
Import dictionary entries
Import dictionary entries (automatic)
Export dictionary entries
Context
The manual import and export can be executed in the censhare Client and the censhare Admin Client.
The configuration of the manual and the automatic server actions is done in the censhare Admin Client.
Prerequisites
Permission to access Configuration folder in censhare Admin Client.
Introduction
Dictionaries contain special terminology and proper names that need to be consistently translated in specific contexts. The censhare translation memory accesses dictionaries and identifies and marks the terms found in the text to translate. Using dictionaries, you can ensure that specialty terms, brand names, and editorial guidelines specific to your organization are standardized.
Unlike segments, dictionaries contain entries of individual words, terms or compound terms that should always be translated the same way. In addition to that, they can also contain proper names or brand names that are never to be translated.
Dictionaries are also referred to as glossaries or terminology lists in certain applications or contexts. In censhare they all mean the same thing. Dictionaries that you import into censhare are shown in the translation memory or in translation management in the Terminology Widget.
In the censhare Admin Client module Dictionary you can configure the import and export of dictionary entries. You can also set up an automatic import. First you have to activate and set up the desired server actions in the Dictionary module in the "Configuration/module" directory in the censhare Admin Client.
Proceed as follows to configure the individual server actions:
Import dictionary entries
Import dictionary entries (automatic)
Export dictionary entries
TBX format
The XML-format TermBase eXchange or TBX is standardized by ISO 30042. It helps you capture terminology in a structured way and is used by most translation memory systems.
Examples of a TBX- file:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE martif SYSTEM "TBXBasiccoreStructV02.dtd">
<martif>
<martifHeader>
<fileDesc/>
<titleStmt><title/><note/></titleStmt>
<sourceDesc><p></p></sourceDesc>
</fileDesc>
<encodingDesc><p type="XCSURI">TBXBasicXCSV02.xcs</p></encodingDesc>
</martifHeader>
<text><body>
<termEntry id="1001">
<descrip type="definition">Use this term when the relation is created in
the table "asset_rel".</descrip>
<langSet xml:lang="de">
<tig><term>Asset-Verknüpfung</term></tig>
</langSet>
<langSet xml:lang="en">
<tig><term>asset relation</term></tig>
</langSet>
</termEntry>
</body></text>
</martif>
The file header contains information for translators and agencies in the <martifHeader> element. This information is optional. In the text body, every entry is represented by a <termEntry>-element. The ID is generated automatically by censhare. Alternatively, you can enter a description for individual entries with notes for translators in the <descrip> elements. Every entry contains as many languages as you like in the various <langSet> elements. The language is saved there as an XML attribute. There is also another container element in there <tig>. The actual term is entered in the <term>element. Alternatively, for every <tig> element you can add <termNote>"elements with notes for the translation like "Proper name" or "Synonym".
Import dictionary entries
Configure server action
To activate the import function for terminology lists, go to the Dictionary directory in the censhare Admin Client, double click the entry Import dictionary entries and configure the action as follows:
Field | Required | Description |
Note | no | This field is for documentation purposes only. |
General Settings | ||
Specify the desired settings for the export in the dialog. | ||
Language mapping | ||
From | Yes | Language mapping allows censhare to assign the language attributes found in the import file to a censhare system language. censhare creates the mapping according to the languages and language codes stored in the system. You can change this configuration and add new assignments if the desired assignment is not stored in the system. For example, if you only use "English" in your system, you can indicate the source languages British English (en-GB), American English (en-US), and English without country code (en) for the target language (English). |
To | Yes | |
Import Parameter | ||
Domains | No | Enter a domain name where you want to add the dictionary entries. The translation memory can save dictionary entries in different domain trees (sister domains) and split them up from one another. Users only see entries from the domain tree in which their standard domain is located. It doesn't matter whether a segment is in the user domain or in a parent or child domain. |
2. Domain | No | Enter a second domain here for the import. For more information, see the description of the Domain field. |
XPath expression for the Description field (optional). | No | If the file to be imported contains descriptions and notes about individual entries, then enter the XPath expression here that references these elements. censhare then imports these entries to the database and shows them in the Terminology widget. For instance, the XPath expression for the <descrip> element would be "martif/text/body/termEntry/descrip/@type=" definition in the code sample. The expression references all <descrip> elements with a Type attribute with the value definition located in a <termEntry> element. |
Delete all existing terms upon import | No | Activate this field if all of the existing entries should be deleted when you carry out imports. If you select this option, you should be sure that the TBX file you are importing contains the entire dictionary or the entire terminology list. Otherwise, existing entries will be lost from the memory if they are not contained in the import file. |
Name | No | censhare makes the entry in these two fields automatically. The Name field shows the script command that initiates the deletion. The Value field contains Boolean variables that were set to "true" or "false" depending on whether the previous field was activated or deactivated. |
Value | No |
Save the configuration by clicking OK. The changed configuration is shown in the Dictionary directory. You can also add more system-specific configurations, for example for another server or for another role. Then you need to update the server so the configuration will take effect.
Icon | Action |
Update server action |
If you work with a master server configuration, you will configure the TMX import, update the master server and then synchronize the remote server in order to make changes to that configuration as well:
Icon | Action |
Synchronize remote server |
Execute import
To carry out an import, open the Server actions menu in the censhare Admin Client or censhare Client and select the option Import dictionary entries.
Icon | Action |
Server actions |
In the dialog, select the file to open and confirm your selection. The Import Dictionary Entries dialog opens:
Field | Required | Definition |
Setup | ||
Domains | No | Enter a domain name where you wish to import the TMX file. The translation memory can save segments in different domain trees (sister domains) and split them up from one another. Segments in a domain tree are only accessible to users who are logged in to the associated domains. It doesn't matter whether a segment is in the user domain itself or in a parent or child domain. |
2. Domains | No | Enter a value here for a second domain for the import. For more information, see the description of the Domain field. |
Language mapping | ||
Term | n/a | Shows the language code for a language that has been found in the import file. The language code cannot be changed. |
Mapping | Yes | Select a in censhare defined language for the language code on the left side. This language is then mapped to every dictionary entry with this language code in the import file. If there is a mapping pre-defined in the configuration of the server action, censhare pre-selects this mapping. Change this setting if needed. If there is no pre-defined mapping, the selection field is empty. Make your own choice. You can select any language that is available in censhare. It does not matter if this a language with standard code or a custom definition. |
Click OK to continue. A second Import Dictionary Entries dialog shows a summary of the import in a dialog window. Click OK to complete the import. The imported entries are now available in the translation memory.
Dialog "Import dictionary entries" | ||
Number of imported entries | When doing imports, censhare does not perform a double-check. If an existing entry is re-imported with a different ID, censhare creates another new entry in the terminology list. | |
Number of updated entries | When importing, censhare compares the IDs in the TBX file with the IDs in the terminology list. If censhare finds a deviation in one or more languages for a term with a known ID, the entry will be shown as "updated". | |
Number of deleted entries | censhare deletes existing entries during an import only if the field Delete all existing terms upon import has been activated in the configuration of the server action. |
Import dictionary entries (automatic)
censhare can also import dictionary entries automatically: Configure the Import dictionary entries (automatic) module.
Note: To ensure that the import was correctly executed, keep an eye on the right coding and language assignment of the TMX files to be imported.
Field | Required | Description |
Note | No | This field is for documentation purposes only. |
General settings | ||
Server names | No | In a cluster system, you can limit the import function to one server or set up different configurations for different servers. By default, the function is activated for all servers. |
Activated | Yes | Set this field to enable the automatic server action. |
Run only on master server | No | This field is not relevant for TBX imports. |
Version | No | This field is not relevant for TBX imports. |
Cron pattern | Yes | In this field, you define the interval for the automatic import as a Crontab. The default value is set to "0 0 * * *". The expression leads to a daily import at 00:00 o'clock. The five columns stand for minutes, hours, day of the month, month and weekday. The first two zeros indicate 00:00 o'clock. The star is a placeholder and tells the server that every unit is counted, i.e. every day of the month, every month and every weekday. For example, to set up a weekly import on Sunday at 23:59 o'clock, the Cron expression would read "59 23 * * 0". Zero stands for Sunday here. |
Language mapping | ||
From | Yes | Language mapping allows censhare to assign the language attributes found in the import file to a censhare system language. censhare creates an automatic mapping for the languages stored in the system. You can also change this selection and add new assignments. For example, if you only use English in your system, you can indicate the source languages British English (en-GB), American English (en-US), and English without country code (en) for the target language (English). |
To | Yes | |
File system settings | ||
Entry directory | Yes | Set the directories for the individual steps of the automatic import. Files to be imported need to be saved to the entry directory. To process them, censhare moves them to the working directory. If the import is successful, censhare moves the file to the working directory and if there were errors they are moved to the errata directory. Select Temp dir for the File system field. Enter the relative path for the selected File system. It always starts with the "file:" prefix, for example, "file:in/" for Input dir. Only change the default setting if you are sure the directory exists and the functionality of the import won't be affected. |
Working directory | Yes | |
Output directory | Yes | |
Errata directory | Yes | |
Import parameter | ||
For more information, see Import parameter in the "Import dictionary entries" table. |
Exporting dictionary entries
Configure server action
You can export dictionary entries from the censhare translation memory in a TMX file. For example, then edit them in an external system or import them into another translation program.
Specify the desired settings for the export in the configuration dialog.
Icon | Action |
Update server action |
Execute export
To carry out an export, open the Server actions menu in the censhare Admin Client or censhare Client and select Export dictionary entries. This action starts the Export dictionary entries dialog:
Field | Required | Definition |
Languages | ||
Select All | No | Select if you want to export all languages in the dictionary. |
LANGUAGE | No | Select a LANGUAGE if you want to export their entries in the dictionary. The dialog shows all languages that exists in the dictionary. If you do not select any LANGUAGE, no entries of the dictionary are exported. |
Mapping | Yes | Enter an ISO-639-1 language code. This can be a lowercase two-digit format "xx", for example, "de" or "en". Or, extend the code to "xx-XX" for country-specific information, for example, "de-DE" or "en-US". |
The export generates a TBX file that you can open with an XML editor and edit or that you import into another translation memory system. Then select a place to save the TBX file on your computer.
Result
You can configure the server actions for import/import (automatic) and export dictionaries. You know how to import or export TBX files.