HCMS Schema - Storage Items

Explains the referencing of files in Headless CMS schemas, also known as storage items.

Asset storage items can be mapped in the schema by specifying the file type id using the property "cs:storage.item".

They are typically mapped as a string either containing a link to the actual storage item content or the actual content itself.

Storage Item mapped as string

Storage Item as Link

With this mapping, storage items are always represented as a URL link that can be used to download the content, as a consequence the JSON type of the target property needs to be string. The following properties are used to configure the download link:

"cs:storage.item": String containing the storage item file type id (required)
"cs:link": Must be present to indicate that the download link is created. The value of the property is ignored and can be of any JSON type.
"cs:storage.max_age": How long should the content be cached by a client, in seconds. Directly sets value in Cache-Control: header (max-age).
- Negative value disables caching.
- By default, caching is disabled.
"cs:storage.public": Should the Cache-Control computed from one of the "cs:storage.max_age" settings be sent with public flag?
- true means public, false means private
- Default value is false - all Cache-Control headers are by default sent with private flag. This is the safer option.
- When the "cs:storage.max_age" is not set, this directive has no effect.

When reading the entity a URL pointing to the desired storage, an item is created by the Headless CMS. On creation or update, one of the following methods can be used to set the data:

A data: direct value URI (as specified by RFC 2397)
A URL (schema http or https) from which the Headless CMS downloads the binary data from. The content size for downloads as part of updates is limited to 200MB and transfer must be finished within two minutes. If the URL generated by the Headless CMS is send back to it during an update, the update of the storage item file is skipped.
A URL with the scheme formdata followed by a colon and the name of the part in the formdata/multipart request used to update the JSON entity

Example schema:

JSON

{
  "type" : "object",
  "properties" : {
    "image": {
        "type" : "string",
        "cs:storage.item" : "master",
        "cs:link" : ""

    }
  }
}

Example entity:

JSON

{
  "image": "http://localhost:8080/hcms/v2.1/entity/image/67982/storage/MDY3OTgyLzAvdGh1bWJuYWls"
}

Storage Item inline

With this mapping the storage item file content is mapped to a JSON type string encoded as base64. The following properties are used to configure it:

"cs:storage.item": String containing the storage item file type id (required)
"cs:$string_type": Must be set to base64

Example schema:

JSON

{
  "type" : "object",
  "properties" : {
    "image": {
        "type" : "string",
        "cs:storage.item" : "master",
        "cs:$string_type" : "base64"
    }
  }
}

Example entity:

JSON

{
  "image": "iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKBAMAAAB/HNKOAAAAD1BMVEUAAAD///9fX18/Pz8fHx9fJrE7AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAIElEQVQImWNgwAEUgJiJQcGAmcGBQYEJxIOQBkARKAAAIiIBnc84ujIAAAAASUVORK5CYII="
}

XML Content

XML content can be directly mapped inside string typed properties. In addition, the XML can also be mapped split into several fragments.

"cs:storage.item": The storage item file type id, type string, (required)
"cs:$string_type": Specifies the presentation of the XML, (optional):
- "plain": Plain unicode text
  - Default value
  - Mime type text/plain
  - Stored by using UTF-8 encoding
- "xml": xml document
  - Mime type text/xml
"cs:storage.$path": Path inside the XML document
- This declares that the value is not a complete storage item content, but just a single fragment of the XML document stored in the storage item
- This is not Xpath, but just simple dot-separated list of element names from path to the final element
- It cannot be used alone, always requires "cs:storage.$xml"
- Note when mapped as a singular "type": "string", the path corresponds to exactly one element. If there are several elements with the same path, only the first element will be used in the JSON.
  - Since version 4.2, it is possible to use this feature inside JSON array type. In that case, all matching elements in the XML are found.
    - Note that updates will work in this case, but the exact behavior may be surprising if the intermediate path elements are not unique.
    - Note that the cs: directives must be put inside the "items" - see the example below.
  - When the string type is xml, the element name is always the same as the last element of the path. This is also true on updates: the new value must be a valid XML element, but only its content and attributes is used. In other words, element name on PUT can be different and it is completely ignored.
"cs:storage.$xml": ID of the defined XML structure
- It can be used in combination with "cs:storage.$path" or alone

Special care is taken for XML documents, because they are used to store structured data and/or formatted text. A typical example is the article content, which contains both. The structure of the XML document must be defined beforehand (usually at the schema root) and referenced by its id. This allows the same XML structure to be used at different places.

Any level of JSON schema can contain the property "cs:$xml" with any number of new structures defined. The property name is the id of the structure, the value is a JSON object containing:

"initial": The initial content of the XML document ("empty" value)
- required, it must contain at least the root element
"processor": Name of the processor used to convert XML between internal and external format
- Currently there is only one possible value:
  - CenshareXLink: Converts all censhare URLs in XLinks to external (REST API) links and vice versa

A single XML document can be split into several fragments that are separately mapped to different JSON properties. Each of these properties specify the same "cs:storage.item" and "cs:storage.$xml", but different "cs:storage.$path". On export, the value is obtained by finding the element specified by path (and a property is missing if there is no such element). On import, this element is created and deleted (as needed) and its content is updated.

A fragment can be both plain and xml; in the former case, the text value of the element is actually used. In the latter case, the value in JSON string is the selected element with all the content. Note that on import, the name of the element is always the last element of the XML path; the name of the root element in JSON string is actually ignored.

Example schema:

JSON

{
  "type": "object",
  "cs:asset.type": "text.",
  "cs:$xml": {
    "article": {
      "processor": "CenshareXLink",
      "initial": "<article><title/><content/></article>"
      }
  },
  "properties": {
    "title": {
      "type": "string",
      "cs:storage.item": "master",
      "cs:storage.$xml": "article",
      "cs:storage.$path": "title",
      "cs:$string_type": "plain"
    },
    "content": {
      "type": "string",
      "cs:storage.item": "master",
      "cs:storage.$xml": "article",
      "cs:storage.$path": "content",
      "cs:$string_type": "xml"
    }
  }
}

Example entity:

JSON

{
    "title": "The Title",
    "content": "<content>Some <bold>content</bold></content>"
}

Example schema with multi-element path matching:

JSON

{
  "type": "object",
  "cs:asset.type": "text.",
  "cs:$xml": {
    "article-nonstandard": {
      "initial": "<article><title/><content/></article>"
    }
  },
  "properties": {
    "title": {
      "type": "string",
      "cs:storage.item": "master",
      "cs:storage.$xml": "article",
      "cs:storage.$path": "title",
      "cs:$string_type": "plain"
    },
    "texts": {
      "type": "array",
      "items": {
        "cs:storage.item": "master",
        "cs:storage.$path": "content.text",
        "cs:storage.$xml": "article-content",
        "cs:$string_type": "xml",
        "type": "string"
      }
    }
  }
}

Example entity:

JSON

{
    "title": "The Title",
    "texts": [
      "<text>Main text, generic version</text>",
      "<text variant=\"print\">Main text, print version</text>",
      "<text variant=\"web\">Main text, online version</text>"
    ]
}

Storage Item mapped as object

If the storage item is mapped as object JSON type, it is expected to contain proper JSON document (object) serialized into text form. This JSON document is seamless part of the main document, both on import an export.

Note that the schema itself can contain further properties used to validate the JSON document.
This allows tighter control over the content of the stored JSON document. Any cs: properties in this subtree are ignored.

Example schema:

JSON

{
    "title": "Text",
    "type": "object",
    "cs:asset.type": "text.json.",
    "properties": {
        "id": {
            "type": "integer",
            "cs:feature.key": "censhare:asset.id"
        },
        "name": {
            "type": "string",
            "cs:feature.key": "censhare:asset.name"
        },
        "content": {
            "type": "object",
            "cs:storage.item": "master"
        }
    }
}

Direct inclusion of potentially unrestricted tree structure is not allowed by GraphQL. For this reason, GraphQL schema actually provides the storage item content as a string (serialized JSON object).

Storage Item Attributes

Some attributes of storage items can be mapped by using the property "cs:storage.$attribute" with the attribute name as a value. These mappings are only for reading, the values are always ignored on creation or update.

Available storage item attributes:

mimetype: string, always available
filelength: integer, always available
hashcode: string, available for all types
width_px and height_px: integer, only for images, not usable in query by default
dpi: integer, only for images
colordepthbits: integer, only for some images
color: string, only for images, contains the color schema, typically "rgb" ; not usable in query by default
annotation, language: string, rarely used
charcount, wordcount, linecount: string, some texts, rarely used
web_url, web_expiry_date: rarely used
audio_format , video_format: string, video and audio files (if their processing is available and configured on the server)
duration_sec, frames_per_second, bitrate_mbps: number, video and audio files (if their processing is available and configured on the server)

Other attributes might be available too (especially for patched servers), but they are considered undocumented and their use discouraged.

Special directive "cs:feature.index" can be used to allow searching: it's value is a feature key that is actually used to search. This feature must already exist in masterdata and it must be searchable (by single string value).

Except for width_px, height_px and color, it is possible to use attribute mapping in queries and search for values - but only if the storage item is master. Attributes of other storage items are not searchable at all, unless "cs:feature.index" is specified.

Search (when available) is implemented by using a special asset feature for search. This feature can be explicitly changed (or added) by using directive "cs:feature.index" with feature key. This index feature must already exist, and it must be searchable. Usually, calculated feature is most useful for this purpose. Note that adding index feature can make searchable even attribute that normally cannot be used in queries.

Storage item attribute can either be used stand alone in conjunction with the property "cs:storage.item" to specify the storage item type:

JSON

{
    "mime": {
        "type": "string",
        "cs:storage.$attribute": "mimetype",
        "cs:storage.item": "master"
    }
}

Or it can be mapped together with a storage mapping in a common JSON object using the property "cs:storage.$value_property" on the storage item mapping:

JSON

{
    "storage": {
        "type": "object",
        "cs:storage.item": "master",
        "cs:storage.$value_property": "content",
        "cs:$string_type": "plain",
        "properties": {
            "content": {
                "type": "string"
            },
            "mime": {
                "type": "string",
                "cs:storage.$attribute": "mimetype"
            }
        }
    }
}

Media Mapping

The media mapping can be used to expose pre and on-demand generated variants of media content for consumption, e.g. image cropping variants, video size variants. This mapping is read only and does nothing on creation and update. Media mapping generates base URLs into the property that need to be extended by appending variant ids to create a URL that points to a variant. The admissible variants are part of the schema configuration. Since the mapping returns URLs the property type needs to be string and the media mapping is indicated with the child property cs:media of type object. This child object contains the variant ids as properties of type object in turn containing the configuration of the property. The variant type of the variant needs to be defined in the property "type".

There are two possible JSON types for this mapping: string and object. The string one generates "media link" which can be used to obtain data about variants, the object provides all data about all variants (including correctly computed sizes). The string variant creates smaller entities, object variant is useful in schemas directly consumed by a frontend application (which would have to invoke one additional request for each and every entity to get computed image sizes).

Below is an example illustrating the abstract principle. Schema:

JSON

{
  "type" : "object",
  "properties" : {
    "image": {
        "type" : "string",
        "cs:media" : {
          "small" : {
            "type" : "…"
          },
          "large" : {
            "type" : "…"
          }
        }      
    }
  }
}

It results in an entity like this:

JSON

{
  "image": "http://localhost:8080/hcms/v2.1/entity/image/68824/media/MDA2ODgyNC8w" 
}

And the URLs of the actual variants are http://localhost:8080/hcms/v2.1/entity/image/68824/media/MDA2ODgyNC8w/small and http://localhost:8080/hcms/v2.1/entity/image/68824/media/MDA2ODgyNC8w/large.

The URL may also be used without specifying a variant id to retrieve the available variants with meta data for an entity as a 300 multiple choices response:

JSON

{
  "variants": {
    "small": {
      "link": "http://localhost:8080/hcms/v2.1/entity/image/68824/media/MDA2ODgyNC8w/small",
      "type": "image/png",
      "width": 640,
      "height": 480,
      "size": 12340
    },
    "large": {
      "link": "http://localhost:8080/hcms/v2.1/entity/image/68824/media/MDA2ODgyNC8w/large",
      "type": "image/jpeg",
      "width": 1280,
      "height": 960,
      "size": 1590150
    }
  } 
}

The link and type properties for the actual variant URL and MIME type are always available. width, height and size for the width/height in pixels and the size in bytes are only available for some variant types.

Variant Type Storage

The variant type storage provides access to a storage item specified by the property storageKey.:

Example schema:

JSON

{
  "cs:media": {
    "preview": {
      "type": "storage",
      "storageKey": "preview"             
    }
  }
}

Variant Type Dynamic Image

The variant type dynamic_image provides access to dynamically generated images. The dynamic image generation needs to be enabled in the global service configuration. It supports the following options:

aspectRatio: Aspect ratio for the generated image e.g. 16:9, value type string
with: Width for the generated image in pixel, value type integer
height: Height for the generated image in pixel, value type integer
cropKey: ID of the censhare image crop to use for the selection of the clipping, value type string
fileFormat: Format of the generated image (either "jpg" or "png"), value type string
quality: Quality for JPEG images as number between 0 and 100, value type integer
backgroundColor: Color to fill the transparencies when converting to JPEG images as hexdecimal RGB color (e.g. "#ff00ff"), value type string
watermarkAssetId: ID if an image asset to overlay as a watermark, value type integer
watermarkOpacity: Opacity of the watermark as floating point number between 0 and 1.0 inclusive, value type number
noCrop: set to true, if no croppings should be read from the entity. (this could also be toggled on per-entity basis by feature censhare:module.oc.dic-image-no-crop), the image will be resized in its original aspect ratio, and padded to fit in the expected size
boundingBox: when set to true together with noCrop: true, the returned image will have the aspect ratio of the original image, within the specified width/height
- Note that the bounding box itself must be explicitly specified by width and height, without aspectRatio.

At least width or height must be specified and can be combined with aspectRatio. width, height and aspectRatio must not be specified together.

Example schema:

JSON

{
  "cs:media": {
    "medium": {
      "type": "dynamic_image",
      "width": 400,
      "aspectRatio": "4:3",
      "fileFormat": "jpg"            
    }
  }
}

Storage Item Direct Download

A schema may specify some name ("my-download" in example below) for a file id to make the storage item downloadable with a permanent link like /entity/image/54301/storage/my-download. All existing preconditions for allowing a download at all must be met the same way as they are for a downloadLink.

JSON

{
  "type" : "object",
  "properties" : {
    "image": {
        "type" : "string",
        "cs:storage.item" : "master",
        "cs:storage.$direct-file-name" : "my-download",
        "cs:link" : "download"
    }
  }
}

Cache control and expiration

Generated links always reflect the image content and change when the image itself changes, but any old link is still valid and can be used at any time (as long as the image variant it represents is valid). Handling of correct and obsolete links is configured by following meta-properties:

"cs:media.max_age": How long should the content be cached by a client, in seconds. This value is used only if the link is the current, latest one.
- Negative value disables caching (this is also reflected in other directives of Cache-Control).
- Default value is 31536000 (365 days).
"cs:media.changed.max_age": How long should the content be cached by a client, in seconds. This value is used in a response to older, obsolete link.
- Default value is -1 (no caching is allowed).
"cs:media.changed": How should the old link be actually handled. Value must be one of the following (case-insensitive):
- "CONTENT": Current content is sent, with Cache-Control and Expired headers computed from "cs:media.changed.max_age" setting.
  - This is the default setting.
- "LOCATION": The same as "CONTENT", but Location header is also set with URL of the correct, latest link.
- "REDIRECT": Redirect (301 Moved Permanently) is sent instead of the content, with Location being the correct latest link.
"cs:media.friends": Which media assets are allowed even if they are not properly part of this entity.
- This is used in a very special use case: the media mapping is actually inside inlined relation/reference mapping and thus represents attached media (usually image). Standard behavior in this case is to accept only the exact link as generated (ie the currently attached image). When the relation/reference is changed to different media asset, the link changes and the old one will no longer work.
- Value of this property is an array of schema names (strings). If defined, all links to any media asset is accepted, as long as the asset itself is accessible as an entity in one of these schemas.
- Note that this allows old links to be used (usually because they are cached somewhere), but it also opens a possibility for a completely forged urls. The declared schema list at least makes sure that only correct assets are used and no information can actually leak this way.
"cs:media.public": Should the Cache-Control computed from one of the max_age settings be sent with public flag?
- true means public, false means private
- Default value is false - all Cache-Control headers are by default sent with private flag. This is the safer option.
- Note that setting this to true value on entity that is not completely public might lead to a data leak.
"cs:media.no_transform": Should the Cache-Control header contain no-transform flag?
- true means that it is always present, false means that it is always missing
- The default behavior is that no-transform is present when the caching is disabled (no-cache, no-store, no-transform) and missing when the caching is enabled (positive max-age).
- Note that this behavior has been added in version 4 and in older versions of HCMS, the no-transform flag is always present!

Note that default behavior (images are cached for one full year) is probably undesirable in any application where frequent changes are expected. Example of less caching media mapping (one minute only):

TEXT

            "cs:media.max_age": 60000,
            "cs:media": {

Asset Elements

To handle assets with element structures e.g. PDFs with their individual pages, you can map elements to an array of objects using "cs:element.key": "actual.". Currently this mapping is read only and just the actual elements are supported.

The mapping does not generate anything besides the empty object for each element. You need to either use the storage or media mappings mentioned above or use the element attribute. The element attribute mapping is configured using the property cs:element.$attribute with one of the following values:

kind the element kind, must be mapped to a JSON string and internally it's a relation type identifier (so it must end with a dot)
width_mm, height_mm, xoffsmm or yoffsmm the element geometry, must be mapped to a JSON number
Other legacy attribute except those marked as "_internal". Full list of these attributes is intentionally not provided in this documentation, because they are considered deprecated and should not be used unless absolutely necessary.

Example schema:

JSON

{
  "type" : "object",
  "properties" : {
    "pages": {
      "type" : "array",
      "items": {
        "type": "object",
        "cs:element.key": "actual.",
        "properties": {
          "preview": {
            "type": "string",
            "cs:storage.item": "preview",
            "cs:link": ""
          },
          "kind": {
            "type": "string",
            "cs:element.$attribute": "kind"            
          }      
        }
      }
    }
  }
}

Example entity:

JSON

{
  "pages": [
    {
      "preview": "http://localhost:8080/hcms/v2.1/entity/image/67982/storage/MDA2ODgyNS8x",
      "kind": "page"
    },
    {
      "preview": "http://localhost:8080/hcms/v2.1/entity/image/67982/storage/MDA2ODgyNS8y",
      "kind": "page"
    }
  ]
}

change