censhare uses Google Cloud Natural Language, Google Cloud Vision AI, and Google Cloud Video Intelligence API for content and media analyses. Google Cloud AI services can be used with censhare Web and censhare WP.


Prerequisites

  • Google Developer Account
  • Without activating the modules described below, the analysis actions will be hidden from the censhare standard workspace.


Note: Google Cloud Storage, Google Cloud Natural Language, Google Cloud Vision AI, and Google Cloud Video Intelligence API are services provided by Google. The use of one or all of these Google services can result in additional costs that are invoiced directly by Google. censhare cannot influence or control these costs and therefore cannot be held responsible for them.

Google Cloud object detection with Google Cloud Vision is supported as of censhare 2020.3.1.

Overview

The Google Cloud offers extensive capabilities to analyze various media files. With the Google Cloud AI integration in censhare, you can analyze:

  • Texts. For example, find known entities.

  • Images. For example, find keywords, text (OCR), or web pages that contain the image. Find and identify multiple objects in the image.

  • Videos. For example, find keywords, or inadequate content.

The following media and content assets can be analyzed with the Google Cloud AI integration in censhare:


Google Cloud service

Asset type

File type/MIME type

Natural Language

Text

plain, ICML, XML


Text

DOCX


Image

PDF


PDF

PDF


Document

Powerpoint, Excel, Word, PDF

Vision AI

Image & sub-asset types

JPEG, GIF, PNG (8 Bit and 24 Bit)

Video Intelligence API

Video

Any MIME type if configured with JPEG previews


Analyses can be executed manually in censhare Web and the censhare Client, or automatically via trigger events.

Analyses results are shown in the Analysis tab of an asset page in censhare Web.

For more information on Google Cloud AI:

Integration

Google Cloud integration consists of two parts:

  • censhare Google Cloud AI service: The service connects with the Google Cloud, uploads the media files for the analysis, and receives the results from Google Cloud AI.

  • Google module: The module is part of the censhare Server. The module is part of the censhare Server. The module provides the server actions to start an analysis. A manual and an automatic server action are available.

censhare Google Cloud AI integration



The Google module contains synchronous and asynchronous server actions. The synchronous server actions wait for the result from Google Cloud AI after they have sent the request. 

The following server actions are available:


Name

Execution mode

Comment

Analyze via Google Natural Language

Synchronous (1)

Start the analysis of a Text via Google Cloud Natural Language manually.

Analyze via Google Natural Language (automatic)

Synchronous (1)

Starts the analysis of a Text via Google Cloud Natural Language automatically. The execution is defined by the assets events configured for asset automation.

Analyze via Google Vision API

Synchronous (1)

Start the analysis of a Text via Google Cloud Vision AI manually.

Analyze via Google Vision API (automatic)

Synchronous (1)

Starts the analysis of a Text via Google Cloud Vision AI automatically. The execution is defined by the assets events configured for asset automation.

Analyze via Google Video Intelligence API

Asynchronous (2)

Start the analysis of a video manually. The first execution for a video sends the request. Every following execution requests the status of the processing.

Analyze via Google Video Intelligence API (automatic)

Asynchronous (2)

Starts the analysis of a Text via Google Cloud Video Intelligence API automatically. The execution is defined by the assets events configured for asset automation. The server action only starts the execution. For updates on the execution use Status check via Google Video Intelligence API (automatic).

Status check via Google Video Intelligence API (automatic)

Asynchronous (2)

Checks the execution status of videos regularly that are currently being analyzed via Google Cloud Video Intelligence API.


(1)   The synchronous server actions wait for the result from Google Cloud AI after they have sent the request. During this time, the widget is disabled.

(2)   When a server action sends the request, the action does not wait for the response. To receive an update of the processing status, configure the automatic server action for status checks.

Using Google Cloud AI

Authentication

The censhare Google AI service uses a service account key to authenticate himself to Google Cloud AI to start the various analysis tasks.

You must create your own key and provide it to the censhare Google AI service.

Google Cloud Storage

To use the Google AI analyses, a bucket within Google Cloud Storage is required. If you have not set up a bucket, you must do this before you can use the censhare Google AI service

The censhare Google AI service first uploads the media file to a bucket within the Google Cloud Storage. From there, the file is transferred to the respective analysis service in the Google Cloud. This has the advantage that there is no file size restriction for the file to upload.

Text analyses

For a text, Google Cloud Natural Language returns content categories and calculates a confidence value for each category. Google also searches for known entities such as public persons or companies and calculates a salience value for each entity. There is a threshold defined in censhare for confidence and salience.

Note: Google currently supports 10 languages for entities analysis. For a list of supported languages, see cloud.google.com/natural-language/docs/languages.

Category mapping

Returned categories are stored in an asset reference. For this purpose, censhare provides a default set of Content category assets.

Each Content category is identified by an External source ID that contains the respective Google content category. If there is no asset for a found content category name, censhare skips this result.

Note: Currently, Google only supports content categories in English.

Entity mapping

Due to the huge number of possible entities, censhare does not provide a default set of entity assets. It uses the entity type that Google delivers for a found entity, for example, "CONSUMER_GOOD" or "PERSON".

Entities are mapped to an asset type and category via a mapping table. The mapping table is stored in the censhare Admin-Client and is editable through an XML file.

If it contains a mapping definition for the found entity type, censhare creates an asset.

If Google returns a Wikipedia page that refers to an entity, censhare creates a Wikipedia web page asset.

Images analyses

The following table shows the categories that Google Vision returns and their mapping in censhare.

Note: Results are only shown if the functionality is activated in the censhare Admin Client.


Google category

Google sub-category

censhare result

censhare sub-result

Web

Web Entities

Keywords

-

Web

Pages with Matched Images

Matching images

Full matching image page (URL)

Web

Fully Matched Images

Matching images

Full matching image

Web

Partially Matched Images

Matching images

Partial matching image (URL)

Properties

Dominant Colors

Content Colors

-

Safe Search

-

Safe Search

-

Landmarks

-

Locations

-

Logos

-

Brands

-

Document

-

Texts

-

Object
Marker


When censhare receives the results from Google Cloud Vision, it checks depending on the category if:

  • There is a threshold for relevance score.

  • There is a limit to the number of results.

Text recognition

censhare stores the recognized text as plain text in a storage item and assigns the storage item to the image asset. The key for storage item is text-preview and the MIME type is application/xhtml+xml. The text is indexed, and users can search for it.

Object detection 

Object Detection in Google Cloud Vision AI detects objects in images. Each object is classified and the location of the object in the image is calculated. Google calculates a confidence value for the classification.

For each object, that Google returns, censhare creates a Marker asset and corresponding keywords.

Markers

To generate a Marker, the calculated relevance score must be above the threshold, and the maximum number of objects must not be exceeded. The name of the Marker asset is taken from the name of the object that Google returns. censhare adds a number at the end of the name. If there is more than one object with the same type, the number is increased with each additional object. The Marker asset stores the outline of the object (position and size of the outline box in millimeters).

Keywords

For each Marker, censhare creates a Keyword asset and references the Keyword in the Marker together with the calculated relevance score. Users can search for all images by keyword to find images that contain a certain object type. The keyword assets have an external source ID from the Google service. The query for existing keyword assets is done based on this external source ID. There is no resource key for this kind of keyword.

Video analyses

censhare supports the following functionality from the Google Cloud Vision API:

  • Keywords

  • Safe search

  • Transcription

Each function can be activated individually in the censhare Admin Client.

For keyword detection, there is a threshold defined in censhare. Results from Google below this threshold are not shown. Besides that, censhare defines a limit for the number of shown keywords.  

Video transcriptions

censhare stores the result of the transcription in the Time text storage item for a video asset. The text is stored in the VTT file format (Video Text Format). 

By default, transcriptions include punctuation and create correct sentences. However, Google does not support punctuation for all languages. This can affect the result in the VTT file.

censhare stores the returned text as plain text in a storage item and assigns the storage item to the video asset. The storage item key is text-preview and the MIME type is application/xhtml+xml. The text is indexed, and users can search for it. Select the Content (full text) field in the Detailed search or Expert search.

Update from previous versions

What is new?

  • New functionality: Google Cloud Video Intelligence API

  • No restrictions for file size to analyze texts, images, or videos.

  • The censhare Google Cloud AI service uses a Google service account key to authenticate to the Google Cloud. Before 2020.1, censhare used an API key.

  • A Google Cloud Storage bucket is needed.

  • All Google Cloud AI-related server actions have a new Host configuration parameter to connect to the censhare Google Cloud AI service. This setting replaces the obsolete Google API Key configuration.

  • Workspace configurations are updated. Manual actions are moved to the right side of the screen and the enhancement of disabled widgets during calls and better indicators of progress and status.

Steps to do after the update to 2020.1

  • Previously configured manual server actions and asset automation must to be removed and reconfigured again.

  • Recreate text previews for plain text assets: With 2020.1, the malformed HTML header for the generated text previews is corrected. As of that, the Google Natural Language analysis does not work with the format before.

  • Check if also XML content shall be analyzed automatically with Google Natural Language analysis: As of a bug fixed with 2020.1, censhare now also generates a preview-done event for generated text-preview(-s) for master files with 'text/xml' MIME type. If you have already configured asset automation for natural language analysis before version 2020.1, check if also XML content shall be analyzed automatically.

  • Ensure that your workspace configurations, both for static and asset pages, work according to your expectations.

Result

You understand the integration of Google Cloud services and the Google Cloud AI functionality. You know about the server actions that execute an analysis and how the server actions work.

You know how to install/configure:

  • Accessing Google Cloud AI and provide space from Google Cloud Storage

  • The manual and automatic server actions to use the Google Cloud AI functionality

  • The censhare Google Cloud AI service

Further steps

  • Configure the manual and automatic server actions in the censhare Admin Client that you want to use with Google Cloud AI.

  • Install and configure the censhare Google Cloud AI service.