censhare uses Google Cloud AI services to analyze texts, images, and videos. There are various server actions to submit requests to Google Cloud AI. To use the services in censhare, you have to configure it in Google, the censhare Server, and the censhare Google Cloud AI service.

Google Cloud Storage, Google Cloud Natural Language, Google Cloud Vision AI, and Google Cloud Video Intelligence API are services provided by Google. The use of one or all of these Google services can result in additional costs that are invoiced directly by Google. censhare cannot influence or control these costs and therefore cannot be held responsible for them.

Context

  • censhare integrates the analysis tools Google Cloud Natural Language, Google Cloud Vision AI, and Google Cloud Video Intelligence API.

  • Each of these Google Cloud AI services can be executed manually or automatically.

  • In the left navigation: There are static pages for media and content assets for easy access.

  • Workspace view snippets provide the necessary search and list views.

  • Google Cloud AI services can be used with censhare Web and censhare WP.

Prerequisites

  • Google Developer Account

Introduction

Google Cloud offers extensive capabilities to analyze media files. censhare uses the following Google Cloud AI services::

  • Google Cloud Natural Language for text analysis

  • Google Cloud Vision AI for image analysis

  • Google Cloud Video Intelligence API for video analysis


In censhare, the Google Cloud AI integration consists of two parts:

  • censhare Google Cloud AI service: The service uploads media to the Google Storage, requests analysis, and removes it after the results of the analysis are delivered.

  • Google module: The module provides the server actions to start the analysis of a media or content asset. A manual and automatic server action is available. The configuration is done for each server action separately.

censhare Server

Configure the Google module in the censhare Admin Client in Configuration/Modules:

  • Configure the general part: This is different for manual and for automatic server actions.

Configuration in Google

Google Storage

Create a bucket at Google Cloud Storage:

  1. Create a bucket at Google Cloud Storage: Open the Cloud Storage browser in the Google Cloud Console.

  2. Click Create bucket to open the bucket creation form.

  3. Specify a bucket name.

    Note: Consider that Google uses global namespaces .

  4. Select a Default storage class for the bucket.

  5. Select an Access control model.

  6. Optionally, you can add bucket labels, set a retention policy, and choose an encryption method.

  7. Click Done.

You also need the bucket name for the configuration of the censhare Google Cloud AI service.

Enable Google Cloud APIs

The use of Google Cloud APIs requires a Google project. To enable an API for a project, do the following:

  • Go to the Cloud Console API Library.

  • From the projects list, select a project, or create a new one.

  • In the API Library, enable the APIs that you want to use with Google Cloud AI:

    • Cloud Natural Language API

    • Cloud Vision API

    • Cloud Video Intelligence API

    • Cloud Speech-to-Text. This used for speech-to-text detection in videos.

  • In the API Library, you must also enable:

    • Cloud Storage

  • If you need help to find the API, use the search field and/or the filters.

  • On the API page, click ENABLE.

Google Authentication

Create a service account key:

  1. Open Google Console.

  2. Select your project.

  3. On the Dashboard, you get an overview of the available APIs and their usage.

  4. Go to the Credentials tab.

  5. Click Create Credentials and select Service account.

  6. Enter the account details in the mask and click CREATE.

  7. Add roles: Owner and Storage Admin and then click CONTINUE.

  8. Click CREATE KEY, select JSON as Key type, and click CREATE. A file with the key is stored on the local computer.

More information:

Configuration in the censhare Admin Client

Note: Only, when a module is enabled in the censhare Admin Client, the analysis action options are visible in the user interface.

Configure access to the service

The censhare Server accesses the censhare Google Cloud AI service via HTTP/HTTPS requests through REST endpoints. The Host field contains the network address of the host that runs the service and port to access it:

http://SERVER-ADRESS:SERVICE-PORT

For example:

http://censhare.myCompany.de:8033

Check the configuration file of the censhare Google Cloud AI service to see which port is defined.

Configure the External Source Provider

As there can be other cloud services from the same or other services providers, you must identify:

  • The service provider

  • The analyses service

In censhare, there are two configuration assets for this purpose:

  • The Google asset (asset type Module/External provider)  stores the service provider.

  • For each analyses service, a dedicated Module/Module Interface asset is required to identify the corresponding service provider API.

They are configured in Analyzer setup section of the server action. 


Field

Google service

Value (Resource key)

Referenced asset name

External provider key


censhare:external-source.google

Google

Configuration asset key

Text

censhare:interface.google-language-api

Google Natural Language API

Configuration asset key

Image

censhare:interface.google-vision-api

Google Vision API

Configuration asset key

Video

censhare:interface.google-video-ai

Google Video AI


Note: In most cases, you do not need to changes these values.

Configure file selection for analysis

For video and image analyses you can select, which storage item is transmitted and analyzed:

  • Select the storage item type in Storage type.

  • Check Fallback to master to use the master file the selected storage type is not available for the current asset.

Note: Be aware that the master file can be larger then the selected storage type. This then can result in much higher cost for the analysis by Google Cloud AI.

Hints for the selection of the Storage type:

  • Automatic analysis of images using Object Detection: If you select Preview or Thumbnail as storage type, you must configure Asset preview done as Asset event in Trigger events in Analyze via Google Vision API (automatic). This ensures that the selected storage item exists when the automatic analysis is triggered.

  • Analysis of image formats that are not supported by Google: Select Preview or Thumbnail as the storage type. These storage types have the MIME Type JPEG. This can be analyzed by Google. If Preview or Thumbnail do not exist, no analysis is possible.

  • Large master files: Be aware that the master file can be larger than the selected storage type. This then can result in higher costs for the analysis by Google Cloud AI.

Google Natural Language Analyzes

For more information on Google:

For text analyses, the following settings are available:

  • Salience threshold for detected entities: Enter the desired threshold into Entity salience threshold. Entities are stored as assets and then related to the analyzed asset. If an entity does not exist, it is created.

  • Confidence threshold for detected content categories: Enter the desired threshold into Entity salience threshold.

  • Mapping of entity types provided by Google to asset types in the censhare Server: For each found entity, Google also delivers an entity type. censhare uses this information to map the entity type to an asset type. This asset type is also used to create a new entity asset if no existing is found. The mapping is stored in the Google to censhare type mappings section.

The default XML configuration for type mapping:

                                                                            

The src attribute stores the entity type delivered from Google. You find the complete entity type list at Google reference for Natural Language. The dest attribute stores the asset type that is creating in censhare for this entity. The default mapping only uses Keyword assets as entities. The category attribute stores the classification of the keyword.


To edit the configuration, click Edit type mappings.

In the mapping you can do the following:

  • Add additional entity types.

  • Map different source entity types.

  • Map source entities to other target asset types.

  • Map two or more different source entity types to one target asset type.

Google Cloud Vision

For more information on Google:

Control the results of the analysis

Google Cloud Vision provides several functions and can deliver a large number of results to censhare. In the configuration, you can control the results shown in censhare in different ways:

  • Enable/disable individual functions: Check/Uncheck Enabled.

  • Set a relevance threshold for a function: Enter a value in Threshold.

  • Limit the number of results for a function: Enter a value in Max result.

The following table shows, which items are analyzed:


Label Admin Client

Functionality

Activate

Threshold

Number of results

Logo detection

Identify brands

x

x

x

Landmark detection

Identify locations

x

x

x

Text detection

Recognize text

x

o

o

Dominant colors detection

Identify the main content colors

x

o

x

Safe search detection

Check for inadequate content

x

o

o

Web detection

Assign keywords

x

x

x 1)


Detect web full matching images

x

o



Detect partially matching images

x

o



Detect partially matching images

x

o


Object detectionDetects objects in the image and classifies themxox


1) The value limits each result list in Web detection.

Note: For Web detection, there is only one threshold and one maximum result number. The values are valid for all four result lists.

Results for brands, locations, content colors, and keywords are stored as assets, and an asset relation is created to the image. If an asset does not exist, censhare creates a new one.

Color

In the default configuration, censhare maps the returned colors from Google to the standard 16-color palette. The mapping is based on the RGB values of each color. censhare calculates the distances of the red, green and blue values between the Google color and the censhare color palette. The closest match in the censhare 16-color palette is mapped to the result and assigned to the image asset. 

The censhare 16-color palette is a dynamic value list. Each color is represented by a “Feature item” asset (asset type: “Module/Feature/Feature item”). For the definition of the assets, see the folder on the censhare Server:

~/censhare/censhare-Server/install/system/required/features/content-color

Google Cloud Video Intelligence API

For more information on Google:

Features

To store the state of the analysis for a video asset, the censhare Server uses the following asset features:

  • Google service transaction ID: ID that Google returns to reference the analysis while it is ongoing.

  • Google service completion state: completion state of the analysis for the video asset

Note: You must update the database before the features are available.

The status update for videos

When the manual or the automatic server action starts the analysis of a video, Google returns a transaction ID that is stored with the video. The automatic status check server action requests the status using the transaction ID if the transaction has not yet finished. 

For each status request, Google returns a completeness percentage for keywords, safe search, and speech transcription. These percentages are calculated into an average completion percentage and stored in the video asset. 

For the automatic status check, configure the Trigger events in the Status check via Google Video Intelligence API (automatic) server action the following:

Prevention of a repeated analysis

Once a video has been analyzed, the censhare Server prevents that a new request can be started. The reason for this that a repetitive analysis of a video can produce very high costs. This is especially important for automatic server actions. 

For this purpose, the censhare Server checks if the Google service transaction ID and the Google service completion state feature exist for video assets.

Note: To allow a new analysis of a video that was already analyzed, you must manually remove the features and start a new request.

Keyword detection

Configuration:  

  • Enable keywords: Check Enabled.

  • Set a relevance threshold for keywords: Enter a value in Threshold.

  • Limit the number of results for keywords: Enter a value in Max result.

Keywords are stored as assets and then related to the analyzed video. If a keyword does not exist, it is created.   

Safe search

  • Enable safe search: Check Enabled

Speech transcription

  • Enable speech transcription: Check Enabled.

To allow transcription, censhare sends the language code to Google. Google Speech-to-Text requires the language code with language and region value, for example, en-US.

censhare allows us to define languages in various ways. For example, you can define language codes that are only valid within a company. 

Therefore, a mapping is required to create the correct output format for the Google service. By default, the following mappings are available:


language code in censhare

language-region code

en

en-US

de

de-DE

fr

fr-FR

it

it-IT

ja

ja-JP


To add other languages to the mapping table, do the following:

  1. Change to the Admin mode in the censhare Admin Client.

  2. Go to the manual or automatic server action in the Google module and mark it.

  3. Click Show/edit XML file in the Admin menu in the censhare Admin Client.

  4. Go to the tag and add the mapping in the following format:             

    zz is the language code defined in censhare. xx is the internationally defined language code. YY is the respective language region code.

There is a priority for the selection of the language code:

  1. Content language of the video asset if defined

  2. Language defined in Default language code in manual/automatic server action.

  3. Language in Default language code

  4. Language code en-US as a fallback if there is no entry in Default language code. This is hard-coded and cannot be changed.

Enable new analysis

  1. In the censhare Client: Change to Admin mode.

  2. Search for the video asset in the censhare Client.

  3. Open the edit dialogue for metadata for this asset.

  4. Go to Features (internal section) on the Features tab:

    • Delete Google service completion state.

    • Delete Google service transaction ID.

Note: Be aware that the analysis of a video with Google Cloud Video Intelligence API can lead to high costs charged by Google!

Permissions

To manually execute the analysis through Google Cloud AI, one of the following permission keys is needed:


ID (Permission key)

Name

Comment

app_google_all

Google tools (all)

Permission to use all Google Cloud AI tools

app_google_nl

Google natural language

Permission to use Google Cloud Natural Language

app_google_vision

Google vision

Permission to use Google Vision

app_google_video

Google video intelligence

Permission to use Google Video Intelligence


Monitoring

censhare Server

The censhare Server writes log messages upon requesting an analysis from Google Cloud AI. 

Use the command name for respective server action to find the entries in the server log:


Name

Command name

Analyze via Google Natural Language

google_nl.update-data-action

Analyze via Google Natural Language (automatic)

google_nl.update-data

Analyze via Google Vision API

google_vision.update-data-action

Analyze via Google Vision API (automatic)

google_vision.update-data

Analyze via Google Video Intelligence API

google_video_intelligence.update-data-action

Analyze via Google Video Intelligence API (automatic)

google_video_intelligence.update-data

Status check via Google Video Intelligence API (automatic)

google_video_intelligence.update-data-status


Video analysis can take a long time. To follow a video that is being analyzed, Google returns a transaction ID. This ID is also written into the log:

AAGoogleVideoIntelligence.serverActionSetup: GoogleAiService:    
SERVER_NAME.20200304.144508.206[USER]: assetId[42942]:
Google AI Video Intelligence Analyze Async:
Processing RequestId = dd2eb205-c0d2-40cd-9dfe-c3a755169a1c

This ID is then used to request the status of the analysis for the related video:

AAGoogleVideoIntelligence.serverActionSetup: GoogleAiService:    
SERVER_NAME.20200304.144537.604[USER]:
requestId[dd2eb205-c0d2-40cd-9dfe-c3a755169a1c]:
Google AI Video Intelligence Status Async: Processing progress percent = 66

Google Cloud AI service

The censhare Google Cloud AI service writes log messages.

Flags when a text or images is analyzed:


Flag

Comment

[analyze]

Analysis start.

[uploadFile]

Upload file to Google Cloud Storage.

[getStorageUrl]

Get the URL of the file at the Google Cloud storage.

[analyzeStorageUrlSync]

Analyze files through Google Cloud AI.

[deleteDirectory]

Delete the uploaded file.

[cleanupDirectoryCleanup]

The bucket is empty again.

[analyze] result / output

Return the result.


Flags when the request to analyze a video file is started:


Flag

Comment

[analyze]

Request start.

[uploadFile]

Uploads file to Google Cloud Storage.

[getStorageUrl]

Get the URL of the file at the Google Cloud storage.

[analyzeStorageUrlAsync]

Analyze files through Google Cloud AI. analyzeStorageUrlAsync contains the Google Long-Run processing ID.

[analyze] result

No result is returned. The requests are started separately to get status updates and return the results.


Flags when a status update request is sent:


Flag

Comment

[asyncStatus]

Request start.

[getAsyncOperationStatus]

Gets status in percent for one of the following analysis tasks:


·       EXPLICIT_CONTENT_DETECTION


·       SPEECH_TRANSCRIPTION


·       LABEL_DETECTION


·       AVERAGE (average status result from all analyze tasks)


Flags after the video analysis is finished (average status = 100 %):


Flag

Comment

[asyncResult]

Starts to finish the request.

[deleteDirectory]

Delete the uploaded file.

[cleanupDirectory]

Cleanup. The bucket is empty again.

[processAnalyseResult]

Update video asset with the results.


censhare Google Cloud AI service

Introduction

The censhare Google Cloud AI service is running as a standalone service. It uses Apache Tomcat that runs on a defined port. The service calls the censhare Server through a REST API. The REST API is also used to download the storage item from the censhare Server. Storage items are uploaded to Google Cloud Storage to the bucket that is configured in the configuration file. 

The censhare Google Cloud AI service and the censhare Server can run on the same machine, or on dedicated servers. Both machines must be located in the same intranet. 

The service only handles one Google account, it is not multi-tenant.

By using Google Cloud AI services, you accept the terms and conditions of Google. Additional costs for analyses are invoiced directly by Google to the account that you configured in the service.

Installation

The censhare Google Cloud AI service is provided as RPM package.

If the censhare Server is installed as an RPM package, the installation process also installs the analysis service. 


If you want to install the Google Cloud AI service manually, use the following command:

yum install censhare-google-ai

Note: You must have an RPM repository with the censhare RPM sources configured. For more information, contact the censhare support.

The Google module for the Google Cloud AI configuration is automatically installed with the censhare Server 2020.1.

Configuration

During the installation, the following local directory for the configuration is created:

/opt/censer/google-ai 

The installation directory contains the following files:

  • Jar file for the Google Cloud AI service

  • application.yml for the configuration of the Google Cloud AI service

There is also a DB file for storage purposes. But, this file is created on-demand. It is not part of the installation.

The service is configured to start automatically. Before it can start for the first time, you must do the following steps:

  • Install the Google service API key.

  • Edit the configuration file.

To install the Google service API key:

  1. Obtain the JSON file for the Google service API key. 

    For more information, see Google Authentication.

  2. Copy the file into the following directory:

    /opt/censer/google-ai
  3. Rename the file to google-api-key.json

To update the configuration file:

  1. Obtain the bucket name for Google Cloud storage. For more information, see Google storage.


  2. Open the application.yml file in a text editor.

  3. Go to # Google Storage Name.

  4. Enter the bucket name behind the variable censhare.microservices.google-ai.google-storage.

To start the service, use the following command:

systemctl start censhare.google-ai.service

censhare.google-ai.service is the service name. 

To check if the service is running, use the following command:

systemctl status censhare.google-ai.service

The variables in the application.yml configuration file:


Variable

Comment

server.port

This port is used to reach the service from the outside. You must enter the port in the Host field in the configuration of the respective server action.

censhare.microservices.standalone-web-server.host

URL to send requests to the censhare Server, for example, https://censhare.yourCompany.com:9443. This URL is used to access the censhare Server via REST calls.

censhare.microservices.standalone-web-server.username

User name to access the REST API of the censhare Server

censhare.microservices.standalone-web-server.password

Password of the user

censhare.microservices.standalone-web-server.timeout-ms

Default: 500. Timeout to wait for a connection to the censhare Server.

censhare.microservices.google-ai.google-storage

Name of the bucket in the Google Cloud Storage to use with censhare Google Cloud AI service

censhare.microservices.google-ai.google-credentials

Relative path and name and of the file that contains the Google service account key, for example, "./google-api-key.json".


Result

You understand the architecture of censhare Server and censhare Google Cloud AI service to use the Google analysis functionality. You know that there are server actions that work synchronously and some asynchronously. 

You know how to install/configure:

  • Using Google Cloud AI and Google Cloud Storage

  • The manual and automatic server actions to use the Google Cloud AI functionality

  • The censhare Google Cloud AI service