Rules Engine 3

Rules Engine allows you to create dynamic rules in the system based on a set of predefined rules. Rules can be triggered manually, based on items matching a search, or by notifications, such as job finishing.

The actual rules are defined in Cantemo UI and executed by Activiti running Tomcat.

Basic usage

Creating Rules from Cantemo UI

Rule creation can be triggered via Rules Engine section under System

../../_images/create_jobs_new_items.png

Rules for Jobs, new items, Saved Searches, Collections, Metadata changes, as well as Format Distribution rules and Manual Processes from the Rules Admin View

Note

All rules created will be run as admin

Creating Rules - Actions

Rules are created by selecting a combination of actions from the form:

../../_images/create_custom_rule.png

The actions include:

Set metadata

Set metadata on the item. If a field is left empty, that value is not modified. Values also support Tags Based on Item Metadata.

Set access

Set access on item. Can add Read/Write/All/None permissions to a number of users or groups.

Priority applies if multiple rules set ACLs on same item.

Access Rights Method is either “Dynamic” or “Static”:

“Dynamic” method sets Access Rights dynamically, so that they are removed if the rule is deleted, or for search based rules if an item does not match the search anymore. “Static” method sets Access Rights directly on the item, and the rights remain even if the rule is deleted.

Transcode

Transcode item to a target format. Items of a media type different than the selected one will be ignored, for example if a video item is handled and an image target format is selected.

Replace transcode Recreate the target shape if it already exists.

Move/Copy files

Moves or Copies item files of selected shapes to a number of storages. See chapter Format Distribution Rules for more information about the settings.

Run shell script

Execute one of the Rules Engine 3 shell scripts uploaded to the system, see Shell Script Files for Rules.

Shell Script Arguments - Command line arguments passed to the shell script. Supports quotes to include spaces in a single argument, for example: foo and bar becomes three arguments foo, and, bar. With quotes "foo and bar" is handled as single argument foo and bar.

Wait for job to finish - Select this to add a step after your script that waits for any Vidispine jobs on the item to finish. Useful only for scripts that start Vidispine jobs.

Export

Export the item to an Export Location with the selected Format. Items of a media type different than the selected one will be ignored: for example if a video item is handled and an image target format is selected.

Archive item

Archive the item using the Cantemo Archive Framework. Archive policy can be selected, and also if the files should be purged from online storage

Restore item

Restore an item that was archived using the Cantemo Archive Framework.

Delete from archive

Delete the asset data from archive. Only completes successfully if a copy still exists on online storage.

Delete item

Moves the item to Recycle Bin for deletion.

Grace period (hours) Decides how long items will stay in the recycle bin before being permanently deleted. Initially set to the system default.

Send email

Send an email notification to the given Email Address. The predefined email template includes a link to the item.

Add to collection

Add the item to a collection. Collection name also supports Tags Based on Item Metadata.

Tags Based on Item Metadata

Item metadata can be used in the following rule actions:

  • Set Metadata: in any values for textual metadata fields, including String, Textarea, Tags

  • Move/Copy files: in Target Directory -parameter, when Modify Directory is enabled

  • Add to collection: in Collection -parameter to define the target collection name

They are included by entering ${tagName} in the value. Available tags include:

  • ${title} - Item title

  • ${itemId} - Item ID

  • ${mediaType} - Media type (audio/video/etc.)

  • ${originalFilename} - Original filename of the item

  • ${user} - Username of the user who ingested the item

  • ${field_id} - Value of any metadata field of the item, for example ${portal_mf619153} for the value of the default Description field in Film metadata group. Note that for key/value type of fields this is the key - so for example en for language ENGLISH in example field ${portal_mf257027}.

  • ${field_id_value} - Display value of a metadata field of the item. For example ${portal_mf257027_value} for the display value of the default Language field in Film metadata group. Result could be ENGLISH. Available for Checkbox, Date, Dropdown, Lookup, Radio, Timecode, Timestamp, and Workstep fields.

Metadata values in non-referenced subgroups can also be used by their Field ID. In case of multiple subgroup instances, values from the last one will be used.

Format Distribution Rules

Format Distribution Rules modify the storage of all files in the system, based on shape tags. For example they can move all video proxy files to a certain storage, or to place all original files in at least two storages.

Exact target storages do not need to be selected: If omitted the system will select the storages with most space available before high water-mark. The rules will never fill a storage beyond it’s high water-mark.

For example to move all Original shapes to storage “media1”, in path “hires” (e.g. /srv/media1/hires/), select: Format: original, Number of storages: 1, set “Remove from other storages”, set “Modify directory”, Target directory: “hires”.

The same setting are available for Copy/Move action on other type of rules.

When moving an item to multiple destinations, Rules Engine will use Copy jobs followed by a Delete job once finished. However if there is only a single target storage, a Move job will be used. This means that if both the target and source storages point to folders on same physical storage, a single file move will be a “rename” on the OS level, and will not duplicate file data.

Target directory also supports Tags Based on Item Metadata.

../../_images/create_format_distribution_rule.png

Warning

When you create Format Distribution Rules or rules that Move/Copy files: Please make sure you do not create rules that contradict each other. That can lead to unpredictable behaviour.

Rules Admin View

To administrate rules, go to the Rules Engine 3 configuration panel in the Admin section. In this view you can:

  • List all currently active rules, and their status

  • Create rules

  • List shell scripts available for use in rules

../../_images/rulesengine_admin.png

Deployed Rules list

This list contains the following information about every deployed rule:

Description

Textual description of the rule, with links to the item source if available (e.g. a saved search)

Handled items

How many items have been handled by this rule

Dynamic ACLs

How many items have dynamic access applied from this rule. See Creating Rules - Actions / Set access for more information.

Last handled

When was the last item handled by this rule

Issues

Any issues that have occurred with this rule, such as rule failing on some items, or if the rule has stalled

Deployed

The time when this rule was deployed

The following actions are available for each rule:

Details

Show more detailed information about this rule and its status

Redeploy

Delete and deploy this rule. This can help if a rule has issues.

Delete rule/trigger

“Delete rule” deletes a whole rule, including polling/notification trigger and the action process. “Delete trigger” deletes the trigger part of the rule, but leaves the action process in place.

Shell Script Files for Rules

Shell scripts can be used as actions in the rules. They can be Bash scripts, Python, or anything executable in the server environment.

This list in the admin view can be used to upload new scripts, view their content, and delete existing ones.

Note

If a shell script is used in an action and deleted, that part of the rule will fail in all future executions.

On execution, Rules Engine inserts arguments to the environment variables of shell scripts. These are all prefixed with portal_. Example variable when executed from a polling based rule:

portal_itemId=VX-13
portal_rule_uuid=6b154509-d691-49cd-8e07-4fba0371dd93
  • portal_itemId contains ID of the item that the script should handle.

  • portal_rule_uuid Is the UUID of the rule that triggered this execution.

For Job based rules, all job notification variables are passed to the script. Example when executed from a job finished change notification:

portal_rule_uuid=6b154509-d691-49cd-8e07-4fba0371dd93
portal_filePathMap=VX-1554=Screen Shot 2015-10-06_lowimage.jpg
portal_transcodeEstimatedTimeLeft=0.0
portal_fileIdsPerTag0=VX-1554
portal_user=admin
portal_transcoderId=VX-1
portal_overridePresetSourceTag=true
portal_currentStepNumber=4
portal_transcodeProgress=0.0
portal_transcoder=http://localhost:8888/
portal__portalJobId=912f7e2f-44a6-24b4-a7ac-346d169a8a2b
portal_mimeType=image/png
portal_transcodeDone=true
portal_tags=lowimage
portal_transcodeMediaTimes=10000@10000
portal_sequenceNumber=0
portal_status=FINISHED
portal_fileIds=VX-1554
portal_action=STOP
portal_jobId=VX-1279
portal_currentStepStatus=FINISHED
portal_baseUri=http://127.0.0.1:8080/API/
portal_growing=false
portal_itemId=VX-149
portal_signal_name=job_stop
portal_fastStartRequire=true
portal_originalShapeId=VX-323
portal_started=2015-10-08T10:22:41.540Z
portal_fileShapeMap=VX-1554=VX-325
portal_type=TRANSCODE
portal_createThumbnails=true
portal_item=VX-149
portal_jobDocument=<?xml ...
portal_transcodeWallTime=0.379286
portal_bytesWritten=0
portal_username=admin
portal_progress-200-0-0=percent 0.0/100
portal_shapeIds=VX-325
portal_closeProxyURI5=file:///srv/media3/Screen%20Shot%202015-10-06.png
portal_totalSteps=4
portal_jobStatusDocument=<?xml ...
  • portal_user is the name of the user who started the job

  • portal_type is the job type

A shell script that logs all the variables coming from Cantemo, log_to_tmp_example.sh in the default installation:

#!/bin/bash
# Output current date and all the script variables to a log file
log_file=/tmp/re3_script_example.log

printf "\n\n*** Script started at %s\n" "$(date)" >> $log_file
echo "Script variables from Cantemo:" >> $log_file
printenv | grep ^portal_ >> $log_file

A Python script that shows how to setup the environment for using Cantemo classes, settings and logging to standard Cantemo logs. This is available as python_example.py in the default installation:

#!/opt/cantemo/python/bin/python
# Add Cantemo classes to path and setup Django environment.
import os
import sys

import django

sys.path.append("/opt/cantemo/portal")
os.environ['DJANGO_SETTINGS_MODULE'] = 'portal.settings'

django.setup()
# Now Cantemo/Django environment is setup and helper classes are available

import logging

# Logging through standard Cantemo logging, i.e. to /var/log/cantemo/portal/portal.log
log = logging.getLogger('portal.plugins.rulesengine3.shellscripts')

# Access to variables from the Rules Engine through os.environ
log.info('All script variables:')
for key, value in os.environ.items():
    if key.startswith('portal_'):
        log.info(' %s=%r', key, value)

item_id = os.environ.get('portal_itemId')
if item_id:
    from portal.vidispine.iitem import ItemHelper

    ith = ItemHelper()
    log.info('Item title: %s', ith.getItem(item_id).getTitle())
else:
    log.info('portal_itemId not set')

A more complex example that integrates Google Vision API with Cantemo is available at https://github.com/Cantemo/GoogleVisionRule

Tips for Rule Engine Shell Script development

A shell script in Rules Engine 3 must never execute for more than 5 minutes, otherwise Activiti will consider the action failed and will restart a process. This causes errors with an message such as “Exception in job: JobEntity [id=123456] was updated by another transaction concurrently”.

If a rule script execution fails, all output it has made to STDOUT and STDERR will be logged to the Tomcat log, by default /var/log/tomcat/catalina.out. If the script returns without errors, output to STDOUT and STDERR is ignored. The above Python example uses Cantemo standard logging to output to /var/log/cantemo/portal/portal.log with proper log level filtering.

The script files are stored on the Cantemo server into the folder /opt/cantemo/portal/portal/plugins/rulesengine3/shellscripts/. If there is shell access to the server, these files can be modified in place. This is especially useful during development of a script.

Any changes to the files will affect all rules using the script in Shell Script actions.

The scripts can also be directly executed from shell, in an identical way as Rules Engine. The following shell command executes python_example.py as if it was in a rule started on item VX-100, providing immediate feedback:

portal_itemId=VX-100 /opt/cantemo/portal/portal/plugins/rulesengine3/shellscripts/python_example.py

Rule Details View

This view shows detailed information about a process, such as

  • Activiti process diagram, and a link to the BPMN2.0 XML -file that has been deployed to Activiti.

  • When the process was deployed

  • How long the last execution of this rule took

  • If there is any pending jobs for this rule, or if the rule has stalled

  • Any errors encountered when executing this rule

  • List of items that have Dynamic Access Rights applied by this rule

The handled items lists show each item that has been handled by this rule. Polling rules are usually only applied once per each item. From this view this list can be cleared, which for other than manual rules means the rule will be executed on them again, if the items still match the criteria. This can be especially useful if some action of the rule has failed for some items, and the rule should be executed again on those items.

../../_images/rule_details.png

Rule details view for a polling rule that has handled one item.

Rules Engine 3 Settings

Settings are accessed with the “Rules Engine 3 Settings” button in the admin view.

../../_images/rulesengine_settings.png

Available settings:

Activiti URL

The URL that Cantemo uses to communicate with Activiti, default http://localhost:8008/activiti-rest/service/ works with the Activiti installed by Cantemo installer.

Activiti Username, Password

Username and password used by Cantemo to communicate with Activiti. This Activiti user is created on Cantemo installation. By default the username is admin, and password is actiadm1n. The password can be set when installing Cantemo or later from the Activiti Explorer view.

Activiti Tenant

In an Activiti instance with multiple services, this can be used to separate Cantemo processes and signals from other services. Default value is empty, meaning Cantemo does not use a tenantId in Activiti. More information in the Activiti User Guide section Multitenancy: http://activiti.org/userguide/index.html#advanced.tenancy

Note

If a non-empty value is set, Activiti Explorer will not be able to deploy rules to Cantemo since Activiti Explorer is not multi-tenant aware.

Cantemo URL

The URL to access Cantemo from Activiti, default http://localhost:9000/. This should be a local HTTP URL, i.e. not HTTPS.

Polling Interval

Default polling interval for items and job status, in seconds.

Note

Default Polling interval cannot be lower than 10 seconds

Maximum total concurrency

This is the maximum items that will be handled in parallel. The default value is 100 items. Setting this to 0 temporarily disables Rules Engine polling.

Individual Rule Settings

Each individual rule that periodically polls for new Items has the following configurables

../../_images/rulesengine_rule_settings.png

Rule name

The name of the Rule

Polling suspended

Temporarily suspend this rule from Polling for new Items.

Polling inverval

The Interval of this rule to poll. This overrides the global setting Polling Interval

Note

Think through your polling intervals before applying them, so a Low priority rule does not poll just ahead of a High priority rule

Note

Polling interval cannot be lower than 10 seconds

Execution priority

What priority does this rule have.

Note

You need to set Maximum total concurrency for this to have effect

Maximum concurrency

Maximum number of items this rule will process in parallel. If omitted it will use the default value of 50.

Note

No more than Maximum total concurrency items will ever be handled in parallel

Tips on configuring Rules Engine

There are a couple of things you need to think of when configuring Rules Engine 3.

  • Make sure the Maximum concurrency for rules that creates Vidispine jobs not to be higher than Concurrent jobs in system. This will only create rules within activiti that waits for available slots.

  • Make sure that the Polling Interval is higher than the shortest time it takes to execute a Vidispine job or a script.

Rules Engine Internals

Rules Engine Architecture

Rules Engine 3 comes with a free, open source, Business Process Management (BPM) system, Activiti. It allows for complex business processes to be modelled with visual diagrams, and executed directly from those definitions.

Each rule is deployed to Activiti as a Process. Cantemo uses the Activiti REST -interface to communicate with Activiti.

Useful links for more information about Activiti:

Cantemo Polling

Polling is done through Celery by adding Celerybeat tasks that executes every 10 second.

Rules are polled based on their Execution priority, and the number of parallel rule processes is limited by the rules own Maximum concurrency and the global Maximum total concurrency.

This means that if Maximum total concurrency is not set, the rules do not affect each other, and their priority is not relevant.

Accessing Activiti

Activiti is installed with Rules Engine 3 to an Tomcat instance running on the Cantemo server.

Activiti Explorer is a visual interface for Activiti. The default Cantemo installation allows access to it only from localhost.

To access the GUI from another computer, open up an ssh port - e.g. ssh root@<YOUR PORTAL ADDRESS> -L 8008:localhost:8008 - and open up http://localhost:8008/activiti-explorer/ on your computer.

By default the username is admin, and password is actiadm1n. The password can be set when installing Cantemo or later from the Activiti Explorer view. If you change the password in Activiti, remember to also change it in Rules Engine Settings.

Access settings for Activiti are set in Tomcat, /etc/tomcat/server.xml.

Activiti Explorer also includes Activiti Modeler, a visual tool for creating and editing processes.

Detailed information about Activi Explorer can be found from the Activiti User Guide: http://activiti.org/userguide/index.html#activitiExplorer

Useful views for Rules Engine 3 administration:

Processess / Deployed process definitions

This lists all Activiti processes, Cantemo rules are listed as polling_UUID, signal_UUID and process_UUID. With Convert to editable model the process can be inspected and edited with Activiti Modeler.

Processess / Model workspace

This lists all Activiti process models. Cantemo rules can be converted to editable models from the Deployed process definitions list. Edit opens a model in Activiti Modeler.

Manage / Jobs

This is a list of pending Jobs in Activiti, for execution of rule actions. This view also shows details if a job has failed.

Rule Execution

Rules Engine rules are started either by:

  • Polling a Cantemo endpoint, detecting new items that match the search

  • From Vidispine notifications, such as “job finished”

  • Manually from search results, item page, etc.

../../_images/start_process_search.png

Starting a rule manually from search results

  • In the end, all rules action executions are triggered by a Celery polling process

  • For notifications and manual triggering this event adds items to a waiting list, and they are picked up by Celery

Processes containing the actions to take on items are deployed in separate processes that these processes start. One action process is started for each item.

../../_images/diagram_action_process.png

A process with actions.

Every step in such action process is modelled as a Script Task that executes Python code in Activiti. The script code can be viewed either via Activiti Modeler (after converting process to an editable model), or from the deployed BPMN2.0 XML.

Each of the script tasks share a common set of utility functions. These are omitted from the below example.

Below script requests a list of IDs from the /rulesengine3/savedsearches/VX-42-endpoint.

If new matching items are found, the code uses Activiti functions to start a new process instance of target_key for all items. Also their processing status is updated inside Cantemo. These statuses are visible in Cantemo UI as the “Processing” and “Handled items”-lists in Rule Details View.

Additionally a search polling checks for items that have been handled by this rule but do not match the search anymore. Such items are marked for rehandling, so they are handled again if they match the search later.

These scripts output log information with print. These log messages appear in the Tomcat logs, by default /var/log/tomcat/catalina.out.

Polling Script Task XML definition:

<scriptTask id="sid-06e4f2e5-ca2a-4128-a45d-cfa48da7d33e"
            name="Poll endpoint for changes every 30 seconds, start
                  process_1ba320bb-58ad-4c5f-868d-e9d5fbef5d45 on new items"
            scriptFormat="python"
            activiti:async="false"
            activiti:autoStoreVariables="false">
<script>
<![CDATA[
[... Utility functions omitted ...]

target_key = 'process_1ba320bb-58ad-4c5f-868d-e9d5fbef5d45'
poll_interval = 30
poll_url = '/rulesengine3/savedsearches/VX-42?'

# Script code
# Poll an endpoint and start new process instance for all items received.
try:
    # Get search results from server, also updates ItemStatus on results.
    response = server_post("%s&task=%s" % (poll_url, get_execution_rule_uuid()),
                           {'update_item_status': True})
    items = response['items']
    not_matching_anymore = response['not_matching_anymore']
    log.info('polling_action, items: {}, not_matching_anymore: {}',
             items, not_matching_anymore)

    if items:
        # Start a new action set process instance for all items
        rs = execution.getEngineServices().getRuntimeService()
        for itemId in items:
            new_process = rs.startProcessInstanceByKey(target_key,
                                                       {'itemId': itemId})
            log.info('started process {}', new_process)
except Exception:
    log.error(traceback.format_exc())


# Store local error list variable (potentially with new values) to execution
execution.setVariable("actionErrors", localActionErrors)

# Default action end log
log.info('Finished, itemId: {}, actionErrors: {}, took {} seconds.',
    itemId,
    localActionErrors,
    time.time()-start_time)
</script>
</scriptTask>

When target process has finished handling an item, this signaled to Cantemo in the Script Task “mark items handled”. This also notifies Cantemo of any errors that have occurred in the individual actions for that item.

All rule actions are within Activiti executed synchronously, one after another. All rule actions must execute within 5 minutes, otherwise Activiti assumes the thread has failed and stops it. For actions which trigger a job, such as transcoding, the rule will wait until all jobs on an item and its shaped are finished before proceeding to next action. These jobs are allowed to take more than 5 minutes.

Note

If IP address changes of Rules Engine Vidispine notifications needs to be setup again with /opt/cantemo/portal/manage.py setup_rules_engine_notifications

Starting, stopping and restarting Activiti

To start or stop the Activiti REST service and Explorer, including the task executor used by Rules Engine, disable the Rules Engine 3 application from System Overview / Registered Apps. Please note that this will immediately interrupt any currently running rules.

This can also be done with the following command:

sudo service tomcat [start|stop|restart]