Searching¶
New in Cantemo 2.0
Cantemo ships with the Elastic Search engine which drives most search functionality in Cantemo. While the search engine can be accessed directly, the recommended way to access the search is through the /API/V2/search REST API.
Search API Reference¶
Celery tasks¶
Low level elastic function¶
- portal.search.elastic.acl_filter(username, user_field_name='access_control_user', group_field_name='access_control_group', nested_user_field_name=False, nested_group_field=False)¶
Applies ACL filters before sending request to elastic. If user is part of an Admin group, match_all ACL filter is applied.
Otherwise if simple_access_control is set to True it will check the user and the users groups against the users & groups fields and users & groups negative fields If simple_access_control is set to False it will add ACL filter that user must be part of ACL_USER_SEARCH_FIELD_NAME
If simple_access_control is true we need to make sure to nest the user/group fields
- portal.search.elastic.add_advanced_search_filters(search, querydict, escape=None, searchcollections=False, username=None, excludesubcollections=False)¶
- Parameters
search – instance of elastic-dsl Search
querydict –
escape –
searchcollections –
username –
excludesubcollections –
- Returns
- portal.search.elastic.append_metadata_filtering(key, value, range_type, escape, filters, queries, path=None)¶
Appends filters or queries based on one metadata field to the given filters/queries lists.
- Parameters
key – Metadata field name, e.g. ‘portal_mf793831’
value – Value to filter for
range_type – More info about the filtering, understood values: ‘before’, ‘after’, ‘between’, ‘passed’, ‘not_passed’
filters – List to append filters to
queries – List to append queries to
path – A list of parent group this field is in
- Returns
None
- portal.search.elastic.construct_include_exclude_filter(include_ids, exclude_ids, elastic_field_name)¶
Construct elastic filters based on list of ids that field should or should not have :param include_ids: list of ids that should be in the list or *.
Star means that field should be non-empty
- Parameters
exclude_ids – list of ids that should not be in the list or *. Star means that field should not have values.
elastic_field_name – elastic field name
- Returns
- portal.search.elastic.create_index(index_type)¶
Create the index
- Parameters
index_type ("item" | "collection" | "file" | "subclip") – The type of the index to recreate.
- Returns
- portal.search.elastic.date_string_to_iso(date_str)¶
Convert a date_str to .isoformat(), or return as such if it cannot be parsed :param date_str: The date string, e.g. “2016-05-10”, “‘2016-05-17T16:03:15.903385”, “NOW-5DAYS” :return: Date in .isoformat(), or “date_str” as such if not parseable (e.g. “NOW-5DAYS”
- portal.search.elastic.delete_all_indexes()¶
Delete all indexes
- portal.search.elastic.delete_index(index_type)¶
Delete index :param index_type: The type of the index to recreate. :type index_type: “item” | “collection” | “file” | “subclip”
- portal.search.elastic.escape_query_string(qs)¶
Escape an elastic search query string.
The following characters are escaped: &|(){}[]^”:/
- portal.search.elastic.fix_query_string(query_string, escape)¶
Fixes some words in a query string (or to OR etc.). Escapes special characters if escape is True.
- portal.search.elastic.get_cantemo_max_result_window() int ¶
Get the smallest max_result_window value set in Cantemo search indexes, or the OpenSearch default 10000 is nothing is defined.
- portal.search.elastic.get_default_operator()¶
Get the system default operator, AND or OR. :return:
- portal.search.elastic.get_elastic_cluster_nodes_data()¶
Retrieves a dict with a “nodes” key, which is a list containing information about each node in the Elastic cluster, specifying useful information for tracking the nodes status in the UI
- portal.search.elastic.get_field_mappings(field_name)¶
Get field mappings for a field in Elastic. This can be used to check what information Elastic contains for a field.
Note: This returns mappings for all document types.
Example return value for an unknown field:
- {
- u’portal_7’: {
u’mappings’: {}
}
}
For ‘f_portal_mf619153_str’, the Description field:
- {
- u’portal_7’: {
- u’mappings’: {
- u’item’: {
- u’f_portal_mf619153_str’: {
u’full_name’: u’f_portal_mf619153_str’, u’mapping’: {
- u’f_portal_mf619153_str’:
- {u’fields’: {
- u’analyzed’: {
u’copy_to’: [u’suggest’], u’index_analyzer’: u’custom_english’, u’search_analyzer’: u’custom_english_search’, u’type’: u’string’},
- u’lower’: {
u’analyzer’: u’case_insensitive_sort’, u’ignore_above’: 1024, u’type’: u’string’}},
u’ignore_above’: 1024, u’index’: u’not_analyzed’, u’type’: u’string’}
}
}
}
}
}
}
Note that the above would include ‘collections’ on the same level as ‘item’ if Description field was used in a collection object.
- Parameters
field_name – Elastic field name.
- Returns
A mappings dictionary, see https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-get-field-mapping.html
- portal.search.elastic.get_index_version(index_type)¶
Returns the current index version for the specified index_type
- Parameters
index_type ("item" | "collection" | "file" | "subclip") – The type of the index to recreate.
- portal.search.elastic.get_search_connection(new_connection_pool=False)¶
- Parameters
new_pool – Forces creation of a new connection pool
- Returns
an elastic connection
- Rtype elasticsearch.Elasticsearch
- portal.search.elastic.is_cantemo_index(index: str) bool ¶
Tests if an index is owned by Cantemo Portal.
- portal.search.elastic.is_vidispine_index(index: str) bool ¶
Tests if an index is owned by Vidispine.
- portal.search.elastic.metadata_filter(metadata_dict, escape=False, path=None)¶
This contstructs a filter of what is entered in the metadata form.
- portal.search.elastic.mget_elastic_by_ids(ids, doc_type, fields=None)¶
Run multiget query to elastic :param ids: :param doc_type: :param fields: :return:
- portal.search.elastic.parse_string_to_bool(key)¶
Checks if the given can be returned as a boolean. We can use this to beatify boolean objects in the response data
- Parameters
key – Value to check E.G ‘true’
- Returns
Boolean if value should be represented as a boolean value, else the initial value is returned
- portal.search.elastic.postprocess_search(elastic_hits, user)¶
Given a elastic search result, fetch all referenced items and collections from Vidispine and return them. SubClips are also parsed out from the item metadata and returned inlined in the item list
- portal.search.elastic.qdr_filter(qdr)¶
Adds a filter for the create date
- portal.search.elastic.query_elastic(query, first=0, number=25, doc_type=None, fields=None, **kwargs)¶
This performs a raw query to elastic. :param query: The elastic search document :param first: Starting offset :param number: Number of results to return :param doc_type: Document types to search for, default is [‘item’, ‘subclip’, ‘collection’] :param fields: Elastic fields to fetch, default is None meaning all fields are returned :return: The json structure returned from elastic
- portal.search.elastic.recreate_all_indexes_if_needed(force=False)¶
Recrete all portal Elastic indexes if they don’t exist or if they have the wrong version. :param force: Force recreate even if not needed :return: A list of indexes which were recreated :rtype: list(str)
- portal.search.elastic.recreate_index_if_needed(index_type, force=False)¶
Recreate all elastic indexes if needed :param index_type: The type of the index to recreate. :type index_type: “item” | “collection” | “file” | “subclip” :param force: Force recrete the index :return: True if the index was recreated, False otherwise
- portal.search.elastic.refresh_index(index: str, allow_no_indices: bool = None)¶
Force immediate refresh of index.
- portal.search.elastic.refresh_vidispine_index()¶
Force immediate refresh of Vidispine index.
- portal.search.elastic.remove_cluster_nodes_readonly()¶
Removes the read_only setting set on the Cluster Nodes of Elastic Search which is set automatically when the remaining space on disk is critical.
This is useful after the disk has been cleaned a bit, so that the nodes can be written back again.
- portal.search.elastic.search_criteria_filter(search_criteria, escape, searchcollections=False, excludesubcollections=False)¶
- Parameters
search_criteria – dictinary with search_criterias
escape –
searchcollections –
excludesubcollections –
- Returns
- portal.search.elastic.test_read_only_and_nodes_space(is_for_healthcheck_page)¶
Check if there are any indexes with read_only setting and returns if that happens and some messages.
- Parameters
is_for_healthcheck_page – a boolean which should be True when used from the health
page view, so it retrieves the proper messages and links to fix the indexes with read_only setting. :return a Tuple which looks like:
- (
a boolean specifying if read_only is present at any index, a list of messages with information about the status of the Elastic cluster nodes.
)
- portal.search.elastic.user_has_access_to_document(username, document)¶
- If simple_access_control is set to True:
Returns true if user is part of an admin group or is in the list of users with access (ACL_USER_SEARCH_FIELD_NAME)
- If simple_access_control is set to False:
Return true if user is in ACL_USER_SEARCH_FIELD_NAME and not in NEGATIVE_ACL_USER_SEARCH_FIELD_NAME or if one of the users group is in ACL_GROUP_SEARCH_FIELD_NAME and not in NEGATIVE_ACL_GROUP_SEARCH_FIELD_NAME
- portal.search.elastic.validate_query(query)¶
Validates a query against Elastic without performing it.
- Parameters
query –
- Returns
True if the query is valid