Azul Metastore
Azul Metastore enables storage and retrieval of binary files and plugin execution results
- Store plugin results + info via ingestors
- Deletion of old data via age-off
- Retry of failed execution
- Manual data deletion
- Expose functionality via restapi using
azul-restapi-server
Supplies Rest API endpoints for:
- binary submission (malpz and cart support)
- download of files (selectable neutering format)
- download of other plugin artifacts, reports, etc.
- query whether content (still) exists for binary
- Re-enqueing processing
- metadata interaction / search
Usage
Most functionality is available via the command line script.
$ azul-metastore --help
Usage: azul-metastore [OPTIONS] COMMAND [ARGS]...
Entrypoint to the program.
Options:
--help Show this message and exit.
Commands:
age-off Delete expired indices.
force-update-templates Force opensearch templates to be added.
ingest-binary Ingest binary events from dispatcher.
ingest-plugin Ingest plugin events from dispatcher.
ingest-status Ingest status events from dispatcher.
process-lost-tasks Retry failed processing tasks from dispatcher.
purge Purge metadata and data.
apply-opensearch-config Create roles in Opensearch that are required by Azul to function.
Commands in depth
age-off
Deletes the expired indices out of Opensearch, based on the source configuration. This is useful to minimise the amount of data in Opensearch and delete data that is no longer required.
force-update-templates
Updates the Opensearch templates for the current Opensearch indices. Useful when a significant change has been made to the Opensearch model and you need to increment the index prefix. This command will allow you to create the new templates to be used on the new indices.
ingest-binary
Run as a pod in Azul and queries kafka through dispatcher for binary topics that have new data. That data is then indexed and then transformed into Opensearch documents where it is then inserted into Opensearch.
ingest-plugin
Same as binary ingestor but for plugin events.
ingest-status
Same as binary ingestor but for status events.
process-lost-tasks
Run as a pod in Azul and looks for events that have dequeued events but have no associated completion event. When these events are found a message is sent to dispatcher to retry this event.
purge
Removes all metadata and binary data about a particular hash from Azul. It does this by deleting all the data out of S3 and Opensearch through dispatcher and metastore.
apply-opensearch-config
Used to create the roles in Opensearch associated with the current security configuration and the necessary default roles. To modify the roles this command creates update the security labels.
There is also a restapi component that can only be used via azul-restapi-server project.
Configuration
Controlled through environment variables. See azul_metastore/settings.py for more info.
Library usage
The Azul team do not recommend using the metastore as a library, as there is no guarantee of the stability of any public functions.
Testing
Running Unit Tests
Run unit tests via pytest tests/unit
Running Integration tests
To setup a local instance of OpenSearch please look at demo-cluster/readme.md.
Run all tests via pytest tests.
Project Structure
common/
classes and utilities shared between other parts of the project
encoders/
handle conversion between metastore searchable format and dispatcher message format
query/
handle the querying of data from opensearch.
restapi/
expose azul-metastore functions via rest api
scripts/
Assorted scripts to assist different kinds of development. Not intended for use in production systems.
Running Restapi locally
To run metastore's restapi locally you should install azul-restapi-server and a development version of metastore.
Refer to azul-restapi-server on how to startup the server locally.