Workflow - Overview

Jobs & Tasks

The PI has the opportunity to define jobs that should be executed periodically at a fixed time. Therefore the PI has to set the frequency and the interval between the executions. The system supports hourly, daily, and weekly jobs. If the job should run every second hour, then the PI has to set the frequency to hourly and the interval to two. Each job has several TaskManagers. A manager is dedicated to handling one kind of task. For example, a job could have two task managers—the first manager imports the metadata (e.g., instruments data), and the second the series. To resolve dependencies, the PI can order the jobs and the managers. During the execution of a job, the task manager will first collect all tasks and store them in a queue. After it, the tasks will be executed by the managers.

Execution of a task

../_images/tasks.svg
  1. Request data that should be imported (TransferHandler)

  2. Read the data and parse it to an interpretable structure (Reader) After reading import and process the data and perform some basic tests for flagging (Importer)

  3. Perform advanced tests

  4. Inform PI that checks the plausibility of the data. Can change the evaluation method and rerun the data processing

  5. Prepare the export data (Exporter) and write the data to a specific format (Writer)

  6. Export the data to an external system (TransferHandler)

TansferHandler

The TansferHandler handles the communication between the application and external systems. It is used by the ImportManager and the ExportManager. The handler combines the three components TransferType, ConnectionHandler and DataProvider. The component TransferType defines the transfer protocol (e.g., REST). Note that the transfer type is an interface that could be extended with new protocols if needed. The ConnectionHandler handles the authentication and the connection to the external systems. Note that the authentication method often differs between the systems for security purposes. Therefore the component provides an interface that allows adding new authentication methods. In many use cases, only the authentication method differs. Assume you have to request a token for authentication; you can use OAuth2 or connect directly with your credentials. But after the token is requested, the REST requests would be the same. Thus the type and authentication are split into two components which can be combined individually. Finally, there is the component DataProvider. It records only some information about the external data provider.

ImportManager

The ImportManager combines the components Reader and Importer. The Reader reads the data and parses it to a standardized format that the Importer can interpret. The Importer imports and processes the parsed data. Since the importer expects the standardized format, it doesn’t have to know the format. This allows you to implement different Readers for different formats and to use the same importer. If you import processed data or final data, you can ignore the data processing part. Within the IAGOS project, the raw data will be read and then processed. The data processing is a part of the Importer because in some use cases the data needs to be synchronized. It’s recommended to synchronize the data first and then to import only the synchronized data.

QA/QC Algorithms

See also

The system uses the QA/QC framework Autom8QC. See also: https://autom8qc.readthedocs.io/en/latest/index.html

ExportManager

The ExportManager is the last instance of the task execution and combines the components Exporter and Writer. The Exporter accesses the data of the database and prepares (e.g., resample the data) it for the export. The Writer writes the prepared data into a specific format (e.g., netCDF).