******************* Workflow - Overview ******************* Jobs & Tasks ============ The PI has the opportunity to define jobs that should be executed periodically at a fixed time. Therefore the PI has to set the frequency and the interval between the executions. The system supports hourly, daily, and weekly jobs. If the job should run every second hour, then the PI has to set the frequency to hourly and the interval to two. Each job has several TaskManagers. A manager is dedicated to handling one kind of task. For example, a job could have two task managers—the first manager imports the metadata *(e.g., instruments data)*, and the second the series. To resolve dependencies, the PI can order the jobs and the managers. During the execution of a job, the task manager will first collect all tasks and store them in a queue. After it, the tasks will be executed by the managers. Execution of a task ------------------- .. figure:: ../graphics/tasks.svg :width: 100% 1. Request data that should be imported (**TransferHandler**) 2. Read the data and parse it to an interpretable structure (**Reader**) After reading import and process the data and perform some basic tests for flagging (**Importer**) 3. Perform advanced tests 4. Inform PI that checks the plausibility of the data. Can change the evaluation method and rerun the data processing 5. Prepare the export data (**Exporter**) and write the data to a specific format (**Writer**) 6. Export the data to an external system (**TransferHandler**) TansferHandler ============== The **TansferHandler** handles the communication between the application and external systems. It is used by the **ImportManager** and the **ExportManager**. The handler combines the three components **TransferType**, **ConnectionHandler** and **DataProvider**. The component **TransferType** defines the transfer protocol *(e.g., REST)*. Note that the transfer type is an interface that could be extended with new protocols if needed. The **ConnectionHandler** handles the authentication and the connection to the external systems. Note that the authentication method often differs between the systems for security purposes. Therefore the component provides an interface that allows adding new authentication methods. In many use cases, only the authentication method differs. Assume you have to request a token for authentication; you can use *OAuth2* or connect directly with your credentials. But after the token is requested, the REST requests would be the same. Thus the type and authentication are split into two components which can be combined individually. Finally, there is the component **DataProvider**. It records only some information about the external data provider. ImportManager ============= The **ImportManager** combines the components **Reader** and **Importer**. The **Reader** reads the data and parses it to a standardized format that the **Importer** can interpret. The Importer imports and processes the parsed data. Since the importer expects the standardized format, it doesn't have to know the format. This allows you to implement different **Readers** for different formats and to use the same importer. If you import processed data or final data, you can ignore the data processing part. Within the **IAGOS** project, the raw data will be read and then processed. The data processing is a part of the **Importer** because in some use cases the data needs to be synchronized. It's recommended to synchronize the data first and then to import only the synchronized data. QA/QC Algorithms ================ .. seealso:: The system uses the QA/QC framework **Autom8QC**. See also: https://autom8qc.readthedocs.io/en/latest/index.html ExportManager ============= The **ExportManager** is the last instance of the task execution and combines the components **Exporter** and **Writer**. The Exporter accesses the data of the database and prepares *(e.g., resample the data)* it for the export. The **Writer** writes the prepared data into a specific format *(e.g., netCDF)*.