Import your Data
The ImportManager combines the components Reader and Importer. The Reader reads the data and parses it to a standardized format that the Importer can interpret. The Importer imports and processes the parsed data. Since the importer expects the standardized format, it doesn’t have to know the format. This allows you to implement different Readers for different formats and to use the same importer. If you import processed data or final data, you can ignore the data processing part. Within the IAGOS project, the raw data will be read and then processed. The data processing is a part of the Importer because in some use cases the data needs to be synchronized. It’s recommended to synchronize the data first and then to import only the synchronized data.
Reader
Interface
- class IAGOS.apps.workflow.imports.readers.base.BaseReader
Bases:
abc.ABC
Every reader has to inherit from this abstract base class. This approach ensures that new readers can be integrated into the workflow without modifying the existing code. Each reader has to provide its metadata (NAME, DESCRIPTION) and has to implement the abstract methods read and read_list. These methods read and parse the data to a standardized format which can be interpreted by the importer. Moreover, each reader has to provide an additional name (e.g., filename, flight-id, etc).
Warning
If you add additional attributes in the inherited classes, make sure that you reset them in the method read. Moreover, ensure that call the super-constructor.
- Parameters
NAME (string) – Name of the reader
DESCRIPTION (string) – Description of the reader
- IAGOS.apps.workflow.imports.readers.base.BaseReader.read(self, source)
Reads the data and parse it to a standarized structure.
- Parameters
source (string) – Source of the data
- Returns
None
- Return type
None
- IAGOS.apps.workflow.imports.readers.base.BaseReader.read_list(source, additional_source, additional_info)
Parses the given source and prepares it for the TransferHandler. The method is needed since the TransferHandler can’t interpret the responses from external servers. For example, if the TransferHandler checks for new flights (e.g., http://example.com/all-flights/?timestamp=2020-01-01), it will receive a list with the flights that were performed after the given timestamp. These flight IDs will be used to create the requests for the server (e.g., http://example.com/flight/data/?param={id}). In that case, the method will return a list with all created requests for each flight.
Important
Some TransferTypes don’t need this functionality (e.g., DirectoryTransfer). In that case, the method returns just the source.
- Parameters
source (string) – Source of the data
additional_source (string) – Additional source to create requests
additional_info (string) – Additional information for parsing
- Returns
Sources
- Return type
List(string)
- IAGOS.apps.workflow.imports.readers.base.BaseReader.get_source_name(self)
Returns an additional name (e.g., filename of the dataset) which describes the source. This method is used by the SeriesImporter to add an addition to the series name to avoid that the every series of the same parameter has the same name.
- Returns
Name
- Return type
string
Template
from IAGOS.apps.workflow.imports.readers.base import BaseReader
class ExampleReader(BaseReader):
"""
Here you can describe the functionality of your reader!
"""
NAME = "Example"
DESCRIPTION = "Example for demonstrations"
def __init__(self):
super().__init__()
@staticmethod
def read_list(source, additional_source, additional_info):
pass
def read(self, source):
pass
def get_source_name(self):
pass
Example & Registration
To enable a smooth start, we have prepared simple examples. We have prepared a
reader to read the metadata of the ICH unit (instrument, deployment, calibration)
from an Excel file ich_metadata.xls
and a reader to read a measured time series from a netCDF file
H2O2018061616204002.nc
. With the
following steps, you can register them.
Download the Python module
demo.py
Move the module to the folder application/IAGOS/apps/workflow/imports/readers
Open the module application.IAGOS.apps.workflow.imports.readers.all_readers.py
Import the readers
from IAGOS.apps.workflow.imports.readers.demo import ICHMetaDataReader from IAGOS.apps.workflow.imports.readers.demo import ICHSeriesReader
Add your reader to the dictionary
READERS = { ICHMetaDataReader.NAME: ICHMetaDataReader, ICHSeriesReader.NAME: ICHSeriesReader, }
Register them by executing the following command in the project directory
make update-data
Importer
Interface
- class IAGOS.apps.workflow.imports.importer.base.BaseImporter
Bases:
abc.ABC
Every importer has to inherit from this abstract base class. This approach ensures that new importers can be integrated into the workflow without modifying the existing code. Each importer has to provide its metadata (NAME, DESCRIPTION) and has to implement the abstract methods run and rerun.
Warning
If you add additional attributes in the inherited classes, make sure that you reset them in the method run. Moreover, ensure that call the super-constructor.
- Parameters
NAME (string) – Name of the importer
DESCRIPTION (string) – Description of the importer
error_status (Status) – Error status that describes why the import failed
process_status (List(ProcessStatus)) – All process status of the import
imported_metadata (List(Metadata)) – All imported metadata
imported_series (List(DataSeries)) – All imported series
evaluation_method (EvaluationMethod) – Evaluation method
level (integer) – Level of the data
- IAGOS.apps.workflow.imports.importer.base.BaseImporter.run(self, reader, parameters, flags, addition)
Runs the import process. The reader provides the data in an interpretable structure. With the parameter parameters, you can define which parameters should be imported. This is necessary because some datasets offer more parameters as needed. If you import series that are already flagged, you can use the parameter flags to define the relationship between the flags and the parameters. Some importer needs additional parameters for the import process. Therefore you can use the parameter addition.
- Parameters
reader (BaseReader) – Reader that already parsed the data to needed structure
parameters (models.ManyToManyField) – Parameters that should be imported
flags (models.ManyToManyField) – Relations between the flags and the parameters
addition (dict) – Additional parameters for the import.
- Returns
None
- Return type
None
- IAGOS.apps.workflow.imports.importer.base.BaseImporter.rerun(self, task, reader, parameters, flags, addition)
Reruns the import process.
Important
If your Importer process data, make sure that a new task will be generated during the rerun process. It’s important that the existing series won’t be deleted since the data wouldn’t be constructable anymore.
- Parameters
task (Task) – Related task
reader (BaseReader) – Reader that already parsed the data to needed structure
parameters (models.ManyToManyField) – Parameters that should be imported
flags (models.ManyToManyField) – Relations between the flags and the parameters
addition (dict) – Additional parameters for the import.
- Returns
None
- Return type
None
Template
from IAGOS.apps.workflow.imports.importer.base import BaseImporter
class ExampleImporter(BaseImporter):
"""
Here you can describe the functionality of your importer!
"""
NAME = "Example"
DESCRIPTION = "Example for demonstrations"
def __init__(self):
super().__init__()
def run(self, reader, parameters, flags, addition):
pass
def rerun(self, task, reader, parameters, flags, addition):
pass
Example
To enable a smooth start, we have prepared simple examples. We have prepared an importer to read the parsed data and an importer to read the time series (see also: Example & Registration).
Download the Python module
demo_importer.py
Move the module to the folder application/IAGOS/apps/workflow/imports/importer
Open the module application.IAGOS.apps.workflow.imports.importer.all_importer.py
Import your importer
from IAGOS.apps.workflow.imports.importer.demo import ICHMetadataImporter from IAGOS.apps.workflow.imports.importer.demo import ICHSeriesImporter
Add them to the dictionary
IMPORTERS = { ICHMetadataImporter.NAME: ICHMetadataImporter, ICHSeriesImporter.NAME: ICHSeriesImporter, }
Register them by executing the following command in the project directory
make update-data