Import your Data

The ImportManager combines the components Reader and Importer. The Reader reads the data and parses it to a standardized format that the Importer can interpret. The Importer imports and processes the parsed data. Since the importer expects the standardized format, it doesn’t have to know the format. This allows you to implement different Readers for different formats and to use the same importer. If you import processed data or final data, you can ignore the data processing part. Within the IAGOS project, the raw data will be read and then processed. The data processing is a part of the Importer because in some use cases the data needs to be synchronized. It’s recommended to synchronize the data first and then to import only the synchronized data.

../../_images/import_steps.svg

Reader

Interface

class IAGOS.apps.workflow.imports.readers.base.BaseReader

Bases: abc.ABC

Every reader has to inherit from this abstract base class. This approach ensures that new readers can be integrated into the workflow without modifying the existing code. Each reader has to provide its metadata (NAME, DESCRIPTION) and has to implement the abstract methods read and read_list. These methods read and parse the data to a standardized format which can be interpreted by the importer. Moreover, each reader has to provide an additional name (e.g., filename, flight-id, etc).

Warning

If you add additional attributes in the inherited classes, make sure that you reset them in the method read. Moreover, ensure that call the super-constructor.

Parameters
  • NAME (string) – Name of the reader

  • DESCRIPTION (string) – Description of the reader

IAGOS.apps.workflow.imports.readers.base.BaseReader.read(self, source)

Reads the data and parse it to a standarized structure.

Parameters

source (string) – Source of the data

Returns

None

Return type

None

IAGOS.apps.workflow.imports.readers.base.BaseReader.read_list(source, additional_source, additional_info)

Parses the given source and prepares it for the TransferHandler. The method is needed since the TransferHandler can’t interpret the responses from external servers. For example, if the TransferHandler checks for new flights (e.g., http://example.com/all-flights/?timestamp=2020-01-01), it will receive a list with the flights that were performed after the given timestamp. These flight IDs will be used to create the requests for the server (e.g., http://example.com/flight/data/?param={id}). In that case, the method will return a list with all created requests for each flight.

Important

Some TransferTypes don’t need this functionality (e.g., DirectoryTransfer). In that case, the method returns just the source.

Parameters
  • source (string) – Source of the data

  • additional_source (string) – Additional source to create requests

  • additional_info (string) – Additional information for parsing

Returns

Sources

Return type

List(string)

IAGOS.apps.workflow.imports.readers.base.BaseReader.get_source_name(self)

Returns an additional name (e.g., filename of the dataset) which describes the source. This method is used by the SeriesImporter to add an addition to the series name to avoid that the every series of the same parameter has the same name.

Returns

Name

Return type

string

Template

from IAGOS.apps.workflow.imports.readers.base import BaseReader

class ExampleReader(BaseReader):
   """
   Here you can describe the functionality of your reader!
   """

   NAME = "Example"
   DESCRIPTION = "Example for demonstrations"

   def __init__(self):
      super().__init__()

   @staticmethod
   def read_list(source, additional_source, additional_info):
      pass

   def read(self, source):
      pass

   def get_source_name(self):
      pass

Example & Registration

To enable a smooth start, we have prepared simple examples. We have prepared a reader to read the metadata of the ICH unit (instrument, deployment, calibration) from an Excel file ich_metadata.xls and a reader to read a measured time series from a netCDF file H2O2018061616204002.nc. With the following steps, you can register them.

  1. Download the Python module demo.py

  2. Move the module to the folder application/IAGOS/apps/workflow/imports/readers

  3. Open the module application.IAGOS.apps.workflow.imports.readers.all_readers.py

  4. Import the readers

    from IAGOS.apps.workflow.imports.readers.demo import ICHMetaDataReader
    from IAGOS.apps.workflow.imports.readers.demo import ICHSeriesReader
    
  5. Add your reader to the dictionary

    READERS = {
       ICHMetaDataReader.NAME: ICHMetaDataReader,
       ICHSeriesReader.NAME: ICHSeriesReader,
    }
    
  6. Register them by executing the following command in the project directory

    make update-data
    

Importer

Interface

class IAGOS.apps.workflow.imports.importer.base.BaseImporter

Bases: abc.ABC

Every importer has to inherit from this abstract base class. This approach ensures that new importers can be integrated into the workflow without modifying the existing code. Each importer has to provide its metadata (NAME, DESCRIPTION) and has to implement the abstract methods run and rerun.

Warning

If you add additional attributes in the inherited classes, make sure that you reset them in the method run. Moreover, ensure that call the super-constructor.

Parameters
  • NAME (string) – Name of the importer

  • DESCRIPTION (string) – Description of the importer

  • error_status (Status) – Error status that describes why the import failed

  • process_status (List(ProcessStatus)) – All process status of the import

  • imported_metadata (List(Metadata)) – All imported metadata

  • imported_series (List(DataSeries)) – All imported series

  • evaluation_method (EvaluationMethod) – Evaluation method

  • level (integer) – Level of the data

IAGOS.apps.workflow.imports.importer.base.BaseImporter.run(self, reader, parameters, flags, addition)

Runs the import process. The reader provides the data in an interpretable structure. With the parameter parameters, you can define which parameters should be imported. This is necessary because some datasets offer more parameters as needed. If you import series that are already flagged, you can use the parameter flags to define the relationship between the flags and the parameters. Some importer needs additional parameters for the import process. Therefore you can use the parameter addition.

Parameters
  • reader (BaseReader) – Reader that already parsed the data to needed structure

  • parameters (models.ManyToManyField) – Parameters that should be imported

  • flags (models.ManyToManyField) – Relations between the flags and the parameters

  • addition (dict) – Additional parameters for the import.

Returns

None

Return type

None

IAGOS.apps.workflow.imports.importer.base.BaseImporter.rerun(self, task, reader, parameters, flags, addition)

Reruns the import process.

Important

If your Importer process data, make sure that a new task will be generated during the rerun process. It’s important that the existing series won’t be deleted since the data wouldn’t be constructable anymore.

Parameters
  • task (Task) – Related task

  • reader (BaseReader) – Reader that already parsed the data to needed structure

  • parameters (models.ManyToManyField) – Parameters that should be imported

  • flags (models.ManyToManyField) – Relations between the flags and the parameters

  • addition (dict) – Additional parameters for the import.

Returns

None

Return type

None

Template

from IAGOS.apps.workflow.imports.importer.base import BaseImporter

class ExampleImporter(BaseImporter):
   """
   Here you can describe the functionality of your importer!
   """

   NAME = "Example"
   DESCRIPTION = "Example for demonstrations"

   def __init__(self):
      super().__init__()

   def run(self, reader, parameters, flags, addition):
      pass

   def rerun(self, task, reader, parameters, flags, addition):
      pass

Example

To enable a smooth start, we have prepared simple examples. We have prepared an importer to read the parsed data and an importer to read the time series (see also: Example & Registration).

  1. Download the Python module demo_importer.py

  2. Move the module to the folder application/IAGOS/apps/workflow/imports/importer

  3. Open the module application.IAGOS.apps.workflow.imports.importer.all_importer.py

  4. Import your importer

    from IAGOS.apps.workflow.imports.importer.demo import ICHMetadataImporter
    from IAGOS.apps.workflow.imports.importer.demo import ICHSeriesImporter
    
  5. Add them to the dictionary

    IMPORTERS = {
       ICHMetadataImporter.NAME: ICHMetadataImporter,
       ICHSeriesImporter.NAME: ICHSeriesImporter,
    }
    
  6. Register them by executing the following command in the project directory

    make update-data