.. include:: .. _series: ******************** Import a Time Series ******************** Purpose of this Chapter ======================= The aim of this chapter is to explain how to import a time series. Therefore you will learn how to create entries for the metadata and how to import data points. Based on it, you will learn how to create relations between series, calibrations and functions. Note that time series can't be created via the web interface. You have to use the REST-API or to write an importer for it. .. figure:: ../../graphics/dataseries_overview.svg :width: 100% Create Data Levels ================== A data level describes the whole state of the time series and each series has to refer to a level. A level has a unique *name* and a short *description*. Within the **IAGOS** projects, there are the three levels *Raw data*, *Preliminary data*, and *Final data*. Via Python ---------- In the following code snippet, you will learn how to create an entry for the level *Final data*. .. code-block:: python from IAGOS.apps.database.dataseries import models as dataseries level, _ = dataseries.DataLevel.objects.get_or_create( id=2, name="Final data", description="Calibrated data" ) Via Web Interface ----------------- 1. Make sure that you have the permissions to create new entries *(admin)* 2. Go to the menu *Time Series* |rarr| *Levels* 3. Create the data level **Final data** (ID: 2) by clicking the button on the top right corner. Create validities ================= Validities are used to flag data points. A validity has a unique *name* and a *description*. Within the **IAGOS** projects, there are the validities *Good*, *Limited*, *Erroneous*, *Not validated* and *Missing value*. Via Python ---------- In the following code snippet, the validities *Good*, *Limited*, and *Erroneous* will be created. .. code-block:: python from IAGOS.apps.database.dataseries import models as dataseries good_validity, _ = dataseries.DataValidity.objects.get_or_create( id=0, name="Good" ) limited_validity, _ = dataseries.DataValidity.objects.get_or_create( id=2, name="Limited", description="Doubtful" ) error_validity, _ = dataseries.DataValidity.objects.get_or_create( id=3, name="Erroneous", description="Invalid" ) Via Web Interface ----------------- 1. Make sure that you have the permissions to create new entries *(admin)* 2. Go to the menu *Time Series* |rarr| *Validities* 3. Create the data validities **Good** (ID: 0), **Limited** (ID: 2), and **Erroneous** (ID: 3) by clicking the button on the top right corner. Import Series ============= Generate Sample Data -------------------- Usually, you read the data from files or other systems. The following code snippet generate a time series for demonstration purposes. .. code-block:: python import matplotlib.pyplot as plt import numpy as np import pandas as pd mu, sigma = 15e3, 100 values = np.random.normal(mu, sigma, 400) index = pd.date_range(start="1/1/2021", periods=400, freq="min") series = pd.Series(values, index=index) series.iloc[10:390] -= 145e2 series.plot() plt.grid(True) plt.show() .. figure:: ../../graphics/time_series.svg :width: 100% Import Metadata --------------- First of all, you have to create the metadata of the time series. Each time series needs the following attributes *component_parameter*, *deployment*, *data_description*, *period_start*, *period_end*, *timestamp*, *revision* and *data_level*. The field *data_description* saves the *name* and the *description* of the series. It's recommended that series *(with different revision and levels)* which describe the same data share the same *data_description* and have a unique name. This gives you the possibility to create a history. If series of several deployments or flights have always the same name *(e.g., H2O_gas)*, you have to use additional properties *(e.g., time period)* for it. .. important:: For performance reasons, each series refers to the related deployment. It makes it possible to locate the time series of the same deployment close together. See also: https://en.wikipedia.org/wiki/Database_index .. code-block:: python from IAGOS.apps.database.components import models as components from IAGOS.apps.database.dataseries import models as dataseries data_description, _ = dataseries.DataDescription.objects.get_or_create( name="H2O_gas", description="Measured by ICH" ) component_parameter = components.ComponentParameter.objects.get( component=instances["ICH"], parameter__name="H2O_gas" ) db_series, _ = dataseries.DataSeries.objects.get_or_create( component_parameter=component_parameter, deployment=deployment, data_description=data_description, period_start=series.index[0].to_pydatetime(), period_end=series.index[-1].to_pydatetime(), timestamp=datetime(2020, 4, 1), revision=20200101, data_level=level ) Import data points ------------------ As already mentioned, each data point has a *timestamp*, a *value* and a *validity*. The first step is to create a DataFrame that contains theses information for each data point. After it, you can use the method *import_data_points_from_pandas* from the model *DataSeries* which imports the data points. .. code-block:: python import numpy as np from IAGOS.apps.database.dataseries import models as dataseries df = series.to_frame(name="value") df["timestamp"] = series.index validities = np.zeros(400) validities[[42, 142, 242, 342]] = 2 df["data_validity"] = validities df["data_validity"] = dataseries.DataValidity.get_validities(df["data_validity"]) points = db_series.import_data_points_from_pandas(df) Import Errors ------------- Within the IAGOS project, many series have an error value for each data point. These errors will be stored as **DataPointExtensions**. An extension is always related to a data point and has the field **values** that allows you to store additional information in JSON format. The model **DataSeries** provides the method **import_errors** to import these errors and expect the errors as a *pd.Series*. .. code-block:: python import numpy as np import pandas as pd mu, sigma = 5, 2 errors = pd.Series(np.random.normal(mu, sigma, 400), index=index) db_series.import_errors(errors) Import Flags ------------ Besides the errors, additional flags can be stored as well. Especially when you perform QA/QC algorithms, it's recommended to use the functionality. You can save the results of different data points as an extension. This can be helpful to understand the resulting validities and can be used for further analysis. The model **DataSeries** provides the method **import_flags** that expects a data frame. Note that the columns of the data frame will be used to store values. .. code-block:: python import numpy as np import pandas as pd mu, sigma = 5, 2 values = pd.Series(np.random.normal(mu, sigma, 400), index=index) flags = np.concatenate([(values < 5), (values < 2)], axis=0).reshape(-1, 2) flags = pd.DataFrame(flags, index=index, columns=["flag_a", "flag_b"]) db_series.import_flags(flags) Create Relations ================ If you apply a calibration function to the data or use other series to process the data, it's highly recommended to create a relation for it. The relation helps you to retrace which series, calibrations, and functions were used for processing. In the following code snippet, a relation between the calibrations, functions and the series will be created. .. note:: If you want to refer other series, you have to use the following line for it: **relation.related_series.add(other_series)**. .. code-block:: python from IAGOS.apps.database.dataseries import models as dataseries relation, _ = dataseries.SeriesRelation.objects.get_or_create(series=db_series) relation.related_functions.add(function) relation.related_calibrations.add(pre_calibration) relation.related_calibrations.add(post_calibration) relation.save()