Process of Reconstruction Across Datasets

If your workbook contains multiple datasets, you can reconstruct each one's missing or invalid observations of speed, speed standard deviation, direction, and temperature using the other datasets for reference. This approach can fill gaps within each dataset, but it can also extend each dataset’s period of record to match that of the longest dataset. This article describes the process, and a separate article describes a validation test demonstrating its performance.

The process of reconstruction across datasets consists of two phases:

Phase 1: MCP-based reconstruction across datasets

For each of the four reconstructable data types (direction, speed, speed SD, temperature):

Calculate the R2 correlation coefficient between each pair of sensors of this type, across all datasets.
For each dataset, identify the primary data column of this type. This will serve as the target data column for this data type. For the vacant time steps in this target data column, perform the MCP-based reconstruction mechanism . Whether that process succeeds in reconstructing every vacant time step depends on the availability of MCP source data and the minimum R2 setting.

Phase 2: Pattern-based reconstruction within each dataset

For each dataset, for each of the four reconstructable data types (speed, speed SD, direction, temperature):

Choose as the ‘anchor’ the data column of this type and subtype with the highest data coverage rate.
For the anchor column, perform the pattern-based reconstruction mechanism
For all other columns of this type and subtype, perform the pattern-based reconstruction mechanism.

This approach minimizes the number of time steps reconstructed with the MCP-based mechanism, preferring instead the pattern-based mechanism whenever possible since it refers to measurements made at the target location rather than some other location. It uses MCP-based reconstruction only in the vacant time steps in which pattern-based reconstruction is impossible, and only for the primary data column of each type. Using pattern-based reconstruction for all other data columns ensures that the synthesized data obey the shear and directional patterns observed at the target location.