Data Reconstruction

Datasets often contain missing or invalid values. We use the term 'reconstruction' to describe the process of replacing those missing or invalid values with synthetic values. This article describes the mechanisms that Windographer uses to reconstuct data, and the processes that make use of those mechanisms.

Why Reconstruct?

You may wish to reconstruct your measured datasets simply to fill the gaps in the measurements. If you intend to use your datasets for time series simulations, for example, it may help to maximize data recovery first. The image below shows the effect reconstruction could have on a set of three datasets measured by three nearby met towers, where the reconstruction process fills most of the gaps in each dataset:

Another reason you might choose to reconstruct is to lengthen datasets beyond their original period of record (POR), either to reflect longer term conditions or to match the POR of another related dataset. The image below shows the effect reconstruction could have on a set of three related datasets where one has a longer POR than the other two, and the reconstruction process lengthens the two shorter datasets to match the longer, while simultaneously filling gaps in all three:

Three Reconstruction Mechanisms

Windographer employs three mechanisms for reconstructing missing or invalid data:

MCP-based mechanism for reconstruction across datasets

The MCP-Based Reconstruction Mechanism reconstructs speed, speed SD, direction, and temperature data columns by referring to data columns of the same type from other datasets, using the observed correlation between the data columns, and an MCP algorithm.

Pattern-based mechanism for reconstruction within a dataset

The Pattern-Based Reconstruction Mechanism reconstructs speed, speed SD, direction, and temperature data columns using as reference other data columns of the same type within the same dataset, at different heights or different boom orientations, replicating the observed relationships between the data columns.

Markov-based mechanism for reconstruction within a data column

The Markov-Based Reconstruction Mechanism fills gaps in any data column by generating artificial data whose statistical properties match those of that column’s measured data. This is the most speculative of the three mechanisms because unlike the other two, it uses no concurrent reference measurements. Instead, it fills each gap by synthesizing an artificial segment of time series data that exhibits the same statistical characteristics as the real data, and that matches up with the real data at the start and end of the gap.

Two Reconstruction Processes

These mechanisms power two separate reconstruction processes. The Reconstruct Across Datasets window uses the first two mechanisms to perform the process of reconstruction across datasets. the Reconstruct Single Dataset window uses the second two mechanisms to perform the process of reconstruction within a dataset. Separate articles describe those processes.

Reconstruction Benefits Time Series Energy Modeling

Openwind can simulate wind farm energy production on a time step by time step basis, using wind resource data in time series form rather than frequency distribution form. This time-series modeling approach offers many benefits. It can account for the effects of varying temperature, air density, turbulence, and energy price, and it can incorporate scheduled curtailments related to wildlife, noise, or shadow flicker.

Time-series energy modeling becomes particularly powerful when it incorporates wind resource data from multiple measurement locations and multiple measurement heights. Imperfect data coverage limits its benefits, particularly as more variables enter into the simulation. For any time step to be useful to the simulation, it must contain valid measurements of all relevant variables at all met towers and all measurement heights. A missing or invalid measurement in even a single variable therefore makes a time step unusable, and since quality control problems (like sensor malfunctions and icing events) tend to affect different sensors at different times, their cumulative effect can eliminate many time steps from a simulation.

Fortunately, atmospheric conditions at any point tend to correlate strongly with those measured nearby, so an algorithm that can ‘reconstruct’ missing values using nearby measurements may maximize data recovery and thereby help realize the full potential of time series energy modeling.

Treatment of Tower Shading

Windographer does not reconstruct tower shaded data. Even if your settings indicate that you wish to reconstruct data flagged to exclude, the reconstruction process will not replace data segments flagged with the special tower shading flag. This results from the nature of tower shading, and the intention of the reconstruction process.

A met tower is a collection of instruments. The reconstruction process aims to synthesize artificial values that accurately reflect what those instruments should have measured, but for some reason could not measure. In other words, the reconstruction process aims to understand the character of the measured data, and then to synthesize missing data segments so that they conform to that character. An ideal reconstruction process would replace missing or excluded data points with synthetic values that match precisely what the instruments would have measured in those time steps.

Tower shading is unique, or at least Windographer treats it uniquely in this respect, in that it does not result from a failure of the measurement system. Tower-shaded measurements are not invalid measurements, they are just distorted by the measurement system. They are not abnormal but normal, in the sense that every time the wind blows from that direction, the shaded anemometer always reports a speed lower than its co-located mate. The shading pattern is part of the character of the measured data.

If Windographer learned perfectly how the different instruments behaved relative to each other, and used that perfect knowledge to replace a tower-shaded wind speed value, it would replace it with the exact same number. That is to say, a successful reconstruction process would replace a tower-shaded wind speed with a tower-shaded wind speed.

For this reason, the reconstruction windows do not replace tower shaded data.

You can address tower shading by combining co-located anemometers as in the Combine Anemometers window, or by applying correction factors by direction sector as in the Correct Tower Distortion window. Reconstruction, especially across datasets, usually functions best after you have addressed tower shading, so that directional differences between met towers are free from the distortions of tower shading.

See also

Data coverage rate

MCP-based reconstruction mechanism

Pattern-based reconstruction mechanism

Markov-based reconstruction mechanism

Process of reconstruction across datasets

Process of reconstruction within a dataset

Reconstruct Single Dataset window

Reconstruct Across Datasets window


Written by: Tom Lambert
Contact: windographer.support@ul.com
Last modified: September 26, 2022