Assessing automated gap imputation of regional scale groundwater level data sets with typical gap patterns
Artikel i vetenskaplig tidskrift, 2023

Large groundwater level (GWL) data sets are often patchy with hydrographs containing continuous gaps and irregular measurement frequencies. However, most statistical time series analyses require regular observations, thus hydrographs with larger gaps are routinely excluded from further analysis despite the loss of coverage and representativity of an initially large data set. Missing values can be filled in with different imputation methods, yet the challenge is to assess the imputation performance of automated methods. Assessment of such methods tends to be carried out on randomly introduced missing values. However, large GWL data sets are commonly dominated by more complex patterns of missing values with longer contiguous gaps. This study presents a new artificial gap introduction approach (TGP- typical gap patterns) that improves our understanding of automated imputation performance by mimicking typical gap patterns found in regional scale groundwater hydrographs. Imputation performance of machine learning algorithm missForest and imputePCA is then compared with commonly applied linear interpolation to prepare a gapless daily GWL data set for the Baltic states (Estonia, Latvia, Lithuania). We observed that imputation performance varies among different gap patterns, and performance for all imputation algorithms declined when infilling previously unseen extremes and hydrographs influenced by groundwater abstraction. Further, missForest algorithm substantially outperformed other methods when infilling contiguous gaps (up to 2.5 years), while linear interpolation performs similarly for short random gaps. The TGP approach can be of use to assess the complexity of missing observation patterns in a data set and its value lies in assessing the performance of gap filling methods in a more realistic way. Thus the approach aids the appropriate selection of imputation methods, a task not limited to groundwater level time series alone. The study further provides insights into region-specific data peculiarities that can assist groundwater analysis and modelling.

missing values

Baltic states

drought

abstraction

gap filling

Time series

Författare

Janis Bikse

Latvijas Universitate

Inga Retike

Latvijas Universitate

Ezra Haaf

Chalmers, Arkitektur och samhällsbyggnadsteknik, Geologi och geoteknik

Andis Kalvans

Latvijas Universitate

Journal of Hydrology

0022-1694 (ISSN)

Vol. 620 129424

När var hur? Identifikation av orsaker till hydrogeologiska störningar i undermarksbebyggelse

Trafikverket (TRV2019/45670), 2019-11-01 -- 2023-04-30.

Drivkrafter

Hållbar utveckling

Ämneskategorier

Tvärvetenskapliga studier

Vattenteknik

Oceanografi, hydrologi, vattenresurser

Medicinsk bildbehandling

Multidisciplinär geovetenskap

DOI

10.1016/j.jhydrol.2023.129424

Mer information

Senast uppdaterat

2023-04-12