hide
Free keywords:
-
Abstract:
The open data and FAIR data principles have seen widespread adoption in recent years. Unfortunately, the implementation of this vision often falls short of the original goals: seamless reuse is prevented by the reliance on undocumented and inconsistent file formats as well as the lack of metadata. Truly fulfilling the FAIR data vision requires standardization of the data formats and metadata. The necessary standardization efforts have been done, for example with the Climate and Forecast convention and the Attribute Convention for Data Discovery of the NetCDF ecosystem. But this is only one part of the solution, the other part being that the published dataset must implement such standards. We present our efforts in publishing a dataset made of data from more than 50 Automatic Weather Stations with data extending over a period of 25 years, contributed by nine institutions. The raw data have been standardized into a common format with a common set of parameter names, units and metadata standard. This standardization process is described in a configuration file (one per station) that is used to dynamically generate the final dataset from the raw data.The challenges in setting up such a dataset have ranged from finding the original data and parsing many different file formats, up to struggling to find the original measurement location or guessing which maintenance operations had been performed a long time ago. Preparing this dataset has therefore taken much longer than originally planned and our hope is that this could be improved in the future.