Dataset Reduction Techniques to Speed Up SVD Analyses on Big Geo-Datasets

Bogaardt, Laurens; Goncalves, Romulo; Zurita-Milla, Raul; Izquierdo-Verdiguier, Emma

doi:10.3390/ijgi8020055

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Journal Article

Dataset Reduction Techniques to Speed Up SVD Analyses on Big Geo-Datasets

Authors

Bogaardt, Laurens
External Organizations;

/persons/resource/romulo

Goncalves, Romulo
0 Pre-GFZ, Departments, GFZ Publication Database, Deutsches GeoForschungsZentrum;

Zurita-Milla, Raul
External Organizations;

Izquierdo-Verdiguier, Emma
External Organizations;

External Ressource

No external resources are shared

Fulltext (public)

There are no public fulltexts stored in GFZpublic

Supplementary Material (public)

There is no public supplementary material available

Citation

Bogaardt, L., Goncalves, R., Zurita-Milla, R., Izquierdo-Verdiguier, E. (2019): Dataset Reduction Techniques to Speed Up SVD Analyses on Big Geo-Datasets. - ISPRS International Journal of Geo-Information, 8, 2, 55.
https://doi.org/10.3390/ijgi8020055

Cite as: https://gfzpublic.gfz-potsdam.de/pubman/item/item_5025242

Abstract

The Singular Value Decomposition (SVD) is a mathematical procedure with multiple applications in the geosciences. For instance, it is used in dimensionality reduction and as a support operator for various analytical tasks applicable to spatio-temporal data. Performing SVD analyses on large datasets, however, can be computationally costly, time consuming, and sometimes practically infeasible. However, techniques exist to arrive at the same output, or at a close approximation, which requires far less effort. This article examines several such techniques in relation to the inherent scale of the structure within the data. When the values of a dataset vary slowly, e.g., in a spatial field of temperature over a country, there is autocorrelation and the field contains large scale structure. Datasets do not need a high resolution to describe such fields and their analysis can benefit from alternative SVD techniques based on rank deficiency, coarsening, or matrix factorization approaches. We use both simulated Gaussian Random Fields with various levels of autocorrelation and real-world geospatial datasets to illustrate our study while examining the accuracy of various SVD techniques. As the main result, this article provides researchers with a decision tree indicating which technique to use when and predicting the resulting level of accuracy based on the dataset’s structure scale.