English
 
Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Anomaly Detection in Seismic Data–Metadata Using Simple Machine-Learning Models

Zaccarelli, R., Bindi, D., Strollo, A. (2021): Anomaly Detection in Seismic Data–Metadata Using Simple Machine-Learning Models. - Seismological Research Letters, 92, 4, 2627-2639.
https://doi.org/10.1785/0220200339

Item is

Files

hide Files
:
5006823.pdf (Postprint), 3MB
Name:
5006823.pdf
Description:
-
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-
License:
-

Locators

show

Creators

hide
 Creators:
Zaccarelli, Riccardo1, 2, Author              
Bindi, Dino1, 2, Author              
Strollo, Angelo2, 3, Author              
Affiliations:
12.6 Seismic Hazard and Risk Dynamics, 2.0 Geophysics, Departments, GFZ Publication Database, Deutsches GeoForschungsZentrum, ou_146032              
2Publikationen aller GIPP-unterstützten Projekte, Deutsches GeoForschungsZentrum, Potsdam, ou_44021              
32.4 Seismology, 2.0 Geophysics, Departments, GFZ Publication Database, Deutsches GeoForschungsZentrum, ou_30023              

Content

hide
Free keywords: -
 Abstract: In modern seismological analysis, it is not unusual to process huge amounts of data, as illustrated by two case studies exemplified in this work, both assessing the quality of several millions of segments selected for computing local and energy magnitudes. In this scenario, quality control tools to filter, discard, or rank data are of extreme importance and should ideally be simple, fast, and generalizable. Using machine‐learning tools, we present here a simple and efficient model based on the isolation forest algorithm for detecting amplitude anomalies on any seismic waveform segment, with no restriction on the segment record content (earthquake vs. noise) and no additional requirements than the segment metadata. By considering a simple feature space composed of amplitudes of each segment’s power spectral density (PSD) evaluated at selected periods suitable for both local and teleseismic applications, feature selection revealed that one single feature, the PSD at 5 s, is sufficient to achieve the best predicting performances. The evaluation results report average precision scores around 0.97, and maximum F1 scores above 0.9, both remarkable results with respect to the simplicity of the approach used and the generality of the problem tackled. The trained model producing the best evaluation results is the backbone of a publicly available software, which computes an amplitude anomaly score in [0, 1] for any given seismic waveform, and can be beneficial in several applications such as discarding anomalies from data sets, ideally in a preprocessing stage, and detecting potential metadata problems on data center side. When applied to our two case studies, the software was revealed to be fast and effective, and the computed anomaly scores allow additional flexibility in addition to the proven wide‐range applicability.

Details

hide
Language(s):
 Dates: 2021-05-052021
 Publication Status: Finally published
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: DOI: 10.1785/0220200339
GFZPOF: p4 MESI
GFZPOFWEITERE: p4 T3 Restless Earth
OATYPE: Green Open Access
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

hide
Title: Seismological Research Letters
Source Genre: Journal, SCI, Scopus
 Creator(s):
Affiliations:
Publ. Info: -
Pages: - Volume / Issue: 92 (4) Sequence Number: - Start / End Page: 2627 - 2639 Identifier: CoNE: https://gfzpublic.gfz-potsdam.de/cone/journals/resource/journals447
Publisher: Seismological Society of America (SSA)