This page tries to summarize the filtering applied in the geonet_to_gmdb.py file


Filters

  • inventory_st is empty and there is no station magnitude pairs for the given network / station
  • if preferred mag type is ml or mlv
    • If there is no data in the waveform or the trace is all 0's and there is no station_magnitude pairing recorded for the event then the record is ignored
    • If there is data from the waveform and the trace is not all 0's and the SNR is not greater than 3 then it will attempt a filter using highpass and recalculate snr, if it is less than 3 again then the record is ignored
    • If there is data from the waveform and the trace is not all 0's and the SNR is not greater than 3 then it will attempt a filter using highpass and recalculate snr, if it is greater than 3 but the gain from the digital SOS filter is less than 1e-2 then the record is ignored


Notes for diagram:

inventory_st is the result of a station fetch from the FDSN Webservice Client from geonet for the specific network, station, channel filter and exact time of a given event pick

preferred mag type can be None, mb, ml, mlv, m

highpass frequency code is below

tr.filter("highpass", freq=1, corners=4, zerophase=True)


Questions to be answered:

1) Do we want to do the SNR steps for accepting or ignoring records?
2) If we want to do SNR, do we care about the phase arrival time selection being from the old method (from the FDSN_Client)
3) Do we care about how if records are removed from this filtering that they won't appear in the flatfile at the end when merged with the final dataset from IM_calc etc

 
Values affected if we remove SNR from the flatfiles

If there is no epicentral distanc emeasured then there will be no differnce
I there is a measurement and SNR passes then these columns will change values

columns=[
    "mag_corr",
    "mag_corr_method",
    "amp",
    "amp_peak",
    "amp_trough",
    "amp_max",
    "amp_unit",
    "SNR",
    "filtered",
    "amp_time",
]

All of these columns are None for other accepted record paths, except for the ones listed below
1 - mag_corr_method - "uncorrected" in all other cases
2 - amp - Taken from the generic amplitude from the event catalogue
3 - amp_unit - Taken from the amplitude unit from the event catalogue

When SNR is taken into account amp is calculated based on the trace data and the amp unit is always "mm"

Extra questions

What values do we actually use from the flatfiles? (Is there some columns we could ignore and remove unneeded compute for)
https://www.dropbox.com/scl/fi/1910sgbuh0874ebmip6hy/v3p4.zip?rlkey=ons99y6mc7511d5d9rvfpwxkd&dl=0


Extra notes

  • station_magnitude csv is duplicated with the mags csv?
  • Not every record has SNR
  • No labels