音響イベント検出におけるメディアンフィルタ
adaptive median filterの発想元:
2.5節 Post-processingより引用:
To determine the sound event activation, we perform thresholding for the network output posterior. Then, we perform median filtering as post-processing to smooth the detected activation sequence. Since each sound event has different characteristics, such as temporal structures, the optimal post-processing parameters depend on the individual sound events. Hence, we determine the optimal postprocessing parameters for each sound event using the validation set.
We search the optimal threshold and median filter size from 0.1 to 0.9 in increments of 0.1, and from 1 to 31 in increments of 2, respectively.
個別のイベントごとに時間が異なるので,メディアンフィルタの窓について,ハイパーパラメータの調整が必要だが,それは検証用のデータを使って行った,ということか?
後者より引用:
In all experiments, we employed adaptive median filtering (MF) technique. This approach involved the application of median filters with varying window sizes, denoted as $ W in, based on the duration of real-life event categories $ c. The specific window sizes for each event category are presented below:
code: tex
W inc = durationc × βc (2)
In order to handle event categories with significant duration variation, we employed a dynamic approach by setting the median duration $ duration_c as the reference. For this purpose, we initially set the parameter $ βc= 1/3 and fine-tuned the window sizes based on the development set, ensuring optimal performance.
推論結果を統合するためにmedian filterを使用し,その窓はイベントの持つ時間などで決まる
今回のモデルだと,評価時に後処理として使用
それぞれのモデルの推論後
評価結果について,閾値とmedlian filterで後処理
この後処理を丸々置き換えるのがSBSS?