Structured Training for Large-vocabulary Chord Recognition

#survey #ISMIR #2017

ShuKumata.icon

Author: Brian McFee, Juan Pablo Bello

Research institute: New York University

The problem the authors try to solve:

Link to This Paper: https://bmcfee.github.io/papers/ismir2017_chord.pdf

1枚まとめ

0. とりあえず一言

アブスト

Automatic chord recognition systems operating in the large-vocabulary regime must overcome data scarcity: certain classes occur much less frequently than others, and this presents a significant challenge when estimating model parameters. While most systems model the chord recognition task as a (multi-class) classification problem, few attempts have been made to directly exploit the intrinsic structural similarities between chord classes. In this work, we develop a deep convolutional-recurrent model for automatic chord recognition over a vocabulary of 170 classes. To exploit structural relationships between chord classes, the model is trained to produce both the time-varying chord label sequence as well as binary encodings of chord roots and qualities. This binary encoding directly exposes similarities between related classes, allowing the model to learn a more coherent representation of simultaneous pitch content. Evaluations on a corpus of 1217 annotated recordings demonstrate substantial improvements compared to previous models.

1. どんなもの？問題意識は？

2. 先行研究と比べてどこがすごい？

3. 技術や手法のキモはどこ？

4. どうやって有効だと検証した？

5. 議論はある？

6. 次に読むべき論文は？

7. メモ

リンク