CatBoost Target Encoding
この記事の翻訳、内容のまとめ
https://catboost.ai/docs/concepts/algorithm-main-stages_cat-to-numberic.html
CatBoostは内部でTargetEncodingしているらしい?がどうやっているのか
ほげ
Before each split is selected in the tree (see Choosing the tree structure), categorical features are transformed to numerical. This is done using various statistics on combinations of categorical features and combinations of categorical and numerical features.
カテゴリ×カテゴリ、カテゴリ×数値それぞれの組み合わせを新しい特徴量として生成する?
various statisticsってなんだよ
Reference
catboost
https://catboost.ai/docs/concepts/algorithm-main-stages_cat-to-numberic.html
https://catboost.ai/docs/concepts/python-reference_catboost_fit.html
https://catboost.ai/docs/concepts/algorithm-main-stages_choose-tree-structure.html#algorithm-main-stages_choose-tree-structure
category_encoders
https://contrib.scikit-learn.org/categorical-encoding/catboost.html
https://contrib.scikit-learn.org/categorical-encoding/_modules/category_encoders/cat_boost.html#CatBoostEncoder.fit
blog
https://copypaste-ds.hatenablog.com/entry/2019/09/05/184947