交互作用と多項式 - Note for Machine Learning

交互作用と多項式

Overview

特徴量表現をより豊かにする方法として、特に線形モデルに有効なのが、元のデータの交互作用特徴量（interaction feature）と多項式特徴量（polynomial feature）を加える方法です。

線形モデルは、個々のビンに対して定数を学習しますが、ビンごとに傾きも学習することができます。

Coding

交互作用特徴量

code: Python

from sklearn.linear_model import LinearRegression

from sklearn.preprocessing import OneHotEncoder

bins = np.linspace(-3, 3, 11)

which_bin = np.digitize(X, bins=bins)

encoder = OneHotEncoder(sparse=False)

encoder.fit(which_bin)

X_binned = encoder.transform(which_bin)

X, y = mglearn.datasets.make_wave(n_samples=100)

X_combined = np.hstack(X, X_binned)

line = np.linspace(-3, 3, 1000, endpoint=False).reshape(-1, 1)

line_binned = encoder.transform(np.digitize(line, bins=bins))

reg = LinearRegression().fit(X_combined, y)

line_combined = np.hstack(line, line_binned)

plt.plot(line, reg.predict(line_combined), label='linear regression combined')

for bin in bins:

plt.plot(bin, bin, -3, 3, ':', c='k')

plt.legend(loc='best')

plt.ylabel('Regression output')

plt.xlabel('Input feature')

plt.plot(X:, 0, y, 'o', c='k')

plt.show()

https://gyazo.com/814df03518ca36658912e347ee158033

上図では、ビン数ごとの傾きが同じになっています。これでは、あまり役に立ちません。それぞれのビンごとに違う傾きを持つには、データポイントがどこに入っているかを示す特徴量とx軸のどこにあるかを示す特徴量の相互作用もしくは積を、特徴量として加えます。

code: Python

# 特徴量が20に増える

X_product = np.hstack(X_binned, X * X_binned)

print(X_product.shape)

reg = LinearRegression().fit(X_product, y)

line_product = np.hstack(line_binned, line * line_binned)

plt.plot(line, reg.predict(line_product), label='linear regression product')

for bin in bins:

plt.plot(bin, bin, -3, 3, ':', c='k')

plt.plot(X:, 0, y, 'o', c='k')

plt.ylabel("Regression output")

plt.xlabel("Input feature")

plt.legend(loc="best")

plt.show()

--------------------------------------------------------------------------

(100, 20)

--------------------------------------------------------------------------

https://gyazo.com/b6368379242bbd3086ff11389e2f435b

多項式特徴量

code: Python

from sklearn.preprocessing import PolynomialFeatures

# x ** 10までの多項式を加える

# デフォルトの”include_bias=True”だと、常に1となる特徴量を加える

poly = PolynomialFeatures(degree=10, include_bias=False)

poly.fit(X)

X_poly = poly.transform(X)

# X_polyの内容をXと比較してみます

print("Entries of X:\n{}".format(X:5))

print("Entries of X_poly:\n{}".format(X_poly:5))

--------------------------------------------------------------------------

Entries of X:

Entries of X_poly:

[[-7.52759287e-01 5.66646544e-01 -4.26548448e-01 3.21088306e-01

-2.41702204e-01 1.81943579e-01 -1.36959719e-01 1.03097700e-01

-7.76077513e-02 5.84199555e-02]

[ 2.70428584e+00 7.31316190e+00 1.97768801e+01 5.34823369e+01

1.44631526e+02 3.91124988e+02 1.05771377e+03 2.86036036e+03

7.73523202e+03 2.09182784e+04]

[ 1.39196365e+00 1.93756281e+00 2.69701700e+00 3.75414962e+00

5.22563982e+00 7.27390068e+00 1.01250053e+01 1.40936394e+01

1.96178338e+01 2.73073115e+01]

[ 5.91950905e-01 3.50405874e-01 2.07423074e-01 1.22784277e-01

7.26822637e-02 4.30243318e-02 2.54682921e-02 1.50759786e-02

8.92423917e-03 5.28271146e-03]

[-2.06388816e+00 4.25963433e+00 -8.79140884e+00 1.81444846e+01

-3.74481869e+01 7.72888694e+01 -1.59515582e+02 3.29222321e+02

-6.79478050e+02 1.40236670e+03]]

--------------------------------------------------------------------------

個々の特徴量の意味はget_feature_namesメソッドで知ることができます。

code: Python

print('Polynomial feature names:\n{}'.format(poly.get_feature_names()))

--------------------------------------------------------------------------

Polynomial feature names:

'x0', 'x0^2', 'x0^3', 'x0^4', 'x0^5', 'x0^6', 'x0^7', 'x0^8', 'x0^9', 'x0^10'

--------------------------------------------------------------------------

多項式特徴量を線形回帰モデルと組み合わせると、古典的な多項式回帰モデルになります。

code: Python

reg = LinearRegression().fit(X_poly, y)

line_poly = poly.transform(line)

plt.plot(line, reg.predict(line_poly), label='polynomial linear regression')

plt.plot(X:, 0, y, 'o', c='k')

plt.ylabel('Regression output')

plt.xlabel('Input feature')

plt.legend(loc='best')

plt.show()

https://gyazo.com/fb39fb918e6018a663a98d5677c1618f

より複雑なモデルを使う場合には、特徴量に対して明示的な変換を行わなくても、多項式回帰と同じように複雑な予測をすることができます。そういったモデルに、交互作用特徴量と多項式特徴量を入れると性能が下がってしまう可能性があるため気を付けましょう。