決定木学習 - Note for Machine Learning

決定木学習

Coding

決定木の場合

code: Python

import numpy as np

from sklearn import datasets

from sklearn.model_selection import train_test_split

from sklearn.svm import SVC

iris = datasets.load_iris()

X = iris.data[:, 2, 3]

y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1, stratify=y)

from sklearn.tree import DecisionTreeClassifier

# エントロピーを指標とする決定木のインスタンスを生成

tree = DecisionTreeClassifier(criterion='gini', max_depth=4, random_state=1)

# 決定木のモデルをトレーニングデータに適合させる

tree.fit(X_train, y_train)

# 決定木では、数値の分割条件を値の大小関係として与えるため標準化は必要ない

X_combined = np.vstack((X_train, X_test))

y_combined = np.hstack((y_train, y_test))

plot_decision_regions(X=X_combined, y=y_combined, classifier=tree, test_idx=range(105, 150))

plt.xlabel('petal length cm')

plt.ylabel('petal with cm')

plt.legend(loc='upper left')

plt.tight_layout()

plt.show()

https://gyazo.com/50ff4c73b821981add2802babebf7330

code: Python

# pngで分岐結果を出力する

from pydotplus import graph_from_dot_data

from sklearn.tree import export_graphviz

dot_data = export_graphviz(tree, filled=True, rounded=True, class_names='Setosa', 'Versicolor', 'Virginica',

feature_names='petal length', 'petal width', out_file=None)

graph = graph_from_dot_data(dot_data)

graph.write_png('tree.png')

ランダムフォレストの場合

code: Python

from sklearn.ensemble import RandomForestClassifier

# エントロピーを指標とするランダムフォレストのインスタンスを生成

forest = RandomForestClassifier(criterion='gini', n_estimators=25, random_state=1, n_jobs=2)

# トレーニングデータに適合させる

forest.fit(X_train, y_train)

X_combined = np.vstack((X_train, X_test))

y_combined = np.hstack((y_train, y_test))

plot_decision_regions(X=X_combined, y=y_combined, classifier=forest, test_idx=range(105, 150))

plt.xlabel('petal length cm')

plt.ylabel('petal with cm')

plt.legend(loc='upper left')

plt.tight_layout()

plt.show()

https://gyazo.com/28deafaeff1b995bfbd504e03c312b36