科学・IＴ・登山の話題

Python 機械学習

kaggle Titanic Tutorial – 3

投稿日：2018年5月24日更新日：2019年9月22日

DecitionTreeのパラメータを調整する。

まずはMaxDepthから


from sklearn.model_selection import LeaveOneOut
from sklearn.metrics import accuracy_score
MAX_DEPTH = 20
depths = range(1, MAX_DEPTH)

loo_Y = d_train["Survived"].values
loo_X = d_train[["Pclass", "Sex", "Age", "Fare", "Parch", "Embarked", "SibSp"]].values

accuracy_scores = []
for depth in depths:

predicted_labels = []
loo = LeaveOneOut()
for train_index, test_index in loo.split(loo_X):
X_train, X_test = loo_X[train_index], loo_X[test_index]
y_train, y_test = loo_y[train_index], loo_y[test_index]
clf = DecisionTreeClassifier(max_depth=depth)
clf.fit(X_train, y_train)

predicted_label = clf.predict(loo_X[test_index])
predicted_labels.append(predicted_label)

score = accuracy_score(loo_Y, predicted_labels)
print('max depth={0}: {1}'.format(depth, score))

max depth=1: 0.7867564534231201
max depth=2: 0.6936026936026936
max depth=3: 0.8181818181818182
max depth=4: 0.8237934904601572
max depth=5: 0.8181818181818182
max depth=6: 0.8103254769921436
max depth=7: 0.8215488215488216
max depth=8: 0.8249158249158249
max depth=9: 0.8204264870931538
max depth=10: 0.8148148148148148
max depth=11: 0.8058361391694725
max depth=12: 0.8002244668911336
max depth=13: 0.797979797979798
max depth=14: 0.7934904601571269
max depth=15: 0.7912457912457912
max depth=16: 0.7755331088664422
max depth=17: 0.77665544332211
max depth=18: 0.7833894500561167
max depth=19: 0.7744107744107744

MaxDepthは8を利用する。

Related posts:

SIGNATE お弁当の需要予測-3

kaggle Titanic Tutorial – 6

make_blobsで分類データを作成する

-Python, 機械学習
-Kaggle, Python

執筆者：admin

comment コメントをキャンセル

関連記事

: kaggle Titanic Tutorial – 6

さて、今回は年齢について検証する。まずこれまでは中央値を使っていたわけだ。これをもともと年齢分布と中央値を使って更新した後の年齢分布を比較する。 import numpy as nm import p …

: dataframeで条件を付けて要素を返す方法

前回の続きから。 dfは現在以下のようになっている。 W X Y Z A 2.706850 0.628133 0.907969 0.503826 B 0.651118 -0.319318 -0.848 …

: 実践ワークショップExcel徹底活用ビジネスデータ分析

メモ相関係数の行列で傾向が似ている変数を探すことができる。例えば過去データとして商品A,B,C,D,E,Fがあるとする。今商品Xを開発し、マーケティング方法を決めたい。この時A～Fについてはすでに売 …

: automated the boring

まずは肩慣らし print(‘Hello world!’) print(‘What is your name?’) # ask for their na …

: Numpyまとめ

環境及びインポート numpyのインポートおよび環境確認配列生成配列をリストから生成配列の属性を確認すべての要素が同じ値を持つ配列を生成空の配列を生成 numpy.linspace()を使っ …

PREV: kaggle Titanic Tutorial – 2
NEXT: kaggle Titanic Tutorial – 4

GIMPでForeground color pickerが正しくない色をピックアップするときの対応方法 2021年9月7日
lerpとslerpの違い 2021年8月5日
シェーダーとは 2021年8月3日
Unityの座標系についてメモ 2021年7月1日
ALLとREMOVEFILTERSの違い 2021年6月28日