kaggleで人気があるlightGBMをつかってみる。
インストール
pip install lightgbm
特に問題がなく終了。
コード、関係するところだけ記載。
split_before_y = d_train["Survived"].values
split_before_x = d_train.drop("Survived",axis=1)
X_train,X_test,y_train,y_test = train_test_split(split_before_x,split_before_y,test_size=0.008,random_state=0)
import lightgbm as lgb
lgb_train = lgb.Dataset(X_train, y_train)
lgb_eval = lgb.Dataset(X_test, y_test, reference=lgb_train)
params = {
'task': 'train',
'boosting_type': 'gbdt',
'objective': 'regression',
'metric': {'l2'},
'num_leaves': 200,
'learning_rate': 0.003,
'feature_fraction': 0.50,
'bagging_fraction': 0.80,
'bagging_freq': 7,
'verbose': 0
}
gbm = lgb.train(params, lgb_train, num_boost_round=1000, valid_sets=lgb_eval, early_stopping_rounds=200)
Y_pred = gbm.predict(d_test.drop("PassengerId",axis=1).copy(), num_iteration=gbm.best_iteration)
for i in range(418):
if Y_pred[i]>=0.51:
Y_pred[i]=
else:
Y_pred[i]=0
kaggle_submission = pd.DataFrame({
"PassengerId": d_test["PassengerId"],
"Survived": Y_pred.astype('int64')
})
結果は大幅な改善。0.80861となった。