kaggleで人気があるlightGBMをつかってみる。
インストール
pip install lightgbm
特に問題がなく終了。
コード、関係するところだけ記載。
split_before_y = d_train["Survived"].values split_before_x = d_train.drop("Survived",axis=1) X_train,X_test,y_train,y_test = train_test_split(split_before_x,split_before_y,test_size=0.008,random_state=0) import lightgbm as lgb lgb_train = lgb.Dataset(X_train, y_train) lgb_eval = lgb.Dataset(X_test, y_test, reference=lgb_train) params = { 'task': 'train', 'boosting_type': 'gbdt', 'objective': 'regression', 'metric': {'l2'}, 'num_leaves': 200, 'learning_rate': 0.003, 'feature_fraction': 0.50, 'bagging_fraction': 0.80, 'bagging_freq': 7, 'verbose': 0 } gbm = lgb.train(params, lgb_train, num_boost_round=1000, valid_sets=lgb_eval, early_stopping_rounds=200) Y_pred = gbm.predict(d_test.drop("PassengerId",axis=1).copy(), num_iteration=gbm.best_iteration) for i in range(418): if Y_pred[i]>=0.51: Y_pred[i]= else: Y_pred[i]=0 kaggle_submission = pd.DataFrame({ "PassengerId": d_test["PassengerId"], "Survived": Y_pred.astype('int64') })
結果は大幅な改善。0.80861となった。