当前位置: 首页 > news >正文

网站建设方案有关内容房屋设计装修公司

网站建设方案有关内容,房屋设计装修公司,外贸行业前景怎么样,县级网站建设培训会手动调参 分析影响模型的参数#xff0c;设计步长进行交叉验证 我们以随机森林为例#xff1a; 本文将使用sklearn自带的乳腺癌数据集#xff0c;建立随机森林#xff0c;并基于泛化误差#xff08;Genelization Error#xff09;与模型复杂度的关系来对模型进行调参设计步长进行交叉验证 我们以随机森林为例 本文将使用sklearn自带的乳腺癌数据集建立随机森林并基于泛化误差Genelization Error与模型复杂度的关系来对模型进行调参从而使模型获得更高的得分。 泛化误差是机器学习中用来衡量模型在未知数据上的准确率的指标 1、导入相关包 from sklearn.datasets import load_breast_cancer from sklearn.ensemble import RandomForestClassifier from sklearn.model_selection import cross_val_score from sklearn.model_selection import GridSearchCV import numpy as np import pandas as pd import matplotlib.pyplot as plt 2、导入乳腺癌数据集建立模型 由于sklearn自带的数据集已经很工整了所以无需做预处理直接使用。 # 导入乳腺癌数据集 data load_breast_cancer()# 建立随机森林 rfc RandomForestClassifier(n_estimators100, random_state90)用交叉验证计算得分 score_pre cross_val_score(rfc, data.data, data.target, cv10).mean() score_pre 3、调参 随机森林主要的参数有n_estimators子树的数量、max_depth树的最大生长深度、min_samples_leaf叶子的最小样本数量、min_samples_split(分支节点的最小样本数量、max_features最大选择特征数。它们对随机森林模型复杂度的影响如下图所示 n_estimators是影响程度最大的参数我们先对其进行调整 # 调参绘制学习曲线来调参n_estimators对随机森林影响最大 score_lt []# 每隔10步建立一个随机森林获得不同n_estimators的得分 for i in range(0,200,10):rfc RandomForestClassifier(n_estimatorsi1,random_state90)score cross_val_score(rfc, data.data, data.target, cv10).mean()score_lt.append(score) score_max max(score_lt) print(最大得分{}.format(score_max),子树数量为{}.format(score_lt.index(score_max)*101))# 绘制学习曲线 x np.arange(1,201,10) plt.subplot(111) plt.plot(x, score_lt, r-) plt.show() 如图所示当n_estimators从0开始增大至21时模型准确度有肉眼可见的提升。这也符合随机森林的特点在一定范围内子树数量越多模型效果越好。而当子树数量越来越大时准确率会发生波动当取值为41时获得最大得分。 框架自动调参 Optuna是一个自动化的超参数优化软件框架专门为机器学习而设计。 这里对其进行简单的入门介绍详细的学习可以参考https://github.com/optuna/optuna optuna是一个使用python编写的超参数调节框架。一个极简的 optuna 的优化程序中只有三个最核心的概念目标函数(objective)单次试验(trial)和研究(study)。其中 objective 负责定义待优化函数并指定参/超参数数范围trial 对应着 objective 的单次执行而 study 则负责管理优化决定优化的方式总试验的次数、试验结果的记录等功能。 objective根据目标函数的优化Session,由一系列的trail组成。trail根据目标函数作出一次执行。study根据多次trail得到的结果发现其中最优的超参数。 随机森林iris数据集调优 from sklearn.datasets import load_iris x, y load_iris().data, load_iris().target from sklearn.ensemble import RandomForestClassifier from sklearn.model_selection import train_test_split def objective(trial):global x, yX_train, X_test, y_train, y_testtrain_test_split(x, y, train_size0.3)# 数据集划分param {n_estimators: trial.suggest_int(n_estimators, 5, 20),criterion: trial.suggest_categorical(criterion, [gini,entropy])}dt_clf RandomForestClassifier(**param)dt_clf.fit(X_train, y_train)pred_dt dt_clf.predict(X_test)score (y_testpred_dt).sum() / len(y_test)return score studyoptuna.create_study(directionmaximize) n_trials20 # try50次 study.optimize(objective, n_trialsn_trials) print(study.best_value) print(study.best_params) #######################################结果###################################### [32m[I 2021-04-12 16:20:13,627][0m A new study created in memory with name: no-name-47fe20d7-e9c0-4bed-bc6d-8113edae0bec[0m [32m[I 2021-04-12 16:20:13,652][0m Trial 0 finished with value: 0.9523809523809523 and parameters: {n_estimators: 15, criterion: gini}. Best is trial 0 with value: 0.9523809523809523.[0m [32m[I 2021-04-12 16:20:13,662][0m Trial 1 finished with value: 0.9523809523809523 and parameters: {n_estimators: 5, criterion: gini}. Best is trial 0 with value: 0.9523809523809523.[0m [32m[I 2021-04-12 16:20:13,680][0m Trial 2 finished with value: 0.9428571428571428 and parameters: {n_estimators: 15, criterion: entropy}. Best is trial 0 with value: 0.9523809523809523.[0m [32m[I 2021-04-12 16:20:13,689][0m Trial 3 finished with value: 0.9523809523809523 and parameters: {n_estimators: 7, criterion: gini}. Best is trial 0 with value: 0.9523809523809523.[0m [32m[I 2021-04-12 16:20:13,704][0m Trial 4 finished with value: 0.9428571428571428 and parameters: {n_estimators: 14, criterion: gini}. Best is trial 0 with value: 0.9523809523809523.[0m [32m[I 2021-04-12 16:20:13,721][0m Trial 5 finished with value: 0.9714285714285714 and parameters: {n_estimators: 14, criterion: gini}. Best is trial 5 with value: 0.9714285714285714.[0m [32m[I 2021-04-12 16:20:13,733][0m Trial 6 finished with value: 0.9619047619047619 and parameters: {n_estimators: 10, criterion: gini}. Best is trial 5 with value: 0.9714285714285714.[0m [32m[I 2021-04-12 16:20:13,753][0m Trial 7 finished with value: 0.9619047619047619 and parameters: {n_estimators: 18, criterion: gini}. Best is trial 5 with value: 0.9714285714285714.[0m [32m[I 2021-04-12 16:20:13,764][0m Trial 8 finished with value: 0.9714285714285714 and parameters: {n_estimators: 8, criterion: entropy}. Best is trial 5 with value: 0.9714285714285714.[0m [32m[I 2021-04-12 16:20:13,771][0m Trial 9 finished with value: 0.9333333333333333 and parameters: {n_estimators: 5, criterion: gini}. Best is trial 5 with value: 0.9714285714285714.[0m [32m[I 2021-04-12 16:20:13,795][0m Trial 10 finished with value: 0.9333333333333333 and parameters: {n_estimators: 20, criterion: entropy}. Best is trial 5 with value: 0.9714285714285714.[0m [32m[I 2021-04-12 16:20:13,809][0m Trial 11 finished with value: 0.9333333333333333 and parameters: {n_estimators: 9, criterion: entropy}. Best is trial 5 with value: 0.9714285714285714.[0m [32m[I 2021-04-12 16:20:13,827][0m Trial 12 finished with value: 0.9428571428571428 and parameters: {n_estimators: 12, criterion: entropy}. Best is trial 5 with value: 0.9714285714285714.[0m [32m[I 2021-04-12 16:20:13,842][0m Trial 13 finished with value: 0.9238095238095239 and parameters: {n_estimators: 11, criterion: entropy}. Best is trial 5 with value: 0.9714285714285714.[0m [32m[I 2021-04-12 16:20:13,855][0m Trial 14 finished with value: 0.9428571428571428 and parameters: {n_estimators: 8, criterion: entropy}. Best is trial 5 with value: 0.9714285714285714.[0m [32m[I 2021-04-12 16:20:13,880][0m Trial 15 finished with value: 0.9428571428571428 and parameters: {n_estimators: 18, criterion: entropy}. Best is trial 5 with value: 0.9714285714285714.[0m [32m[I 2021-04-12 16:20:13,899][0m Trial 16 finished with value: 0.9428571428571428 and parameters: {n_estimators: 13, criterion: entropy}. Best is trial 5 with value: 0.9714285714285714.[0m [32m[I 2021-04-12 16:20:13,911][0m Trial 17 finished with value: 0.9714285714285714 and parameters: {n_estimators: 7, criterion: gini}. Best is trial 5 with value: 0.9714285714285714.[0m [32m[I 2021-04-12 16:20:13,933][0m Trial 18 finished with value: 0.9428571428571428 and parameters: {n_estimators: 17, criterion: entropy}. Best is trial 5 with value: 0.9714285714285714.[0m [32m[I 2021-04-12 16:20:13,948][0m Trial 19 finished with value: 0.9523809523809523 and parameters: {n_estimators: 11, criterion: gini}. Best is trial 5 with value: 0.9714285714285714.[0m0.9714285714285714 {n_estimators: 14, criterion: gini} ##################################################################################
http://www.dnsts.com.cn/news/15907.html

相关文章:

  • 万户做网站好不好WordPress支付宝登录
  • 模板网站与定制网站的价格不忘初心 继续前进网站怎么做
  • 单页面组合网站黄冈做网站公司
  • 怎么做锅炉网站wordpress插件 七牛
  • 外包加工网站电子商务都是做网站的吗
  • 色彩设计网站高校资源网网站建设方案
  • 网站开发分前台后台中华室内设计师
  • 企业网站导航一般做多高排版设计图
  • 小公司做网站用哪种服务器wordpress设置邮件提醒
  • 东莞网站建设托管工作总结2023年个人
  • 大型网站外链是怎么建设的爱南宁app下载二维码
  • 不用dw怎么做网站wordpress表白源码
  • 学做标书的网站小程序开发案例
  • 国外校友网站建设的现状网站设计客户对接流程
  • 京东导购网站开发凡科小程序价格
  • swoole怎么做直播网站海外网站推广方法
  • 设计 网站 源码零售户订烟电商网站
  • 网站更新提示怎末做qq快速登录入口
  • 网站域名区别公司网站建设需求
  • 门户网站排版微网站是免费的吗
  • 白山建设局网站如何自己设计logo图标
  • 沙洋建设局网站做微网站迅宇科技
  • 百度网盟如何选择网站企业小程序建设公司
  • 网站设计 cdc网站开启速度变慢了
  • 网站建设价格裙微信公众号微网站开发类型
  • 建站模板哪里好工信部网站备案信息
  • 微信做单网站有哪些自己建设网站怎么被百度收入
  • 网站被黑了怎么恢复福州建设工程招投标信息网
  • 网站建设云免费中小企业管理软件
  • 有口碑的宁波网站建设策划公司是做什么的