做网站最好选什么语言,北京 网站建设 招标信息,网站建设项目公司,专门做照片的网站本次分享官网教程地址
https://mlr3book.mlr-org.com/chapters/chapter4/hyperparameter_optimization.html 型调优
当你对你的模型表现不满意时#xff0c;你可能希望调高你的模型表现#xff0c;可通过超参数调整或者尝试一个更加适合你的模型#xff0c;本篇将介绍这些操…
本次分享官网教程地址
https://mlr3book.mlr-org.com/chapters/chapter4/hyperparameter_optimization.html 型调优
当你对你的模型表现不满意时你可能希望调高你的模型表现可通过超参数调整或者尝试一个更加适合你的模型本篇将介绍这些操作。本章主要包括3个部分的内容超参数调整机器学习模型都有默认的超参数但是这些超参数不能根据数据自动调整往往不能得到更好的性能表现。但是手动调整往往也不能获得最佳的表现mlr3包含自动调参的策略在此包中实现自动调参需要指定搜索空间search_space优化算法调参方法评估方法重抽样策略评价指标。特征选择主要是通过mlr3filter和mlr3select包进行。嵌套重抽样调整超参数
很多人戏称调参的过程就像是炼丹确实差不多而且很多时候你调整后的结果可能还不如默认的结果好这就好比打游戏“一顿操作猛如虎一看战绩0比5”模型调优一定要基于对算法和数据的理解进行不是随便调的。我们使用著名的糖尿病数据集进行演示首先创建任务library(mlr3verse)
## 载入需要的程辑包mlr3
task - tsk(pima)
print(task)
## TaskClassif:pima (768 x 9)
## * Target: diabetes
## * Properties: twoclass
## * Features (8):
## - dbl (8): age, glucose, insulin, mass, pedigree, pregnant, pressure,
## triceps选择算法,查看算法支持的超参数learner - lrn(classif.rpart)
learner$param_set
## ParamSet
## id class lower upper nlevels default value
## 1: cp ParamDbl 0 1 Inf 0.01
## 2: keep_model ParamLgl NA NA 2 FALSE
## 3: maxcompete ParamInt 0 Inf Inf 4
## 4: maxdepth ParamInt 1 30 30 30
## 5: maxsurrogate ParamInt 0 Inf Inf 5
## 6: minbucket ParamInt 1 Inf Inf NoDefault[3]
## 7: minsplit ParamInt 1 Inf Inf 20
## 8: surrogatestyle ParamInt 0 1 2 0
## 9: usesurrogate ParamInt 0 2 3 2
## 10: xval ParamInt 0 Inf Inf 10 0
1
在这里我们选择调整复杂度参数cp和最小分支参数minsplit并设定超参数的调整范围search_space - ps(cp p_dbl(lower 0.001, upper 0.1),minsplit p_int(lower 1, upper 10)
)
search_space
## ParamSet
## id class lower upper nlevels default value
## 1: cp ParamDbl 0.001 0.1 Inf NoDefault[3]
## 2: minsplit ParamInt 1.000 10.0 10 NoDefault[3]然后选择重抽样方法和性能指标hout - rsmp(holdout, ratio 0.7)
measure - msr(classif.ce)
1
2
接下来进行调参有两种方法。方法一通过tuninginstancesinglecrite和tuner训练模型
library(mlr3tuning)
## 载入需要的程辑包paradoxevals20 - trm(evals, n_evals 20) # 设定何时停止训练# 统一放入instance中
instance - TuningInstanceSingleCrit$new(task task,learner learner,resampling hout,measure measure,terminator evals20,search_space search_space
)
instance
## TuningInstanceSingleCrit
## * State: Not optimized
## * Objective: ObjectiveTuning:classif.rpart_on_pima
## * Search Space:
## ParamSet
## id class lower upper nlevels default value
## 1: cp ParamDbl 0.001 0.1 Inf NoDefault[3]
## 2: minsplit ParamInt 1.000 10.0 10 NoDefault[3]
## * Terminator: TerminatorEvals
## * Terminated: FALSE
## * Archive:
## ArchiveTuning
## Null data.table (0 rows and 0 cols)关于何时停止训练mlr3给出了5种方法Terminate after a given time一定时间后停止
Terninate after a given number of iterations特定迭代次数后停止
Terminate after a specific performance has been reached达到特定性能指标后停止
Terminate when tuning dose find a better configuration for a given number of iterations在给定迭代次数中确实找到表现很好的参数组合后停止
A combination of above in ALL or ANY fashon上面几种方法组合
然后还需要设置超参数搜索的方法mlr3tuning目前支持以下超参数搜索的方法Grid search网格搜索
Random search随机搜索
Generalized simulated annealing
Non-Linear optimization# 这里选择网格搜索
tuner - tnr(grid_search, resolution 5) # 网格搜索
1
2
接下来就是进行训练模型上面我们设置了网格搜索的分辨率是5我们有2个超参数需要调整所以理论上一共有5 * 5 25个组合但是在前面的停止搜索的方法中我们选择了n_evals 20所有实际上在评价完20个组合后就会停止了#lgr::get_logger(mlr3)$set_threshold(warn)
#lgr::get_logger(bbotk)$set_threshold(warn) # 减少屏幕打印内容tuner$optimize(instance)
## INFO [20:51:28.312] [bbotk] Starting to optimize 2 parameter(s) with TunerGridSearch and TerminatorEvals [n_evals20, k0]
## INFO [20:51:28.331] [bbotk] Evaluating 1 configuration(s)
## 省略输出
## INFO [20:51:29.306] [bbotk] uhash
## INFO [20:51:29.306] [bbotk] 58eb421d-f0ed-4246-8430-3c1832ae615c
## INFO [20:51:29.309] [bbotk] Finished optimizing after 20 evaluation(s)
## INFO [20:51:29.310] [bbotk] Result:
## INFO [20:51:29.310] [bbotk] cp minsplit learner_param_vals x_domain classif.ce
## INFO [20:51:29.310] [bbotk] 0.02575 3 list[3] list[2] 0.2130435
## cp minsplit learner_param_vals x_domain classif.ce
## 1: 0.02575 3 list[3] list[2] 0.2130435查看调整好的超参数instance$result_learner_param_vals
## $xval
## [1] 0
##
## $cp
## [1] 0.02575
##
## $minsplit
## [1] 3查看模型性能
instance$result_y
## classif.ce
## 0.2130435
1查看每一次迭代的结果只有20个instance$archive
## ArchiveTuning
## cp minsplit classif.ce runtime_learners timestamp batch_nr
## 1: 0.026 3 0.21 0.02 2022-02-27 20:51:28 1
## 2: 0.075 8 0.21 0.00 2022-02-27 20:51:28 2
## 3: 0.050 5 0.21 0.00 2022-02-27 20:51:28 3
## 4: 0.001 1 0.30 0.00 2022-02-27 20:51:28 4
## 5: 0.100 3 0.21 0.02 2022-02-27 20:51:28 5
## 6: 0.026 5 0.21 0.02 2022-02-27 20:51:28 6
## 7: 0.100 8 0.21 0.01 2022-02-27 20:51:28 7
## 8: 0.001 8 0.27 0.00 2022-02-27 20:51:28 8
## 9: 0.001 5 0.28 0.00 2022-02-27 20:51:28 9
## 10: 0.100 5 0.21 0.02 2022-02-27 20:51:28 10
## 11: 0.075 10 0.21 0.00 2022-02-27 20:51:28 11
## 12: 0.050 10 0.21 0.01 2022-02-27 20:51:28 12
## 13: 0.075 5 0.21 0.00 2022-02-27 20:51:28 13
## 14: 0.050 8 0.21 0.01 2022-02-27 20:51:29 14
## 15: 0.001 10 0.26 0.00 2022-02-27 20:51:29 15
## 16: 0.050 3 0.21 0.00 2022-02-27 20:51:29 16
## 17: 0.050 1 0.21 0.02 2022-02-27 20:51:29 17
## 18: 0.100 10 0.21 0.00 2022-02-27 20:51:29 18
## 19: 0.075 1 0.21 0.01 2022-02-27 20:51:29 19
## 20: 0.026 1 0.21 0.00 2022-02-27 20:51:29 20
## warnings errors resample_result
## 1: 0 0 ResampleResult[22]
## 2: 0 0 ResampleResult[22]
## 3: 0 0 ResampleResult[22]
## 4: 0 0 ResampleResult[22]
## 5: 0 0 ResampleResult[22]
## 6: 0 0 ResampleResult[22]
## 7: 0 0 ResampleResult[22]
## 8: 0 0 ResampleResult[22]
## 9: 0 0 ResampleResult[22]
## 10: 0 0 ResampleResult[22]
## 11: 0 0 ResampleResult[22]
## 12: 0 0 ResampleResult[22]
## 13: 0 0 ResampleResult[22]
## 14: 0 0 ResampleResult[22]
## 15: 0 0 ResampleResult[22]
## 16: 0 0 ResampleResult[22]
## 17: 0 0 ResampleResult[22]
## 18: 0 0 ResampleResult[22]
## 19: 0 0 ResampleResult[22]
## 20: 0 0 ResampleResult[22]接下来就可以把训练好的超参数应用于模型重新应用于数据learner$param_set$values - instance$result_learner_param_vals
learner$train(task)
1
2
这个训练好的模型就可以用于预测了使用learner$predict()即可以上步骤写起来有些复杂与tidymodels相比不够简洁好理解我刚开始学习的时候经常记不住后来版本更新后终于有了简便写法instance - tune(task task,learner learner,resampling hout,measure measure,search_space search_space,method grid_search,resolution 5,term_evals 25
)
## INFO [20:51:29.402] [bbotk] Starting to optimize 2 parameter(s) with TunerGridSearch and TerminatorEvals [n_evals25, k0]
## INFO [20:51:29.403] [bbotk] Evaluating 1 configuration(s)
## INFO [20:51:29.411] [mlr3] Running benchmark with 1 resampling iterations
## 省略。。。
## INFO [20:51:30.535] [bbotk] 0.02575 10 list[3] list[2] 0.2347826instance$result_learner_param_vals
## $xval
## [1] 0
##
## $cp
## [1] 0.02575
##
## $minsplit
## [1] 10
instance$result_y
## classif.ce
## 0.2347826
learner$param_set$values - instance$result_learner_param_vals
learner$train(task)mlr3也支持同时设定多个性能指标measures - msrs(c(classif.ce,time_train)) # 设定多个评价指标evals20 - trm(evals, n_evals 20)instance - TuningInstanceMultiCrit$new(task task,learner learner,resampling hout,measures measures,search_space search_space,terminator evals20
)tuner$optimize(instance)
## INFO [20:51:30.595] [bbotk] Starting to optimize 2 parameter(s) with TunerGridSearch and TerminatorEvals [n_evals20, k0]
## INFO [20:51:30.597] [bbotk] Evaluating 1 configuration(s)
## 省略输出。。。查看结果instance$result_learner_param_vals
## [[1]]
## [[1]]$xval
## [1] 0
##
## [[1]]$cp
## [1] 0.0505
##
## [[1]]$minsplit
## [1] 1
##
##
## [[2]]
## [[2]]$xval
## [1] 0
##
## [[2]]$cp
## [1] 0.07525
##
## [[2]]$minsplit
## [1] 1
##
##
## [[3]]
## [[3]]$xval
## [1] 0
##
## [[3]]$cp
## [1] 0.07525
##
## [[3]]$minsplit
## [1] 10
##
##
## [[4]]
## [[4]]$xval
## [1] 0
##
## [[4]]$cp
## [1] 0.1
##
## [[4]]$minsplit
## [1] 8
##
##
## [[5]]
## [[5]]$xval
## [1] 0
##
## [[5]]$cp
## [1] 0.02575
##
## [[5]]$minsplit
## [1] 3
##
##
## [[6]]
## [[6]]$xval
## [1] 0
##
## [[6]]$cp
## [1] 0.07525
##
## [[6]]$minsplit
## [1] 8
##
##
## [[7]]
## [[7]]$xval
## [1] 0
##
## [[7]]$cp
## [1] 0.1
##
## [[7]]$minsplit
## [1] 3
##
##
## [[8]]
## [[8]]$xval
## [1] 0
##
## [[8]]$cp
## [1] 0.1
##
## [[8]]$minsplit
## [1] 5
##
##
## [[9]]
## [[9]]$xval
## [1] 0
##
## [[9]]$cp
## [1] 0.02575
##
## [[9]]$minsplit
## [1] 5
##
##
## [[10]]
## [[10]]$xval
## [1] 0
##
## [[10]]$cp
## [1] 0.07525
##
## [[10]]$minsplit
## [1] 5
##
##
## [[11]]
## [[11]]$xval
## [1] 0
##
## [[11]]$cp
## [1] 0.0505
##
## [[11]]$minsplit
## [1] 8
##
##
## [[12]]
## [[12]]$xval
## [1] 0
##
## [[12]]$cp
## [1] 0.0505
##
## [[12]]$minsplit
## [1] 3
##
##
## [[13]]
## [[13]]$xval
## [1] 0
##
## [[13]]$cp
## [1] 0.07525
##
## [[13]]$minsplit
## [1] 3
##
##
## [[14]]
## [[14]]$xval
## [1] 0
##
## [[14]]$cp
## [1] 0.0505
##
## [[14]]$minsplit
## [1] 5
##
##
## [[15]]
## [[15]]$xval
## [1] 0
##
## [[15]]$cp
## [1] 0.02575
##
## [[15]]$minsplit
## [1] 1
instance$rusult_y
## NULL以上就是第一种方法接下来介绍第二种方法。方法二通过autotuner训练模型
这种方式方法把调整参数、将调整好的参数应用于模型放到一起了但是也需要提前设定好各种需要的参数。task - tsk(pima) # 创建任务leanrer - lrn(classif.rpart) # 选择学习器search_space - ps(cp p_dbl(0.001, 0.1),minsplit p_int(1,10)
) # 设定搜索范围terminator - trm(evals, n_evals 10) # 设定停止标志tuner - tnr(random_search) # 选择搜索方法resampling - rsmp(holdout) # 选择重抽样方法measure - msr(classif.acc) # 选择评价指标# 训练
at - AutoTuner$new(learner learner,resampling resampling,search_space search_space,measure measure,tuner tuner,terminator terminator
)自动选择最优参数并作用于数据at$train(task)
## INFO [20:51:31.873] [bbotk] Starting to optimize 2 parameter(s) with OptimizerRandomSearch and TerminatorEvals [n_evals10, k0]
## INFO [20:51:31.882] [bbotk] Evaluating 1 configuration(s)
##省略巨多输出
## INFO [20:51:32.332] [bbotk] 0.02278977 3 list[3] list[2] 0.7695312
at$predict(task)
## PredictionClassif for 768 observations:
## row_ids truth response
## 1 pos pos
## 2 neg neg
## 3 pos neg
## ---
## 766 neg neg
## 767 pos neg
## 768 neg neg这个方法也有个简便写法auto_learner - auto_tuner(learner learner,resampling resampling,measure measure,search_space search_space,method random_search,term_evals 10
)auto_learner$train(task)
## INFO [20:51:32.407] [bbotk] Starting to optimize 2 parameter(s) with OptimizerRandomSearch and TerminatorEvals [n_evals10, k0]
## INFO [20:51:32.414] [bbotk] Evaluating 1 configuration(s)
## INFO [20:51:32.421] [mlr3] Running benchmark with 1 resampling iterations
## INFO [20:51:32.425] [mlr3] Applying learner classif.rpart on task pima (iter 1/1)
##省略巨多输出
auto_learner$predict(task)
## PredictionClassif for 768 observations:
## row_ids truth response
## 1 pos pos
## 2 neg neg
## 3 pos neg
## ---
## 766 neg neg
## 767 pos neg
## 768 neg neg超参数设定的方法
每次单独设置超参数的范围等可能会显得比较笨重无聊mlr3也提供另外一种可以在选择学习器时进行设定超参数的方法。# 在选择学习器时设置超参数范围
learner - lrn(classif.svm)
learner$param_set$values$kernel - polynomial
learner$param_set$values$degree - to_tune(lower 1, upper 3)print(learner$param_set$search_space())
## ParamSet
## id class lower upper nlevels default value
## 1: degree ParamInt 1 3 3 NoDefault[3]但其实这样也有问题这个方法要求你对算法很熟悉能够记住所有超参数记忆它们在mlr3中的拼写但很显然这有点困难所有我还是推荐第一种每次单独设置记不住还可以查看一下具体的超参数。参数依赖
某些超参数只有在某些条件下才有效比如支持向量机SVM它的degree参数只有在kernel是polynomial时才有效这种情况也可以在mlr3中设置好。library(data.table)
search_space ps(cost p_dbl(-1, 1, trafo function(x) 10^x), # 可进行数据变换kernel p_fct(c(polynomial, radial)),degree p_int(1, 3, depends kernel polynomial) # 设置参数依赖
)
rbindlist(generate_design_grid(search_space, 3)$transpose(), fill TRUE)
## cost kernel degree
## 1: 0.1 polynomial 1
## 2: 0.1 polynomial 2
## 3: 0.1 polynomial 3
## 4: 0.1 radial NA
## 5: 1.0 polynomial 1
## 6: 1.0 polynomial 2
## 7: 1.0 polynomial 3
## 8: 1.0 radial NA
## 9: 10.0 polynomial 1
## 10: 10.0 polynomial 2
## 11: 10.0 polynomial 3
## 12: 10.0 radial NA 超参数设置
超参数设置是通过paradox包完成的。
reference-based objects
paradox是ParamHelpers的重写版完全基于R6对象。 library(paradox)ps ParamSet$new()
ps2 ps
ps3 ps$clone(deep TRUE)
print(ps) # ps2和ps3是一样的
## ParamSet
## Empty.ps$add(ParamLgl$new(a))
print(ps)
## ParamSet
## id class lower upper nlevels default value
## 1: a ParamLgl NA NA 2 NoDefault[3]设定参数范围parameter space paradox包里面的超参数主要有以下类型
ParamInt: 整数 ParamDbl: 浮点数小数 ParamFct: 因子 ParamLgl: 逻辑值TRUE / FALSE ParamUty: 能取代任意值的参数 设定超参数范围的完整写法前面几篇用到的是简写
library(paradox)
parA ParamLgl$new(id A)
parB ParamInt$new(id B, lower 0, upper 10, tags c(tag1, tag2))
parC ParamDbl$new(id C, lower 0, upper 4, special_vals list(NULL))
parD ParamFct$new(id D, levels c(x, y, z), default y)
parE ParamUty$new(id E, custom_check function(x) checkmate::checkFunction(x))