Trainers¶
Hyperopt trainers¶
-
class
modelgym.trainers.hyperopt_trainer.
HyperoptTrainer
(model_spaces, algo=None, tracker=None)¶ Bases:
modelgym.trainers.trainer.Trainer
HyperoptTrainer is a class for models hyperparameter optimization, based on hyperopt library
Parameters: - model_spaces (list of modelgym.models.Model or modelgym.utils.ModelSpaces) – list of model spaces (model classes and parameter spaces to look in). If some list item is Model, it is converted in ModelSpace with default space and name equal to model class __name__
- algo (function, e.g hyperopt.rand.suggest or hyperopt.tpe.suggest) – algorithm to use for optimization
- tracker (modelgym.trackers.Tracker, optional) – tracker to save (and load, if there was any) optimization progress.
Raises: ValueError if there are several model_spaces with similar names
-
crossval_optimize_params
(opt_metric, dataset, cv=3, opt_evals=50, metrics=None, verbose=False, batch_size=10, client=None, **kwargs)¶ Find optimal hyperparameters for all models
Parameters: - opt_metric (modelgym.metrics.Metric) – metric to optimize
- dataset (modelgym.utils.XYCDataset or None) – dataset
- cv (int or list of tuples of (XYCDataset, XYCDataset)) – if int, then number of cross-validation folds or cross-validation folds themselves otherwise.
- opt_evals (int) – number of cross-validation evaluations
- metrics (list of modelgym.metrics.Metric, optional) – additional metrics to evaluate
- verbose (bool) – Enable verbose output.
- batch_size (int) – periodicity of saving results to tracker
- client –
- **kwargs – ignored
Note
if cv is int, than dataset is split into cv parts for cross validation. Otherwise, cv folds are used.
-
get_best_results
()¶ When training is complete, return best parameters (and additional information) for each model space
Returns: dict of shape: { name (str): { "result": { "loss": float, "loss_variance": float, "status": "ok", "metric_cv_results": list, "params": dict }, "model_space": modelgym.utils.ModelSpace } }
name is a name of corresponding model_space,
metric_cv_results contains dict’s from metric names to calculated metric values for each fold in cv_fold,
params is optimal parameters of corresponding model
model_space is corresponding model_space.
-
class
modelgym.trainers.hyperopt_trainer.
RandomTrainer
(model_spaces, tracker=None)¶ Bases:
modelgym.trainers.hyperopt_trainer.HyperoptTrainer
TpeTrainer is a HyperoptTrainer using Random search
-
class
modelgym.trainers.hyperopt_trainer.
TpeTrainer
(model_spaces, tracker=None)¶ Bases:
modelgym.trainers.hyperopt_trainer.HyperoptTrainer
TpeTrainer is a HyperoptTrainer using Tree-structured Parzen Estimator
Skopt trainers¶
-
class
modelgym.trainers.skopt_trainer.
GPTrainer
(model_spaces, tracker=None)¶ Bases:
modelgym.trainers.skopt_trainer.SkoptTrainer
GPTrainer is a SkoptTrainer, using Bayesian optimization using Gaussian Processes.
-
class
modelgym.trainers.skopt_trainer.
RFTrainer
(model_spaces, tracker=None)¶ Bases:
modelgym.trainers.skopt_trainer.SkoptTrainer
RFTrainer is a SkoptTrainer, using Sequential optimisation using decision trees
-
class
modelgym.trainers.skopt_trainer.
SkoptTrainer
(model_spaces, optimizer, tracker=None)¶ Bases:
modelgym.trainers.trainer.Trainer
SkoptTrainer is a class for models hyperparameter optimization, based on skopt library
Parameters: - model_spaces (list of modelgym.models.Model or modelgym.utils.ModelSpaces) – list of model spaces (model classes and parameter spaces to look in). If some list item is Model, it is converted in ModelSpace with default space and name equal to model class __name__
- (function, e.g forest_minimize or gp_minimize (optimizer) –
- tracker (modelgym.trackers.Tracker, optional) – ignored
Raises: ValueError if there are several model_spaces with similar names
-
crossval_optimize_params
(opt_metric, dataset, cv=3, opt_evals=50, metrics=None, verbose=False, **kwargs)¶ Find optimal hyperparameters for all models
Parameters: - opt_metric (modelgym.metrics.Metric) – metric to optimize
- dataset (modelgym.utils.XYCDataset or None) – dataset
- cv (int or list of tuples of (XYCDataset, XYCDataset)) – if int, then number of cross-validation folds or cross-validation folds themselves otherwise.
- opt_evals (int) – number of cross-validation evaluations
- metrics (list of modelgym.metrics.Metric, optional) – additional metrics to evaluate
- verbose (bool) – Enable verbose output.
- **kwargs – ignored
Note
if cv is int, than dataset is split into cv parts for cross validation. Otherwise, cv folds are used.
-
get_best_results
()¶ When training is complete, return best parameters (and additional information) for each model space
Returns: dict of shape: { name (str): { "result": { "loss": float, "metric_cv_results": list, "params": dict }, "model_space": modelgym.utils.ModelSpace } }
name is a name of corresponding model_space,
metric_cv_results contains dict’s from metric names to calculated metric values for each fold in cv_fold,
params is optimal parameters of corresponding model,
model_space is corresponding model_space.