Models¶
In order to use our Trainer you need the wrapper on your model. You can find the required Model interface below.
We implement wrappers for several models:
Also, we implement an Ensemble Model.
Model interface¶
-
class
modelgym.models.model.
Model
(params=None)¶ Model is a base class for a specific ML algorithm implementation factory, i.e. it defines algorithm-specific hyperparameter space and generic methods for model training & inference
Parameters: params (dict or None) – parameters for model. -
fit
(dataset, weights=None)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
Returns: self
-
static
get_default_parameter_space
()¶ Returns: default parameter space Return type: dict from parameter name to hyperopt distribution
-
static
get_learning_task
()¶ Returns: task Return type: modelgym.models.LearningTask
-
is_possible_predict_proba
()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot
(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict
(dataset)¶ Parameters: dataset (modelgym.utils.XYCDataset) – the input data, dataset.y may be None Returns: predictions Return type: np.array, shape (n_samples, )
-
predict_proba
(X)¶ Parameters: dataset (np.array, shape (n_samples, n_features)) – the input data Returns: predicted probabilities Return type: np.array, shape (n_samples, n_classes)
-
save_snapshot
(filename)¶ Returns: serializable internal model state snapshot.
-
XGBoost¶
-
class
modelgym.models.xgboost_model.
XGBClassifier
(params=None)¶ Bases:
modelgym.models.model.Model
Parameters: params (dict) – parameters for model. -
fit
(dataset, weights=None)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
Returns: self
-
static
get_default_parameter_space
()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task
()¶
-
is_possible_predict_proba
()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot
(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict
(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba
(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot
(filename)¶ Returns: serializable internal model state snapshot.
-
-
class
modelgym.models.xgboost_model.
XGBRegressor
(params=None)¶ Bases:
modelgym.models.model.Model
Parameters: - params (dict or None) – parameters for model. If None default params are fetched.
- learning_task (str) – set type of task(classification, regression, …)
-
fit
(dataset, weights=None)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
Returns: self
-
static
get_default_parameter_space
()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task
()¶
-
is_possible_predict_proba
()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot
(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict
(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba
(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot
(filename)¶ Returns: serializable internal model state snapshot.
LightGBM¶
-
class
modelgym.models.lightgbm_model.
LGBMClassifier
(params=None)¶ Bases:
modelgym.models.model.Model
Parameters: - params (dict or None) – parameters for model. If None default params are fetched.
- learning_task (str) – set type of task(classification, regression, …)
-
fit
(dataset, weights=None)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
Returns: self
-
static
get_default_parameter_space
()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task
()¶
-
is_possible_predict_proba
()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot
(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict
(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba
(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot
(filename)¶ Returns: serializable internal model state snapshot.
-
class
modelgym.models.lightgbm_model.
LGBMRegressor
(params=None)¶ Bases:
modelgym.models.model.Model
Parameters: - params (dict or None) – parameters for model. If None default params are fetched.
- learning_task (str) – set type of task(classification, regression, …)
-
fit
(dataset, weights=None)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
Returns: self
-
static
get_default_parameter_space
()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task
()¶
-
is_possible_predict_proba
()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot
(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict
(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba
(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot
(filename)¶ Return: serializable internal model state snapshot.
RandomForestClassifier¶
-
class
modelgym.models.rf_model.
RFClassifier
(params=None)¶ Bases:
modelgym.models.model.Model
Parameters: - params (dict or None) – parameters for model. If None default params are fetched.
- learning_task (str) – set type of task(classification, regression, …)
-
fit
(dataset, weights=None)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
Returns: self
-
static
get_default_parameter_space
()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task
()¶
-
is_possible_predict_proba
()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot
(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict
(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba
(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot
(filename)¶ Returns: serializable internal model state snapshot.
Catboost¶
-
class
modelgym.models.catboost_model.
CtBClassifier
(params=None)¶ Bases:
modelgym.models.model.Model
Wrapper for CatBoostClassifier
Parameters: params (dict) – parameters for model. -
fit
(dataset, weights=None, eval_dataset=None, **kwargs)¶ Parameters: - dataset (XYCDataset) – train
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
- eval_dataset – same as dataset
- kwargs – CatBoost.Pool kwargs if eval_dataset is None or
{'train': train_kwargs, 'eval': eval_kwargs}
otherwise
Returns: self
-
static
get_default_parameter_space
()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task
()¶
-
is_possible_predict_proba
()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot
(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict
(dataset, **kwargs)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- kwargs – CatBoost.Pool kwargs
Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba
(dataset, **kwargs)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- kwargs – CatBoost.Pool kwargs
Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot
(filename)¶ Returns: serializable internal model state snapshot.
-
-
class
modelgym.models.catboost_model.
CtBRegressor
(params=None)¶ Bases:
modelgym.models.model.Model
Wrapper for CatBoostRegressor
Parameters: - params (dict or None) – parameters for model. If None default params are fetched.
- learning_task (str) – set type of task(classification, regression, …)
-
fit
(dataset, weights=None, eval_dataset=None, **kwargs)¶ Parameters: - dataset (XYCDataset) –
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
- eval_dataset – same as dataset
- kwargs – CatBoost.Pool kwargs if eval_dataset is None or
{'train': train_kwargs, 'eval': eval_kwargs}
otherwise
Returns: self
-
static
get_default_parameter_space
()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task
()¶
-
is_possible_predict_proba
()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot
(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict
(dataset, **kwargs)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- kwargs – CatBoost.Pool kwargs
Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba
(dataset, **kwargs)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- kwargs – CatBoost.Pool kwargs
Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot
(filename)¶ Returns: serializable internal model state snapshot.
Ensemble Model¶
-
class
modelgym.models.ensemble_model.
EnsembleClassifier
(params=None)¶ Bases:
modelgym.models.model.Model
Parameters: params (dict) – parameters for model. -
fit
(dataset, weights=None, **kwargs)¶ Parameters: - dataset (XYCDataset) – train
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
- eval_dataset – same as dataset
- kwargs – CatBoost.Pool kwargs if eval_dataset == None or
{'train': train_kwargs, 'eval': eval_kwargs}
otherwise
Returns: self
-
static
get_default_parameter_space
()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task
()¶
-
static
get_one_hot
(targets, nb_classes)¶
-
is_possible_predict_proba
()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot
(filename, models)¶ Parameters: filename – prefix for models’ files Returns: EnsembleClassifier
-
predict
(dataset, **kwargs)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- kwargs – CatBoost.Pool kwargs
Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba
(dataset, **kwargs)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- kwargs – CatBoost.Pool kwargs
Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot
(filename)¶ Parameters: filename – prefix for models’ files Returns: serializable internal model state snapshot.
-
-
class
modelgym.models.ensemble_model.
EnsembleRegressor
(params=None)¶ Bases:
modelgym.models.model.Model
Parameters: params (dict) – parameters for model -
fit
(dataset, weights=None, **kwargs)¶ Parameters: - dataset (XYCDataset) – train
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
- eval_dataset – same as dataset
- kwargs – CatBoost.Pool kwargs if eval_dataset == None or
{'train': train_kwargs, 'eval': eval_kwargs}
otherwise
Returns: self
-
static
get_default_parameter_space
()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task
()¶
-
is_possible_predict_proba
()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot
(filename, models)¶ Parameters: filename – prefix for models’ files Returns: EnsembleClassifier
-
predict
(dataset, **kwargs)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- kwargs – CatBoost.Pool kwargs
Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba
(dataset, **kwargs)¶ Regressor can’t predict proba
-
save_snapshot
(filename)¶ Parameters: filename – prefix for models’ files Returns: serializable internal model state snapshot.
-