Models¶
In order to use our Trainer you need the wrapper on your model. You can find the required Model interface below.
We implement wrappers for several models:
Also, we implement an Ensemble Model.
Model interface¶
-
class
modelgym.models.model.Model(params=None)¶ Model is a base class for a specific ML algorithm implementation factory, i.e. it defines algorithm-specific hyperparameter space and generic methods for model training & inference
Parameters: params (dict or None) – parameters for model. -
fit(dataset, weights=None)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
Returns: self
-
static
get_default_parameter_space()¶ Returns: default parameter space Return type: dict from parameter name to hyperopt distribution
-
static
get_learning_task()¶ Returns: task Return type: modelgym.models.LearningTask
-
is_possible_predict_proba()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict(dataset)¶ Parameters: dataset (modelgym.utils.XYCDataset) – the input data, dataset.y may be None Returns: predictions Return type: np.array, shape (n_samples, )
-
predict_proba(X)¶ Parameters: dataset (np.array, shape (n_samples, n_features)) – the input data Returns: predicted probabilities Return type: np.array, shape (n_samples, n_classes)
-
save_snapshot(filename)¶ Returns: serializable internal model state snapshot.
-
XGBoost¶
-
class
modelgym.models.xgboost_model.XGBClassifier(params=None)¶ Bases:
modelgym.models.model.ModelParameters: params (dict) – parameters for model. -
fit(dataset, weights=None)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
Returns: self
-
static
get_default_parameter_space()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task()¶
-
is_possible_predict_proba()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot(filename)¶ Returns: serializable internal model state snapshot.
-
-
class
modelgym.models.xgboost_model.XGBRegressor(params=None)¶ Bases:
modelgym.models.model.ModelParameters: - params (dict or None) – parameters for model. If None default params are fetched.
- learning_task (str) – set type of task(classification, regression, …)
-
fit(dataset, weights=None)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
Returns: self
-
static
get_default_parameter_space()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task()¶
-
is_possible_predict_proba()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot(filename)¶ Returns: serializable internal model state snapshot.
LightGBM¶
-
class
modelgym.models.lightgbm_model.LGBMClassifier(params=None)¶ Bases:
modelgym.models.model.ModelParameters: - params (dict or None) – parameters for model. If None default params are fetched.
- learning_task (str) – set type of task(classification, regression, …)
-
fit(dataset, weights=None)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
Returns: self
-
static
get_default_parameter_space()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task()¶
-
is_possible_predict_proba()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot(filename)¶ Returns: serializable internal model state snapshot.
-
class
modelgym.models.lightgbm_model.LGBMRegressor(params=None)¶ Bases:
modelgym.models.model.ModelParameters: - params (dict or None) – parameters for model. If None default params are fetched.
- learning_task (str) – set type of task(classification, regression, …)
-
fit(dataset, weights=None)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
Returns: self
-
static
get_default_parameter_space()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task()¶
-
is_possible_predict_proba()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot(filename)¶ Return: serializable internal model state snapshot.
RandomForestClassifier¶
-
class
modelgym.models.rf_model.RFClassifier(params=None)¶ Bases:
modelgym.models.model.ModelParameters: - params (dict or None) – parameters for model. If None default params are fetched.
- learning_task (str) – set type of task(classification, regression, …)
-
fit(dataset, weights=None)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
Returns: self
-
static
get_default_parameter_space()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task()¶
-
is_possible_predict_proba()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba(dataset)¶ Parameters: X (np.array, shape (n_samples, n_features)) – the input data Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot(filename)¶ Returns: serializable internal model state snapshot.
Catboost¶
-
class
modelgym.models.catboost_model.CtBClassifier(params=None)¶ Bases:
modelgym.models.model.ModelWrapper for CatBoostClassifier
Parameters: params (dict) – parameters for model. -
fit(dataset, weights=None, eval_dataset=None, **kwargs)¶ Parameters: - dataset (XYCDataset) – train
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
- eval_dataset – same as dataset
- kwargs – CatBoost.Pool kwargs if eval_dataset is None or
{'train': train_kwargs, 'eval': eval_kwargs}otherwise
Returns: self
-
static
get_default_parameter_space()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task()¶
-
is_possible_predict_proba()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict(dataset, **kwargs)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- kwargs – CatBoost.Pool kwargs
Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba(dataset, **kwargs)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- kwargs – CatBoost.Pool kwargs
Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot(filename)¶ Returns: serializable internal model state snapshot.
-
-
class
modelgym.models.catboost_model.CtBRegressor(params=None)¶ Bases:
modelgym.models.model.ModelWrapper for CatBoostRegressor
Parameters: - params (dict or None) – parameters for model. If None default params are fetched.
- learning_task (str) – set type of task(classification, regression, …)
-
fit(dataset, weights=None, eval_dataset=None, **kwargs)¶ Parameters: - dataset (XYCDataset) –
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
- eval_dataset – same as dataset
- kwargs – CatBoost.Pool kwargs if eval_dataset is None or
{'train': train_kwargs, 'eval': eval_kwargs}otherwise
Returns: self
-
static
get_default_parameter_space()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task()¶
-
is_possible_predict_proba()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot(filename)¶ :snapshot serializable internal model state loads from serializable internal model state snapshot.
-
predict(dataset, **kwargs)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- kwargs – CatBoost.Pool kwargs
Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba(dataset, **kwargs)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- kwargs – CatBoost.Pool kwargs
Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot(filename)¶ Returns: serializable internal model state snapshot.
Ensemble Model¶
-
class
modelgym.models.ensemble_model.EnsembleClassifier(params=None)¶ Bases:
modelgym.models.model.ModelParameters: params (dict) – parameters for model. -
fit(dataset, weights=None, **kwargs)¶ Parameters: - dataset (XYCDataset) – train
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
- eval_dataset – same as dataset
- kwargs – CatBoost.Pool kwargs if eval_dataset == None or
{'train': train_kwargs, 'eval': eval_kwargs}otherwise
Returns: self
-
static
get_default_parameter_space()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task()¶
-
static
get_one_hot(targets, nb_classes)¶
-
is_possible_predict_proba()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot(filename, models)¶ Parameters: filename – prefix for models’ files Returns: EnsembleClassifier
-
predict(dataset, **kwargs)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- kwargs – CatBoost.Pool kwargs
Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba(dataset, **kwargs)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- kwargs – CatBoost.Pool kwargs
Returns: np.array, shape (n_samples, n_classes)
-
save_snapshot(filename)¶ Parameters: filename – prefix for models’ files Returns: serializable internal model state snapshot.
-
-
class
modelgym.models.ensemble_model.EnsembleRegressor(params=None)¶ Bases:
modelgym.models.model.ModelParameters: params (dict) – parameters for model -
fit(dataset, weights=None, **kwargs)¶ Parameters: - dataset (XYCDataset) – train
- y (np.array, shape (n_samples, ) or (n_samples, n_outputs)) – the target data
- weights (np.array, shape (n_samples, ) or (n_samples, n_outputs) or None) – weights of the data
- eval_dataset – same as dataset
- kwargs – CatBoost.Pool kwargs if eval_dataset == None or
{'train': train_kwargs, 'eval': eval_kwargs}otherwise
Returns: self
-
static
get_default_parameter_space()¶ Returns: dict of DistributionWrappers
-
static
get_learning_task()¶
-
is_possible_predict_proba()¶ Returns: bool, whether model can predict proba
-
static
load_from_snapshot(filename, models)¶ Parameters: filename – prefix for models’ files Returns: EnsembleClassifier
-
predict(dataset, **kwargs)¶ Parameters: - X (np.array, shape (n_samples, n_features)) – the input data
- kwargs – CatBoost.Pool kwargs
Returns: np.array, shape (n_samples, ) or (n_samples, n_outputs)
-
predict_proba(dataset, **kwargs)¶ Regressor can’t predict proba
-
save_snapshot(filename)¶ Parameters: filename – prefix for models’ files Returns: serializable internal model state snapshot.
-