Basic Tutorial

Welcome to Modelgym Basic Tutorial.

As an example, we will show you how to use Modelgym for binary classification problem.

    In this tutorial we will go through the following steps:

  1. Choosing the models.

  2. Searching for the best hyperparameters on default spaces using TPE algorithm locally.

  3. Visualizing the results.

Define models we want to use

In this tutorial, we will use

  1. LightGBMClassifier
  2. XGBoostClassifier
  3. RandomForestClassifier
  4. CatBoostClassifier
from modelgym.models import LGBMClassifier, XGBClassifier, RFClassifier, CtBClassifier
/Users/f-minkin/.pyenv/versions/3.6.2/lib/python3.6/site-packages/sklearn/cross_validation.py:41: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
  "This module will be removed in 0.20.", DeprecationWarning)
models = [LGBMClassifier, XGBClassifier, RFClassifier, CtBClassifier]

Get dataset

For tutorial purposes we will use toy dataset

from sklearn.datasets import make_classification
from modelgym.utils import XYCDataset
X, y = make_classification(n_samples=500, n_features=20, n_informative=10, n_classes=2)
dataset = XYCDataset(X, y)

Create a TPE trainer

from modelgym.trainers import TpeTrainer
trainer = TpeTrainer(models)

Optimize hyperparams

We chose accuracy as a main metric that we rely on when optimizing hyperparams.

Also keep track for RocAuc and F1 measure besides accuracy for our best models.

Please, keep in mind, that now we’re optimizing hyperparameters from the default space of hyperparameters. That means, they are not optimal, for optimal ones and complete understanding follow advanced tutorial.

from modelgym.metrics import Accuracy, RocAuc, F1

Of course, it will take some time.

%%time
trainer.crossval_optimize_params(Accuracy(), dataset, metrics=[Accuracy(), RocAuc(), F1()])
/Users/f-minkin/.pyenv/versions/3.6.2/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 due to no predicted samples.
  'precision', 'predicted', average, warn_for)
CPU times: user 2h 2min 45s, sys: 47min 59s, total: 2h 50min 45s
Wall time: 28min 17s

Report best results

from modelgym.report import Report
reporter = Report(trainer.get_best_results(), dataset, [Accuracy(), RocAuc(), F1()])

Report in text form

reporter.print_all_metric_results()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~    accuracy    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

                            tuned
LGBMClassifier   0.776002 (0.00%)
XGBClassifier    0.838059 (8.00%)
RFClassifier     0.800075 (3.10%)
CtBClassifier   0.861963 (11.08%)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~    roc_auc    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

                            tuned
LGBMClassifier   0.815768 (0.00%)
XGBClassifier   0.904991 (10.94%)
RFClassifier     0.875230 (7.29%)
CtBClassifier   0.926832 (13.61%)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~    f1_score    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

                            tuned
LGBMClassifier   0.777157 (0.00%)
XGBClassifier    0.835813 (7.55%)
RFClassifier     0.792136 (1.93%)
CtBClassifier   0.859078 (10.54%)

Report plots

reporter.plot_all_metrics()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~    accuracy    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
_images/basic_tutorial_20_1.png
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~    roc_auc    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
_images/basic_tutorial_20_3.png
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~    f1_score    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
_images/basic_tutorial_20_5.png

Report heatmaps for each metric

reporter.plot_heatmaps()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~    accuracy    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
_images/basic_tutorial_22_1.png
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~    roc_auc    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
_images/basic_tutorial_22_3.png
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~    f1_score    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
_images/basic_tutorial_22_5.png

That’s it!

If you like it, please follow the advanced tutorial and learn all features modelgym can provide.