Model#

This module provides tools to streamline data modeling workflows. It contains functions to set up pipelines, iterate over models, and evaluate and plot results.

Functions:
  • compare_models() - Find the best classification model and hyper-parameters for a dataset.

  • create_nn_binary() - Create a binary classification neural network model.

  • create_nn_multi() - Create a multi-class classification neural network model.

  • create_pipeline() - Create a custom pipeline for data preprocessing and modeling.

  • create_results_df() - Initialize the results_df DataFrame with the columns required for iterate_model.

  • eval_model() - Produce a detailed evaluation report for a classification model.

  • iterate_model() - Iterate and evaluate a model pipeline with specified parameters.

  • plot_acf_residuals() - Plot residuals, histogram, ACF, and PACF of a time series ARIMA model.

  • plot_results() - Plot the results of model iterations and select the best metric.

  • plot_train_history() - Plot the training and validation history of a fitted Keras model.

datawaza.model.compare_models(x: DataFrame, y: Series, models: List[str], config: Dict[str, Any], class_map: Dict[Any, Any] | None = None, pos_label: Any | None = None, test_size: float = 0.25, search_type: str = 'grid', grid_cv: int | str = 5, plot_perf: bool = False, scorer: str = 'accuracy', random_state: int = 42, decimal: int = 4, verbose: int = 4, title: str | None = None, fig_size: Tuple[int, int] = (12, 6), figmulti: float = 1.5, multi_class: str = 'ovr', average: str | None = None, legend_loc: str = 'best', model_eval: bool = False, svm_proba: bool = False, threshold: float = 0.5, class_weight: Dict[Any, float] | None = None, stratify: Series | None = None, imputer: str | None = None, impute_first: bool = True, transformers: List[str] | None = None, scaler: str | None = None, selector: str | None = None, cat_columns: List[str] | None = None, num_columns: List[str] | None = None, max_iter: int = 10000, rotation: int | None = None, plot_curve: bool = True, under_sample: float | None = None, over_sample: float | None = None, notes: str | None = None, svm_knn_resample: float | None = None, n_jobs: int | None = None, output: bool = True, timezone: str = 'UTC', debug: bool = False) DataFrame[source]#

Find the best classification model and hyper-parameters for a dataset by automating the workflow for multiple models and comparing results.

This function integrates a number of steps in a typical classification model workflow, and it does this for multiple models, all with one command line:

  • Auto-detecting single vs. multi-class classification problems

  • Option to Under-sample or Over-smple imbalanced data,

  • Option to use a sub-sample of data for SVC or KNN, which can be computation intense

  • Ability to split the Train/Test data at a specified ratio,

  • Creation of a multiple-step Pipeline, including Imputation, multiple Column Transformer/Encoding steps, Scaling, Feature selection, and the Model,

  • Grid Search of hyper-parameters, either full or random,

  • Calculating performance metrics from the standard Classification Report (Accuracy, Precision, Recall, F1) but also with ROC AUC, and if binary, True Positive Rate, True Negative Rate, False Positive Rate, False Negative Rate,

  • Evaluating this performance based on a customizable Threshold,

  • Visually showing performance by plotting (a) a Confusion Matrix, and if binary, (b) a Histogram of Predicted Probabilities, (c) an ROC Curve, and (d) a Precision-Recall Curve.

  • Save all the results in a DataFrame for reference and comparison, and

  • Option to plot the results to visually compare performance of the specified metric across multiple model pipelines with their best parameters.

To use this function, a configuration should be created that defines the desired model configurations and parameters you want to search. When compare_models is run, for each model in the models parameter, the create_pipeline function will be called to create a pipeline from the specified parameters. Each model iteration will have the same pipeline construction, except for the final model, which will vary. Here are the major pipeline parameters, along with the config sections they map to:

  • imputer (str) is selected from config[‘imputers’]

  • transformers (list or str) are selected from config[‘transformers’]

  • scaler (str) is selected from config[‘scalers’]

  • selector (str) is selected from config[‘selectors’]

  • models (list or str) are selected from config[‘models’]

Here is an example of the configuration dictionary structure. It is based on what create_pipeline requires to assemble the pipeline. But it adds some additional configuration parameters referenced by compare_models, which are params (grid search parameters, required) and cv (cross-validation parameters, optional if grid_cv is an integer). The configuration dictionary is passed to compare_models as the config parameter:

>>> config = {  
...     'models' : {
...         'logreg': LogisticRegression(max_iter=max_iter,
...                   random_state=random_state, class_weight=class_weight),
...         'knn_class': KNeighborsClassifier(),
...         'tree_class': DecisionTreeClassifier(random_state=random_state,
...                       class_weight=class_weight)
...     },
...     'imputers': {
...         'simple_imputer': SimpleImputer()
...     },
...     'transformers': {
...         'ohe': (OneHotEncoder(drop='if_binary', handle_unknown='ignore'),
...                     ohe_columns)
...     },
...     'scalers': {
...         'stand': StandardScaler()
...     },
...     'selectors': {
...         'sfs_logreg': SequentialFeatureSelector(LogisticRegression(
...                       max_iter=max_iter, random_state=random_state,
...                       class_weight=class_weight))
...     },
...     'params' : {
...         'logreg': {
...             'logreg__C': [0.0001, 0.001, 0.01, 0.1, 1, 10, 100],
...             'logreg__solver': ['newton-cg', 'lbfgs', 'saga']
...         },
...         'knn_class': {
...             'knn_class__n_neighbors': [3, 5, 10, 15, 20, 25],
...             'knn_class__weights': ['uniform', 'distance'],
...             'knn_class__metric': ['euclidean', 'manhattan']
...         },
...         'tree_class': {
...             'tree_class__max_depth': [3, 5, 7],
...             'tree_class__min_samples_split': [5, 10, 15],
...             'tree_class__criterion': ['gini', 'entropy'],
...             'tree_class__min_samples_leaf': [2, 4, 6]
...         },
...     },
...     'cv': {
...         'kfold_5': KFold(n_splits=5, shuffle=True, random_state=42)
...     },
...     'no_scale': ['tree_class'],
...     'no_poly': ['knn_class', 'tree_class']
... }

In addition to the configuration file, you will need to define any column lists if you want to target certain transformations to a subset of columns. For example, you might define a ‘ohe’ transformer for One-Hot Encoding, and reference ‘ohe_columns’ or ‘cat_columns’ in its definition in the config.

Here is an example of how to call this function in an organized manner:

>>> results_df = dw.compare_models(  
...
...     # Data split and sampling
...     x=X, y=y, test_size=0.25, stratify=None, under_sample=None,
...     over_sample=None, svm_knn_resample=None,
...
...     # Models and pipeline steps
...     imputer=None, transformers=None, scaler='stand', selector=None,
...     models=['logreg', 'knn_class', 'svm_proba', 'tree_class',
...     'forest_class', 'xgb_class', 'keras_class'], svm_proba=True,
...
...     # Grid search
...     search_type='random', scorer='accuracy', grid_cv='kfold_5', verbose=4,
...
...     # Model evaluation and charts
...     model_eval=True, plot_perf=True, plot_curve=True, fig_size=(12,6),
...     legend_loc='lower left', rotation=45, threshold=0.5,
...     class_map=class_map, pos_label=1, title='Breast Cancer',
...
...     # Config, preferences and notes
...     config=my_config, class_weight=None, random_state=42, decimal=4,
...     n_jobs=None, debug=False, notes='Test Size=0.25, Threshold=0.50'
... )

Use this function when you want to find the best classification model and hyper-parameters for a dataset, after doing any required pre-processing or cleaning. It is a significant time saver, replacing numerous manual coding steps with one command.

Parameters:
  • x (pd.DataFrame) – The feature matrix.

  • y (pd.Series) – The target vector.

  • test_size (float, optional (default=0.25)) – The proportion of the dataset to include in the test split.

  • models (List[str]) – A list of model names to iterate over.

  • config (Dict[str, Any], optional (default=None)) – A configuration dictionary that defines the pipeline steps, models, grid search parameters, and cross-validation functions. It should have the following keys: ‘imputers’, ‘transformers’, ‘scalers’, ‘selectors’, ‘models’, ‘params’, ‘cv’, ‘no_scale’, and ‘no_poly’.

  • class_map (Dict[Any, Any], optional (default=None)) – A dictionary to map class labels to new values.

  • search_type (str, optional (default='grid')) – The type of hyperparameter search to perform. Can be either ‘grid’ for GridSearchCV or ‘random’ for RandomizedSearchCV.

  • grid_cv (Union[int, str], optional (default=5)) – The number of cross-validation folds for GridSearchCV or RandomizedSearchCV, or a string to select a cross-validation function from config[‘cv’]. Default is 5.

  • plot_perf (bool, optional (default=False)) – Whether to plot the model performance.

  • scorer (str, optional (default='accuracy')) – The scorer to use for model evaluation.

  • pos_label (Any, optional (default=None)) – The positive class label.

  • random_state (int, optional (default=42)) – The random state for reproducibility.

  • decimal (int, optional (default=4)) – The number of decimal places to round the results to.

  • verbose (int, optional (default=4)) – The verbosity level for the search.

  • title (str, optional (default=None)) – The title for the plots.

  • fig_size (Tuple[int, int], optional (default=(12, 6))) – The figure size for the plots.

  • figmulti (float, optional (default=1.5)) – The multiplier for the figure size in multi-class classification.

  • multi_class (str, optional) – The method for handling multi-class ROC AUC calculation. Can be ‘ovr’ (one-vs-rest) or ‘ovo’ (one-vs-one). Default is ‘ovr’.

  • average (str, optional) – The averaging method for multi-class classification metrics. Can be ‘macro’, ‘micro’, ‘weighted’, or ‘samples’. Default is ‘macro’.

  • legend_loc (str, optional (default='best')) – The location of the legend in the plots.

  • model_eval (bool, optional (default=False)) – Whether to perform a detailed model evaluation.

  • svm_proba (bool, optional (default=False)) – Whether to enable probability estimates for SVC.

  • threshold (float, optional (default=0.5)) – The classification threshold for binary classification.

  • class_weight (Dict[Any, float], optional (default=None)) – The class weights for balancing imbalanced classes.

  • stratify (pd.Series, optional (default=None)) – The stratification variable for train-test split.

  • imputer (str, optional (default=None)) – The imputation strategy.

  • impute_first (bool, optional (default=True)) – Whether to impute before other preprocessing steps.

  • transformers (List[str], optional (default=None)) – A list of transformers to apply.

  • scaler (str, optional (default=None)) – The scaling strategy.

  • selector (str, optional (default=None)) – The feature selection strategy.

  • config – A configuration dictionary for customizing the pipeline.

  • cat_columns (List[str], optional (default=None)) – A list of categorical columns in X.

  • num_columns (List[str], optional (default=None)) – A list of numerical columns in X.

  • max_iter (int, optional (default=10000)) – The maximum number of iterations for the solvers.

  • rotation (int, optional (default=None)) – The rotation angle for the x-axis labels in the plots.

  • plot_curve (bool, optional (default=True)) – Whether to plot the learning curve for KerasClassifier.

  • under_sample (float, optional (default=None)) – The under-sampling ratio.

  • over_sample (float, optional (default=None)) – The over-sampling ratio.

  • notes (str, optional (default=None)) – Additional notes or comments.

  • svm_knn_resample (float, optional (default=None)) – The resampling ratio for SVC and KNeighborsClassifier.

  • n_jobs (int, optional (default=None)) – The number of parallel jobs to run.

  • output (bool, optional (default=True)) – Whether to print the progress and results.

  • timezone (str, optional) – Timezone to be used for timestamps. Default is ‘UTC’.

  • debug (bool, optional) – Flag to show debugging information.

Returns:

A DataFrame containing the performance metrics and other details for each model.

Return type:

pd.DataFrame

Examples

Prepare the data for the examples:

>>> pd.set_option('display.max_columns', None)  # For test consistency
>>> pd.set_option('display.width', None)  # For test consistency
>>> from sklearn.datasets import make_classification
>>> X, y = make_classification(n_samples=1000, n_classes=2, n_features=20,
...                            weights=[0.4, 0.6], random_state=42)
>>> X = pd.DataFrame(X, columns=[f'Feature_{i+1}' for i in range(X.shape[1])])
>>> y = pd.Series(y, name='Target')
>>> class_map = {0: 'Malignant', 1: 'Benign'}

Example 1: Define the configuration for the models:

>>> # Set some variables referenced in the config
>>> random_state = 42
>>> class_weight = None
>>> max_iter = 10000
>>>
>>> # Set column lists referenced in the config
>>> num_columns = list(X.columns)
>>> cat_columns = []
>>>
>>> # Create a custom configuration file with 3 models and grid search params
>>> my_config = {
...     'models' : {
...         'logreg': LogisticRegression(max_iter=max_iter,
...                   random_state=random_state, class_weight=class_weight),
...         'knn_class': KNeighborsClassifier(),
...         'tree_class': DecisionTreeClassifier(random_state=random_state,
...                       class_weight=class_weight),
...         'svm_proba': SVC(random_state=random_state, probability=True,
...                      class_weight=class_weight),
...     },
...     'imputers': {
...         'simple_imputer': SimpleImputer()
...     },
...     'transformers': {
...         'ohe': (OneHotEncoder(drop='if_binary', handle_unknown='ignore'),
...                     cat_columns),
...         'poly2': (PolynomialFeatures(degree=2, include_bias=False), num_columns)
...     },
...     'scalers': {
...         'stand': StandardScaler()
...     },
...     'selectors': {
...         'sfs_logreg': SequentialFeatureSelector(LogisticRegression(
...                       max_iter=max_iter, random_state=random_state,
...                       class_weight=class_weight))
...     },
...     'params' : {
...         'logreg': {
...             'logreg__C': [0.0001, 0.001, 0.01, 0.1, 1, 10, 100],
...             'logreg__solver': ['newton-cg', 'lbfgs', 'saga']
...         },
...         'knn_class': {
...             'knn_class__n_neighbors': [3, 5, 10, 15, 20, 25],
...             'knn_class__weights': ['uniform', 'distance'],
...             'knn_class__metric': ['euclidean', 'manhattan']
...         },
...         'tree_class': {
...             'tree_class__max_depth': [3, 5, 7],
...             'tree_class__min_samples_split': [5, 10, 15],
...             'tree_class__criterion': ['gini', 'entropy'],
...             'tree_class__min_samples_leaf': [2, 4, 6]
...         },
...         'svm_proba': {
...             'svm_proba__C': [0.01, 0.1, 1, 10, 100],
...             'svm_proba__kernel': ['linear', 'poly']
...         },
...     },
...     'cv': {
...         'kfold_5': KFold(n_splits=5, shuffle=True, random_state=42)
...     },
...     'no_scale': ['tree_class'],
...     'no_poly': ['knn_class', 'tree_class']
... }

Example 1: Compare models with default parameters:

>>> results_df = compare_models(
...
...     # Data split and sampling
...     x=X, y=y, test_size=0.25, stratify=None, under_sample=None,
...     over_sample=None, svm_knn_resample=None,
...
...     # Models and pipeline steps
...     imputer=None, transformers=None, scaler='stand', selector=None,
...     models=['logreg', 'knn_class', 'tree_class'], svm_proba=True,
...
...     # Grid search
...     search_type='random', scorer='accuracy', grid_cv='kfold_5', verbose=1,
...
...     # Model evaluation and charts
...     model_eval=True, plot_perf=True, plot_curve=True, fig_size=(12,6),
...     legend_loc='lower left', rotation=45, threshold=0.5,
...     class_map=class_map, pos_label=1, title='Breast Cancer',
...
...     # Config, preferences and notes
...     config=my_config, class_weight=None, random_state=42, decimal=2,
...     n_jobs=None, notes='Test Size=0.25, Threshold=0.50'
... )  

-----------------------------------------------------------------------------------------
Starting Data Processing - ... UTC
-----------------------------------------------------------------------------------------

Classification type detected: binary
Unique values in y: [0 1]

Train/Test split, test_size:  0.25
X_train, X_test, y_train, y_test shapes:  (750, 20) (250, 20) (750,) (250,)

-----------------------------------------------------------------------------------------
1/3: Starting LogisticRegression Random Search - ... UTC
-----------------------------------------------------------------------------------------

Fitting 5 folds for each of 10 candidates, totalling 50 fits

Total Time: ... seconds
Average Fit Time: ... seconds
Inference Time: ...
Best CV Accuracy Score: 0.88
Train Accuracy Score: 0.89
Test Accuracy Score: 0.86
Overfit: Yes
Overfit Difference: 0.03
Best Parameters: {'logreg__solver': 'saga', 'logreg__C': 0.1}

LogisticRegression Binary Classification Report

              precision    recall  f1-score   support

   Malignant       0.81      0.82      0.81        92
      Benign       0.89      0.89      0.89       158

    accuracy                           0.86       250
   macro avg       0.85      0.85      0.85       250
weighted avg       0.86      0.86      0.86       250

ROC AUC: 0.92

               Predicted:0         1
Actual: 0                75        17
Actual: 1                18        140

True Positive Rate / Sensitivity: 0.89
True Negative Rate / Specificity: 0.82
False Positive Rate / Fall-out: 0.18
False Negative Rate / Miss Rate: 0.11

Positive Class: Benign (1)
Threshold: 0.5

-----------------------------------------------------------------------------------------
2/3: Starting KNeighborsClassifier Random Search - ... UTC
-----------------------------------------------------------------------------------------

Fitting 5 folds for each of 10 candidates, totalling 50 fits

Total Time: ... seconds
Average Fit Time: ... seconds
Inference Time: ...
Best CV Accuracy Score: 0.86
Train Accuracy Score: 1.00
Test Accuracy Score: 0.84
Overfit: Yes
Overfit Difference: 0.16
Best Parameters: {'knn_class__weights': 'distance', 'knn_class__n_neighbors': 20, 'knn_class__metric': 'manhattan'}

KNeighborsClassifier Binary Classification Report

              precision    recall  f1-score   support

   Malignant       0.75      0.84      0.79        92
      Benign       0.90      0.84      0.87       158

    accuracy                           0.84       250
   macro avg       0.82      0.84      0.83       250
weighted avg       0.84      0.84      0.84       250

ROC AUC: 0.91

               Predicted:0         1
Actual: 0                77        15
Actual: 1                26        132

True Positive Rate / Sensitivity: 0.84
True Negative Rate / Specificity: 0.84
False Positive Rate / Fall-out: 0.16
False Negative Rate / Miss Rate: 0.16

Positive Class: Benign (1)
Threshold: 0.5

-----------------------------------------------------------------------------------------
3/3: Starting DecisionTreeClassifier Random Search - ... UTC
-----------------------------------------------------------------------------------------

Fitting 5 folds for each of 10 candidates, totalling 50 fits

Total Time: ... seconds
Average Fit Time: ... seconds
Inference Time: ...
Best CV Accuracy Score: 0.88
Train Accuracy Score: 0.93
Test Accuracy Score: 0.86
Overfit: Yes
Overfit Difference: 0.08
Best Parameters: {'tree_class__min_samples_split': 15, 'tree_class__min_samples_leaf': 6, 'tree_class__max_depth': 5, 'tree_class__criterion': 'entropy'}

DecisionTreeClassifier Binary Classification Report

              precision    recall  f1-score   support

   Malignant       0.76      0.89      0.82        92
      Benign       0.93      0.84      0.88       158

    accuracy                           0.86       250
   macro avg       0.84      0.86      0.85       250
weighted avg       0.87      0.86      0.86       250

ROC AUC: 0.92

               Predicted:0         1
Actual: 0                82        10
Actual: 1                26        132

True Positive Rate / Sensitivity: 0.84
True Negative Rate / Specificity: 0.89
False Positive Rate / Fall-out: 0.11
False Negative Rate / Miss Rate: 0.16

Positive Class: Benign (1)
Threshold: 0.5
>>> results_df.head()  
                    Model  Test Size Over Sample Under Sample Resample  Total Fit Time  Fit Count  Average Fit Time  Inference Time Grid Scorer                                        Best Params  Best CV Score  Train Score  Test Score Overfit  Overfit Difference  Train Accuracy Score  Test Accuracy Score  Train Precision Score  Test Precision Score  Train Recall Score  Test Recall Score  Train F1 Score  Test F1 Score  Train ROC AUC Score  Test ROC AUC Score  Threshold  True Positives  False Positives  True Negatives  False Negatives       TPR       FPR       TNR       FNR  False Rate            Pipeline                           Notes Timestamp
0      LogisticRegression       0.25        None         None     None               ...         50                 ...    Accuracy       {'logreg__solver': 'saga', 'logreg__C': 0.1}       0.877333        0.888       0.860     Yes               0.028                 0.888                0.860               0.903153              0.891720            0.907240           0.886076        0.905192       0.888889             0.935388            0.922675        0.5             140               17              75               18  0.886076  0.184783  0.815217  0.113924    0.298707     [stand, logreg]  Test Size=0.25, Threshold=0.50...
1    KNeighborsClassifier       0.25        None         None     None               ...         50                 ...    Accuracy  {'knn_class__weights': 'distance', 'knn_class_...       0.861333        1.000       0.836     Yes               0.164                 1.000                0.836               1.000000              0.897959            1.000000           0.835443        1.000000       0.865574             1.000000            0.911805        0.5             132               15              77               26  0.835443  0.163043  0.836957  0.164557    0.327600  [stand, knn_class]  Test Size=0.25, Threshold=0.50...
2  DecisionTreeClassifier       0.25        None         None     None               ...         50                 ...    Accuracy  {'tree_class__min_samples_split': 15, 'tree_cl...       0.882667        0.932       0.856     Yes               0.076                 0.932                0.856               0.955711              0.929577            0.927602           0.835443        0.941447       0.880000             0.974926            0.919889        0.5             132               10              82               26  0.835443  0.108696  0.891304  0.164557    0.273253        [tree_class]  Test Size=0.25, Threshold=0.50...

Example 2: Compare models with more pipeline steps, stratification, under sampling, and resampling for SVM, with SVM probabilities enabled:

>>> results_df = compare_models(
...
...     # Data split and sampling
...     x=X, y=y, test_size=0.25, stratify=y, under_sample=0.8,
...     over_sample=None, svm_knn_resample=0.2,
...
...     # Models and pipeline steps
...     imputer='simple_imputer', transformers=None, scaler='stand', selector=None,
...     models=['logreg', 'svm_proba'], svm_proba=True,
...
...     # Grid search
...     search_type='random', scorer='accuracy', grid_cv='kfold_5', verbose=1,
...
...     # Model evaluation and charts
...     model_eval=True, plot_perf=True, plot_curve=True, fig_size=(12,6),
...     legend_loc='lower left', rotation=45, threshold=0.5,
...     class_map=class_map, pos_label=1, title='Breast Cancer',
...
...     # Config, preferences and notes
...     config=my_config, class_weight=None, random_state=42, decimal=2,
...     n_jobs=None, notes='Test Size=0.25, Threshold=0.50'
... )  

-----------------------------------------------------------------------------------------
Starting Data Processing - ... UTC
-----------------------------------------------------------------------------------------

Classification type detected: binary
Unique values in y: [0 1]

Train/Test split, test_size:  0.25
X_train, X_test, y_train, y_test shapes:  (750, 20) (250, 20) (750,) (250,)

Undersampling via RandomUnderSampler strategy:  0.8
X_train, y_train shapes before:  (750, 20) (750,)
y_train value counts before:  Target
1    450
0    300
Name: count, dtype: int64
Running RandomUnderSampler on X_train, y_train...
X_train, y_train shapes after:  (675, 20) (675,)
y_train value counts after:  Target
1    375
0    300
Name: count, dtype: int64

-----------------------------------------------------------------------------------------
1/2: Starting LogisticRegression Random Search - ... UTC
-----------------------------------------------------------------------------------------

Fitting 5 folds for each of 10 candidates, totalling 50 fits

Total Time: ... seconds
Average Fit Time: ... seconds
Inference Time: ...
Best CV Accuracy Score: 0.87
Train Accuracy Score: 0.88
Test Accuracy Score: 0.86
Overfit: Yes
Overfit Difference: 0.01
Best Parameters: {'logreg__solver': 'saga', 'logreg__C': 0.1}

LogisticRegression Binary Classification Report

              precision    recall  f1-score   support

   Malignant       0.84      0.82      0.83       100
      Benign       0.88      0.89      0.89       150

    accuracy                           0.86       250
   macro avg       0.86      0.86      0.86       250
weighted avg       0.86      0.86      0.86       250

ROC AUC: 0.92

               Predicted:0         1
Actual: 0                82        18
Actual: 1                16        134

True Positive Rate / Sensitivity: 0.89
True Negative Rate / Specificity: 0.82
False Positive Rate / Fall-out: 0.18
False Negative Rate / Miss Rate: 0.11

Positive Class: Benign (1)
Threshold: 0.5

-----------------------------------------------------------------------------------------
2/2: Starting SVC Random Search - ... UTC
-----------------------------------------------------------------------------------------

Training data resampled to 20.0% of original for KNN and SVM speed improvement
X_train, y_train shapes after:  (135, 20) (135,)
y_train value counts after:  Target
1    75
0    60
Name: count, dtype: int64

Fitting 5 folds for each of 10 candidates, totalling 50 fits

Total Time: ... seconds
Average Fit Time: ... seconds
Inference Time: ...
Best CV Accuracy Score: 0.87
Train Accuracy Score: 0.90
Test Accuracy Score: 0.86
Overfit: Yes
Overfit Difference: 0.05
Best Parameters: {'svm_proba__kernel': 'linear', 'svm_proba__C': 0.01}

SVC Binary Classification Report

              precision    recall  f1-score   support

   Malignant       0.83      0.85      0.84       100
      Benign       0.90      0.88      0.89       150

    accuracy                           0.87       250
   macro avg       0.86      0.86      0.86       250
weighted avg       0.87      0.87      0.87       250

ROC AUC: 0.92

               Predicted:0         1
Actual: 0                85        15
Actual: 1                18        132

True Positive Rate / Sensitivity: 0.88
True Negative Rate / Specificity: 0.85
False Positive Rate / Fall-out: 0.15
False Negative Rate / Miss Rate: 0.12

Positive Class: Benign (1)
Threshold: 0.5
datawaza.model.create_nn_binary(hidden_layer_dim: int, dropout_rate: float, l2_reg: float, second_layer_dim: int | None = None, third_layer_dim: int | None = None, meta: Dict[str, Any] | None = None) Sequential[source]#

Create a binary classification neural network model.

This function allows for flexible configuration of the neural network structure for binary classification using the KerasClassifier in scikit-learn. It supports adding up to three hidden layers with customizable dimensions, dropout regularization, and L2 regularization.

Use this function to create a neural network model with a specific structure and regularization settings for binary classification tasks. It is set as the model parameter of a KerasClassifier instance referenced in the configuration file for compare_models.

Parameters:
  • hidden_layer_dim (int) – The number of neurons in the first hidden layer.

  • dropout_rate (float) – The dropout rate to be applied after each hidden layer.

  • l2_reg (float) – The L2 regularization strength. If greater than 0, L2 regularization is applied to the kernel weights of the dense layers.

  • second_layer_dim (Optional[int], optional) – The number of neurons in an additional hidden layer. If not None, an additional hidden layer is added. Default is None.

  • third_layer_dim (Optional[int], optional) – The number of neurons in a third hidden layer. If not None, a third hidden layer is added. Default is None.

  • meta (Dict[str, Any], optional) – A dictionary containing metadata about the input features and shape. Default is None.

Returns:

The constructed neural network model for binary classification.

Return type:

keras.models.Sequential

Examples

>>> pd.set_option('display.max_columns', None)  # For test consistency
>>> pd.set_option('display.width', None)  # For test consistency
>>> from sklearn.datasets import make_classification
>>> from sklearn.model_selection import train_test_split
>>> X, y = make_classification(n_samples=100, n_features=10, random_state=42)
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
...                                                     random_state=42)
>>> meta = {"n_features_in_": 10, "X_shape_": (80, 10)}

Example 1: Create a basic neural network with default settings:

>>> model = create_nn_binary(hidden_layer_dim=32, dropout_rate=0.2, l2_reg=0.01,
...                       meta=meta)
>>> model_summary(model)  
        Item                  Name         Type Activation Output Shape  Parameters   Bytes
0      Model            Sequential   Sequential       None         None         NaN     NaN
1      Input                 Input  KerasTensor       None   (None, 10)         0.0     0.0
2      Layer              Hidden_1        Dense       relu   (None, 32)       352.0  1408.0
3      Layer             Dropout_1      Dropout       None   (None, 32)         0.0     0.0
4      Layer                Output        Dense    sigmoid    (None, 1)        33.0   132.0
5  Statistic          Total Params         None       None         None       385.0  1540.0
6  Statistic      Trainable Params         None       None         None       385.0  1540.0
7  Statistic  Non-Trainable Params         None       None         None         0.0     0.0

Example 2: Create a neural network with additional layers and regularization:

>>> model = create_nn_binary(hidden_layer_dim=64, dropout_rate=0.3, l2_reg=0.05,
...                       second_layer_dim=32, third_layer_dim=16, meta=meta)
>>> model_summary(model)  
         Item                  Name         Type Activation Output Shape  Parameters    Bytes
0       Model            Sequential   Sequential       None         None         NaN      NaN
1       Input                 Input  KerasTensor       None   (None, 10)         0.0      0.0
2       Layer              Hidden_1        Dense       relu   (None, 64)       704.0   2816.0
3       Layer             Dropout_1      Dropout       None   (None, 64)         0.0      0.0
4       Layer              Hidden_2        Dense       relu   (None, 32)      2080.0   8320.0
5       Layer             Dropout_2      Dropout       None   (None, 32)         0.0      0.0
6       Layer              Hidden_3        Dense       relu   (None, 16)       528.0   2112.0
7       Layer             Dropout_3      Dropout       None   (None, 16)         0.0      0.0
8       Layer                Output        Dense    sigmoid    (None, 1)        17.0     68.0
9   Statistic          Total Params         None       None         None      3329.0  13316.0
10  Statistic      Trainable Params         None       None         None      3329.0  13316.0
11  Statistic  Non-Trainable Params         None       None         None         0.0      0.0
datawaza.model.create_nn_multi(hidden_layer_dim: int, dropout_rate: float, l2_reg: float, second_layer_dim: int | None = None, third_layer_dim: int | None = None, meta: Dict[str, Any] | None = None) Sequential[source]#

Create a multi-class classification neural network model.

This function allows for flexible configuration of the neural network structure for multi-class classification using the KerasClassifier in scikit-learn. It supports adding an optional hidden layer with customizable dimensions, dropout regularization, and L2 regularization.

Use this function to create a neural network model with a specific structure and regularization settings for multi-class classification tasks. It is set as the model parameter of a KerasClassifier instance referenced in the configuration file for compare_models.

Parameters:
  • hidden_layer_dim (int) – The number of neurons in the hidden layer.

  • dropout_rate (float) – The dropout rate to be applied after the hidden layer.

  • l2_reg (float) – The L2 regularization strength applied to the kernel weights of the dense layers.

  • second_layer_dim (Optional[int], optional) – The number of neurons in an additional hidden layer. If not None, an additional hidden layer is added. Default is None.

  • third_layer_dim (Optional[int], optional) – The number of neurons in a third hidden layer. If not None, a third hidden layer is added. Default is None.

  • meta (Dict[str, Any], optional) – A dictionary containing metadata about the input features, shape, and number of classes. Default is None.

Returns:

The constructed neural network model for multi-class classification.

Return type:

keras.models.Sequential

Examples

>>> pd.set_option('display.max_columns', None)  # For test consistency
>>> pd.set_option('display.width', None)  # For test consistency
>>> from sklearn.datasets import load_iris
>>> from sklearn.model_selection import train_test_split
>>> X, y = load_iris(return_X_y=True)
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
...                                                     random_state=42)
>>> meta = {"n_features_in_": 4, "X_shape_": (120, 4), "n_classes_": 3}

Example 1: Create a basic neural network with default settings:

>>> model = create_nn_multi(hidden_layer_dim=64, dropout_rate=0.2, l2_reg=0.01,
...                         meta=meta)
>>> model_summary(model)  
        Item                  Name         Type Activation Output Shape  Parameters   Bytes
0      Model            Sequential   Sequential       None         None         NaN     NaN
1      Input                 Input  KerasTensor       None    (None, 4)         0.0     0.0
2      Layer              Hidden_1        Dense       relu   (None, 64)       320.0  1280.0
3      Layer             Dropout_1      Dropout       None   (None, 64)         0.0     0.0
4      Layer                Output        Dense    softmax    (None, 3)       195.0   780.0
5  Statistic          Total Params         None       None         None       515.0  2060.0
6  Statistic      Trainable Params         None       None         None       515.0  2060.0
7  Statistic  Non-Trainable Params         None       None         None         0.0     0.0

Example 2: Create a neural network with an additional hidden layer:

>>> model = create_nn_multi(hidden_layer_dim=128, dropout_rate=0.3, l2_reg=0.05,
...                         second_layer_dim=64, meta=meta)
>>> model_summary(model)  
        Item                  Name         Type Activation Output Shape  Parameters    Bytes
0      Model            Sequential   Sequential       None         None         NaN      NaN
1      Input                 Input  KerasTensor       None    (None, 4)         0.0      0.0
2      Layer              Hidden_1        Dense       relu  (None, 128)       640.0   2560.0
3      Layer             Dropout_1      Dropout       None  (None, 128)         0.0      0.0
4      Layer              Hidden_2        Dense       relu   (None, 64)      8256.0  33024.0
5      Layer             Dropout_2      Dropout       None   (None, 64)         0.0      0.0
6      Layer                Output        Dense    softmax    (None, 3)       195.0    780.0
7  Statistic          Total Params         None       None         None      9091.0  36364.0
8  Statistic      Trainable Params         None       None         None      9091.0  36364.0
9  Statistic  Non-Trainable Params         None       None         None         0.0      0.0
datawaza.model.create_pipeline(imputer_key: str | None = None, transformer_keys: List[str] | str | None = None, scaler_key: str | None = None, selector_key: str | None = None, model_key: str | None = None, impute_first: bool = True, config: Dict[str, Any] | None = None, cat_columns: List[str] | None = None, num_columns: List[str] | None = None, random_state: int = 42, class_weight: Dict[int, float] | None = None, max_iter: int = 10000, debug: bool = False) Pipeline[source]#

Create a custom pipeline for data preprocessing and modeling.

This function allows you to define a custom pipeline by specifying the desired preprocessing steps (imputation, transformation, scaling, feature selection) and the model to use for predictions. Provide the keys for the steps you want to include in the pipeline. If a step is not specified, it will be skipped. The definition of the keys are defined in a configuration dictionary that is passed to the function. If no external configuration is provided, a default one will be used.

  • imputer_key (str) is selected from config[‘imputers’]

  • transformer_keys (list or str) are selected from config[‘transformers’]

  • scaler_key (str) is selected from config[‘scalers’]

  • selector_key (str) is selected from config[‘selectors’]

  • model_key (str) is selected from config[‘models’]

  • config[‘no_scale’] lists model keys that should not be scaled.

  • config[‘no_poly’] lists models that should not be polynomial transformed.

By default, the sequence of the Pipeline steps are: Imputer > Column Transformer > Scaler > Selector > Model. However, if impute_first is False, the data will be imputed after the column transformations. Scaling will not be done for any Model that is listed in config[‘no_scale’] (ex: for decision trees, which don’t require scaling).

A column transformer will be created based on the specified transformer_keys. Any number of column transformations can be defined here. For example, you can define transformer_keys = [‘ohe’, ‘poly2’, ‘log’] to One-Hot Encode some columns, Polynomial transform some columns, and Log transform others. Just define each of these in your config file to reference the appropriate column lists. By default, these will transform the columns passed in as cat_columns or num_columns. But you may want to apply different transformations to your categorical features. For example, if you One-Hot Encode some, but Ordinal Encode others, you could define separate column lists for these as ‘ohe_columns’ and ‘ord_columns’, and then define transformer_keys in your config dictionary that reference them.

Here is an example of the configuration dictionary structure:

>>> config = {  
...     'imputers': {
...         'knn_imputer': KNNImputer().set_output(transform='pandas'),
...         'simple_imputer': SimpleImputer()
...     },
...     'transformers': {
...         'ohe': (OneHotEncoder(drop='if_binary', handle_unknown='ignore'),
...                 cat_columns),
...         'ord': (OrdinalEncoder(), cat_columns),
...         'poly2': (PolynomialFeatures(degree=2, include_bias=False),
...                   num_columns),
...         'log': (FunctionTransformer(np.log1p, validate=True),
...                 num_columns)
...     },
...     'scalers': {
...         'stand': StandardScaler(),
...         'minmax': MinMaxScaler()
...     },
...     'selectors': {
...         'rfe_logreg': RFE(LogisticRegression(max_iter=max_iter,
...                                         random_state=random_state,
...                                         class_weight=class_weight)),
...         'sfs_linreg': SequentialFeatureSelector(LinearRegression())
...     },
...     'models': {
...         'linreg': LinearRegression(),
...         'logreg': LogisticRegression(max_iter=max_iter,
...                                      random_state=random_state,
...                                      class_weight=class_weight),
...         'tree_class': DecisionTreeClassifier(random_state=random_state),
...         'tree_reg': DecisionTreeRegressor(random_state=random_state)
...     },
...     'no_scale': ['tree_class', 'tree_reg'],
...     'no_poly': ['tree_class', 'tree_reg'],
... }

Use this function to quickly create a pipeline during model iteration and evaluation. You can easily experiment with different combinations of preprocessing steps and models to find the best performing pipeline. This function is utilized by iterate_model, compare_models, and compare_reg_models to dynamically build pipelines as part of that larger modeling workflow.

Parameters:
  • imputer_key (str, optional) – The key corresponding to the imputer to use for handling missing values. If not provided, no imputation will be performed.

  • transformer_keys (list of str, str, or None, optional) – The keys corresponding to the transformers to apply to the data. This can be a list of string keys or a single string key. If not provided, no transformers will be applied.

  • scaler_key (str or None, optional) – The key corresponding to the scaler to use for scaling the data. If not provided, no scaling will be performed.

  • selector_key (str or None, optional) – The key corresponding to the feature selector to use for selecting relevant features. If not provided, no feature selection will be performed.

  • model_key (str, optional) – The key corresponding to the model to use for predictions.

  • impute_first (bool, default=True) – Whether to perform imputation before applying the transformers. If False, imputation will be performed after the transformers.

  • config (dict or None, optional) – A dictionary containing the configuration for the pipeline components. If not provided, a default configuration will be used.

  • cat_columns (list-like, optional) – List of categorical columns from the input dataframe. This is used in the default configuration for the relevant transformers.

  • num_columns (list-like, optional) – List of numeric columns from the input dataframe. This is used in the default configuration for the relevant transformers.

  • random_state (int, default=42) – The random state to use for reproducibility.

  • class_weight (dict or None, optional) – A dictionary mapping class labels to weights for imbalanced classification problems. If not provided, equal weights will be used.

  • max_iter (int, default=10000) – The maximum number of iterations for iterative models.

  • debug (bool, optional) – Flag to show debugging information.

Returns:

pipeline – The constructed pipeline based on the specified components and configuration.

Return type:

sklearn.pipeline.Pipeline

Examples

Prepare sample data for the examples:

>>> from sklearn.datasets import fetch_california_housing
>>> X, y = fetch_california_housing(return_X_y=True)
>>> cat_columns = ['ocean_proximity']
>>> num_columns = ['longitude', 'latitude', 'housing_median_age',
...                  'total_rooms', 'total_bedrooms', 'population',
...                  'households', 'median_income']

Example 1: Create a pipeline with Standard Scaler and Linear Regression:

>>> pipeline = create_pipeline(scaler_key='stand', model_key='linreg',
...                            cat_columns=cat_columns,
...                            num_columns=num_columns)
>>> pipeline.steps
[('stand', StandardScaler()), ('linreg', LinearRegression())]

Example 2: Create a pipeline with One-Hot Encoding, Standard Scaler, and a Logistic Regression model:

>>> pipeline = create_pipeline(transformer_keys=['ohe'],
...                            scaler_key='stand',
...                            model_key='logreg',
...                            cat_columns=cat_columns,
...                            num_columns=num_columns)
>>> pipeline.steps
[('ohe', ColumnTransformer(remainder='passthrough',
                  transformers=[('ohe',
                                 OneHotEncoder(drop='if_binary',
                                               handle_unknown='ignore'),
                                 ['ocean_proximity'])])), ('stand', StandardScaler()), ('logreg', LogisticRegression(max_iter=10000, random_state=42))]

Example 3: Create a pipeline with KNN Imputer, One-Hot Encoding, Polynomial Transformation, Log Transformation, Standard Scaler, and Gradient Boost Regressor for the model:

>>> pipeline = create_pipeline(imputer_key='knn_imputer',
...                            transformer_keys=['ohe', 'poly2', 'log'],
...                            scaler_key='stand',
...                            model_key='boost_reg',
...                            cat_columns=cat_columns,
...                            num_columns=num_columns)
>>> pipeline.steps
[('knn_imputer', KNNImputer()), ('ohe_poly2_log', ColumnTransformer(remainder='passthrough',
                  transformers=[('ohe',
                                 OneHotEncoder(drop='if_binary',
                                               handle_unknown='ignore'),
                                 ['ocean_proximity']),
                                ('poly2',
                                 PolynomialFeatures(include_bias=False),
                                 ['longitude', 'latitude', 'housing_median_age',
                                  'total_rooms', 'total_bedrooms', 'population',
                                  'households', 'median_income']),
                                ('log',
                                 FunctionTransformer(func=<ufunc 'log1p'>,
                                                     validate=True),
                                 ['longitude', 'latitude', 'housing_median_age',
                                  'total_rooms', 'total_bedrooms', 'population',
                                  'households', 'median_income'])])), ('stand', StandardScaler()), ('boost_reg', GradientBoostingRegressor(random_state=42))]
datawaza.model.create_results_df() DataFrame[source]#

Initialize the results_df DataFrame with the columns required for iterate_model.

This function creates a new DataFrame with the following columns: ‘Iteration’, ‘Train MSE’, ‘Test MSE’, ‘Train RMSE’, ‘Test RMSE’, ‘Train MAE’, ‘Test MAE’, ‘Train R^2 Score’, ‘Test R^2 Score’, ‘Pipeline’, ‘Best Grid Params’, ‘Note’, ‘Date’.

Create a results_df with this function, and then pass it as a parameter to iterate_model. The results of each model iteration will be appended to results_df.

Returns:

The initialized results_df DataFrame.

Return type:

pd.DataFrame

Examples

Create a DataFrame with the columns required for iterate_model:

>>> results_df = create_results_df()
>>> results_df.columns
Index(['Iteration', 'Train MSE', 'Test MSE', 'Train RMSE', 'Test RMSE',
       'Train MAE', 'Test MAE', 'Train R^2 Score', 'Test R^2 Score',
       'Pipeline', 'Best Grid Params', 'Note', 'Date'],
      dtype='object')
datawaza.model.eval_model(*, y_test: ndarray, y_pred: ndarray, class_map: Dict[Any, Any] | None = None, estimator: Any | None = None, x_test: ndarray | None = None, class_type: str | None = None, pos_label: Any | None = 1, threshold: float = 0.5, multi_class: str = 'ovr', average: str = 'macro', title: str | None = None, model_name: str = 'Model', class_weight: str | None = None, decimal: int = 2, bins: int = 10, bin_strategy: str | None = None, plot: bool = False, figsize: Tuple[int, int] = (12, 11), figmulti: float = 1.7, conf_fontsize: int = 14, return_metrics: bool = False, output: bool = True, debug: bool = False) Dict[str, int | float] | None[source]#

Evaluate a classification model’s performance and plot results.

This function provides a comprehensive evaluation of a binary or multi-class classification model based on y_test (the actual target values) and y_pred (the predicted target values). It displays a text-based classification report enhanced with True/False Positives/Negatives (if binary), and 4 charts if plot is True: Confusion Matrix, Histogram of Predicted Probabilities, ROC Curve, and Precision-Recall Curve.

If class_type is ‘binary’, it will treat this as a binary classification. If class_type is ‘multi’, it will treat this as a multi-class problem. If class_type is not specified, it will be detected based on the number of unique values in y_test. To plot the curves or adjust the threshold (default 0.5), both x_test and estimator must be provided so that proababilities can be calculated.

For binary classification, pos_label is required. This defaults to 1 as an integer, but can be set to any value that matches one of the values in y_test and y_pred. The class_map can be used to provide display names for the classes. If not provided, the actual class values will be used.

A number of classification metrics are shown in the report: Accuracy, Precision, Recall, F1, and ROC AUC. In addition, for binary classification, True Positive Rate, False Positive Rate, True Negative Rate, and False Negative Rate are shown. The metrics are calculated at the default threshold of 0.5, but can be adjusted with the threshold parameter.

You can customize the title of the report completely, or pass the model_name and it will be displayed in a dynamically generated title. You can also specify the number of decimal places to show, and size of the figure (fig_size). For multi-class, you can set a figmulti scaling factor for the plot.

You can set the class_weight as a display only string that is not used in any functions within eval_model. This is useful if you trained the model with a ‘balanced’ class_weight, and now want to pass that to this report to see the effects.

A dictionary of metrics can be returned if return_metrics is True, and the output can be disabled by setting output to False. These are used by parent functions (ex: compare_models) to gather the data into a DataFrame of the results.

Use this function to assess the performance of a trained classification model. You can experiment with different thresholds to see how they affect metrics like Precision, Recall, False Positive Rate and False Negative Rate. The plots make it easy to see if you’re getting good separation and maximum area under the curve.

Parameters:
  • y_test (np.ndarray) – The true labels of the test set.

  • y_pred (np.ndarray) – The predicted labels of the test set.

  • class_map (Dict[Any, Any], optional) – A dictionary mapping class labels to their string representations. Default is None.

  • estimator (Any, optional) – The trained estimator object used for prediction. Required for generating probabilities. Default is None.

  • x_test (np.ndarray, optional) – The test set features. Required for generating probabilities. Default is None.

  • class_type (str, optional) – The type of classification problem. Can be ‘binary’ or ‘multi’. If not provided, it will be inferred from the number of unique labels. Default is None.

  • pos_label (Any, optional) – The positive class label for binary classification. Default is 1.

  • threshold (float, optional) – The threshold for converting predicted probabilities to class labels. Default is 0.5.

  • multi_class (str, optional) – The method for handling multi-class ROC AUC calculation. Can be ‘ovr’ (one-vs-rest) or ‘ovo’ (one-vs-one). Default is ‘ovr’.

  • average (str, optional) – The averaging method for multi-class classification metrics. Can be ‘macro’, ‘micro’, ‘weighted’, or ‘samples’. Default is ‘macro’.

  • title (str, optional) – The title for the plots. Default is None.

  • model_name (str, optional) – The name of the model for labeling the plots. Default is ‘Model’.

  • class_weight (str, optional) – The class weight settings used for training the model. Default is None.

  • decimal (int, optional) – The number of decimal places to display in the output and plots. Default is 4.

  • bins (int, optional) – The number of bins for the predicted probabilities histogram when bin_strategy is None. Default is 10.

  • bin_strategy (str, optional) – The strategy for determining the number of bins for the predicted probabilities histogram. Can be ‘sqrt’, ‘sturges’, ‘rice’, ‘freed’, ‘scott’, or ‘doane’. Default is None.

  • plot (bool, optional) – Whether to display the evaluation plots. Default is False.

  • figsize (Tuple[int, int], optional) – The figure size for the plots in inches. Default is (12, 11).

  • figmulti (float, optional) – The multiplier for the figure size in multi-class classification. Default is 1.7.

  • conf_fontsize (int, optional) – The font size for the numbers in the confusion matrix. Default is 14.

  • return_metrics (bool, optional) – Whether to return the evaluation metrics as a dictionary. Default is False.

  • output (bool, optional) – Whether to print the evaluation results. Default is True.

  • debug (bool, optional) – Whether to print debug information. Default is False.

Returns:

metrics – A dictionary containing the evaluation metrics. Returned only if return_metrics is True and the classification type is binary.

Return type:

Dict[str, Union[int, float]], optional

Examples

Prepare data and model for the examples:

>>> from sklearn.datasets import make_classification
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.svm import SVC
>>> X, y = make_classification(n_samples=1000, n_classes=2, weights=[0.4, 0.6],
...                            random_state=42)
>>> class_map = {0: 'Malignant', 1: 'Benign'}
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
...                                                     random_state=42)
>>> model = SVC(kernel='linear', probability=True, random_state=42)
>>> model.fit(X_train, y_train)
SVC(kernel='linear', probability=True, random_state=42)
>>> y_pred = model.predict(X_test)

Example 1: Basic evaluation with default settings:

>>> eval_model(y_test=y_test, y_pred=y_pred)  

Binary Classification Report

              precision    recall  f1-score   support

           0       0.76      0.74      0.75        72
           1       0.85      0.87      0.86       128

    accuracy                           0.82       200
   macro avg       0.81      0.80      0.80       200
weighted avg       0.82      0.82      0.82       200

               Predicted:0         1
Actual: 0                53        19
Actual: 1                17        111

True Positive Rate / Sensitivity: 0.87
True Negative Rate / Specificity: 0.74
False Positive Rate / Fall-out: 0.26
False Negative Rate / Miss Rate: 0.13

Positive Class: 1 (1)
Threshold: 0.5

Example 2: Evaluation with custom settings:

>>> eval_model(y_test=y_test, y_pred=y_pred, estimator=model, x_test=X_test,
...            class_type='binary', class_map=class_map, pos_label=0,
...            threshold=0.35, model_name='SVM', class_weight='balanced',
...            decimal=4, plot=True, figsize=(13, 13), conf_fontsize=18,
...            bins=20)   

SVM Binary Classification Report

              precision    recall  f1-score   support

      Benign     0.9545    0.8203    0.8824       128
   Malignant     0.7444    0.9306    0.8272        72

    accuracy                         0.8600       200
   macro avg     0.8495    0.8754    0.8548       200
weighted avg     0.8789    0.8600    0.8625       200

ROC AUC: 0.9220

               Predicted:1         0
Actual: 1                105       23
Actual: 0                5         67

True Positive Rate / Sensitivity: 0.9306
True Negative Rate / Specificity: 0.8203
False Positive Rate / Fall-out: 0.1797
False Negative Rate / Miss Rate: 0.0694

Positive Class: Malignant (0)
Class Weight: balanced
Threshold: 0.35

Example 3: Evaluate model with no output and return a dictionary:

>>> metrics = eval_model(y_test=y_test, y_pred=y_pred, estimator=model,
...            x_test=X_test, class_map=class_map, pos_label=0,
...            return_metrics=True, output=False)
>>> print(metrics)
{'True Positives': 53, 'False Positives': 17, 'True Negatives': 111, 'False Negatives': 19, 'TPR': 0.7361111111111112, 'TNR': 0.8671875, 'FPR': 0.1328125, 'FNR': 0.2638888888888889, 'Benign': {'precision': 0.8538461538461538, 'recall': 0.8671875, 'f1-score': 0.8604651162790697, 'support': 128.0}, 'Malignant': {'precision': 0.7571428571428571, 'recall': 0.7361111111111112, 'f1-score': 0.7464788732394366, 'support': 72.0}, 'accuracy': 0.82, 'macro avg': {'precision': 0.8054945054945055, 'recall': 0.8016493055555556, 'f1-score': 0.8034719947592532, 'support': 200.0}, 'weighted avg': {'precision': 0.819032967032967, 'recall': 0.82, 'f1-score': 0.819430068784802, 'support': 200.0}, 'ROC AUC': 0.9219835069444444, 'Threshold': 0.5, 'Class Type': 'binary', 'Class Map': {0: 'Malignant', 1: 'Benign'}, 'Positive Label': 0, 'Title': None, 'Model Name': 'Model', 'Class Weight': None, 'Multi-Class': 'ovr', 'Average': 'macro'}

Prepare multi-class example data:

>>> from sklearn.datasets import load_iris
>>> X, y = load_iris(return_X_y=True)
>>> X = pd.DataFrame(X, columns=['sepal_length', 'sepal_width', 'petal_length',
...                              'petal_width'])
>>> y = pd.Series(y)
>>> class_map = {0: 'Setosa', 1: 'Versicolor', 2: 'Virginica'}
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
...                                    random_state=42)
>>> model = SVC(kernel='linear', probability=True, random_state=42)
>>> model.fit(X_train, y_train)
SVC(kernel='linear', probability=True, random_state=42)
>>> y_pred = model.predict(X_test)

Example 4: Evaluate multi-class model with default settings:

>>> metrics = eval_model(y_test=y_test, y_pred=y_pred, class_map=class_map,
...               return_metrics=True)   

Multi-Class Classification Report

              precision    recall  f1-score   support

      Setosa       1.00      1.00      1.00        10
  Versicolor       1.00      1.00      1.00         9
   Virginica       1.00      1.00      1.00        11

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30

Predicted   Setosa  Versicolor  Virginica
Actual
Setosa          10           0          0
Versicolor       0           9          0
Virginica        0           0         11

>>> print(metrics)
{'Setosa': {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 10.0}, 'Versicolor': {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 9.0}, 'Virginica': {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 11.0}, 'accuracy': 1.0, 'macro avg': {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 30.0}, 'weighted avg': {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 30.0}, 'ROC AUC': None, 'Threshold': 0.5, 'Class Type': 'multi', 'Class Map': {0: 'Setosa', 1: 'Versicolor', 2: 'Virginica'}, 'Positive Label': None, 'Title': None, 'Model Name': 'Model', 'Class Weight': None, 'Multi-Class': 'ovr', 'Average': 'macro'}
datawaza.model.iterate_model(x_train: DataFrame, x_test: DataFrame, y_train: Series, y_test: Series, model: str | None = None, imputer: str | None = None, transformers: List[str] | str | None = None, scaler: str | None = None, selector: str | None = None, drop: List[str] | None = None, config: Dict[str, Any] | None = None, iteration: str = '1', note: str = '', save: bool = False, save_df: DataFrame | None = None, export: bool = False, plot: bool = False, coef: bool = False, perm: bool = False, vif: bool = False, cross: bool = False, cv_folds: int = 5, grid: bool = False, grid_params: str | None = None, grid_cv: str | None = None, grid_score: str = 'r2', grid_verbose: int = 1, search_type: str = 'grid', random_state: int = 42, n_jobs: int | None = None, decimal: int = 2, lowess: bool = False, timezone: str = 'UTC', debug: bool = False) Tuple[DataFrame, Pipeline, Dict[str, Any] | None][source]#

Iterate and evaluate a model pipeline with specified parameters.

This function creates a pipeline from specified parameters for imputers, column transformers, scalers, feature selectors, and models. Parameters must be defined in a configuration dictionary containing the sections described below. If config is not defined, the create_pipeline function will revert to the default config embedded in its code. After creating the pipeline, it fits the pipeline to the passed training data, and evaluates performance with both test and training data. There are options to see plots of residuals and actuals vs. predicted, save results to a save_df with user-defined note, display coefficients, calculate permutation feature importance, variance inflation factor (VIF), and perform cross-validation.

If grid is set to True, a Grid Search CV will run to find the best hyper- parameters. You must also specify a grid_params string that matches a key in the config[‘params’] dictionary. This needs to point to a dictionary whose keys exactly match the name of the pipeline steps and parameter you want to search. See the example config. You can also specify a different grid_score and control the grid_verbose level (set it to 4 to see a full log). If you want to do a Randomized Grid Search, set search_type to ‘random’. random_state defaults to 42. n_jobs are None by default, but you can increase the number (however, you may not see the real-time output of the search if you have grid_verbose set high).

When iterate_model is run, the create_pipeline function is called to create a pipeline from the specified parameters:

  • imputer_key (str) is selected from config[‘imputers’]

  • transformer_keys (list or str) are selected from config[‘transformers’]

  • scaler_key (str) is selected from config[‘scalers’]

  • selector_key (str) is selected from config[‘selectors’]

  • model_key (str) is selected from config[‘models’]

  • config[‘no_scale’] lists model keys that should not be scaled.

  • config[‘no_poly’] lists models that should not be polynomial transformed.

Here is an example of the configuration dictionary structure. It is based on what create_pipeline requires to assemble the pipeline. But it adds some additional configuration parameters only required by iterate_model, which are params (grid search parameters) and cv (cross-validation parameters):

>>> config = {  
...     'imputers': {
...         'knn_imputer': KNNImputer().set_output(transform='pandas'),
...         'simple_imputer': SimpleImputer()
...     },
...     'transformers': {
...         'ohe': (OneHotEncoder(drop='if_binary', handle_unknown='ignore'),
...                 cat_columns),
...         'ord': (OrdinalEncoder(), cat_columns),
...         'poly2': (PolynomialFeatures(degree=2, include_bias=False),
...                   num_columns),
...         'log': (FunctionTransformer(np.log1p, validate=True),
...                 num_columns)
...     },
...     'scalers': {
...         'stand': StandardScaler(),
...         'minmax': MinMaxScaler()
...     },
...     'selectors': {
...         'rfe_logreg': RFE(LogisticRegression(max_iter=max_iter,
...                                         random_state=random_state,
...                                         class_weight=class_weight)),
...         'sfs_linreg': SequentialFeatureSelector(LinearRegression())
...     },
...     'models': {
...         'linreg': LinearRegression(),
...         'logreg': LogisticRegression(max_iter=max_iter,
...                                      random_state=random_state,
...                                      class_weight=class_weight),
...         'tree_class': DecisionTreeClassifier(random_state=random_state),
...         'tree_reg': DecisionTreeRegressor(random_state=random_state)
...     },
...     'no_scale': ['tree_class', 'tree_reg'],
...     'no_poly': ['tree_class', 'tree_reg'],
...     'params': {
...         'sfs': {
...             'Selector: sfs__n_features_to_select': np.arange(3, 13, 1),
...         },
...         'linreg': {
...             'Model: linreg__fit_intercept': [True],
...         },
...         'ridge': {
...             'Model: ridge__alpha': np.array([0.001, 0.1, 1, 10, 100, 1000, 10000, 100000]),
...         }
...     },
...     'cv': {
...         'kfold_5': KFold(n_splits=5, shuffle=True, random_state=42),
...         'kfold_10': KFold(n_splits=10, shuffle=True, random_state=42),
...         'skf_5': StratifiedKFold(n_splits=5, shuffle=True, random_state=42),
...         'skf_10': StratifiedKFold(n_splits=10, shuffle=True, random_state=42)
...     }
... }

In addition to the configuration file, you will need to define any column lists if you want to target certain transformations to a subset of columns. For example, you might define a ‘ohe’ transformer for One-Hot Encoding, and reference ‘ohe_columns’ or ‘cat_columns’ in its definition in the config.

When iterate_model completes, it will print out the results and performance metrics, as well as any requested charts. It will return the best model, and also the grid search results (if a grid search was ran). In addition, if save = True it will append the results to a global variable results_df. This should be created using create_results_df beforehand. If export=True it will save the best model to disk using joblib dump with a timestamp.

Use this function to iterate and evaluate different model pipeline configurations, analyze their performance, and select the best model. With one line of code, you can quickly explore a change to the model pipeline, or grid search parameters, and see how it impacts performance. You can also track the results of these iterations in a results_df DataFrame that can be used to evaluate the best model, or to plot the progress you made from each iteration.

Parameters:
  • x_train (pd.DataFrame) – Training feature set.

  • x_test (pd.DataFrame) – Test feature set.

  • y_train (pd.Series) – Training target set.

  • y_test (pd.Series) – Test target set.

  • model (str, optional) – Key for the model to be used (ex: ‘linreg’, ‘lasso’, ‘ridge’).

  • imputer (str, optional) – Key for the imputer to be applied (ex: ‘simple_imputer’).

  • transformers (List[str], optional) – List of transformation keys to apply (ex: [‘ohe’, ‘poly2’]).

  • scaler (str, optional) – Key for the scaler to be applied (ex: ‘stand’).

  • selector (str, optional) – Key for the selector to be applied (ex: ‘sfs’).

  • drop (List[str], optional) – List of columns to be dropped from the training and test sets.

  • iteration (str, optional) – A string identifier for the iteration (default ‘1’).

  • note (str, optional) – Any note or comment to be added for the iteration.

  • save (bool, optional) – Boolean flag to save the results to the global results dataframe.

  • save_df (pd.DataFrame, optional) – DataFrame to store the results of each iteration.

  • export (bool, optional) – Boolean flag to export the trained model.

  • plot (bool, optional) – Flag to plot residual and actual vs predicted for train/test data.

  • coef (bool, optional) – Flag to print and plot model coefficients.

  • perm (bool, optional) – Flag to compute and display permutation feature importance.

  • vif (bool, optional) – Flag to calculate and display Variance Inflation Factor.

  • cross (bool, optional) – Flag to perform cross-validation and print results.

  • cv_folds (int, optional) – Number of folds for cross-validation if cross=True (default 5).

  • config (Dict[str, Any], optional) – Configuration dictionary for pipeline construction.

  • grid (bool, optional) – Flag to perform grid search for hyperparameter tuning.

  • grid_params (str, optional) – Key for the grid search parameters in the config dictionary.

  • grid_cv (str, optional) – Key for the grid search cross-validation in the config dictionary.

  • grid_score (str, optional) – Scoring metric for grid search (default ‘r2’).

  • grid_verbose (int, optional) – Verbosity level for grid search (default 1).

  • search_type (str, optional) – Choose type of grid search: ‘grid’ for GridSearchCV, or ‘random’ for RandomizedSearchCV. Default is ‘grid’.

  • random_state (int, optional) – Random state seed, necessary for reproducability with RandomizedSearchCV. Default is 42.

  • n_jobs (int, optional) – Number of jobs to run in parallel for Grid Search or Randomized Search. Default is None.

  • decimal (int, optional) – Number of decimal places for displaying metrics (default 2).

  • lowess (bool, optional) – Flag to display lowess curve in residual plots (default False).

  • timezone (str, optional) – Timezone to be used for timestamps. Default is ‘UTC’.

  • debug (bool, optional) – Flag to show debugging information.

Returns:

A tuple containing the save_df DataFrame, the best model pipeline, and the grid search results (if grid=True, else None).

Return type:

Tuple[DataFrame, Pipeline, Optional[Dict[str, Any]]]

Examples

Prepare some sample data for the examples:

>>> from sklearn.datasets import make_regression
>>> from sklearn.model_selection import train_test_split
>>> X, y = make_regression(n_samples=100, n_features=5, noise=0.5,
...                        random_state=42)
>>> X_df = pd.DataFrame(X,
...                     columns=[f"Feature {i+1}" for i in range(X.shape[1])])
>>> y_df = pd.DataFrame(y, columns=['Target'])
>>> X_train, X_test, y_train, y_test = train_test_split(X_df, y_df,
...     test_size=0.2, random_state=42)

Create column lists and set some variables:

>>> num_columns = ['Feature 1','Feature 2','Feature 3','Feature 4','Feature 5']
>>> cat_columns = []
>>> random_state = 42

Create a dataframe to store the results of each iteration (optional):

>>> results_df = create_results_df()

Create a custom configuration file:

>>> my_config = {
...     'imputers': {
...         'simple_imputer': SimpleImputer()
...     },
...     'transformers': {
...         'poly2': (PolynomialFeatures(degree=2, include_bias=False),
...                   num_columns)
...     },
...     'scalers': {
...         'stand': StandardScaler()
...     },
...     'selectors': {
...         'sfs_linreg': SequentialFeatureSelector(LinearRegression())
...     },
...     'models': {
...         'linreg': LinearRegression(),
...         'ridge': Ridge(random_state=random_state)
...     },
...     'no_scale': [],
...     'no_poly': [],
...     'params': {
...         'linreg': {
...             'linreg__fit_intercept': [True],
...         },
...         'ridge': {
...             'ridge__alpha': np.array([0.1, 1, 10, 100]),
...         }
...     },
...     'cv': {
...         'kfold_5': KFold(n_splits=5, shuffle=True, random_state=42)
...     }
... }

Example 1: Iterate a linear regression model with default parameters:

>>> model = iterate_model(X_train, X_test, y_train, y_test,
...                       model='linreg')  

ITERATION 1 RESULTS

Pipeline: linreg
...UTC

Predictions:
                          Train            Test
MSE:                       0.20            0.28
RMSE:                      0.45            0.53
MAE:                       0.36            0.42
R^2 Score:                 1.00            1.00

Example 2: Iterate a pipeline with transformers and scalers

>>> results_df, model, grid = iterate_model(X_train, X_test, y_train, y_test,
...     transformers=['poly2'], scaler='stand', model='ridge', iteration='2',
...     grid=True, grid_params='ridge', grid_cv='kfold_5', plot=True,
...     coef=True, perm=True, vif=True, config=my_config,
...     save=True, save_df=results_df)  

ITERATION 2 RESULTS

Pipeline: poly2 -> stand -> ridge
...UTC

Grid Search:

Fitting 5 folds for each of 4 candidates, totalling 20 fits

Best Grid mean score (r2): 1.00
Best Grid parameters: ridge__alpha: 0.1

Predictions:
                          Train            Test
MSE:                       0.20            0.43
RMSE:                      0.45            0.66
MAE:                       0.37            0.50
R^2 Score:                 1.00            1.00

Permutation Feature Importance:
  Feature Importance Mean Importance Std
Feature 2            0.83           0.14
Feature 1            0.47           0.03
Feature 4            0.33           0.03
Feature 3            0.31           0.03
Feature 5            0.11           0.01

Variance Inflation Factor:
 Features  VIF Multicollinearity
Feature 1 1.03               Low
Feature 4 1.03               Low
Feature 5 1.02               Low
Feature 3 1.02               Low
Feature 2 1.01               Low


Coefficients:
                Feature Coefficient
1             Feature 1       65.68
2             Feature 2       90.96
3             Feature 3       53.72
4             Feature 4       56.56
5             Feature 5       33.85
6           Feature 1^2        0.02
7   Feature 1 Feature 2        0.03
8   Feature 1 Feature 3       -0.16
9   Feature 1 Feature 4       -0.08
10  Feature 1 Feature 5        0.03
11          Feature 2^2       -0.03
12  Feature 2 Feature 3       -0.03
13  Feature 2 Feature 4        0.07
14  Feature 2 Feature 5       -0.05
15          Feature 3^2       -0.06
16  Feature 3 Feature 4        0.03
17  Feature 3 Feature 5       -0.07
18          Feature 4^2        0.01
19  Feature 4 Feature 5       -0.04
20          Feature 5^2       -0.05
datawaza.model.plot_acf_residuals(results: Any, figsize: Tuple[float, float] = (12, 8), rotation: int = 45, bins: int = 30, lags: int = 40, legend_loc: str = 'best', show_std: bool = True, pacf_method: str = 'ywm', alpha: float = 0.7) None[source]#

Plot residuals, histogram, ACF, and PACF of a time series ARIMA model.

This function takes the results of an ARIMA model and creates a 2x2 grid of plots to visualize the residuals, their histogram, autocorrelation function (ACF), and partial autocorrelation function (PACF). The residuals are plotted with lines indicating standard deviations from the mean if show_std is True.

Use this function in time series analysis to assess the residuals of an ARIMA model and check for any patterns or autocorrelations that may indicate inadequacies in the model.

Parameters:
  • results (Any) – The result object typically obtained after fitting an ARIMA model. This object should have a resid attribute containing the residuals.

  • figsize (Tuple[float, float], optional) – The size of the figure in inches, specified as (width, height). Default is (12, 7).

  • rotation (int, optional) – The rotation angle for the x-axis tick labels in degrees. Default is 45.

  • bins (int, optional) – The number of bins to use in the histogram of residuals. Default is 30.

  • lags (int, optional) – The number of lags to plot in the ACF and PACF plots. Default is 40.

  • legend_loc (str, optional) – The location of the legend in the residual plot and histogram. Default is ‘best’.

  • show_std (bool, optional) – Whether to display the standard deviation lines in the residual plot and histogram. Default is True.

  • pacf_method (str, optional) – The method to use for the partial autocorrelation function (PACF) plot. Default is ‘ywm’. Other options include ‘ywadjusted’, ‘ywmle’ and ‘ols’.

  • alpha (float, optional) – The transparency of the histogram bars, between 0 and 1. Default is 0.7.

Returns:

The function displays a 2x2 grid of plots using matplotlib.

Return type:

None

Examples

Prepare the necessary data and model:

>>> from statsmodels.tsa.arima.model import ARIMA
>>> import numpy as np
>>> data = np.random.random(100)
>>> model = ARIMA(data, order=(1, 1, 1))
>>> results = model.fit()

Example 1: Plot residuals with default parameters:

>>> plot_acf_residuals(results)

Example 2: Plot residuals without standard deviation lines:

>>> plot_acf_residuals(results, show_std=False)

Example 3: Plot residuals with custom figsize, bins, and PACF method:

>>> plot_acf_residuals(results, figsize=(12, 10), bins=20, pacf_method='ols')
datawaza.model.plot_results(df: DataFrame, metrics: List[str] | str | None = None, select_metric: str | None = None, select_criteria: str = 'max', chart_type: str = 'line', decimal: int = 2, return_df: bool = False, x_column: str = 'Iteration', y_label: str | None = None, rotation: int = 45, title: str | None = None) DataFrame | None[source]#

Plot the results of model iterations and select the best metric.

This function creates line plots to visualize the performance of a model over multiple iterations, or to compare the performance of multiple models. Specify one or more metrics columns to plot (ex: ‘Train MAE’, ‘Test MAE’) in a list, and specify the name of the x_column whose values will become the X axis of the plot. The default is ‘Iteration’, which aligns with the format of the ‘results_df’ DataFrame created by the create_results_df function. But this could be any column in the provided df that you want to compare across (for example, ‘Model’, ‘Epoch’, ‘Dataset’).

In addition, if you specify select_metric (any metric column in the df) and select_criteria (‘min’ or ‘max’), the best result will be selected and plotted on the chart with a vertical line, dot, and a legend label that describes the value. The number of decimal places can be controlled by setting decimal (default is 2).

The title of the chart will be dynamically generated if y_label and x_column are defined. The title will be constructed in this format: ‘{y_label} over {x_column}’ (ex: ‘MSE over Iteration’). However, you can always pass a customer title by setting title to any string of text. If none of these are defined, there will be no title on the chart.

Use this function to easily visualize and compare the performance of a model across different metrics, and identify the best iteration based on a chosen metric and criteria.

Parameters:
  • df (pd.DataFrame) – The DataFrame containing the model evaluation results.

  • metrics (Optional[Union[str, List[str]]], optional) – The metric(s) to plot. If a single string is provided, it will be converted to a list. If None, an error will be raised. Default is None.

  • select_metric (Optional[str], optional) – The metric to use for selecting the best result. If None, then no best result will be selected. Default is None.

  • select_criteria (str, optional) – The criteria for selecting the best result. Can be either ‘max’ or ‘min’. Required if select_metric is specified. Default is ‘max’.

  • chart_type (str, optional) – The type of chart to plot. Currently only ‘line’ or ‘bar’ is supported. Default is ‘line’.

  • decimal (int, optional) – The number of decimal places to display in the plot and legend. Default is 2.

  • return_df (bool, optional) – Whether to return the melted DataFrame used for plotting. Default is False.

  • x_column (str, optional) – The column in df to use as the x-axis. Default is ‘Iteration’.

  • y_label (str, optional) – The text to display as the label for the y-axis, and to also include in the dynamically generated title of the chart. Default is None.

  • title (Optional[str], optional) – The title of the plot. If None, a default title will be generated from select_metric and x_column. If select_metric is also None, the title will be blank. Default is None.

  • rotation (int, optional) – The rotation angle for the x-axis tick labels in degrees. Default is 45.

Returns:

If return_df is True, returns the melted DataFrame used for plotting. Otherwise, returns None.

Return type:

Optional[pd.DataFrame]

Examples

Prepare some example data:

>>> df = pd.DataFrame({
...     'Iteration': [1, 2, 3, 4, 5],
...     'Train Accuracy': [0.8510, 0.9017, 0.8781, 0.9209, 0.8801],
...     'Test Accuracy': [0.8056, 0.8509, 0.8232, 0.8889, 0.8415]
... })

Example 1: Plot a single metric with default parameters:

>>> plot_results(df, metrics='Test Accuracy')

Example 2: Plot multiple metrics, select the best result based on the minimum value of ‘Test Accuracy’, and customize the Y-axis label:

>>> plot_results(df, metrics=['Train Accuracy', 'Test Accuracy'],
...              select_metric='Test Accuracy', select_criteria='max',
...              y_label='Accuracy')

Example 3: Plot multiple metrics, customize the title and decimal, and return the melted DataFrame:

>>> long_df = plot_results(df, metrics=['Train Accuracy', 'Test Accuracy'],
...              select_metric='Test Accuracy', select_criteria='max',
...              title='Train vs. Test Accuracy by Model Iteration',
...              return_df=True, decimal=4)
>>> long_df
   Iteration          Metric   Value
0          1  Train Accuracy  0.8510
1          2  Train Accuracy  0.9017
2          3  Train Accuracy  0.8781
3          4  Train Accuracy  0.9209
4          5  Train Accuracy  0.8801
5          1   Test Accuracy  0.8056
6          2   Test Accuracy  0.8509
7          3   Test Accuracy  0.8232
8          4   Test Accuracy  0.8889
9          5   Test Accuracy  0.8415

Example 4: Plot a single metric as a bar chart:

>>> plot_results(df, metrics='Test Accuracy', chart_type='bar')

Example 5: Plot multiple metrics as a bar chart:

>>> plot_results(df, metrics=['Train Accuracy', 'Test Accuracy'],
...              select_metric='Test Accuracy', select_criteria='max',
...              y_label='Accuracy', chart_type='bar')
datawaza.model.plot_train_history(model=None, history=None, metrics: List[str] | None = None, plot_loss: bool = True) None[source]#

Visualize the training history of a fitted Keras model or history dictionary.

This function creates a grid of subplots to display the training and validation metrics over the epochs. You can pass a fitted model, in which case the history will be extracted from it. Alternatively, you can pass the history dictionary itself. This function will automatically detect the metrics present in the history and plot them all, unless a specific list of metrics is provided. The loss is plotted by default, but can be excluded by setting plot_loss to False.

Use this function to quickly analyze the model’s performance during training and identify potential issues such as overfitting or underfitting.

Parameters:
  • model (keras.Model, optional) – The fitted Keras model whose training history will be plotted. Default is None.

  • history (dict, optional) – A direct history dictionary obtained from the fitting process. Default is None.

  • metrics (List[str], optional) – A list of metric names to plot. If None, all metrics found in the history will be plotted, excluding ‘loss’ unless explicitly listed. Default is None.

  • plot_loss (bool, optional) – Whether to plot the training and validation loss. Default is True.

Returns:

The function displays the plot and does not return any value.

Return type:

None

Examples

Prepare a simple example model:

>>> model = Sequential([
...     Input(shape=(8,)),
...     Dense(10, activation='relu'),
...     Dense(1, activation='sigmoid')
... ])
>>> model.compile(optimizer='adam', loss='binary_crossentropy',
...               metrics=['accuracy', 'precision', 'recall'])

Fit the model on some random data:

>>> import numpy as np
>>> X = np.random.rand(100, 8)
>>> y = np.random.randint(0, 2, size=(100, 1))
>>> model.fit(X, y, epochs=10, batch_size=32, validation_split=0.2,
...           verbose=0)  
<keras...callbacks.history.History object at 0x...>
>>> history = model.history.history

Example 1: Plot all metrics in the training history from a model:

>>> plot_train_history(model)

Example 2: Plot the training history with specific metrics:

>>> plot_train_history(model, metrics=['accuracy', 'precision'])

Example 3: Plot the training history without the loss:

>>> plot_train_history(model, plot_loss=False)

Example 4: Plot the training history of a model without validation data:

>>> model.fit(X, y, epochs=10, batch_size=32, verbose=0)  
<keras...callbacks.history.History object at 0x...>
>>> plot_train_history(model)

Example 5: Plot the training history from a history dictionary:

>>> plot_train_history(history=history)