Photo by cottonbro studio

In this post, I will explore how to leverage alternative hyperparameter tuning techniques to accelerate the tuning process and identify the hyperparameter sets that yield the best model performance.

The three methods covered are:

tune_race_anova
tune_win_loss
tune_sim_anneal

These functions are part of the finetune package.

Below is a summary based on the finetune package documentation:

Both the tune_race_anova and tune_win_loss functions begin by evaluating all hyperparameter combinations on an initial set of resamples.

The tune_race_anova function uses a repeated measures ANOVA model to eliminate hyperparameter combinations that are unlikely to yield the best results. In contrast, tune_win_loss employs pairwise win/loss statistics and fits a logistic regression model to predict the likelihood of each combination outperforming the others.

tune_sim_anneal uses a method called Simulated Annealing, which is a global optimization technique.

For model tuning, it iteratively explores the parameter space to find optimal hyperparameter combinations. At each iteration, a new combination is generated by slightly perturbing the current parameters to remain within a local neighborhood.

This new combination is then used to fit a model, and the model’s performance is evaluated using resampling (or a validation set).

If the new settings yield better performance than the current ones, they are accepted, and the process continues.

Demonstration

First, I will import all the necessary packages into the environment.

pacman::p_load(tidyverse, tidymodels, janitor, finetune, future, bonsai, aorsf, BradleyTerry2)

In this demonstration, I will use finetune packages to perform the different types of hyperparameter tuning.

Import Data

I will be using this body performance dataset I found on Kaggle for the demonstration.

df <- read_csv("https://raw.githubusercontent.com/jasperlok/my-blog/refs/heads/master/_posts/2024-06-29-ordinal-regression/data/bodyPerformance.csv") %>%
  clean_names() %>% # clean up the column naming
  mutate_at(c("class", "gender")
            ,function(x) as.factor(x))

Train Test Split

Next, I will split the dataset into training and testing datasets.

df_split <- initial_split(df, prop = 0.7, strata = class)
df_train <- training(df_split)
df_test <- testing(df_split)
df_vfold <- vfold_cv(df_train, strata = class)

Okay, let’s start building a model!

Model Building

First, I will define the formula for the model.

orsf_recipe <- 
  recipe(class ~ .
         ,data = df_train)

I will also specify which model I want to build and which hyperparameters to tune.

orsf_spec <-
  rand_forest(trees = tune(), min_n = tune(), mtry = tune()) %>% 
  set_engine("aorsf") %>% 
  set_mode("classification")

Next, I will define the model building workflow.

orsf_wf <-
  workflow() %>% 
  add_recipe(orsf_recipe) %>% 
  add_model(orsf_spec)

Hyperparameter tuning

I will also define the settings for hyperparameter tuning control.

race_ctrl <-
  control_race(
    save_pred = TRUE
    ,save_workflow = FALSE
    ,verbose_elim = TRUE
  )

I will use grid_space_filling to generate the different combinations of hyperparameters.

orsf_param_list <- parameters(trees(c(1, 50)), min_n(c(1, 5)), mtry(c(1, 10)))

orsf_grid_param <-
  orsf_spec %>% 
  extract_parameter_set_dials() %>% 
  grid_space_filling(x = orsf_param_list
                     ,size = 30)

Method 1: Racing method

Tune ANOVA

One of the racing methods involves using a repeated measures ANOVA model to eliminate combinations of hyperparameters that are unlikely to produce good model results.

orsf_tune_anova <- tune_race_anova(
  orsf_wf
  ,resamples = df_vfold
  ,grid = orsf_grid_param
  ,metrics = metric_set(roc_auc)
  ,control = race_ctrl
)

From the run log, we can see that at each fold, the different combinations of hyperparameters are evaluated.

If they’re unlikely to yield good results, those combinations are dropped from the study.

For instance, after the 10th fold, 19 hyperparameter combinations were eliminated.

By doing this, we save time by not evaluating hyperparameters that are unlikely to result in the best model.

We can also visualize how the different combinations of hyperparameters perform across different folds.

From the graph, we can see that the majority of the hyperparameter sets were dropped early in the tuning process.

plot_race(orsf_tune_anova)

Just like in traditional hyperparameter tuning, we can extract the best set of hyperparameters using the show_best function.

show_best(orsf_tune_anova)

# A tibble: 2 × 9
   mtry trees min_n .metric .estimator  mean     n std_err .config    
  <int> <int> <int> <chr>   <chr>      <dbl> <int>   <dbl> <chr>      
1     8    50     3 roc_auc hand_till  0.901    10 0.00146 Preprocess…
2     8    44     1 roc_auc hand_till  0.900    10 0.00175 Preprocess…

Win Loss Statistics

AAnother racing method involves using win/loss statistics, which can be done with the tune_race_win_loss function from the finetune package.

orsf_tune_win_loss <- tune_race_win_loss(
  orsf_wf
  ,resamples = df_vfold
  ,grid = orsf_grid_param
  ,metrics = metric_set(roc_auc)
  ,control = race_ctrl
)

As in the previous section, we can plot the tuning results using the plot_race function.

plot_race(orsf_tune_win_loss)

We can use show_best function to find the best combinations of hyperparameters.

show_best(orsf_tune_win_loss)

# A tibble: 5 × 9
   mtry trees min_n .metric .estimator  mean     n std_err .config    
  <int> <int> <int> <chr>   <chr>      <dbl> <int>   <dbl> <chr>      
1     8    50     3 roc_auc hand_till  0.900    10 0.00165 Preprocess…
2     8    44     1 roc_auc hand_till  0.899    10 0.00144 Preprocess…
3     6    41     5 roc_auc hand_till  0.899    10 0.00151 Preprocess…
4     7    31     3 roc_auc hand_till  0.897    10 0.00143 Preprocess…
5     6    33     1 roc_auc hand_till  0.897    10 0.00152 Preprocess…

Method 2: Simulated annealing

According to finetune package, tuning via simulated annealing optimization is an iterative search tool for finding good values.

First, I will define the model tuning control settings.

anneal_ctrl <-
  control_sim_anneal(
    save_workflow = FALSE
    ,verbose = TRUE
  )

Then, I will use tune_sim_anneal function to perform model tuning.

orsf_tune_anneal <-
     orsf_spec %>%
     tune_sim_anneal(class ~ .
                     ,resamples = df_vfold
                     ,param_info = orsf_param_list
                     ,iter = 30
                     ,metrics = metric_set(roc_auc)
                     ,control = anneal_ctrl
                     )

We can plot the results at each iteration using the autoplot function.

autoplot(orsf_tune_anneal, metric = "roc_auc", type = "performance") +
    theme_bw()

Again, we can use the show_best function to find the hyperparameter combination that gives us the best model results.

show_best(orsf_tune_anneal)

# A tibble: 5 × 10
  trees min_n  mtry .metric .estimator  mean     n std_err .config
  <int> <int> <int> <chr>   <chr>      <dbl> <int>   <dbl> <chr>  
1    48     4     7 roc_auc hand_till  0.900    10 0.00123 Iter27 
2    48     2     6 roc_auc hand_till  0.900    10 0.00121 Iter25 
3    50     1     7 roc_auc hand_till  0.900    10 0.00134 Iter5  
4    46     2     8 roc_auc hand_till  0.900    10 0.00139 Iter6  
5    47     1     7 roc_auc hand_till  0.899    10 0.00164 Iter22 
# ℹ 1 more variable: .iter <int>

Conclusion

That’s all for the day!

Thanks for reading the post until the end.

Feel free to contact me through email or LinkedIn if you have any suggestions on future topics to share.

Refer to this link for the blog disclaimer.

Till next time, happy learning!

Photo by ANTONI SHKRABA production

Hyperparameter tuning - Tune Race, Win loss, and Sim anneal