Local Interpretable Model-Agnostic Explanations (LIME)

Machine Learning Model Explanability

Don’t worry, you won’t feel any sourness while using the method

Jasper Lok https://jasperlok.netlify.app/
04-16-2022

In this post, I will be exploring local interpretable model-agnostic explanations (LIME), i.e. one of the model explainability methods.

Photo by Vino Li on Unsplash

This technique was first introduced back in 2016. The title of the paper is rather interesting, i.e. “Why should I trust you?” Explaining the predictions of any classifier.

You may refer to the paper under this link.

LIME

One interesting point highlighted by the authors of LIME paper was “if the users do not trust a model or a prediction, they will not use it.” (Ribeiro, Singh, and Guestrin 2016).

Based on my practical experience, this is so true.

Most of the time the model metrics (eg. confusion matrix, accuracy, etc) alone are not sufficient to give comfort to the users, especially the non-technical users, comfort that they could trust the model results.

Below are some of the questions I have encountered so far:

If the models cannot be explained, it would be seen as a black box.

The authors of the LIME paper also outlined 4 criteria for explanations that must be satisfied (purva91 2021):

But fear not! There is research in helping data scientists in explaining the model results.

In this post, I will be exploring LIME, i.e. one of the common methods used in explaining the model results.

Photo by Govinda Valbuena from Pexels

Don’t worry, you won’t feel any sourness while using the method in explaining the predictions!

What is LIME?

LIME is a local surrogate model that is used to explain the individual predictions of the black-box machine learning models (Molnar 2022).

One key assumption made in LIME method is that at the local scale, the model can be approximated by a simple linear model (Molnar 2022).

Below is the graph shown:

In the graph, the black dotted line is the simple model that attempts to explain the predicted value of the selected model point (ie. the black cross) from the complex classification model (ie. the orange and green area).

Pros and cons

Some of the advantages of LIME method are (Biecek and Burzykowski 2021):

Following are some of the flaws of LIME method (Molnar 2022):

Demonstration

In this demonstration, I will be using the employee attrition dataset from Kaggle.

Nevertheless, let’s begin the demonstration!

Setup the environment

First, I will set up the environment by calling all the packages I need for the analysis later.

packages <- c('tidyverse', 'readr', 'tidymodels', 'DALEXtra', 'themis', 
              'lime')

for(p in packages){
  if(!require (p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

For this demonstration, we will be using a R package called DALEXtra.

This package provides an interface to different implementations of LIME method. Refer to the documentation page for the list of LIME package this function supports.

As this package is acting as a wrapper, so I will be importing the LIME package into the environment as well.

Import the data

First I will import the data into the environment.

df <- read_csv("https://raw.githubusercontent.com/jasperlok/my-blog/master/_posts/2022-03-12-marketbasket/data/general_data.csv")

I will set the random seed for reproducability.

set.seed(1234)

Build a model

For simplicity, I will build a random forest and attempt to explain the predicted results by using LIME method.

Split data

Next, I will split the data into training and testing datasets.

df_split <- initial_split(df, 
                          prop = 0.6, 
                          strata = Attrition)

df_train <- training(df_split)
df_test <- testing(df_split)

Random Forest

First I will define the recipe for the machine learning models.

ranger_recipe <- 
  recipe(formula = Attrition ~ ., 
         data = df_train) %>%
  step_impute_mean(NumCompaniesWorked,
                   TotalWorkingYears) %>%
  step_nzv(all_predictors()) %>%
  step_dummy(all_nominal_predictors()) %>%
  step_upsample(Attrition)

In the recipe above, I have done the following:

Next, I will define the model specs for the machine learning I will be building.

ranger_spec <- 
  rand_forest(trees = 1000) %>% 
  set_mode("classification") %>% 
  set_engine("ranger") 

Next, I will build the workflow for the model building.

ranger_workflow <- 
  workflow() %>% 
  add_recipe(ranger_recipe) %>% 
  add_model(ranger_spec) 

Finally, I will start fitting the model.

ranger_fit <- ranger_workflow %>%
  fit(data = df_train)

Local Interpretable Model-agnostic Explanations (LIME)

Now, we will start using LIME to explain our model predictions!

Create explainer objects

To use DALEXtra package to use LIME method to explain predictions, we will use explain_tidymodels function to create the explainer object.

ranger_explainer <- explain_tidymodels(ranger_fit,
                   data = select(df_train, -Attrition),
                   y = df_train$Attrition,
                   verbose = FALSE)

According to the documentation, this package also supports the models

Aside from that, we need the following codes to ensure the right explainers are being used (Lendway).

model_type.dalex_explainer <- DALEXtra::model_type.dalex_explainer
predict_model.dalex_explainer <- DALEXtra::predict_model.dalex_explainer

Otherwise, the subsequent codes will not be able to run.

Explaining predictions

In this demonstration, I will attempt to explain the top 3 predicted attrition by using lime method.

To do so, I will use predict function to generate the predictions from testing dataset. I will also indicate the probability of attrition should be generated by indicating prob in the type argument.

Then, I will slice the data points with top 3 predicted attrition by using slice_head function.

top_3_obs <- predict(ranger_fit, 
        df_test, 
        type = "prob") %>%
  bind_cols(df_test) %>%
  arrange(desc(.pred_Yes)) %>%
  slice_head(n = 3)

top_3_obs
# A tibble: 3 x 26
  .pred_No .pred_Yes   Age Attrition BusinessTravel    Department     
     <dbl>     <dbl> <dbl> <chr>     <chr>             <chr>          
1  0.00594     0.994    31 Yes       Travel_Frequently Research & Dev~
2  0.0128      0.987    31 Yes       Travel_Frequently Sales          
3  0.0134      0.987    37 Yes       Travel_Frequently Sales          
# ... with 20 more variables: DistanceFromHome <dbl>,
#   Education <dbl>, EducationField <chr>, EmployeeCount <dbl>,
#   EmployeeID <dbl>, Gender <chr>, JobLevel <dbl>, JobRole <chr>,
#   MaritalStatus <chr>, MonthlyIncome <dbl>,
#   NumCompaniesWorked <dbl>, Over18 <chr>, PercentSalaryHike <dbl>,
#   StandardHours <dbl>, StockOptionLevel <dbl>,
#   TotalWorkingYears <dbl>, TrainingTimesLastYear <dbl>, ...

As it seems like the new observation must be in the same data format and cannot contain additional columns, I will drop the two prediction fields.

top_3_obs <- top_3_obs %>%
  select(-c(.pred_No, .pred_Yes))

Okay, let’s start to explain the predictions!

For this, I will be using predict_surrogate function from DALEXtra package.

lime_rf_top_3 <- predict_surrogate(explainer = ranger_explainer,
                  new_observation = top_3_obs %>% 
                    select(-Attrition),
                  n_features = 8,
                  type = "lime")

Note that we will need to remove the target variable from the data when running the predict_surrogate function, otherwise the code will return an error (Lendway).

One cool thing to note is the output from the plot function above is an object is a ggplot object.

This would allow us to modify the graph by using the different ggplot related functions.

For example, I would like to change the color to grey and make the bar color a bit more transparent.

To do so, I will add on scale_fill_manual function to the plot object as shown below to modify the ggplot graph.

plot(lime_rf_top_3 %>% filter(case == 2)) +
  labs(title = "Before Modification")
plot(lime_rf_top_3 %>% filter(case == 2)) +
  scale_fill_manual(values = alpha(c("grey", "black"), 0.6)) +
  labs(title = "After Modification")

Nevertheless, let’s start analyzing the results!

As the graph can be very cluttered, so I will use the for loop to plot the different observations separately.

for (i in 1:3){
  print(
    plot(lime_rf_top_3 %>% filter(case == i))
    )
}

The graphs above shows us the top variables that are important in the local model.

The graph also shows on each variable is “contributing” to the prediction. The higher the positive weight, the higher effect that the variable has on the prediction.

Aside from the “explanation” breakdown of three predictions, the graph also contains the predicted output and explanation fit.

The values under explanation fit indicate how well the LIME method explains the prediction of the relevant data point (Adyatama 2019).

From the graph, we can see that the explanation fit is rather poor since the values range between 35-55%.

This implies the LIME model is only able to explain about 35-55% of our fitted model, which is not very ideal.

To resolve this, we can pass in additional arguments to tune the parameters in lime.

Refer to the lime documentation page for the list of available arguments under lime package.

lime_rf_top_3 <- predict_surrogate(explainer = ranger_explainer,
                  new_observation = top_3_obs %>% 
                    select(-Attrition),
                  n_features = 8,
                  dist_fun = "manhattan",
                  kernel_width = 2,
                  type = "lime")

for (i in 1:3){
  print(
    plot(lime_rf_top_3 %>% filter(case == i))
    )
}

As shown above, the values on explanation fit increase after I have changed the dist_fun to “manhattan” and kernel_width to 2.

From the graph, it seems like the staffs with higher attrition have following common characteristics:

Alternatively, lime package offers an option to plot a condensed overview of all explanations. This would help identify the common features that influence the observations.

plot_explanations(lime_rf_top_3)

Similarly to the graphs earlier, we can see that the top 3 employees are all single and works in the current company for less than 3 years.

Conclusion

That’s all for the day!

This post has demonstrated how we could use lime package in explaining the predictions.

Thanks for reading the post until the end.

Feel free to contact me through email or LinkedIn if you have any suggestions on future topics to share.

Refer to this link for the blog disclaimer.

Till next time, happy learning!

Photo by Wagner Soares from Pexels

Adyatama, Arga. 2019. “Interpreting Classification Model with LIME.” https://algotech.netlify.app/blog/interpreting-classification-model-with-lime/.
Biecek, Przemyslaw, and Tomasz Burzykowski. 2021. Explanatory Model Analysis. Chapman; Hall/CRC, New York. https://pbiecek.github.io/ema/.
Lendway, Lisa. “Interpretable Machine Learning: This Tutorial Focuses on Local Interpretation.” https://advanced-ds-in-r.netlify.app/posts/2021-03-31-imllocal/.
Molnar, Christoph. 2022. 9.2 Local Surrogate (LIME). https://christophm.github.io/interpretable-ml-book/lime.html.
purva91. 2021. “ML Interpretability Using LIME in r.” https://www.analyticsvidhya.com/blog/2021/01/ml-interpretability-using-lime-in-r/.
Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. ‘Why Should i Trust You?’ Explaining the Predictions of Any Classifier.” https://arxiv.org/pdf/1602.04938.pdf.

References