Explainable and Interpretable AI
Explainable AI models
“summarize the reasons […]
for [their] behavior […] or
produce insights about the
causes of their decisions,”
whereas Interpretable AI
refers to AI systems which
“describe the internals of a
system in a way which is
understandable to humans”
Explainable vs. Interpretable AI
GradCAM
Gradient-weighted Class Activation Mapping
one of the first explainability techniques; it generalizes CAM that could only be used for certain model architectures
examining the gradient information flowing through the last (or any) convolution layer of the network
produces a localization heatmap
Challenges of GradCAM
GradCAM Advantages and Disadvantages
LIME
local interpretable model-a gnostic explanations
post hoc, perturbation-based explainability technique
can be used for regression or classification models
LIME works by changing regions of an image, turning them on or off and rerunning inferences to see which regions are most influential to the model’s prediction
LIME uses an inherently interpretable model to measure influence or importance of input features.
LIME Advantages and Disadvantages
What is Explainable AI and why do we need it?
to have less of an black box
to higher the trust in AI decision
to evaluate the model
What is difference between explainable and interpretable AI?
Interpretability — If a business wants high model transparency and wants to understand exactly why and how the model is generating predictions, they need to observe the inner mechanics of the AI/ML method. This leads to interpreting the model’s weights and features to determine the given output. This is interpretability.
For example, an economist may want to build a multi-variate regression model to predict an inflation rate, they can view the estimated parameters of the model’s variables to measure the expected output given different data examples. In this case, full transparency is given and the economist can answer the exact why and how of the model’s behavior.
However, high interpretability typically comes at the cost of performance, as seen in the following figure. If a company wants to achieve high performance but still wants to have a general understanding of the model behavior, model explainability starts to play a larger role.
Explainability — Explainability is how to take an ML model and explain the behavior in human terms. With complex models (for example, black boxes
), you cannot fully understand how and why the inner mechanics impact the prediction. However, through model agnostic methods (for example, partial dependence plots, SHapley Additive exPlanations
(SHAP) dependence plots, or surrogate models) you can discover meaning between input data attributions and model outputs, which enables you to explain the nature and behavior of the AI/ML model.
For example, a news media outlet uses a neural network to assign categories to different articles. The news outlet cannot interpret the model in depth; however, they can use a model agnostic approach to evaluate the input article data versus the model predictions. With this approach, they find that the model is assigning the Sports category to business articles that mention sport organizations. Although the news outlet did not use model interpretability, they were still able to derive an explainable answer to reveal the model’s behavior.
Zuletzt geändertvor 7 Monaten