Measuring and acting on uncertainty in deep neural networks: Selective prediction and calibrating confidence

Loading...
Thumbnail Image

Date

Authors

Salem, Mahmoud

Journal Title

Journal ISSN

Volume Title

Publisher

University of Guelph

Abstract

Many machine learning (ML) models suffer from overconfidence and mismatch be- tween a model’s prediction and the likelihood of true correctness. Outside of the lab- oratory setting, ML models usually operate within the context of larger systems. A model’s inability to detect when it is uncertain about its prediction can lead to critical issues. There are different approaches to measure and act on uncertainty in predictions when deploying ML models in the real-world. This thesis explores two in depth. The first approach is to integrate a reject option for ML models allowing them to abstain from making predictions when they are uncertain. We adopt the framework known as selective networks. However, optimizing selective networks is challenging due to the non-differentiability of the binary selection function (the discrete decision of whether to predict or abstain). We propose an alternative framework to train selective networks that employs the Gumbel-softmax reparameterization trick to enable selection within an end-to-end differentiable training framework. Our framework showed better perfor- mance on different classification and regression benchmarks and resulted in a reduced gap between the selection rates during training and test time. The second approach is training models to have well-calibrated confidences. Calibrated confidences are more robust in the real-world as they are more representative of the true correctness like- lihood. We propose a framework that adapts knowledge distillation for training ML models to have better calibration in long-tailed recognition settings. Our approach achieves state-of-the-art results on various visual recognition benchmarks.

Description

Keywords

model calibration, knowledge distillation, long tail distribution, selective networks

Citation

Salem, M., Ahmed, M. O., Tung, F., & Oliveira, G. (2022). Gumbel-Softmax Selective Networks. https://doi.org/10.48550/arXiv.2211.10564