U-CE: Uncertainty-aware Cross-Entropy for Semantic Segmentation

Abstract

Deep neural networks have shown exceptional performance in various tasks, but their lack of robustness, reliability, and tendency to be overconfident pose challenges for their deployment in safety-critical applications like autonomous driving. In this regard, quantifying the uncertainty inherent to a model's prediction is a promising endeavour to address these shortcomings. In this work, we present a novel Uncertainty-aware Cross-Entropy loss (U-CE) that incorporates dynamic predictive uncertainties into the training process by pixel-wise weighting of the well-known cross-entropy loss (CE). Through extensive experimentation, we demonstrate the superiority of U-CE over regular CE training on two benchmark datasets, Cityscapes and ACDC, using two common backbone architectures, ResNet-18 and ResNet-101. With U-CE, we manage to train models that not only improve their segmentation performance but also provide meaningful uncertainties after training. Consequently, we contribute to the development of more robust and reliable segmentation models, ultimately advancing the state-of-the-art in safety-critical applications and beyond.

Methodology

A schematic overview of the training process of U-CE. U-CE integrates the predictive uncertainties of a Monte Carlo Dropout (MC-Dropout) model into the training process to enhance segmentation performance. In comparison to most applications of Monte Carlo Dropout, U-CE utilizes the uncertainties not only at test time but also dynamically during training by applying pixel-wise weighting to the regular cross-entropy loss.

Quantitative Results

A detailed quantitative comparison between regular CE and U-CE on the Cityscapes dataset using a dropout ratio of 20%. The provided numbers represent the mIoU ↑, ECE ↓, and mUnc.

Qualitative Results

Example images from the Cityscapes and ACDC validation set (a), corresponding ground truth labels (b), the model's segmentation predictions (c), a binary accuracy map (d), and the predictive uncertainty (e). White pixels in the binary accuracy map are either incorrect predictions or void classes, which appear black in the ground truth label. For the uncertainty prediction, brighter pixels represent higher predictive uncertainties. The first three rows depict results from models with a ResNet-18 backbone and dropout ratio of 20%, trained for 200 epochs on Cityscapes. The last three rows show examples from models using a ResNet-101 backbone and a dropout ratio of 20%, trained for 500 epochs on the ACDC dataset.

Conclusion

In this paper, we introduced U-CE, a novel uncertainty-aware cross-entropy loss for semantic segmentation. U-CE incorporates predictive uncertainties, based on Monte Carlo Dropout, into the training process through pixel-wise weighting of the regular cross-entropy loss. As a result, we manage to train models that are naturally capable of predicting meaningful uncertainties after training while simultaneously improving their segmentation performance. Through extensive experimentation on the Cityscapes and ACDC datasets using ResNet-18 and ResNet-101 architectures, we demonstrated the superiority of U-CE over regular cross-entropy training. We hope that U-CE and our thorough discussion of potential limitations and future work contribute to the development of more robust and trustworthy segmentation models, ultimately advancing the state-of-the-art in safety-critical applications and beyond.

BibTeX

@article{landgraf2023u,
  title={U-CE: Uncertainty-aware cross-entropy for semantic segmentation},
  author={Landgraf, Steven and Hillemann, Markus and Wursthorn, Kira and Ulrich, Markus},
  journal={arXiv preprint arXiv:2307.09947},
  year={2023}
}