Motivation

In many online decision-making settings, ensuring that predictions are well-calibrated is crucial for the safe operation of systems. One way to achieve calibration is through adaptive risk control, which adjusts the uncertainty estimates of a machine learning model based on past feedback [1]. This method guarantees that the calibration error over an arbitrary sequence is controlled and that, in the long run, the model becomes statistically well-calibrated if the data points are independently and identically distributed [2]. However, these schemes only ensure calibration when averaged across the entire input space, raising concerns about fairness and robustness. For instance, consider the figure below, which depicts a tumor segmentation model calibrated to identify potentially cancerous areas. If the model is calibrated using images from different datasets, marginal calibration may be achieved by prioritizing certain subpopulations at the expense of others.

A tumor segmentation model is calibrated using data from two sources to ensure that the marginal false negative rate (FNR) is controlled. However, as shown on the right, the error rate for one source is significantly lower than for the other, leading to unfair performance across subpopulations.

Localized Adaptive Risk Control

To address this issue, our recent work at NeurIPS 2024 proposes a method to localize uncertainty estimates by leveraging the connection between online learning in reproducing kernel Hilbert spaces [3] and online calibration methods. The key idea behind our approach is to use feedback to adjust a model’s confidence levels only in regions of the input space that are near observed data points. This allows for localized calibration, tailoring uncertainty estimates to specific areas of the input space. We demonstrate that, for adversarial sequences, the number of mistakes can be controlled. More importantly, the scheme provides asymptotic guarantees that are localized, meaning they remain valid under a wide range of covariate shifts, for instance those induced by considering certain subpopulation of the data.

Experiments

Comparison between the coverage map obtained using adaptive risk control (on the left) and localized adaptive risk control (on the right). Adaptive risk control is unable to deliver uniform coverage across the deployment areas, leading to large regions where the SNR level is unsatisfactory. In contrast, localized adaptive risk control is capable of guaranteeing a more uniform SNR level, improving the overall system coverage.

To demonstrate the fairness improvements of our algorithm, we conducted a series of experiments using standard machine learning benchmarks as well as wireless communication problems. Specifically, in the wireless domain, we considered the problem of beam selection based on contextual information. Here, a base station must select a subset of communication beam vectors to guarantee a level of signal-to-noise ratio (SNR) across a deployment area. Standard calibration methods like adaptive risk control (on the left) result in substantial SNR variation across the area, creating regions where communication is impossible. In contrast, our localized adaptive risk control scheme (on the right) enables the base station to calibrate the beam selection algorithm to match the local uncertainty, providing more uniform coverage throughout the deployment area.

 

References

[1] Isaac Gibbs and Emmanuel Candes. Adaptive conformal inference under distribution shift. Advances in Neural Information Processing Systems, 34 (2021).

[2] Anastasios Nikolas Angelopoulos, Rina Barber, Stephen Bates. Online conformal prediction with decaying step sizes. Proceedings of the 41st International Conference on Machine Learning. (2024).

[3] Jyrki Kivinen, Alex Smola and Robert C. Williamson. Online Learning with Kernels. Advances in Neural Information Processing Systems, 14 (2001)