{"id":871,"date":"2024-06-11T10:24:06","date_gmt":"2024-06-11T10:24:06","guid":{"rendered":"https:\/\/blogs.kcl.ac.uk\/kclip\/?p=871"},"modified":"2024-06-11T10:24:47","modified_gmt":"2024-06-11T10:24:47","slug":"cross-validation-conformal-risk-control","status":"publish","type":"post","link":"https:\/\/blogs.kcl.ac.uk\/kclip\/2024\/06\/11\/cross-validation-conformal-risk-control\/","title":{"rendered":"Cross-Validation Conformal Risk Control"},"content":{"rendered":"<h1>Motivation<\/h1>\n<p>Conformal risk control (CRC) [1] [2] is a recently proposed technique that applies post-hoc to a conventional point predictor to provide calibration guarantees. Generalizing conformal prediction (CP) [3], with CRC, calibration is ensured for a set predictor that is extracted from the point predictor to control a risk function such as the probability of miscoverage or the false negative rate. The original CRC requires the available data set to be split between training and validation data sets. This can be problematic when data availability is limited, resulting in inefficient set predictors. In [4], a novel CRC method is introduced that is based on cross-validation, rather than on validation as the original CRC. The proposed cross-validation CRC (CV-CRC) allows for the control of a broader range of risk functions, while proved to offer theoretical guarantees on the average risk of the set predictor, and reduced average set size with respect to CRC when the available data are limited.<\/p>\n<h1>Cross-Validation Conformal Risk Control<\/h1>\n<p>The objective of CRC is to design a set predictor with a mean risk no larger than a predefined level \u03b1, i.e.,<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-873 aligncenter\" src=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_eq_1.png\" alt=\"\" width=\"578\" height=\"85\" srcset=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_eq_1.png 578w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_eq_1-300x44.png 300w\" sizes=\"auto, (max-width: 578px) 100vw, 578px\" \/><\/p>\n<p>with test data input-label pair (x,y), and a set of N data pairs D.<\/p>\n<p>The risk is defined between the\u00a0 true label y and a predictive set \u0393 of labels.<\/p>\n<p>VB-CRC generalizes VB-CP [2] in the sense it allows the risk taking arbitrary form under technical conditions such as boundness and monotonicity in the set. VB-CP is resorted when VB-CRC considers the special case of the miscoverage risk<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-874 aligncenter\" src=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_eq_2.png\" alt=\"\" width=\"223\" height=\"61\" \/><\/p>\n<p>In this work, we introduce CV-CRC, which is a cross-validation-based version of VB-CRC. In a similar manner how CV-CP [5] generalizes VB-CP, CV-CRC generalizes VB-CRC. See Fig. 1 for illustration.<\/p>\n<div id=\"attachment_875\" style=\"width: 957px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-875\" class=\"wp-image-875 size-full\" src=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_1.png\" alt=\"\" width=\"947\" height=\"570\" srcset=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_1.png 947w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_1-300x181.png 300w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_1-768x462.png 768w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_1-676x407.png 676w\" sizes=\"auto, (max-width: 947px) 100vw, 947px\" \/><p id=\"caption-attachment-875\" class=\"wp-caption-text\">Fig. 1. (top) validation-based CRC (bottom) the proposed method, CV-CRC.<\/p><\/div>\n<p>In the top panel of Fig. 2, VB-CRC is shown as the outcome of available data split into training data and validation data. The former is used to train a model, while the latter is used to post process and control a threshold \u03bb. Upon test input x, a predictive set \u0393 of labels y\u2019s is formed. In the bottom panel, CV-CRC is illustrated as a generalization. Available data is split K\u2264N folds, and K leave-fold-out models are trained. Then, K predictive sets are formed and merged via a threshold that is set via the trained models and the left-fold-out data.<\/p>\n<div id=\"attachment_876\" style=\"width: 957px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-876\" class=\"size-full wp-image-876\" src=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_2.png\" alt=\"\" width=\"947\" height=\"1101\" srcset=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_2.png 947w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_2-258x300.png 258w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_2-881x1024.png 881w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_2-768x893.png 768w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_2-676x786.png 676w\" sizes=\"auto, (max-width: 947px) 100vw, 947px\" \/><p id=\"caption-attachment-876\" class=\"wp-caption-text\">Fig. 2. (top) validation-based CRC (bottom) the proposed method, CV-CRC.<\/p><\/div>\n<h1>Experiments<\/h1>\n<p>To illustrate the main theorem that the risk guarantee (1) is met, while the average set sizes are expected to reduce, two experiments were conducted. The first is vector regression using maximum-likelihood learning, and is shown in Fig. 3.<\/p>\n<div id=\"attachment_877\" style=\"width: 1505px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-877\" class=\"size-full wp-image-877\" src=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_3.png\" alt=\"\" width=\"1495\" height=\"549\" srcset=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_3.png 1495w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_3-300x110.png 300w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_3-1024x376.png 1024w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_3-768x282.png 768w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_3-676x248.png 676w\" sizes=\"auto, (max-width: 1495px) 100vw, 1495px\" \/><p id=\"caption-attachment-877\" class=\"wp-caption-text\">Fig. 3. VB-CRC and CV-CRC for the vector regression problem.<\/p><\/div>\n<p>The second problem is a temporal point process prediction, where a point process set predictor aims to predict sets that contain future events of a temporal process with false negative rate of no more than a predefined \u03b1. As can be seen, in both problems, CV-CRC is shown to be more data-efficient in the small data regime, while holding the risk condition (1).<\/p>\n<p>&nbsp;<\/p>\n<div id=\"attachment_878\" style=\"width: 834px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-878\" class=\"size-full wp-image-878\" src=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_4.png\" alt=\"\" width=\"824\" height=\"885\" srcset=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_4.png 824w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_4-279x300.png 279w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_4-768x825.png 768w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/cvcrc_Fig_4-676x726.png 676w\" sizes=\"auto, (max-width: 824px) 100vw, 824px\" \/><p id=\"caption-attachment-878\" class=\"wp-caption-text\">Fig. 4. VB-CRC and CV-CRC for the temporal point process prediction problem.<\/p><\/div>\n<p>Full details can be found at <a href=\"https:\/\/arxiv.org\/abs\/2401.11974\"><u>ISIT preprint<\/u><\/a> [4].<\/p>\n<h1>References<\/h1>\n<p>[1] A. N. Angelopoulos, S. Bates, A. Fisch, L. Lei, and T. Schuster, \u201cConformal Risk Control,\u201d in The Twelfth International Conference on Learning Representations, 2024.<\/p>\n<p>[2] S. Feldman, L. Ringel, S. Bates, and Y. Romano, \u201cAchieving Risk Control in Online Learning Settings,\u201d Transactions on Machine Learning Research, 2023.<\/p>\n<p>[3] V. Vovk, A. Gammerman, and G. Shafer, Algorithmic Learning in a Random World. Springer, 2005, springer, New York.<\/p>\n<p>[4] K. M. Cohen, S. Park, O. Simeone, and S. Shamai Shitz, \u201cCross-Validation Conformal Risk Control,\u201d accepted to IEEE International Symposium on Information Theory Proceedings (ISIT2024), July 2024.<\/p>\n<p>[5] R. F. Barber, E. J. Candes, A. Ramdas, and R. J. Tibshirani, \u201cPredictive Inference with the Jackknife+,\u201d The Annals of Statistics, vol. 49, no. 1, pp. 486\u2013507, 2021.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Motivation Conformal risk control (CRC) [1] [2] is a recently proposed technique that applies post-hoc to a conventional point predictor to provide calibration guarantees. Generalizing conformal prediction (CP) [3], with CRC, calibration is ensured for a set predictor that is extracted from the point predictor to control a risk function such as the probability of [&hellip;]<\/p>\n","protected":false},"author":1227,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-871","post","type-post","status-publish","format-standard","hentry","category-uncategorized","post-preview"],"_links":{"self":[{"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/posts\/871","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/users\/1227"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/comments?post=871"}],"version-history":[{"count":4,"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/posts\/871\/revisions"}],"predecessor-version":[{"id":881,"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/posts\/871\/revisions\/881"}],"wp:attachment":[{"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/media?parent=871"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/categories?post=871"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/tags?post=871"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}