{"id":883,"date":"2024-06-28T16:56:08","date_gmt":"2024-06-28T16:56:08","guid":{"rendered":"https:\/\/blogs.kcl.ac.uk\/kclip\/?p=883"},"modified":"2024-06-28T16:56:08","modified_gmt":"2024-06-28T16:56:08","slug":"bayesian-optimization-with-formal-safety-guarantees-via-online-conformal-prediction","status":"publish","type":"post","link":"https:\/\/blogs.kcl.ac.uk\/kclip\/2024\/06\/28\/bayesian-optimization-with-formal-safety-guarantees-via-online-conformal-prediction\/","title":{"rendered":"Bayesian Optimization with Formal Safety Guarantees via Online Conformal Prediction"},"content":{"rendered":"<h1>Motivation<\/h1>\n<p>In the general formulation of black-box optimization problems, a designer sequentially attempts candidate solutions, receiving noisy feedback on the value of each attempt from the system. As illustrated in Fig. 1, we consider scenarios in which feedback is also provided on the <em>safety<\/em> of the attempted solution, and the optimizer is constrained to limit the number of unsafe solutions that are tried throughout the optimization process [1] [2]. Focusing on methods based on Bayesian optimization (BO), prior works provide safety guarantee that <em>any<\/em> unsafe solution is excluded with a controllable probability with respect to feedback noise. This theoretical guarantee is, however, only valid if the optimizer has access to information about the constraint function, e.g., reproducible kernel Hilbert space (RKHS) norm bound of the constraint function. In practice, specifying such information may be difficult, since the constraint function is a priori unknown.<\/p>\n<div id=\"attachment_885\" style=\"width: 516px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-885\" class=\"wp-image-885\" src=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/intro_flow-890x1024.png\" alt=\"\" width=\"506\" height=\"582\" srcset=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/intro_flow-890x1024.png 890w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/intro_flow-261x300.png 261w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/intro_flow-768x884.png 768w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/intro_flow-676x778.png 676w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/intro_flow.png 1024w\" sizes=\"auto, (max-width: 506px) 100vw, 506px\" \/><p id=\"caption-attachment-885\" class=\"wp-caption-text\">Fig. 1. Illustration of black-box optimization with safety constraints. We provide formal safety guarantee on keeping the fraction of unsafe solutions attempted during the optimization process below some tolerated threshold.<\/p><\/div>\n<p>&nbsp;<\/p>\n<h1>Safe-BO via Online Conformal Prediction<\/h1>\n<p>In our recent work, to appear in IEEE Journal of Selected Topics in Signal Processing, we study for the first time leveraging online conformal prediction (CP) for providing <em>assumptions-free<\/em> guarantees on the safety level of the attempted candidate solutions, while enabling any non-zero target safety violation level. As shown in Fig. 2, we introduce Safe-BOCP that models objective function and constraint function by using independent Gaussian processes (GPs) as surrogate models, calibrating the credible intervals constructed for safe sets adaptively based on the observation history via online CP [3] [4]. The key mechanism is to use safety feedback, in the form of a well-designed safety error signal, on the reliability of past decisions to adjust the post-processing of probabilistic surrogate model&#8217;s outputs. In contrast to previous safe BO methods assuming RKHS properties of the constraint function to ensure a strict safety guarantee, Safe-BOCP adopts a &#8220;caution-increasing&#8221; back-off strategy that compensates for the uncertainty on the boundaries of the safe regions without any assumptions.<\/p>\n<div id=\"attachment_886\" style=\"width: 586px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-886\" class=\"wp-image-886\" src=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/general_safebo-300x188.png\" alt=\"\" width=\"576\" height=\"360\" srcset=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/general_safebo-300x188.png 300w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/general_safebo-1024x640.png 1024w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/general_safebo-768x480.png 768w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/general_safebo-676x423.png 676w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/general_safebo.png 1190w\" sizes=\"auto, (max-width: 576px) 100vw, 576px\" \/><p id=\"caption-attachment-886\" class=\"wp-caption-text\">Fig. 2. Block diagram of the main steps including safe set creation, producing the safe set, and of acquisition, selecting the next iterate.<\/p><\/div>\n<p>&nbsp;<\/p>\n<h1>Experiments<\/h1>\n<p>We compare Safe-BOCP with the state-of-the-art SAFEOPT in a safe movie recommendation problem and plug flow reactor (PFR) optimization problem. Fig. 3 plots the histograms of the ratings across all selected movies during the optimization procedure with varying target violation rates, showing that SAFEOPT does not meet the safety requirement (red dashed line) while D-SAFE-BOCP can correctly control the fraction of unsafe movies. As shown in Fig. 4, P-SAFE-BOCP is seen to meet the target reliability level irrespective of observation noise power, while SAFEOPT can only achieve it when the observation noise power is sufficiently large.<\/p>\n<div id=\"attachment_887\" style=\"width: 586px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-887\" class=\"wp-image-887\" src=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/hist_comparison-300x214.png\" alt=\"\" width=\"576\" height=\"412\" srcset=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/hist_comparison-300x214.png 300w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/hist_comparison-1024x732.png 1024w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/hist_comparison-768x549.png 768w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/hist_comparison-1536x1098.png 1536w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/hist_comparison-676x483.png 676w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/hist_comparison.png 1605w\" sizes=\"auto, (max-width: 576px) 100vw, 576px\" \/><p id=\"caption-attachment-887\" class=\"wp-caption-text\">Fig. 3. Histograms of the ratings of recommended movies by SAFEOPT, as well by D-SAFE-BOCP under different target violation rates.<\/p><\/div>\n<div id=\"attachment_888\" style=\"width: 586px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-888\" class=\"wp-image-888\" src=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/pfr_compare-204x300.png\" alt=\"\" width=\"576\" height=\"849\" srcset=\"https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/pfr_compare-204x300.png 204w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/pfr_compare-695x1024.png 695w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/pfr_compare-676x996.png 676w, https:\/\/blogs.kcl.ac.uk\/kclip\/files\/2024\/06\/pfr_compare.png 750w\" sizes=\"auto, (max-width: 576px) 100vw, 576px\" \/><p id=\"caption-attachment-888\" class=\"wp-caption-text\">Fig. 4. Probability of excessive violation rate (top) and optimality ratio (bottom) as a function of constraint observation noise power.<\/p><\/div>\n<p>&nbsp;<\/p>\n<h1>References<\/h1>\n<div class=\"page\" role=\"region\" data-page-number=\"3\" aria-label=\"Page 3\" data-listening-for-double-click=\"true\" data-loaded=\"true\">\n<div class=\"textLayer\">\n<p><span dir=\"ltr\" role=\"presentation\">[1]<\/span> <span dir=\"ltr\" role=\"presentation\">Y. Sui, A. Gotovos, J. Burdick, and A. Krause, \u201cSafe exploration for optimization with Gaussian processes,\u201d in<\/span> <em><span dir=\"ltr\" role=\"presentation\">Proceedings <\/span><span dir=\"ltr\" role=\"presentation\">of International Conference on Machine Learning<\/span><\/em><span dir=\"ltr\" role=\"presentation\">, Lille, France, 2015.<\/span><br role=\"presentation\" \/><span dir=\"ltr\" role=\"presentation\">[2]<\/span> <span dir=\"ltr\" role=\"presentation\">F. Berkenkamp, A. Krause, and A. P. Schoellig, \u201cBayesian optimization with safety constraints: Safe and automatic parameter <\/span><span dir=\"ltr\" role=\"presentation\">tuning in robotics,\u201d<\/span> <em><span dir=\"ltr\" role=\"presentation\">Machine Learning<\/span><\/em><span dir=\"ltr\" role=\"presentation\">, pp. 1\u201335, 2021.<\/span><br role=\"presentation\" \/><span dir=\"ltr\" role=\"presentation\">[3]<\/span> <span dir=\"ltr\" role=\"presentation\">I. Gibbs and E. Candes, \u201cAdaptive conformal inference under distribution shift,\u201d in<\/span> <em><span dir=\"ltr\" role=\"presentation\">Proceedings of Advances in Neural <\/span><span dir=\"ltr\" role=\"presentation\">Information Processing Systems<\/span><\/em><span dir=\"ltr\" role=\"presentation\">, Virtual, 2021.<\/span><br role=\"presentation\" \/><span dir=\"ltr\" role=\"presentation\">[4]<\/span> <span dir=\"ltr\" role=\"presentation\">S. Feldman, L. Ringel, S. Bates, and Y. Romano, \u201cAchieving risk control in online learning settings,\u201d<\/span> <em><span dir=\"ltr\" role=\"presentation\">Transactions on <\/span><span dir=\"ltr\" role=\"presentation\">Machine Learning Research<\/span><\/em><span dir=\"ltr\" role=\"presentation\">, 2023.<\/span><\/p>\n<div class=\"endOfContent active\"><\/div>\n<\/div>\n<\/div>\n<div class=\"page\" role=\"region\" data-page-number=\"4\" aria-label=\"Page 4\" data-listening-for-double-click=\"true\" data-loaded=\"true\">\n<div class=\"canvasWrapper\"><\/div>\n<div class=\"textLayer\"><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Motivation In the general formulation of black-box optimization problems, a designer sequentially attempts candidate solutions, receiving noisy feedback on the value of each attempt from the system. As illustrated in Fig. 1, we consider scenarios in which feedback is also provided on the safety of the attempted solution, and the optimizer is constrained to limit [&hellip;]<\/p>\n","protected":false},"author":1313,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[33,28,15],"tags":[],"class_list":["post-883","post","type-post","status-publish","format-standard","hentry","category-bayesian-optimization","category-conformal-prediction","category-machine-learning","post-preview"],"_links":{"self":[{"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/posts\/883","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/users\/1313"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/comments?post=883"}],"version-history":[{"count":9,"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/posts\/883\/revisions"}],"predecessor-version":[{"id":896,"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/posts\/883\/revisions\/896"}],"wp:attachment":[{"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/media?parent=883"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/categories?post=883"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.kcl.ac.uk\/kclip\/wp-json\/wp\/v2\/tags?post=883"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}