Created At: [[2024-11-08]] ## Problem Setup - Focus on maximising $f$, because minimising is the maximising of $-f$. - $x \rightarrow f \rightarrow f(x)$. - Low RKHS norm => easier to optimise, less sharp points, not robust (continuous, smooth, derivative is bounded) - High RKHS norm => more sharp - [[Dirac Delta|Dirac Delta Function]]: not a well-defined function - Noisy Observation: $y=f(x) + \epsilon,$ i.e. for every given $x$, we may get different $y$ values at different times, and $\epsilon$ follows gaussian distribution. - In addition, $f$ is expensive to query. ### Gaussian Process - Defined mean: $\mu(\cdot)$ and covariance (kernel) $k(\cdot, \cdot)$. - Common choice of [[Bayes Theorem|Prior]]: $(0, k)$ -