Created At: [[2024-11-08]]
## Problem Setup
- Focus on maximising $f$, because minimising is the maximising of $-f$.
- $x \rightarrow f \rightarrow f(x)$.
- Low RKHS norm => easier to optimise, less sharp points, not robust (continuous, smooth, derivative is bounded)
- High RKHS norm => more sharp
- [[Dirac Delta|Dirac Delta Function]]: not a well-defined function
- Noisy Observation: $y=f(x) + \epsilon,$ i.e. for every given $x$, we may get different $y$ values at different times, and $\epsilon$ follows gaussian distribution.
- In addition, $f$ is expensive to query.
### Gaussian Process
- Defined mean: $\mu(\cdot)$ and covariance (kernel) $k(\cdot, \cdot)$.
- Common choice of [[Bayes Theorem|Prior]]: $(0, k)$
-