## What’s New in the?

of the simulation. To facilitate such practice, we can convert the vector representation of the configuration variables, $\mathbf{X}$, into continuous distributions by the following equation: \begin{aligned} p(\mathbf{X})&=\mathcal{N}\left(\frac{\sum_i \mathbf{X_i}}{\sum_i N_i}\middle|\mathbf{0},\mathbf{I}\right) \\ &=\mathcal{N}\left(\mathbf{x}_i \middle|\mathbf{0},\frac{1}{\sum_i N_i}\mathbf{I}\right)\end{aligned} Thus, for every $i$, we can infer $X_i$ from the continuous distribution $p(\mathbf{X})$. As a result, the samples $\mathbf{X}$ in the artificial GP can be used to represent samples from the $p(\mathbf{X})$ for all the artificial GP models at every iteration of the optimization. As an example, in Fig. $fig:samp$, $\mathbf{X}$ (dots) are randomly generated from the distribution $p(\mathbf{X})$ (grey dashed curve), while the estimated kernel function (solid black curve) and the GP model are trained using those samples. A Gaussian noise of zero mean and unit variance is added to the samples to mimic the noisy training data used for training real GPs in practice. The model evaluation process for the original GP model can also be explained by the effective artificial GP. Let’s consider the predictions $\mathbf{y}$ and the training data $\mathbf{X}$ in the original GP model. In general, the prediction $\mathbf{y}$ is a vector of n predictions and usually accompanied by the noise $\epsilon$. The conditional probability of $\mathbf{y}$ can be estimated by replacing the samples with noise as follows: \begin{aligned} \mathcal{N}\left(\mathbf{y} \middle|\mathbf{X},\Sigma_\epsilon \right) &=\mathcal{N}\left(\mathbf{y} \middle|\mathbf{X},(\mathbf{