Note 3 of PRML: Linear Models for Regression

Linear models: linear functions of the adjustable parameters (instead of input variables, which is just the simplest form of linear models).

From linear regression to linear models for regression

Generally, linear regression models have a form of

where $x_0=1$, corresponding bias $w_0$.

We can extend this class of models by adding some nonlinearity to input $X$ and only keeping the linearity in terms of weight $W$, through which we get general Linear models, of the form

where $\phi_j(x)$ are called basis functions (as before, we can define $\phi_0(x)=1$). One particular example of linear models is polynomial regression, in which we have $\phi_j(x)=x^j$ as the basis function.

There are still other basis functions including

where $\mu_j$ controls the locations of the basis functions in input space and $s$ controls the spatial space. They are referred to as ‘Gaussian basis functions’ because of the similarity with Gaussian function except that they have no normalization coefficients, which are not important because each of them has an adjustable parameter $w_j$.

Another example is the sigmoid basis function of the form

where sigmoid function is defined by

A regularization technique is adding a regularization term to a cost function in order to control overfitting. A particular choice is known as Weight decay of the form (suppose the original cost function is least squares)

It is so named because it encourages weights to decay towards zero unless supported by data.