Note 3 of PRML: Linear Models for Regression
Linear models: linear functions of the adjustable parameters (instead of input variables, which is just the simplest form of linear models).
From linear regression to linear models for regression
Generally, linear regression models have a form of
where $x_0=1$, corresponding bias $w_0$.
We can extend this class of models by adding some nonlinearity to input $X$ and only keeping the linearity in terms of weight $W$, through which we get general Linear models, of the form
where $\phi_j(x)$ are called basis functions (as before, we can define $\phi_0(x)=1$). One particular example of linear models is polynomial regression, in which we have $\phi_j(x)=x^j$ as the basis function.
There are still other basis functions including
where $\mu_j$ controls the locations of the basis functions in input space and $s$ controls the spatial space. They are referred to as ‘Gaussian basis functions’ because of the similarity with Gaussian function except that they have no normalization coefficients, which are not important because each of them has an adjustable parameter $w_j$.
Another example is the sigmoid basis function of the form
where sigmoid function is defined by
A regularization technique is adding a regularization term to a cost function in order to control overfitting. A particular choice is known as Weight decay of the form (suppose the original cost function is least squares)
It is so named because it encourages weights to decay towards zero unless supported by data.