A regression curve describes a general relationship between two or more quantitative variables. In a multivariate situation vectors of explanatory variables as well as response variables may be present. For the simple case of one explanatory variable and one response variable, n data points S : = [(X1, Y1), i = 1, 2, …., n] are collected. The regression relationship can be modeled by Y1, = m(X1)+el where m(x)=E(Y\X=x) is the unknown regression function and the εi's are independent random errors with mean 0 and unknown variance σ2.
Nonparametric methods (i.e. smoothing methods) to obtain consistent estimators m(x) of m(x) are revised.
Nonparametric methods relax on traditional assumptions and usually only assumes that m belongs to an infinite¬ dimensional collection of smooth functions.
Several nonparametric estimators are discussed, mostly of a weighted average form. Several kernel and nearest neighbour approaches to the weight functions are considered. Each of these estimators depends on a smoothing parameter and the critical issue of estimating it is discussed briefly.
The performance of m(x) is assessed via methods involving the mean squared error (MSE) and the mean integrated squared error (MISE). Two methods of improving the performance of m(x) are "boosting" and ”bagging", which are respectively an iterative computer intensive method, and an averaging method involving the generation of bootstrap samples. These methods are briefly introduced.