Supervised Classification

For multivariate data, a classification function predicts one (or more) output attribute(s) (dependent variable(s)) given the values of the input attributes. Depending on usage, the prediction can be "definite" or probabilistic over possible values.

A classification function is learned from, or fitted to, training data. It is then tested on (surprise) test data. Over-fitting is a risk - where the model fits both the structure and the noise in the training data. Techniques such as cross-validation can be used to provide a stopping criterion. Minimum message length (MML) inference has a natural stopping criterion and is generally resistant to over-fitting

The output attribute, its range of values, and the training data are given - hence `supervised classification'.

Examples of classes of classification (decision-) functions:

Classification- (Decision-) Trees
Classification- (Decision-) Graphs
Artificial Neural Networks

(Also see unsupervised learning.)