Supervised Classification
For multivariate data, a classification function predicts one (or more) output attribute(s) (dependent variable(s)) given the values of the input attributes. Depending on usage, the prediction can be "definite" or probabilistic over possible values.
A classification function is learned from, or fitted to, training data. It is then tested on (surprise) test data. Over-fitting is a risk - where the model fits both the structure and the noise in the training data. Techniques such as cross-validation can be used to provide a stopping criterion. Minimum message length (MML) inference has a natural stopping criterion and is generally resistant to over-fitting
The output attribute, its range of values, and the training data are given - hence `supervised classification'.
Examples of classes of classification (decision-) functions:
- Classification- (Decision-) Trees
- Classification- (Decision-) Graphs
- Artificial Neural Networks
(Also see unsupervised learning.)