Theoretical analyses of cross-validation error and voting in instance-based learning

Turney, Peter D. (1994) Theoretical analyses of cross-validation error and voting in instance-based learning. [Journal (Paginated)]

Full text available as:

[img] PDF


This paper begins with a general theory of error in cross-validation testing of algorithms for supervised learning from examples. It is assumed that the examples are described by attribute-value pairs, where the values are symbolic. Cross-validation requires a set of training examples and a set of testing examples. The value of the attribute that is to be predicted is known to the learner in the training set, but unknown in the testing set. The theory demonstrates that cross-validation error has two components: error on the training set (inaccuracy) and sensitivity to noise (instability). This general theory is then applied to voting in instance-based learning. Given an example in the testing set, a typical instance-based learning algorithm predicts the designated attribute by voting among the k nearest neighbors (the k most similar examples) to the testing example in the training set. Voting is intended to increase the stability (resistance to noise) of instance-based learning, but a theoretical analysis shows that there are circumstances in which voting can be destabilizing. The theory suggests ways to minimize cross-validation error, by insuring that voting is stable and does not adversely affect accuracy.

Item Type:Journal (Paginated)
Subjects:Computer Science > Artificial Intelligence
Computer Science > Machine Learning
Computer Science > Statistical Models
ID Code:1821
Deposited By: Turney, Peter
Deposited On:13 Oct 2001
Last Modified:11 Mar 2011 08:54

References in Article

Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.

Aha, D.W., Kibler, D., & Albert, M.K. (1991) Instance-based learning algorithms,

Machine Learning, 6:37-66.

Cover, T.M., & Hart, P.E. (1967) Nearest neighbor pattern classification, IEEE Transac-tions

on Information Theory, IT-13:21-27. Also in (Dasarathy, 1991).

Dasarathy, B.V. (1991) Nearest Neighbor Pattern Classification Techniques, Edited col-lection

(California: IEEE Press).

Fix, E., & Hodges, J.L. (1951) Discriminatory analysis: nonparametric discrimination:

consistency properties, Project 21-49-004, Report Number 4, USAF School of

Aviation Medicine, Randolph Field, Texas, 261-279. Also in (Dasarathy, 1991).

Fraser, D.A.S. (1976) Probability and Statistics: Theory and Applications (Massachusetts:

Duxbury Press).

Kibler, D., Aha, D.W., & Albert, M.K. (1989) Instance-based prediction of real-valued

attributes, Computational Intelligence, 5:51-57.

Langley, P. (1993) Average-case analysis of a nearest neighbor algorithm, Proceedings of

the Thirteenth International Joint Conference on Artificial Intelligence, Chambéry,

France, in press.

Sakamoto, Y., Ishiguro, M., & Kitagawa, G. (1986) Akaike Information Criterion Statis-tics

(Dordrecht, Holland: Kluwer Academic Publishers).

Tomek, I. (1976) A generalization of the k-NN rule, IEEE Transactions on Systems, Man,

and Cybernetics, SMC-6:121-126. Also in (Dasarathy, 1991).

Turney, P.D. (1990) The curve fitting problem: a solution, British Journal for the Philoso-phy

of Science, 41:509-530.

Turney, P.D. (1993) A theory of cross-validation error. Submitted to the Journal of Exper-imental

and Theoretical Artificial Intelligence.

Weiss, S.M., & Kulikowski, C.A. (1991) Computer Systems that Learn: Classification and

Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert

Systems (California: Morgan Kaufmann).


Repository Staff Only: item control page