Cogprints

Using Feature Weights to Improve Performance of Neural Networks

Iqbal, Ridwan Al (2011) Using Feature Weights to Improve Performance of Neural Networks. [Preprint]

Full text available as:

[img]PDF - Submitted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

166Kb

Abstract

Different features have different relevance to a particular learning problem. Some features are less relevant; while some very important. Instead of selecting the most relevant features using feature selection, an algorithm can be given this knowledge of feature importance based on expert opinion or prior learning. Learning can be faster and more accurate if learners take feature importance into account. Correlation aided Neural Networks (CANN) is presented which is such an algorithm. CANN treats feature importance as the correlation coefficient between the target attribute and the features. CANN modifies normal feed-forward Neural Network to fit both correlation values and training data. Empirical evaluation shows that CANN is faster and more accurate than applying the two step approach of feature selection and then using normal learning algorithms.

Item Type:Preprint
Keywords:Feature weight, Feature ranking,Hybrid Learning,Correlation,Constraint learning
Subjects:Computer Science > Artificial Intelligence
Computer Science > Machine Learning
Computer Science > Neural Nets
ID Code:7179
Deposited By:Iqbal, Ridwan Al
Deposited On:16 Feb 2011 19:49
Last Modified:11 Mar 2011 08:57

References in Article

Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.

[Abu¬Mostafa, 1995] Abu¬Mostafa, Y.S., 1995. Hints. Neural Computation, (7).

[Bekkerman et al., 2003] Bekkerman, R., El-Yaniv, R., Tishby, N. & Winter., Y., 2003. Distributional word clusters vs. words for text categorization. JMLR, 3, p.1183–1208.

[Fung et al., 2002] Fung, G., Mangasarian, O. & Shavlik, J., 2002. Knowledge-Based Support Vector Machine Classifiers. In Proceedings of Sixteenth Conference on Neural Information Processing Systems (NIPS). Vancouver, Canada, 2002.

[Guyon & Elisseeff, 2003] Guyon, I. & Elisseeff, A., 2003. An Introduction to Variable and Feature Selection. Journal of Machine Learning Research, 3, pp.1157-82.

[Haykin, 1998] Haykin, S., 1998. Neural Networks: A Comprehensive Foundation. 2nd ed. Prentice Hall.

[Iqbal, 2011] Iqbal, R.A., 2011. Empirical learning aided by weak knowledge in the form of feature importance. In CMSP'11. Guilin, China, 2011. IEEE.

[Kearns & Vazirani, 1994] Kearns, M. & Vazirani, U., 1994. An Introduction to Computational Learning Theory. MIT Press.

[Leray & Gallinari, 1998] Leray, P. & Gallinari, P., 1998. Feature Selection with Neural Networks. Behaviormetrika, 26.

[Marcus, 1989] Marcus, S.(.)., 1989. Special issue on knowledge acquisition. Mach. Learn., 4.

[Mitchell, 1997a] Mitchell, T.M., 1997a. Machine Learning. McGraw-Hill.

[Mitchell, 1997] Mitchell, T.M., 1997. Artificial neural networks. In Mitchell, T.M. Machine learning. McGraw-Hill Science/Engineering/Math. pp.81-126.

[Quinlan, 1993] Quinlan, J.R., 1993. C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.

[Rodgers & Nicewander, 1984] Rodgers, L. & Nicewander, W.A., 1984. Thirteen ways to look at the correlation coefficient. The American Statistician, 42(1), p.59–66.

[Ruck et al., 1990] Ruck, D.W., Rogers, S.K. & Kabrisky, M., 1990. Feature Selection Using a Multilayer Perceptron. Journal of Neural Network Computing, 2, pp.40-48.

[Scott, 1991] Scott, A..C.J..&.G.E., 1991. A practical guide to knowledge acquisition. Addison-Wesley.

[Simard et al., 1992] Simard, P.S., Victoni, B., LeCun, Y. & Denker, J., 1992. Tangent prop-A formalism for specifying selected invariances in an adaptive network. In Advances in Neural Information Processing Systems. San Mateo, CA, 1992. Morgan Kaufmann.

[Towell & Shavlik, 1994] Towell, G.G. & Shavlik, J.W., 1994. Knowledge-based artificial neural networks. Artif. Intel., 70, pp.50-62.

[Vapnik, 1998] Vapnik, V.N., 1998. Statistical Learning Theory. New York: Wiley.

[ZHANG & WANG, 2010] ZHANG, L. & WANG, Z., 2010. Ontology-based Clustering Algorithm with Feature Weights. Journal of Computational Information Systems, 6(9).

[Zien et al., 2009] Zien, A., Kramer, N., Sonnenburg, S. & Ratsch, G., 2009. The Feature Importance Ranking Measure. In ECML 09., 2009.

Metadata

Repository Staff Only: item control page