Cogprints

Empirical learning aided by weak domain knowledge in the form of feature importance

Iqbal, Ridwan Al (2010) Empirical learning aided by weak domain knowledge in the form of feature importance. [Preprint] (Unpublished)

Full text available as:

[img]PDF - Submitted Version
Available under License Creative Commons Attribution No Derivatives.

114Kb

Abstract

Standard hybrid learners that use domain knowledge require stronger knowledge that is hard and expensive to acquire. However, weaker domain knowledge can benefit from prior knowledge while being cost effective. Weak knowledge in the form of feature relative importance (FRI) is presented and explained. Feature relative importance is a real valued approximation of a feature’s importance provided by experts. Advantage of using this knowledge is demonstrated by IANN, a modified multilayer neural network algorithm. IANN is a very simple modification of standard neural network algorithm but attains significant performance gains. Experimental results in the field of molecular biology show higher performance over other empirical learning algorithms including standard backpropagation and support vector machines. IANN performance is even comparable to a theory refinement system KBANN that uses stronger domain knowledge. This shows Feature relative importance can improve performance of existing empirical learning algorithms significantly with minimal effort.

Item Type:Preprint
Keywords:neural network, domain knowledge, prior knowledge, feature importance
Subjects:Computer Science > Machine Learning
ID Code:6855
Deposited By:Iqbal, Ridwan Al
Deposited On:06 Jun 2010 15:35
Last Modified:11 Mar 2011 08:57

References in Article

Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.

1

Winston, P. H. Learning structural descriptions from examples. MIT Technical Report, 1970.

2

Pazzani, M., Mani, S., and Shankle, W. R. Comprehensible knowledge discovery in databases. In CogSci-97 ( 1997).

3

Simard, P. S., Victoni, B., LeCun, Y., and Denker, J. Tangent prop-A formalism for specifying selected invariances in an adaptive network. In Advances in Neural Information Processing Systems (San Mateo, CA 1992), Morgan Kaufmann.

4

Pazzani, M., Brunk, C., and & Silverstein, G. A knowledge-intensive approach to learning relational concepts. In Proceedings of the Eighth International Workshop on Machine Learning (San Francisco 1991), 432-436.

5

Mahoney, J. Jeffrey and Mooney, Raymond J. Combining Symbolic and Neural Learning to Revise Probabilistic Theories. In Proceedings of the 1992 Machine Learning Workshop on Integrated Learning in Real Domains ( 1992).

6

Towell, G. G. and Shavlik, J. W. Knowledge-based artificial neural networks. Artif. Intel., 70 (1994), 50-62.

7

Fung, G., Mangasarian, O., and Shavlik, J. Knowledge-Based Support Vector Machine Classifiers. In Proceedings of Sixteenth Conference on Neural Information Processing Systems (NIPS) (Vancouver, Canada 2002).

8

Scott, A., Clayton, J., & Gibson, E. A practical guide to knowledge acquisition. Addison-Wesley, 1991.

9

Marcus, S. (Ed.). Special issue on knowledge acquisition. Mach. Learn., 4 (1989).

10

Bekkerman, R., El-Yaniv, R., Tishby, N., and Winter., Y. Distributional word clusters vs. words for text categorization. JMLR, 3 (2003), 1183–1208.

11

Ruck, Dennis W., Rogers, Steven K., and Kabrisky, Matthew. Feature Selection Using a Multilayer Perceptron. Journal of Neural Network Computing, 2 (1990), 40-48.

12

Guyon, Isabelle and Elisseeff, Andr´e. An Introduction to Variable and Feature Selection. Journal of Machine Learning Research, 3 (2003), 1157-1182.

13

Friedman, J. Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29 (2001), 1189-1232.

14

Zien, Alexander, Kramer, Nicole, Sonnenburg, Soren, and Ratsch, Gunnar. The Feature Importance Ranking Measure. In ECML 09 ( 2009).

15

Mitchell, Tom M. Artificial neural networks. In Machine learning. McGraw-Hill, 1997.

16

Mitchell, Tom M. Artificial neural networks. In Machine learning. McGraw-Hill Science/Engineering/Math, 1997.

17

Quinlan, J. R. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA, 1993.

18

Aha, D., Kibler, D., and & Albert, M. Instance-based learning algorithms. Machine learning, 6 (1991), 37-66.

19

Vapnik, V. N. Statistical Learning Theory. Wiley, New York, 1998.

20

Towell, G., Shavlik, J., and Noordewier, M. Refinement of Approximate Domain Theories by Knowledge-Based Neural Networks. In Proceedings of the Eighth National Conference on Artificial Intelligence (Boston, MA 1990), 861-866.

21

Noordewier, M., Towell, G., and Shavlik, J. Training Knowledge-Based Neural Networks to Recognize Genes in DNA Sequences. In Advances in Neural Information Processing Systems (Denver, CO 1991), Morgan Kaufmann, 530-536.

Metadata

Repository Staff Only: item control page