Learning Analogies and Semantic Relations

Turney, Peter and Littman, Michael (2003) Learning Analogies and Semantic Relations. [Departmental Technical Report]

Full text available as:



We present an algorithm for learning from unlabeled text, based on the Vector Space Model (VSM) of information retrieval, that can solve verbal analogy questions of the kind found in the Scholastic Aptitude Test (SAT). A verbal analogy has the form A:B::C:D, meaning "A is to B as C is to D"; for example, mason:stone::carpenter:wood. SAT analogy questions provide a word pair, A:B, and the problem is to select the most analogous word pair, C:D, from a set of five choices. The VSM algorithm correctly answers 47% of a collection of 374 college-level analogy questions (random guessing would yield 20% correct). We motivate this research by relating it to work in cognitive science and linguistics, and by applying it to a difficult problem in natural language processing, determining semantic relations in noun-modifier pairs. The problem is to classify a noun-modifier pair, such as "laser printer", according to the semantic relation between the noun (printer) and the modifier (laser). We use a supervised nearest-neighbour algorithm that assigns a class to a given noun-modifier pair by finding the most analogous noun-modifier pair in the training data. With 30 classes of semantic relations, on a collection of 600 labeled noun-modifier pairs, the learning algorithm attains an F value of 26.5% (random guessing: 3.3%). With 5 classes of semantic relations, the F value is 43.2% (random: 20%). The performance is state-of-the-art for these challenging problems.

Item Type:Departmental Technical Report
Keywords:analogies, semantic relations, vector space model, noun-modifier expressions
Subjects:Computer Science > Language
Linguistics > Computational Linguistics
Linguistics > Semantics
Computer Science > Machine Learning
ID Code:3084
Deposited By:Turney, Peter
Deposited On:25 Jul 2003
Last Modified:11 Mar 2011 08:55

References in Article

Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.

Baeza-Yates, R., and Ribeiro-Neto, B. (1999). Modern Information Retrieval. Addison-Wesley.

Barker, K., and Szpakowicz, S. (1998). Semi-automatic recognition of noun modifier relationships. Proceedings of the 17th International Conference on Computational Linguistics and the 36th Annual Meeting of the Association for Computational Linguistics (COLING-ACL'98), Montréal, Québec, 96-102.

Berland, M. and Charniak, E. (1999) Finding parts in very large corpora. Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL '99). ACL, New Brunswick NJ, 57-64.

Church, K.W., And Hanks, P. (1989). Word association norms, mutual information and lexicography. Proceedings of the 27th Annual Conference of the Association of Computational Linguistics. Association for Computational Linguistics, New Brunswick, NJ, 76-83.

Claman, C. (2000). 10 Real SATs. College Entrance Examination Board.

Daganzo, C.F. (1994). The cell transmission model: A dynamic representation of highway traffic consistent with the hydrodynamic theory. Transportation Research Part B: Methodological, 28(4), 269-287.

Dolan, W.B. (1995). Metaphor as an emergent property of machine-readable dictionaries. Proceedings of the AAAI 1995 Spring Symposium Series: Representation and Acquisition of Lexical Knowledge: Polysemy, Ambiguity and Generativity, 27-32.

Dunning, T. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19, 61-74.

Fellbaum, C. (editor). (1998). WordNet: An Electronic Lexical Database. MIT Press.

French, R.M. (2002). The computational modeling of analogy-making. Trends in Cognitive Sciences, 6(5), 200-205.

Hearst, M.A. (1992). Automatic acquisition of hyponyms from large text corpora. Proceedings of the Fourteenth International Conference on Computational Linguistics, Nantes, France, 539-545.

Hofstadter, D., and the Fluid Analogies Research Group (1995). Fluid Concepts and Creative Analogies: Computer Models of the Fundamental Mechanisms of Thought. New York: Basic Books.

Lakoff, G., and Johnson, M. (1980). Metaphors We Live By. University of Chicago Press.

Lakoff, G. (1987). Women, Fire, and Dangerous Things. University of Chicago Press.

Landauer, T.K., and Dumais, S.T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104, 211-240.

Lesk, M.E. (1969). Word-word associations in document retrieval systems. American Documentation, 20(1): 27-38.

Lewis, D.D. (1991). Evaluating text categorization. Proceedings of the Speech and Natural Language Workshop, Asilomar, 312-318.

Martin, J. (1992). Computer understanding of conventional metaphoric language. Cognitive Science, 16, 233-270.

Nastase, V., and Szpakowicz, S. (2003). Exploring noun-modifier semantic relations. Fifth International Workshop on Computational Semantics (IWCS-5), Tilburg, The Netherlands, 285-301.

Pantel, P., and Lin, D. (2002). Discovering word senses from text. Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 613-619.

Reitman, W.R. (1965). Cognition and Thought: An Information Processing Approach. New York, NY: John Wiley and Sons.

Resnik, P. (1995). Using information content to evaluate semantic similarity in a taxonomy. Proceedings of the 14th International Joint Conference on Artificial Intelligence. Morgan Kaufmann, San Mateo, CA, 448-453.

Rosario, B., and Hearst, M. (2001). Classifying the semantic relations in noun-compounds via a domain-specific lexical hierarchy. Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing (EMNLP-01), 82-90.

Rosario, B, Hearst, M., and Fillmore, C. (2002). The descent of hierarchy, and selection in relational semantics. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL '02), Philadelphia, PA, 417-424.

Ruge, G. (1992). Experiments on linguistically-based term associations. Information Processing and Management, 28(3), 317-332.

Salton, G., and McGill, M.J. (1983). Introduction to Modern Information Retrieval. McGraw-Hill, New York.

Salton, G. (1989). Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, Massachusetts.

Salton, G., and Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), 513-523.

Skeat, W.W., (1963). A Concise Etymological Dictionary of the English Language. New York, Capricorn Books.

Smadja, F. (1993). Retrieving collocations from Text: Xtract. Computational Linguistics, 19, 143-177.

Turney, P.D. (2001). Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. Proceedings of the Twelfth European Conference on Machine Learning. Springer-Verlag, Berlin, 491-502.

Turney, P.D., Littman, M.L., Bigham, J., and Shnayder, V. (2003). Combining independent modules to solve multiple-choice synonym and analogy problems. Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03). Borovets, Bulgaria.

van Rijsbergen, C.J. (1979). Information Retrieval (2nd edition), Butterworths, London.

Vanderwende, L. (1994). Algorithm for automatic interpretation of noun sequences. Proceedings of the Fifteenth International Conference on Computational Linguistics, Kyoto, Japan, 782-788.

Voorhees, E.M., and Harman, D.K. (1997). Overview of the fifth Text Retrieval Conference (TREC-5). Proceedings of the Fifth Text Retrieval Conference (TREC-5), pp. 1-28. NIST Special Publication 500-238.

Wong, S.K.M., Ziarko, W., and Wong, P.C.N. (1985). Generalized vector space model in information retrieval. Proceedings of the 8th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-85), 18-25.

Yi, J., Lin, H., Alvarez, L., and Horowitz, R. (2003). Stability of macroscopic traffic flow modeling through wavefront expansion. Transportation Research Part B: Methodological, 37(7), 661-679.

Zhang, H.M. (2003). Driver memory, traffic viscosity and a viscous vehicular traffic flow model. Transportation Research Part B: Methodological, 37(1), 27-41.


Repository Staff Only: item control page