Cogprints

Expressing Implicit Semantic Relations without Supervision

Turney, Peter D. (2006) Expressing Implicit Semantic Relations without Supervision. [Conference Paper]

Full text available as:

[img]
Preview
PDF
82Kb

Abstract

We present an unsupervised learning algorithm that mines large text corpora for patterns that express implicit semantic relations. For a given input word pair X:Y with some unspecified semantic relations, the corresponding output list of patterns <P1,...,Pm> is ranked according to how well each pattern Pi expresses the relations between X and Y. For example, given X=ostrich and Y=bird, the two highest ranking output patterns are "X is the largest Y" and "Y such as the X". The output patterns are intended to be useful for finding further pairs with the same relations, to support the construction of lexicons, ontologies, and semantic networks. The patterns are sorted by pertinence, where the pertinence of a pattern Pi for a word pair X:Y is the expected relational similarity between the given pair and typical pairs for Pi. The algorithm is empirically evaluated on two tasks, solving multiple-choice SAT word analogy questions and classifying semantic relations in noun-modifier pairs. On both tasks, the algorithm achieves state-of-the-art results, performing significantly better than several alternative pattern ranking algorithms, based on tf-idf.

Item Type:Conference Paper
Keywords:analogies, semantic relations, vector space model, noun-modifier expressions, latent relational analysis, pertinence
Subjects:Computer Science > Language
Linguistics > Computational Linguistics
Linguistics > Semantics
Computer Science > Machine Learning
Computer Science > Artificial Intelligence
ID Code:5039
Deposited By: Turney, Peter
Deposited On:01 Aug 2006
Last Modified:11 Mar 2011 08:56

References in Article

Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.

Eugene Agichtein and Luis Gravano. 2000. Snowball: Extracting relations from large plain-text collec-tions. In Proceedings of the Fifth ACM Conference on Digital Libraries (ACM DL 2000), pages 85-94.

Matthew Berland and Eugene Charniak. 1999. Find-ing parts in very large corpora. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL-99), pages 57-64.

Sergey Brin. 1998. Extracting patterns and relations from the World Wide Web. In WebDB Workshop at the 6th International Conference on Extending Database Technology (EDBT-98), pages 172-183.

Charles L.A. Clarke, Gordon V. Cormack, and Chris-topher R. Palmer. 1998. An overview of MultiText. ACM SIGIR Forum, 32(2):14-15.

Gene H. Golub and Charles F. Van Loan. 1996. Ma-trix Computations. Third edition. Johns Hopkins University Press, Baltimore, MD.

Marti A. Hearst. 1992. Automatic acquisition of hy-ponyms from large text corpora. In Proceedings of the 14th International Conference on Computa-tional Linguistics (COLING-92), pages 539-545.

Thomas K. Landauer and Susan T. Dumais. 1997. A solution to Plato’s problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104(2):211-240.

Maria Lapata. 2002. The disambiguation of nominali-sations. Computational Linguistics, 28(3):357-388.

George A. Miller. 1995. WordNet: A lexical database for English. Communications of the ACM, 38(11):39-41.

Scott Miller, Heidi Fox, Lance Ramshaw, and Ralph Weischedel. 2000. A novel use of statistical pars-ing to extract information from text. In Proceed-ings of the Sixth Applied Natural Language Proc-essing Conference (ANLP 2000), pages 226-233.

Vivi Nastase and Stan Szpakowicz. 2003. Exploring noun-modifier semantic relations. In Fifth Interna-tional Workshop on Computational Semantics (IWCS-5), pages 285-301.

Ellen Riloff and Rosie Jones. 1999. Learning diction-aries for information extraction by multi-level bootstrapping. In Proceedings of the 16th National Conference on Artificial Intelligence (AAAI-99), pages 474-479.

Gerard Salton and Chris Buckley. 1988. Term weight-ing approaches in automatic text retrieval. Informa-tion Processing and Management, 24(5):513-523.

Mark Stevenson. 2004. An unsupervised WordNet-based algorithm for relation extraction. In Proceed-ings of the Fourth International Conference on Language Resources and Evaluation (LREC) Workshop, Beyond Named Entity Recognition: Se-mantic Labelling for NLP Tasks, Lisbon, Portugal.

Egidio Terra and Charles L.A. Clarke. 2003. Fre-quency estimates for statistical word similarity measures. In Proceedings of the Human Language Technology and North American Chapter of Asso-ciation of Computational Linguistics Conference (HLT/NAACL-03), pages 244-251.

Peter D. Turney. 2005. Measuring semantic similarity by latent relational analysis. In Proceedings of the Nineteenth International Joint Conference on Arti-ficial Intelligence (IJCAI-05), pages 1136-1141.

Peter D. Turney and Michael L. Littman. 2005. Cor-pus-based learning of analogies and semantic rela-tions. Machine Learning, 60(1-3):251-278.

Tony Veale. 2004. WordNet sits the SAT: A knowl-edge-based approach to lexical analogy. In Pro-ceedings of the 16th European Conference on Arti-ficial Intelligence (ECAI 2004), pages 606-612.

Roman Yangarber, Ralph Grishman, Pasi Tapanainen, and Silja Huttunen. 2000. Unsupervised discovery of scenario-level patterns for information extrac-tion. In Proceedings of the Sixth Applied Natural Language Processing Conference (ANLP 2000), pages 282-289.

Roman Yangarber. 2003. Counter-training in discov-ery of semantic patterns. In Proceedings of the 41st Annual Meeting of the Association for Computa-tional Linguistics (ACL-03), pages 343-350.

Dmitry Zelenko, Chinatsu Aone, and Anthony Rich-ardella. 2003. Kernel methods for relation extrac-tion. Journal of Machine Learning Research, 3:1083-1106.

Metadata

Repository Staff Only: item control page