http://cogprints.org/4027/
Combining Independent Modules in Lexical Multiple-Choice Problems
Existing statistical approaches to natural language problems are very
coarse approximations to the true complexity of language processing.
As such, no single technique will be best for all problem
instances. Many researchers are examining ensemble methods that
combine the output of multiple modules to
create more accurate solutions. This paper examines three merging
rules for combining probability distributions: the familiar mixture
rule, the logarithmic rule, and a novel product rule.
These rules were applied with state-of-the-art results to two
problems used to assess human mastery of lexical
semantics -- synonym questions and analogy questions. All three
merging rules result in ensembles that are more accurate than any of
their component modules. The differences among the three rules are not statistically
significant, but it is suggestive that the popular mixture rule
is not the best rule for either of the two problems.
Turney, Peter D.
Littman, Michael L.
Bigham, Jeffrey
Shnayder, Victor
Statistical Models
Language
Computational Linguistics
Semantics
Machine Learning
Peter D.
Turney
Michael L.
Littman
Jeffrey
Bigham
Victor
Shnayder