Abstract
This paper introduces a method to improve supervised word sense disambiguation performance by including a new class of features which leverage contextual information from large unannotated corpora. This new feature class, selectors, contains words that appear in other corpora with the same local context as a given lexical instance. We show that support vector sense classifiers trained with selectors achieve higher accuracy than those trained only with standard features, producing error reductions of 15.4% and 6.9% on standard coarse-grained and fine-grained disambiguation tasks respectively. Furthermore, we find an error reduction of 9.3% when including selectors for the classification step of named-entity recognition over a representative sample of OntoNotes. These significant improvements come free of any human annotation cost, only requiring unlabeled Web-Scale corpora.
| Original language | English |
|---|---|
| Pages | 2423-2440 |
| Number of pages | 18 |
| State | Published - 2012 |
| Event | 24th International Conference on Computational Linguistics, COLING 2012 - Mumbai, India Duration: Dec 8 2012 → Dec 15 2012 |
Conference
| Conference | 24th International Conference on Computational Linguistics, COLING 2012 |
|---|---|
| Country/Territory | India |
| City | Mumbai |
| Period | 12/8/12 → 12/15/12 |
Keywords
- Lexical semantics
- Semi-supervised learning
- Word sense disambiguation
Fingerprint
Dive into the research topics of 'Improving supervised sense disambiguation with web-scale selectors'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver