Task #10: English Lexical Substitution


if you have questions please write to diana at dianamccarthy dot co dot uk

What's NEW

Data including system output, and Resources

Overview for a General Audience

You can download a report written for a general audience.

Microsoft Word version available here

PDF version available here

The Task

briefly, in this task both annotators and systems have to find a substitute word for the target word in a given sentence. For example, given the target match in the following sentence:

The ideal preparation would be a light meal about 2-2 1/2 hours pre-match , followed by a warm-up hit and perhaps a top-up with extra fluid before the match.

a response might be game

Three Sub Tasks

Please refer to the task documentation for details task10documentation.pdf
  • best The system provides the substitute it believes are best for a given item. It can provide more than one substitute if it believes they are all equally fitting, but credit is divided by the number of responses given
  • ootScoring for the best 10 substitutes for a given item. 10 responses are anticipated and systems will not benefit from providing less responses
  • mw precision and recall for detection and identification of multiwords in the input sentences

Data Format

Please see section 2 of the document task10documentation.pdf

Scoring and Baselines

Please see sections 3-5 of the document task10documentation.pdf

SemEval Release of Trial and Test data

available here

All Gold Standard and Scoring Data Now Available


Please use the following reference for this data:

McCarthy, D. and R. Navigli (2007) SemEval-2007 Task 10: English Lexical Substitution Task In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic pp 48-53 PDF version available

All System output Now Available


Please use the following reference for this data:

McCarthy, D. and R. Navigli (2009) The English Lexical Substitution Task, In Language Resources and Evaluation 43 (2) Special Issue on Computational Sema ntic Analysis of Language: SemEval-2007 and Beyond, Agirre, E., Màrquez, L. and Wicentowksi, R. (Eds). pp 139-159 Springer.

Resources Provided by Participants


Annotator Guidelines

Guidelines given to the annotators available here

Guidelines given to the post-hoc annotators are available here


The results are available here


We have listed all questions that we received from participants along with our answers here.

We have also listed issues with the trial data (thanks to those of you who spotted errors). A list is available here

 For more information, visit the SemEval-2007 home page.