Our apologies if you receive multiple postings of this CFP
Task1: Coreference Resolution in multiple languages
The purpose of this e-mail is to encourage participation in the task
'Coreference Resolution in multiple languages' in the 5th International
Workshop on Semantic Evaluations, SemEval-2010
GENERAL TASK DESCRIPTION
Using coreference information has been shown to be beneficial in a number
of NLP applications including Information Extraction, Text Summarization,
Question Answering and Machine Translation. This task is concerned with
automatic coreference resolution for six different languages: Catalan,
Dutch, English, German, Italian and Spanish. Two tasks are proposed for
each of the languages:
* Full task. Detection of full coreference chains, composed by named
entities, pronouns, and full noun phrases.
* Subtask. Pronominal resolution, i.e., finding the antecedents of the
pronouns in the text.
In particular, we aim:
(i) To study the portability of coreference resolution systems across
languages (Catalan, Dutch, English, German, Italian, Spanish)
* To what extent is it possible to implement a general system that is
portable to the three languages?
* How much language-specific tuning is necessary?
* Are there significant differences between Germanic and Romance
languages? And between languages of the same family?
(ii) To study how helpful morphology, syntax and semantics are to solve
* How much preprocessing is needed?
* How much does the quality of the preprocessing modules (perfect
linguistic input vs. noisy automatic input) affect the performance of
state-of-the-art coreference resolution systems?
* Is morphology more helpful than syntax? Or semantics? Or is syntax
more helpful than semantics?
(iii) To compare four different evaluation metrics (MUC, B-CUBED, CEAF and
BLANC) for coreference resolution.
* Do all evaluation metrics provide the same ranking? Is there one that
provides a more accurate picture of a system's accuracy?
* Is there a strong correlation between them?
* Can statistical systems be optimized under all four metrics at the
Although we target at general systems addressing the full multilingual
task, we will allow taking part in any full/sub-task of any language.
* Veronique Hoste (Hogeschool Gent)
* Lluís Màrquez (TALP, Universitat Politècnica de Catalunya)
* M. Antònia Martí (CLiC, University of Barcelona)
* Massimo Poesio (University of Essex / Università di Trento)
* Marta Recasens (CLiC, University of Barcelona)
* Emili Sapena (TALP, Universitat Politècnica de Catalunya)
* Mariona Taulé (CLiC, University of Barcelona)
* Yannick Versley (Universitat Tübingen)
Training data release: February 10, 2010 (from
Deadline for submission of systems: April 2, 2010
The Workshop will be held in conjunction with ACL July 11-16, Uppsala,
For more information, please visit: http://stel.ub.edu/semeval2010-coref/