[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[eu_members at aclweb dot org] Italy: ResPubliQA 2010, Question Answering Evaluation over European Legislation -- Preliminary Call for Participation
To whom it may concern. Our apologies if you receive multiple postings of
Question Answering Evaluation over European Legislation
the success of ResPubliQA 2009, we are pleased to announce ResPubliQA 2010, the
second evaluation campaign of Question Answering systems over European
Legislation to be held within the framework of CLEF 2010 conference
information and updates visit the ResPubliQA website at:
invite participation from IR and NLP practitioners and potential users of QA
systems concerned with European texts.
Detailed guidelines describing the
task will be distributed among the participants and downloadable from the
The results of the evaluation campaign will be
disseminated at the final workshop which will be organized, in conjunction with
the CLEF 2010 conference, 20-23 September in Padua, Italy.
ResPubliQA 2010: TASK OVERVIEW
The aim of ResPubliQA 2010 is
to capitalize on what has been achieved in the previous evaluation campaign
while at the same time adding a number of refinements:
. The addition of
new question types and the refinement of old ones;
. The opportunity to
return both paragraph and exact answer;
. The addition of a new collection:
Two separate tasks are proposed for the ResPubliQA 2010
1. PARAGRAPH SELECTION (PS) TASK: to retrieve one
paragraph containing the answer to a question in natural language. One of the
following responses must be returned:
a) ONE single paragraph containing the
b) the string NOA to indicate that the system prefers not to
answer the question.
2. ANSWER SELECTION (AS) TASK: beyond retrieving a
paragraph containing the answer to a question in natural language, systems are
required to demarcate also the exact answer. One of the following responses must
a) the exact answer highlighted inside one paragraph
string NOA to indicate that the system prefers not to answer the question.
N.B. Systems that prefer to leave some questions unanswered, can
OPTIONALLY decide to submit also a candidate paragraph/answer with the aim of
evaluating the validation performance.
The two tasks are only different
in the output required. Document collection and test data for both tasks are the
DOCUMENT COLLECTION: the following multilingual parallel-aligned
document collections are used:
o The ResPubliQA collection: a subset of
JRC-Acquis with parallel-aligned documents in 9 languages
o A small subset
of the EUROPARL collection with parallel-aligned documents in 9 languages has
been created by crawling the web to get the data from the website of the
European Parliament2 (starting from January 2009).
Both collections will
be available at the ResPubliQA website3.
The subject of the Acquis documents
is European legislation while EUROPARL deals with the parliamentary domain. The
two collections are different in style and content while being fully compatible
at the same time.
LANGUAGES: parallel-aligned documents are available in
9 languages, i.e: Bulgarian, Dutch, English, French, German, Italian,
Portuguese, Romanian and Spanish.
Only the tasks in which there will be at
least one registered participant will be activated.
TEST DATA: a pool of
200 questions will be provided
o independent questions that can be answered
by a paragraph
o question types: factoid, definition, purpose, reason,
o NO NIL; NO LIST
EVALUATION: each output of both the
PS and AS tasks are automatically evaluated against the GoldStandard manually
produced. Non-matching paragraphs and answers are manually evaluated by native
The adoption of the c at 1
evaluation metric encourages systems to maintain the number of correct answers
while reducing the amount of incorrect ones by leaving some questions unanswered
(NOA). Answer Validation techniques (including Machine Learning) are expected to
be used for taking this final decision. For more details, please read the
ResPubliQA 2009 Overview, available at the campaign website.
systems are allowed to participate in one or both tasks which will operate
simultaneously on the same input questions. A maximum of two runs in total can
be submitted, i.e. two PS runs, two AS runs or one PS plus one AS
. Track guidelines: January 25
Registration at the ResPubliQA website: by March
. Test set release: May
. Run submissions: May 27*
. Results to the participants: July 9
Submission of Papers: August 15
. Workshop: 20-23 September 2010, in Padua,
*Participants will have 5 DAYS to upload their submissions,
starting from the moment when the questions are downloaded.
- Anselmo Peñas, E.T.S.I. Informática de la UNED, Madrid,
- Pamela Forner, CELCT, Trento, Italy
- Richard Sutcliffe,
Dept. of Computer Science, University of Limerick, Limerick,
- Donna Harman (National Institute for
Standards and Technology (NIST), USA)
- Maarten de Rijke (University of
Amsterdam, The Netherlands)
- Dominique Laurent (Synapse Développement,