[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[eu_members at aclweb dot org] CORRECTED RTE-5: Call for Participation

                    RTE-5: Call for Participation



Since 2004, RTE Challenges have promoted research in textual
entailment recognition as a task that captures major semantic
inference needs across many natural language processing applications,
such as Question Answering (QA), Information Retrieval (IR),
Information Extraction (IE), and multi-document summarization. Over
the years the encouraging progress, in terms of both the number of
researchers involved and results achieved, has spurred the community
to further investigate the phenomena involved by adding innovations to
the challenge every year and moving it toward more realistic

Capitalizing on the favorable response obtained so far, the RTE
Organizing Committee is glad to launch the Fifth Recognizing Textual
Entailment Challenge, proposed for the second year as a track of the
Text Analysis Conference (TAC).

Organizations interested in participating in the RTE-5 Challenge are
invited to submit a track registration form by May 31, 2009, at the
TAC 2009 web site:



1) A Textual Entailment Search Pilot task will be proposed, based on
   the data used in the Summarization task at TAC 2008/2009.

2) The main RTE-5 task will be similar to the RTE-4 task, with the
   following changes:

   * Texts will be longer, usually corresponding to a portion of the
     source document that a reader would naturally select, such as a
     paragraph or a group of related sentences.

   * Texts will come from a variety of sources and will not be edited
     from their source documents. Thus, systems will be asked to
     handle real text that may include typographical errors and
     ungrammatical sentences.

   * A development set will be released.

   * The textual entailment recognition task will be based on only
     three application settings: QA, IE, and IR.

   * Mandatory ablation tests for major knowledge resources will be
     required for those systems that employ these resources.


RTE is the task of recognizing that the meaning of one text, termed
Hypothesis (H), can be inferred by the content of another, termed Text
(T). Given a set of pairs of T's and H's as input, the systems must
recognize whether each T entails the corresponding H, deciding

   * T entails H
   * T contradicts H, or shows it false
   * the veracity of H is unknown on the basis of T.

The RTE-5 main task will consist of two sub-tasks:

1) The three-way RTE task, where the system must decide whether:

   * T entails H - in which case the pair will be marked as ENTAILMENT
   * T contradicts H - in which case the pair will be marked as CONTRADICTION
   * The truth of H cannot be determined on the basis of T - in which
     case the pair will be marked as UNKNOWN

2) The two-way RTE task is to decide whether:

   * T entails H - in which case the pair will be marked as ENTAILMENT
   * T does not entail H - in which case the pair will be marked as NO ENTAILMENT

Systems can decide whether to participate in either or both tasks.

System results will be compared to a human-annotated gold-standard
test corpus. Examples of three-way judgments are given at the end this

As in previous challenges, the test data sets will be based on
multiple data sources, intended to be representative of typical
problems encountered by applied systems. Specifically, data types
corresponding to the following application areas will be used:

1) Question Answering (QA): simulating a QA scenario in which the
   hypothesized answer has to be inferred from the candidate text

2) Information Retrieval (IR): choosing propositional queries as
   hypotheses, and proposing relevant and irrelevant sentences
   retrieved by IR systems as texts

3) Information Extraction/Relation Extraction (IE): generating T-H
   pairs, picking positive and negative examples of typical outputs of
   IE systems

More details are provided in the guidelines for participants available
at the RTE-5 website (http://www.nist.gov/tac/2009/RTE/).


The Textual Entailment Search Pilot, representing a first step towards
more realistic scenarios in the Textual Entailment Recognition task,
is aimed at:

1) producing a data set which reflects the natural distribution of
   entailment in a corpus and presents problems that can arise when
   detecting textual entailment in a natural setting

2) analyzing the potential impact of textual entailment recognition on
   a real NLP application task, namely the Summarization task.

The Textual Entailment Search task consists in finding all the
sentences in a set of documents that entail a given Hypothesis.

The task is situated in the Summarization application setting, where
the Hypothesis (H) is taken from a Summary Content Unit (SCU), and the
systems must find all the entailing sentences (Ts) in a corpus of 10
newswire documents about a common topic.

The following example is taken from the development set:

<H_sentence>Russia requested international help to rescue the AS-28.</H_sentence>
<text doc_id="APW_ENG_20050806.0018" s_id="1" evaluation="YES">At Moscow's request, Japan has dispatched four naval vessels to help rescue a Russian submarine snagged on the floor of the Pacific Ocean, but the ships aren't expected to arrive at the scene until early next week.</text>
<text doc_id="APW_ENG_20050806.0018" s_id="7" evaluation="YES">Navy spokesman Capt. Igor Dygalo said the U.S. Navy has also been asked for assistance, the RIA-Novosti news agency reported.</text>
<text doc_id="APW_ENG_20050806.0726" s_id="6" evaluation="YES">Russian authorities hope British and American unmanned submersibles, sent after a Russian plea for help, can cut the submarine loose.</text>

As can be seen from the example above, in the Entailment Search task
both Text and Hypothesis are to be interpreted in the context of the
corpus and contain explicit and implicit references to entities,
events, dates, places, situations, etc. pertaining to the topic.

As this Pilot requires the retrieval of entailing sentences only,
contradicting sentences are not to be taken into account, and thus the
entailment judgment may be seen as a two-way decision between "yes"
and "no" entailment.

The guidelines for participants, together with one topic taken from
the development set, are available at the RTE-5 website


The RTE Resource Pool, set up for the first time during RTE-3, serves
as a portal and forum for publicizing and tracking resources, and
reporting on their use. All the RTE participants and other members of
the NLP community who develop or use relevant resources are encouraged
to contribute to this important resource.

This year we are also planning to update and integrate the RTE
Resource Pool with a section specifically dedicated to knowledge
resources used. The new page will mainly contain a list of the
"standard" RTE resources, which have been selected and exploited
majorly in the design of RTE systems during the RTE challenges held so
far, together with the links to the locations where they are made
available. Moreover, a shortlist of the "top" resources will be
provided, as well as some results of the data analyses which have been
conducted so far on the resources presented in the page.

Pilot Development Set release            3 April 2009

Main Development Set release:             29 May 2009

Track registration deadline:            31 May 2009

Main and Pilot Test Set release:         2 September 2009

Submissions:                    9 September 2009

Release of individual evaluated results:     18 September 2009

TAC 2009 Workshop:                16-17 November 2009


Luisa Bentivogli, CELCT and FBK, Italy (Track coordinator, bentivo at fbk dot it)
Ido Dagan, Bar Ilan University, Israel
Hoa Trang Dang, NIST, USA
Danilo Giampiccolo, CELCT, Italy (Track coordinator, giampiccolo at celct dot it)
Bernardo Magnini, FBK, Italy


Examples of main task three-way judgments taken from RTE 4 test set (downloadable from http://www.nist.gov/tac/data/):

- <pair id="16" entailment="ENTAILMENT" task="IR">
  <t>A 66-year-old man has been sentenced to life in prison by a French court for murdering seven girls and young women. Michel Fourniret, dubbed the "Ogre of the Ardennes", had admitted kidnapping and killing his victims between 1987 and 2001.</t>
  <h>Michel Fourniret was sentenced to life imprisonment.</h>

- <pair id="60" entailment="CONTRADICTION" task="IR">
  <t>Syrian officials have said the bombed building was an empty military warehouse. They have refused to let nuclear inspectors visit the location, which was bulldozed after the bombing.</t>
  <h>Nuclear inspectors are to visit Syria.</h>

- <pair id="100" entailment="UNKNOWN" task="IR">
  <t>British and American diplomats were today attacked as they tried to investigate political violence in Zimbabwe, the US Embassy in Harare has said.</t>
  <h>Diplomats were detained in Zimbabwe.</h>

- <pair id="307" entailment="ENTAILMENT" task="QA">
  <t>African Union leaders ended their summit in Egypt yesterday refusing to condemn President Mugabe, cementing his hold on power even as they urged the establishment of a national unity government in Zimbabwe.</t>
  <h>African Union leaders had a meeting in Egypt.</h>

- <pair id="316" entailment="CONTRADICTION" task="QA">
  <t>Adopting just a couple of elements of the Mediterranean diet could cut the risk of cancer by 12%, say scientists. A study of 26,000 Greek people found just using more olive oil alone cut the risk by 9%.</t>
  <h>Mediterranean foods increase the risk of cancer because of olive oil.</h>

- <pair id="327" entailment="UNKNOWN" task="QA">
  <t>Speaking at a press conference held by video link from Lebanon, Shiekh Hassan Nasrallah said that the Shia Islamist group had also agreed to supply Israel with information on the airman Ron Arad, who went missing in 1986.</t>
  <h>Shiekh Hassan Nasrallah is from Lebanon.</h>

- <pair id="417" entailment="UNKNOWN" task="QA">
  <t>The acceleration of the shrinking of Arctic ice continues to threaten the survival of these animals. Scientists predict that the numbers of polar bears will fall by about a third, if sea ice in the Arctic continues to melt at its present rate.</t>
  <h>The level of Arctic ice will fall by a third.</h>

- <pair id="534" entailment="CONTRADICTION" task="SUM">
  <t>Much of the world has moved toward democracy and freedom, but China hasn't moved much and Russia seems headed in the opposite direction. Of the two, China is probably easier to deal with. It appears to have a collective leadership, which gives a certain continuity to its policy.</t>
  <h>China and Russia will move toward democracy.</h>

- <pair id="614" entailment="ENTAILMENT" task="SUM">
  <t>Political analyst Earl Ofari Hutchinson says Barack Obama has to capture the votes of Latinos for his Democratic presidential bid in the March 4 Texas primary.</t>
  <h>Latino voters are crucial for Obama in Texas.</h>

- <pair id="709" entailment="ENTAILMENT" task="IE">
  <t>A new report by the International Federation of Journalists (IFJ) documents 129 cases where media workers have been killed because of their work during 2004. They expect the number to increase as more information reaches them. This could make 2004 the deadliest year ever. 49 casualties (close to 40%) occurred in Iraq, making it by far the deadliest country for journalists. At least 20 of those appeared to be cases where journalists were directly targeted because of their profession.</t>
  <h>49 media workers were killed in Iraq in 2004.</h>