Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Workshop at the ACL/IJCNLP 2009 Conference (Singapore)
August 6, 2009
The ACL 2009 Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications (MWE2009) will take place on August 6, 2009 in Singapore, immediately following the annual meeting of the Association for Computational Linguistics (ACL). This is the fifth time this workshop has been held in conjunction with ACL, following the meetings in 2003, 2004, 2006, and 2007.
The workshop will focus on Multi-Word Expressions (MWEs), which represent an indispensable part of natural languages and appear steadily on a daily basis, both novel and already existing but paraphrased, which makes them important for many natural language applications. Unfortunately, while easily mastered by native speakers, MWEs are often non-compositional, which poses a major challenge for both foreign language learners and automatic analysis.
The growing interest in MWEs in the NLP community has led to many specialized workshops held every year since 2001 in conjunction with ACL, EACL and LREC; there have been also two recent special issues on MWEs published by leading journals: the International Journal of Language Resources and Evaluation, and the Journal of Computer Speech and Language.
As a result of the overall progress in the field, the time has come to move from basic preliminary research to actual applications in real-world NLP tasks. Thus, in MWE2009, we were interested in the overall process of dealing with MWEs, asking for original research on the following four fundamental topics:
(1) Identification. Identifying MWEs in free text is a very challenging problem. Due to the variability of _expression_, it does not suffice to collect and use a static list of known MWEs; complex rules and machine learning are typically needed as well.
(2) Interpretation. Semantically interpreting MWEs is a central issue. For some kinds of MWEs, e.g., noun compounds, it could mean specifying their semantics using a static inventory of semantic relations, e.g., WordNet-derived. In other cases, MWE2009 semantics could be expressible by a suitable paraphrase.
(3) Disambiguation. Most MWEs are ambiguous in various ways. A typical disambiguation task is to determine whether an MWE is used non-compositionally (i.e., figuratively) or compositionally (i.e., literally) in a particular context.
(4) Applications. Identifying MWEs in context and understanding their syntax and semantics is important for many natural language applications, including but not limited to question answering, machine translation, information retrieval, information extraction, and textual entailment. Still, despite the growing research interest, there are not enough successful applications in real NLP problems, which we believe is the key for the advancement of the field.
Of course, the above topics largely overlap. For example, identification can require disambiguating between literal and idiomatic uses since MWEs are typically required to be non-compositional by definition. Similarly, interpreting three-word noun compounds like morning flight ticket and plastic water bottle requires disambiguation between a left and a right syntactic structure, while interpreting two-word compounds like English teacher requires disambiguating between (a) teacher who teaches English and (b) teacher coming from England (who could teach any subject, e.g., math.
We received 18 submissions, and, given our limited capacity as a one-day workshop, we were only able to accept 9 full papers for oral presentation, an acceptance rate of 50%.
We would like to thank the members of the Program Committee for their timely reviews. We would also like to thank the authors for their valuable contributions.
List of accepted papers:
Statistically-Driven Alignment-Based Multiword _expression_ Identification for Technical Domains
Helena Caseli, Aline Villavicencio, André Machado and Maria José Finatto
Re-examining Automatic Keyphrase Extraction Approaches in Scientific Articles
Su Nam Kim and Min-Yen Kan
Verb Noun Construction MWE Token Classification
Mona Diab and Pravin Bhutada
Exploiting Translational Correspondences for Pattern-Independent MWE Identification
Sina Zarrieß and Jonas Kuhn
A re-examination of lexical association measures
Hung Huu Hoang, Su Nam Kim and Min-Yen Kan
Mining Complex Predicates In Hindi Using A Parallel Hindi-English Corpus
R. Mahesh K. Sinha
Improving Statistical Machine Translation Using Domain Bilingual Multiword Expressions
Zhixiang Ren, Yajuan LÃŒ, Jie Cao, Qun Liu and Yun Huang
Bottom-up Named Entity Recognition using Two-stage Machine Learning Method
Hirotaka Funayama, Tomohide Shibata and Sadao Kurohashi
Abbreviation Generation for Japanese Multi-Word Expressions
Hiromi Wakaki, Hiroko Fujii, Masaru Suzuki, Mika Fukui and Kazuo Sumita
* Inaki Alegria, University of the Basque Country (Spain)
* Timothy Baldwin, Stanford University (USA); University of Melbourne (Australia)
* Colin Bannard, Max Planck Institute (Germany)
* Francis Bond, National Institute of Information and Communications Technology (Japan)
* Gael Dias, Beira Interior University (Portugal)
* Ulrich Heid, Stuttgart University (Germany)
* Stefan Evert, University of Osnabrueck (Germany)
* Afsaneh Fazly,University of Toronto (Canada)
* Nicole Gregoire,University of Utrecht (The Netherlands)
* Roxana Girju,University of Illinois at Urbana-Champaign (USA)
* Kyo Kageura, University of Tokyo (Japan)
* Brigitte Krenn, Austrian Research Institute for Artificial Intelligence (Austria)
* Eric Laporte, University of Marne-la-Vall?e (France)
* Rosamund Moon, University of Birmingham (UK)
* Diana McCarthy, University of Sussex (UK)
* Jan Odijk, University of Utrecht (The Netherlands)
* Stephan Oepen, University of Oslo (Norway)
* Darren Pearce, London Knowledge Lab (UK)
* Pavel Pecina, Charles University (Czech Republic)
* Scott Piao, University of Manchester (UK)
* Violeta Seretan, University of Geneva (Switzerland)
* Stan Szpakowicz, University of Ottawa (Canada)
* Beata Trawinski, University of Tuebingen (Germany)
* Peter Turney, National Research Council of Canada (Canada)
* Kiyoko Uchiyama, Keio University (Japan)
* Begona Villada Moiron, University of Groningen (The Netherlands)
* Aline Villavicencio, Federal University of Rio Grande do Sul (Brazil)
* Dimitra Anastasiou, Localisation Research Centre, Limerick University, Ireland
* Chikara Hashimoto, National Institute of Information and Communications Technology, Japan
* Preslav Nakov, National University of Singapore, Singapore
* Su Nam Kim, University of Melbourne, Australia