[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[eu_members at aclweb dot org] Colorado: NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing -- Call for Participation


                      CALL FOR PARTICIPATION

                     NAACL HLT 2009 Workshop on
           Active Learning for Natural Language Processing

               June 5, 2009, Boulder, Colorado, USA

      Endorsed by the following ACL Special Interest Groups:
     • Special Interest Group on Natural Language Learning (SIGNLL)
     • Special Interest Group for Annotation (SIGANN)


Labeled data is a prerequisite for many popular algorithms in natural
language processing and machine learning.  While it is possible to
obtain large amounts of annotated data for well-studied languages in
well-studied domains and well-studied problems, labeled data are rarely
available for less common languages, domains, or problems.
Unfortunately, obtaining human annotations for linguistic data is
labor-intensive and typically the costliest part of the acquisition of
an annotated corpus.

It has been shown before that active learning can be employed to reduce
annotation costs but not at the expense of quality.  While diverse work
over the past decade has demonstrated the possible advantages of active
learning for corpus annotation and NLP applications, active learning is
not widely used in many ongoing data annotation tasks.  Much of the
machine learning literature on the topic has focused on active learning
for classification problems with less attention devoted to the kinds of
problems encountered in NLP.

This workshop aims at bringing together researchers to explore the
challenges and opportunities of active learning for NLP tasks, language
acquisition, and language learning.

8:30     Welcome: Eric Ringger, Robbie Haertel, Katrin Tomanek

9:00     Invited talk: Active Learning for NLP: Past, Present, and
Future (Burr Settles, University of Wisconsin)

*Session 1: Anaphora Resolution*

10:00     Active learning for anaphora resolution Caroline Gasperin)
10:30     Break

*Session 2: Multiple Annotators and Cost Considerations*

11:00     On Proper Unit Selection in Active Learning: Co-Selection
Effects for Named Entity Recognition (Katrin Tomanek, Florian Laws, Udo
Hahn and Hinrich Schütze)

11:30     Estimating Annotation Cost for Active Learning in a
Multi-Annotator Environment (Shilpa Arora, Eric H. Nyberg and Carolyn P.

12:00     Data Quality from Crowd-sourcing: A Study of Annotation
Selection Criteria for Sentiment Analysis (Pei-Yun Hsueh, Prem Melville
and Vikas Sindhawni)

12:30     Lunch

*Session 3: Real Annotators and Experts*

2:00    Evaluating automation strategies in language documentation
(Alexis Palmer, Jason Baldridge and Taesun Moon)

2:30     A Web Survey on the Use of Active Learning to support
Annotation of Text Data (Katrin Tomanek and Fredrik Olsson)

3:00     Invited talk: Return on Investment for Active Learning (Robbie
Haertel, Brigham Young University)

3:30     Break

*Session 4: New Methods*

4:00     Active Dual Supervision: Reducing the Cost of Annotating
Examples and Features (Prem Melville and Vikas Sindhwani)

4:30     Proactive Learning for Building Machine Translation Systems for
Minority Languages (Vamshi Ambati and Jaime Carbonell)

5:00     Discussion

5:30     End of Workshop


You can register online before May 23, or on-site in the foyer outside
the conference rooms.
More information can be found at the main conference website:

Organizers and contact

Eric Ringger, Brigham Young University, USA
Robbie Haertel, Brigham Young University, USA
Katrin Tomanek, University of Jena, Germany

Please address any queries regarding the workshop to:
al dot nlp2009 at googlemail dot com

Program committee

Shlomo Argamon (Illinois Institute of Technology, USA)
Jason Baldridge (University of Texas at Austin, USA)
Markus Becker (SPSS, UK)
Ken Church (Microsoft Research, USA)
Hal Daume (University of Utah, USA)
Robbie Haertel (Brigham Young University, USA)
Ben Hachey (University of Edinburgh, UK)
Udo Hahn (University of Jena, Germany)
Eric Horvitz (Microsoft Research, USA)
Rebecca Hwa (University of Pittsburgh, USA)
Ashish Kapoor (Microsoft Research, USA)
Mark Liberman (University of Pennsylvania/LDC, USA)
Prem Melville (IBM T.J. Watson Research Center, USA)
Ray Mooney (University of Texas at Austin, USA)
Miles Osborne (University of Edinburgh, UK)
Eric Ringger (Brigham Young University, USA)
Kevin Seppi (Brigham Young University, USA)
Burr Settles (University of Wisconsin, USA)
Victor Sheng (New York University, USA)
Katrin Tomanek (University of Jena, Germany)
Jingbo Zhu (Northeastern University, China)