[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[eu_members at aclweb dot org] China: CIPS-SIGHAN Joint Conference on Chinese Language Processing (CLP2010), after Coling 2010 -- CFP and Bake-Off Tasks



*********************************************************
    We apologize if you receive duplicates of this CFP
  Please feel free to distribute it to those who might be interested.
*********************************************************
 
Deadline of registration for Bake-offs:    May 1st, 2010
 
Paper Submission Deadline:                May 30, 2010
Notice: The registration for Bake-offs will be closed on May 1st.   Before that you can submit your registrations.

 

Call for Papers: CIPS-SIGHAN Joint Conference on Chinese Language Processing (CLP2010)

 

http://www.cipsc.org.cn/clp2010/cfp.htm

Background and Goals

 

With the rapid of expansion of Chinese language materials on the Internet, the use of natural language technology as a way of harnessing Chinese language content is drawing growing interest from researchers around the globe. The rise of China as a global power with increasing influence on the world stage is only fanning this interest. The Chinese language also has a number of characteristics that make Chinese language processing particularly challenging and intellectually rewarding. For example, written Chinese text does not have conventionalized word boundaries like English and other Western languages, and researchers have devoted an enormous amount of energy to figuring out the best way to identify words, which is generally considered to the first step for more advanced language processing tasks. There have been four successful international Chinese word segmentation bakeoffs sponsored by the ACL Special Interest Group on Chinese Language Processing (SIGHAN), and they have drawn wide participation and have greatly advanced the state-of-the-art in this area.  The Chinese language is also characterized by the lack of formal devices such as morphological tense and number that often provide important clues for shallow language processing tasks like part-of-speech tagging and syntactic chunking. As a result, solutions to Chinese language processing problems often require more sophisticated language processing techniques that are capable of drawing inferences from more subtle information.

 

Against this backdrop, the first conference on Chinese Language Processing (CLP2010) jointly organized by the Chinese Information Processing Society of China (CIPS) and SIGHAN, will be held on August 28-29, 2010 in Beijing, right after COLING 2010 and in the same venue. The goal is to bring together both established and aspiring researchers around the globe and provide a unified forum for them to showcase their research achievements, share their ideas, and frame research problems that are crucial in advancing the state-of-the-art in Chinese language processing.

Papers are invited on substantial, original and unpublished research on all aspects of Chinese language processing, including but not limited to:

       word segmentation

       part-of-speech tagging

       syntactic chunking and parsing

       lexical semantics

       semantic role labeling

       word sense disambiguation

       lexicon acquisition

       corpus development  and language resources

       evaluation methods and user studies

       computational models of discourse

       temporal and spatial information processing

       sentimental analysis and opinion mining

       language generation

 

 

Call for Participation to Bake-off tasks of CLP2010

 

 

The CIPS-SIGHAN Joint Conference on Chinese Language Processing (CLP2010) will also feature four international bake-offs in Chinese Language Processing, and these are:

Chinese word segmentation

Chinese Parsing

Chinese Personal Name disambiguation

Chinese Word Sense Induction

 

Task 1: Chinese Segmentation

Built on the successes of previous SIGHAN-sponsored international bakeoffs, the training and test data in the CIPS-SIGHAN-2010 word segmentation task will be from different domains to improve the robustness of current systems. In addition, selected examples for various test points will be added to expose potential problems that need to be solved to take the state of the art to the next level. The evaluation will help improve the performance of automatic segmentation for Chinese by identifying crucial language resources and new natural language processing algorithms.

Organizers: Liu, Qun   Zhao, HongMei

 

Task 2: Chinese Parsing

Chinese syntactic parsing has been a highly active research area in recent years, and there is a pressing need for a common evaluation platform where different approaches can be compared and progress can be gauged. The purpose of the CIPS-ParsEval campaign is to provide such a platform. The first CIPS-ParsEval (CIPS-ParsEval-2009) was successfully held in Beijing in 2009. Built on this success, the second CIPS-ParsEval (CIPS-ParsEval-2010), jointly sponsored by CIPS and SIGHAN, will be held in the summer of 2010. The hope is that through such evaluation campaigns, more advanced Chinese syntactic parsing techniques will emerge, more effective Chinese language processing resources will be built, and the state of the art will be advanced as a result.

This evaluation includes two sub-tasks: sub-sentence parsing and complete sentence parsing. For complex sentences, the performance of automatic parsers will be evaluated at three different levels (phrase level, simple sentence level and complex sentence level).

For each sub-task, there are two tracks. 1) In the closed track, participants can only use training data provided by the organizers. 2) In the open track the participants can use any data source in addition to the training data provided by the organizers.   Entries in the two tracks will be evaluated separately.

In addition, single systems and combined systems will be evaluated separately in the closed track. 1) single system: parsers that use a single parsing model to accomplish the parsing task. 2) system combination: participants are allowed to combine multiple models to improve performance. Collaborative decoding methods will be regarded as a combination method.

Organizer: Zhou, Qiang    Zhu, Jingbo

 

Task 3: Chinese Personal Name disambiguation

Personal names are usually highly ambiguous in text because different people may have the same name and the same name can be written in different ways. Solving this problem will have a huge impact on the accuracy of web search and potentially other natural language applications. There have been two recent Web People Search (WePS) evaluation campaigns on personal name disambiguation using data from English language web pages. Chinese personal name disambiguation is potentially more challenging due to the need for word segmentation, which could introduce errors that can in large part be avoided in the English task. The Chinese personal name disambiguation task will thus be an adapted version of the English WePS task that takes word segmentation into account.

Organizers: Li, Maggie   Huang, Chu-Ren    Chen, Ying   Jin, Peng

 

Task 4: Chinese Word Sense Induction

The use of word senses instead of word forms has been shown to improve performance in information retrieval, information extraction and machine translation. Word Sense Disambiguation generally requires the use of large-scale manually annotated lexical resources. Word Sense Induction (WSI) can overcome this limitation, and it has become one of the most important topics in current computational linguistics research.

Compared with European languages such as English, the study of WSI and WSD in Chinese is inadequate. In addition, Chinese word senses have their own characteristics. The methods that work well in English may not work well in Chinese. This task is intended to promote the exchange of ideas among participants and improve the performance of Chinese WSI systems.

Organizer: Sun, Le    Dong, Qiang    Zhang, Zhenzhong

 

Please visit the website (http://www.cipsc.org.cn/clp2010/cfpa.htm) for the details on these competitions.