|
|
FSMNLP 2005: call for papers
Finite-State Methods and Natural Language Processing
FSMNLP 2005
Fifth International Workshop
University of Helsinki, Finland
1 - 2 September 2005
http://www.ling.helsinki.fi/events/FSMNLP2005
Papers due: 9 May 2005 (EXTENDED DEADLINE)
The aim of the FSMNLP 2005 is to bring together members of
the academic, research, and industrial community working on
finite-state based models in language technology, computational
linguistics, linguistics and cognitive science or on related theory
or methods in fields such as computer science and mathematics.
The workshop will be a forum for researchers working
- on NLP applications,
- on the theoretical and implementation aspects, or
- on their combination.
We invite novel high-quality papers that are related to the themes
including but not limited to:
- NLP applications and linguistic aspects of finite-state methods
The topic includes but is not restricted to:
– speech, sign language, phonology, hyphenation, prosody
– scripts, text normalization, segmentation,
tokenization, indexing
– morphology, stemming, lemmatisation, information
retrieval, spelling correction
– syntax, POS tagging, partial parsing, disambiguation,
information extraction
– machine translation, translation memories, glossing,
dialect adaptation
– annotated corpora and treebanks, semi-automatic
annotation, error mining, searching
- Finite-state models of language
With this more focused topic (inside 1) we invite papers on
aspects that motivate sufficiency of finite-state methods or their
subsets for capturing various requirements of natural language processing.
The topic includes but is not restricted to:
– performance, linguistic applicability, finite-state
hypotheses
– Zipf's law and coverage, model checking against finite corpora
– regular approximations under parameterized complexity, limitations and definitions of relevant complexities such as ambiguity,
recursion, crossings, rule applications, constraint
violations, reduplication, exponents,
discontinuity, path-width, and induction depth
– similarity inferences, dissimilation, segmental length, counter-freeness, asynchronous machines
– garden-path sentences, deterministic parsing, expected
parses, Markov chains
– incremental parsing, uncertainty, reliability/variance in
stochastic parsing, linear sequential machines
- Practices for building lexical transducers for the world's
languages.
The topic accounts for usability of finite-state methods in NLP.
It includes but is not restricted to:
– required user training and consultation, learning curve of
non-specialists
– questionnaires, discovery methods, adaptive computer-aided
glossing and interlinearization
– example-based grammars, semi-automatic learning,
user-driven learning (see topic 6 too)
– low literacy level and restricted availability of training data,
writing systems/phonology under development, new non-Roman
scripts, endangered languages
– linguist's workbenches, stealth-to-wealth parser development
– experiences of using existing
tools (e.g. TWOL) for computational morphology and phonology
- Specification and implementation of sets, relations and
multiplicities in NLP using finite automata
The topic includes but is not restricted to:
– regular rule formalisms, grammar systems, expressions, operations, closure properties, complexities
– algorithms for compilation, approximation,
manipulation, optimization, and lazy evaluation of finite machines
– finite string and tree automata, transducers, morphisms and bimorphisms
– weights, registers, multiple tapes, alphabets, state covers
and partitions, representations
– locality, constraint propagation, star-free languages,
data vs. query complexity
– logical specification, MSO(SLR,matches),
FO(Str,<), LTL, generalized
restriction, local grammars
- Constraint-based grammars and k-ary regular relations
With this more focused topic (inside 4) we invite researchers from
related fields (computational linguists, mathematicians and computer
scientists) into discussion that is motivated
by constraint-based, declarative approaches to morphology/phonology
and computational problems related to them. For example, regular
relations in general are not closed under intersection, but restricted
use of intersection of relations have proven useful in
computational
phonology and morphology, and their implementations such as KIMMO,
PC-KIMMO,
TWOLC,
SEMHE,
AMAR,
WFSC,
etc. In the future, new useful approaches and implementations may come up.
The approaches may also propagate to other application areas in natural
language processing, including finite-state syntax and query
languages for parallel annotations in linguistic corpora.
The topic includes but is not restricted to:
– multi-tape automata, same-length relations and partition-based
morphology, Semitic morphology
–
autosegmental phonology, shuffle, trajectories, synchronization, segmental anchoring, alignment constraints, syllable
structure, partial-order reductions
– problems related to
auto-intersection
of multi-tape automata e.g. marked Post Correspondence Problem
– varieties of regular languages and relations,
descriptive complexity of finite-state based grammars
– automaton-based approaches to declarative constraint
grammars, constraints in optimality theory
– parallel corpus annotations, register automata, acyclic timed automata
- Machine learning of finite-state models of natural language
This topic includes but is not restricted to:
– learning regular rule systems, learning topologies of finite automata and transducers
– parameter estimation and smoothing, lexical openness
– computer-driven grammar writing, user-driven grammar
learning, discovery procedures
– data scarcity, realistic variations of Gold's model,
learnability and cognitive science
– incompletely specified finite-state networks
– model-theoretic grammars, gradient well/ill-formedness
- Finite-state manipulation software (with relevance to the above
themes)
This topic includes but is not restricted to
– regular expression pre-compilers such as
regexopt,
xfst2fsa,
standards and interfaces for finite-state based software
components, conversion tools
– tools such as
LEXC,
Lextools,
Intex,
XFST,
FSM,
GRM,
WFSC,
FIRE Engine,
FADD,
FSA/UTR,
SRILM,
FIRE Station
and
Grail
– free or almost free software such as
MIT FST,
Carmel,
RWTH FSA,
FSA Utilities,
Unitex,
OpenFIRE,
Vaucanson,
SFST,
PCKIMMO,
MONA,
Hopskip,
ASTL,
UCFSM,
HaLeX,
SML,
and
WFST
– results obtainable with such exploration tools as
automata,
Autographe,
Amore,
and
TESTAS
– visualization tools such as
Graphviz and
Vaucanson-G
– language-specific resources and descriptions, freely available benchmarking resources
The descriptions of the topics above are not meant to be complete, and
should extend to cover all traditional FSMNLP topics. Submitted
papers or abstracts may fall in several categories.
IMPORTANT DATES
| Paper/poster submissions due: | 9 May (extended) |
| Notifications sent out: | 31 May (extended) |
| | |
| Deadline for early registration: | 10 June |
| Abstracts for software demos due: | 10 June |
| Final versions due: | 20 June |
SUBMISSION PROCEDURE
We expect three kinds of submissions:
- full papers,
- interactive presentations (posters) and
- software demos.
Submissions are electronic and in PDF format via a web-based
submission
server. Authors are encouraged to use Springer LNCS styles
for LaTeX in producing the PDF document. The information about the
author(s) should be omitted in the submitted papers.
PRE-PROCEEDINGS
The papers and abstracts will be included on a CDROM that will be
distributed to the participants of the workshop.
REFEREED POST-PROCEEDINGS
Revised versions of the papers will be published by Springer in the
FSMNLP 2005 post-proceedings. The FSMNLP 2005 post-proceedings will
appear in the series of Lecture Notes in Artificial Intelligence.
After earlier FSMNLP workshops, the following special journal
issues have been published:
This time, we might be satisfied with the refereed LNAI proceedings,
although the possibility of having a special journal issue
(consisting of some extended papers) is considered on sufficient demand.
PROGRAM COMMITTEE
Steven Bird (University of Melbourne, Australia)
— Francisco Casacuberta (Universitat Politècnica de València, Spain)
— Jean-Marc Champarnaud (Université de Rouen, France)
— Jan Daciuk (Gdansk University of Technology, Poland)
— Jason Eisner (Johns Hopkins University, USA)
— Tero Harju (University of Turku, Finland)
— Arvi Hurskainen (Institute for Asian and African
Studies, University of Helsinki, Finland)
— Juhani Karhumäki
(University of Turku, Finland, co-chair)
— Lauri
Karttunen (PARC and Stanford University, USA, co-chair)
— André Kempe (Xerox Research Centre Europe, France)
— George Anton Kiraz (Beth Mardutho: The Syriac Institute, USA)
— Andras Kornai (Budapest Institute of Technology, Hungary)
— Terence Langendoen (University of Arizona, USA)
— Eric Laporte (Université de Marne-la-Vallée, France)
— Mike Maxwell (Linguistic Data Consortium, USA)
— Mark-Jan Nederhof (University of Groningen, the Netherlands)
— Gertjan van Noord (University of Groningen, the Netherlands)
— Kemal Oflazer (Sabanci University, Turkey)
— Jean-Eric Pin (CNRS/University Paris 7, France)
— James Rogers (Earlham College, USA)
— Giorgio Satta (University of Padua, Italy)
— Jacques Sakarovitch (CNRS/ENST, France)
— Richard Sproat (University of Illinois at Urbana-Champaign, USA)
— Nathan Vaillette (University of Tübingen, Germany)
— Atro Voutilainen (Connexor, Finland)
— Bruce
W. Watson (University of Pretoria, South Africa)
— Shuly Wintner (University of Haifa, Israel)
— Sheng Yu (University of Western Ontario, Canada)
— Lynette van Zijl (Stellenbosch University, South Africa)
REGISTRATION
Information about the registration procedure will be provided
later. Participant's registration fee is normally 100 EUR, but
students will not need to pay that much.
ORGANIZATION
The workshop will take place in the University of Helsinki. The
organizing institution is the Department of General
Linguistics in the University of Helsinki. The chair of the
organization committee is Anssi Yli-Jyrä at CSC —
Scientific Computing Ltd.. Several academic institutions and
disciplines are represented in the steering committee.
The workshop is a follow-up for some earlier workshops, but
also continues their dynamic, changing tradition. FSMNLP workshops
have traditionally had tutorial lessons and/or invited
speakers. These workshops and courses are under different names and
time intervals:
CO-LOCATED EVENTS
Information on co-located events (AWL meeting, tutorial day and a
special colloquium) is available at http://www.ling.helsinki.fi/events/FSMNLP2005/satellite.shtml.
|