Finite-State Methods and
Natural Language Processing 2005

1 - 2 September, 2005,

The Fifth International Workshop
in the Series of Workshops on

Finite-State Methods in Natural Language Processing



Automata, Words and Logic (AWL) meeting
 —The program and talk abstracts of AWL
  31 August 2005
A Two-Level Morphology Day (TWOL Day)
 —The program and talk abstracts of TWOLDAY
  31 August 2005
The FSMNLP workshop
  1-2 September 2005
Colloquium: Words, Contexts and Constructs   2 September 2005 (5 p.m.)


The Department of General Linguistics, University of Helsinki, will organize FSMNLP 2005, the Finite-State Methods and Natural Language Processing workshop. The FSMNLP workshop will be held on the downtown premises of the University of Helsinki. CSC Scientific Computing Ltd, the Finnish IT center for science, is a co-organizer.


Facilities Provided by



LATEST NEWS ON 31st August 2005

  • On 31th August, TWOLDAY and AWL starts
  • On 31st August, CD preproceedings burned
  • On 30th August, preproceedings available
  • On 11th August, more information on the invited talks was added
  • On 10th August, decisions on the first TWOL Day papers
  • On 22nd July, TWOL Day Call for papers
  • On 2nd June, the notifications were sent out
  • 0n 9th May, the submission deadline passed
  • On 25th April, the submission server is available again.
  • On 19th April, the submission and notification deadlines were extended
  • On 18th - 22th April, the submission service is down because of the moving and reinstallation of the IBMSC supercomputer at CSC.
  • On 29th March, the submission server was opened and the second CFP was issued.
  • On 7th March 2005, we aggreed with Springer on plans of publishing the postproceedings of the FSMNLP 2005 workshop in the Lecture Notes in Artificial Intelligence.


FSMNLP 2005: call for papers

Second Call for Papers

Finite-State Methods and Natural Language Processing


Fifth International Workshop

University of Helsinki, Finland
1 - 2 September 2005

Papers due: 25th April 2005

The aim of the FSMNLP 2005 is to bring together members of the academic, research, and industrial community working on finite-state based models in language technology, computational linguistics, linguistics and cognitive science or on related theory or methods in fields such as computer science and mathematics. The workshop will be a forum for researchers working

  • on NLP applications,
  • on the theoretical and implementation aspects, or
  • on their combination.

We invite novel high-quality papers that are related to the themes including but not limited to:

  1. NLP applications and linguistic aspects of finite-state methods

    The topic includes but is not restricted to:
    – speech, sign language, phonology, hyphenation, prosody
    – scripts, text normalization, segmentation, tokenization, indexing
    – morphology, stemming, lemmatisation, information retrieval, spelling correction
    – syntax, POS tagging, partial parsing, disambiguation, information extraction
    – machine translation, translation memories, glossing, dialect adaptation
    – annotated corpora and treebanks, semi-automatic annotation, error mining, searching

  2. Finite-state models of language

    With this more focused topic (inside 1) we invite papers on aspects that motivate sufficiency of finite-state methods or their subsets for capturing various requirements of natural language processing.

    The topic includes but is not restricted to:
    – performance, linguistic applicability, finite-state hypotheses
    – Zipf's law and coverage, model checking against finite corpora
    – regular approximations under parameterized complexity, limitations and definitions of relevant complexities such as ambiguity, recursion, crossings, rule applications, constraint violations, reduplication, exponents, discontinuity, path-width, and induction depth
    – similarity inferences, dissimilation, segmental length, counter-freeness, asynchronous machines
    – garden-path sentences, deterministic parsing, expected parses, Markov chains
    – incremental parsing, uncertainty, reliability/variance in stochastic parsing, linear sequential machines

  3. Practices for building lexical transducers for the world's languages.

    The topic accounts for usability of finite-state methods in NLP. It includes but is not restricted to:
    – required user training and consultation, learning curve of non-specialists
    – questionnaires, discovery methods, adaptive computer-aided glossing and interlinearization
    – example-based grammars, semi-automatic learning, user-driven learning (see topic 6 too)
    – low literacy level and restricted availability of training data, writing systems/phonology under development, new non-Roman scripts, endangered languages
    – linguist's workbenches, stealth-to-wealth parser development
    – experiences of using existing tools (e.g. TWOL) for computational morphology and phonology

  4. Specification and implementation of sets, relations and multiplicities in NLP using finite automata

    The topic includes but is not restricted to:
    – regular rule formalisms, grammar systems, expressions, operations, closure properties, complexities
    – algorithms for compilation, approximation, manipulation, optimization, and lazy evaluation of finite machines
    – finite string and tree automata, transducers, morphisms and bimorphisms
    – weights, registers, multiple tapes, alphabets, state covers and partitions, representations
    – locality, constraint propagation, star-free languages, data vs. query complexity
    – logical specification, MSO(SLR,matches), FO(Str,<), LTL, generalized restriction, local grammars

  5. Constraint-based grammars and k-ary regular relations

    With this more focused topic (inside 4) we invite researchers from related fields (computational linguists, mathematicians and computer scientists) into discussion that is motivated by constraint-based, declarative approaches to morphology/phonology and computational problems related to them. For example, regular relations in general are not closed under intersection, but restricted use of intersection of relations have proven useful in computational phonology and morphology, and their implementations such as KIMMO, PC-KIMMO, TWOLC, SEMHE, AMAR, WFSC, etc. In the future, new useful approaches and implementations may come up. The approaches may also propagate to other application areas in natural language processing, including finite-state syntax and query languages for parallel annotations in linguistic corpora.

    The topic includes but is not restricted to:
    – multi-tape automata, same-length relations and partition-based morphology, Semitic morphology
    autosegmental phonology, shuffle, trajectories, synchronization, segmental anchoring, alignment constraints, syllable structure, partial-order reductions
    – problems related to auto-intersection of multi-tape automata e.g. marked Post Correspondence Problem
    – varieties of regular languages and relations, descriptive complexity of finite-state based grammars
    – automaton-based approaches to declarative constraint grammars, constraints in optimality theory
    parallel corpus annotations, register automata, acyclic timed automata

  6. Machine learning of finite-state models of natural language

    This topic includes but is not restricted to:
    – learning regular rule systems, learning topologies of finite automata and transducers
    – parameter estimation and smoothing, lexical openness
    – computer-driven grammar writing, user-driven grammar learning, discovery procedures
    – data scarcity, realistic variations of Gold's model, learnability and cognitive science
    – incompletely specified finite-state networks
    – model-theoretic grammars, gradient well/ill-formedness

  7. Finite-state manipulation software (with relevance to the above themes)

    This topic includes but is not restricted to
    – regular expression pre-compilers such as regexopt, xfst2fsa, standards and interfaces for finite-state based software components, conversion tools
    – tools such as LEXC, Lextools, Intex, XFST, FSM, GRM, WFSC, FIRE Engine, FADD, FSA/UTR, SRILM, OMAC FSM library, FIRE Station and Grail
    – free or almost free software such as MIT FST, Carmel, RWTH FSA, FSA Utilities, Unitex, OpenFIRE, Vaucanson, SFST, PCKIMMO, MONA, Hopskip, ASTL, UCFSM, HaLeX, SML, and WFST
    – results obtainable with such exploration tools as automata, Autographe, Amore, and TESTAS
    – visualization tools such as Graphviz and Vaucanson-G
    – language-specific resources and descriptions, freely available benchmarking resources

The descriptions of the topics above are not meant to be complete, and should extend to cover all traditional FSMNLP topics. Submitted papers or abstracts may fall in several categories.


Paper/poster submissions due: 25th April
Notifications sent out: 25th May
Deadline for early registration: 10th June
Abstracts for software demos due: 10th June
Final versions due: 20th June


We expect three kinds of submissions:

  1. full papers,
  2. interactive presentations (posters) and
  3. software demos.

Submissions are electronic and in PDF format via a web-based submission server. Authors are encouraged to use Springer LNCS styles for LaTeX in producing the PDF document. The information about the author(s) should be omitted in the submitted papers.


The papers and abstracts will be included on a CDROM that will be distributed to the participants of the workshop.


Revised versions of the papers will be published by Springer in the FSMNLP 2005 post-proceedings. The FSMNLP 2005 post-proceedings will appear in the series of Lecture Notes in Artificial Intelligence.

After earlier FSMNLP workshops, the following special journal issues have been published:

This time, we might be satisfied with the refereed LNAI proceedings, although the possibility of having a special journal issue (consisting of some extended papers) is considered on sufficient demand.


Steven Bird (University of Melbourne, Australia) — Francisco Casacuberta (Universitat Politècnica de València, Spain) — Jean-Marc Champarnaud (Université de Rouen, France) — Jan Daciuk (Gdansk University of Technology, Poland) — Jason Eisner (Johns Hopkins University, USA) — Tero Harju (University of Turku, Finland) — Arvi Hurskainen (Institute for Asian and African Studies, University of Helsinki, Finland) — Juhani Karhumäki (University of Turku, Finland, co-chair) — Lauri Karttunen (PARC and Stanford University, USA, co-chair) — André Kempe (Xerox Research Centre Europe, France) — George Anton Kiraz (Beth Mardutho: The Syriac Institute, USA) — Andras Kornai (Budapest Institute of Technology, Hungary) — Terence Langendoen (University of Arizona, USA) — Eric Laporte (Université de Marne-la-Vallée, France) — Mike Maxwell (Linguistic Data Consortium, USA) — Mark-Jan Nederhof (University of Groningen, the Netherlands) — Gertjan van Noord (University of Groningen, the Netherlands) — Kemal Oflazer (Sabanci University, Turkey) — Jean-Eric Pin (CNRS/University Paris 7, France) — James Rogers (Earlham College, USA) — Giorgio Satta (University of Padua, Italy) — Jacques Sakarovitch (CNRS/ENST, France) — Richard Sproat (University of Illinois at Urbana-Champaign, USA) — Nathan Vaillette (University of Tübingen, Germany) — Atro Voutilainen (Connexor, Finland) — Bruce W. Watson (University of Pretoria, South Africa) — Shuly Wintner (University of Haifa, Israel) — Sheng Yu (University of Western Ontario, Canada) — Lynette van Zijl (Stellenbosch University, South Africa)


Information about the registration procedure will be provided later. Participant's registration fee is normally 100 EUR, but students will not need to pay that much.


The workshop will take place in the University of Helsinki. The organizing institution is the Department of General Linguistics in the University of Helsinki. The chair of the organization committee is Anssi Yli-Jyrä at CSC — Scientific Computing Ltd.. Several academic institutions and disciplines are represented in the steering committee.

The workshop is a follow-up for some earlier workshops, but also continues their dynamic, changing tradition. FSMNLP workshops have traditionally had tutorial lessons and/or invited speakers. These workshops and courses are under different names and time intervals:


Information on co-located events (AWL meeting, tutorial day and a special colloquium) is available at



  • 28 June 2005 — Final Versions of Accepted Papers (preferably earlier)
  • 10 June 2005 — Abstracts for Software Demos Due
  • 1 June 2005 — Registration started

  • 31 May 2005 — Extended Notification Deadline (original deadline 25th May 2005)

  • 9 May 2005 — Extended Submission Deadline(original deadline 25th April 2005)

Check Calendar for the year 2005.

Conference Time

Automata, Words and Languages     31 August 2005
A Two-Level Morphology Day     31 August 2005
The FSMNLP workshop   1-2 September 2005



Instructions for preparing the final (Springer LNAI) version

The instructions for preparing the final version of the articles or abstracts for the official proceedings are HERE.

Instructions for preparing the final (preproceedings) version

The instructions for preparing the final version of the articles or abstracts for the workshop preproceedings are HERE.

The submission procedure

We received three kinds of submissions:

  1. full papers,
  2. interactive presentations (posters) and
  3. software demos.

The submission of full papers is now closed. The instructions for submissions can be found HERE.



Program Committee

Bird, Steven (University of Melbourne, Australia)
Casacuberta, Francisco (Universitat Politècnica de València, Spain)
Champarnaud, Jean-Marc (Université de Rouen, France)
Daciuk, Jan (Gdansk University of Technology, Poland)
Eisner, Jason (Johns Hopkins University, USA)
Harju, Tero (University of Turku, Finland)
Hurskainen, Arvi (IAAS, University of Helsinki, Finland)
Karhumäki, Juhani, co-chair (University of Turku, Finland)
Karttunen, Lauri, co-chair (PARC and Stanford University, USA)
Kempe, André (Xerox Research Centre Europe, France)
Kiraz, George Anton (Beth Mardutho: The Syriac Institute, USA)
Kornai, András (Budapest Institute of Technology, Hungary)
Langendoen, D. Terence (University of Arizona, USA)
Laporte, Eric (Université de Marne-la-Vallée, France)
Maxwell, Mike (Linguistic Data Consortium, USA)
Nederhof, Mark-Jan (University of Groningen, the Netherlands)
van Noord, Gertjan (University of Groningen, the Netherlands)
Oflazer, Kemal (Sabanci University, Turkey)
Pin, Jean-Eric (CNRS/University Paris 7, France)
Rogers, James (Earlham College, USA)
Satta, Giorgio (University of Padua, Italy)
Sakarovitch, Jacques (CNRS/ENST, France)
Sproat, Richard (University of Illinois at Urbana-Champaign, USA)
Vaillette, Nathan (University of Tübingen, Germany)
Voutilainen, Atro (Connexor Oy, Finland)
Watson, Bruce W. (University of Pretoria, South Africa)
Wintner, Shuly (University of Haifa, Israel)
Yu, Sheng (University of Western Ontario, Canada)
van Zijl, Lynette (Stellenbosch University, South Africa)

Local Organizers

Anssi Yli-Jyrä, chair (CSC – Scientific Computing Ltd., Finland)
Antti Arppe (University of Helsinki, Finland)
Hanna Westerlund (University of Helsinki, Finland)
Sari Hyvärinen (University of Helsinki, Finland)

Steering Committee I (FSMNLP traditions)

Karttunen, Lauri (PARC and Stanford University, USA)
Koskenniemi, Kimmo (University of Helsinki, Finland)
van Noord, Gertjan (University of Groningen, the Netherlands) (unconfirmed)
Oflazer, Kemal (Sabanci University, Turkey)

Steering Committee II (local aspects)

Carlson, Lauri (Dept. of General Linguistics, University of Helsinki)
Harju, Tero (Dept. of Mathematics, University of Turku)
Hella, Lauri (Dept. of Mathematics, Statistics and Philosophy, University of Tampere)
Hurskainen, Arvi (Dept. of African Studies, IAAS, University of Helsinki)
Karlsson, Fred (Dept. of General Linguistics, University of Helsinki)
Lagus, Krista (Neural Networks Research Centre, Helsinki University of Technology)
Luosto, Kerkko (Dept. of Mathematics and Statistics, University of Helsinki)
Matti Nykänen (Dept. of Computer Science, Helsinki University of Technology)



Talk Abstracts

Characterizations of Regularity

Tero Harju
Department of Mathematics
University of Turku, Finland

Regular languages have many different characterizations in terms of automata, congruences, semigroups etc. In this talk we have a look at the more recent result, obtained during the last two decades, namely characterizations using morphic compositions, equality sets and well ordered structures.

Finnish Optimality-Theoretic Prosody

Lauri Karttunen
Palo Alto Research Center
Stanford University

A well-known phenomenon in Finnish prosody is the alternation of binary and ternary feet. In native Finnish words, the primary stress falls on the first syllable. Secondary stress generally falls on every second syllable: (vói.mis).(tè.li).(jòi.ta) 'gymnasts' creating a sequence of trochaic binary feet. However, secondary stress skips a light syllable that is followed by a heavy syllable. In (vói.mis.te).(lè 'we are doing gymnastics', the first foot is ternary, a dactyl.

Within the context of Optimality Theory (OT, Prince and Smolensky 1993), it has been argued that prosodic phenomena are best explained in terms of universal metric constraints.  OT constraints can be violated; no word can satisfy all of them.  A language-specific ranking of the constraints makes some violations less important than others. In her 1999 dissertation, A unified account of binary and ternary stress, Nine Elenbaas gives an analysis of Finnish in which the alternation between binary and ternary feet follows as a side effect of the ordering of two particular constraints, *Lapse and *(L'. H) The *Lapse constraint stipulates that an unstressed syllable must be adjacent to a stressed syllable or to word edge. The *(L'. H) constraint prohibits feet such as (tè.lem) where a light stressed syllable is followed by a heavy unstressed syllable.  The latter constraint of course is outranked by the constraint that requires initial stress on the first syllable in Finnish regardless of the its weight. In his 2003 article on Finnish Noun Inflection, Paul Kiparsky gives essentially the same account of the binary/ternary alternation except that he replaces the *(L'.H) rule by a more general StressToWeight constraint.

Although OT constraints themselves can be expressed in finite-state terms, Optimality Theory as a whole is not a finite-state model if it involves unbounded counting of constraint violations (Frank and Satta 1998). With that limitation OT analyses can be modelled with finite-state tools. In this paper we will give a full computational implementation of the Elenbaas and Kiparsky analyses using the extended regular expression calculus from the 2003 Beesley & Karttunen book on Finite State Morphology. Surprisingly, it turns out that Elenbaas and Kiparsky both make some incorrect predictions. For example, according to their accounts a word such as kalasteleminen 'fishing' should begin with a ternary foot: (ká.las.te).(lè.mi).nen. The correct footing is (ká.las).(tè.le).(mì.nen). There may of course be some ranking of OT constraints under which the binary/ternary alternation in Finnish comes "for free". It does not emerge from the Elenbaas and Kiparsky analyses.

This case study illustrates a more general point: Optimality Theory is computationally difficult and OT theorists are much in the need of computational help.



The demo session is on Thursday, 1 September, at 18 - 20 o'clock. The following demo abstracts have been accepted to the conference:
  • Markus Forsberg and Aarne Ranta:   Tool Demonstration: Functional Morphology
  • Witold Drozdzynski, Hans-Ulrich Krieger, Jakub Piskorski and Ulrich Schaefer:
    SProUT - a General-Purpose NLP Framework Integrating Finite-State and Unification-based Grammar Formalisms
  • Helmut Schmid:   A Programming Language For Finite State Transducers
  • Mathias Creutz, Krista Lagus and Sami Virpioja:   Unsupervised Morphology Induction Using Morfessor
  • Børre Gaup, Sjur Moshagen, Thomas Omma, Maaren Palismaa, Tomi Pieski and Trond Trosterud:   From Xerox to Aspell: A first prototype of a North Sámi speller based on TWOL technology
  • Bruce Watson:   FIRE Station


The poster session will be at the same time at the same location or close to the demo session, in the corridors of the Arppeanum building.

More information for poster presentes is at the "Local Info" page.

  • Saba Amsalu and Dafydd Gibbon:   A complete FS model for Amharic morphographemics
  • Jose Castaño, James Pustejovsky:   Tagging with Delayed Disambiguation
  • Harald Hammarström:   A New Algorithm for Unsupervised Induction of Concatenative Morphology
  • Lotta Harjula:   Morphological Parsing of Tone
  • Iñaki Alegria, Arantza Díaz de Ilarraza, Gorka Labaka, Mikel Lersundi, Aingeru Mayor, Kepa Sarasola:   An FST grammar for verb chain transfer in a Spanish-Basque MT System
  • Arvi Hurskainen, George Poulos and Louis Louwrens:   Describing Verbs in Disjoining Writing Systems
  • Alicia Pérez, Francisco Casacuberta, M. Inés Torres, Víctor Guijarrubia:   Finite state transducers based on K-TSS grammars for speech translation



Collocated events of FSMNLP 2005 in Helsinki

There will be satellite events both before and after FSMNLP.

  • Approaches to Complexity in Language. The Linguistic Association of Finland and the Department of General Linguistics, University of Helsinki, jointly organize the symposium Approaches to Complexity in Language in Helsinki, 24 - 26 August 2005 at Tieteiden Talo.

  • Course on Minimally Supervised Induction of Morphology. Richard Wicentowski, in the KIT Graduate school / Department of General Linguistics, 22 - 26 August 2005.

  • Automata, Words and Logic (AWL) -- a meeting for automata theorists (combinatorics, model checking, finite model theory, special automata etc.).

    AWL is a forum for national mathematicians and computer scientists to present their ongoing projects or new results. Also of possible interest to the audience of the FSMNL. The meeting does not have proceedings and the reviewing process is light-weight. Time: 31th August.

  • A Two-Level Morphology Day (TWOL Day) -- a mini-workshop for linguists working on two-level morphology (with PCKIMMO, TWOLC, LEXC, XFST, etc.).

    The TWOL day is a mini-workshop for project notes and discussion on applications of the existing methods for two-level finite-state morphology to less studied languages and phenomena. The purpose of the day is to exchange experiences and solutions between linguistis using finite-state methods in morphology. The reviewing process is light-weight. Time: 31th August.

  • The special colloquium day after the FSMNLP workshop have been replaced with a short time slot during FSMNLP.

  • Course on Language Contact and Structural Complexity. Professor John McWhorter. September 1-2, 5-9, 12-14, 2005.

Further information to be posted here later.

Other Events with Similar Orientation

  • INTEX/NooJ 2005, Besançon, May 30-June 1, 2005
  • TALN 2005, annual conference on Natural Language Processing (le Traitement Automatique des Langues Naturelles), Dourdan (France), June 6-10 June, 2005
  • CIAA 2005, Tenth International Conference on Implementation and Application of Automata, June 27 - 29, 2005, Sophia Antipolis, France
  • Semitic Languages 2005, an ACL workshop on Computational Approaches to Semitic Languages, University of Michigan, Ann Arbor, June 29, 2005.
  • DCFS'05, 7th workshop on Descriptional Complexity of Formal Systems, Como, Italy, June 30 - July 2, 2005
  • SIGPhon 2004, Current Themes in Computational Phonology and Morphology, 7th Meeting of the ACL Special Interest Group in Computational Phonology, Workshop at ACL 2004, Forum Convention Centre, Barcelona, Spain, July 26, 2004
  • Finite-State Methods in NLP, ESSLLI 2005 course, Heriot-Watt University Edinburgh, Scotland, August 15 - 19, 2005
  • Prague Stringology Conference, 10th event of the Prague Stringology Club, Prague, Czech Republic, August 29 - 31, 2005
  • The "Slot" for FSMNLP 2005
  • Interspeech'2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, September 4-8, 2005
  • EFD, FASTAR Days, not scheduled yet

Other Conferences of More General Orientation

  • AFL'05, 11th international conference on Automata and Formal Languages, Dobogókõ, Hungary, May 17-20, 2005
  • AKRR'05, international and interdisciplinary conference on Adaptive Knowledge Representation and Reasoning, Espoo, Finland, June 15-17, 2005, Helsinki University of Technology
  • ICALP 2005, the 32nd International Colloquium on Automata, Languages and Programming, Lisboa, Portugal, July 11 - 15, 2005.
  • ACL 2005, 43rd annual meeting of the Association for Computational Linguistics, University of Michigan, Ann Arbor, June 25 - 30, 2005
  • DLT'05, 9th international conference Developments in Language Theory, Hotel La Torre, Mondello, Palermo, Italy, July 4-8, 2005.
  • MFCS 2005, 30th International Symposium on Mathematical Foundations of Computer Science August 29 - September 2, 2005, Gdansk, Poland

Relevant Conferences beyond the Immediate Future

  • EACL 2006, 11th tri-annual conference of the European Chapter of the Association for Computational Linguistics
  • ECAI 2006, European Conference on AI, Riva del Garda, Italy, 2006
  • WATA, Weighted Automata: Theory and Applications
  • COLING, international conference on COmputational LINGuistics



Information for Presenters

Please find information on local facilities and important instructions for presenters in the "Local Info" page.

University Main Building (auditorium IV)

Arppeanum (auditorium)

  • Wednesday, 31th August, 12:00 - 18:00, Automata, Words and Logic
  • Thurday 1st August, 08:00 - 20:00, FSMNLP 2005
  • Friday 2nd August: 08:55 - 18:30, FSMNLP 2005
Time Thu 1 Sep
[ARPPEANUM auditorium]
Fri 2 Sep
[ARPPEANUM auditorium]
8:00 Registration  
8:45 Opening  
9:00 Han: Klex: A Finite-State Transducer Lexicon of Korean Petersen: Principles, Implementation Strategies, and Evaluation of a Corpus Query System
9:30 Geyken & Hanneforth:
TAGH: a complete morphology for German based on WFSA
Oflazer & Dincer Erbas & Erdogmus: Using Finite State Technology in a Tool for Linguistic Exploration
10:00 Uí Dhonnchadha: Scaling an Irish FST morphology engine for use on unrestricted text Civera, Vilar, Cubel, Lagarda, Barrachina, Casacuberta and Vidal: A Novel Approach to Computer Assisted Translation based on FSTs
10:30 Biro: Squeezing the Infinite into the Finite Niemi & Carlson: Modelling the Semantics of Calendar Expressions as Extended Regular Expressions
11:00 Coffee (Posters) Coffee (Posters)
11:15 Karttunen: Invited Talk: Fin(n)ish Optimality-Theoretic Prosody
(PowerPoint slides)
Harju: Invited Talk:
Characterizations of Regularity
12:00 Lunch Lunch
12:30 Lunch Lunch
13:00 Hulden: Finite-state syllabification Lakshmanan: Further Results on Syntactic Ambiguity of Internal Contextual Grammars
13:30 Barthelemy: Partitioning Multitape Transducers Nasr & Rambow: Parsing with Lexicalized Probabilistic Recursive Transition Networks
14:00 Yli-Jyrä & Niemi:Pivotal Synchronization Expressions: A Formalism for Alignments in Regular String Relations van Delden & Gomez: Improving Inter-Level Communication in Cascaded Finite-State Partial Parsers
14:30 Cohen-Sygal & Wintner: Finite State Register Automata and Their Uses in Natural languages Padró & Padró: Applying a Finite Automata Acquisition Algorithm to Named Entity Recognition
15:00 Coffee (Posters) Coffee (Posters)
15:30 Anne Schiller: German Compound Analysis with WFSC Volanschi & Nasr: Integrating a POS Tagger and a Chunker Implemented as WFSMs
16:00 Kempe & Champarnaud & Guingne & Nicart: WFSM Auto-Intersection and Join Algorithms Martin Jansche: Algorithms for Minimum Risk Chunking (cancellation of force majour)
Miyata & Hasida: Error-Driven Learning with Bracketing Constraints
16:30 Hanneforth: Longest-Match Pattern Matching with WFSA (see note 2)
17:00 Howard Johnson: Collapsing epsilon-loops in WFSMs Colloquium: Words, Contexts and Constructs
(see note 3)
17:30 Piskorski: On Compact Storage Models for Gazetteers Colloquium
18:00 Demo/poster session Colloquium
18:30 Demo/poster session Colloquium
19:00 Demo/poster session Break
19:30 Demo/poster session Break
20:00 Break Conference Dinner starts

Session Chairs

  • Morphology —Arvi Hurskainen
  • Optimality Theory —Trond Trosterud
  • Some Special FSM Families —Kimmo Koskenniemi
  • Weighted FSM Algorithms —Lauri Karttunen
  • FSM Representations —Pasi Tapanainen
  • Exploration —Lauri Carlson
  • Ordered Structures —Juhani Karhumäki
  • Surface Parsing I —Jussi Piitulainen
  • Surface Parsing II —Atro Voutilainen


(1) The above program is the final.

(2) On the on-site invitation of the organizers, we heard an "encore" from Lauri Karttunen who presented an exciting and entertaining experiment with Finnish numerals.

(3) The program of the colloquium was kept secret until the last minutes. The day ended with the delivery of "Words, Contexts and Constraints", A. Arppe (2005), Festschrift in the honour of Kimmo Koskenniemi on the occasion of his 60th birthday.



Workshop Venue

Travelling to and from Helsinki

Travelling in Helsinki



List of participants

The list of participants has been removed from this page.

Registration Procedure

Conference registration starts on June 1, 2005. Email or facsimile registration before the event is mandatory. You can register by sending your

  1. name,
  2. affiliation,
  3. contact information,
  4. time of arrival, and
  5. indication of your participation to the related events
    • the AWL / TWOL day (31st August),
    • the conference dinner (2nd September)
  6. any dietary restrictions
to fsmnlp(a) or to the fax number +358-9-19129307.
The subject header "FSMNLP Registration" should be used in both cases.

Early registration deadline --- REMOVED

There is no deadline for early registration, but the places for the conference dinner are limited. Therefore we advice you to register as soon as possible.

Registration fee

The registration fee will be collected in cash (euros) on spot upon registration. It is € 50 for full-time undergraduate and MA students, and Ph.D. students without salary or a grant, and € 100 for others.

Conference dinner

The conference dinner is included in the registration fee (drinks are not included). There are no fee reductions for those who do not participate in the dinner or for those who register too late to fit in to the dinner.

Extra tickets are available for the dinner. Please inform the local organizers of a companion taking part in the dinner in advance. Fee for dinner only is € 50 and it will be collected upon registration as well.




    Organizer's Cell Phone. If you are lost and need help, you can contact Anssi Yli-Jyra at the number +358-40-5933923.

    Police (emergency). Call +358-9-10022 in emergency. Generic emergency number (ambulance etc.) is 112. If you need to call hospital or medical first-aid in the Helsinki region, its 24-hour-number is +358-9-10023.

    The tourism kiosks, Tourist & Convention Bureau, and Sightseeing Kiosk, are close to the Senate Square and the conference location, see a map for their location.

    A WikiTravel article about Helsinki is very informative. See also Finland as facts.

Shops, Chemists etc.
    Information to be posted here.
Arppeanum Building

    Venue. See VENUE page for maps.

    The oudoor doors of the Arppeanum building will be open only between 8:00 - 16:00. In all other times you will need to ring the bell or call the organizer's cell phone if you visit outside.

    Lockers. The building contains medium-size lockers into which you can store your valuables.

Coffee Breaks, Meals and the Conference Dinner

    Coffee/tea will be served during the FSMNLP 2005 coffee breaks at the caferia of Arppeanum. During TWOLDAY and AWL, the coffee is not not served by the conference. The cafeteria are easy to find in the same building (both in Arppeanum and in the Main Building).

    Lunch meals will be available in several restaurants close to the conference site.

    • The prices at UniCafe restaurants range from € 5,20 to 7,70. A fast pizzeria (KOTIPIZZA) is around the corner, Mariankatu 20.
    • Café Engel in the front of the Senate Square is very popular place for those who want to eat a good sandwitch, a pie or drink good cafe/tea.

    • Take-away Indian food: Namaskaar Express, Aleksanterinkatu 36 B.
    Drinking water. Tap water is safe to drink everywhere in Finland.

    Dietary restríctions. Could you please notify us as soon as possible if you have to observe any dietary restrictions. Please do this by sending an email to fsmnlp(a) with the subject header "FSMNLP Dinner". This is needed to plan the conference dinner in advance.

    Conference Dinner. Kulosaaren Casino is the restaurant of the Conference Dinner.


    Do I need to bring my own laptop? The preproceedings (roughly 300 pages) will be given as on a CD. If you do not have your computer, we can arrange photocopied printouts for the small costs.

    We are arranging temporary accounts that will be given only to the FSMNLP/TWOLDAY/AWL participants. You will have to sign an agreement before getting the password and accout.

    Access to PC machines. To write email, you have to use either your own laptop with a given account in WLAN or a desktop computers in certain buildings (Alexandia for example).

    Protection. If you bring a Windows laptop and WLAN card, we require that you have at least SP2 installed, the firewall activated and We require that you update your Virus protection database and have a Firewall program and anti-virus checking turned on. This is a part of the normal protection level required anywhere. More information on the WLAN areas will be posted here.

    WLAN will be available at many places of the campus. Moreover, wireless network is available at the venue. Setting the network up has been somewhat uncertain, but at the moment everything seems to be in order and the network ought to be in function during the workshop.

    All workshop participants receive a user account, if they wish, which will be usable until Saturday, 3 Sep. With this, it will be possible to use the Helsinki University WLAN (HUPnet).

    The number of ethernet cables is limited in Arppeanum.

Instructions for Speakers

Poster and Software Demos

    Size of the poster: The size of the poster for each presentations is limited to "900mm x 1200mm" (A0 portrait size). You can choose the way of width and height within the A0 size. A1 - A0 is ideal. You can make the poster of smaller pieces, for your convenience and economy, although a one-piece poster looks better.

    Poster session time: On Thursday, 1st August at the same time as the demo session. The conference site will provide poster walls to which the posters should be fixed before the poster session.

    The walls for presenting posters will stay in Arppeanum, the event venue, for two days and they are located near the conference room, auditorium. So, in addition to the poster session, the posters can be explored during the coffee breaks, for example.

Software Demos

    The demo session at the FSMNLP 2005 will be on Thursday, 1 September, at 18.00 - 20.00 hours and is planned to run as follows:

    1. Each demo group or presenter presents itself/himself/herself shortly (4 minutes per demo presenter/group) at the beginning of the session to everyone. For this, we would like you to prepare a few (1-3) presentation slides, which could be downloaded beforehand to the main computer at the conference room to make the presentation smooth. If these slides can be sent to us (e-mail to the workshop address) before the event, all the better, or they can be downloaded during breaks.
    2. After the short presentations, each demo group can take their places at their own presentation point (personal laptop and some desk space; for those without a personal laptop, two laptops are available). The rest of the session is reserved for free discussions and demo presentations.

    Technical equipment available: (i) a laptop (FujitsuSiemens C1020) with XP and Acrobat 7 and minimally Office 2000 (incl. PowerPoint), (ii) beamer with VGA connection, overhead projector, dia projector, microphone.

Paper Presentations
    Time slots. Regular papers have a 30 minute slot. You should aim to talk for about 20 minutes, leaving 10 minutes for discussion.

    Some general instructions for presenters. Please make sure that your slides are easily readable from the back of a lecture theatre. Do not try to get too much information onto one slide. Do not use more than one slide per minute (one slide can contain several overlays, that OK). There will be Chairpersons for each session. Follow their signs. If they need to signal that the time is over, you are expected to stop almost immediately. Testing. Please check that your presentations is in the right computer and works fine before the sessions.

    The BEST strategy is to take your own laptop with you. Then the slides and viewer program will be ready for you presentation and you depend less on our help.

Photocopying Office
    Where I can find the closest photocopier? We are preparing the answer. Anyway, there are good photocopying facilities in several buildings of Helsinki University. The machines require special "KOPIOKORTTI" card that you can buy from house and library officers. The best facilities are in Alexandria Learning Center (Fabianinkatu 26 -OR- Vuorikatu 7). Open on 31.8. from 8:00 to 18:00 and since 1.9. from 8:00 to 20:00, closed on Saturday. There you can buy the copying card for 10 EUR note from a machine, or for 10,80 EUR from an officer.
AV equipment

    Beamer? The lecture halls have a data projector with a VGA input (we checked in Arppeanum that 800 x 600, 1024 x 768, and 1280 x 1024 display geometries are fine, we suppose the same is true in the Main Building).

    Laptop? For purpose of the presentations, you can bring your own laptop, but we provide also an XP laptop with AcroRead and MS PowerPoint of Office XP. Please keep a copy of the presentation in an USB memory stick, and consider backuping the presentation by sending it as a PDF file to fsmnlp AT

    AWL and laptop? The FSMNLP organizers do not provide a laptop to AWL as we have only one, AWL organizers may bring one of their own.

    Furthermore, both the rooms are equipped with overhead and slide projectors. Should you need them, let us know in advance.

    The rooms.



The easiest way to arrange your accommodation in Helsinki is to fill in this Hotel Booking Form.

Examples of Hotels Within Walking Distance



The arranged social program for FSMNLP participants will include:

Confirmed events

  • Conference dinner on Friday, 2 September

Preliminary plans

  • city walk on Wednesday 31 August

If you are interested in certain kind of social program, the organizers would be happy to hear your suggestions.

Helsinki Festival

Helsinki Festival is an arts festival held annually in late August – early September. It takes in music, theatre, dance, the visual arts, cinema and city events featuring both Finnish and non-Finnish artists of international repute.

The Helsinki Festival is the biggest festival in Finland in terms of audience figures, in 2004 it gathered approx. 246.000 members of the audience to its various events.

The program of Helsinki Festival is available in the Internet (see the above link).

Temppeliaukio Church

One of the must-see tourist attractions in Helsinki is Temppeliaukio church, a unique piece of architecture. It is probably one of the many places where the events of Helsinki Festival will take place.



Versions of the Proceedings

Accepted Papers

  • Takashi Miyata and Koiti Hasida: Error-Driven Learning with Bracketing Constraints
  • Mans Hulden: Finite-state syllabification
  • Kuppusamy Lakshmanan: Further Results on Syntactic Ambiguity of Internal Contextual Grammars
  • Alexis Nasr and Owen Rambow: Parsing with Lexicalized Probabilistic Recursive Transition Networks
  • Sebastian van Delden and Fernando Gomez: Improving Inter-Level Communication in Cascaded Finite-State Partial Parsers
  • Muntsa Padró and Lluís Padró: Applying a Finite Automata Acquisition Algorithm to Named Entity Recognition
  • Jyrki Niemi and Lauri Carlson: Modelling the Semantics of Calendar Expressions as Extended Regular Expressions
  • Kemal Oflazer and Mehmet Dincer Erbas and Muge Erdogmus: Using Finite State Technology in a Tool for Linguistic Exploration
  • Jakub Piskorski: On Compact Storage Models for Gazetteers
  • Anssi Yli-Jyrä and Jyrki Niemi: An Approach to Specification of Regular Relations: Pivotal Synchronization Expressions
  • Ulrik Petersen: Principles, Implementation Strategies, and Evaluation of a Corpus Query System
  • Francois Barthelemy: Partitioning Multitape Transducers
  • Andre Kempe, Jean-Marc Champarnaud, Franck Guingne and Florent Nicart: WFSM Auto-Intersection and Join Algorithms
  • Alexandra Volanschi and Alexis Nasr: Integrating a POS Tagger and a Chunker Implemented as Weighted Finite State Machines
  • Martin Jansche: Algorithms for Minimum Risk Chunking
  • Jorge Civera, Elsa Cubel, Juan Miguel Vilar, Antonio Luis Lagarda and Sergio Barrachina: A Novel Approach to Computer-Assisted Translation based on Finite-State Transducers
  • Anne Schiller: German Compound Analysis with wfsc
  • Thomas Hanneforth: Longest-Match Pattern Matching with Weighted Finite State Automata
  • Howard Johnson: Collapsing epsilon-loops in weighted finite-state machines
  • Yael Cohen-Sygal and Shuly Wintner: Finite State Registered Automata and Their Uses in Natural languages
  • Tamas Biro: Squeezing the Infinite into the Finite
  • Lauri Karttunen: Invited Talk
  • Na-Rae Han: Klex: A Finite-State Transducer Lexicon of Korean
  • Alexander Geyken and Thomas Hanneforth: TAGH: a complete morphology for German based on weighted Finite State Automata
  • Elaine Uí Dhonnchadha: Scaling an Irish FST morphology engine for use on unrestricted text


The papers and abstracts will be included to CD-ROM that will be distributed only to the workshop participants at the time of the event.

Official Proceedings

We have aggreed with Springer on the plans of publishing the postproceedings of FSMNLP 2005 in Lecture Notes in Artificial Intelligence (LNAI). LNAI is a subseries of LNCS. LNCS is a highly esteemed and stable publication channel.




The FSMNLP 2005 workshop is a follow-up for some earlier workshops, but also continues their dynamic, changing tradition. FSMNLP workshops have traditionally had tutorial lessons and/or invited speakers. These workshops are under different names and time intervals:

1st FSMNLP1996ECAI Workshop Extended Finite-State Models of Language Budapest
2nd FSMNLP1998International Workshop Finite-State Methods in Natural Language Processing Ankara
3rd FSMNLP2001ESSLLI Workshop Finite-State Methods in Natural Language Processing Helsinki
4th FSMNLP2003EACL Workshop Finite-State Methods in Natural Language Processing Budapest
5th FSMNLP2005International Workshop Finite-State Methods and Natural Language ProcessingHelsinki


There is no official organization behind FSMNLP. The success in having a series of FSMNLP workshop has been based on international co-operation and steadily growing interest in the FSMNLP theme. Earlier organizers are willingly sharing their experience with the subsequent organizers.

Planning the FSMNLP 2006 or FSMNLP 2007

Planning the FSMNLP 2006 or FSMNLP 2007 workshop starts already before FSMNLP 2005. The event would probably be in association with such a conference as EACL, ACL, COLING, ECAI, ICAP, or CIAA. If you have a good idea, we encourage you to take an initiative and contact some of the earlier FSMNLP organizers.

During FSMNLP 2005, the plans will be discussed informally.

  Website: events/FSMNLP2005 (printable site snapshot)
Mail alias to reach the organizing chair: fsmnlp(write-the-sign-here)
This page modified: Monday, 28-Mar-2005 21:55:08 EEST
Web design: A Y.-J.