The purpose of this page is to collect links and information on our methdo for compilation of replace rules, two-level rules and other finite-state based rule formalism with rewrite-rules.

Some Contributions Related to the Yli-Jyrä & Koskenniemi method:

On correct attribution of the method in the title:

  1. Yli-Jyrä's two methods for compiling context restrictions, the operation of generalized restriction (GR) and formulas related to GR and Generalized TWOL and compilation of replace rules.
  2. Koskenniemi's proposal for layerization of intersection grammars, that turned out to be implementable with GR, and represents authority, expertice and wishes in the classical two-level morphology (TWOL), and was the first to pose a question of the applicability of GR to replacement.
  3. These (1+2) appeared in joint publications by Yli-Jyrä and Koskenniemi in 2004, 2006, 2007.
  4. The prior art includes e.g. MOL and FO logics plus concatenation hierarchies, generalized rational expressions plus Kaplan's double negation, and an approach to compiling partition-based or multi-tape two-level morphology, and all remaining methods for compiling rules into transducers, in particular Karttunen's replace rules.)

Research Narrative

In 2002, I discovered a bug in XFST when compiling compound context restrictions (CCR) and showed that the correct semantics of CCR is expressible with star-free regular operators. The relevant publications:

In 2003, I found a simpler formula that captured the correct semantics of CCR. This formula suggested generalizations and the operation of Generalized Restriction (GR) was introduced. Constraints with additional preconditions were introduced with applications to my co-author's idea about layerization of rules.

I found in 2006 that Generalized Restriction lends itself to compilation of individual Two-Level Rules. Kimmo Koskenniemi provided a lot of valuable critique and helped me to relate the ideas to the main formalisms of Two-Level Morphology. During the research, I found that various combinations of rules can be compiled with a single GR, if they are combined under union inside the operands of GR. This GR based approach to compilation of Two-Level Grammars provided a number of extensions, including more general context conditions, additional rule types, more conflict resolutions, rules with multi-character changes, and longest application principle. I like to call the new family of systems Generalized Two-Level Morphology (GTWOL), although at this point the semantics of GTWOL was not precisely coupled to a particular definition since the semantics of the ideal compilation formula was developed incrementally in the paper.

In December 2005, Kimmo Koskenniemi prompted me to look for a new approach for compilation of replace rules. I found the core ideas of method of already in December 2005. The approach was based on Two-Level Morphology. Kimmo contributed to the approach mostly when planning the FinTAL paper, since the existence of the longest application principle in GTWOL is crucial for the replace rule compilation method.

In 2007, I was an invited scholar in BUTE, Hungary. There I rewrote the paper that was originally planned for CIAA 2007. The paper was to describe the rich possibilities of the new compilation method for replace rules more in detail compared to the poster by Anssi Yli-Jyrä and Koskenniemi, 2007. This paper was still rewritten after its acceptance to FSMNLP 2007, and again for the final proceedings. The final proceedings is still in print. Nevertheless, I recommend the final version that substantially improves the presentation and takes the feedback in the FSMNLP 2007 meeting into account. In particular, the paper describes the GTWOL and Bracketed GTWOL systems that are used inside the compilation method, and discusses the semantics of the obligatory replacement.

In BUTE, I developed a branch of the SFST tool that includes functionalities for the GR operation and BGTWOL. I used my version, called AFST, when testing my compilation method for parallel replace rules. The system remained, however, problematic, since SFST and AFST did not contain full support for open alphabets, the "other" symbol in the transducers and unknown universal languages. Due to these technical problems, the software demo was not given as such, but a running system was still seen during the presentation of Anssi Yli-Jyrä, 2007b. The AFST code is available here. Currently, the HFST team in Helsinki is going to incorporate some of its ideas to HFST and add the missing support for open alphabets.

I promised in 2007 to give an invited talk in FSMNLP 2007. I knew already how important it is to understand the possibilities of the GR operator. Much in my current research stems from this paper and I consider the paper as an important landmark in the history of GR operator. The paper puts forward a number of open research questions but also gives a lot of structure to the landscape of GR applications. In addition, the paper presents dozens of new formulae that illustrate important uses of the operator for the first time. The title of the talk was altered in the final version which is also much more verbose, formally correct and readable. The final version has 24 pages and its second display formula (formula 32) in Section 6.1 contains a visible Greek epsilon symbol. The final FSMNLP 2007 proceedings will contain this final version or the last but one version which is similar with respect to all other sections. The proceedings is still in print as far as I know, but it will be an open-access publication when available.

Other Methods

My plan is that this page would eventually contain information also on the prior and current state of the art and the implementations. Including, for example, the methods of: