Open Source Morphologies (OMor)


The Helsinki Open Source Morphology Project for various languages aims at implementing full-fledged morphological analysers for a number of languages using the Helsinki Finite-State Transducer Technology (HFST).

The first large-scale implemented lexicon is an Open Source Finnish Morphology (OMorFi) but a number of other analyzers and generators based on open source resources for various languages have also been implemented. These works are licensed under the GNU Lesser General Public License v3.0 unless specific restrictions apply to the original lexical resources for a language.


The Finnish lexicon has been substantially extended and revised before it was compiled into a finite-state transducer, whereas the other languages are more or less mechanically derived from their repective sources. For documentation on the development work on the Finnish lexicon, see the link below.

» Finnish Lexicon documentation in KitWiki


Demos based on the OMor lexicons:

» Analyzers, Generators and Guessers


Download area for transducer versions of the Open Source Lexicons:

» Source code

Related Lexical Data

For license policies of the data, see the resource links:

» Nykysuomen sanalista Research Institute for the Languages of Finland (~94 K lemmas, LGPL)

» DSSO Den stora svenska ordlistan (~48 K lemmas, GPL)

» English Dictionary English WSJ-based dictionary (~40 K lemmas, GPL)

» Morphalou 2.0 Lexique morphologique (~95 K lemmas, special license)

» Divvun Sámi proofing tools project (GPL)