Department of Modern Languages
The Helsinki Open Source Morphology Project for various languages aims at implementing full-fledged morphological analysers for a number of languages using the Helsinki Finite-State Transducer Technology (HFST).
The first large-scale implemented lexicon is an Open Source Finnish Morphology (OMorFi) but a number of other analyzers and generators based on open source resources for various languages have also been implemented. These works are licensed under the GNU Lesser General Public License v3.0 unless specific restrictions apply to the original lexical resources for a language.
The Finnish lexicon has been substantially extended and revised before it was compiled into a finite-state transducer, whereas the other languages are more or less mechanically derived from their repective sources. For documentation on the development work on the Finnish lexicon, see the link below.
Demos based on the OMor lexicons:
Download area for transducer versions of the Open Source Lexicons:
Related Lexical Data
For license policies of the data, see the resource links:
» Nykysuomen sanalista Research Institute for the Languages of Finland (~94 K lemmas, LGPL)
» DSSO Den stora svenska ordlistan (~48 K lemmas, GPL)
» English Dictionary English WSJ-based dictionary (~40 K lemmas, GPL)
» Morphalou 2.0 Lexique morphologique (~95 K lemmas, special license)
» Divvun Sámi proofing tools project (GPL)