FinnTreeBank - A Treebank for Finnish

CC BY 3.0

The FinnTreeBank project is creating a treebank and a parsebank for Finnish. This work is licensed under a Creative Commons Attribution 3.0.

The first and second version of the treebank is annotated by hand and based on 17.000 model senctences in the Large Grammar of Finnish VISK - Iso Suomen Kielioppi. Brief samples of text from other sources, e.g. news items and literature, are also available in the second version. A parsebank for Finnish based on the Europarl and the JRC-Aquis will be available in June 2012.


General documentation of the annotation scheme and scientific publications related to the project.

» The Annotation Manual

» Publications


Download area for the Finnish Treebank:

» Treebank Files

Related Software

» HFST - Helsinki Finite-State Technology at SourceForge (

» OMorFi - Finnish Morphological Analyzer