ENDANGERED FINNO-UGRIC LANGUAGES
I. ORGANIZATION OF THE PROJECT
Goals of the Project:
Providing a text corpus collection of endangered Finno-Ugric languages
in machine-readable form;
- Basic linguistic research.
- Academy of Finland (SA)
- University of Helsinki: Department of Finno-Ugric Studies
- Department of General Linguistics
- Joint Committee of the Nordic Research Councils for the Humanities (NOS-H)
- The Nordic Research Council
of the Research Period:
- Academy of Finland: 1996-1998
- Joint Committee of the Nordic Research Councils for the
- Seppo Suhonen, University of Helsinki, Department of
Finno-Ugric Studies, chief of the project.
- University of Helsinki, Department of Finno-Ugric
- NOS-H: Jelena Adel
- Jarmo Alatalo (full-time researcher),
- Miikul Pahomov (part-time researcher),
- and Merja Salo (full-time researcher)
- Academy of Finland: Erja Kujala
- Jack Rueter,
- and Tapani Salminen (part-time researchers).
Department of General Linguistics:
Academy of Finland: Pirkko
Suihkonen (full-time researcher).
- University of Uppsala, Department of Finno-Ugric
- NOS-H: André Hesselbäck and
- Manja Lehto (full-time researchers)
- University of Umeå, Institute for Saami
Korhonen (part-time researcher).
- Nord-Trøndelag College, Department of Education:
NOS-H: Nora Bransfjell (part-time researcher)
Norwegian Computing Centre for the Humanities
NOS-H: Sjur Moshagen (full-time researcher)
Norwegian University for Science and Technology, Department of
NOS-H: Sagka Renander (short-term research-assistant).
II. COMPUTER CORPORA
Languages from which the computer corpora will be created:
- Finland: Komi and Erzya (Jack Rueter), Khanty (Merja Salo), Nenets
Selkup and Kamassian (Jarmo Alatalo), Livonian (Seppo Suhonen);
- Sweden: Ingrian (Manja Lehto), Hill Mari (André Hesselbäck) and
- Norway: Southern Saami (Nora Bransfjäll, Sjur Moshagen,
Samples of the following languages were adjusted for use in the University of Helsinki Language Corpus Server (UHLCS) in 1996 - 1999:
Uralic languages: Livvi, Dvina-Karelian, Ludian, Ingrian,
Kildin Saami, South Saami, Ume Saami,
Erzya, Moksha, East Mari, West Mari,
Komi Zyrian, Komi Permyak, Khanty,
Mansi, Hungarian, Enets, Nenets, Selkup and Kamas;
Indo-European languages: Kurdish, Ossete, Tajik, Armenian,
Latvian, and Lithuanian,
Belorussian, Ukrainian, Serbo-Croatian, and Moldavian (Romanian);
Caucasian languages: Avar, Lak, and Tabassaran;
Turkic languages: Altai, Azerbaijani, Balkar, Bashkir,
Crimean Tatar, Gazauz, Khakas,
Kirghiz, Kumyk, Kazakh, Turkmen, Tuvin, Uyhghur, Uzbek, and Yakut;
Mongolic languages: Buryat, and Kalmyk;
Tungusic languages: Even, Evenki, and Nanay;
Chukotko-Kamchatkan languages: Chukchi, and Koryak.
Pirkko Suihkonen, Aug. 9, 1998. Updated in Aug. 2002.