MULTIMODAL LINGUISTIC DATA BANK
AND DATA SERVER

Max Planck Institute for Evolutionary Anthropology,
Department of Linguistics, Leipzig


The Max Planck Institute for Evolutionary Anthropology, Department of Linguistics, Leipzig, is developing a Multimodal data bank and data server. This work was started in spring 2000. The bank will contain all kinds of electronic linguistic data: dictionaries, grammars, running texts, word lists, audio-visual material and linguistic data located in different kinds of data bases. In addition to the linguistic data, also cultural and demographic material and maps showing the distribution of linguistic and cultural information will be available on the server. Special attention will be paid to collecting data about endangered languages. The server will also contain basic research. The multimodal linguistic data server can also be used as a forum to inform the public about the languages and cultures of the world, their diversity and universals.

The main part of the data bank is located in the UNIX-operating system, but also other operating systems will be included in the data bank and data server. In this stage of the work, the main directories of the data bank are as follows:

* /data
* /tools
* /users

The data are located in the directory /data, the tools that can be used in analyzing the data are located in the directory /tools, and the directory /users contains the directories of the users of the data bank. The structure of the directory /data is organized according to data types as follows:

* /demographic-data
* /demonstrations
* /dictionaries
* /endangered-languages
* /grammars
* /historical-linguistics
* /maps
* /metadata-descriptions
* /morphological-typology
* /onomastics
* /phonological-typology
* /sound-data
* /syntactic-typology
* /terminology
* /text-simple
* /video-data
* /word-lists

The data located in the linguistic directories are organized according to the language families and/or specific data types. For instance the directory /syntactic-typology contains the following directories and files (the files are separated with commas):

/austro-asiatic-lgs/
/mon-khmer-lgs/ Khasi
/munda-lgs/ Kharia
/nicobarese/
/dravidian-lgs/ Kurux
/indo-aryan-lgs/Bagri, Bangani, Konkani, Konkani-Choraon, Konkani-Shiroda, Nepali, Sambhalpuri,
/sino-tibetan-lgs/
/tibeto-burman-lgs/ Manipuri, Naga

In the directory /metadata-descriptions the metadata descriptions are arranged according to the names of languages. The data have been adapted to the data bank so that they are both portable and platform independent. Data prepared using character sets other than the basic Latin-1 have been converted to UNICODE.

From its inception, the data bank contains dictionaries, running texts, syntactic and morphological typological data, and word lists from several languages. The largest dictionary will be the IDS-dictionary that at present (July 2001) contains material from the Uralic languages. The data to be prepared during the DOBES-project will also be located in the data bank. Information on the data is given in the README-files that are located in the data directories. Information on the data will also be given with the help of metadata on the web-site of the MPI-EVA, Department of Linguistics.

So far, the directory /tools contains some basic scripts for analyzing linguistic data. Instructions needed in the use of the scripts are/will be connected with each of them. All the tools available in the UNIX-operating system can also be used in the work.

The data located in the Multimodal data bank in Leipzig can be used in research and teaching. For more information on the data bank, please contact the following:

Peter Fröhlich
Max Planck Institute for Evolutionary Anthropology
Department of Linguistics, Inselstrasse 22
D-04103 Leipzig
Germany
Fax: +49 - (0)341 - 99 52 119


* Information on the computer account application

If someone wishes to donate data, the contract to cover the copyright and ownership questions is available at the following web-address:

* Agreement form

* The pilot phase of the data bank project


Pirkko Suihkonen, July 2001
Last modified: Jan. 12, 2002.