Implementing Ndonga verbal morphology with finite state tools

Minttu Hurme
Department of African and Asian Studies, University of Helsinki

Ndonga, as is typical to all Bantu languages, has a very rich verbal morphology, which includes also some non-concatenative phenomena. The finite verbs agree with their subject and up to two objects in the 21 noun-classes. There are over 40 different combinations of tense, aspect and mood (TAM) forms, which are formed from the verb stem with a circumfix consisting of a TAM-prefix and an appropriate final vowel. The basic verbal stems themselves can be extended with several verbal extensions ranging from very productive inflectional extensions like the passive or the applicative extension, to the rarer, more derivational extensions such as the frequentative extension. The most interesting derivative is, though, formed by reduplicating the possibly extended verb stem itself.

The Xerox Finite State Tools system, which was used to implementate the Ndonga verbal morphology, has two features specifically designed to deal with the non-concatenative morphology: flag diacritics and the compile-replace algorithm. The inflectional circumfixes and restricting the combinations of extensions were quite simple to program using the flag diacritics. The reduplication proved to be more problematic. While the compile-replace algorithm worked very well when a limited test set of stems was used, it caused fatal memory problems to XFST when applied to an unlimited set of real stems.

Last modified: Thu Aug 11 21:50:38 EEST 2005