Automatic analysis of the Old Babylonian finite verb
Aleksi Sahala 2010
University of Helsinki
Abstract
General
The goal of my thesis was to write a regular expression based program (in Python 2.5) that was able to
analyze Akkadian nonlinear morphology. Main focus was set on finite verbs as they carry
all characteristics of the Akkadian grammar. Program was written to support G, D, and N-stems
of strong, I-nun, I-waw, I-aleph, I-yod, II-infirmae, III-infirmae and quadriliteral verbs in
present, perfect, preterite, imperative and stative. All personal suffixes, dative, accusative,
ventive and consecutive are supported.
The program uses customizable lexicon and affix files (XML) which allow user to add possibly
missing variants. The lexicon includes most of the attested Akkadian verbal roots (1307) based on
Concise Dictionary of Akkadian (CDA).
Inputs can be given in transliteration or transcription (preferred). Distinctive vowel length may
be ignored to overcome some problems concerning transliteration.
Parser
Analysis process is divided into two phases: Affix analysis and root analysis. In the affix analysis
the input is compared with all affixes listed in the grammar files. Found affixes are cut off and added
into a stack. The deaffixed input (stem) is then indentified by comparing it to regular expression
stem patterns; eg. pattern R1-a-R2*2-V1-R3 stands for strong G-present parras,
şabbit etc.
After affixes and stem have been identified, radicals (R1-R3) and opaque root vowel (V1 or V2)
are picked and validated by comparing them with the lexicon. Should the analyzed input be a valid finite verb,
the program will generate a morphological gloss for it and print out an analysis.
A small scale validation (100 random samples from Enūma Eli) indicated the program having
an error percent of 4-12% depending on input form. Transcription with vowel quantity ignored (!ilv = OFF)
gave the best results.
|