Morphology based glossing of Estonian and the Leipzig glossing rules: problems and solutions for Uralic languages
Ülle Viks and Anne Tamm
1. Introduction
One of the characteristics of the Uralic languages is its rich morphology. There are a number of clear functional categories that are expressed in morphology, such as possession, which are expressed by a separate set of morphemes - possessive suffixes.
2. Finno-Ugric/Uralic tradition
The traditional scholars of the Finno-Ugric or Uralic languages presume that the readership of their articles is well educated in Finno-Ugric and Uralic morphology. Providing linguistic examples with translations has been an adequate way of presenting linguistic material in a scientific article. Outside Uralistics, not many “outsiders” seemed to be interested in the rich tradition of Finno-Ugric or Uralistic studies.
3. English-based traditions
In theoretical linguistics, the linguistic examples have been predominantly from English. Perhaps more than is the tradition in other generative schools, LFG has concentrated on a wide variedy of languages other than English, and this trend has spread to other streams of generative linguistics. Those researchers typically do not have previous schooling in those languages, therefore, if they discuss a linguistic example, there is always a line of glossing sandwiched between the example and its translation.
4. A surge of interest in smaller languages brings along a problem
In this century, theoretical linguists, typologists, neurolinguists and others have started to become interested in data from smaller languages, including the Uralic languages. However, previous valuable work is practically useless for those who are interested but fall out of the Uralic tradition. The problem with earlier writings is the lack of exact correspondence between the linguistic example and its translation. The more general researchers are interested in the exact, word-by-word, morpheme-by-morpheme, category-by-category correspondences.
5. Standardization does not provide a solution
Standardization is needed, and in this century, the so-called Leipzig glossing rules have been established for standard interlinear glossing to solve several problems of mutual intelligibility. Sporadically, they are applied in writings on the Uralic languages. However, several categories that are necessary in the discussions of the Uralic language are missing in the Leipzig convention. There are articles that have solved the glossing by a general reference to the webpage on the Leipzig rules and another, individual author’s webpage with additional, Uralic glosses.
6. An inventory of problems, both general and specific
This presentation discusses the general and more specific problems of uniting the glossing conventions form the
- syntactic
- constructional
- morph(no)ological
- phonological
- functional
- semantic
- diachronic
perspective.
Problems appear in all modules of language description – semantics, pragmatics, syntax, morphology, phonology. Some of the problems appear exactly because they involve linguistic information that spans across modules and temporal evolvement of the languages in question. Here are some examples:
- cross-linguistic categories (is abessive always caritive in the Uralic languages, is caritive the same as privative in the Australian languages),
- category change (is the partitive marked present participle epistemic modality or infinitive)
- case syncretism (is there an accusative in Estonian, or is it just genitive and nominative)
- morphological versus non-morphological glossing (what is the relationship between form and function, when is it better to gloss something as VAT versus EPIST_MOD or PART_EVID)
- the problem of encoding lenghth (‘välde’) (shall we find a uniform way of distinguishing the length in Inari Sami, Estonian and Nganasan)
These and other items need to be discussed.
7. The proposal of an exact procedure
We will propose an exact procedure to deal with these problems that are recurrent across our languages. We propose the following procedure, which we detail for Estonian:
- start with morphological glossing based on morphological descriptions of the languages in question
- revise standard grammars
- revise automatic text processing labels used for the language, morphological codes etc
- revise leading articles in leading journals or other publications on the specific languages and using interlinear glossing
- find another model that has been applied to a morphologically rich language, comparing the results to other languages
- extend the morphology based glossing to instances that are either more marginal or more restricted (glossing for historical linguistics with many examples of category change)
Appendix1. The codes used in automatic rule-based morphology of Estonian. The explanations are in Estonian as well.
VERB:
|
|
Inf
|
infinitive
|
infinitiiv e da-infinitiiv e da-tegevusnimi
|
Ger
|
gerund
|
gerundium e des-vorm
|
Sup
|
supine
|
supiin e ma-infinitiiv e ma-tegevusnimi
|
Pts
|
participle
|
partitsiip e kesksõna
|
Ps
|
personal voice
|
personaal e isikuline tegumood
|
Ips
|
impersonal voice
|
impersonaal e umbisikuline tegumood
|
Pr
|
present
|
preesens e olevik
|
Pt
|
past
|
preteeritum e (üld)minevik
|
Ipf
|
imperfect
|
imperfekt e lihtminevik
|
Pf
|
perfect
|
perfekt e täisminevik
|
Ppf
|
pluperfect
|
pluskvamperfekt e enneminevik
|
Ind
|
indicative
|
indikatiiv e kindel kõneviis
|
Kvt
|
quotative
|
kvotatiiv e kaudne kõneviis
|
Knd
|
conditional
|
konditsionaal e tingiv kõneviis
|
Imp
|
imperative
|
imperatiiv e käskiv kõneviis
|
Sg 1
|
1. person singular
|
singulari e ainsuse 1. pööre
|
Sg 2
|
2. person singular
|
singulari e ainsuse 2. pööre
|
Sg 3
|
3. person singular
|
singulari e ainsuse 3. pööre
|
Pl 1
|
1. person plural
|
pluurali e mitmuse 1. pööre
|
Pl 2
|
2. person plural
|
pluurali e mitmuse 2. pööre
|
Pl 3
|
3. person plural
|
pluurali e mitmuse 3. pööre
|
Af
|
affirmative
|
afirmatiiv e jaatav kõne
|
Neg
|
negative
|
negatiiv e eitav kõne
|
|
|
NOUN:
|
|
Sg
|
singular
|
singular e ainsus
|
Pl
|
plural
|
pluural e mitmus
|
Nom
|
nominative
|
nominatiiv e nimetav
|
Gen
|
genitive
|
genitiiv e omastav
|
Part
|
partitive
|
partitiiv e osastav
|
Adt
|
aditive
|
aditiiv e suunduv (e lühike sisseütlev)
|
Ill
|
illative
|
illatiiv e sisseütlev
|
In
|
insessive
|
insessiiv e seesütlev
|
El
|
elative
|
elatiiv e seestütlev
|
All
|
allative
|
allatiiv e alaleütlev
|
Ad
|
adessive
|
adessiiv e alalütlev
|
Abl
|
ablative
|
ablatiiv e alaltütlev
|
Tr
|
tranlative
|
tranlatiiv e saav
|
Ter
|
terminative
|
terminatiiv e rajav
|
Es
|
essive
|
essiiv e olev
|
Ab
|
abessive
|
abessiiv e ilmaütlev
|
Kom
|
comitative
|
komitatiiv e kaasaütlev
|
Ü. Viks. Eesti keele avatud morfoloogiamudel. -- Arvutuslingvistikalt inimesele (toim T. Hennoste). Tartu Ülikooli üldkeeleteaduse õppetooli toimetised 1. Tartu 2000, lk 9--36. (http://www.eki.ee/teemad/avatud_mrf.html)
Ü. Viks. http://www.eki.ee/tarkvara/morf_lisa.html
Comments (0)
You don't have permission to comment on this page.