Theoretical Foundations of Syntactic Annotation

5 11

Authors

  • L.T. Alimtayevа Al-Farabi Kazakh National University
  • Zh.B. Satkenovа Al-Farabi Kazakh National University
  • S.B. Beissembayeva Khoja Akhmet Yassawi Kazakh-Turkish International University

Keywords:

syntactic annotation, formal grammar, dependency grammar, sentence models, language corpus.

Abstract

The article provides a comprehensive analysis of the theoretical foundations of syntactic annotation. The study presents a comparative overview of different grammatical schools and their approaches to syntactic structure. In particular, N. Chomsky‘s transformational-generative theory, the concepts of deep and surface syntax, and their role in the development of syntactic parsing programs and annotation models are examined. L. Tesnière‘s dependency grammar is highlighted as an alternative to the traditional subject–predicate dichotomy, proposing to represent the sentence as a graph. His theory of actants, centered on the predicate verb, is analyzed as a model that significantly influenced modern computational language processing practices. I. Mel‘čuk‘s Meaning–Text Theory (MTT) emphasizes the importance of formally representing syntactic valency and semantic actants, and it has been widely applied in machine translation and computational lexicography.
The main aim of the research is to establish a scientific and theoretical foundation for syntactic annotation of the Kazakh language. The methodological framework includes comparative-descriptive analysis, formal grammar models, and structural-syntactic analysis. The advantages of constituency and dependency grammars are discussed, with the conclusion that the dependency grammar model is more adaptable to the grammatical system of Kazakh.
The paper also evaluates the possibilities and limitations of the Kazakh syntactic annotation within the Universal Dependencies (UD) project. While UD provides an effective platform for cross-linguistic unification, the syntactic annotation of Kazakh should not be restricted to this framework. A national corpus-based annotation is required to adequately capture the language‘s unique features, such as its agglutinative structure, case multifunctionality, possessive constructions, and specific properties of complex sentences.
The study contributes to strengthening the theoretical foundations of syntactic annotation in Kazakh linguistics and offers significant practical implications for the development of national corpora and further advancement of language technologies.

References

Жҧбанов А.Қ., Жаңабекова А.Ә. Корпустық лингвистика. – Алматы: Қазақ тілі, 2017. – 336 б.

Словарь по языкознанию. Тіл білімі сӛздігі / под общ. ред. Э.Д. Сулейменовой. – Алматы: Ғылым, 1998. – 540 с.

Салқынбай А., Абақан Е. Лингвистикалық тҥсіндірме сӛздік. – Алматы: Сӛздік-словарь, 1998. – 330 б.

Chomsky N. Syntactic Structures. – The Hague: Mouton, 2002. – 117 б.

Хомский Н. Аспекты теории синтаксиса. – М.: Изд. МГУ, 1972. – 117 с.

Tesnière L. Éléments de syntaxe structurale. – Paris: Librairie C. Klincksieck, 1959. – 690 p.

Mel‘čuk I.A. Dependency Syntax: Theory and Practice. – Albany: State University of New York Press, 1988. – 428 p.

Universal Dependencies [Electronic resource]. URL: https://universaldependencies.org/treebanks/tr_imst/index.html (date of access 01.05.2025)

KazDT. kazdet: NLA-NU Kazakh Dependency Treebank [Electronic resource]. URL: https://github.com/nlacslab/kazdet (қаралған кҥні 10.05.2025)

Байтҧрсынов А. Тіл тағлымы. – Алматы: Ғылым, 1992. – 446 б.

Омарҧлы Е. Ҥш томдық шығармалар жинағы: зерттеулер, мақалалар. – Алматы: Алашорда, 2018. – 400 б.

Жҧбанов Қ. Қазақ тілі жӛніндегі зерттеулер. – Алматы: Атамҧра, 2010. – 608 б.

Жҧбанов А.Қ. Қазақ тілтанымындағы статистикалық әдістің орны // Кіт.: Жизнь языка и язык жизни. – Алматы: Қазақ университеті, 2004. – Б. 115–129.

Аманжолов С. Қазақ тілі теориясының негіздері. – Алматы: Ғылым, 2002. – 366 б.

Аманжолов С. Қазақ әдеби тілінің қысқаша курсы. – Павлодар: Павлодар мемлекеттік университеті, 2008. – 110 б.

Арғынов Х. Жай сӛйлем синтаксисінің методикасы. – Алматы: Мектеп, 1967. – 112 б.

Әмір Р., Әмірова Ж. Жай сӛйлем синтаксисі. – Алматы: Қазақ университеті, 2003. – 199 б.

Садуақасҧлы Ж. Қазақ тіліндегі бір қҧрамды сӛйлемдердің қҧрылымдық типтері: филол. ғыл. док. ... дис. – Алматы, 1997. – 305 б.

Қайыров А.Б. Қазіргі қазақ тіліндегі екі қҧрамды хабарлы жай сӛйлемдердің қҧрылымдық-мағыналық модельдері: филол. ғыл. канд. ... дис. – Алматы, 1999. – 121 б.

Мҧсаева Г.Ә. Жай сӛйлемдерді меңгеру динамикасы. – Алматы: Әл-Фараби атындағы Қазақ ҧлттық университеті, 2013. – 137 б.

Жҧбанов А.Қ. Қолданбалы лингвистика: формалды модельдер. – Алматы: Қазақ университеті, 2006. – 280 б.

Жубанов А.К. Основные принципы формализации содержания казахского текста. – Алматы: Каз НИИНТИ, 2006. – 617 с.

Nivre J. et al. Universal Dependencies v1: A Multilingual Treebank Collection // Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). – Portorož, Slovenia, 2016. – P. 1659–1666.

Gerdes K., Guillaume B., Kahane S., Perrier G. SUD or Surface-Syntactic Universal Dependencies: An annotation scheme near-isomorphic to UD // Proceedings of the Second Workshop on Universal Dependencies, 2018. – P. 66–74.

Zhubanov A.Q., Zhanabekova A.A. Korpustyq lingvistika [Corpus linguistics]. – Almaty: Qazaq tіlі, 2017. – 336 b. [In Kazakh]

Slovar po iazykoznaniu. Tіl bіlіmі sozdіgі [Dictionary of Linguistics. Dictionary of Linguistics] / pod obsh. red. E.D. Suleimenovoi. – Almaty: Gylym, 1998. – 540 s. [In Russian and Kazakh]

Salqynbai A., Abaqan E. Lingvistikalyq tusіndіrme sozdіk [Explanatory dictionary of linguistics]. – Almaty: Sozdіk-slovar, 1998. – 330 b. [In Kazakh]

Chomsky N. Syntactic Structures. – The Hague: Mouton, 2002. – 117 b.

Homski N. Aspekty teorii sintaksisa [Aspects of the Theory of Syntax]. – M.: Izd. MGU, 1972. – 117 s. [In Russian]

Tesnière L. Éléments de syntaxe structurale. – Paris: Librairie C. Klincksieck, 1959. – 690 p.

Mel‘čuk I.A. Dependency Syntax: Theory and Practice. – Albany: State University of New York Press, 1988. – 428 p.

Universal Dependencies [Electronic resource]. URL: https://universaldependencies.org/treebanks/tr_imst/index.html (date of access 01.05.2025)

KazDT. kazdet: NLA-NU Kazakh Dependency Treebank [Electronic resource]. URL: https://github.com/nlacslab/kazdet (қaralғan kҥnі 10.05.2025)

Baitursynov A. Tіl taglymy [The study of language]. – Almaty: Gylym, 1992. – 446 b. [In Kazakh]

Omaruly E. Ush tomdyq shygarmalar zhinagy: zertteuler, maqalalar [Collected works in three volumes: studies, articles]. – Almaty: Alashorda, 2018. – 400 b. [In Kazakh]

Zhubanov Q. Qazaq tіlі zhonіndegі zertteuler [Studies on the Kazakh language]. – Almaty: Atamura, 2010. – 608 b. [In Kazakh]

Zhubanov A.Q. Qazaq tіltanymyndagy statistikalyq adіstіn orny [The role of the statistical method in Kazakh linguistics] // Kіt.: Zhizn iazyka i iazyk zhizni. – Almaty: Qazaq universitetі, 2004. – B. 115–129. [In Kazakh]

Amanzholov S. Qazaq tіlі teoriasynyn negіzderі [Fundamentals of the theory of the Kazakh language]. – Almaty: Gylym, 2002. – 366 b. [In Kazakh]

Amanzholov S. Qazaq adebi tіlіnіn qysqasha kursy [Short course of the Kazakh literary language]. – Pavlodar: Pavlodar memlekettіk universitetі, 2008. – 110 b. [In Kazakh]

Argynov H. Zhai soilem sintaksisіnіn metodikasy [Methods of simple sentence syntax]. – Almaty: Mektep, 1967. – 112 b. [In Kazakh]

Amіr R., Amіrova Zh. Zhai soilem sintaksisі [Simple sentence syntax]. – Almaty: Qazaq universitetі, 2003. – 199 b. [In Kazakh]

Saduaqasuly Zh. Qazaq tіlіndegі bіr quramdy soilemderdің qurylymdyq tipterі: filol. gyl. dok. ... dis. [Structural types of one-member sentences in Kazakh: dis.]. – Almaty, 1997. – 305 b. [In Kazakh]

Qaiyrov A.B. Qazіrgі qazaq tіlіndegі ekі quramdy habarly zhai soilemderdіn qurylymdyq-magynalyq modelderі: filol. gyl. kand. ... dis. [Structural-semantic models of two-member declarative simple sentences in modern Kazakh: dis.]. – Almaty, 1999. – 121 b. [In Kazakh]

Musaeva G.A. Zhai soilemderdі mengeru dinamikasy [Dynamics of mastering simple sentences]. – Almaty: Al-Farabi atyndagy Qazaq ulttyq universitetі, 2013. – 137 b. [In Kazakh]

Zhubanov A.Q. Qoldanbaly lingvistika: formaldy modelder [Applied linguistics: formal models]. – Almaty: Qazaq universitetі, 2006. – 280 b. [In Kazakh]

Zhubanov A.K. Osnovnye principy formalizacii soderzhania kazahskogo teksta [Basic principles of formalization of Kazakh text content]. – Almaty: Kaz NIINTI, 2006. – 617 s. [In Russian

Nivre J. et al. Universal Dependencies v1: A Multilingual Treebank Collection // Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). – Portorož, Slovenia, 2016. – P. 1659–1666.

Gerdes K., Guillaume B., Kahane S., Perrier G. SUD or Surface-Syntactic Universal Dependencies: An annotation scheme near-isomorphic to UD // Proceedings of the Second Workshop on Universal Dependencies, 2018. – P. 66–74.

Published

2025-12-29

Issue

Section

PHILOLOGY