当前位置:主页 > 文艺论文 > 语言艺术论文 >

面向信息处理的蒙古语词根研究

发布时间:2018-10-29 20:40
【摘要】:本文运用语料库语言学方法和统计方法对蒙古语词根进行系统统计研究,从而分析蒙古语的词根音节、构词能力、语法功能,将蒙古语词根的研究从面向人的研究转化为面向人和机器所需要的研究。论文以《新蒙汉词典》里的七万余词条为主要语料库,《蒙古语正音正字词典》为辅助语料库的基础上完成。语料库中最终取了 55114个单词对其进行了录入,切分、标注、归类等一系列细化工作。在阐述词根、词干和词缀关系基础上统计并分析所切分的词根,把词根分成1-4音节,通过数据统计,分析哪一词性哪一音节词根数目及构词数目。根据统计和分析,本研究主要得到以下结论:1.蒙古语词根主要分1-4音节,其中2音节词根和3音节词根占据83.66%,而且构词能力远超1音节词根和4音节词根。2.以词根的词性分类统计,名词词根占总词根数52.7%,构词比例占总词数库27.88%;动词词根占总词根数16.6%,构词比例占总词数库13.9%;形容词词根占总词根数23.650%,构词比例占总词数库19.130%。从而可以得出结论:词根类型名词最多、形容词次之、动词最少。从构词能力上看,名词词根构词能力最强、形容词次之而动词最弱,整体上看,蒙古语是名词类词根为主导的语言。3.除了蒙古语词根追加词缀构词外,蒙古语词根也可通过词根合成法、粘着法和复合词根等方式和形式丰富着词汇。
[Abstract]:This paper makes a systematic statistical study on the roots of Mongolian by using the methods of corpus linguistics and statistics, so as to analyze the syllable, the ability of word-formation and the grammatical function of Mongolian. The study of Mongolian roots is transformed from human-oriented research to human-oriented and machine-oriented research. The thesis is completed on the basis of more than 70,000 entries in the New Mongol and Han Dictionary as the main corpus and the Mongolian orthography as the auxiliary corpus. In the corpus, 55114 words are input, segmented, marked and classified. On the basis of explaining the relationship among roots, stems and affixes, the root is divided into 1-4 syllable and the number of which syllable roots and word-formation are analyzed. According to statistics and analysis, the main conclusions of this study are as follows: 1. Mongolian roots are mainly divided into 1-4 syllables, of which 2 syllables and 3 syllables occupy 83.66, and the ability of word formation is far greater than that of 1 syllable and 4 syllables. 2. According to the classification statistics of the word root, the noun root occupies 52.7% of the total word root, the proportion of word formation occupies 27.88% of the total word number library, the verb root occupies 16.6% of the total word root, and the proportion of word formation accounts for 13.9% of the total word number library. The number of adjective roots is 23.650 and the proportion of word formation is 19.130. It can be concluded that the root type nouns are the most, the adjectives the second, and the verbs the least. In terms of word-formation ability, noun root is the strongest, adjective is the second and verb is the weakest. On the whole, Mongolian is the dominant language of noun root. In addition to the addition of affixes to the Mongolian roots, the Mongolian roots can enrich the vocabulary by the methods of root composition, adhesion and compound roots.
【学位授予单位】:西北民族大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:H212

【参考文献】

相关期刊论文 前3条

1 赵理莉;张聪品;吴金星;长青;;蒙古语词干提取单带非线性自动机构造[J];信阳师范学院学报(自然科学版);2010年04期

2 侯宏旭;刘群;那顺乌日图;牧仁高娃;李锦涛;;基于统计语言模型的蒙古文词切分[J];模式识别与人工智能;2009年01期

3 赵海;揭春雨;;基于有效子串标注的中文分词[J];中文信息学报;2007年05期



本文编号:2298760

资料下载
论文发表

本文链接:https://www.wllwen.com/wenyilunwen/yuyanyishu/2298760.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户07eef***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com