少儿华语教学主题分类词表构建
发布时间:2019-03-05 11:38
【摘要】:在总结前人研究的基础上,将少儿华语、主题、词表研究三者相结合,基于语文百科性,以12套代表性的东南亚少儿华语教材为语料,构建了一个分层级的少儿华语主题库;运用计算语言学的相关技术实现主题词语聚类,并人工干预筛选出那些与主题密切相关、使用频率高、难度较低的词语,按相关度、常用度排序;少儿华语主题分类词表共60个三级话题,2970个词条。
[Abstract]:On the basis of summarizing previous studies, this paper combines Chinese language for children, theme and thesaurus. Based on Chinese encyclopedia, 12 sets of representative Chinese textbooks for children in Southeast Asia are used as corpus to construct a hierarchical Chinese subject database for children. The related techniques of computational linguistics are used to cluster the topic words, and the words with high frequency and low difficulty are screened out by manual intervention. The words are sorted according to the degree of correlation and the degree of commonality, which are closely related to the topic, and are used in high frequency and low difficulty. There are 60 third-level topics and 2970 entries in the children's Chinese topic classification list.
【作者单位】: 暨南大学华文学院/海外华语研究中心;三清山风景名胜区管委会;
【基金】:北京成像技术高精尖创新中心资助项目“华裔学生作文自动批改研究”(BAICIT-2016008)~~
【分类号】:G623.2
本文编号:2434866
[Abstract]:On the basis of summarizing previous studies, this paper combines Chinese language for children, theme and thesaurus. Based on Chinese encyclopedia, 12 sets of representative Chinese textbooks for children in Southeast Asia are used as corpus to construct a hierarchical Chinese subject database for children. The related techniques of computational linguistics are used to cluster the topic words, and the words with high frequency and low difficulty are screened out by manual intervention. The words are sorted according to the degree of correlation and the degree of commonality, which are closely related to the topic, and are used in high frequency and low difficulty. There are 60 third-level topics and 2970 entries in the children's Chinese topic classification list.
【作者单位】: 暨南大学华文学院/海外华语研究中心;三清山风景名胜区管委会;
【基金】:北京成像技术高精尖创新中心资助项目“华裔学生作文自动批改研究”(BAICIT-2016008)~~
【分类号】:G623.2
【相似文献】
相关会议论文 前1条
1 杜向阳;张吉林;;基于语义本体知识库技术的主题分类方法在舆情监测实践中的应用[A];中国新闻技术工作者联合会五届一次理事会暨学术年会论文集(上篇)[C];2009年
相关硕士学位论文 前3条
1 程志强;基于新浪微博主题的用户影响力研究[D];东北大学;2013年
2 李洪图;中文短文本主题分类方法研究[D];西北大学;2014年
3 戴依若;基于内容的中文流行病新闻主题分类[D];北京邮电大学;2011年
,本文编号:2434866
本文链接:https://www.wllwen.com/jiaoyulunwen/xiaoxuejiaoyu/2434866.html