当前位置:主页 > 科技论文 > 搜索引擎论文 >

基于双层语义分析的文档排序方法研究

发布时间:2018-03-09 04:13

  本文选题:信息检索 切入点:语义分析 出处:《华中师范大学》2013年硕士论文 论文类型:学位论文


【摘要】:互联网的蓬勃发展带动了信息检索技术的不断成熟,搜索引擎已经成为每个人都离不开的重要工具,人性化服务的时代背景也要求信息检索技术向智能化发展。传统的基于关键词机械匹配的信息检索方式已经不能满足科学研究和普通用户的需求,因此基于语义的信息检索成为当前信息检索研究的热点,通过自然语言语句进行信息检索已经成为发展的趋势。 面对自然语言查询语句,目前的检索系统往往不能够精确的理解用户的查询请求;同时,在检索的过程中,现有的技术往往将文档中的语义信息丢弃。在对现有的信息检索模型的分析研究下,我们发现单纯的查询语句处理和主题模型检索并不能满足用户对检索结果准确率越来越高的要求。 分析现有的技术和研究成果,本文提出了一种基于双层语义分析的文档排序方法,分别通过查询语句层次语义分析和文档篇章层次语义分析,获取信息检索过程中所需的语义信息,从而提升搜索引擎性能。同时给出了基于双层语义分析的全文检索系统框架,该系统能够在查询语句层次上,对查询语句进行语义处理和复述;在文档篇章层次上,通过提取文档中的潜在主题语义信息,用于优化检索结果。该方法通过结合查询语句层次的语义信息和篇章层次语义信息,在向量空间模型的基础上给出了基于双层语义分析的文档打分公式。 根据提出的基于双层语义分析的全文检索系统框架,设计并实现了原型系统,并解决在系统实现的中的问题。通过对系统的实验结果进行分析,验证了这种基于双层语义分析的全文检索方法的有效性。
[Abstract]:With the rapid development of the Internet, the information retrieval technology is becoming more and more mature, and the search engine has become an important tool that everyone can not do without. The background of humanized service also requires the development of information retrieval technology to intelligence. The traditional information retrieval method based on keyword mechanical matching can no longer meet the needs of scientific research and ordinary users. Therefore, information retrieval based on semantics has become a hot topic in current information retrieval research, and information retrieval through natural language sentences has become a trend of development. In the face of natural language query statements, the current retrieval systems are often unable to accurately understand the user's query requests; at the same time, in the process of retrieval, The existing technologies often discard the semantic information in the document. We find that simple query processing and topic model retrieval can not meet the users' increasing demand for the accuracy of retrieval results. After analyzing the existing technology and research results, this paper proposes a method of document sorting based on two-layer semantic analysis, which is based on query sentence level semantic analysis and document text level semantic analysis, respectively. The semantic information needed in the process of information retrieval is obtained so as to improve the performance of search engine. At the same time, a framework of full-text retrieval system based on double-level semantic analysis is presented, which can be used in query sentence level. Semantic processing and retelling of query statements; at the document text level, by extracting semantic information about potential topics in the document, This method combines the semantic information of query sentence level and text level semantic information, and gives the document scoring formula based on two-layer semantic analysis on the basis of vector space model. According to the proposed framework of full-text retrieval system based on two-layer semantic analysis, the prototype system is designed and implemented, and the problems in the system implementation are solved. The effectiveness of this full-text retrieval method based on double-level semantic analysis is verified.
【学位授予单位】:华中师范大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP391.1

【参考文献】

相关期刊论文 前1条

1 张琪玉;;网络信息检索工具增强关键词检索功能的措施[J];图书馆杂志;2001年01期



本文编号:1586930

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1586930.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户ebe87***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com