当前位置:主页 > 科技论文 > 软件论文 >

微博用户属性认证研究与应用

发布时间:2018-10-22 19:19
【摘要】:随着互联网的飞速发展,互联网已经渗透到每个人的生活。而微博这样的社交网络平台更是风靡社会,迅速成为了人们聊天交互、信息获取的重要方式。国内外主流微博平台都积累了数以亿计的用户,拥有丰富的用户信息,这些信息中蕴含着巨大的商业价值。如何正确地应用这些数据来发现潜在的重要的知识,并以此更加了解用户从而实现可观的价值是十分重要的。基于以上背景,本文主要对微博用户的属性认证展开研究。本文对微博用户的多种重要属性的认证算法进行了研究。文章提出了一种新的基于词向量距离的微博用户职业属性认证算法,通过衡量用户发布的微博内容词汇与职业词汇之间的距离来预测用户所属的职业,并利用Word2vec这种基于神经网络的词向量转化工具以提高预测的准确率。基于真实微博用户数据的实验表明,该算法的准确率可以达到近80%。同时,文章针对用户的另一种社会属性——用户角色的分析,进行了算法研究。文章提出一种用户角色分析的综合评价指标U-Score,该指标由多种不同类型的层次化指标构成,综合考虑了用户的影响力、活跃度、中心性、可信度和重要性五种不同因素,并利用层次分析法来计算不同特征的权重。实验结果表明,这种方法对于微博用户的角色分析是可行的且能够量化用户的多种指标。文章同时也对用户的性别属性认证进行了研究,根据采集到的微博用户数据的特性,文章综合了三种不同类型的用户特征来对用户性别进行分类,分类准确率可以达到90%以上。同时,本文利用以上提出的微博用户属性认证算法,综合开发了一个微博用户属性认证系统。该系统包括数据采集、数据存储、数据挖掘认证三大模块。在数据挖掘认证模块中,系统实现了以上三种用户属性认证算法,可以实现对用户属性进行认证的目标。
[Abstract]:With the rapid development of the Internet, the Internet has penetrated into everyone's life. The social network platform such as Weibo is popular in society and has become an important way for people to chat and exchange information quickly. The mainstream Weibo platform at home and abroad has accumulated hundreds of millions of users and has abundant user information, which contains enormous commercial value. It is very important to correctly apply these data to discover the potentially important knowledge, and thus to understand the user better and realize considerable value. Based on the above background, this paper mainly studies the attribute authentication of Weibo users. In this paper, the authentication algorithm of Weibo user's important attributes is studied. In this paper, a new occupational attribute authentication algorithm for Weibo users based on word vector distance is proposed to predict the occupation of the user by measuring the distance between the occupational vocabulary and the Weibo content vocabulary published by the user. Word2vec, a word vector transformation tool based on neural network, is used to improve the accuracy of prediction. Experiments based on real Weibo user data show that the accuracy of the algorithm can reach nearly 80%. At the same time, this paper studies the algorithm of user's role, which is another kind of social attribute. In this paper, a comprehensive evaluation index U-Scorefor user role analysis is proposed. The index is composed of many different types of hierarchical indexes, and five different factors, namely, influence, activity, centrality, credibility and importance of users, are taken into account. Analytic hierarchy process (AHP) is used to calculate the weights of different features. The experimental results show that this method is feasible for Weibo user's role analysis and can quantify user's multiple indexes. At the same time, the paper also studies the gender attribute authentication of users. According to the characteristics of Weibo user data collected, the paper synthesizes three different types of user characteristics to classify the gender of users, and the classification accuracy can reach more than 90%. At the same time, using Weibo user attribute authentication algorithm, we develop a user attribute authentication system. The system includes three modules: data acquisition, data storage and data mining authentication. In the data mining authentication module, the system implements the above three user attribute authentication algorithms, which can achieve the goal of user attribute authentication.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP393.092;TP311.13

【参考文献】

相关期刊论文 前3条

1 郑文超;徐鹏;;利用word2vec对中文词进行聚类的研究[J];软件;2013年12期

2 赵文兵;朱庆华;吴克文;黄奇;;微博客用户特性及动机分析——以和讯财经微博为例[J];现代图书情报技术;2011年02期

3 夏雨禾;;微博互动的结构与机制——基于对新浪微博的实证研究[J];新闻与传播研究;2010年04期



本文编号:2288076

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2288076.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户85112***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com