"景先生毕设|www.jxszl.com

面向购物网站用户评论的关键词抽取研究【字数:10216】

2024-11-03 10:56编辑: www.jxszl.com景先生毕设

目录
摘要 Ⅲ
关键词 Ⅲ
Abstract Ⅳ
引言
引言1
关键词抽取的相关研究1
用户评论挖掘 1
关键词抽取2
本文研究工作2
二、用户评论关键词抽取方法 2
(一)问题定义 1
(二)关键词抽取方法 3
1.TFIDF算法3
2.TextRank算法3
3.基于图和LDA模型算法4
(三)评价准则 5
1.p@K 5
2.按抽词效果进行赋值的方法赋值法6
三、实验设置 6
(一)数据采集及处理 6
1.数据采集对象及工具6
2.数据主要处理过程7
(二)方法设置 7
1. 应用TFIDF算法处理数据 7
2. 应用TextRank算法处理数据 7
2. 应用基于图和LDA模型算法处理数据7
四、结果与讨论 7
1. 抽取结果 7
2. 按关键词是否有价值进行标注 8
2. 按抽取位次进行标注 8
五、结果分析 8
(一)三种方法的抽取效果比较 8
(二) K值对抽词结果的影响8
(三)抽词结果在购物平台和用户层面上的应用分析 9
六、结语 9
致谢10
参考文献10
图21 TextRank算法的循环过程4
图51 三种算法P@K值的结果对比8
表31 数据样例表(未进行关键词抽取)6
表41 标注样例7
表42 三种算法P@K值的结果对比8
表43 标注样例 8
面向购物网站用户评论的关键词抽取研究
摘要
关键词抽取算法较多,不同算法对于不同的文本类型的抽词效果也不一样。因此,本文主要讨论使用传统关键词抽取算法,包括基于词频统计的算法TFIDF、基于图的算法TextRank和较新的基于图和LDA模型算法 *51今日免费论文网|www.51jrft.com +Q: ^351916072
的三种不同的关键词抽取算法,自动从大量购物网站用户评论中抽取有效关键词,从而帮助购物平台、商品生产厂商和用户高效的发现用户评论中有用的信息,并利用该信息不断的为购物平台、商家和手机生产厂商提供有效信息从而进行商业决策,为用户购买行为提供支持。本文主要利用了P@K指标和对关键词按抽取位次进行赋值的评价方式,在京东购物平台手机商品类目下的大量用户评论试验数据集中,对三种不同的抽词算法的结果进行了比较评价。试验结果表明,传统的TFIDF抽词算法和TextRank抽词算法的效果基本一致,基于图和LDA模型的算法的抽词效果稍微优于传统的TFIDF和TextRank算法,但不明显。同时,对于购物平台、商品生产厂商和用户,关键词抽取在应用层面上的效果不同,其中对于生产厂商和购物平台的效果较好,而从用户角度来看则相对较差。
RESEARCH ON KEYWORD EXTRACTION OF USER COMMENTS ON SHOPPING WEBSITES
ABSTRACT
There are many keyword extraction algorithms in Natural language processing field, and different algorithms have different extraction effects for different text types. Therefore, the main purpose of this paper is to use the traditional keyword extraction algorithms such as TFIDF, textrank and three different keyword extraction algorithms based on graph and LDA model algorithm to automatically extract effective keywords from a large number of user comments on shopping websites, so that as to help shopping platforms, commodity manufacturers and users efficiently find useful information in some user comments, and make use of this information Constantly provide effective information for shopping platforms, businesses and mobile phone manufacturers to make business decisions and provide support for users purchase behavior. This paper mainly uses the P@K index and the evaluation method of assigning the key words according to the extracted bits, and compares and evaluates the results of three different word extraction algorithms in a large number of user comment test data sets under the mobile phone commodity category of Jingdong shopping platform. The experimental results show that the traditional TFIDF extraction algorithm and traditional textrank extraction algorithm are basically the same. The algorithm based on graph and LDA model is slightly better than the traditional TFIDF and traditional textrank algorithm, but not obvious. At the same time, for the shopping platform, commodity manufacturers and users, the effect of keyword extraction on the application level is different, which is better for manufacturers and shopping platform, but relatively poor from the perspective of users.

原文链接:http://www.jxszl.com/jsj/xxaq/607037.html