"景先生毕设|www.jxszl.com

基于wos的多义术语自动消歧研究【字数:10238】

2024-11-03 10:51编辑: www.jxszl.com景先生毕设

目录
摘要Ⅱ
关键词Ⅱ
AbstractⅢ
引言
引言1
一、多义术语消歧研究1
(一)研究背景1
(二)研究现状1
(三)研究意义和目的2
(四)本文的主要研究内容2
(五)本文的研究思路2
(六)本文的研究方法和实验方案3
二、实验相关理论3
(一)获取文本特征3
1.TF特征提取3
2.IDF逆文本频率4
3.TFIDF(词频逆文本频率)特征提取4
(二)朴素贝叶斯分类算法5
三、数据预处理6
(一)数据说明6
1.WOS术语数据处理6
2.知网术语数据处理8
(二)术语多义性特征分析9
1.多义术语研究9
2.多义术语的涵盖学科11
3.多义术语数据的爬取及选择13
(三)文本分词15
四、多义术语消歧实验17
(一)实验模型17
(二)WOS多义术语自动消歧情况举例18
五、总结20
致谢21
参考文献22
基于WOS的多义术语自动消歧研究
摘 要
RESEARCH ON AUTOMATIC DISAMBIGUATION OF POLYSEMY TERMS BASED ON WOS
ABSTRACT
With the development of science and technology and the progress of society, human society has gradually entered the era of information explosion. More and more terms and the expansion of terms are needed to accurately describe information.Terminology is a collectio *51今日免费论文网|www.51jrft.com +Q: ^351916072
n of conceptual appellations in a specific field. It has basic characteristics such as professionalism, scientificity, and univocality. And how to ensure the professionalism of terminology has become a hot spot in machine learning research.This paper migrates to English data by studying the phenomenon of multidisciplinary polysemy terminology that exists in the terminology of CNKI, and believes that English terminology also has interdisciplinary polysemy terminology.Therefore, the data of Web of Science, the worlds largest comprehensive academic information resource platform covering the most disciplines, is used to analyze and study polysemy terminology.In this paper, after preprocessing data such as word segmentation and keyword feature extraction of WOS polysemy terminology data and CNKI polyseme terminology data obtained by crawler technology, the training set and test set are separated, and the classification algorithm is used to learn and fit and effect evaluation.Among them, in the keyword extraction stage, this article mainly uses the TFIDF algorithm to extract keyword features from text data of relevant polysemy term.In the classification stage, this article uses the Naive Bayesian algorithm to perform polysemy terms. The automatic disambiguation work was carried out, and the disambiguation effect of polysemy terms was explored. It was concluded that the model had a correct term disambiguation rate of 70%, which had a good disambiguation effect.

原文链接:http://www.jxszl.com/jsj/xxaq/606964.html