基于知识图谱的教育领域舆情文本语义过滤模型
首发时间:2023-03-13
摘要:为满足实际舆情研究工作中以某主题为核心过滤所有语义相关的舆情文本的任务需要,解决长短文本间语义过滤不敏感的问题。本文实际并构建了一个基于知识图谱的教育领域舆情文本语义过滤模型。本文通过教育领域知识图谱对查询主题和候选文档分别进行实体链接,并通过知识图谱的嵌入表示方法将其分别表示为实体向量集合,最后通过晚交互的神经网络计算两者的相似度,依据相似度对候选文本排序完成过滤任务。本文在自主构建的教育领域舆情知识图谱和语义过滤任务数据集上进行了系统地实验,实验证明本模型的平均倒数排名(MRR)到达9.35,比原生ColBERT模型提高0.32,排名前十的文档平均召回率达到82.2%,较基线模型提高2.9%。本文模型能够克服实际任务中长短文本的差异,充分利用知识图谱的语义信息,在教育领域舆情文本语义过滤任务中具备可行性和准确性,能够满足现实工作的需求。
关键词: 计算机科学与技术 知识图谱 语义过滤 实体链接 BERT
For information in English, please click here
Semantic filtering model of public opinion text in education field based on knowledge Graph
Abstract:In order to fit the task of filtering all semantically related public opinion texts with a certain topic in the actual public opinion research work, and to solve the difficultise in similarity calculation between long and short texts. In this paper, we propose a semantic filtering model of public opinion text in education field based on knowledge graph. Firstly, we use technology of entity link to deal with queries of some certain topic and candidate documents linked to them. Secondly, queries and candidate documents are represented as entity vector sets by the embedddings representation method of the knowledge graph. Thirdly, the similarity between the them is calculated by the late-interaction neural network. At last, the candidate texts are sorted according to the similar scores to complete the filtering task. This paper has carried out systematic experiments on the self-developed public opinion knowledge graph and semantic filtering task data set in the education field. The experiments show that the average reciprocal ranking (MRR) of this model reaches 37.4, 1.3 higher than the original ColBERT model, and the average recall of the top-10 documents reaches 82.2%, 2.9% higher than the baseline model. The model in this paper can overcome the differences between long and short texts in practical tasks, make full use of the semantic information of the knowledge graph, and has feasibility and accuracy in the semantic filtering task of public opinion texts in the field of education, which can meet the needs of practical work.
Keywords: Computer science and technology Knowledge map Semantic filtering Entity link BERT
引用
No.****
同行评议
勘误表
基于知识图谱的教育领域舆情文本语义过滤模型
评论
全部评论