CN106021424B - 一种文献作者重名检测方法 - Google Patents
一种文献作者重名检测方法 Download PDFInfo
- Publication number
- CN106021424B CN106021424B CN201610320129.XA CN201610320129A CN106021424B CN 106021424 B CN106021424 B CN 106021424B CN 201610320129 A CN201610320129 A CN 201610320129A CN 106021424 B CN106021424 B CN 106021424B
- Authority
- CN
- China
- Prior art keywords
- document
- author
- initial training
- disambiguation
- duplication
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
Description
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610320129.XA CN106021424B (zh) | 2016-05-13 | 2016-05-13 | 一种文献作者重名检测方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610320129.XA CN106021424B (zh) | 2016-05-13 | 2016-05-13 | 一种文献作者重名检测方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106021424A CN106021424A (zh) | 2016-10-12 |
CN106021424B true CN106021424B (zh) | 2019-05-28 |
Family
ID=57096991
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610320129.XA Active CN106021424B (zh) | 2016-05-13 | 2016-05-13 | 一种文献作者重名检测方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106021424B (zh) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107590128B (zh) * | 2017-09-21 | 2021-08-17 | 湖北大学 | 一种基于高置信度特征属性分层聚类方法的论文同名作者消歧方法 |
CN108021657A (zh) * | 2017-12-01 | 2018-05-11 | 四川大学 | 一种基于文献标题语义信息的相似作者搜索方法 |
CN110941662A (zh) * | 2019-06-24 | 2020-03-31 | 上海市研发公共服务平台管理中心 | 科研合作关系的图示化方法、***、存储介质、及终端 |
CN111191466B (zh) | 2019-12-25 | 2022-04-01 | 中国科学院计算机网络信息中心 | 一种基于网络表征和语义表征的同名作者消歧方法 |
CN112597305B (zh) * | 2020-12-22 | 2023-09-01 | 上海师范大学 | 基于深度学习的科技文献作者名消歧方法及web端消歧装置 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7444351B1 (en) * | 2007-12-18 | 2008-10-28 | International Business Machines Corporation | Systems, methods and computer products for name disambiguation by using private/global directories, and communication contexts |
CN102654881A (zh) * | 2011-03-03 | 2012-09-05 | 富士通株式会社 | 用于名称消岐聚类的装置和方法 |
CN104111973A (zh) * | 2014-06-17 | 2014-10-22 | 中国科学院计算技术研究所 | 一种学者重名的消歧方法及其*** |
CN104199838A (zh) * | 2014-08-04 | 2014-12-10 | 浙江工商大学 | 一种基于标签消歧的用户模型建构方法 |
US9305083B2 (en) * | 2012-01-26 | 2016-04-05 | Microsoft Technology Licensing, Llc | Author disambiguation |
-
2016
- 2016-05-13 CN CN201610320129.XA patent/CN106021424B/zh active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7444351B1 (en) * | 2007-12-18 | 2008-10-28 | International Business Machines Corporation | Systems, methods and computer products for name disambiguation by using private/global directories, and communication contexts |
CN102654881A (zh) * | 2011-03-03 | 2012-09-05 | 富士通株式会社 | 用于名称消岐聚类的装置和方法 |
CN102654881B (zh) * | 2011-03-03 | 2014-10-22 | 富士通株式会社 | 用于名称消岐聚类的装置和方法 |
US9305083B2 (en) * | 2012-01-26 | 2016-04-05 | Microsoft Technology Licensing, Llc | Author disambiguation |
CN104111973A (zh) * | 2014-06-17 | 2014-10-22 | 中国科学院计算技术研究所 | 一种学者重名的消歧方法及其*** |
CN104199838A (zh) * | 2014-08-04 | 2014-12-10 | 浙江工商大学 | 一种基于标签消歧的用户模型建构方法 |
Non-Patent Citations (6)
Title |
---|
Author name disambiguation: What difference does it make in author-based citation analysis;Andreas Strotmann et al;《Journal of American Society for Information Science &Technology》;20141231;第1820-1833页 |
Unsupervised author disambiguation usingDempster–Shafer theory;Hao Wu et al;《Scientometrics》;20141231;第1955-1972页 |
一种基于Mapreduce的知识聚类与统计机制;徐小龙 等;《电子与信息学报》;20160131;第38卷(第1期);第202-208页 |
基于两阶段聚类的人名消歧算法研究;张立伟;《中国优秀硕士学位论文全文数据库 信息科技辑》;20150515;第2015年卷(第05期);第I138-1335页 |
基于分步聚类的人名消歧算法;阳怡林 等;《数据采集与处理》;20160131;第31卷(第1期);第213-222页 |
科技文献作者重名消歧与实体链接;宋文强;《中国优秀硕士学位论文全文数据库 信息科技辑》;20140415;第2014年卷(第04期);第I138-729页 |
Also Published As
Publication number | Publication date |
---|---|
CN106021424A (zh) | 2016-10-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110298033B (zh) | 关键词语料标注训练提取*** | |
CN106649260B (zh) | 基于评论文本挖掘的产品特征结构树构建方法 | |
CN106021424B (zh) | 一种文献作者重名检测方法 | |
CN107463658B (zh) | 文本分类方法及装置 | |
CA2423033C (en) | A document categorisation system | |
CN102663139B (zh) | 一种情感词典构建方法及*** | |
CN110825877A (zh) | 一种基于文本聚类的语义相似度分析方法 | |
US20150074112A1 (en) | Multimedia Question Answering System and Method | |
CN109190117A (zh) | 一种基于词向量的短文本语义相似度计算方法 | |
CN109670014B (zh) | 一种基于规则匹配和机器学习的论文作者名消歧方法 | |
CN110134792B (zh) | 文本识别方法、装置、电子设备以及存储介质 | |
CN104516903A (zh) | 关键词扩展方法及***、及分类语料标注方法及*** | |
CN112559684A (zh) | 一种关键词提取及信息检索方法 | |
CN108804595B (zh) | 一种基于word2vec的短文本表示方法 | |
CN112052356A (zh) | 多媒体分类方法、装置和计算机可读存储介质 | |
CN108090223A (zh) | 一种基于互联网信息的开放学者画像方法 | |
Wagh | Knowledge discovery from legal documents dataset using text mining techniques | |
CN106484676B (zh) | 基于句法树和领域特征的生物文本蛋白质指代消解方法 | |
CN110245234A (zh) | 一种基于本体和语义相似度的多源数据样本关联方法 | |
CN113297844B (zh) | 一种基于doc2vec模型与最小编辑距离的重复性数据检测方法 | |
Luo et al. | Research on civic hotline complaint text classification model based on word2vec | |
CN103793444A (zh) | 用户需求获取方法 | |
Li | Automatic Classification of Chinese Long Texts Based on Deep Transfer Learning Algorithm | |
CN109145296A (zh) | 一种基于监督模型的泛词识别方法及装置 | |
Jadhav et al. | A concept based mining model for nlp using text clustering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 210003 new model road 66, Gulou District, Nanjing, Jiangsu Applicant after: Nanjing Post & Telecommunication Univ. Address before: 210023 9 Wen Yuan Road, Qixia District, Nanjing, Jiangsu. Applicant before: Nanjing Post & Telecommunication Univ. |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20161012 Assignee: NUPT INSTITUTE OF BIG DATA RESEARCH AT YANCHENG Assignor: NANJING University OF POSTS AND TELECOMMUNICATIONS Contract record no.: X2020980007071 Denomination of invention: A method of the name re detection of the author of the document Granted publication date: 20190528 License type: Common License Record date: 20201026 |
|
EE01 | Entry into force of recordation of patent licensing contract |