CN112287682B - 一种主题词提取方法、装置、设备及存储介质 - Google Patents
一种主题词提取方法、装置、设备及存储介质 Download PDFInfo
- Publication number
- CN112287682B CN112287682B CN202011573897.9A CN202011573897A CN112287682B CN 112287682 B CN112287682 B CN 112287682B CN 202011573897 A CN202011573897 A CN 202011573897A CN 112287682 B CN112287682 B CN 112287682B
- Authority
- CN
- China
- Prior art keywords
- idf
- subject
- scores
- idayf
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 238000000605 extraction Methods 0.000 claims abstract description 75
- 238000004590 computer program Methods 0.000 claims description 6
- 230000011218 segmentation Effects 0.000 claims description 6
- 239000000126 substance Substances 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 5
- 239000000284 extract Substances 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 abstract description 9
- 230000000694 effects Effects 0.000 abstract description 3
- 206010022000 influenza Diseases 0.000 description 12
- 238000004422 calculation algorithm Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 6
- 238000007418 data mining Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 229960005486 vaccine Drugs 0.000 description 4
- 238000004891 communication Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 210000001747 pupil Anatomy 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 230000007423 decrease Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011573897.9A CN112287682B (zh) | 2020-12-28 | 2020-12-28 | 一种主题词提取方法、装置、设备及存储介质 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011573897.9A CN112287682B (zh) | 2020-12-28 | 2020-12-28 | 一种主题词提取方法、装置、设备及存储介质 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112287682A CN112287682A (zh) | 2021-01-29 |
CN112287682B true CN112287682B (zh) | 2021-06-08 |
Family
ID=74426411
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011573897.9A Active CN112287682B (zh) | 2020-12-28 | 2020-12-28 | 一种主题词提取方法、装置、设备及存储介质 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112287682B (zh) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113051921B (zh) * | 2021-03-17 | 2024-02-20 | 北京智慧星光信息技术有限公司 | 互联网文本实体识别方法、***、电子设备及存储介质 |
CN114281983B (zh) * | 2021-04-05 | 2024-04-12 | 北京智慧星光信息技术有限公司 | 分层结构的文本分类方法、***、电子设备和存储介质 |
CN113537691A (zh) * | 2021-05-09 | 2021-10-22 | 武汉兴得科技有限公司 | 一种大数据公共卫生事件应急指挥方法及*** |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008178037A (ja) * | 2007-01-22 | 2008-07-31 | Sony Corp | 情報処理装置、情報処理方法及び情報処理プログラム |
JP6441203B2 (ja) * | 2015-11-12 | 2018-12-19 | 日本電信電話株式会社 | 音声認識結果圧縮装置、音声認識結果圧縮方法、プログラム |
CN108446274A (zh) * | 2018-03-15 | 2018-08-24 | 北京科技大学 | 一种基于时间敏感tf-idf的关键词提取方法 |
CN111159557B (zh) * | 2019-12-31 | 2023-07-25 | 北京奇艺世纪科技有限公司 | 一种热点信息获取方法、装置、服务器及介质 |
-
2020
- 2020-12-28 CN CN202011573897.9A patent/CN112287682B/zh active Active
Also Published As
Publication number | Publication date |
---|---|
CN112287682A (zh) | 2021-01-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112287682B (zh) | 一种主题词提取方法、装置、设备及存储介质 | |
WO2022141861A1 (zh) | 情感分类方法、装置、电子设备及存储介质 | |
WO2021218322A1 (zh) | 段落搜索方法、装置、电子设备及存储介质 | |
CN112541338A (zh) | 相似文本匹配方法、装置、电子设备及计算机存储介质 | |
CN110309251B (zh) | 文本数据的处理方法、装置和计算机可读存储介质 | |
WO2012135319A1 (en) | Processing data in a mapreduce framework | |
CN111930962A (zh) | 文献数据价值评估方法、装置、电子设备及存储介质 | |
CN112380244B (zh) | 一种分词搜索方法、装置、电子设备及可读存储介质 | |
WO2022160454A1 (zh) | 医疗文献的检索方法、装置、电子设备及存储介质 | |
CN112380859A (zh) | 舆情信息的推荐方法、装置、电子设备及计算机存储介质 | |
CN113095076A (zh) | 敏感词识别方法、装置、电子设备及存储介质 | |
CN113449187A (zh) | 基于双画像的产品推荐方法、装置、设备及存储介质 | |
CN113886708A (zh) | 基于用户信息的产品推荐方法、装置、设备及存储介质 | |
CN106649308B (zh) | 一种分词词库更新方法及*** | |
CN105653553B (zh) | 词权重生成方法和装置 | |
CN108875050B (zh) | 面向文本的数字取证分析方法、装置和计算机可读介质 | |
CN110019556B (zh) | 一种话题新闻获取方法、装置及其设备 | |
CN108628875B (zh) | 一种文本标签的提取方法、装置及服务器 | |
CN112633988A (zh) | 用户产品推荐方法、装置、电子设备及可读存储介质 | |
CN112579781A (zh) | 文本归类方法、装置、电子设备及介质 | |
Coenen et al. | Statistical identification of key phrases for text classification | |
CN108763258B (zh) | 文档主题参数提取方法、产品推荐方法、设备及存储介质 | |
CN115438048A (zh) | 表搜索方法、装置、设备及存储介质 | |
CN110674283A (zh) | 文本摘要的智能抽取方法、装置、计算机设备及存储介质 | |
CN112100318B (zh) | 一种多维度信息合并方法、装置、设备及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: A method, device, device and storage medium for extracting subject words Effective date of registration: 20221031 Granted publication date: 20210608 Pledgee: China Co. truction Bank Corp Beijing Zhongguancun branch Pledgor: BEIJING SMART STARLIGHT INFORMATION TECHNOLOGY CO.,LTD. Registration number: Y2022110000282 |
|
PC01 | Cancellation of the registration of the contract for pledge of patent right | ||
PC01 | Cancellation of the registration of the contract for pledge of patent right |
Date of cancellation: 20231227 Granted publication date: 20210608 Pledgee: China Co. truction Bank Corp Beijing Zhongguancun branch Pledgor: BEIJING SMART STARLIGHT INFORMATION TECHNOLOGY CO.,LTD. Registration number: Y2022110000282 |