CN110427626A - The extracting method and device of keyword - Google Patents

The extracting method and device of keyword Download PDF

Info

Publication number
CN110427626A
CN110427626A CN201910703459.0A CN201910703459A CN110427626A CN 110427626 A CN110427626 A CN 110427626A CN 201910703459 A CN201910703459 A CN 201910703459A CN 110427626 A CN110427626 A CN 110427626A
Authority
CN
China
Prior art keywords
word
value
text
weighted value
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910703459.0A
Other languages
Chinese (zh)
Other versions
CN110427626B (en
Inventor
崔峭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mininglamp Software System Co ltd
Original Assignee
Beijing Mininglamp Software System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mininglamp Software System Co ltd filed Critical Beijing Mininglamp Software System Co ltd
Priority to CN201910703459.0A priority Critical patent/CN110427626B/en
Publication of CN110427626A publication Critical patent/CN110427626A/en
Application granted granted Critical
Publication of CN110427626B publication Critical patent/CN110427626B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of extracting method of keyword and devices.Specifically, this method comprises: the text to input carries out text retrieval conference TREC, with corresponding first weighted value of content type each in the determination text;Semantic analysis is carried out to each word in the text, with corresponding second weighted value of each word of determination;The term frequencies TF value of each word is adjusted according to first weighted value and second weighted value, and the TF value by adjusting after calculates the third weighted value of each word;According to the third weighted value, extracts in the word and specify word as the keyword for being retrieved.Through the invention, it solves TF-IDF to calculate dependent on the relevant document of multiple contents, the word weight of single text can not be calculated, and TF-IDF method discrete data lower for the degree of association shows poor problem, reach the precision effect for improving and extracting to key word information.

Description

The extracting method and device of keyword
Technical field
The present invention relates to the communications fields, in particular to the extracting method and device of a kind of keyword.
Background technique
The current most common searching system is all based on what keyword was realized, and the extraction of keyword, nearly all uses word Frequently the calculation method of (term frequency, TF) and anti-document frequency (inverse document frequency, IDF). But TF-IDF is calculated and is depended on the relevant document of multiple contents, can not calculate the word weight of single text, and the side TF-IDF Method discrete data performance lower for the degree of association is poor.
Summary of the invention
The embodiment of the invention provides a kind of extracting method of keyword and devices, at least to solve TF- in the related technology IDF, which is calculated, depends on the relevant document of multiple contents, can not calculate the word weight of single text, and TF-IDF method is for closing The lower discrete data of connection degree shows poor problem.
According to one embodiment of present invention, provide a kind of extracting method of keyword, comprising: to the text of input into Row text retrieval conference TREC, with corresponding first weighted value of content type each in the determination text;To each in the text A word carries out semantic analysis, with corresponding second weighted value of each word of determination;According to first weighted value and described Two weighted values are adjusted the term frequencies TF value of each word, and the TF value by adjusting after calculates each institute The third weighted value of predicate language;According to the third weighted value, extracts in the word and word is specified to be used as being retrieved Keyword.
Optionally, before carrying out semantic analysis to the word in the text, the method also includes: according to default rule Word segmentation processing then is carried out to the text, and, it is determined according to the relevance between each word after participle each The part of speech of a word.
Optionally, semantic analysis is carried out to each word in the text, to determine corresponding second power of specified word Weight values, comprising: each word is ranked up according to preset part of speech priority rule;According to the part of speech priority Sequence assigns corresponding second weighted value to each word.
Optionally, according to first weighted value and second weighted value to the term frequencies TF value of each word It is adjusted, further includes: obtain the TF value of each word;By the TF value and first weighted value and described the Two weighted values are multiplied, with the determination TF value adjusted.
Optionally, the TF value by adjusting after calculates the third weighted value of each word, comprising: obtains each The IDF value of the word;It is determined according to the TF value of each word adjusted and the IDF value of each word The third weighted value.
Optionally, it according to the third weighted value, extracts in the word and specifies word as the pass for being retrieved Keyword, the method also includes: it removes third weighted value described in the word and is less than the word of default weight threshold as institute State specified word.
Optionally, the content type includes at least one of: drawing in the text according to pre-set text format Point content of text type, the location type of paragraph in the text, the location type of sentence in the text.
According to another embodiment of the invention, a kind of extraction element of keyword is provided, comprising: first determines mould Block, for carrying out text retrieval conference TREC to the text of input, with corresponding first power of content type each in the determination text Weight values;Second determining module, it is corresponding with each word of determination for carrying out semantic analysis to each word in the text Second weighted value;Module is adjusted, for the word according to first weighted value and second weighted value to each word Speech frequency rate TF value is adjusted, and the TF value by adjusting after calculates the third weighted value of each word;Extract mould Block specifies word as the keyword for being retrieved for according to the third weighted value, extracting in the word.
According to still another embodiment of the invention, a kind of storage medium is additionally provided, meter is stored in the storage medium Calculation machine program, wherein the computer program is arranged to execute the step in any of the above-described embodiment of the method when operation.
According to still another embodiment of the invention, a kind of electronic device, including memory and processor are additionally provided, it is described Computer program is stored in memory, the processor is arranged to run the computer program to execute any of the above-described Step in embodiment of the method.
Through the invention, the TF of word is carried out using the results of structural analysis of text and the semantic analysis result of word Adjustment, therefore, can solve solution, TF-IDF is calculated dependent on the relevant document of multiple contents in the related technology, can not calculate list The word weight of one text, and TF-IDF method discrete data lower for the degree of association shows poor problem, reaches Improve the precision effect extracted to key word information.
Detailed description of the invention
The drawings described herein are used to provide a further understanding of the present invention, constitutes part of this application, this hair Bright illustrative embodiments and their description are used to explain the present invention, and are not constituted improper limitations of the present invention.In the accompanying drawings:
Fig. 1 is the flow chart of the extraction of keyword according to an embodiment of the present invention;
Fig. 2 is a kind of schematic diagram of test text according to an embodiment of the present invention;
Fig. 3 is a kind of result figure for extracting result according to an embodiment of the present invention;
Fig. 4 is a kind of structural block diagram of the extracting method device of keyword according to an embodiment of the present invention.
Specific embodiment
Hereinafter, the present invention will be described in detail with reference to the accompanying drawings and in combination with Examples.It should be noted that not conflicting In the case of, the features in the embodiments and the embodiments of the present application can be combined with each other.
It should be noted that description and claims of this specification and term " first " in above-mentioned attached drawing, " Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.
Embodiment 1
A kind of extracting method for running on keyword is provided in the present embodiment, and Fig. 1 is according to an embodiment of the present invention The flow chart of the extraction of keyword, as shown in Figure 1, the process includes the following steps:
Step S102 carries out text retrieval conference TREC to the text of input, with each content type pair in the determination text The first weighted value answered;
Step S104 carries out semantic analysis to each word in the text, with each word of determination corresponding second Weighted value;
Step S106, according to first weighted value and second weighted value to the term frequencies TF of each word Value is adjusted, and the TF value by adjusting after calculates the third weighted value of each word;
Step S108 is extracted in the word and is specified word as being retrieved according to the third weighted value Keyword.
Optionally, the content type includes at least one of: drawing in the text according to pre-set text format Point content of text type, the location type of paragraph in the text, the location type of sentence in the text.
Specifically, in the text according to pre-set text format divide content of text type refer in text it is each in Hold part.Such as the specification of a patent application, abstract of description, abstract of description attached drawing, claims, Specification, Figure of description can be used as a kind of division mode of content of text type of content part.And for specification and Speech, the division mode of the content of text type of the specification can be said according to technical field, background technique, summary of the invention, attached drawing Bright, specific embodiment is divided.
Specifically, the location type of paragraph refers to the position of paragraph content of text in the text in text.Such as, if In technical field, if in summary of the invention, if in the description of the drawings.Whether paragraph is first paragraph simultaneously, still most Paragraph afterwards, or the paragraph of some intermediate position.
Specifically, the location type of sentence is similar with the location type of paragraph in text in text, refer to sentence in section It falls or the position of content of text in the text.Such as, if in technical field, if in summary of the invention, if attached In figure explanation.Simultaneously whether some paragraph first sentence, tail sentence or middle section.
Above description is illustrative examples, and the manifestation mode of any content type based on above-mentioned thinking is in this implementation Within the protection scope of example.
Specifically, being with patent application when corresponding first weighted value of each content type in determining the text , content importance of the content obviously than other four parts in specification in specific embodiment is high, therefore, for Paragraph, sentence in the specific embodiment, specific embodiment can assign weight more higher than other content type.And having In body embodiment, often the content in first section or preceding several paragraphs is most important.Therefore, for first section or former A paragraph will assign weight more higher than other paragraphs.And in each paragraph, often first section or endpiece usually provide knot By the sentence of property, therefore, the weight of head and the tail section will assign weight more higher than sentence in other paragraphs.
Optionally, before carrying out semantic analysis to the word in the text, the method also includes: according to default rule Word segmentation processing then is carried out to the text, and, it is determined according to the relevance between each word after participle each The part of speech of a word.
Optionally, semantic analysis is carried out to each word in the text, to determine corresponding second power of specified word Weight values, comprising: each word is ranked up according to preset part of speech priority rule;According to the part of speech priority Sequence assigns corresponding second weighted value to each word.
Specifically, on the one hand semantic analysis can filter out the principal entities of article discussion according to semanteme, on the other hand, The unwanted contributions in sentence can be removed.For example, " Xiao Ming is a Chinese ", can propose subject " Xiao Ming " and predicative " in Compatriots ".Then a higher weight, other words are assigned for critical entities such as subject, the objects of core phrase and sentence Weight of converging is directly disposed as 1.Another example is that in view of sometimes quantifier, adjective are also more crucial part.Cause This, can to quantifier, adjectival weight assign it is lower than critical entities such as core phrase and the subject of sentence, objects, but than it The high weight of his word can assign higher value.
Optionally, according to first weighted value and second weighted value to the term frequencies TF value of each word It is adjusted, further includes: obtain the TF value of each word;By the TF value and first weighted value and described the Two weighted values are multiplied, with the determination TF value adjusted.
Optionally, the TF value by adjusting after calculates the third weighted value of each word, comprising: obtains each The IDF value of the word;It is determined according to the TF value of each word adjusted and the IDF value of each word The third weighted value.
Optionally, it according to the third weighted value, extracts in the word and specifies word as the pass for being retrieved Keyword, the method also includes: it removes third weighted value described in the word and is less than the word of default weight threshold as institute State specified word.
It should be pointed out that the purpose of default weight threshold is to influence the knot retrieved below in order to avoid output result is excessive Fruit, because excessive keyword is searched in knowledge mapping, it is possible to can because major key is excessive, return excessive information or Person excessively can not return information because limiting.
In order to better understand the technical solution recorded in the present embodiment, following scene is additionally provided in the present embodiment To better understand the scheme recorded in above-described embodiment.
Fig. 2 is a kind of schematic diagram of test text according to an embodiment of the present invention.As shown in Fig. 2,
Step 1: text retrieval conference TREC is carried out to the test text of input.To analyze, the 1st row is assigned, the 31st row is most High weighted value assigns the 7th row, the weighted value (being equivalent to the content in the 7th row of removal and the 9th row) that the 9th row weighted value is 0, However the sentence of other rows is then assigned lower than the 1st row, the weighted value of the 31st row.
Step 2: semantic analysis being carried out to each word in the text, is segmented, part of speech, goes to listen word, syntax The processing such as analysis.Such as recorded in the 2nd row " many people really remember artificial intelligence, or because Shi Di in 2001 This literary Pierre's Burger instructs that film " artificial intelligence " " it can extract, subject " people ", " Glenn Stevens Pierre Burger ", it calls Language " remembers ", " guidance ", object " artificial intelligence ".However for the subject of core, object " people ", " Glenn Stevens Pierre primary Lattice ", " artificial intelligence " assign the highest weighted value greater than 1.Predicate " remembers " that " guidance " then assigns and be greater than 1 but be less than core Subject, the corresponding weighted value of object data.Other words are then directly disposed as 1.
Step 3: according to first weighted value and second weighted value to the term frequencies TF value of each word It is adjusted.Calculate the TF value of each word in every words.And multiplication is carried out according to the weighted value that step 1 and step 2 are got Operation gets the TF value of each word.
Step 4: calculating the TF-IDF value of each word, (the case where being zero for IDF value, TF-IDF value is directly weighed using word Weight).It after obtaining the TF-IDF value of each word, can directly use, or utilize sigmoid function, TF-IDF value is gone Linearisation, and result is normalized.To get the corresponding weighted value of keyword of each word.
Step 5: each word weighted value corresponding with keyword is compared with preset threshold value, thus filter out as Result shown in Fig. 3.Fig. 3 is a kind of result figure for extracting result according to an embodiment of the present invention.As shown in figure 3, final extract It as a result is " artificial intelligence ", " mankind ", " robot ", " machine ", " law ".Wherein, the weight highest of law ", " robot " Weight it is minimum.
Step 6: according to the output in Fig. 3 as a result, targetedly being retrieved.
Through the above description of the embodiments, those skilled in the art can be understood that according to above-mentioned implementation The method of example can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but it is very much In the case of the former be more preferably embodiment.Based on this understanding, technical solution of the present invention is substantially in other words to existing The part that technology contributes can be embodied in the form of software products, which is stored in a storage In medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that a terminal device (can be mobile phone, calculate Machine, server or network equipment etc.) execute method described in each embodiment of the present invention.
Embodiment 2
Additionally provide a kind of device in the present embodiment, the device for realizing above-described embodiment and preferred embodiment, The descriptions that have already been made will not be repeated.As used below, term " module " may be implemented predetermined function software and/or The combination of hardware.Although device described in following embodiment is preferably realized with software, hardware or software and hard The realization of the combination of part is also that may and be contemplated.
Fig. 4 is a kind of structural block diagram of the extracting method device of keyword according to an embodiment of the present invention, as shown in figure 4, The device includes:
First determining module 42, for carrying out text retrieval conference TREC to the text of input, with each in the determination text Corresponding first weighted value of content type;
Second determining module 44, for carrying out semantic analysis to each word in the text, with each word of determination Corresponding second weighted value;
Module 46 is adjusted, for the word according to first weighted value and second weighted value to each word Frequency TF value is adjusted, and the TF value by adjusting after calculates the third weighted value of each word;
Extraction module 48 specifies word to be used as carrying out for according to the third weighted value, extracting in the word The keyword of retrieval.
It should be noted that above-mentioned modules can be realized by software or hardware, for the latter, Ke Yitong Following manner realization is crossed, but not limited to this: above-mentioned module is respectively positioned in same processor;Alternatively, above-mentioned modules are with any Combined form is located in different processors.
Embodiment 3
The embodiments of the present invention also provide a kind of storage medium, computer program is stored in the storage medium, wherein The computer program is arranged to execute the step in any of the above-described embodiment of the method when operation.
Optionally, in the present embodiment, above-mentioned storage medium can be set to store by executing based on following steps Calculation machine program:
S1 carries out text retrieval conference TREC to the text of input, with each content type in the determination text corresponding the One weighted value;
S2 carries out semantic analysis to each word in the text, with corresponding second weighted value of each word of determination;
S3 carries out the term frequencies TF value of each word according to first weighted value and second weighted value Adjustment, and the TF value by adjusting after calculates the third weighted value of each word.
Optionally, in the present embodiment, above-mentioned storage medium can include but is not limited to: USB flash disk, read-only memory (Read- Only Memory, referred to as ROM), it is random access memory (Random Access Memory, referred to as RAM), mobile hard The various media that can store computer program such as disk, magnetic or disk.
The embodiments of the present invention also provide a kind of electronic device, including memory and processor, stored in the memory There is computer program, which is arranged to run computer program to execute the step in any of the above-described embodiment of the method Suddenly.
Optionally, above-mentioned electronic device can also include transmission device and input-output equipment, wherein the transmission device It is connected with above-mentioned processor, which connects with above-mentioned processor.
Optionally, in the present embodiment, above-mentioned processor can be set to execute following steps by computer program:
S1 carries out text retrieval conference TREC to the text of input, with each content type in the determination text corresponding the One weighted value;
S2 carries out semantic analysis to each word in the text, with corresponding second weighted value of each word of determination;
S3 carries out the term frequencies TF value of each word according to first weighted value and second weighted value Adjustment, and the TF value by adjusting after calculates the third weighted value of each word.
Optionally, the specific example in the present embodiment can be with reference to described in above-described embodiment and optional embodiment Example, details are not described herein for the present embodiment.
Obviously, those skilled in the art should be understood that each module of the above invention or each step can be with general Computing device realize that they can be concentrated on a single computing device, or be distributed in multiple computing devices and formed Network on, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored It is performed by computing device in the storage device, and in some cases, it can be to be different from shown in sequence execution herein Out or description the step of, perhaps they are fabricated to each integrated circuit modules or by them multiple modules or Step is fabricated to single integrated circuit module to realize.In this way, the present invention is not limited to any specific hardware and softwares to combine.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.It is all within principle of the invention, it is made it is any modification, etc. With replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims (10)

1. a kind of extracting method of keyword characterized by comprising
Text retrieval conference TREC is carried out to the text of input, with corresponding first weight of content type each in the determination text Value;
Semantic analysis is carried out to each word in the text, with corresponding second weighted value of each word of determination;
The term frequencies TF value of each word is adjusted according to first weighted value and second weighted value, and The TF value by adjusting after calculates the third weighted value of each word;
According to the third weighted value, extracts in the word and specify word as the keyword for being retrieved.
2. the method according to claim 1, wherein in the text word carry out semantic analysis it Before, the method also includes:
Word segmentation processing is carried out to the text according to preset rules, and,
The part of speech of each word is determined according to the relevance between each word after participle.
3. according to the method described in claim 2, it is characterized in that, in the text each word carry out semantic analysis, To determine specified corresponding second weighted value of word, comprising:
Each word is ranked up according to preset part of speech priority rule;
Corresponding second weighted value is assigned to each word according to the sequence of the part of speech priority.
4. the method according to claim 1, wherein according to first weighted value and second weighted value pair The term frequencies TF value of each word is adjusted, further includes:
Obtain the TF value of each word;
The TF value is multiplied with first weighted value and second weighted value, with the determination TF value adjusted.
5. the method according to claim 1, wherein the TF value by adjusting after calculates each word Third weighted value, comprising:
Obtain the IDF value of each word;
The third weight is determined according to the IDF value of the TF value of each word adjusted and each word Value.
6. the method according to claim 1, wherein extracting the word middle finger according to the third weighted value Determine word as the keyword for being retrieved, the method also includes:
It removes third weighted value described in the word and is less than the word of default weight threshold as the specified word.
7. the method according to claim 1, wherein the content type includes at least one of:
The content of text type divided in the text according to pre-set text format, the location type of paragraph, institute in the text State the location type of sentence in text.
8. a kind of extraction element of keyword characterized by comprising
First determining module, for carrying out text retrieval conference TREC to the text of input, with each content class in the determination text Corresponding first weighted value of type;
Second determining module, it is corresponding with each word of determination for carrying out semantic analysis to each word in the text Second weighted value;
Module is adjusted, for the term frequencies TF according to first weighted value and second weighted value to each word Value is adjusted, and the TF value by adjusting after calculates the third weighted value of each word;
Extraction module is extracted and specifies word as being retrieved in the word for according to the third weighted value Keyword.
9. a kind of storage medium, which is characterized in that be stored with computer program in the storage medium, wherein the computer Program is arranged to execute method described in any one of claim 1 to 7 when operation.
10. a kind of electronic device, including memory and processor, which is characterized in that be stored with computer journey in the memory Sequence, the processor are arranged to run the computer program to execute side described in any one of claim 1 to 7 Method.
CN201910703459.0A 2019-07-31 2019-07-31 Keyword extraction method and device Active CN110427626B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910703459.0A CN110427626B (en) 2019-07-31 2019-07-31 Keyword extraction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910703459.0A CN110427626B (en) 2019-07-31 2019-07-31 Keyword extraction method and device

Publications (2)

Publication Number Publication Date
CN110427626A true CN110427626A (en) 2019-11-08
CN110427626B CN110427626B (en) 2022-12-09

Family

ID=68413496

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910703459.0A Active CN110427626B (en) 2019-07-31 2019-07-31 Keyword extraction method and device

Country Status (1)

Country Link
CN (1) CN110427626B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931480A (en) * 2020-07-03 2020-11-13 北京新联财通咨询有限公司 Method and device for determining main content of text, storage medium and computer equipment
CN112417101A (en) * 2020-11-23 2021-02-26 平安科技(深圳)有限公司 Keyword extraction method and related device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130185308A1 (en) * 2012-01-13 2013-07-18 International Business Machines Corporation System and method for extraction of off-topic part from conversation
CN103927302A (en) * 2013-01-10 2014-07-16 阿里巴巴集团控股有限公司 Text classification method and system
CN108920488A (en) * 2018-05-14 2018-11-30 平安科技(深圳)有限公司 The natural language processing method and device that multisystem combines

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130185308A1 (en) * 2012-01-13 2013-07-18 International Business Machines Corporation System and method for extraction of off-topic part from conversation
CN103927302A (en) * 2013-01-10 2014-07-16 阿里巴巴集团控股有限公司 Text classification method and system
CN108920488A (en) * 2018-05-14 2018-11-30 平安科技(深圳)有限公司 The natural language processing method and device that multisystem combines

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931480A (en) * 2020-07-03 2020-11-13 北京新联财通咨询有限公司 Method and device for determining main content of text, storage medium and computer equipment
CN112417101A (en) * 2020-11-23 2021-02-26 平安科技(深圳)有限公司 Keyword extraction method and related device
CN112417101B (en) * 2020-11-23 2023-08-18 平安科技(深圳)有限公司 Keyword extraction method and related device

Also Published As

Publication number Publication date
CN110427626B (en) 2022-12-09

Similar Documents

Publication Publication Date Title
KR102431549B1 (en) Causality recognition device and computer program therefor
US11531818B2 (en) Device and method for machine reading comprehension question and answer
KR102163549B1 (en) Method and apparatus for determining retreat
CN112988969B (en) Method, apparatus, device and storage medium for text retrieval
US7099819B2 (en) Text information analysis apparatus and method
CN109299280B (en) Short text clustering analysis method and device and terminal equipment
CN110874528B (en) Text similarity obtaining method and device
CN110297893A (en) Natural language question-answering method, device, computer installation and storage medium
WO2021159655A1 (en) Data attribute filling method, apparatus and device, and computer-readable storage medium
KR20200137924A (en) Real-time keyword extraction method and device in text streaming environment
CN111309916A (en) Abstract extraction method and device, storage medium and electronic device
US8806455B1 (en) Systems and methods for text nuclearization
CN110427626A (en) The extracting method and device of keyword
US9454568B2 (en) Method, apparatus and computer storage medium for acquiring hot content
CN111401039A (en) Word retrieval method, device, equipment and storage medium based on binary mutual information
CN117076650B (en) Intelligent dialogue method, device, medium and equipment based on large language model
CN113392305A (en) Keyword extraction method and device, electronic equipment and computer storage medium
CN110287284B (en) Semantic matching method, device and equipment
CN117113174A (en) Model training method and device, storage medium and electronic equipment
JP2017027106A (en) Similarity calculation device, similarity retrieval device, and similarity calculation program
CN112784046B (en) Text clustering method, device, equipment and storage medium
CN113704433A (en) Man-machine conversation voice intention recognition method, device, equipment and storage medium
CN113934842A (en) Text clustering method and device and readable storage medium
CN112579769A (en) Keyword clustering method and device, storage medium and electronic equipment
CN111209752A (en) Chinese extraction integrated unsupervised abstract method based on auxiliary information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant