CN110263137A - The extracting method and device of subject key words, electronic equipment - Google Patents

The extracting method and device of subject key words, electronic equipment Download PDF

Info

Publication number
CN110263137A
CN110263137A CN201910468420.5A CN201910468420A CN110263137A CN 110263137 A CN110263137 A CN 110263137A CN 201910468420 A CN201910468420 A CN 201910468420A CN 110263137 A CN110263137 A CN 110263137A
Authority
CN
China
Prior art keywords
answer
keyword
question
target
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910468420.5A
Other languages
Chinese (zh)
Other versions
CN110263137B (en
Inventor
谷银波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910468420.5A priority Critical patent/CN110263137B/en
Publication of CN110263137A publication Critical patent/CN110263137A/en
Application granted granted Critical
Publication of CN110263137B publication Critical patent/CN110263137B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This specification one or more embodiment provides the extracting method and device, electronic equipment of a kind of subject key words, which comprises target question and answer entry is read from question and answer type knowledge base;Wherein, the target question and answer entry includes problem data and answer data;Keyword is extracted from described problem data and the answer data respectively;It determines and whether there is identical target keyword from the keyword extracted in described problem data and from the keyword extracted in the answer data;If there is identical target keyword, then the target keyword is determined as to the subject key words of the target question and answer entry.

Description

The extracting method and device of subject key words, electronic equipment
Technical field
This specification one or more embodiment is related to computer application technology more particularly to a kind of subject key words Extracting method and device, electronic equipment.
Background technique
It, all can be by some common problems and corresponding answer (Frequently Asked in many technical fields Questions, FAQ) it records, in order to which answer can be quickly found out when being subsequently encountered same problem.With recording The problem and answer come are more and more, it will usually by these problems and answer typing to database, form knowledge base.And with knowing When data volume in knowledge library is increasing, it usually needs classify to the entry in knowledge base, in order to be carried out to knowledge base Quick-searching.
Summary of the invention
This specification proposes a kind of extracting method of subject key words, which comprises
Target question and answer entry is read from question and answer type knowledge base;Wherein, the target question and answer entry include problem data and Answer data;
Keyword is extracted from described problem data and the answer data respectively;
It determines from the keyword extracted in described problem data and from the keyword extracted in the answer data With the presence or absence of identical target keyword;
If there is identical target keyword, then the target keyword is determined as to the master of the target question and answer entry Inscribe keyword.
Optionally, the method also includes:
Based on the subject key words of the target question and answer entry, tag along sort is added for the target question and answer entry.
Optionally, the subject key words based on the target question and answer entry, for target question and answer entry addition point Class label, comprising:
If there are multiple subject key words for the target question and answer entry, it is determined that in described problem data and the answer The most target topic keyword of frequency of occurrence in data, and using the target topic keyword as the target question and answer entry Tag along sort store to the question and answer type knowledge base;
If the subject key words of the target question and answer entry existence anduniquess, using the target topic keyword as institute The tag along sort for stating target question and answer entry is stored to the question and answer type knowledge base.
Optionally, the method also includes:
The subject key words are added to the search key collection of the search engine docked with the question and answer type knowledge base It closes.
Optionally, keyword extraction algorithm used by keyword is extracted from described problem data and the answer data For TextRank algorithm or TF-IDF algorithm.
This specification also proposes a kind of extraction element of subject key words, and described device includes:
Read module, for reading target question and answer entry from question and answer type knowledge base;Wherein, the target question and answer entry packet Include problem data and answer data;
Extraction module, for extracting keyword from described problem data and the answer data respectively;
First determining module, for determining from the keyword extracted in described problem data and from the answer data It whether there is identical target keyword in the keyword extracted;
Second determining module, for the target keyword being determined as described when there are identical target keyword The subject key words of target question and answer entry.
Optionally, described device further include:
First adding module is the target question and answer entry for the subject key words based on the target question and answer entry Add tag along sort.
Optionally, first adding module is specifically used for:
If there are multiple subject key words for the target question and answer entry, it is determined that in described problem data and the answer The most target topic keyword of frequency of occurrence in data, and using the target topic keyword as the target question and answer entry Tag along sort store to the question and answer type knowledge base;
If the subject key words of the target question and answer entry existence anduniquess, using the target topic keyword as institute The tag along sort for stating target question and answer entry is stored to the question and answer type knowledge base.
Optionally, described device further include:
Second adding module is drawn for the subject key words to be added to the search docked with the question and answer type knowledge base The search key set held up.
Optionally, keyword extraction algorithm used by keyword is extracted from described problem data and the answer data For TextRank algorithm or TF-IDF algorithm.
This specification also proposes a kind of electronic equipment, and the electronic equipment includes:
Processor;
For storing the memory of machine-executable instruction;
It wherein, can by reading and executing the machine corresponding with the control logic of keyword extraction of the memory storage It executes instruction, the processor is prompted to:
Target question and answer entry is read from question and answer type knowledge base;Wherein, the target question and answer entry include problem data and Answer data;
Keyword is extracted from described problem data and the answer data respectively;
It determines from the keyword extracted in described problem data and from the keyword extracted in the answer data With the presence or absence of identical target keyword;
If there is identical target keyword, then the target keyword is determined as to the master of the target question and answer entry Inscribe keyword.
In the above-mentioned technical solutions, it for question and answer type knowledge base, can ask respectively what question and answer entry therein was included Inscribe data and answer data and carry out keyword extraction, can will further be extracted from the problem data with from the answer The identical keyword extracted in data is determined as the subject key words of the question and answer entry.In this way, on the one hand can use each The subject key words of a question and answer entry classify to the question and answer entry in question and answer type knowledge base, so as to be convenient for utilizing theme Keyword carries out quick-searching to question and answer type knowledge base.On the other hand, since subject key words are extracted from problem data The identical keyword with what is extracted from answer data, therefore can reflect more accurately question and answer entry it is main in Hold, so as to improve the retrieval accuracy for being directed to question and answer type knowledge base.
Detailed description of the invention
Fig. 1 is a kind of schematic diagram of the extraction system of subject key words shown in one exemplary embodiment of this specification;
Fig. 2 is a kind of flow chart of the extracting method of subject key words shown in one exemplary embodiment of this specification;
Fig. 3 is a kind of schematic diagram of user interface shown in one exemplary embodiment of this specification;
Fig. 4 is the schematic diagram of another user interface shown in one exemplary embodiment of this specification;
Fig. 5 is electronic equipment where a kind of extraction element of subject key words shown in one exemplary embodiment of this specification Hardware structure diagram;
Fig. 6 is a kind of block diagram of the extraction element of subject key words shown in one exemplary embodiment of this specification.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with this specification one or more embodiment.Phase Instead, they are only some aspects phases with the one or more embodiments of as detailed in the attached claim, this specification The example of consistent device and method.
It is only to be not intended to be limiting this explanation merely for for the purpose of describing particular embodiments in the term that this specification uses Book.The "an" of used singular, " described " and "the" are also intended to packet in this specification and in the appended claims Most forms are included, unless the context clearly indicates other meaning.It is also understood that term "and/or" used herein is Refer to and includes that one or more associated any or all of project listed may combine.
It will be appreciated that though various information may be described using term first, second, third, etc. in this specification, but These information should not necessarily be limited by these terms.These terms are only used to for same type of information being distinguished from each other out.For example, not taking off In the case where this specification range, the first information can also be referred to as the second information, and similarly, the second information can also be claimed For the first information.Depending on context, word as used in this " if " can be construed to " ... when " or " when ... " or " in response to determination ".
This specification is intended to provide one kind for the problem that question and answer type knowledge base, by the included data of question and answer entry therein And identical keyword in answer data, it is determined as the technical solution of the subject key words of the question and answer entry.
In specific implementation, the question and answer entry in question and answer type knowledge base can be traversed, with from the question and answer type knowledge Some question and answer entry not being classified is read in library.It is possible to further it is included from the question and answer entry the problem of data in mention Keyword is taken, and extracts keyword from the answer data that the question and answer entry is included.
It is subsequent, it can be by will be from the keyword that is extracted in the problem data and the pass that is extracted from the answer data Keyword is compared, come the keyword for determining the keyword extracted from the problem data with extracting from the answer data In whether there is identical target keyword.
If there is identical target keyword, then the theme that the target keyword can be determined as to the question and answer entry closes Keyword.
In this manner, the theme according to each question and answer entry in question and answer type knowledge base can be further realized Keyword classifies to the question and answer entry in the question and answer type knowledge base.
In the above-mentioned technical solutions, it for question and answer type knowledge base, can ask respectively what question and answer entry therein was included Inscribe data and answer data and carry out keyword extraction, can will further be extracted from the problem data with from the answer The identical keyword extracted in data is determined as the subject key words of the question and answer entry.In this way, on the one hand can use each The subject key words of a question and answer entry classify to the question and answer entry in question and answer type knowledge base, so as to be convenient for utilizing theme Keyword carries out quick-searching to question and answer type knowledge base.On the other hand, since subject key words are extracted from problem data The identical keyword with what is extracted from answer data, therefore can reflect more accurately question and answer entry it is main in Hold, so as to improve the retrieval accuracy for being directed to question and answer type knowledge base.
This specification is described below by specific embodiment.
It is a kind of schematic diagram of keyword extraction system shown in one exemplary embodiment of this specification with reference to Fig. 1, Fig. 1.
As shown in Figure 1, the keyword extraction system may include question and answer type knowledge base, and with the question and answer type knowledge base pair The electronic equipment connect.Wherein, which can carry out keyword extraction for the question and answer type knowledge base, which can To be server, computer, mobile phone, tablet device, laptop or palm PC (PDAs, Personal Digital Assistants) etc., this specification to this with no restriction.
In practical applications, question and answer type knowledge base can be the knowledge base for storing question and answer type data, question and answer type data It can be stored in the form of question and answer entry in question and answer type database, a question and answer entry may include a problem and a use In the answer for answering the problem.For example, the question and answer type data stored in question and answer type knowledge base can be as shown in table 1 below:
Table 1
Wherein, answer 1 can be for answer a question 1 answer, problem 1 and answer 1 form question and answer entry 1;Answer 2 can Be for answer a question 2 answer, problem 2 and answer 2 form question and answer entry 2;And so on.
It is a kind of stream of the extracting method of subject key words shown in one exemplary embodiment of this specification with reference to Fig. 2, Fig. 2 Cheng Tu.This method can be applied to electronic equipment shown in FIG. 1, comprising the following steps:
Step 202, target question and answer entry is read from question and answer type knowledge base;Wherein, the target question and answer entry includes asking Inscribe data and answer data;
Step 204, keyword is extracted from described problem data and the answer data respectively;
Step 206, determination is extracted from the keyword extracted in described problem data and from the answer data It whether there is identical target keyword in keyword;
Step 208, if there is identical target keyword, then the target keyword is determined as the target question and answer The subject key words of entry.
In the present embodiment, electronic equipment can read question and answer entry from interfaced question and answer type knowledge base first (referred to as target question and answer entry).Wherein, which may include problem data and answer data.
For question and answer type knowledge base shown in the above table 1, the electronic equipment docked with the question and answer type knowledge base can be from this The question and answer entry 1 including problem 1 (i.e. problem data) and answer 1 (i.e. answer data) is read in question and answer type knowledge base is used as target Question and answer entry, can also read from the question and answer type knowledge base includes problem 2 (i.e. problem data) and answer 2 (i.e. answer data) Question and answer entry 2 be used as target question and answer entry, and so on.
After reading above-mentioned target question and answer entry, can be further included from the target question and answer entry the problem of number Keyword is extracted according to middle extraction keyword, and from the answer data that the target question and answer entry is included.
In a kind of embodiment shown, it can be based on preset keyword extraction algorithm, mentioned from the problem data Take keyword.Wherein, keyword extraction algorithm can be preset by technical staff, specifically can be TextRank algorithm or TF-IDF (Term Frequency-Inverse Document Frequency, the common weighting skill of information retrieval data mining Art) the common keyword extraction algorithm such as algorithm, details are not described herein for this specification.
Likewise it is possible to be based on preset keyword extraction algorithm, keyword is extracted from the answer data.
It should be noted that the consistency in order to guarantee keyword extraction, mentions for keyword used in problem data Algorithm is taken, it is identical with can be for keyword extraction algorithm used in answer data.But in practical applications, for asking Keyword extraction algorithm used in data is inscribed, is also possible to difference with for keyword extraction algorithm used in answer data , this specification to this with no restriction.
Respectively after extracting keyword in above problem data and above-mentioned answer data, can further by from this The keyword extracted in problem data is compared with the keyword extracted from the answer data, to determine from the problem The keyword extracted in data whether there is identical keyword (referred to as with from the keyword extracted in the answer data Target keyword).
If it is determined that there are identical target keywords, then the target keyword can be determined as to above-mentioned target question and answer item Purpose subject key words.Wherein, which is that can be used for reflecting the pass of the main contents of the target question and answer entry Keyword.
As an example it is assumed that from the keyword that some question and answer entry extracts in data the problem of included include: key Word 1, keyword 2 and key to the issue word 3, the keyword extracted from the answer data that the question and answer entry is included include: to close Keyword 2, keyword 3 and keyword 4.In this case, it is answered by the keyword extracted from the problem data with from this After the keyword extracted in case data is compared, it can determine that there are identical keyword 2 and keywords 3, i.e. keyword 2 It all can serve as target keyword with keyword 3.It is subsequent, keyword 2 and keyword 3 can be determined as to the master of the question and answer entry Inscribe keyword.
In a kind of embodiment shown, after the subject key words that above-mentioned target question and answer entry has been determined, Ke Yijin The target question and answer entry is judged to one step with the presence or absence of multiple subject key words, i.e. whether the target question and answer entry has and only one A subject key words.
If the subject key words of the target question and answer entry existence anduniquess, i.e., the target question and answer entry one and only one master Keyword is inscribed, then can directly be stored the subject key words as the label of the target question and answer entry to above-mentioned question and answer type knowledge Tag along sort is directly added in the question and answer type knowledge base using the subject key words for the target question and answer entry in library.
If there are multiple subject key words for the target question and answer entry, each subject key words can be counted respectively at this The number occurred in the problem of target question and answer entry is included data and answer data, to determine in the problem data and the answer The most subject key words of frequency of occurrence (referred to as target topic keyword) in data.It is subsequent, it can be by target topic key Word is stored as the label of the target question and answer entry to above-mentioned question and answer type knowledge base, i.e., is asked using the target topic keyword at this It answers in type knowledge base and adds tag along sort for the target question and answer entry.
As an example it is assumed that the subject key words of some the question and answer entry determined include: keyword 1 and keyword 2, then may be used To count the number that keyword 1 and keyword 2 occur in data and answer data the problem of the question and answer entry is included respectively. It is answered in the problem data with this if the number that keyword 1 occurs in the problem data and the answer data is less than keyword 2 The number occurred in case data, then can target topic keyword by keyword 2 as the question and answer entry, and by keyword 2 Label as the question and answer entry is stored to the question and answer type knowledge base where the question and answer entry.
Alternatively, multiple subject key words of the target question and answer entry can also be exported by user interface to user.With Family can select a subject key words (referred to as target topic keyword) by the user interface from these subject key words. It is subsequent, it can store the target topic keyword as the label of the target question and answer entry to above-mentioned question and answer type knowledge base, i.e., Tag along sort is added for the target question and answer entry in the question and answer type knowledge base using the target topic keyword.
Referring to FIG. 3, Fig. 3 is a kind of schematic diagram of user interface shown in one exemplary embodiment of this specification.
It is used provided by the customer service system of online customer service as shown in figure 3, the user interface can be for providing a user Family interface.Wherein, which can be docked with above-mentioned question and answer type knowledge base.
User can input the keyword for wishing the information obtained in the Text Entry provided by the user interface.With Family can click " transmission " button in the user interface after completing keyword input.The customer service system is detecting user For " transmission " button clicking operation when, available current keyword input by user, and further with this The question and answer entry of keyword hit is searched in the question and answer type knowledge base of customer service system docking, i.e. its label includes the keyword Question and answer entry.Subsequent, which can show user for the question and answer entry found, so that user checks.
For question and answer type knowledge base shown in following table 2:
Table 2
Assuming that user is in the user interface provided by the customer service system docked with the question and answer type knowledge base, the key of input Word is keyword 1, then since the label of question and answer entry 1 and question and answer entry 2 includes keyword 1, which can be with Question and answer entry 1 and question and answer entry 2 are showed into user, so that user checks.
In a kind of embodiment shown, after the subject key words that above-mentioned target question and answer entry has been determined, Ke Yijin The subject key words are added to to one step the search key set of the search engine docked with the question and answer type knowledge base.
Referring to FIG. 4, Fig. 4 is the schematic diagram of another user interface shown in one exemplary embodiment of this specification.
It is used provided by the customer service system of online service as shown in figure 4, the user interface can be for providing a user Family interface.Wherein, which can be docked by above-mentioned search engine with above-mentioned question and answer type knowledge base.
The customer service system can show the search key set of the search engine in the user interface, to use Family can click some keyword shown in the user interface, to obtain information relevant to the keyword.
For example, user can click " keyword 1 " in the user interface.The customer service system is detecting that user exists When detecting that user is directed to the clicking operation of " keyword 1 ", it can be searched in the question and answer type knowledge base by the search engine The question and answer entry of keyword hit.Subsequent, which can return to the customer service system for the question and answer entry found, The question and answer entry found is showed user by the customer service system, so that user checks.
In the above-mentioned technical solutions, it for question and answer type knowledge base, can ask respectively what question and answer entry therein was included Inscribe data and answer data and carry out keyword extraction, can will further be extracted from the problem data with from the answer The identical keyword extracted in data is determined as the subject key words of the question and answer entry.In this way, on the one hand can use each The subject key words of a question and answer entry classify to the question and answer entry in question and answer type knowledge base, so as to be convenient for utilizing theme Keyword carries out quick-searching to question and answer type knowledge base.On the other hand, since subject key words are extracted from problem data The identical keyword with what is extracted from answer data, therefore can reflect more accurately question and answer entry it is main in Hold, so as to improve the retrieval accuracy for being directed to question and answer type knowledge base.
Corresponding with the embodiment of the extracting method of aforementioned subject key words, this specification additionally provides subject key words The embodiment of extraction element.
The embodiment of the extraction element of this specification subject key words can be using on an electronic device.Installation practice can Can also be realized by way of hardware or software and hardware combining by software realization.Taking software implementation as an example, as one Device on logical meaning is by the processor of electronic equipment where it by computer journey corresponding in nonvolatile memory Sequence instruction is read into memory what operation was formed.For hardware view, as shown in figure 5, for this specification subject key words A kind of hardware structure diagram of electronic equipment where extraction element, in addition to processor shown in fig. 5, memory, network interface, Yi Jifei Except volatile memory, the practical function of electronic equipment in embodiment where device generally according to the extraction of the subject key words Can, it can also include other hardware, this is repeated no more.
Referring to FIG. 6, Fig. 6 is a kind of extraction element of subject key words shown in one exemplary embodiment of this specification Block diagram.The device 60 can be applied to electronic equipment shown in fig. 5, comprising:
Read module 601, for reading target question and answer entry from question and answer type knowledge base;Wherein, the target question and answer item Mesh includes problem data and answer data;
Extraction module 602, for extracting keyword from described problem data and the answer data respectively;
First determining module 603, for determine the keyword that is extracted from described problem data with from the answer number It whether there is identical target keyword in the keyword extracted in;
Second determining module 604, for when there are identical target keyword, the target keyword to be determined as institute State the subject key words of target question and answer entry.
In the present embodiment, described device 60 can also include:
First adding module 605 is the target question and answer item for the subject key words based on the target question and answer entry Mesh adds tag along sort.
In the present embodiment, first adding module 605 specifically can be used for:
If there are multiple subject key words for the target question and answer entry, it is determined that in described problem data and the answer The most target topic keyword of frequency of occurrence in data, and using the target topic keyword as the target question and answer entry Tag along sort store to the question and answer type knowledge base;
If the subject key words of the target question and answer entry existence anduniquess, using the target topic keyword as institute The tag along sort for stating target question and answer entry is stored to the question and answer type knowledge base.
In the present embodiment, described device 60 can also include:
Second adding module 606 is searched for being added to the subject key words with what the question and answer type knowledge base was docked Index the search key set held up.
In the present embodiment, keyword used by keyword is extracted from described problem data and the answer data to mention Taking algorithm is TextRank algorithm or TF-IDF algorithm.
The function of modules and the realization process of effect are specifically detailed in the above method and correspond to step in above-mentioned apparatus Realization process, details are not described herein.
For device embodiment, since it corresponds essentially to embodiment of the method, so related place is referring to method reality Apply the part explanation of example.The apparatus embodiments described above are merely exemplary, wherein described be used as separation unit The module of explanation may or may not be physically separated, and the component shown as module can be or can also be with It is not physical module, it can it is in one place, or may be distributed on multiple network modules.It can be according to actual The purpose for needing to select some or all of the modules therein to realize this specification scheme.Those of ordinary skill in the art are not In the case where making the creative labor, it can understand and implement.
System, device or the module that above-described embodiment illustrates can specifically be realized, Huo Zheyou by computer chip or entity Product with certain function is realized.A kind of typically to realize that equipment is computer, the concrete form of computer can be a People's computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media player, navigation It is any several in equipment, E-mail receiver/send equipment, game console, tablet computer, wearable device or these equipment The combination of kind equipment.
Corresponding with the extracting method embodiment of above-mentioned subject key words, this specification additionally provides a kind of electronic equipment Embodiment.The electronic equipment includes: processor and the memory for storing machine-executable instruction;Wherein, processor and Memory is usually connected with each other by internal bus.In other possible implementations, the equipment is also possible that outside Interface, can be communicated with other equipment or component.
In the present embodiment, by reading and executing the corresponding with the control logic of keyword extraction of the memory storage Machine-executable instruction, the processor is prompted to:
Target question and answer entry is read from question and answer type knowledge base;Wherein, the target question and answer entry include problem data and Answer data;
Keyword is extracted from described problem data and the answer data respectively;
It determines from the keyword extracted in described problem data and from the keyword extracted in the answer data With the presence or absence of identical target keyword;
If there is identical target keyword, then the target keyword is determined as to the master of the target question and answer entry Inscribe keyword.
In the present embodiment, by reading and executing the corresponding with the control logic of keyword extraction of the memory storage Machine-executable instruction, the processor is also prompted to:
Based on the subject key words of the target question and answer entry, tag along sort is added for the target question and answer entry.
In the present embodiment, by reading and executing the corresponding with the control logic of keyword extraction of the memory storage Machine-executable instruction, the processor is prompted to:
If there are multiple subject key words for the target question and answer entry, it is determined that in described problem data and the answer The most target topic keyword of frequency of occurrence in data, and using the target topic keyword as the target question and answer entry Tag along sort store to the question and answer type knowledge base;
If the subject key words of the target question and answer entry existence anduniquess, using the target topic keyword as institute The tag along sort for stating target question and answer entry is stored to the question and answer type knowledge base.
In the present embodiment, by reading and executing the corresponding with the control logic of keyword extraction of the memory storage Machine-executable instruction, the processor is also prompted to:
The subject key words are added to the search key collection of the search engine docked with the question and answer type knowledge base It closes.
In the present embodiment, keyword used by keyword is extracted from described problem data and the answer data to mention Taking algorithm is TextRank algorithm or TF-IDF algorithm.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to this specification Other embodiments.This specification is intended to cover any variations, uses, or adaptations of this specification, these modifications, Purposes or adaptive change follow the general principle of this specification and undocumented in the art including this specification Common knowledge or conventional techniques.The description and examples are only to be considered as illustrative, the true scope of this specification and Spirit is indicated by the following claims.
It should be understood that this specification is not limited to the precise structure that has been described above and shown in the drawings, And various modifications and changes may be made without departing from the scope thereof.The range of this specification is only limited by the attached claims System.
The foregoing is merely the preferred embodiments of this specification one or more embodiment, not to limit this theory Bright book one or more embodiment, all within the spirit and principle of this specification one or more embodiment, that is done is any Modification, equivalent replacement, improvement etc. should be included within the scope of the protection of this specification one or more embodiment.

Claims (11)

1. a kind of extracting method of subject key words, which comprises
Target question and answer entry is read from question and answer type knowledge base;Wherein, the target question and answer entry includes problem data and answer Data;
Keyword is extracted from described problem data and the answer data respectively;
Determine from the keyword that is extracted in described problem data with from the keyword extracted in the answer data whether There are identical target keywords;
If there is identical target keyword, then the theme that the target keyword is determined as the target question and answer entry is closed Keyword.
2. according to the method described in claim 1, the method also includes:
Based on the subject key words of the target question and answer entry, tag along sort is added for the target question and answer entry.
3. according to the method described in claim 2, the subject key words based on the target question and answer entry, are the target Question and answer entry adds tag along sort, comprising:
If there are multiple subject key words for the target question and answer entry, it is determined that in described problem data and the answer data The most target topic keyword of middle frequency of occurrence, and using the target topic keyword as point of the target question and answer entry Class label is stored to the question and answer type knowledge base;
If the subject key words of the target question and answer entry existence anduniquess, using the target topic keyword as the mesh The tag along sort of mark question and answer entry is stored to the question and answer type knowledge base.
4. according to the method described in claim 1, the method also includes:
The subject key words are added to the search key set of the search engine docked with the question and answer type knowledge base.
5. being used according to the method described in claim 1, extracting keyword from described problem data and the answer data Keyword extraction algorithm be TextRank algorithm or TF-IDF algorithm.
6. a kind of extraction element of subject key words, described device include:
Read module, for reading target question and answer entry from question and answer type knowledge base;Wherein, the target question and answer entry includes asking Inscribe data and answer data;
Extraction module, for extracting keyword from described problem data and the answer data respectively;
First determining module is extracted for determining from the keyword extracted in described problem data with from the answer data It whether there is identical target keyword in keyword out;
Second determining module, for when there are identical target keyword, the target keyword to be determined as the target The subject key words of question and answer entry.
7. device according to claim 6, described device further include:
First adding module is added for the subject key words based on the target question and answer entry for the target question and answer entry Tag along sort.
8. device according to claim 7, first adding module is specifically used for:
If there are multiple subject key words for the target question and answer entry, it is determined that in described problem data and the answer data The most target topic keyword of middle frequency of occurrence, and using the target topic keyword as point of the target question and answer entry Class label is stored to the question and answer type knowledge base;
If the subject key words of the target question and answer entry existence anduniquess, using the target topic keyword as the mesh The tag along sort of mark question and answer entry is stored to the question and answer type knowledge base.
9. device according to claim 6, described device further include:
Second adding module, for the subject key words to be added to the search engine docked with the question and answer type knowledge base Search key set.
10. device according to claim 6 extracts keyword from described problem data and the answer data and is used Keyword extraction algorithm be TextRank algorithm or TF-IDF algorithm.
11. a kind of electronic equipment, the electronic equipment include:
Processor;
For storing the memory of machine-executable instruction;
Wherein, executable by reading and executing the machine corresponding with the control logic of keyword extraction of the memory storage Instruction, the processor are prompted to:
Target question and answer entry is read from question and answer type knowledge base;Wherein, the target question and answer entry includes problem data and answer Data;
Keyword is extracted from described problem data and the answer data respectively;
Determine from the keyword that is extracted in described problem data with from the keyword extracted in the answer data whether There are identical target keywords;
If there is identical target keyword, then the theme that the target keyword is determined as the target question and answer entry is closed Keyword.
CN201910468420.5A 2019-05-31 2019-05-31 Theme keyword extraction method and device and electronic equipment Active CN110263137B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910468420.5A CN110263137B (en) 2019-05-31 2019-05-31 Theme keyword extraction method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910468420.5A CN110263137B (en) 2019-05-31 2019-05-31 Theme keyword extraction method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN110263137A true CN110263137A (en) 2019-09-20
CN110263137B CN110263137B (en) 2023-06-06

Family

ID=67916218

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910468420.5A Active CN110263137B (en) 2019-05-31 2019-05-31 Theme keyword extraction method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN110263137B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114462384A (en) * 2022-04-12 2022-05-10 北京大学 Metadata automatic generation device for digital object modeling

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902652A (en) * 2014-02-27 2014-07-02 深圳市智搜信息技术有限公司 Automatic question-answering system
CN105528437A (en) * 2015-12-17 2016-04-27 浙江大学 Question-answering system construction method based on structured text knowledge extraction
WO2016101727A1 (en) * 2014-12-23 2016-06-30 北京奇虎科技有限公司 Question-and-answer-based search result adjustment method and device
US20170330087A1 (en) * 2016-05-11 2017-11-16 International Business Machines Corporation Automated Distractor Generation by Identifying Relationships Between Reference Keywords and Concepts
CN107993724A (en) * 2017-11-09 2018-05-04 易保互联医疗信息科技(北京)有限公司 A kind of method and device of medicine intelligent answer data processing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902652A (en) * 2014-02-27 2014-07-02 深圳市智搜信息技术有限公司 Automatic question-answering system
WO2016101727A1 (en) * 2014-12-23 2016-06-30 北京奇虎科技有限公司 Question-and-answer-based search result adjustment method and device
CN105528437A (en) * 2015-12-17 2016-04-27 浙江大学 Question-answering system construction method based on structured text knowledge extraction
US20170330087A1 (en) * 2016-05-11 2017-11-16 International Business Machines Corporation Automated Distractor Generation by Identifying Relationships Between Reference Keywords and Concepts
CN107993724A (en) * 2017-11-09 2018-05-04 易保互联医疗信息科技(北京)有限公司 A kind of method and device of medicine intelligent answer data processing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王飞鸿: "基于自动生成知识库的智能问答***设计", 《中国科技信息》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114462384A (en) * 2022-04-12 2022-05-10 北京大学 Metadata automatic generation device for digital object modeling

Also Published As

Publication number Publication date
CN110263137B (en) 2023-06-06

Similar Documents

Publication Publication Date Title
CN107851092B (en) Personal entity modeling
US9971847B2 (en) Automating browser tab groupings based on the similarity of facial features in images
US10210243B2 (en) Method and system for enhanced query term suggestion
CN110390054B (en) Interest point recall method, device, server and storage medium
US9720904B2 (en) Generating training data for disambiguation
TWI710917B (en) Data processing method and device
US20100198816A1 (en) System and method for presenting content representative of document search
CN109325108B (en) Query processing method, device, server and storage medium
US11361030B2 (en) Positive/negative facet identification in similar documents to search context
CN108701155A (en) Expert's detection in social networks
WO2015081720A1 (en) Instant messaging (im) based information recommendation method, apparatus, and terminal
US10956470B2 (en) Facet-based query refinement based on multiple query interpretations
US20160378847A1 (en) Distributional alignment of sets
CN105740454A (en) Display method and device of picture folder and electronic equipment
Lavid Ben Lulu et al. Functionality-based clustering using short textual description: Helping users to find apps installed on their mobile device
CN114330329A (en) Service content searching method and device, electronic equipment and storage medium
CN111310065A (en) Social contact recommendation method and device, server and storage medium
CN114357325A (en) Content search method, device, equipment and medium
CN110263137A (en) The extracting method and device of subject key words, electronic equipment
WO2018106550A1 (en) Query disambiguation by means of disambiguating dialog questions
Maiya et al. Exploratory analysis of highly heterogeneous document collections
US9286349B2 (en) Dynamic search system
CN110659353A (en) Searching method and device
KR102113663B1 (en) Hierarchical classification-based incremental class learning method and computing device for digital storytelling
CN109145084B (en) Data processing method, data processing device and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20201009

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20201009

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant