CN113704393A

CN113704393A - Keyword extraction method, device, equipment and medium

Info

Publication number: CN113704393A
Application number: CN202110393894.5A
Authority: CN
Inventors: 林岳
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-04-13
Filing date: 2021-04-13
Publication date: 2021-11-26

Abstract

The application discloses a keyword extraction method, a keyword extraction device, equipment and a medium, and relates to the field of data processing. The method comprises the steps of obtaining a first comment text from a plurality of comment texts, wherein an emotion tag of the first comment text is a first emotion tag; segmenting the first comment text to obtain a feature word set of the first comment text; calculating an information entropy set of the first comment text according to the feature word set, wherein the information entropy in the information entropy set is obtained by calculation according to the feature words in the feature word set; and determining the key words of the first comment text according to the information entropy set. According to the method and the device, the information entropy set of the comment text is obtained, the keywords in the comment text are determined according to the information entropy set, the information entropy represents the difference between the comment texts, the keywords obtained through the information entropy have stronger interpretability on emotion classification results, and the modeling effect and the interpretability are improved.

Description

Keyword extraction method, device, equipment and medium

Technical Field

The present application relates to the field of data processing, and in particular, to a keyword extraction method, apparatus, device, and medium.

Background

In an actual business scene, in order to confirm the emotion guidance of the comment text of the user, emotion classification is performed on the comment text, for example, whether the comment text of the user is optimistic or pessimistic is judged, and after emotion classification is completed. In order to improve interpretability, keywords under a certain emotion classification need to be extracted.

The related technology is that after the comment text is input into the text sentiment classification model, the sentiment labels of the comment text are output, and the comment text is classified according to the sentiment labels, for example, the comment text is classified into positive comment text and negative comment text. After the classification is completed, keywords based on emotion labels in each comment text are marked out by technicians.

The efficiency of extracting keywords in the related technology is low, and accurate keywords cannot be obtained in some cases.

Disclosure of Invention

The embodiment of the application provides a keyword extraction method, a keyword extraction device, equipment and a medium, wherein the method can acquire an information entropy set of a comment text and determine keywords in the comment text according to the information entropy set so that the acquired keywords can better accord with the content of the comment text.

According to an aspect of the present application, there is provided a keyword extraction method, including:

acquiring a first comment text from a plurality of comment texts, wherein an emotion tag of the first comment text is a first emotion tag;

segmenting the first comment text to obtain a feature word set of the first comment text;

calculating an information entropy set of the first comment text according to the feature word set, wherein the information entropy in the information entropy set is obtained by calculation according to the feature words in the feature word set;

and determining the key words of the first comment text according to the information entropy set.

According to an aspect of the present application, there is provided a keyword extraction apparatus including:

the obtaining module is used for obtaining a first comment text from a plurality of comment texts, and the emotion tag of the first comment text is a first emotion tag;

the word segmentation module is used for segmenting the first comment text to obtain a feature word set of the first comment text;

the calculation module is used for calculating an information entropy set of the first comment text according to the feature word set, wherein the information entropy in the information entropy set is obtained by calculation according to the feature words in the feature word set;

and the determining module is used for determining the key words of the first comment text according to the information entropy set.

In an optional design of the present application, the computing module is further configured to randomly determine a first feature word from the feature word set; acquiring the emotion probability of the first feature word, wherein the emotion probability is used for representing the probability that the emotion label of the comment text is the first emotion label when the first feature word appears in the comment text; calculating the information entropy of the first feature word based on the emotion probability of the first feature word; and repeating the three steps until the information entropies corresponding to all the feature words in the feature word set are obtained, and generating the information entropy set of the first comment text.

In an optional design of the present application, the calculation module is further configured to perform word segmentation on the plurality of comment texts, and obtain a feature word set of each comment text; determining m target comment texts containing the first characteristic words according to the characteristic word set of each comment text; acquiring m emotion labels corresponding to the m target comment texts; and calculating the proportion of the first emotion label in the m emotion labels to obtain the emotion probability of the first feature word.

In an optional design of the application, the determining module is further configured to use feature words corresponding to the minimum n information entropies in the information entropy set as keywords of the first comment text.

In an optional design of the present application, the determining module is further configured to obtain a preset threshold; and taking the feature words corresponding to the information entropies smaller than the preset threshold value in the information entropy set as the key words of the first comment text.

In an optional design of the present application, the word segmentation module is further configured to input the first comment text into a word segmentation model, and extract a word in the first comment text; generating a directed acyclic graph of the first comment text according to the words; calculating the weighted sum of each path in the directed acyclic graph based on the weighted values of the words in the directed acyclic graph; taking the path with the smallest weight sum in all the paths as an optimal path; and segmenting the first comment text according to the optimal path to obtain a feature word set of the first comment text.

In an alternative design of the present application, the apparatus further includes: and a label extraction module.

The label extraction module is used for inputting the comment texts into the emotion label extraction model and acquiring an input vector of each comment text, wherein the input vector comprises at least one of a word embedding vector, a segment embedding vector and a position embedding vector; inputting the input vector into an encoder model, and outputting an emotion score through a self-attention mechanism; and outputting the emotion label of each comment text based on the emotion score.

In an alternative design of the present application, the apparatus further includes: and a training module.

The training module is used for constructing a training data set, and the training data set comprises a target training sample and a corresponding real emotion label; inputting the target training sample into the emotion label extraction model, and outputting an emotion label of the target training sample; and training the emotion label extraction model based on the difference value between the emotion label of the target training sample and the real emotion label.

In an optional design of the application, the obtaining module is further configured to obtain a first video comment text from the plurality of video comment texts, and an emotion tag of the first video comment text is a positive emotion tag.

The word segmentation module is further configured to segment the first video comment text to obtain a feature word set of the first video comment text.

The calculation module is further configured to calculate an information entropy set of the first video comment text according to the feature word set of the first video comment text.

The determining module is further configured to determine a keyword of the first comment text according to the information entropy set of the first video comment text.

In an optional design of the present application, the determining module is further configured to use the keyword as a tag of a comment object corresponding to the plurality of comment texts; classifying the review object based on the tag of the review object.

According to another aspect of the present application, there is provided a computer device including: a processor and a memory, the memory having stored therein at least one instruction, at least one program, set of codes, or set of instructions, the at least one instruction, the at least one program, set of codes, or set of instructions being loaded and executed by the processor to implement the keyword extraction method as described above.

According to another aspect of the present application, there is provided a computer storage medium having at least one program code stored therein, the program code being loaded and executed by a processor to implement the keyword extraction method as described above.

According to another aspect of the application, a computer program product or a computer program is provided, comprising computer instructions, which are stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and executes the computer instructions, so that the computer device executes the keyword extraction method as described above.

The beneficial effects brought by the technical scheme provided by the embodiment of the application at least comprise:

corresponding information entropy is obtained through emotion label extraction, keywords of the comment texts are determined according to the information entropy, the lower the information entropy is, the larger the difference inside the comment texts is, the more likely the difference between the comment texts is, the lower the information entropy is, the stronger the interpretability of the representative keywords on emotion classification results is, the modeling effect and the interpretability are improved, and the empiric meaning of technicians and possible human errors can be further avoided.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a block diagram of an encoder model provided in an exemplary embodiment of the present application;

FIG. 2 is a schematic structural diagram of a BERT model provided in an exemplary embodiment of the present application;

FIG. 3 is a block diagram of a computer system provided in an exemplary embodiment of the present application;

FIG. 4 is a flowchart illustrating a keyword extraction method according to an exemplary embodiment of the present application;

FIG. 5 is a flowchart illustrating a keyword extraction method according to an exemplary embodiment of the present application;

FIG. 6 is a flow chart diagram of a word segmentation method provided by an exemplary embodiment of the present application;

FIG. 7 is a schematic diagram of a directed acyclic graph provided by an exemplary embodiment of the present application;

FIG. 8 is a flowchart illustrating a method for training an emotion tag extraction model according to an exemplary embodiment of the present application;

FIG. 9 is a flowchart illustrating a keyword extraction method according to an exemplary embodiment of the present application;

FIG. 10 is a diagram of a keyword extraction apparatus provided in an exemplary embodiment of the present application;

fig. 11 is a schematic structural diagram of a computer device according to an exemplary embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

First, terms referred to in the embodiments of the present application are described:

artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and teaching learning.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, for example, common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical treatment, smart customer service, and the like.

Information entropy: entropy of information this term is a concept borrowed from thermodynamics, in which thermal entropy is a physical quantity representing a degree of disorder of a molecular state. Shannon describes the uncertainty of the information source by using the concept of information entropy, clarifies the relation between probability and information redundancy by using a mathematical language, and has the following calculation formula:

wherein h (x) represents information entropy; x is the number of_iIndicating that the ith value may occur, for example, when the signal is transmitted, two values 0 or 1 exist in the signal, and x is₁Denotes the case where the value of the signal takes 0, x₂Indicating that the value of the signal takes 1; p (x)_i) Representing the probability of occurrence of the ith value; log () represents a logarithmic operation.

Emotion label: and the emotion guide is used for describing the evaluation text. Illustratively, the emotion tags include positive emotion tags and negative emotion tags.

Word segmentation: refers to splitting text into multiple words according to the semantics of the text. Illustratively, the text "this video is really good" is participled to obtain the words "this", "video", "true" and "good".

Directed Acyclic Graph (DAG): refers to a directed graph without loops. In graph theory, a directed graph is a directed acyclic graph if it cannot go from a vertex back to the vertex through several edges.

Transformer model (a neural network model proposed by *** corporation): includes a complete Encoder-Decoder framework, which is mainly composed of an attention mechanism. Illustratively, it consists of 6 coding modules, each of which is divided into two parts: a self-attention layer and a feed-forward network. The self-attention layer is mainly responsible for calculating the relation of each input, then carrying out weighting to obtain a result as output, and then sending the result to the classification module for classification.

And (3) an encoder model: in the embodiments of the present application, refer to an encoder in a transform model. Illustratively, as shown in FIG. 1, the encoder model consists essentially of an attention layer 101 and a feedforward network 103.

The input of the attention layer 101 is a feature vector of an input sequence, and the output is an encoded sequence. It should be noted that the attention mechanism is a solution proposed by imitating human attention, and can rapidly screen out high-value information from a large amount of information, and is generally used in a model of an encoder and a decoder. The attention mechanism can help the model to give different weights to each input part, more key and important information is extracted, the model can be judged more accurately, and meanwhile, larger expenses cannot be brought to calculation and storage of the model.

As shown in fig. 1, at the output of the attention layer 101, there is also a difference calculation and normalization process 102, which takes as input the encoded sequence and the feature vectors of the input sequence, and outputs as a modified encoded sequence. The difference calculation is used to calculate the difference between the feature vector of the input sequence and the encoded sequence and to modify the encoded code sequence. And the normalization processing is used for keeping the stability of the gradient, and meanwhile, the coded coding sequence is mapped to a reasonable interval to generate a modified coding sequence, so that the output modified coding sequence can be processed by a next model.

The input of the feedforward network 103 is the modified coding sequence, the output is the mapped coding sequence, and the feedforward network is used for mapping the input coded sequence into a preset interval, so that the universality of the model is improved and the subsequent data processing is facilitated.

As shown in fig. 1, at the output of the feed forward network 103, there is also a difference calculation and normalization process 104, which takes as input the modified encoded sequence and the mapped encoded sequence and outputs the modified output sequence. The difference calculation is used to calculate the difference between the mapped sequence and the modified code sequence and to modify the mapped code sequence. And the normalization processing is used for keeping the stability of the gradient, and simultaneously mapping the mapped coding sequence to a reasonable interval to generate a corrected output sequence, so that the next model can process the output corrected output sequence.

BERT model (bidirectional encoderrestations from transformations, bidirectional coder from transform model): the BERT model is actually an encoder portion of a transform model, and is mainly used for processing a classification task, a question-answering task, an identification task and the like. As shown in fig. 2, the BERT model consists of multiple transform encoders.

The input layer 201 includes E_CLS、E1、E2、……、Em。E_CLSIs a special vector at the head of the sequence, E_CLSRepresenting the BERT model for classification tasks, on the other hand, for non-classification models, E_CLSMay be omitted. E1, E2, … …, Em indicate feature vectors corresponding to the input sequence, for example, if the input sequence is the sentence "this video looks good", E1 indicates the feature vector corresponding to "this", E2 indicates the feature vector corresponding to "look", E3 indicates the feature vector corresponding to "frequency", E4 indicates the feature vector corresponding to "good", and E5 indicates the feature vector corresponding to "look".

Network fabric 202 and network fabric 203 are each comprised of m +1 trms. Trm represents an encoder model where the input to each encoder model is all the outputs of the previous layer network structure, e.g. the input to Trm in the network result 203 is the output of all Trm in the network structure 202. The specific structure of Trm can refer to the structural diagram of the encoder model shown in fig. 1.

The output layer 204 includes C, T1, T2, … …And Tm. Wherein the output C is E_CLSAnd the corresponding output value, output C is used for representing the classification result. The output Tn represents an output value corresponding to En (1. ltoreq. n.ltoreq.m), for example, the output T1 is an output value corresponding to E1, and the output T2 is an output value corresponding to E2.

Fig. 3 shows a schematic structural diagram of a computer system provided in an exemplary embodiment of the present application. The computer system 300 includes: a terminal 320 and a server 340.

The terminal 320 has an application program related to the living body detection installed thereon. The application program may be an applet in an app (application), may be a special application program, and may also be a web client. For example, in order to ensure the safety and reliability of the payment process, the user needs to perform the live body detection on the face image acquired by the terminal 320, so as to prevent the transaction caused by the illegal attack and protect the interests of the individual and the public. The terminal 320 is at least one of a smartphone, a tablet, an e-book reader, an MP3 player, an MP4 player, a laptop portable computer, and a desktop computer.

The terminal 320 is connected to the server 340 through a wireless network or a wired network.

The server 340 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform. The server 340 is configured to provide a background service for the application program of the liveness detection, and send the result of the liveness detection to the terminal 320. Alternatively, server 340 undertakes primary computational work and terminal 320 undertakes secondary computational work; alternatively, the server 340 undertakes the secondary computing work and the terminal 320 undertakes the primary computing work; alternatively, both the server 340 and the terminal 320 employ a distributed computing architecture for collaborative computing.

Fig. 4 is a flowchart illustrating a keyword extraction method according to an exemplary embodiment of the present application. The method may be performed by the terminal 320 or the server 340 shown in fig. 3, and the method includes the steps of:

step 401: the method comprises the steps of obtaining a first comment text from a plurality of comment texts, wherein the emotion tag of the first comment text is a first emotion tag.

The comment text is at least one of video comment text, article comment text, music comment text, game comment text, news comment text and commodity comment text

The emotion label of the first comment text is a first emotion label. Optionally, the first comment text refers to one comment text, or the first comment text refers to multiple comment texts, which is not limited in this embodiment.

The emotion labels are used for describing emotion guidance of the text by the user. Optionally, the emotion tags include a positive emotion tag and a negative emotion tag.

The first emotion label is any emotion label. Illustratively, when the comment text is a product comment text, the emotion labels may be classified into a positive emotion label and a negative emotion label, wherein the positive emotion label is used for indicating approval of the product, the negative emotion label is used for indicating non-approval of the product, and the positive emotion label is taken as the first emotion label. Illustratively, when the comment text is a video comment text, the emotion labels may be classified into a positive emotion label, a negative emotion label and a neutral emotion label, wherein the positive emotion label is used for indicating approval of the video, the negative emotion label is used for indicating non-approval of the video, the neutral emotion label is used for indicating that the video is neutral, and the positive emotion label is taken as the first emotion label.

Optionally, the emotion tag of the comment text is obtained through an emotion tag extraction model, or is obtained through manual labeling of a technician.

Step 402: and segmenting the first comment text to obtain a feature word set of the first comment text.

Word segmentation refers to splitting a text into a plurality of word sequences according to the logic of the text. Optionally, the first comment text is participled to obtain at least one word segmentation result. For example, the text "today is really good" is segmented, wherein one segmentation result is "today's weather", "true", "good", and the other segmentation result is "today", "weather", "true".

Optionally, segmenting the first comment text by a dictionary-based segmentation method to obtain a feature word set of the first comment text; or performing word segmentation on the first comment text by a word segmentation method based on statistics to obtain a feature word set of the first comment text; or segmenting words of the first comment text by a word segmentation method of machine learning to obtain a feature word set of the first comment text; or segmenting the first comment text by an understood segmentation method to obtain a feature word set of the first comment text.

Optionally, a feature word set of the first comment text is obtained according to a complete word segmentation result of the first comment text. Optionally, a feature word set of the first comment text is obtained according to a partial word segmentation result of the first comment text.

The feature word set refers to a set composed of feature words obtained according to the word segmentation result of the first comment text.

Step 403: and calculating an information entropy set of the first comment text according to the feature word set, wherein the information entropy in the information entropy set is obtained by calculation according to the feature words in the feature word set.

The information entropy in the information entropy set is obtained by calculation according to the feature words in the feature word set.

Optionally, the information entropy set is obtained by calculation according to all the feature words in the feature word set.

Optionally, the information entropy set is obtained by calculation according to part of the feature words in the feature word set.

Step 404: and determining the key words of the first comment text according to the information entropy set.

Optionally, when the information entropy set only comprises one information entropy, determining the feature word corresponding to the information entropy as the keyword of the first comment text.

Optionally, when the information entropy set includes at least two information entropies, the feature words corresponding to the minimum n information entropies in the information entropy set are used as the keywords of the first comment text, and n is a positive integer.

Optionally, when the information entropy set includes at least two information entropies, the feature words corresponding to the information entropies smaller than a preset threshold in the information entropy set are used as the keywords of the first comment text. The preset threshold may be set by the technician at his or her discretion.

In summary, in the embodiment, the corresponding information entropy is obtained through emotion tag extraction, the keywords of the comment text are determined according to the information entropy, and as the information entropy is smaller, the difference inside the comment text is larger, and the difference between the comment texts is more likely to be illustrated, the result of the information entropy is lower, the interpretability of the representative keywords on emotion classification results is stronger, and the modeling effect and the interpretability are improved.

In the following embodiment, on one hand, the process of obtaining the characteristic words is refined, and more accurate word segmentation results can be obtained by commenting a directed acyclic graph of a text; on the other hand, the keywords of the comment texts are determined through the information entropy, the smaller the information entropy is, the larger the difference inside the comment texts is, and the more likely the difference is between the comment texts, so that the lower the information entropy result is, the stronger the interpretability of the representative keywords on the emotion classification result is, the modeling effect and the interpretability are improved, and on the other hand, the emotion label is extracted by using the attention mechanism, and the reliability of the emotion label is improved.

Fig. 5 is a flowchart illustrating a keyword extraction method according to an exemplary embodiment of the present application. The method may be performed by the terminal 320 or the server 340 shown in fig. 3, and the method includes the steps of:

step 501: and inputting the comment texts into an emotion tag extraction model, and acquiring an input vector of each comment text, wherein the input vector comprises at least one of a word embedding vector, a segment embedding vector and a position embedding vector.

The emotion label extraction model is used for extracting emotion labels corresponding to the evaluation texts. Optionally, the emotion tag extraction model is at least one of a convolutional neural network-based model, a cyclic convolutional neural network-based model, a support vector machine-based model, and a transform model-based neural network. In this embodiment, the emotion tag extraction model is exemplified as a BERT model.

The input vector includes at least one of a word embedding vector, a segment embedding vector, and a position embedding vector. The word embedding vector is used for representing the specific content of the word; the segment embedding vector is used for representing a segment where the word is located; the position embedding vector is used to represent the position of a word in a sentence.

Step 502: and inputting the input vector into the encoder model, and outputting the emotion score through a self-attention mechanism.

The contents of the encoder model may be referred to in particular in the description of fig. 1.

The sentiment score is used to quantify the sentiment guide of the comment text. Optionally, the output sentiment score is k preset values. Optionally, the output emotion score belongs to a preset interval.

Step 503: and outputting the emotion label of each comment text based on the emotion score.

Optionally, when the output emotion scores are k preset numerical values, determining an emotion tag of the comment text according to the emotion scores. Illustratively, the output emotion scores have two cases, namely the emotion score is equal to 0 or 1, and when the emotion score is 0, the corresponding emotion label of the comment text is a negative emotion label; when the emotion score is 1, the emotion label of the corresponding comment text is a positive emotion label.

Optionally, when the output emotion score belongs to the preset interval, comparing the emotion score with a preset value, and determining an emotion tag of the comment text. Illustratively, the output emotion score belongs to an interval [0, 1], the preset value is 0.5, and when the emotion score is greater than the preset value of 0.5, the corresponding emotion label of the comment text is a positive emotion label; and when the emotion score is less than the preset value of 0.5, the corresponding emotion label of the comment text is a negative emotion label.

Step 504: and segmenting the first comment text to obtain a feature word set of the first comment text.

Word segmentation refers to splitting a text into a plurality of word sequences according to the logic of the text. For example, the text "today weather is really good" is segmented, wherein the segmentation results in "today weather", "true", "good" in one category.

Optionally, a feature word set of the first comment text is obtained according to a complete word segmentation result of the first comment text.

Optionally, a feature word set of the first comment text is obtained according to a partial word segmentation result of the first comment text.

Step 505: and randomly determining a first characteristic word from the characteristic word set.

The first characteristic word is any one characteristic word in the characteristic word set.

Step 506: and acquiring the emotion probability of the first characteristic word.

The emotion probability is used for expressing the probability that the emotion label of the comment text is the first emotion label when the first characteristic word appears in the comment text.

Alternatively, the emotion probabilities are obtained by querying a memory. Optionally, the emotion probabilities stored in the memory are obtained by statistical means, for example, after a large number of comment texts are obtained, the emotion probabilities of all feature words in the large number of comment texts are counted, and then the emotion probabilities are stored in the memory. Optionally, the emotion probabilities stored in memory are obtained based on a blockchain technique.

Optionally, the emotion probability is obtained through emotion tags of a plurality of comment texts, and specifically, the method includes the following steps:

1. and segmenting the plurality of comment texts to obtain a feature word set of each comment text.

Optionally, when the multiple comment texts are segmented, the segmentation method of each comment text is the same or different.

2. And determining m target comment texts containing the first characteristic words according to the characteristic word set of each comment text, wherein m is a positive integer.

The target comment text refers to a comment text containing the first characteristic word in the corresponding characteristic word set.

Illustratively, the feature word set corresponding to the comment text 1 and the comment text 2 contains a first feature word, and the feature word set corresponding to the feature text 2 does not contain the first feature word, so that the comment text 1 and the comment text 2 are target comment texts.

3. And acquiring m emotion labels corresponding to the m target comment texts.

And acquiring m emotion labels corresponding to the m target comment texts.

Illustratively, the emotion tags of the target comment text 1 and the target comment text 2 are first emotion tags, and the emotion tag of the target comment text 3 is a second emotion tag.

4. And calculating the occupation ratio of the first emotion label in the m emotion labels to obtain the emotion probability of the first feature word.

Illustratively, the prefix hypothesis m is 3, and when the 3 emotion labels include two first emotion labels and one second emotion label, the proportion of the first emotion label in the 3 emotion labels is 2/3, so the emotion probability is 66.7%.

Step 507: and calculating the information entropy of the first characteristic word based on the emotion probability of the first characteristic word.

Optionally, the information entropy is denoted as h (u), and then:

wherein p is_iAnd expressing the emotion probability of the ith emotion label.

Step 508: and repeating the three steps until the information entropies corresponding to all the feature words in the feature word set are obtained, and generating the information entropy set of the first comment text.

And repeating the three steps until the information entropies corresponding to all the feature words in the feature word set are obtained, and generating the information entropy set of the first comment text.

Step 509: and taking the feature words corresponding to the minimum n information entropies in the information entropy set as the key words of the first comment text.

Optionally, the information entropies in the information entropy set are sorted from small to large, and the feature words corresponding to the first n information entropies are taken as the keywords of the comment text.

Step 510: and acquiring a preset threshold value.

The preset threshold is used for dividing the keywords.

Optionally, the preset threshold is set by the technician at his or her discretion. Illustratively, the preset threshold is set to 0.5.

Step 511: and taking the feature words corresponding to the information entropies smaller than the preset threshold in the information entropy set as the key words of the first comment text.

For example, if the preset threshold is 0.5 and the information entropy set is {0.1, 0.8, 0.3, 0.6}, the feature words corresponding to the information entropy 0.1 and the information entropy 0.3 are used as the keywords of the first comment text.

It should be noted that the above-mentioned step 509 and steps 510 to 511 are in a parallel scheme or at least a combination relationship.

Alternatively, when the step 509 and the steps 510 to 511 are in a parallel scheme, only the step 509 or only the steps 510 to 211 are performed.

Optionally, when step 509 and step 510-step 511 are in a combined relationship, obtaining keyword 1 through step 509, obtaining keyword 2 through step 510-step 511, and taking the intersection of keyword 1 and keyword 2 as the keyword of the first comment text. Or, obtaining keyword 3 through step 509, obtaining keyword 4 through steps 510-511, and taking the union of keyword 3 and keyword 4 as the keyword of the first comment text.

Step 512: and taking the keywords as labels of comment objects corresponding to the comment texts.

The comment object refers to an object evaluated by the comment text. Optionally, the commentary object includes at least one of a video, an article, music, a game, news, and merchandise.

For example, if the obtained keyword is "suitable", the "suitable" is taken as a tag of the comment object.

Tags are used to represent characteristics of the comment object.

Step 513: the comment objects are classified based on their tags.

Optionally, the review object is classified based on all tags of the review object. Illustratively, when the object of review is a commodity, all the tags obtained are "cheap" and "excellent", the commodity is classified as a cost-effective commodity.

Optionally, the review object is classified based on a partial tag of the review object. Illustratively, when the object of review is a commodity, all the tags obtained are "cheap" and "excellent", and the commodity is classified as a low-price commodity by taking the "cheap" tag for classification.

Optionally, the comment objects are classified based on their tags and their categories.

Illustratively, when the object of review is a commodity, the obtained labels are "cheap" and "excellent", and the commodity is classified as a cost-effective commodity.

Illustratively, when the comment object is an article, the obtained tags are "understandable" and "practical", and the article is classified as an excellent article.

And moreover, the emotion label is extracted through the attention mechanism, so that the reliability of the emotion label is improved, and the obtained keyword is more attached to the current emotion label.

On the other hand, the directed acyclic graph is used for word segmentation, so that the word segmentation result is closer to the actual use condition of the user, the characteristic words are more representative, and the structure of the comment text can be accurately reflected.

In the following embodiments, a word segmentation method is provided, which is convenient for extracting feature words from the comment text, and has a good word segmentation effect and an accurate conclusion.

Fig. 6 is a flowchart illustrating a word segmentation method according to an exemplary embodiment of the present application. The method may be performed by the terminal 320 or the server 340 or other computer device shown in fig. 3, the method comprising the steps of:

step 601: and inputting the first comment text into the word segmentation model, and extracting words in the first comment text.

Optionally, the first comment text is input into the word segmentation model, and words in the first comment text are extracted according to the dictionary. The foregoing words include at least one of words and phrases.

Illustratively, words in the comment text "today weather is really good" are extracted, resulting in the words "today", "day", "gas", "true", "good", and the phrases "today", "today weather", "true good".

Step 602: and generating a directed acyclic graph of the first comment text according to the words.

Illustratively, as shown in FIG. 7, from the words, a directed acyclic graph is generated that reviews the text "today's weather is really good". According to the words "today weather", "true" and "good", one of the paths in the directed acyclic graph is obtained as "today weather → true → good", similarly, according to the words "today", "weather", "true" and "good", one of the paths in the directed acyclic graph is obtained as "today → weather → true → good", and so on, all the paths in the directed acyclic graph are obtained.

Step 603: and calculating the weighted sum of each path in the directed acyclic graph based on the weighted values of the words in the directed acyclic graph.

The weight value represents a probability of occurrence of a word. Optionally, the weight value p is freq/total, where freq represents the frequency of occurrence of a word, and total represents the sum of all word frequencies, where freq and total may be obtained by querying a database, or querying a related table. For example, if the freq of the word "today" is 0.1 and the total is 100, the weight value is 0.1/100 — 0.1%.

Step 604: and taking the path with the smallest weight sum in each path as the optimal path.

Optionally, calculating a weighted sum of each path in the directed acyclic graph; and sequencing the weight sum of each path from small to large, and determining the path with the minimum weight sum as the optimal path.

Optionally, calculating a weighted sum of each path in the directed acyclic graph; and determining the optimal path through a minimum function.

Step 605: and segmenting the first comment text according to the optimal path to obtain a feature word set of the first comment text.

For example, if the optimal path of the comment text "weather today is really good" is "today → weather → really good", the feature word set obtained is { today, weather, really good }.

In summary, the embodiment provides a word segmentation method, which determines the word segmentation of the comment text through the directed acyclic graph, so that the obtained word segmentation result is more accurate, and the method can be completed without manual intervention, and has higher efficiency.

In the following embodiment, the emotion tag extraction model is trained, so that the emotion tag of the comment text can be accurately extracted by the emotion tag extraction model.

FIG. 8 is a flowchart illustrating a method for training an emotion label extraction model according to an exemplary embodiment of the present application. The method may be performed by the terminal 320 or the server 340 or other computer device shown in fig. 3, the method comprising the steps of:

step 801: a training data set is constructed.

The training data set includes target training samples and corresponding real emotion labels.

Alternatively, the real emotion label is manually marked by a technician.

Step 802: and inputting the target training sample into the emotion label extraction model, and outputting the emotion label of the target training sample.

Step 803: and training the emotion label extraction model based on the difference value between the emotion label of the target training sample and the real emotion label.

Optionally, the emotion label extraction model is trained through an error back propagation algorithm based on the difference value between the emotion label and the real emotion label.

In summary, the embodiment provides a method for training an emotion tag extraction model, which can obtain the emotion tag extraction model, requires fewer samples for training, increases the uncertainty of the samples, and is beneficial to training the emotion tag extraction model.

Optionally, the comment text is at least one of a video comment text, an article comment text, a music comment text, a game comment text, a news comment text, and a product comment text. In the following embodiments, the extraction of keywords from video comment text will be briefly described.

Fig. 9 is a flowchart illustrating a keyword extraction method according to an exemplary embodiment of the present application. The method may be performed by the terminal 320 or the server 340 shown in fig. 3, and the method includes the steps of:

step 901: and inputting a plurality of video comment texts into the emotion label extraction model.

Step 902: and obtaining the emotion label of each video comment text.

Step 903: the method comprises the steps of obtaining a first video comment text from a plurality of video comment texts, wherein the emotion tag of the first video comment text is a positive emotion tag.

The emotion tag of the first video comment text is a positive emotion tag. Optionally, the first video comment text refers to one video comment text, or the first video comment text refers to multiple video comment texts, which is not limited in this embodiment.

Step 904: and segmenting the first video comment text to obtain a feature word set of the first video comment text.

Optionally, segmenting words of the first video comment text by a dictionary-based word segmentation method to obtain a feature word set of the first video comment text; or performing word segmentation on the first video comment text by a word segmentation method based on statistics to obtain a feature word set of the first video comment text; or segmenting words of the first video comment text by a word segmentation method of machine learning to obtain a feature word set of the first video comment text; or segmenting the first video comment text by an understood segmentation method to obtain a feature word set of the first video comment text.

Optionally, a feature word set of the first video comment text is obtained according to a complete word segmentation result of the first video comment text. For example, if the word segmentation result of the first video comment text is "this video", "true" or "good", the "this video", "true" or "good" in the word segmentation result is all included in the feature word set, resulting in a feature word set { this video, true, good }.

Optionally, a feature word set of the first video comment text is obtained according to the word segmentation result of the first video comment text and the relevance between the feature words and the comment object. For example, the preposition assumption is that the word segmentation result of the first video comment text is "i think", "this video", "true", and "good", wherein the word "i think" is less associated with the comment object, so that "this video", "true", and "good" in the word segmentation result are all included in the feature word set, resulting in the feature word set { this video, true, good }.

Step 905: and calculating an information entropy set of the first video comment text according to the feature word set of the first video comment text.

The information entropy in the information entropy set is obtained through calculation according to the feature words in the feature word set of the first video comment text.

Optionally, the information entropy set is calculated according to all the feature words in the feature word set of the first video comment text. Illustratively, if the feature word set is { video, true, good }, the entropy of the information of the video is calculated to be 0.4, the entropy of the information of true is calculated to be 0.3, and the entropy of the information of good is calculated to be 0.01, so the entropy set is {0.4, 0.3, 0.01 }.

Optionally, the information entropy set is obtained by calculation according to a feature word in a feature word set of the first video comment text, where the feature word is determined according to the relevance to the comment object. Illustratively, the prepositioned hypothesis feature word set is { i feel, this video, true, good }, wherein "i feel" that the association degree with the comment object is low, so "i feel" does not participate in the calculation of the information entropy, so the information entropy of "this video", "true", good "is calculated.

Step 906: and determining keywords of the first comment text according to the information entropy set of the first video comment text.

Optionally, when the information entropy set of the first video comment text only includes one information entropy, determining the feature word corresponding to the information entropy as the keyword of the first comment text.

Optionally, when the information entropy set of the first video comment text includes at least two information entropies, the feature words corresponding to the minimum n information entropies in the information entropy set are used as the keywords of the first comment text, and n is a positive integer.

Optionally, when the information entropy set of the first video comment text includes at least two information entropies, the feature words corresponding to the information entropies smaller than a preset threshold in the information entropy set are used as the keywords of the first comment text. The preset threshold may be set by the technician at his or her discretion.

Optionally, keywords of the first video comment text are used to search the video. For example, if one of the keywords of the first video comment text is "horror", the video can be obtained in the search result when the user inputs the word "horror" to search for the video.

Optionally, keywords of the first video comment text are used to classify the video. For example, if one of the keywords of the first video comment text is "good", the video is classified as excellent.

Optionally, keywords of the first video comment text are used for recommending the video. For example, if one of the keywords of the first video comment text is "excellent", the video is recommended.

The following are embodiments of the apparatus of the present application, and for details that are not described in detail in the embodiments of the apparatus, reference may be made to corresponding descriptions in the embodiments of the method described above, and details are not described herein again.

Fig. 10 shows a schematic structural diagram of a face segmentation apparatus according to an exemplary embodiment of the present application. The apparatus may be implemented as all or part of a computer device by software, hardware or a combination of both, and the apparatus 1000 includes:

an obtaining module 1001, configured to obtain a first comment text from multiple comment texts, where an emotion tag of the first comment text is a first emotion tag;

a word segmentation module 1002, configured to segment words of the first comment text to obtain a feature word set of the first comment text;

a calculating module 1003, configured to calculate an information entropy set of the first comment text according to the feature word set, where information entropy in the information entropy set is obtained by calculation according to feature words in the feature word set;

a determining module 1004, configured to determine, according to the information entropy set, a keyword of the first comment text.

In an optional design of the present application, the calculation module 1003 is further configured to randomly determine a first feature word from the feature word set; acquiring the emotion probability of the first feature word, wherein the emotion probability is used for representing the probability that the emotion label of the comment text is the first emotion label when the first feature word appears in the comment text; calculating the information entropy of the first feature word based on the emotion probability of the first feature word; and repeating the three steps until the information entropies corresponding to all the feature words in the feature word set are obtained, and generating the information entropy set of the first comment text.

In an optional design of the present application, the calculation module 1003 is further configured to perform word segmentation on the multiple comment texts, and obtain a feature word set of each comment text; determining m target comment texts containing the first characteristic words according to the characteristic word set of each comment text; acquiring m emotion labels corresponding to the m target comment texts; and calculating the proportion of the first emotion label in the m emotion labels to obtain the emotion probability of the first feature word.

In an optional design of the present application, the determining module 1004 is further configured to use feature words corresponding to the minimum n information entropies in the information entropy set as the keywords of the first comment text.

In an optional design of the present application, the determining module 1004 is further configured to obtain a preset threshold; and taking the feature words corresponding to the information entropies smaller than the preset threshold value in the information entropy set as the key words of the first comment text.

In an optional design of the present application, the word segmentation module 1002 is further configured to input the first comment text into a word segmentation model, and extract a word in the first comment text; generating a directed acyclic graph of the first comment text according to the words; calculating the weighted sum of each path in the directed acyclic graph based on the weighted values of the words in the directed acyclic graph; taking the path with the smallest weight sum in all the paths as an optimal path; and segmenting the first comment text according to the optimal path to obtain a feature word set of the first comment text.

In an alternative design of the present application, the apparatus 1000 further includes: a tag extraction module 1005.

A tag extraction module 1005, configured to input the comment texts into the emotion tag extraction model, and obtain an input vector of each comment text, where the input vector includes at least one of a word embedding vector, a segment embedding vector, and a position embedding vector; inputting the input vector into an encoder model, and outputting an emotion score through a self-attention mechanism; and outputting the emotion label of each comment text based on the emotion score.

In an alternative design of the present application, the apparatus 1000 further includes: a training module 1006.

A training module 1006, configured to construct a training data set, where the training data set includes a target training sample and a corresponding real emotion label; inputting the target training sample into the emotion label extraction model, and outputting an emotion label of the target training sample; and training the emotion label extraction model based on the difference value between the emotion label of the target training sample and the real emotion label.

In an optional design of the present application, the obtaining module 1001 is further configured to obtain a first video comment text from the plurality of video comment texts, where an emotion tag of the first video comment text is a positive emotion tag.

The word segmentation module 1002 is further configured to segment the first video comment text to obtain a feature word set of the first video comment text.

The calculating module 1003 is further configured to calculate an information entropy set of the first video comment text according to the feature word set of the first video comment text.

The determining module 1004 is further configured to determine a keyword of the first comment text according to the information entropy set of the first video comment text.

In an optional design of the present application, the determining module 1004 is further configured to use the keyword as a tag of a comment object corresponding to the plurality of comment texts; classifying the review object based on the tag of the review object.

FIG. 11 is a block diagram illustrating a computer device in accordance with an exemplary embodiment. The computer device 1100 includes a Central Processing Unit (CPU) 1101, a system Memory 1104 including a Random Access Memory (RAM) 1102 and a Read-Only Memory (ROM) 1103, and a system bus 1105 connecting the system Memory 1104 and the CPU 1101. The computer device 1100 also includes a basic Input/Output system (I/O system) 1106, which facilitates transfer of information between various devices within the computer device, and a mass storage device 1107 for storing an operating system 1113, application programs 1114, and other program modules 1115.

The basic input/output system 1106 includes a display 1108 for displaying information and an input device 1109 such as a mouse, keyboard, etc. for user input of information. Wherein the display 1108 and input device 1109 are connected to the central processing unit 1101 through an input output controller 1110 connected to the system bus 1105. The basic input/output system 1106 may also include an input/output controller 1110 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, input-output controller 1110 also provides output to a display screen, a printer, or other type of output device.

The mass storage device 1107 is connected to the central processing unit 1101 through a mass storage controller (not shown) that is connected to the system bus 1105. The mass storage device 1107 and its associated computer device-readable media provide non-volatile storage for the computer device 1100. That is, the mass storage device 1107 may include a computer device-readable medium (not shown) such as a hard disk or Compact disk-Only Memory (CD-ROM) drive.

Without loss of generality, the computer device readable media may comprise computer device storage media and communication media. Computer device storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer device readable instructions, data structures, program modules or other data. Computer device storage media includes RAM, ROM, Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), CD-ROM, Digital Video Disk (DVD), or other optical, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will appreciate that the computer device storage media is not limited to the foregoing. The system memory 1104 and mass storage device 1107 described above may be collectively referred to as memory.

The computer device 1100 may also operate as a remote computer device connected to a network through a network, such as the internet, in accordance with various embodiments of the present disclosure. That is, the computer device 1100 may connect to the network 1111 through a network interface unit 1112 connected to the system bus 1105, or may connect to other types of networks or remote computer device systems (not shown) using the network interface unit 1112.

The memory further includes one or more programs, the one or more programs are stored in the memory, and the central processor 1101 implements all or part of the steps of the keyword extraction method by executing the one or more programs.

In an exemplary embodiment, a computer readable storage medium is further provided, in which at least one instruction, at least one program, a set of codes, or a set of instructions is stored, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by a processor to implement the keyword extraction method provided by the above-mentioned various method embodiments.

The present application further provides a computer-readable storage medium, where at least one instruction, at least one program, a code set, or an instruction set is stored in the storage medium, and the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by the processor to implement the keyword extraction method provided by the foregoing method embodiment.

Optionally, the present application also provides a computer program product containing instructions, which when run on a computer device, causes the computer device to execute the keyword extraction method according to the above aspects.

The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A keyword extraction method, characterized in that the method comprises:

2. The method of claim 1, wherein the computing the set of information entropies of the first comment text from the set of feature words comprises:

randomly determining a first feature word from the feature word set;

acquiring the emotion probability of the first feature word, wherein the emotion probability is used for representing the probability that the emotion label of the comment text is the first emotion label when the first feature word appears in the comment text;

calculating the information entropy of the first feature word based on the emotion probability of the first feature word;

3. The method of claim 2, wherein the obtaining the emotional probability of the first feature word comprises:

segmenting the comment texts to obtain a feature word set of each comment text;

determining m target comment texts containing the first characteristic words according to the characteristic word set of each comment text;

acquiring m emotion labels corresponding to the m target comment texts;

and calculating the proportion of the first emotion label in the m emotion labels to obtain the emotion probability of the first feature word.

4. The method according to any one of claims 1 to 3, wherein the set of information entropies comprises at least two information entropies;

determining the keywords of the first comment text according to the information entropy set, wherein the determining comprises:

and taking the feature words corresponding to the minimum n information entropies in the information entropy set as the key words of the first comment text.

5. The method according to any one of claims 1 to 3, wherein the determining the keyword of the first comment text according to the information entropy set comprises:

acquiring a preset threshold value;

and taking the feature words corresponding to the information entropies smaller than the preset threshold value in the information entropy set as the key words of the first comment text.

6. The method according to any one of claims 1 to 3, wherein the segmenting the first comment text to obtain a feature word set of the first comment text includes:

inputting the first comment text into a word segmentation model, and extracting words in the first comment text;

generating a directed acyclic graph of the first comment text according to the words;

calculating the weighted sum of each path in the directed acyclic graph based on the weighted values of the words in the directed acyclic graph;

taking the path with the smallest weight sum in all the paths as an optimal path;

and segmenting the first comment text according to the optimal path to obtain a feature word set of the first comment text.

7. The method according to any one of claims 1 to 3, wherein before obtaining the first comment text from the plurality of comment texts, the method further comprises:

inputting the comment texts into the emotion tag extraction model, and acquiring an input vector of each comment text, wherein the input vector comprises at least one of a word embedding vector, a segment embedding vector and a position embedding vector;

inputting the input vector into an encoder model, and outputting an emotion score through a self-attention mechanism;

and outputting the emotion label of each comment text based on the emotion score.

8. The method of claim 7, further comprising:

constructing a training data set, wherein the training data set comprises a target training sample and a corresponding real emotion label;

inputting the target training sample into the emotion label extraction model, and outputting an emotion label of the target training sample;

and training the emotion label extraction model based on the difference value between the emotion label of the target training sample and the real emotion label.

9. The method of any of claims 1 to 3, wherein the comment text comprises video comment text; the emotion labels comprise positive emotion labels;

the method further comprises the following steps:

acquiring a first video comment text from the plurality of video comment texts, wherein the emotion tag of the first video comment text is a positive emotion tag;

segmenting the first video comment text to obtain a feature word set of the first video comment text;

calculating an information entropy set of the first video comment text according to the feature word set of the first video comment text;

and determining keywords of the first comment text according to the information entropy set of the first video comment text.

10. The method according to any one of claims 1 to 3, wherein after determining the keyword of the first comment text according to the information entropy set, the method further comprises:

taking the keywords as labels of comment objects corresponding to the comment texts;

classifying the review object based on the tag of the review object.

11. A keyword extraction apparatus, characterized in that the apparatus comprises:

12. The apparatus of claim 11,

the computing module is further used for randomly determining a first feature word from the feature word set; acquiring the emotion probability of the first feature word, wherein the emotion probability is used for representing the probability that the emotion label of the comment text is the first emotion label when the first feature word appears in the comment text; calculating the information entropy of the first feature word based on the emotion probability of the first feature word; and repeating the three steps until the information entropies corresponding to all the feature words in the feature word set are obtained, and generating the information entropy set of the first comment text.

13. The apparatus of claim 12,

the computing module is further used for segmenting the comment texts to obtain a feature word set of each comment text; determining m target comment texts containing the first characteristic words according to the characteristic word set of each comment text; acquiring m emotion labels corresponding to the m target comment texts; and calculating the proportion of the first emotion label in the m emotion labels to obtain the emotion probability of the first feature word.

14. A computer device, characterized in that the computer device comprises: a processor and a memory, the memory having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, the at least one instruction, the at least one program, the set of codes, or the set of instructions being loaded and executed by the processor to implement the keyword extraction method of any of claims 1 to 10.

15. A computer-readable storage medium, having at least one program code stored therein, the program code being loaded and executed by a processor to implement the keyword extraction method according to any one of claims 1 to 10.