CN117436438A - Emotion analysis method, training method and device for large language model - Google Patents

Emotion analysis method, training method and device for large language model Download PDF

Info

Publication number
CN117436438A
CN117436438A CN202311415479.0A CN202311415479A CN117436438A CN 117436438 A CN117436438 A CN 117436438A CN 202311415479 A CN202311415479 A CN 202311415479A CN 117436438 A CN117436438 A CN 117436438A
Authority
CN
China
Prior art keywords
text
target
analyzed
sample
prompt
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311415479.0A
Other languages
Chinese (zh)
Inventor
李耀松
龚建
卓泽城
张策
李树军
张晓聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202311415479.0A priority Critical patent/CN117436438A/en
Publication of CN117436438A publication Critical patent/CN117436438A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses an emotion analysis method and a training method and device of a large language model, relates to the technical field of computers, and particularly relates to the field of artificial intelligence such as deep learning, natural language processing and the large model. The specific implementation scheme is as follows: acquiring a first target text; extracting an object to be analyzed from the first target text; generating a second target text according to the first target text and the object to be analyzed; the second target text comprises a task prompt text, and the task prompt text is used for prompting the large language model to execute an emotion analysis task on the object to be analyzed based on the first target text; and inputting the second target text into the large language model to obtain the emotion polarity of the object to be analyzed. Therefore, the object to be subjected to emotion analysis can be determined by extracting the object from the first target text, the large language model is guided to carry out emotion analysis on the object to be analyzed based on the task prompt text, object-level emotion analysis is realized, and accuracy of emotion analysis results is improved.

Description

Emotion analysis method, training method and device for large language model
Technical Field
The application relates to the technical field of computers, in particular to the field of artificial intelligence such as deep learning, natural language processing, large models and the like, and specifically relates to an emotion analysis method, a training method of a large language model and a training device of the large language model.
Background
In natural language processing applications, emotion analysis holds great promise. User satisfaction with a product or service may be assessed, for example, by comments made by the user on an internet platform. As another example, public awareness of brands of businesses and emotional tendencies may be known from public comments posted on the Internet platform.
Disclosure of Invention
The application provides an emotion analysis method, a training method of a large language model and a training device of the large language model. The specific scheme is as follows:
according to an aspect of the present application, there is provided an emotion analysis method including:
acquiring a first target text;
extracting an object to be analyzed from the first target text;
generating a second target text according to the first target text and the object to be analyzed; the second target text comprises a task prompt text, and the task prompt text is used for prompting the large language model to execute an emotion analysis task on the object to be analyzed based on the first target text;
and inputting the second target text into the large language model to obtain the emotion polarity of the object to be analyzed.
According to another aspect of the present application, there is provided a training method of a large language model, including:
acquiring a first training sample; the first training sample comprises a first sample text and a task prompt text, wherein the task prompt text is used for prompting the initial large language model to execute emotion analysis tasks on objects to be analyzed in the first sample text;
Inputting a first training sample into an initial large language model to obtain a predicted emotion polarity;
and training the initial large language model according to the difference between the predicted emotion polarity and the actual emotion polarity corresponding to the object to be analyzed to obtain the large language model.
According to another aspect of the present application, there is provided an emotion analysis device including:
the first acquisition module is used for acquiring a first target text;
the extraction module is used for extracting an object to be analyzed from the first target text;
the generation module is used for generating a second target text according to the first target text and the object to be analyzed; the second target text comprises a task prompt text, and the task prompt text is used for prompting the large language model to execute an emotion analysis task on the object to be analyzed based on the first target text;
and the second acquisition module is used for inputting the second target text into the large language model to obtain the emotion polarity of the object to be analyzed.
According to another aspect of the present application, there is provided a training apparatus of a large language model, including:
the first acquisition module is used for acquiring a first training sample; the first training sample comprises a first sample text and a task prompt text, wherein the task prompt text is used for prompting the initial large language model to execute emotion analysis tasks on objects to be analyzed in the first sample text;
The second acquisition module is used for inputting the first training sample into the initial large language model to obtain the predicted emotion polarity;
the first training module is used for training the initial large language model according to the difference between the predicted emotion polarity and the actual emotion polarity corresponding to the object to be analyzed to obtain the large language model.
According to another aspect of the present application, there is provided an electronic device including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of an embodiment of the above aspect or to perform the method of an embodiment of the above aspect.
According to another aspect of the present application, there is provided a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method according to the embodiment of the above aspect or to perform the method according to the embodiment of the above aspect.
According to a further aspect of the present application, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the steps of the method described in the embodiments of the above aspect, or implements the steps of the method described in the embodiments of the above aspect.
It should be understood that the description of this section is not intended to identify key or critical features of the embodiments of the application or to delineate the scope of the application. Other features of the present application will become apparent from the description that follows.
Drawings
The drawings are for better understanding of the present solution and do not constitute a limitation of the present application. Wherein:
FIG. 1 is a schematic flow chart of an emotion analysis method according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of an emotion analysis method according to another embodiment of the present application;
FIG. 3 is a schematic flow chart of an emotion analysis method according to another embodiment of the present application;
FIG. 4 is a flowchart of a training method of a large language model according to an embodiment of the present disclosure;
FIG. 5 is a flowchart illustrating a training method of an object extraction model according to an embodiment of the present application;
FIG. 6 is a schematic diagram of BIO labeling provided in an embodiment of the present application;
FIG. 7 is a schematic diagram of a model training process according to an embodiment of the present disclosure;
FIG. 8 is a schematic diagram of a model reasoning process according to an embodiment of the present application;
FIG. 9 is a schematic diagram of an emotion analysis device according to an embodiment of the present disclosure;
FIG. 10 is a schematic structural diagram of a training device for large language models according to an embodiment of the present disclosure;
FIG. 11 is a block diagram of an electronic device for implementing the emotion analysis method of an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that, in the technical scheme of the present disclosure, the acquisition, storage, use, processing, etc. of the data all conform to the relevant regulations of the national laws and regulations, and the public sequence is not violated.
The emotion analysis method, the training method of the large language model, the device, the electronic equipment and the storage medium of the embodiment of the application are described below with reference to the accompanying drawings.
Artificial intelligence is the discipline of studying certain mental processes and intelligent behaviors (e.g., learning, reasoning, thinking, planning, etc.) of a person using a computer, both in the technical field of hardware and in the technical field of software. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology comprises a computer vision technology, a voice recognition technology, a natural language processing technology, a deep learning technology, a big data processing technology, a knowledge graph technology and the like.
Deep learning is a new research direction in the field of machine learning. Deep learning is the inherent regularity and presentation hierarchy of learning sample data, and the information obtained during such learning is helpful in interpreting data such as text, images and sounds. Its final goal is to have the machine have analytical learning capabilities like a person, and to recognize text, image, and sound data.
Natural language processing is an important direction in the fields of computer science and artificial intelligence, and the content of NLP research includes, but is not limited to, the following branch fields: text classification, information extraction, automatic abstracting, intelligent question and answer, topic recommendation, machine translation, topic word recognition, knowledge base construction, deep text representation, named entity recognition, text generation, text analysis (lexical, syntactic, grammatical, etc.), speech recognition and synthesis, and the like.
Large language models (also referred to as large models) are often referred to as large scale neural networks, which are trained to process, understand, and generate language text.
In some embodiments, emotion analysis methods focus mainly on overall emotion tendencies of text, while neglecting accurate recognition of emotion expressions of different objects, resulting in inaccuracy of emotion analysis results.
Based on the above, the embodiment of the application provides an emotion analysis method, which guides a large language model to carry out emotion analysis on an object to be analyzed based on a task prompt text, realizes object-level emotion analysis, can process large-scale data, and has wide application range.
Fig. 1 is a schematic flow chart of an emotion analysis method according to an embodiment of the present application.
The emotion analysis method can be executed by the emotion analysis device, the device can be configured in electronic equipment, the large language model is guided to conduct emotion analysis on the object to be analyzed through the task prompt text, emotion polarities of different objects can be analyzed, and object-level emotion analysis is achieved.
The electronic device may be any device with computing capability, for example, may be a personal computer, a mobile terminal, a server, etc., and the mobile terminal may be, for example, a vehicle-mounted device, a mobile phone, a tablet computer, a personal digital assistant, a wearable device, etc., which have various operating systems, touch screens, and/or display screens.
As shown in fig. 1, the emotion analysis method includes:
step 101, a first target text is acquired.
In this application, the first target text may be collected from an internet platform, or may be extracted from a questionnaire, or may be obtained by other means.
For example, the first target text may be comment text for a certain product, or a certain event, or a certain brand.
And 102, extracting an object to be analyzed from the first target text.
In the present application, an object to be analyzed may be extracted from a first target text by using a pre-trained object extraction model, or an object to be analyzed may be extracted from the first target text by using a part of speech and a dictionary of each word in the first target text, or an object to be analyzed may be extracted from the first target text by using other methods, which is not limited in this application.
For example, a shampoo with a first target text of "brand a" would also work well. The object to be analyzed extracted from this text is "brand a".
In the present application, the object to be analyzed in the first target text may be one or more, which is not limited. For example, a certain first target text is "brand a shampoo is better than brand B", and the objects to be analyzed extracted from the text are "brand a" and "brand B".
And step 103, generating a second target text according to the first target text and the object to be analyzed.
In the application, the second target text may include a first target text, a task prompt text, and the like, where the task prompt text may be used to prompt the large language model to perform an emotion analysis task on the object to be analyzed based on the first target text, that is, the task prompt text may be used to prompt the large language model to analyze emotion polarity of the first target text with respect to the object to be analyzed. Therefore, the task prompt text clearly indicates the object to be analyzed which needs emotion analysis, and prompts the large language model to predict the emotion polarity of the object to be analyzed.
In this application, emotion polarity may refer to the relative strength of positive emotion and negative emotion in an emotion experience. If the positive emotion intensity is greater than the negative emotion intensity, the emotion polarity is positive; if the positive emotion intensity is smaller than the negative emotion intensity, the emotion polarity is negative; if the positive emotion intensity is equal to the negative emotion intensity, the emotion polarity is neutral or neutral.
In this application, emotion polarity may include positive (i.e., favorable, supported), negative (i.e., negative), neutral (no apparent emotion polarity), and the like.
In the present application, the analysis of emotion polarity may be a process of analyzing, processing, generalizing and reasoning the text with emotion colors.
In the application, the task prompt text can be constructed based on the object to be analyzed, and the second target text can be generated according to the first target text and the task prompt text.
For example, a shampoo with a first target text of "brand a" would also work well. The second target text may be "brand a shampoo is still good". This text may be "the emotional polarity of brand a", or the second target text may be "brand a shampoo is still good. This piece of text is "for brand a," etc.
In some embodiments, emotion polarity, emotion tendencies, emotion categories, etc. may be interchanged.
And 104, inputting the second target text into the large language model to obtain the emotion polarity of the object to be analyzed.
In the application, the second target text is input into a large language model to perform feature extraction and decoding, so that the emotion polarity of the object to be analyzed is obtained.
For example, some second target text may be "brand a shampoo is still good. The emotion polarity of the text segment to the A brand is, and the second target text is input into a large language model, so that the emotion polarity of the A brand is positive, namely the emotion polarity of the text segment to the A brand is positive.
Illustratively, the large language model may be a BERT (Bidirectional Encoder Representations from Transformers, transform-based bi-directional coded representation) model, and emotion analysis is performed through the BERT model, so that emotion context can be considered, and accuracy of object-level emotion analysis results is improved.
In the embodiment of the application, the to-be-analyzed object is extracted from the first target text, the second target text containing the task prompt text is generated according to the first target text and the to-be-analyzed object, and the second target text is processed by using the large language model, so that the emotion polarity of the to-be-analyzed object is obtained. Therefore, the object to be subjected to emotion analysis can be determined by extracting the object from the first target text, the large language model is guided to carry out emotion analysis on the object to be analyzed based on the task prompt text, object-level emotion analysis is realized, and accuracy of emotion analysis results is improved.
Fig. 2 is a schematic flow chart of an emotion analysis method according to another embodiment of the present application.
As shown in fig. 2, the emotion analysis method includes:
in step 201, a first target text is obtained.
In this application, any implementation manner of the embodiments of the present application may be adopted in step 201, which is not limited and will not be described in detail.
Step 202, inputting the first target text into the object extraction model to obtain a tag sequence of the first target text output by the object extraction model.
In the application, the object extraction model may be a sequence labeling model, and the first target text may be input into the object extraction model to perform feature extraction and decoding, so as to obtain a tag sequence of the first target text output by the object extraction model.
The tag sequence may be a sequence of a BIO (B-begin, I-insert, O-outlide) labeling mode, a tag of each character in the first target text may be included in the tag sequence, and the tag may be used to indicate a category of the character.
For ease of distinction, in the present application, different types of objects to be analyzed may be distinguished using different labels, such as ORG for organization, bra for branding, peo for people, etc.
For example, a first target text is "brand A shampoo is still good", and the corresponding label sequence is "B-bra I-bra I-bra O O O O O O O O".
The object extraction model may be, for example, a BERT-CRF (Conditional Random Fields, conditional random field) model that may perform feature extraction and sequence labeling on the first target text. The BERT may perform feature extraction on the first target text to obtain a feature vector, input the feature vector to the CRF, and the CRF may decode the feature vector to obtain a probability that each character in the first target text belongs to each label, thereby determining a label sequence of the first target text.
The text entity extraction method based on the BERT-CRF model can utilize the context information of the BERT and the sequence modeling capability of the CRF, can better capture the context characteristics and semantic association of the entity, and improves the accuracy and robustness of object extraction.
The object extraction model may be, for example, a BiLSTM (Bidirectional Long Short-Term Memory network) -CRF model.
Step 203, determining the label in the first target text as the target character of the target label according to the label sequence.
In the application, the tag in the first target text is the character of the target tag, and the character is determined to be the target character. For example, using BIO notation, the target tag may be a tag beginning with "B", "I".
Step 204, determining the object to be analyzed according to the target character.
In the application, the target characters corresponding to the adjacent labels for representing the entities can be combined to obtain the object to be analyzed.
For example, if a first target text is "shampoo of brand a is still good", the corresponding label sequence is "B-bra I-bra O O O O O O O O", it can be known that "B-bra I-bra" is the target label, and the characters corresponding to the three labels are combined in sequence to obtain the object "brand a" to be analyzed.
For another example, a certain first target text is "the shampoo of brand A is better than the shampoo of brand B", the corresponding label sequence is "B-bra I-bra I-bra O O O O", the characters corresponding to the first three labels in the label sequence are combined to obtain an object to be analyzed of brand A ", and the characters corresponding to the ninth label, tenth label and eleventh label in the label sequence are combined to obtain brand B.
Optionally, part of speech tagging can be performed on the first target text by using a part of speech tagging model, so as to obtain part of speech of each word in the first target sample, the word with the part of speech as a noun in each word is determined as a candidate word, then each candidate word is matched with the word in the preset dictionary, and if the matching degree of the candidate word and any word in the preset dictionary is greater than a preset matching degree threshold, the candidate word is determined as an object to be analyzed. The preset dictionary can comprise a plurality of objects capable of carrying out emotion analysis.
For example, the preset dictionary may include objects capable of emotion analysis in a plurality of fields, or the preset dictionary may include objects capable of emotion analysis in a field to which the first target text belongs, which is not limited.
Therefore, the object to be analyzed can be extracted based on the part of speech and the dictionary of each word in the first target text, the extraction mode of the object to be analyzed is enriched, and the diversified requirements can be met.
Step 205, generating a second target text according to the first target text and the object to be analyzed.
And 206, inputting the second target text into the large language model to obtain the emotion polarity of the object to be analyzed.
In this application, any implementation manner of the embodiments of the present application may be adopted in step 205 to step 206, which is not limited and will not be described in detail.
In the embodiment of the application, the object to be analyzed can be extracted from the first target text by using the object extraction model, so that the extraction efficiency and the extraction accuracy of the object to be analyzed are improved, and the accuracy of the emotion analysis result is further improved.
Fig. 3 is a schematic flow chart of an emotion analysis method according to another embodiment of the present application.
As shown in fig. 3, the emotion analysis method includes:
In step 301, a first target text is obtained.
Step 302, extracting an object to be analyzed from the first target text.
In this application, any implementation manner of the embodiments of the present application may be adopted in steps 301 to 302, which is not limited and will not be described in detail.
And 303, adding the object to be analyzed to a preset position in a preset task prompt template to obtain a task prompt text.
In the application, the task prompt template may include position information of an object to be analyzed, emotion analysis task information, and the like, and may be determined according to actual needs.
For example, the task suggestion template may be "the emotion polarity of the text segment pair [ object to be analyzed ] is", or "the emotion polarity of the [ object to be analyzed ] in the text segment is" or the like.
In the application, the object to be analyzed can be added to a preset position in the task prompt template to obtain the task prompt text. The preset position may be a position of an object that needs emotion analysis.
It can be understood that if there are a plurality of objects to be analyzed, each object to be analyzed may be added to a preset position in the task prompt template to obtain a task prompt text of each object to be analyzed.
For example, a certain first target text is "the shampoo of brand a is better than the shampoo of brand B", the objects to be analyzed extracted from the text are "brand a" and "brand B", the task prompt template is "the text pair [ object to be analyzed ] is", and the "brand a" and "brand B" are respectively added to the positions of the objects to be analyzed in the task prompt template, so as to obtain two task prompt texts: the text of the section is "brand A" and "the text of the section is" brand B ", or the text of the section is" brand A "and" brand B "are added into the task prompt template to obtain a task prompt text of the section is" brand A "and" brand B ".
And step 304, splicing the first target text and the task prompt text to obtain a second target text.
In the application, the first target text and the task prompt text can be spliced according to a preset splicing rule to obtain the second target text. For example, the task tip text may be spliced to the first target text to obtain the second target text.
And 305, inputting the second target text into the large language model to obtain the emotion polarity of the object to be analyzed.
In this application, any implementation manner of the embodiments of the present application may be adopted in step 305, which is not limited and will not be described in detail.
In the embodiment of the application, the object to be analyzed can be added to the preset position in the preset task prompt template to obtain the task prompt text, and the first target text and the task prompt text are spliced to obtain the second target text. Therefore, the second target text can be obtained based on the object to be analyzed and the task prompt template, the generation efficiency of the second target text is improved, and the task prompt text in the second target text can guide the model to carry out emotion analysis on the object to be analyzed.
In one embodiment of the application, a target domain prompt text matched with the first target text can be obtained, and a second target text is generated according to the first target text, the object to be analyzed and the target domain prompt text, so that a domain definition prompt is added in the second target text.
The second target text may include a first target text, a target field prompt text, a task prompt text, and the like. The target domain prompt text may include a target domain to which the first target text belongs, so that the target domain prompt text may be used to prompt that the domain to which the first target text of the large language model belongs is the target domain.
In the present application, the target field prompt text may be preset, for example, for an emotion analysis task of a specific field, the target field prompt text of the field may be preset.
Optionally, different domain prompt texts can be preset for different domains, and the domain of the first target text can be identified to determine the target domain to which the first target text belongs, and the domain prompt text of the target domain is obtained from the domain prompt texts of multiple domains, that is, the target domain prompt text is obtained.
Therefore, emotion analysis can be carried out on the first target text in different fields based on the field prompt text in different fields.
Optionally, the object to be analyzed may be added to a preset position in a preset task prompt template to obtain a task prompt text, and the first target text, the target field prompt text and the task prompt text are spliced according to a preset splicing sequence to obtain a second target text. Therefore, the second target text can be spliced according to different splicing sequences, and the form of the second target text is enriched.
Illustratively, the target field prompt text, the first target text and the task prompt text are sequentially spliced in order, or the first target text, the target field prompt text and the task prompt text are sequentially spliced in order.
For example, a shampoo with a first target text of "brand a" would also work well. The task prompt text is "the section of text is for A brand", the target field prompt text is "the section of comment for commodity", the target field prompt text, the first target text and the task prompt text are spliced, and the second target text is obtained as "the section of comment for commodity: the shampoo of the brand A is also very good. This piece of text is for brand a).
In the embodiment of the application, the second target text is generated according to the first target text, the object to be analyzed and the target field prompt text. Therefore, a field limiting prompt can be added in the second target text, and then the task prompt text and the target field prompt text can guide the large language model to analyze the emotion polarity of the object to be analyzed in the target field, so that the understanding of the model to the text in the specific field can be increased by introducing related field information into the second target text through the prompt, and the accuracy of the emotion analysis result of the model is improved.
In order to further improve accuracy of the emotion analysis result of the model, in an embodiment of the present application, generating the second target text according to the first target text, the object to be analyzed, and the target field prompt text may include: and matching the first target text with terms in a term library corresponding to the target field to determine target terms associated with the first target text, and generating a second target text according to semantics of the first target text, the object to be analyzed, the target field prompt text and the target terms.
The term library corresponding to the target field may include a plurality of terms of the target field and semantics of each term.
In the method, the object to be analyzed and the preset task prompt template can be spliced to obtain the task prompt text, and the first target text, the target field prompt text, the task prompt text and the target term are spliced according to a preset splicing format to obtain the second target text.
Illustratively, the format of the second target text may be "this is a piece of text [ target field ]: a first target text. The semantics of the [ target term ] of this text association are the semantics of the [ target term ]. This text pair [ object to be analyzed ] is.
Therefore, the prompt of the domain-specific term semantics is added on the basis that the second target text contains the domain-defined prompt and the task prompt, so that the model can better understand the domain-specific semantics, and the accuracy of the model emotion analysis result is further improved.
The emotion analysis method of the embodiment of the application can be widely applied to a plurality of fields, such as:
(1) Social media analysis: object-level emotion analysis may be performed on posts, comments, replies, etc. on social media. For example, in social media marketing, businesses may analyze the emotional tendency of users to their brands, products, or services to better understand user feedback, adjust marketing strategies, and improve products.
(2) Customer service and user feedback analysis: in the field of customer service and user feedback, enterprises can be helped to analyze emotion inclination and attitudes of customers or users. Through emotion analysis aiming at specific objects, enterprises can quickly know the satisfaction degree and opinion of users on specific products, services or companies, so that the enterprises can be helped to identify problems, improve product quality, improve customer service experience and respond to the demands and feedback of the users in time.
(3) Emotion-driven recommendation system: the recommendation result which is more personalized and accords with the emotion preference of the user is provided for the user. By analyzing the emotion expression of the user on different objects, the recommendation system can recommend proper products, movies, music or other contents for the user according to the emotion tendencies of the user, and user experience and satisfaction are improved.
In order to achieve the above embodiments, the embodiments of the present application further provide a training method for a large language model. Fig. 4 is a flow chart of a training method of a large language model according to an embodiment of the present application.
As shown in fig. 4, the training method of the large language model includes:
in step 401, a first training sample is obtained.
In the application, the first training sample may include a first text sample and a task prompt text, where the task prompt text may be used to prompt the initial large language model to analyze emotion polarities of objects to be analyzed in the first sample text.
In the application, the object to be analyzed and the emotion polarity of the object to be analyzed in the first sample text can be marked to obtain a candidate sample, and the mask operation is performed on the true emotion polarity of the object to be analyzed marked in the candidate sample to obtain a first training sample.
For example, a candidate sample of "brand a" clothing is comfortable to wear. This text is forward to brand a ", and masking the emotion polarity" forward "in the candidate sample results in a first training sample" brand a' of clothing that is comfortable to wear. This text is "mask" for brand a ", then the first sample text in this first training sample is" clothes of brand a "is comfortable to wear. The task tip text is "this piece of text is [ mask ] for brand a". Wherein the actual value of [ mask ] is the manually noted result [ forward ].
Step 402, inputting the first training sample into the initial large language model to obtain the predicted emotion polarity.
In the method, the first training sample can be input into an initial large language model, and the initial large language model performs feature extraction and decoding on the first training sample to obtain the predicted emotion polarity.
And step 403, training the initial large language model according to the difference between the predicted emotion polarity and the actual emotion polarity corresponding to the object to be analyzed, so as to obtain the large language model.
According to the method and the device, model loss can be determined according to the difference between the predicted emotion polarity and the actual emotion polarity corresponding to the object to be analyzed, parameters of the initial large language model are adjusted based on the model loss, and training is continued on the initial large language model after the parameters are adjusted until training ending conditions are met, so that the large language model is obtained.
For example, a large language model may be constructed using the underlying BERT, and the first sample may be masked from the emotional polarity of the subject to be analyzed, making it a target that requires the model to learn, e.g., "a brand clothing is comfortable to wear. This text is for brand a [ mask ], where the actual value of [ mask ] is the manually annotated result [ forward ]. This matches the MLM (Mask Language Model ) in the BERT pre-training task. Thus, training can continue by modeling the MLM task.
Model parameters can be adjusted during training by minimizing the difference between predicted emotion polarity and true emotion polarity. Therefore, the large language model can learn the context information and emotion related characteristics of the text, and accuracy of the emotion analysis result of the model is improved.
In the embodiment of the application, the initial large language model can be trained by using the first training sample containing the task prompt text, so as to obtain the large language model. Therefore, the task prompt text is used for telling the model of the task to be executed, and the model is guided to analyze the emotion polarities of the objects to be analyzed, so that the emotion polarities of different objects can be analyzed by using the large language model, object-level emotion analysis is realized, and the accuracy of the emotion analysis result of the model is improved.
In one embodiment of the present application, obtaining the first training sample may include: and acquiring a first sample, extracting an object to be analyzed from the first sample text, and generating a first training sample according to the first sample text and the object to be analyzed. Therefore, the object to be analyzed is determined by extracting the object from the first text sample, the labeling efficiency of the object to be analyzed is improved, the first training sample is generated based on the extracted object to be analyzed, and the construction efficiency of the first training sample is improved.
In the present application, an object to be analyzed may be extracted from a first sample text by using a pre-trained object extraction model, or an object to be analyzed may be extracted from the first sample text by using a preset dictionary and part of speech of each word in the first sample text, and the detailed process may refer to the above method for extracting an object to be analyzed from a first target text, which is not described herein.
Therefore, the object to be analyzed is extracted from the first sample text by using the object extraction model, and the accuracy of object extraction is improved.
In the method, a task prompt text can be generated according to an object to be analyzed and a preset task prompt template, and the first text sample and the task prompt text are spliced according to a preset splicing rule to obtain a first training sample.
For example, a garment of a first sample, "brand a," is comfortable to wear. The object to be analyzed is "a brand", the task prompt template is "the text segment is" mask "to the object to be analyzed", the object to be analyzed is added to the preset position in the task prompt template, the task prompt text is obtained, the text segment is "mask" to the brand a, and the first sample text and the task prompt text are spliced to obtain the first training sample "the brand a clothing is comfortable to wear. This text is [ mask ] for brand a.
Optionally, a target field prompt text may be obtained, and a first training sample may be generated according to the first sample text, the object to be analyzed, and the target field prompt text. Therefore, domain limiting prompts can be added in the first training sample to guide the model to learn in the specific domain, so that understanding of the model to the text in the specific domain can be increased, and accuracy of emotion analysis results of the model is improved.
The method for obtaining the target field prompt text is similar to the method for obtaining the target field prompt text in the above embodiment, and the method for generating the first training sample according to the first sample text, the object to be analyzed and the target field prompt text is similar to the method for generating the second target text according to the first target text, the object to be analyzed and the target field prompt text in the above embodiment, so that details are not repeated here.
Optionally, the first sample text may be matched with terms in a term base corresponding to the target domain to determine target terms associated with the first sample text, and the first training sample may be generated according to the first sample text, the object to be analyzed, the target domain prompt text, and semantics of the target terms.
The method for generating the first training sample according to the first sample text, the object to be analyzed, the target field prompt text and the semantics of the target term is similar to the method for generating the second target text according to the first target text, the object to be analyzed, the target field prompt text and the semantics of the target term in the above embodiment, and therefore will not be repeated herein.
Therefore, the prompt of the domain-specific term semantics is added on the basis that the first training sample comprises the domain-defined prompt and the task prompt, so that the model can better understand the domain-specific semantics, and the accuracy of the model emotion analysis result is further improved.
Fig. 5 is a flowchart of a training method of an object extraction model according to an embodiment of the present application.
As shown in fig. 5, the training method of the object extraction model includes:
step 501, a second training sample is obtained.
In the present application, the second training sample may include the second sample text and the real tag sequence of the second sample text.
The second training sample can be obtained by labeling an object needing emotion analysis in the second sample text.
In the application, the object to be subjected to emotion analysis in the second sample text can be manually marked to obtain the original sample text, and then the object to be analyzed is marked by using a BIO marking mode to obtain the real tag sequence of the second sample text.
For example, the second sample text is "a brand of clothing is comfortable to wear", and the real tag sequence is "B-bra I-bra I-bra O O O O O O O O", as shown in FIG. 6.
Since the model has requirements on the format of input data, such as the length of the input data, the original sample text can be converted into a format suitable for model input, and the original sample can be preprocessed, such as segmenting the original sample text into character levels, adding start characters such as "[ CLS ]" and end characters such as "[ SEP ]", performing padding operation, truncation operation, and the like.
For example, if the length of the original sample text is longer, a truncation operation may be performed, for example, the original sample text may be segmented into a plurality of second sample texts according to the model length requirement; if the length of the original sample text is shorter, filling operation is performed, for example, characters can be filled in the original sample text, and a second sample text meeting the length requirement is obtained.
Step 502, inputting the second sample text into the initial object extraction model to obtain a predicted tag sequence.
In the application, the second sample text can be input into an initial object extraction model, the initial object extraction model is utilized to perform feature extraction and context coding on the second sample text to obtain a feature vector, and the feature vector is decoded to obtain a prediction tag sequence.
By way of example, the object extraction model may employ a BERT-CRF model. When the method is realized, a pre-trained BERT model can be used, and the pre-processed text sequence is input into the BERT model to perform feature extraction, so that feature vectors are obtained. The BERT model can perform word embedding on each character to obtain word vector representations relevant to context, and the word vectors can perform feature extraction and context coding through a multi-layer transducer network.
Based on the output of the BERT model, hidden states of different layers can be selected as feature representations, and the feature vectors can be subjected to dimension reduction through simple linear conversion so as to reduce model parameters and calculation amount.
After feature extraction, the feature vectors are input into the CRF layer for sequence labeling. The CRF layer can classify each character by labels and judge whether the characters belong to entity categories or not. Because the CRF layer considers the dependency relationship between sequence marks, the accuracy of object extraction can be improved by performing global optimization on the whole labeling sequence.
Step 503, training the initial object extraction model according to the difference between the predicted tag sequence and the real tag sequence to obtain the object extraction model.
According to the method and the device, the probability that the predicted tag sequence is the real tag sequence can be determined according to the probability that each character in the second sample text belongs to each tag, model loss is determined according to the probability that the predicted tag sequence is the real tag sequence, parameters of an initial object extraction model are adjusted based on the model loss, and training is continued on the initial object extraction model with the parameters adjusted until the training ending condition is met, so that the object extraction model is obtained. Therefore, model loss is determined based on the probability that the predicted tag sequence is the real tag sequence, and training efficiency of the model can be improved.
As a possible implementation manner, the probability that each character belongs to a real label may be determined according to the probability that each character belongs to each label in the second sample text, and the product of the probabilities that each character belongs to the real label is determined as the probability that the predicted label sequence is the real label sequence, so that the model loss is determined based on the probability that the predicted label sequence is the real label sequence.
As another possible implementation manner, the score of the predicted tag sequence as each possible tag sequence may be determined according to the probability that each character in the second sample text belongs to each tag, and the model loss may be determined according to the sum of the score of the real tag sequence and the score of all possible tag sequences. For each possible tag sequence, the score corresponding to the tag sequence may be determined according to the probability that each character in the second sample text belongs to the tag of the character in the tag sequence and the probability that the tag of the preceding character of each character in the tag sequence transitions to the tag of the character.
For example, the model loss can be calculated using the following formula (1):
L=-log(P(Y|X)) (1)
where L represents model loss, X represents input sequence (i.e., second sample text), Y represents real tag sequence, and P (y|x) represents conditional probability of real tag sequence Y given input sequence X.
Wherein, P (y|x) can be calculated using the following formula (2):
where score (X, Y) represents the score of the model for a given input sequence X and true tag sequence Y,representing all possible tag sequences. Wherein score (X, Y) can be broken down into two parts: emission probability (emission probabilities) and transition probability (transition probabilities), due toThis can be calculated using the following equation (3):
wherein n represents the length of the input sequence X, Y i A label representing the ith character, Y i-1 Labels representing the i-1 st character, emision (X, i, Y i ) Indicating that the ith character belongs to the label Y i Probability of transition (Y) i-1 ,Y i ) Representing the slave label Y i-1 Transfer to Y i Is a probability of (2).
According to the method and the device for extracting the emotion from the text, the initial object extraction model is trained by the second training sample to obtain the object extraction model, so that objects needing emotion analysis in the text can be extracted by the object extraction model, and accuracy of object extraction is improved.
In order to facilitate understanding of the foregoing embodiments, a description will be given below with reference to fig. 7 and 8, and fig. 7 is a schematic diagram of a model training process according to an embodiment of the present application. Fig. 8 is a schematic diagram of a model reasoning process according to an embodiment of the present application.
As shown in fig. 7, the model training process is as follows:
step 701, collecting domain related data.
Step 702, manually annotating data.
In the application, the data are marked manually, and the object to be analyzed and the emotion polarity of the object to be analyzed are marked.
At step 703, an object extraction model is constructed.
In the application, an object extraction model can be constructed by using the manually marked object to be analyzed.
The training method of the object extraction model can be referred to the above embodiments, so that the description thereof is omitted herein.
Step 704, a hint paradigm is obtained.
The prompt paradigm may include, among other things, field prompt text, task prompt templates, and the like.
Step 705, constructing emotion analysis model.
In the application, the emotion analysis model corresponds to the large language model in the above embodiment, and the training sample can be constructed based on the prompt paradigm and the extracted object to be analyzed.
The training method of the emotion analysis model can be referred to the training method of the large language model in the above embodiment, so that the description thereof is omitted here.
As shown in fig. 8, the model reasoning process is as follows:
step 801, a target text is entered.
Step 802, extracting an object to be analyzed from the target text.
In the application, the object extraction model can be utilized to extract the object to be analyzed from the target text.
Step 803, splicing the object to be analyzed into a prompt range.
In the method, the target text, the object to be analyzed and the prompt paradigm are spliced, and the spliced text can be obtained.
Step 804, determining the emotion polarity of the object to be analyzed.
In the application, the spliced text can be input into an emotion analysis model to obtain emotion polarity of an object to be analyzed.
In the present application, the object extraction model and the emotion analysis model may be trained separately, for example, the object extraction model is trained first, and then the emotion analysis model is trained, or the emotion analysis model is trained first, and then the object extraction model is trained, or the object extraction model and the emotion analysis model are trained simultaneously.
Or the object extraction model and the emotion analysis model can be trained jointly, and parameters of the two models can be adjusted according to model loss of the object extraction model and model loss of the emotion analysis model, so that model training efficiency can be improved, and the emotion analysis model can learn semantic information better.
In order to achieve the above embodiment, the embodiment of the present application further provides an emotion analysis device. Fig. 9 is a schematic structural diagram of an emotion analysis device according to an embodiment of the present application.
As shown in fig. 9, the emotion analysis device 900 includes:
a first obtaining module 910, configured to obtain a first target text;
the extracting module 920 is configured to extract an object to be analyzed from the first target text;
a generating module 930, configured to generate a second target text according to the first target text and the object to be analyzed; the second target text comprises a task prompt text, and the task prompt text is used for prompting the large language model to execute an emotion analysis task on the object to be analyzed based on the first target text;
and a second obtaining module 940, configured to input the second target text into the large language model to obtain the emotion polarity of the object to be analyzed.
Optionally, the extracting module 920 is configured to:
inputting the first target text into an object extraction model to obtain a tag sequence of the first target text output by the object extraction model;
determining a label in the first target text as a target character of the target label according to the label sequence;
and determining an object to be analyzed according to the target character.
Optionally, the extracting module 920 is configured to:
part of speech tagging is carried out on the first target sample so as to obtain part of speech of each word segmentation in the first target sample;
according to the part of speech of each word segment, determining candidate word segments from each word segment;
Matching the candidate word with the word in the preset dictionary to determine the object to be analyzed from the candidate word.
Optionally, the generating module 930 is configured to:
adding an object to be analyzed to a preset position in a preset task prompt template to obtain a task prompt text;
and splicing the first target text and the task prompt text to obtain a second target text.
Optionally, the generating module 930 is configured to:
acquiring a prompt text of the target field; the target domain prompt text is used for prompting that the domain to which the first target text of the large language model belongs is the target domain;
and generating a second target text according to the first target text, the object to be analyzed and the target field prompt text.
Optionally, the generating module 930 is configured to:
adding an object to be analyzed to a preset position in a preset task prompt template to obtain a task prompt text;
and splicing the first target text, the target field prompt text and the task prompt text according to a preset splicing sequence to obtain a second target text.
Optionally, the generating module 930 is configured to:
matching the first target text with terms in a term base corresponding to the target field to determine target terms associated with the first target text;
And generating a second target text according to the first target text, the object to be analyzed, the target field prompt text and the semantics of the target term.
Optionally, the generating module 930 is configured to:
identifying the field of the first target text to determine the target field to which the first target text belongs;
and acquiring target domain prompt texts of the target domain from the domain prompt texts of the multiple domains.
Note that, the explanation of the foregoing emotion analysis method embodiment is also applicable to the emotion analysis device of this embodiment, and therefore will not be described in detail here.
In the embodiment of the application, the to-be-analyzed object is extracted from the first target text, the second target text containing the task prompt text is generated according to the first target text and the to-be-analyzed object, and the second target text is processed by using the large language model, so that the emotion polarity of the to-be-analyzed object is obtained. Therefore, the object to be subjected to emotion analysis can be determined by extracting the object from the first target text, the emotion analysis is performed on the object to be analyzed based on the task prompt text guide model, object-level emotion analysis is realized, and accuracy of emotion analysis results is improved.
In order to achieve the above embodiment, the embodiment of the present application further provides an emotion analysis device. Fig. 10 is a schematic structural diagram of a training device for a large language model according to an embodiment of the present application.
As shown in fig. 10, the training apparatus 1000 of the large language model includes:
a first obtaining module 1010, configured to obtain a first training sample; the first training sample comprises a first sample text and a task prompt text, wherein the task prompt text is used for prompting the initial large language model to execute emotion analysis tasks on objects to be analyzed in the first sample text;
a second obtaining module 1020, configured to input the first training sample into the initial large language model to obtain a predicted emotion polarity;
the first training module 1030 is configured to train the initial large language model according to a difference between the predicted emotion polarity and an actual emotion polarity corresponding to the object to be analyzed, so as to obtain the large language model.
Optionally, the first obtaining module 1010 is configured to:
acquiring a first sample;
extracting an object to be analyzed from the first sample text;
and generating a first training sample according to the first sample text and the object to be analyzed.
Optionally, the first obtaining module 1010 is configured to:
Acquiring a prompt text of the target field; the target domain prompt text is used for prompting that the domain to which the first text sample of the initial large language model belongs is a target domain;
and generating a first training sample according to the first sample text, the object to be analyzed and the target field prompt text.
Optionally, the first obtaining module 1010 is configured to:
inputting the first sample text into an object extraction model to obtain a tag sequence of the first sample text output by the object extraction model;
determining a label in the first text sample as a target character of a target label according to the label sequence;
and determining an object to be analyzed according to the target character.
Optionally, the apparatus may further include:
the third acquisition module is used for acquiring a second training sample; the second training sample comprises a second sample text and a real label sequence of the second sample text;
the fourth acquisition module is used for inputting the second sample text into the initial object extraction model to obtain a predicted tag sequence;
and the second training module is used for training the initial object extraction model according to the difference between the predicted tag sequence and the real tag sequence to obtain the object extraction model.
Optionally, the second training module is configured to:
Determining the probability that the predicted tag sequence is a real tag sequence according to the probability that each character in the second sample text belongs to each tag;
determining model loss according to the probability that the predicted tag sequence is a real tag sequence;
and training the initial object extraction model according to the model loss to obtain an object extraction model.
It should be noted that, the explanation of the foregoing embodiment of the training method of the large language model is also applicable to the training device of the large language model of this embodiment, so that the explanation is omitted here.
In the embodiment of the application, the initial large language model can be trained by using the first training sample containing the task prompt text, so as to obtain the large language model. Therefore, the task prompt text is used for telling the model of the task to be executed, and the large language model is guided to analyze the emotion polarities of the objects to be analyzed, so that the emotion polarities of different objects can be analyzed by using the large language model, object-level emotion analysis is realized, and the accuracy of the emotion analysis result of the model is improved.
According to embodiments of the present application, there is also provided an electronic device, a readable storage medium and a computer program product.
Fig. 11 illustrates a schematic block diagram of an example electronic device 1100 that can be used to implement embodiments of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the application described and/or claimed herein.
As shown in fig. 11, the apparatus 1100 includes a computing unit 1101 that can perform various appropriate actions and processes according to a computer program stored in a ROM (Read-Only Memory) 1102 or a computer program loaded from a storage unit 1108 into a RAM (Random Access Memory ) 1103. In the RAM 1103, various programs and data required for the operation of the device 1100 can also be stored. The computing unit 1101, ROM 1102, and RAM 1103 are connected to each other by a bus 1104. An I/O (Input/Output) interface 1105 is also connected to bus 1104.
Various components in device 1100 are connected to I/O interface 1105, including: an input unit 1106 such as a keyboard, a mouse, etc.; an output unit 1107 such as various types of displays, speakers, and the like; a storage unit 1108, such as a magnetic disk, optical disk, etc.; and a communication unit 1109 such as a network card, modem, wireless communication transceiver, or the like. The communication unit 1109 allows the device 1100 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.
The computing unit 1101 may be a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1101 include, but are not limited to, a CPU (Central Processing Unit ), a GPU (Graphic Processing Units, graphics processing unit), various dedicated AI (Artificial Intelligence ) computing chips, various computing units running machine learning model algorithms, a DSP (Digital Signal Processor ), and any suitable processor, controller, microcontroller, etc. The computing unit 1101 performs the respective methods and processes described above, such as an emotion analysis method. For example, in some embodiments, the emotion analysis method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1108. In some embodiments, some or all of the computer programs may be loaded and/or installed onto device 1100 via ROM 1102 and/or communication unit 1109. When the computer program is loaded into RAM 1103 and executed by computing unit 1101, one or more steps of the emotion analysis method described above may be performed. Alternatively, in other embodiments, the computing unit 1101 may be configured to perform the emotion analysis method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit System, FPGA (Field Programmable Gate Array ), ASIC (Application-Specific Integrated Circuit, application-specific integrated circuit), ASSP (Application Specific Standard Product, special-purpose standard product), SOC (System On Chip ), CPLD (Complex Programmable Logic Device, complex programmable logic device), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present application may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this application, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, RAM, ROM, EPROM (Electrically Programmable Read-Only-Memory, erasable programmable read-Only Memory) or flash Memory, an optical fiber, a CD-ROM (Compact Disc Read-Only Memory), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., CRT (Cathode-Ray Tube) or LCD (Liquid Crystal Display ) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: LAN (Local Area Network ), WAN (Wide Area Network, wide area network), internet and blockchain networks.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service (Virtual Private Server, virtual special servers) are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.
It should be noted that, the example electronic device for implementing the training method embodiment of the large language model of the present application is similar to the above-mentioned electronic device in structure, so that the description thereof is omitted here.
According to an embodiment of the present application, there is further provided a computer program product, which when executed by an instruction processor in the computer program product, performs the emotion analysis method set forth in the foregoing embodiment of the present application, or performs the training method of the large language model set forth in the foregoing embodiment of the present application.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions disclosed in the present application can be achieved, and are not limited herein.
The above embodiments do not limit the scope of the application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application are intended to be included within the scope of the present application.

Claims (31)

1. A method of emotion analysis, comprising:
acquiring a first target text;
extracting an object to be analyzed from the first target text;
generating a second target text according to the first target text and the object to be analyzed; the second target text comprises a task prompt text, and the task prompt text is used for prompting a large language model to execute emotion analysis tasks on the object to be analyzed based on the first target text;
and inputting the second target text into the large language model to obtain the emotion polarity of the object to be analyzed.
2. The method of claim 1, wherein the extracting the object to be analyzed from the first target text comprises:
inputting the first target text into an object extraction model to obtain a tag sequence of the first target text output by the object extraction model;
determining that the tag in the first target text is a target character of a target tag according to the tag sequence;
and determining the object to be analyzed according to the target character.
3. The method of claim 1, wherein the extracting the object to be analyzed from the first target text comprises:
part of speech tagging is carried out on the first target sample so as to obtain part of speech of each word segmentation in the first target sample;
According to the part of speech of each word segment, determining candidate word segments from the word segments;
and matching the candidate word with the word in the preset dictionary to determine the object to be analyzed from the candidate word.
4. The method of claim 1, wherein the generating a second target text from the first target text and the object to be analyzed comprises:
adding the object to be analyzed to a preset position in a preset task prompt template to obtain the task prompt text;
and splicing the first target text and the task prompt text to obtain the second target text.
5. The method of claim 1, wherein the generating a second target text from the first target text and the object to be analyzed comprises:
acquiring a prompt text of the target field; the target domain prompt text is used for prompting that the domain to which the first target text of the large language model belongs is a target domain;
and generating the second target text according to the first target text, the object to be analyzed and the target field prompt text.
6. The method of claim 5, wherein the generating the second target text from the first target text, the object to be analyzed, and the target field prompt text comprises:
Adding the object to be analyzed to a preset position in a preset task prompt template to obtain the task prompt text;
and splicing the first target text, the target field prompt text and the task prompt text according to a preset splicing sequence to obtain the second target text.
7. The method of claim 5, wherein the generating the second target text from the first target text, the object to be analyzed, and the target field prompt text comprises:
matching the first target text with terms in a term base corresponding to the target field to determine target terms associated with the first target text;
and generating the second target text according to the first target text, the object to be analyzed, the target field prompt text and the semantics of the target term.
8. The method of claim 5, wherein the obtaining target area prompt text comprises:
identifying the field of the first target text to determine the target field to which the first target text belongs;
and acquiring target domain prompt texts of the target domain from the domain prompt texts of the multiple domains.
9. A method of training a large language model, comprising:
acquiring a first training sample; the first training sample comprises a first sample text and a task prompt text, wherein the task prompt text is used for prompting an initial large language model to execute an emotion analysis task on an object to be analyzed in the first sample text;
inputting the first training sample into the initial large language model to obtain predicted emotion polarity;
and training the initial large language model according to the difference between the predicted emotion polarity and the actual emotion polarity corresponding to the object to be analyzed to obtain a large language model.
10. The method of claim 9, wherein the acquiring a first training sample comprises:
acquiring the first sample text;
extracting an object to be analyzed from the first sample text;
and generating the first training sample according to the first sample text and the object to be analyzed.
11. The method of claim 10, wherein the generating the first training sample from the first sample text and the object to be analyzed comprises:
acquiring a prompt text of the target field; the target domain prompt text is used for prompting that the domain to which the first text sample belongs of the initial large language model is a target domain;
And generating the first training sample according to the first sample text, the object to be analyzed and the target field prompt text.
12. The method of claim 10, wherein the extracting the object to be analyzed from the first sample text comprises:
inputting the first sample text into an object extraction model to obtain a tag sequence of the first sample text output by the object extraction model;
determining that the tag in the first text sample is a target character of a target tag according to the tag sequence;
and determining the object to be analyzed according to the target character.
13. The method of claim 12, wherein the object extraction model is trained by:
acquiring a second training sample; the second training sample comprises a second sample text and a real label sequence of the second sample text;
inputting the second sample text into an initial object extraction model to obtain a predicted tag sequence;
and training the initial object extraction model according to the difference between the predicted tag sequence and the real tag sequence to obtain the object extraction model.
14. The method of claim 13, wherein the training the initial object extraction model based on the difference between the predicted tag sequence and the real tag sequence to obtain the object extraction model comprises:
determining the probability that the predicted tag sequence is the real tag sequence according to the probability that each character in the second sample text belongs to each tag;
determining model loss according to the probability that the predicted tag sequence is the real tag sequence;
and training the initial object extraction model according to the model loss to obtain the object extraction model.
15. An emotion analysis device comprising:
the first acquisition module is used for acquiring a first target text;
the extraction module is used for extracting an object to be analyzed from the first target text;
the generation module is used for generating a second target text according to the first target text and the object to be analyzed; the second target text comprises a task prompt text, and the task prompt text is used for prompting a large language model to execute emotion analysis tasks on the object to be analyzed based on the first target text;
and the second acquisition module is used for inputting the second target text into the large language model to obtain the emotion polarity of the object to be analyzed.
16. The apparatus of claim 15, wherein the decimation module is configured to:
inputting the first target text into an object extraction model to obtain a tag sequence of the first target text output by the object extraction model;
determining that the tag in the first target text is a target character of a target tag according to the tag sequence;
and determining the object to be analyzed according to the target character.
17. The apparatus of claim 15, wherein the decimation module is configured to:
part of speech tagging is carried out on the first target sample so as to obtain part of speech of each word segmentation in the first target sample;
according to the part of speech of each word segment, determining candidate word segments from the word segments;
and matching the candidate word with the word in the preset dictionary to determine the object to be analyzed from the candidate word.
18. The apparatus of claim 15, wherein the means for generating is configured to:
adding the object to be analyzed to a preset position in a preset task prompt template to obtain the task prompt text;
and splicing the first target text and the task prompt text to obtain the second target text.
19. The apparatus of claim 15, wherein the means for generating is configured to:
acquiring a prompt text of the target field; the target domain prompt text is used for prompting that the domain to which the first target text of the large language model belongs is a target domain;
and generating the second target text according to the first target text, the object to be analyzed and the target field prompt text.
20. The apparatus of claim 19, wherein the means for generating is configured to:
adding the object to be analyzed to a preset position in a preset task prompt template to obtain the task prompt text;
and splicing the first target text, the target field prompt text and the task prompt text according to a preset splicing sequence to obtain the second target text.
21. The apparatus of claim 19, wherein the means for generating is configured to:
matching the first target text with terms in a term base corresponding to the target field to determine target terms associated with the first target text;
and generating the second target text according to the first target text, the object to be analyzed, the target field prompt text and the semantics of the target term.
22. The apparatus of claim 19, wherein the means for generating is configured to:
identifying the field of the first target text to determine the target field to which the first target text belongs;
and acquiring target domain prompt texts of the target domain from the domain prompt texts of the multiple domains.
23. A training apparatus for a large language model, comprising:
the first acquisition module is used for acquiring a first training sample; the first training sample comprises a first sample text and a task prompt text, wherein the task prompt text is used for prompting an initial large language model to execute an emotion analysis task on an object to be analyzed in the first sample text;
the second acquisition module is used for inputting the first training sample into the initial large language model to obtain predicted emotion polarity;
the first training module is used for training the initial large language model according to the difference between the predicted emotion polarity and the actual emotion polarity corresponding to the object to be analyzed to obtain a large language model.
24. The apparatus of claim 23, wherein the first acquisition module is configured to:
acquiring the first sample text;
Extracting an object to be analyzed from the first sample text;
and generating the first training sample according to the first sample text and the object to be analyzed.
25. The apparatus of claim 24, wherein the first acquisition module is configured to:
acquiring a prompt text of the target field; the target domain prompt text is used for prompting that the domain to which the first text sample belongs of the initial large language model is a target domain;
and generating the first training sample according to the first sample text, the object to be analyzed and the target field prompt text.
26. The apparatus of claim 24, wherein the first acquisition module is configured to:
inputting the first sample text into an object extraction model to obtain a tag sequence of the first sample text output by the object extraction model;
determining that the tag in the first text sample is a target character of a target tag according to the tag sequence;
and determining the object to be analyzed according to the target character.
27. The apparatus of claim 26, further comprising:
the third acquisition module is used for acquiring a second training sample; the second training sample comprises a second sample text and a real label sequence of the second sample text;
A fourth obtaining module, configured to input the second sample text into an initial object extraction model to obtain a predicted tag sequence;
and the second training module is used for training the initial object extraction model according to the difference between the predicted tag sequence and the real tag sequence to obtain the object extraction model.
28. The apparatus of claim 27, wherein the second training module is configured to:
determining the probability that the predicted tag sequence is the real tag sequence according to the probability that each character in the second sample text belongs to each tag;
determining model loss according to the probability that the predicted tag sequence is the real tag sequence;
and training the initial object extraction model according to the model loss to obtain the object extraction model.
29. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8 or to perform the method of any one of claims 9-14.
30. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-8 or to perform the method of any one of claims 9-14.
31. A computer program product comprising a computer program which, when executed by a processor, implements the steps of the method of any one of claims 1-8 or implements the steps of the method of any one of claims 9-14.
CN202311415479.0A 2023-10-27 2023-10-27 Emotion analysis method, training method and device for large language model Pending CN117436438A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311415479.0A CN117436438A (en) 2023-10-27 2023-10-27 Emotion analysis method, training method and device for large language model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311415479.0A CN117436438A (en) 2023-10-27 2023-10-27 Emotion analysis method, training method and device for large language model

Publications (1)

Publication Number Publication Date
CN117436438A true CN117436438A (en) 2024-01-23

Family

ID=89551106

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311415479.0A Pending CN117436438A (en) 2023-10-27 2023-10-27 Emotion analysis method, training method and device for large language model

Country Status (1)

Country Link
CN (1) CN117436438A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117951314A (en) * 2024-03-26 2024-04-30 南京众智维信息科技有限公司 Scenario generation decision method integrating knowledge graph and large language generation model

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117951314A (en) * 2024-03-26 2024-04-30 南京众智维信息科技有限公司 Scenario generation decision method integrating knowledge graph and large language generation model
CN117951314B (en) * 2024-03-26 2024-06-07 南京众智维信息科技有限公司 Scenario generation decision method integrating knowledge graph and large language generation model

Similar Documents

Publication Publication Date Title
CN111191428B (en) Comment information processing method and device, computer equipment and medium
CN111428514A (en) Semantic matching method, device, equipment and storage medium
CN110825867B (en) Similar text recommendation method and device, electronic equipment and storage medium
CN111651974A (en) Implicit discourse relation analysis method and system
CN115357719B (en) Power audit text classification method and device based on improved BERT model
CN111414561B (en) Method and device for presenting information
CN115688920B (en) Knowledge extraction method, training device, training equipment and training medium for model
CN109582788A (en) Comment spam training, recognition methods, device, equipment and readable storage medium storing program for executing
CN112818698B (en) Fine-grained user comment sentiment analysis method based on dual-channel model
CN113761377B (en) False information detection method and device based on attention mechanism multi-feature fusion, electronic equipment and storage medium
US20230073602A1 (en) System of and method for automatically detecting sarcasm of a batch of text
CN117436438A (en) Emotion analysis method, training method and device for large language model
CN112836053A (en) Man-machine conversation emotion analysis method and system for industrial field
CN113486174B (en) Model training, reading understanding method and device, electronic equipment and storage medium
CN111311364A (en) Commodity recommendation method and system based on multi-mode commodity comment analysis
CN112036186A (en) Corpus labeling method and device, computer storage medium and electronic equipment
CN112349294B (en) Voice processing method and device, computer readable medium and electronic equipment
CN113918710A (en) Text data processing method and device, electronic equipment and readable storage medium
CN111339760A (en) Method and device for training lexical analysis model, electronic equipment and storage medium
CN110851572A (en) Session labeling method and device, storage medium and electronic equipment
CN114676699A (en) Entity emotion analysis method and device, computer equipment and storage medium
CN115510860A (en) Text sentiment analysis method and device, electronic equipment and storage medium
CN115470790A (en) Method and device for identifying named entities in file
CN112528674B (en) Text processing method, training device, training equipment and training equipment for model and storage medium
CN115359323A (en) Image text information generation method and deep learning model training method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination