CN113722487A - User emotion analysis method, device and equipment and storage medium - Google Patents

User emotion analysis method, device and equipment and storage medium Download PDF

Info

Publication number
CN113722487A
CN113722487A CN202111015068.3A CN202111015068A CN113722487A CN 113722487 A CN113722487 A CN 113722487A CN 202111015068 A CN202111015068 A CN 202111015068A CN 113722487 A CN113722487 A CN 113722487A
Authority
CN
China
Prior art keywords
user
data
emotion
user data
commodity attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111015068.3A
Other languages
Chinese (zh)
Inventor
刘锴靖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Puhui Enterprise Management Co Ltd
Original Assignee
Ping An Puhui Enterprise Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Puhui Enterprise Management Co Ltd filed Critical Ping An Puhui Enterprise Management Co Ltd
Priority to CN202111015068.3A priority Critical patent/CN113722487A/en
Publication of CN113722487A publication Critical patent/CN113722487A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Evolutionary Computation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of artificial intelligence natural language, and particularly discloses a user emotion analysis method, a device, equipment and a storage medium, wherein the user emotion analysis method comprises the following steps: acquiring user data by adopting a web crawler mode; extracting a user type from the user data, classifying the user data according to the user type based on a classifier to obtain all user data of the same user type, and splicing all the user data of the same user type one by one to obtain user data to be analyzed; inputting user data to be analyzed into a pre-trained user emotion analysis model, extracting commodity attribute features from the user data to be analyzed based on a self-attention mechanism, performing context semantic learning on the commodity attribute features to obtain user comments of each commodity attribute feature, and predicting user emotion categories according to the commodity attribute features and the user comments. By the mode, the emotional intention of the user can be accurately acquired, and the user is helped to realize accurate marketing.

Description

User emotion analysis method, device and equipment and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence natural language, in particular to a user emotion analysis method, device, equipment and storage medium.
Background
With the rapid development of electronic commerce, the demand of people for online shopping is higher and higher, so that various e-commerce platforms are given great development opportunities, but simultaneously, the competition among the platforms is more and more severe, besides preferential strategies and quality control, how electronic commerce enterprises identify the consumption preference of customers, develop accurate marketing, reduce competition cost, and are necessary lessons of each enterprise.
In the existing customer emotion analysis, an enterprise usually analyzes the consumption preference of a customer according to customer comment data, and the customer comment data often has a strong emotion tendency and becomes an important information source for the enterprise to acquire the consumption preference of the customer and develop accurate marketing. However, for a special application scenario, for example, a financial e-commerce scenario, only a single piece of customer comment data is selected to analyze customer emotion, so that customer information cannot be obtained comprehensively, marketing strategies are not accurate enough, and sales performance is reduced.
Disclosure of Invention
The invention provides a user emotion analysis method, device, equipment and storage medium, which can accurately acquire the emotion intention of a user and help the user to realize accurate marketing.
In order to solve the technical problems, the invention adopts a technical scheme that: a user emotion analysis method is provided, and comprises the following steps:
obtaining user data in a web crawler mode, wherein the types of the user data comprise at least two of user purchase area comment data, mall forum posting data and mall forum talking area comment data;
extracting a user type from the user data, classifying the user data according to the user type based on a classifier to obtain all user data of the same user type, and splicing all the user data of the same user type one by one to obtain user data to be analyzed;
inputting the user data to be analyzed into a pre-trained user emotion analysis model, extracting commodity attribute features from the user data to be analyzed based on a self-attention mechanism, performing context semantic learning on the commodity attribute features to obtain user comments of each commodity attribute feature, and predicting user emotion categories according to the commodity attribute features and the user comments.
According to an embodiment of the present invention, the step of obtaining the user data in a web crawler manner, where the types of the user data include at least two of the user purchase area comment data, the mall forum posting data, and the mall forum speaking data, further includes:
capturing URLs in a user purchasing area, a mall forum area and a mall forum area in a web crawler mode, carrying out duplicate removal processing on the URLs by adopting a bloom filter, and forming a URL queue by the URLs after the duplicate removal processing;
crawling a webpage corresponding to the URL queue, and storing the webpage in a database;
and preprocessing the webpage, extracting text data, and cleaning the text data by using a regular expression to obtain user data.
According to an embodiment of the present invention, the step of extracting the user type from the user data, classifying the user data according to the user type based on a classifier, acquiring all the user data of the same user type, performing one-to-one splicing processing on all the user data of the same user type, and acquiring the user data to be analyzed further includes:
performing word segmentation on the splicing processing result to obtain a first vocabulary set;
performing part-of-speech tagging on the first vocabulary set, removing stop words, and obtaining a second vocabulary set;
and carrying out word frequency statistics on the words in the second word set, and carrying out duplicate removal processing on the words with the word frequency statistical result larger than a preset threshold value to obtain the user data to be analyzed.
According to an embodiment of the present invention, the step of predicting the user emotion category according to the commodity attribute feature and the user comment further comprises:
comparing the user comments corresponding to each commodity attribute feature, and obtaining a preference value corresponding to each commodity attribute feature according to a comparison result;
performing product calculation on the preference value and the corresponding commodity attribute weight to obtain an emotion value of each commodity attribute;
summing the emotion values of all the commodity attributes to obtain a user emotion value;
and predicting the emotion category of the user according to the emotion value of the user.
According to an embodiment of the present invention, the step of predicting the user emotion category according to the commodity attribute feature and the user comment further comprises:
calculating the prediction probability of the emotion category of the user according to the commodity attribute features and the user comments based on a naive Bayes classification algorithm;
and taking the classification corresponding to the maximum user emotion classification prediction probability as a user emotion classification.
According to an embodiment of the present invention, before the step of inputting the user data to be analyzed into the pre-trained user emotion analysis model, the method further includes:
acquiring historical user data and forming a data set, and dividing the data set into a training set and a test set;
constructing a user emotion analysis model and training the user emotion analysis model by adopting the training set to obtain a trained user emotion analysis model;
and verifying the prediction result of the trained user emotion analysis model by adopting the test set.
According to an embodiment of the present invention, after the step of predicting the user emotion category according to the commodity attribute characteristics and the user comments, the method further includes:
visually displaying the user emotion category prediction results, comparing the user emotion category prediction results under a plurality of commodities, and obtaining user emotion trend distribution under each commodity through summary statistics;
and formulating a corresponding marketing strategy according to the user emotion trend distribution.
In order to solve the technical problem, the invention adopts another technical scheme that: provided is a user emotion analysis device including:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring user data in a web crawler mode, and the types of the user data comprise at least two of user purchase area comment data, mall forum posting data and mall forum talking data;
the second acquisition module is used for extracting the user type from the user data, classifying the user data according to the user type based on the classifier, acquiring all the user data of the same user type, and splicing all the user data of the same user type one by one to acquire the user data to be analyzed;
and the emotion analysis module is used for inputting the user data to be analyzed into a pre-trained user emotion analysis model, extracting commodity attribute features from the user data to be analyzed based on a self-attention mechanism, performing context semantic learning on the commodity attribute features to obtain user comments of each commodity attribute feature, and predicting user emotion categories according to the commodity attribute features and the user comments.
In order to solve the technical problems, the invention adopts another technical scheme that: there is provided a computer device comprising: the emotion analysis system comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor executes the computer program to realize the user emotion analysis method.
In order to solve the technical problems, the invention adopts another technical scheme that: there is provided a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the user emotion analysis method described above.
The invention has the beneficial effects that: by comprehensively considering the comment data of the purchase area of the user, the post data of the forum area of the mall and the discussion area of the forum of the mall, the emotion of the user is comprehensively analyzed based on the emotion analysis model of the user, the emotion intention of the user to each commodity can be accurately acquired, the mind of the consumer can be deeply known, and therefore the enterprise can be helped to formulate a more effective marketing strategy, accurate marketing is carried out, and the competitive cost is reduced.
Drawings
FIG. 1 is a flowchart illustrating a user emotion analyzing method according to a first embodiment of the present invention;
FIG. 2 is a schematic flowchart of step S101 in the user emotion analysis method according to the embodiment of the present invention;
FIG. 3 is a flowchart illustrating step S102 in the user emotion analysis method according to the embodiment of the present invention;
FIG. 4 is a flowchart illustrating a user emotion analyzing method according to a second embodiment of the present invention;
FIG. 5 is a flowchart illustrating a user emotion analyzing method according to a third embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a user emotion analyzing apparatus according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of a computer device according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a computer storage medium according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first", "second" and "third" in the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first," "second," or "third" may explicitly or implicitly include at least one of the feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise. All directional indicators (such as up, down, left, right, front, and rear … …) in the embodiments of the present invention are only used to explain the relative positional relationship between the components, the movement, and the like in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indicator is changed accordingly. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
FIG. 1 is a flowchart illustrating a user emotion analysis method according to a first embodiment of the present invention. It should be noted that the method of the present invention is not limited to the flow sequence shown in fig. 1 if the results are substantially the same. As shown in fig. 1, the method comprises the steps of:
step S101: and obtaining user data in a web crawler mode, wherein the types of the user data comprise at least two of user purchase area comment data, mall forum posting data and mall discussion area comment data.
In step S101, web crawler refers to a technology for extracting data from a website, and the web crawler may convert unstructured data into structured data. The structured data of this embodiment is json, xml, etc., and the unstructured data is html. Web Crawler formats include, but are not limited to, Crawler4j, crapy, Python, BaseSpider, sitemap, and the like. The application scenario of the embodiment is a financial mall, a user can leave a message in one or more of a purchasing area, a mall forum area and a mall forum area, and the embodiment mainly considers that the user leaves a message in at least two of the purchasing area, the mall forum area and the mall forum area, so as to generate user data. Traditional user emotion analysis is only based on user purchase comment data, and data source is single, can not reflect user emotion completely, and two kinds at least in user purchase comment data, the data posted in the mall forum district and the data spoken in the mall forum district are considered comprehensively to this embodiment, can accurate analysis user emotion, and the heart sound of deep understanding consumer to formulate effectual marketing strategy.
The user data records commodity attributes and user comments on the commodity attributes, wherein the commodity attributes comprise unit price of a product, whether the commodity attributes are regular or not, return rate, whether interest is added or not and risks, and the user comments reflect the purchasing desire of a user on the commodity.
Further, referring to fig. 2, step S101 further includes the following steps:
step S201: and capturing the URL in a user purchasing area, a mall forum area and a mall forum area in a web crawler mode, carrying out duplicate removal treatment on the URL by adopting a bloom filter, and forming a URL queue by the URL after the duplicate removal treatment.
In step S201, first, a part of seed URLs are selected, and the seed URLs are placed in a URL queue to be captured; and taking out the URL to be captured, analyzing the DNS to obtain the IP of the host, downloading the webpage corresponding to the URL to be captured, storing the webpage into a downloaded webpage library, and simultaneously putting the corresponding URL into a captured URL queue. And analyzing the URLs in the captured URL queue, acquiring sub URLs in the URLs, and putting the sub URLs into the URL queue to be captured, so as to enter the next cycle. In other embodiments, incomplete html code can also be processed using the selectors interface, customizing shopping reviews for related targets for crawling.
The bloom filter judges whether a URL is accessed by using a bloom algorithm, the URL queue of the embodiment is stored by using a hash table (hash table), when a new URL is added to the URL queue, the new URL is directly compared with elements in the hash table, if the new URL does not exist in the hash table, the new URL is added to the URL queue, and if the new URL exists in the hash table, the URL is repeated, and the URL is deleted.
Step S202: and crawling the webpage corresponding to the URL queue, and storing the webpage in a database.
Step S203: preprocessing the webpage, extracting text data, and cleaning the text data by using a regular expression to obtain user data.
In step S203, the preprocessing includes the following operations: extracting text characters, Chinese Word segmentation, eliminating noise (such as … … of copyright statement characters, navigation bars, advertisements and the like), index processing, link relation calculation and special file processing, besides html files, a search engine can also generally grab and index a plurality of file types based on characters, such as PDF, Word, WPS, XLS, PPT, TXT files and the like. The regular expression of the embodiment can screen regular user data from text data, so that the obtained user data is cleaner, redundant interference factors are removed, the accuracy and the reliability of the user data are ensured, and the emotion of the user can be conveniently and accurately analyzed subsequently.
Step S102: the user types are extracted from the user data, the user data are classified according to the user types based on the classifier, all the user data of the same user type are obtained, all the user data of the same user type are spliced one by one, and the user data to be analyzed are obtained.
In step S102, the user type and the corresponding user data are correlated, a classifier is trained by using the correlated data as training data, the user type is used as an input of the classifier, and the user data correlated with the user type is used as an output of the classifier. The same user in this embodiment can leave messages in different areas, and the messages left by the same user in different areas can be the same or different.
User data table 1 shows the product attributes and the related user comments in the user data.
Figure BDA0003239549940000071
Figure BDA0003239549940000081
However, the types of the commodity attributes that the same user may comment on in different comment areas may not be completely consistent, for example, user a may comment on the price of a product, whether the product is regular, and the return rate in a user purchase area, comment on whether the product is regular, the return rate, whether the product is rested, and the product is risk in a mall forum, and comment on the return rate, whether the product is rested, and the product is risk in a mall discussion area. Therefore, the comments of the user a in different comment areas are partially repeated, the comments of the same user in different areas need to be merged and spliced in the embodiment to obtain complete user data, and the same commodity attribute needs to be deduplicated in subsequent processing. The present embodiment inserts a [ CLS ] character at the beginning of a comment and an [ SEP ] character at the end. For example, user a's comments in the user purchase area: the regular period is 6 months, the return rate is 10 percent, the calculation of the same line is higher, and the purchase is recommended. User A reviews in a mall forum: the regular period is 6 months, the return rate is 10 percent, and medium and low risks can be considered. The user data splicing result of the user A is that [ CLS ] is regular for 6 months, the return rate is 10%, the peer calculation is high, the [ SEP ] is recommended to be purchased for 6 months regularly, the return rate is 10%, the medium-low risk can be considered, and the [ SEP ] can be taken into account.
Further, referring to fig. 3, step S102 further includes the following steps:
step S301: and performing word segmentation on the splicing processing result to obtain a first vocabulary set.
In step S301, natural speech processing (NLP) such as a TF-IDF method, a TextRank method, or a Word2Vec Word clustering method is used to perform Word segmentation on the result of the concatenation processing, so as to obtain a first vocabulary set.
Step S302: and performing part-of-speech tagging on the first vocabulary set, removing stop words and obtaining a second vocabulary set.
Step S303: and carrying out word frequency statistics on the words in the second word set, and carrying out duplication elimination on the words with the word frequency statistical result larger than a preset threshold value to obtain user data to be analyzed.
In step S303, it is assumed that the user a makes comments on the unit price, the periodic rate, and the return rate of the product in the user purchase area, makes comments on the periodic rate, the return rate, the interest, and the risk in the mall forum, and makes comments on the return rate and the interest in the mall forum. Therefore, the user a repeatedly reviews whether the comment is regular or not, the return rate and the interest is added, and if the comment is not conflicting with the comment of the same type of commodity attribute in different areas, the user a needs to perform deduplication processing on the corresponding vocabulary whether the comment is regular or not, the return rate and the interest is added, and finally obtained user data to be analyzed includes: unit price of product, whether regular, rate of return, whether interest is added, risk.
Step S103: inputting user data to be analyzed into a pre-trained user emotion analysis model, extracting commodity attribute features from the user data to be analyzed based on a self-attention mechanism, performing context semantic learning on the commodity attribute features to obtain user comments of each commodity attribute feature, and predicting user emotion categories according to the commodity attribute features and the user comments.
In step S103, the present embodiment acquires and processes user data to be analyzed based on an artificial intelligence technique. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like. The user emotion analysis model of the embodiment is an artificial intelligent model and relates to a natural language processing technology.
Further, the commodity attributes of the embodiment include unit price of the product, whether to be regular, return rate, whether to add interest, risk, and the like; the user comment includes that the user evaluates the commodity attribute, and specifically includes: unit price of product (high/low), whether regular (yes/no), rate of return (high/low), whether interest is added (yes/no), risk (high/low). The user emotion categories of the present embodiment include "buy" and "not buy".
In one embodiment, user comments corresponding to each commodity attribute feature are compared, and a preference value corresponding to each commodity attribute feature is obtained according to a comparison result; performing product calculation on the preference value and the corresponding commodity attribute weight to obtain an emotion value of each commodity attribute; summing the emotion values of all the commodity attributes to obtain a user emotion value; and predicting the emotion category of the user according to the emotion value of the user. In this embodiment, the higher the user emotion value, the more the user emotion classification tends to buy, and the user emotion analysis model outputs a result "1", and the lower the user emotion value, the more the user emotion classification tends not to buy, and the user emotion analysis model outputs a result "0".
In another embodiment, the prediction probability of the emotion category of the user is calculated according to the commodity attribute characteristics and the user comments based on a naive Bayes classification algorithm; and taking the classification corresponding to the maximum user emotion classification prediction probability as a user emotion classification. In this embodiment, the user emotion category probability is in the [0,1] interval, the closer the predicted user emotion category probability is to 1, the stronger the purchase desire of the user is, the user emotion category prediction result is "buy", the user emotion analysis model outputs the result "1", the closer the predicted user emotion category probability is to 0, the lower the purchase desire of the user is, the user emotion category prediction result is "do not buy", and the user emotion analysis model outputs the result "0".
Further, the naive bayes classification algorithm proceeds according to the following formula:
Figure BDA0003239549940000101
wherein, a is a commodity attribute, B is a user emotion category, P (B | a) is a probability of the user emotion category, P (B) is a probability of each category in a training sample, P (a) is a probability of each commodity attribute in the training sample, and P (a | B) is a probability of each commodity attribute under each category condition.
For example, referring to table 1, the probability of a user buying/not buying in each case is calculated based on a naive bayes classification algorithm as follows:
and selecting the user emotion category corresponding to the maximum probability as the user emotion analysis result of the user data to be analyzed.
According to the user emotion analysis method, the comment data of the user purchase area, the post data of the mall forum area and the comment data of the mall discussion area are comprehensively considered, the obtained data are comprehensively processed, so that the user data to be analyzed, which can fully reflect the emotion of the user, are obtained, the user emotion of the user data to be analyzed is comprehensively analyzed based on the user emotion analysis model, the emotion of the user to each commodity can be accurately obtained, the heart sound of a consumer is deeply known, and therefore an enterprise is helped to formulate a more effective marketing strategy, accurate marketing is developed, and competitive cost is reduced.
FIG. 4 is a flowchart illustrating a user emotion analyzing method according to a second embodiment of the present invention. It should be noted that the method of the present invention is not limited to the flow sequence shown in fig. 4 if the results are substantially the same. As shown in fig. 4, the method includes the steps of:
step S401: and obtaining user data in a web crawler mode, wherein the types of the user data comprise at least two of user purchase area comment data, mall forum posting data and mall discussion area comment data.
In this embodiment, step S401 in fig. 4 is similar to step S101 in fig. 1, and for brevity, is not described herein again.
Step S402: the user types are extracted from the user data, the user data are classified according to the user types based on the classifier, all the user data of the same user type are obtained, all the user data of the same user type are spliced one by one, and the user data to be analyzed are obtained.
In this embodiment, step S402 in fig. 4 is similar to step S102 in fig. 1, and for brevity, is not described herein again.
Step S403: historical user data is acquired, a data set is formed, and the data set is divided into a training set and a testing set.
In step S403, historical user data is obtained in a web crawler manner, commodity attributes are determined from the historical user data, a corresponding comment tag is marked on each commodity attribute, a user emotion category tag is marked on each historical user data according to actual purchase conditions, a training set is divided into a positive sample and a negative sample according to the user emotion category tag, the sample with the user emotion category of "purchase" is a positive sample, the sample with the user emotion category of "no purchase" is a negative sample, and the number of the positive and negative samples is 1: 1.
Step S404: and constructing a user emotion analysis model and training by adopting a training set user emotion analysis model to obtain a trained user emotion analysis model.
In step S404, the user emotion analysis model based on the naive bayes classification algorithm is trained by using a training set, a classification function and a train function, wherein the train function is used for training a classification part in a classification prediction layer, and the classification function is used for predicting a classification result.
And further, verifying the prediction result of the trained user emotion analysis model by adopting a test set. When the prediction result is consistent with the actual emotion category of the test set, storing the trained user emotion analysis model; and when the prediction result is inconsistent with the actual emotion category of the test set, adding the test set into the training set, updating the training set, and optimizing the user emotion analysis model by adopting the updated training set.
For example, the user emotion analysis model based on the naive bayes classification algorithm is taken as an example in the embodiment, and the accuracy of the user emotion analysis model is tested by using the following test sets, wherein Q1, Q2 and Q3 are samples of the test sets.
Q1 (CLS) is regularly 6 months, the return rate is 10%, the peer is high in calculation, the purchase of SEP (SEP) is recommended regularly 6 months, the return rate is 10%, the risk is low or medium, and the possibility of starting the game [ SEP ]
q.sentiments
0.999786745764231
Output:1
Q2 ═ CLS [ ] rate of return is not high, periodic time is a little longer [ SEP ] rate of return is lower, individuals do not like [ SEP ]
q.sentiments
0.314159278698761
Output 0 prediction result is accurate
Q3 (CLS) has higher product price, higher risk, lower return rate of not recommending [ SEP ] purchase, and personal dislike [ SEP ]
q.sentiments
0.8743937414987555
Output:1
From the test results, the prediction results of Q1 and Q2 are correct, which indicates that the emotion analysis model of the user is accurate, the model does not need to be adjusted, and the prediction result of Q3 is wrong, which indicates that the emotion analysis model of the user has errors, and the model adjustment is needed.
Step S405: inputting user data to be analyzed into a pre-trained user emotion analysis model, extracting commodity attribute features from the user data to be analyzed based on a self-attention mechanism, performing context semantic learning on the commodity attribute features to obtain user comments of each commodity attribute feature, and predicting user emotion categories according to the commodity attribute features and the user comments.
In this embodiment, step S405 in fig. 4 is similar to step S103 in fig. 1, and for brevity, is not described herein again.
The user emotion analysis method of the second embodiment of the invention trains the user emotion analysis model based on historical user data, and simultaneously circularly verifies the prediction result of the user emotion analysis model, thereby effectively improving the accuracy of the user emotion analysis model and accurately analyzing the user emotion.
FIG. 5 is a flowchart illustrating a user emotion analyzing method according to a third embodiment of the present invention. It should be noted that the method of the present invention is not limited to the flow sequence shown in fig. 5 if the results are substantially the same. As shown in fig. 5, the method includes the steps of:
step S501: and obtaining user data in a web crawler mode, wherein the types of the user data comprise at least two of user purchase area comment data, mall forum posting data and mall discussion area comment data.
In this embodiment, step S501 in fig. 5 is similar to step S101 in fig. 1, and for brevity, is not described herein again.
Step S502: the user types are extracted from the user data, the user data are classified according to the user types based on the classifier, all the user data of the same user type are obtained, all the user data of the same user type are spliced one by one, and the user data to be analyzed are obtained.
In this embodiment, step S502 in fig. 5 is similar to step S102 in fig. 1, and for brevity, is not described herein again.
Step S503: inputting user data to be analyzed into a pre-trained user emotion analysis model, extracting commodity attribute features from the user data to be analyzed based on a self-attention mechanism, performing context semantic learning on the commodity attribute features to obtain user comments of each commodity attribute feature, and predicting user emotion categories according to the commodity attribute features and the user comments.
In this embodiment, step S503 in fig. 5 is similar to step S103 in fig. 1, and for brevity, is not described herein again.
Step S504: and visually displaying the user emotion category prediction results, comparing the user emotion category prediction results under a plurality of commodities, and obtaining the user emotion trend distribution under each commodity through summary statistics.
In step S504, the consumption intention of the user on each commodity is analyzed for the same user, and the consumption intention of the customer on the commodity is analyzed for the same commodity.
Step S505: and formulating a corresponding marketing strategy according to the emotional tendency distribution of the user.
In step S505, the preferred commodities are selected according to the emotional tendency distribution of the user, and the commodities are accurately marketed to the user.
On the basis of the first embodiment, the emotion analysis result of the user is used for analyzing the consumption intention of the same user on each commodity and deeply knowing the consumption preference of the user on the one hand, and on the other hand, the emotion analysis results of a plurality of users on the same commodity can be integrated to reflect the problems of the commodity or the marketing strategy on the side, so that enterprises can be helped to actively improve the commodity quality and the marketing strategy, accurate marketing is carried out, and the competitive cost is further reduced.
Fig. 6 is a schematic structural diagram of a user emotion analysis apparatus according to an embodiment of the present invention. As shown in fig. 6, the apparatus 60 includes a first obtaining module 61, a second obtaining module 62 and an emotion analyzing module 63.
The first obtaining module 61 is configured to obtain user data in a web crawler manner, where types of the user data include at least two of user purchase area comment data, mall forum posting data, and mall forum speaking data.
The second obtaining module 62 is configured to extract a user type from the user data, classify the user data according to the user type based on the classifier, obtain all user data of the same user type, and perform one-to-one splicing processing on all user data of the same user type to obtain user data to be analyzed.
The emotion analysis module 63 is configured to input user data to be analyzed into a pre-trained user emotion analysis model, extract product attribute features from the user data to be analyzed based on a self-attention mechanism, perform context semantic learning on the product attribute features to obtain user comments of each product attribute feature, and predict user emotion categories according to the product attribute features and the user comments.
Referring to fig. 7, fig. 7 is a schematic structural diagram of a computer device according to an embodiment of the present invention. As shown in fig. 7, the computer device 70 includes a processor 71 and a memory 72 coupled to the processor 71.
The memory 72 stores program instructions for implementing the user emotion analysis method according to any of the above embodiments.
Processor 71 is operative to execute program instructions stored in memory 72 to analyze user emotions.
The processor 71 may also be referred to as a CPU (Central Processing Unit). The processor 71 may be an integrated circuit chip having signal processing capabilities. The processor 71 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Referring to fig. 8, fig. 8 is a schematic structural diagram of a computer storage medium according to an embodiment of the present invention. The computer storage medium of the embodiment of the present invention stores a program file 81 capable of implementing all the methods described above, wherein the program file 81 may be stored in the computer storage medium in the form of a software product, and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present invention. And the aforementioned computer storage media include: various media capable of storing program codes, such as a usb disk, a mobile hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, or terminal devices, such as a computer, a server, a mobile phone, and a tablet.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A user emotion analysis method is characterized by comprising the following steps:
obtaining user data in a web crawler mode, wherein the types of the user data comprise at least two of user purchase area comment data, mall forum posting data and mall forum talking area comment data;
extracting a user type from the user data, classifying the user data according to the user type based on a classifier to obtain all user data of the same user type, and splicing all the user data of the same user type one by one to obtain user data to be analyzed;
inputting the user data to be analyzed into a pre-trained user emotion analysis model, extracting commodity attribute features from the user data to be analyzed based on a self-attention mechanism, performing context semantic learning on the commodity attribute features to obtain user comments of each commodity attribute feature, and predicting user emotion categories according to the commodity attribute features and the user comments.
2. The method for analyzing user emotion according to claim 1, wherein the step of obtaining user data in a web crawler manner, wherein the types of the user data include at least two of user purchase area comment data, mall forum posting data, and mall forum speaking data further comprises:
capturing URLs in a user purchasing area, a mall forum area and a mall forum area in a web crawler mode, carrying out duplicate removal processing on the URLs by adopting a bloom filter, and forming a URL queue by the URLs after the duplicate removal processing;
crawling a webpage corresponding to the URL queue, and storing the webpage in a database;
and preprocessing the webpage, extracting text data, and cleaning the text data by using a regular expression to obtain user data.
3. The method for analyzing user emotion according to claim 1, wherein the step of extracting a user type from the user data, classifying the user data according to the user type based on a classifier to obtain all user data of the same user type, and performing one-to-one concatenation processing on all user data of the same user type to obtain user data to be analyzed further comprises:
performing word segmentation on the splicing processing result to obtain a first vocabulary set;
performing part-of-speech tagging on the first vocabulary set, removing stop words, and obtaining a second vocabulary set;
and carrying out word frequency statistics on the words in the second word set, and carrying out duplicate removal processing on the words with the word frequency statistical result larger than a preset threshold value to obtain the user data to be analyzed.
4. The method for analyzing user emotion according to claim 1, wherein the step of predicting the user emotion classification based on the commodity attribute feature and the user comment further comprises:
comparing the user comments corresponding to each commodity attribute feature, and obtaining a preference value corresponding to each commodity attribute feature according to a comparison result;
performing product calculation on the preference value and the corresponding commodity attribute weight to obtain an emotion value of each commodity attribute;
summing the emotion values of all the commodity attributes to obtain a user emotion value;
and predicting the emotion category of the user according to the emotion value of the user.
5. The method for analyzing user emotion according to claim 1, wherein the step of predicting the user emotion classification based on the commodity attribute feature and the user comment further comprises:
calculating the prediction probability of the emotion category of the user according to the commodity attribute features and the user comments based on a naive Bayes classification algorithm;
and taking the classification corresponding to the maximum user emotion classification prediction probability as a user emotion classification.
6. The method for analyzing user emotion according to claim 1, wherein before the step of inputting the user data to be analyzed into a pre-trained user emotion analysis model, the method further comprises:
acquiring historical user data and forming a data set, and dividing the data set into a training set and a test set;
constructing a user emotion analysis model and training the user emotion analysis model by adopting the training set to obtain a trained user emotion analysis model;
and verifying the prediction result of the trained user emotion analysis model by adopting the test set.
7. The method for analyzing user emotion according to claim 1, further comprising, after the step of predicting the user emotion classification from the commodity attribute feature and the user comment:
visually displaying the user emotion category prediction results, comparing the user emotion category prediction results under a plurality of commodities, and obtaining user emotion trend distribution under each commodity through summary statistics;
and formulating a corresponding marketing strategy according to the user emotion trend distribution.
8. A user emotion analysis device, comprising:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring user data in a web crawler mode, and the types of the user data comprise at least two of user purchase area comment data, mall forum posting data and mall forum talking data;
the second acquisition module is used for extracting the user type from the user data, classifying the user data according to the user type based on the classifier, acquiring all the user data of the same user type, and splicing all the user data of the same user type one by one to acquire the user data to be analyzed;
and the emotion analysis module is used for inputting the user data to be analyzed into a pre-trained user emotion analysis model, extracting commodity attribute features from the user data to be analyzed based on a self-attention mechanism, performing context semantic learning on the commodity attribute features to obtain user comments of each commodity attribute feature, and predicting user emotion categories according to the commodity attribute features and the user comments.
9. A computer device, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the user emotion analyzing method as claimed in any one of claims 1 to 7 when executing the computer program.
10. A computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the user emotion analysis method as recited in any of claims 1-7.
CN202111015068.3A 2021-08-31 2021-08-31 User emotion analysis method, device and equipment and storage medium Pending CN113722487A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111015068.3A CN113722487A (en) 2021-08-31 2021-08-31 User emotion analysis method, device and equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111015068.3A CN113722487A (en) 2021-08-31 2021-08-31 User emotion analysis method, device and equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113722487A true CN113722487A (en) 2021-11-30

Family

ID=78680105

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111015068.3A Pending CN113722487A (en) 2021-08-31 2021-08-31 User emotion analysis method, device and equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113722487A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GR1010537B (en) * 2022-11-10 2023-09-05 Παναγιωτης Τσαντιλας Sentiment analysis of website content

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102945268A (en) * 2012-10-25 2013-02-27 北京腾逸科技发展有限公司 Method and system for excavating comments on characteristics of product
US8949263B1 (en) * 2012-05-14 2015-02-03 NetBase Solutions, Inc. Methods and apparatus for sentiment analysis
CN106504099A (en) * 2015-09-07 2017-03-15 国家计算机网络与信息安全管理中心 A kind of system for building user's portrait
CN107862343A (en) * 2017-11-28 2018-03-30 南京理工大学 The rule-based and comment on commodity property level sensibility classification method of neutral net
CN108038725A (en) * 2017-12-04 2018-05-15 中国计量大学 A kind of electric business Customer Satisfaction for Product analysis method based on machine learning
CN110413780A (en) * 2019-07-16 2019-11-05 合肥工业大学 Text emotion analysis method, device, storage medium and electronic equipment
CN110517121A (en) * 2019-09-23 2019-11-29 重庆邮电大学 Method of Commodity Recommendation and the device for recommending the commodity based on comment text sentiment analysis
CN111859130A (en) * 2020-07-21 2020-10-30 宝鸡文理学院 Tourist attraction recommendation method and device based on big data analysis
CN111966888A (en) * 2019-05-20 2020-11-20 南京大学 External data fused interpretable recommendation method and system based on aspect categories
CN112329474A (en) * 2020-11-02 2021-02-05 山东师范大学 Attention-fused aspect-level user comment text emotion analysis method and system
CN112966526A (en) * 2021-04-20 2021-06-15 吉林大学 Automobile online comment emotion analysis method based on emotion word vector

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8949263B1 (en) * 2012-05-14 2015-02-03 NetBase Solutions, Inc. Methods and apparatus for sentiment analysis
CN102945268A (en) * 2012-10-25 2013-02-27 北京腾逸科技发展有限公司 Method and system for excavating comments on characteristics of product
CN106504099A (en) * 2015-09-07 2017-03-15 国家计算机网络与信息安全管理中心 A kind of system for building user's portrait
CN107862343A (en) * 2017-11-28 2018-03-30 南京理工大学 The rule-based and comment on commodity property level sensibility classification method of neutral net
CN108038725A (en) * 2017-12-04 2018-05-15 中国计量大学 A kind of electric business Customer Satisfaction for Product analysis method based on machine learning
CN111966888A (en) * 2019-05-20 2020-11-20 南京大学 External data fused interpretable recommendation method and system based on aspect categories
CN110413780A (en) * 2019-07-16 2019-11-05 合肥工业大学 Text emotion analysis method, device, storage medium and electronic equipment
CN110517121A (en) * 2019-09-23 2019-11-29 重庆邮电大学 Method of Commodity Recommendation and the device for recommending the commodity based on comment text sentiment analysis
CN111859130A (en) * 2020-07-21 2020-10-30 宝鸡文理学院 Tourist attraction recommendation method and device based on big data analysis
CN112329474A (en) * 2020-11-02 2021-02-05 山东师范大学 Attention-fused aspect-level user comment text emotion analysis method and system
CN112966526A (en) * 2021-04-20 2021-06-15 吉林大学 Automobile online comment emotion analysis method based on emotion word vector

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GR1010537B (en) * 2022-11-10 2023-09-05 Παναγιωτης Τσαντιλας Sentiment analysis of website content

Similar Documents

Publication Publication Date Title
US10748164B2 (en) Analyzing sentiment in product reviews
CN108536852B (en) Question-answer interaction method and device, computer equipment and computer readable storage medium
CN108304526B (en) Data processing method and device and server
US20170249389A1 (en) Sentiment rating system and method
WO2017190610A1 (en) Target user orientation method and device, and computer storage medium
CN110955750A (en) Combined identification method and device for comment area and emotion polarity, and electronic equipment
CN109582788A (en) Comment spam training, recognition methods, device, equipment and readable storage medium storing program for executing
CN111666757A (en) Commodity comment emotional tendency analysis method, device and equipment and readable storage medium
Radovanović et al. Review spam detection using machine learning
CN114238573A (en) Information pushing method and device based on text countermeasure sample
CN110781669A (en) Text key information extraction method and device, electronic equipment and storage medium
CN110781428A (en) Comment display method and device, computer equipment and storage medium
CN115147130A (en) Problem prediction method, apparatus, storage medium, and program product
CN113722487A (en) User emotion analysis method, device and equipment and storage medium
CN111079084B (en) Information forwarding probability prediction method and system based on long-time and short-time memory network
CN107291686B (en) Method and system for identifying emotion identification
WO2019242453A1 (en) Information processing method and device, storage medium, and electronic device
CN116127367A (en) Method and device for auditing service evaluation and computer readable storage medium
CN113971581A (en) Robot control method and device, terminal equipment and storage medium
Hoiriyah et al. Lexicon-Based and Naive Bayes Sentiment Analysis for Recommending the Best Marketplace Selection as a Marketing Strategy for MSMEs
JP2018067215A (en) Data analysis system, control method thereof, program, and recording medium
Liu et al. Stratify Mobile App Reviews: E-LDA Model Based on Hot" Entity" Discovery
Patidar et al. Design & Implementation of Product Recommendation Solution using Sentiment Analysis
CN110929123A (en) E-commerce product competition analysis method and system
Kamalesh et al. Sentiment Analysis on Amazon Product Review

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination