CN114861027A - Multi-dimensional public opinion recommendation method based on big data and natural language processing - Google Patents

Multi-dimensional public opinion recommendation method based on big data and natural language processing Download PDF

Info

Publication number
CN114861027A
CN114861027A CN202210483561.6A CN202210483561A CN114861027A CN 114861027 A CN114861027 A CN 114861027A CN 202210483561 A CN202210483561 A CN 202210483561A CN 114861027 A CN114861027 A CN 114861027A
Authority
CN
China
Prior art keywords
public opinion
keyword
classification
data
score
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210483561.6A
Other languages
Chinese (zh)
Other versions
CN114861027B (en
Inventor
夏超
贺鹏
周嘉宜
张�杰
黄友汉
倪安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Dongsheng Data Co ltd
Original Assignee
Shenzhen Dongsheng Data Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Dongsheng Data Co ltd filed Critical Shenzhen Dongsheng Data Co ltd
Priority to CN202210483561.6A priority Critical patent/CN114861027B/en
Publication of CN114861027A publication Critical patent/CN114861027A/en
Application granted granted Critical
Publication of CN114861027B publication Critical patent/CN114861027B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a multidimensional public opinion recommendation method based on big data and natural language processing, which comprises data acquisition, data access, data cleaning, public opinion scoring, public opinion recommendation and public opinion display.

Description

Multi-dimensional public opinion recommendation method based on big data and natural language processing
Technical Field
The invention relates to the technical field of data processing, in particular to a multidimensional public opinion recommendation method based on big data and natural language processing.
Background
In the big data era, mass data exist on the internet, and how to recommend public sentiment to the mass data is an important research topic of enterprises at present. Common public opinion recommendation processing technologies include simple regular expression rule collection and filtering, text pattern matching, emotion analysis, text similarity and the like, but the accuracy of the conventional recommendation technology based on rule matching or pure keyword matching is low.
Accordingly, the prior art is deficient and needs improvement.
Disclosure of Invention
The invention mainly aims to provide a multidimensional public opinion recommendation method based on big data and natural language processing, aiming at enabling a user to quickly obtain high-quality public opinion information meeting requirements and improving the efficiency of public opinion analysis.
In order to achieve the above object, the invention provides a multidimensional public opinion recommendation method based on big data and natural language processing, which comprises the following steps:
s1: the method comprises the steps of crawling internet public opinion data by using an internet crawler technology, and storing the crawled data into a database mysql;
s2: a big data technology real-time acquisition technology Flink cdc is adopted, full and incremental data are read from mysql in real time, the theme, content and release date of a webpage are extracted from the webpage content, and the webpage content is stored in a big data cluster Hive database;
s3: reading a plurality of keyword matching methods set by a user from a keyword table, analyzing each keyword matching method according to a pattern matching method, and matching with the content of each record in Hive data; if the content meets one of the keywords, the content is considered to meet the keyword matching, and the matched data is stored in a cleaned result database;
s4: scoring public sentiments, including classifying and scoring public sentiments, keyword scoring and media scoring, and classifying and scoring public sentiments, keyword scoring and media scoringCalculating the score through an algorithm formula to obtain the total public opinion score; the formula of the algorithm is S ═ lambda 1 *S c2 *S ky *S m
Wherein S is the total score of public sentiment, S C Score value for public sentiment classification, S ky Score value for public sentiment keywords, S m Score value, lambda, for public sentiment media 1 Weight coefficient, lambda, for public opinion classification 2 The weight coefficient is public opinion keyword; dividing the score into step intervals, S 1 ,S 2 A public opinion importance degree threshold value;
s5: carrying out screening and sequencing on dimensions such as public opinion total score, public opinion classification category and the like to recommend results; and screening the recommended data by using the public opinion classification category, sorting by using the total score, and recommending to the front end for display.
Preferably, in step S1, the stored data structure includes date, URL of the web page, and web page content.
Preferably, the public opinion classification score specifically comprises:
performing multi-classification operation on the text content through a deep learning technology;
labeling classified data by using data labeling software, and performing data labeling on each piece of data to obtain classified training data;
selecting a classification model, setting different parameters and carrying out model training on classification training data;
the classification model is deployed to be an inference interface which predicts public opinion texts and returns a classification category and the probability of the classification category;
selecting texts and labels with the screening probability higher than a set threshold value from the predicted public opinion text categories and category probabilities as training data of a future optimization classification model;
using S C =S ci *P c Calculating score values of public opinion classification;
wherein S C Score value for public sentiment classification, S ci Is a score value, P, of a certain category c A probability value for the classification is predicted for the classification model.
Preferably, the public opinion keyword scoring specifically comprises:
giving a keyword list, and performing keyword matching on public sentiment texts;
and after acquiring all matched keywords of the public opinion text, calculating the scores of the keywords.
Preferably, the calculating the score of the keyword specifically includes: setting a keyword inverse density as text length/keyword score:
Figure BDA0003624618960000021
where μ is the inverse density of the keyword, len (text) is the text length,
Figure BDA0003624618960000022
is the sum of the scores of the keywords;
setting two thresholds (mu) by analyzed keyword inverse density minmax ) Normal public opinion texts are arranged between the two groups;
the analysis mode can draw a scatter diagram, the ordinate is the inverse density of the key words, and the abscissa is the sequence number after sequencing; intercepting a section of normal text after outlier values are removed, and obtaining two boundaries as threshold values:
Figure BDA0003624618960000031
wherein mu t To determine if a normal text factor is present, μ minmax Boundary thresholds for normal text, respectively;
setting step threshold value to make coefficient punishment on keyword score, analyzing keyword inverse density distribution and setting mu 1 ,μ 2 A threshold value;
the analysis method comprises the steps of drawing a histogram, wherein the abscissa is the inverse density of a keyword, and the ordinate is a numerical value of the keyword inverse density for carrying out barrel dividing operation;
penalizing the score according to the inverse densityμ n Multiplying, and recommending texts with higher density;
Figure BDA0003624618960000032
wherein mu n Is an inverse density penalty factor, mu minmax Boundary threshold, μ, for normal text respectively 1 ,μ 2 Penalizing a threshold for a step;
the total score calculation formula of the keyword scores is as follows:
Figure BDA0003624618960000033
wherein S ky The score value of the public opinion keywords is obtained,
Figure BDA0003624618960000034
is the sum of the scores of the keywords, μ t To determine if a normal text factor is present, μ n Is an inverse density penalty factor.
Preferably, the public opinion media scoring specifically comprises: s m =S mi
Wherein S m Score value for public sentiment media, S mi Is the media confidence value.
Compared with the prior art, the invention has the beneficial effects that: by utilizing big data and natural language processing technology, on the basis of a grading algorithm based on keywords, comprehensive evaluation and recommendation are carried out on a plurality of dimensional public opinion information such as comprehensive text classification. The method can enable the user to quickly obtain high-quality public opinion information meeting the requirements, and the recommendation accuracy is higher than that of the prior art, so that the public opinion analysis efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
FIG. 1 is a schematic flow diagram of the overall process of the present invention;
FIG. 2 is a schematic diagram of a public opinion classification scoring process according to the present invention;
FIG. 3 is a schematic diagram of a public opinion keyword scoring process according to the present invention;
FIG. 4 is a schematic diagram of a general public opinion scoring process according to the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
The multidimensional public opinion recommendation method based on big data and natural language processing provided by the embodiment comprises the following steps:
s1: the method comprises the steps of crawling internet public opinion data by using an internet crawler technology, and storing the crawled data into a database mysql; the stored data structure is date, URL of web page, web page content, etc.
S2: adopting a big data technology real-time acquisition technology Flink cdc, reading full and incremental data from mysql in real time, extracting topics (extracted from < h >, < title > of html), contents (from < body >, < text >, < textarea > of html, and the like) and release dates of webpages from webpage contents, and storing the topics and the incremental data in a big data cluster Hive database;
s3: reading a plurality of keyword matching methods set by a user from a keyword table, analyzing each keyword matching method according to a mode matching method (the content simultaneously comprises one or more characters or only comprises one of the characters, such as a + b-c, and simultaneously matches a and b but does not comprise c), and matching with the content of each record in Hive data (the matching method inquires whether the content comprises one or more keywords according to the requirement of mode matching); if the content meets one of the keywords, the content is considered to meet the keyword matching, and the matched data is stored in a cleaned result database;
s4: scoring public sentiments, including classification of public sentimentsScoring, scoring public sentiment keywords, scoring public sentiment media, and calculating public sentiment classification scores, public sentiment keyword scores and public sentiment media scores through an algorithm formula to obtain public sentiment total scores; the formula of the algorithm is S ═ lambda 1 *S c2 *S ky *S m
Wherein S is the total score of public sentiment, S C Score value for public sentiment classification, S ky Score value for public sentiment keywords, S m Score value, lambda, for public sentiment media 1 Weight coefficient, lambda, for public opinion classification 2 The weight coefficient is public opinion keyword; dividing the score into step intervals, S 1 ,S 2 A public opinion importance degree threshold value;
s5: carrying out screening and sequencing on dimensions such as public opinion total score, public opinion classification category and the like to recommend results; and screening the recommended data by using the public opinion classification category, sorting by using the total score, and recommending to the front end for display.
Specifically, the purpose of public opinion classification scoring is to classify text contents into categories with different topics. Different categories are given different scores and the two fields are stored in the database. The category and the probability of being identified as the category are used as scoring basis.
Through a deep learning technology, text content is subjected to multi-classification operation, as shown in fig. 2, the deep learning text multi-classification operation is supervised learning, a large amount of labeled data needs to be prepared in advance, each piece of labeled data corresponds to one label, for a text multi-classification task, the data labeled with the data is the text content, and the label is one of the categories of the classification task.
And labeling the classified data by using data labeling software, and labeling each piece of data to obtain classified training data.
The classification models can be selected from a plurality of classes, such as TextCNN, TextRNN, TextRCNN, FastText, BERT, ALBERT and the like can be used as classifiers, the text classifiers are used for carrying out model training on classification training data by setting different parameters, models and parameters are preferably selected according to evaluation indexes F1, the sizes of the models and inference time, and the TextCNN model is finally selected for carrying out multi-classification on the texts. The method for constructing the deep learning model comprises the steps of using a deep learning framework pytorch to encode a text, using a torch.nn.Embedding layer of the pytorch to embed words in the text, using a torch.nn.Conv1d layer of the pytorch to embed words in the text, using a torch.cat layer of the pytorch to perform convolution operation on the words in the CNN layer, using a torch.nn.Linear layer of the pytorch to perform full connection operation on the linear layer, and using a torch.nn.Dropout of the pytorch to randomly discard the tensor. The model may be gradient descent trained using the GPU.
The TextCNN model can be deployed as an inference interface for predicting the public opinion text, and the interface returns a classified category and the probability of the category.
And using the predicted public opinion text category, the text with the screening probability higher than a threshold value (which can be set as 80%) in the category probability and the label as training data of a later optimization classification model. This data can be manually reviewed to improve accuracy. This positive feedback process can improve the accuracy of the classification model.
Using S last C =S ci *P c And calculating the score of the public opinion classification.
Wherein S c Score value for public sentiment classification, S ci Is a score value, P, of a certain category c A probability value for the classification is predicted for the classification model.
The public opinion keywords are scored, and a keyword list is given firstly, wherein the keyword list contains fields 'keywords' and 'scores'. All keywords are found out from the text, and the keyword scoring flow is shown in fig. 3 according to all found keywords and the keyword scores as scoring bases.
The key word matching algorithm uses an ac automaton, the ac automaton uses a Trie tree which is also a dictionary tree and is combined with a KMP algorithm, the key point is that space is used for exchanging time, and the public prefix of a character string is used for reducing the expense of query time so as to achieve the purpose of high efficiency. And putting all keywords into a Trie tree, matching the texts in an ac automaton from the beginning of a target string one by one when the texts are matched, counting when the texts are matched, and jumping out of the accompanied position to try to match if the texts are not matched until all matching is completed.
After all matched keywords of the public opinion text are acquired, the scores of the keywords need to be calculated, the long text may be matched with more keywords, and if the scores of the keywords are linear or monotonically increasing functions, the problem of preference for the long text exists, so an algorithm needs to be designed to balance the relation between the text length and the scores of the keywords.
And designing a keyword inverse density concept, text length/keyword score.
Figure BDA0003624618960000061
Where μ is the inverse density of the keyword, len (text) is the text length,
Figure BDA0003624618960000062
is the sum of the scores of the keywords.
The keyword inverse density can reflect whether the text is normal text or not, and two threshold values (mu) are set by analyzing the keyword inverse density minmax ) The normal public opinion text is between the two.
The analysis mode can draw a scatter diagram, wherein the ordinate is the inverse density of the key words, and the abscissa is the sequence number after sequencing; and intercepting a section of normal text after the outlier is removed to obtain two boundaries as threshold values.
Figure BDA0003624618960000063
Wherein mu t To determine whether it is a normal text factor, μ minmax Respectively, the boundary threshold for normal text.
Setting step threshold value to make coefficient punishment on keyword score, analyzing keyword inverse density distribution and setting mu 1 ,μ 2 A threshold value.
The analysis method comprises the steps of drawing a histogram, wherein the abscissa is the inverse density of the key words, and the ordinate is the numerical value of the key word inverse density for carrying out barrel dividing operation.
Punishment coefficient mu is carried out on the fraction according to the inverse density n The effect is that the higher the density of text, the more recommended.
Figure BDA0003624618960000071
Wherein mu n Is an inverse density penalty factor, mu minmax Respectively, the boundary threshold for normal text. Mu.s 1 ,μ 2 A threshold is penalized for the step.
The total score calculation formula of the keyword scores is as follows:
Figure BDA0003624618960000072
wherein S ky The score value of the public opinion keywords is obtained,
Figure BDA0003624618960000073
is the sum of the scores of the keywords, μ t To determine if a normal text factor is present, μ n Is an inverse density penalty factor.
For public opinion texts, if the inverse density of the keywords is in a normal range, the more keywords are matched, the larger the score is.
The public opinion media scoring is the basis of scoring by using media information obtained in data acquisition. The confidence levels of different media are different, the message sources of some serious media have higher confidence level, and the message sources of non-serious media have lower confidence level. Based on this dimension, we design a media database, and fields can be added to media sources, media confidence values, etc.
S m =S mi
Wherein S m Score value for public sentiment media, S mi Is the media confidence value.
In the overall scoring of the public sentiment, the overall scoring is to comprehensively score multiple dimensions such as public sentiment classification scoring, public sentiment keyword scoring, public sentiment media scoring and the like into a final score through an algorithm formula.
As shown in fig. 4, the overall total scoring process first analyzes the value range of each dimension, for example, the expected value of the public sentiment classification dimension is the average value of the scores of all classification categories, and the expected value of the public sentiment keyword dimension is the expected value of the keyword score of the median of the text length. And carrying out average balance on the expected values of all dimensions by using weight coefficients.
And (3) integral public opinion scoring algorithm:
S=(λ 1 *S c2 *S ky )*S m
wherein S is the total score of public sentiment, S C Score value for public sentiment classification, S ky Score value for public sentiment keywords, S m Score value, lambda, for public sentiment media 1 Weight coefficient, lambda, for public opinion classification 2 The weight coefficient is the public opinion keyword.
Dividing the score into step intervals, S 1 ,S 2 Is the threshold value of public opinion importance degree.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (6)

1. A multidimensional public opinion recommendation method based on big data and natural language processing is characterized by comprising the following steps:
s1: the method comprises the steps of crawling internet public opinion data by using an internet crawler technology, and storing the crawled data into a database mysql;
s2: a big data technology real-time acquisition technology Flink cdc is adopted, full and incremental data are read from mysql in real time, the theme, content and release date of the webpage are extracted from the webpage content, and the webpage content is stored in a big data cluster Hive database;
s3: reading a plurality of keyword matching methods set by a user from a keyword table, analyzing each keyword matching method according to a pattern matching method, and matching with the content of each record in Hive data; if the content meets one of the keywords, the content is considered to meet the keyword matching, and the matched data is stored in a cleaned result database;
s4: carrying out public opinion scoring, including public opinion classification scoring, public opinion keyword scoring and public opinion media scoring, and calculating public opinion classification score values, public opinion keyword score values and public opinion media score values through an algorithm formula to obtain public opinion total scoring; the formula of the algorithm is S ═ lambda 1 *S c2 *S ky )*S m
Wherein S is the total score of public sentiment, S C For public opinion classification score value, S ky Score value for public sentiment keywords, S m Score value, lambda, for public sentiment media 1 Weight coefficient, lambda, for public opinion classification 2 The weight coefficient is public opinion keyword; dividing the score into step intervals, S 1 ,S 2 A public opinion importance degree threshold value;
s5: carrying out screening and sequencing on dimensions such as public opinion total score, public opinion classification category and the like to recommend results; and screening the recommended data by using the public opinion classification category, sorting by using the total score, and recommending to the front end for display.
2. The method for multi-dimensional public opinion recommendation based on big data and natural language processing as claimed in claim 1, wherein in step S1, the stored data structure includes date, URL of web page, web page content.
3. The method as claimed in claim 1, wherein the public opinion classification scoring specifically includes:
performing multi-classification operation on the text content through a deep learning technology;
labeling classified data by using data labeling software, and performing data labeling on each piece of data to obtain classified training data;
selecting a classification model, setting different parameters and carrying out model training on classification training data;
the classification model is deployed to be an inference interface which predicts public opinion texts and returns a classification category and the probability of the classification category;
selecting texts and labels with the screening probability higher than a set threshold value from the predicted public opinion text categories and category probabilities as training data of a future optimization classification model;
using S C =S ci *P c Calculating score values of public opinion classification;
wherein S C Score value for public sentiment classification, S ci Is a score value, P, of a certain category c A probability value for the classification is predicted for the classification model.
4. The method as claimed in claim 1, wherein the multidimensional public opinion recommendation method based on big data and natural language processing specifically comprises:
giving a keyword list, and performing keyword matching on public sentiment texts;
and after acquiring all matched keywords of the public opinion text, calculating the scores of the keywords.
5. The method as claimed in claim 4, wherein the calculating the score of the keyword specifically comprises: setting a keyword inverse density as text length/keyword score:
Figure FDA0003624618950000021
where μ is the inverse density of the keyword, len (text) is the text length,
Figure FDA0003624618950000022
is the sum of the scores of the keywords;
meridian pointKeyword analysis inverse density setting two thresholds (mu) min ,μ max ) Normal public opinion texts are arranged between the two groups;
the analysis mode can draw a scatter diagram, the ordinate is the inverse density of the key words, and the abscissa is the sequence number after sequencing; intercepting a section of normal text after outliers are removed, and obtaining two boundaries as threshold values:
Figure FDA0003624618950000023
wherein mu t To determine if a normal text factor is present, μ min ,μ max Boundary thresholds for normal text, respectively;
setting step threshold value to make coefficient punishment on keyword score, analyzing keyword inverse density distribution and setting mu 1 ,μ 2 A threshold value;
the analysis method comprises the steps of drawing a histogram, wherein the abscissa is the inverse density of a keyword, and the ordinate is a numerical value of the keyword inverse density for carrying out barrel dividing operation;
punishment coefficient mu is carried out on the fraction according to the inverse density n Multiplying, and recommending texts with higher density;
Figure FDA0003624618950000031
wherein mu n Is an inverse density penalty factor, mu min ,μ max Boundary threshold, μ, for normal text respectively 1 ,μ 2 Penalizing a threshold for a step;
the total score calculation formula of the keyword scores is as follows:
Figure FDA0003624618950000032
wherein S ky The score value of the public opinion keywords is obtained,
Figure FDA0003624618950000033
is the sum of the scores of the keywords, μ t To determine if a normal text factor is present, μ n Is an inverse density penalty factor.
6. The method as claimed in claim 1, wherein the public opinion media scoring specifically includes: s m =S mi
Wherein S m Score value for public sentiment media, S mi Is the media confidence value.
CN202210483561.6A 2022-04-29 2022-04-29 Multi-dimensional public opinion recommendation method based on big data and natural language processing Active CN114861027B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210483561.6A CN114861027B (en) 2022-04-29 2022-04-29 Multi-dimensional public opinion recommendation method based on big data and natural language processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210483561.6A CN114861027B (en) 2022-04-29 2022-04-29 Multi-dimensional public opinion recommendation method based on big data and natural language processing

Publications (2)

Publication Number Publication Date
CN114861027A true CN114861027A (en) 2022-08-05
CN114861027B CN114861027B (en) 2024-06-18

Family

ID=82635162

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210483561.6A Active CN114861027B (en) 2022-04-29 2022-04-29 Multi-dimensional public opinion recommendation method based on big data and natural language processing

Country Status (1)

Country Link
CN (1) CN114861027B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017013667A1 (en) * 2015-07-17 2017-01-26 Giridhari Devanathan Method for product search using the user-weighted, attribute-based, sort-ordering and system thereof
CN109145215A (en) * 2018-08-29 2019-01-04 中国平安保险(集团)股份有限公司 Internet public opinion analysis method, apparatus and storage medium
WO2019227710A1 (en) * 2018-05-31 2019-12-05 平安科技(深圳)有限公司 Network public opinion analysis method and apparatus, and computer-readable storage medium
WO2020186627A1 (en) * 2019-03-15 2020-09-24 深圳市赛为智能股份有限公司 Public opinion polarity prediction method and apparatus, computer device, and storage medium
CN112650848A (en) * 2020-12-30 2021-04-13 交控科技股份有限公司 Urban railway public opinion information analysis method based on text semantic related passenger evaluation
CN113569118A (en) * 2021-06-30 2021-10-29 深圳市东信时代信息技术有限公司 Self-media pushing method and device, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017013667A1 (en) * 2015-07-17 2017-01-26 Giridhari Devanathan Method for product search using the user-weighted, attribute-based, sort-ordering and system thereof
WO2019227710A1 (en) * 2018-05-31 2019-12-05 平安科技(深圳)有限公司 Network public opinion analysis method and apparatus, and computer-readable storage medium
CN109145215A (en) * 2018-08-29 2019-01-04 中国平安保险(集团)股份有限公司 Internet public opinion analysis method, apparatus and storage medium
WO2020186627A1 (en) * 2019-03-15 2020-09-24 深圳市赛为智能股份有限公司 Public opinion polarity prediction method and apparatus, computer device, and storage medium
CN112650848A (en) * 2020-12-30 2021-04-13 交控科技股份有限公司 Urban railway public opinion information analysis method based on text semantic related passenger evaluation
CN113569118A (en) * 2021-06-30 2021-10-29 深圳市东信时代信息技术有限公司 Self-media pushing method and device, computer equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SHAILENDRA KUMAR SINGH.ETC: "SentiVerb system: classification of social media text using sentiment analysis", 30 July 2019 (2019-07-30) *
王浩;: "基于多维度的公安舆情分析模型构建", 情报探索, no. 03, 15 March 2020 (2020-03-15) *

Also Published As

Publication number Publication date
CN114861027B (en) 2024-06-18

Similar Documents

Publication Publication Date Title
KR101203345B1 (en) Method and system for classifying display pages using summaries
CN102929937B (en) Based on the data processing method of the commodity classification of text subject model
CN107315738B (en) A kind of innovation degree appraisal procedure of text information
CN104199833B (en) The clustering method and clustering apparatus of a kind of network search words
CN110516074B (en) Website theme classification method and device based on deep learning
CN112256939B (en) Text entity relation extraction method for chemical field
CN111950273A (en) Network public opinion emergency automatic identification method based on emotion information extraction analysis
CN110543564B (en) Domain label acquisition method based on topic model
CN110134799B (en) BM25 algorithm-based text corpus construction and optimization method
CN107506472B (en) Method for classifying browsed webpages of students
CN107895303B (en) Personalized recommendation method based on OCEAN model
CN107577671A (en) A kind of key phrases extraction method based on multi-feature fusion
CN108009135A (en) The method and apparatus for generating documentation summary
CN112035658A (en) Enterprise public opinion monitoring method based on deep learning
CN109446423B (en) System and method for judging sentiment of news and texts
CN114048305A (en) Plan recommendation method for administrative penalty documents based on graph convolution neural network
CN111221968A (en) Author disambiguation method and device based on subject tree clustering
CN111339777A (en) Medical related intention identification method and system based on neural network
CN114265935A (en) Science and technology project establishment management auxiliary decision-making method and system based on text mining
CN114003726B (en) Subspace embedding-based academic thesis difference analysis method
CN112395862A (en) Environmental risk perception evaluation method based on data mining
CN117235253A (en) Truck user implicit demand mining method based on natural language processing technology
Widoyono et al. Sentiment analysis of learning from home during pandemic covid-19 in indonesia
CN114861027B (en) Multi-dimensional public opinion recommendation method based on big data and natural language processing
CN115238709A (en) Method, system and equipment for analyzing sentiment of policy announcement network comments

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant