CN113569118A - Self-media pushing method and device, computer equipment and storage medium - Google Patents

Self-media pushing method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN113569118A
CN113569118A CN202110741715.2A CN202110741715A CN113569118A CN 113569118 A CN113569118 A CN 113569118A CN 202110741715 A CN202110741715 A CN 202110741715A CN 113569118 A CN113569118 A CN 113569118A
Authority
CN
China
Prior art keywords
media
self
target
score
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110741715.2A
Other languages
Chinese (zh)
Other versions
CN113569118B (en
Inventor
刘杨
熊焕卫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Donson Times Information Technology Co ltd
Original Assignee
Donson Times Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Donson Times Information Technology Co ltd filed Critical Donson Times Information Technology Co ltd
Priority to CN202110741715.2A priority Critical patent/CN113569118B/en
Publication of CN113569118A publication Critical patent/CN113569118A/en
Application granted granted Critical
Publication of CN113569118B publication Critical patent/CN113569118B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a self-media pushing method, a self-media pushing device, computer equipment and a storage medium, wherein the method comprises the following steps: the method comprises the steps of acquiring public sentiment data of works of the target self-media on each basic platform, obtaining the public sentiment data of the target self-media on each basic platform, analyzing each public sentiment data to obtain an analysis result, determining target classification crowds corresponding to the target self-media based on the analysis result, determining comprehensive scores of the target self-media based on preset scoring weights corresponding to each basic platform, determining marketing indexes of the target self-media according to the comprehensive scores, determining target platforms to be recommended from each basic platform based on the marketing indexes, and recommending the target self-media to the target classification crowds in the target platforms.

Description

Self-media pushing method and device, computer equipment and storage medium
Technical Field
The present invention relates to the field of data processing, and in particular, to a method and an apparatus for self-media push, a computer device, and a medium.
Background
The internet can provide rich information resources for users, along with the rapid development of internet technology, more and more self-media (individuals or brands) provide information sharing for the users through various platforms through self resource characteristics, the existing mode is usually to evaluate and position the self-media according to the performance of the self-media in a single platform and push the self-media to correspondingly classified crowds of the platform according to the evaluation and positioning result, and in the process of realizing the invention, an inventor realizes that the existing mode at least has the following problems: when a self-media shares information on a plurality of platforms, how to realize accurate positioning and recommendation of the self-media is an urgent problem to be solved.
Disclosure of Invention
The embodiment of the invention provides a self-media pushing method and device, computer equipment and a storage medium, and aims to improve the self-media pushing accuracy.
In order to solve the foregoing technical problem, an embodiment of the present application provides a self-media pushing method, including:
collecting public sentiment data of a work of a target self-media on each basic platform to obtain the public sentiment data of the target self-media on each basic platform;
analyzing each public opinion data to obtain an analysis result, determining a target classification crowd corresponding to the target self-media based on the analysis result, and determining a comprehensive score of the target self-media based on a preset score weight corresponding to each basic platform;
determining a marketing index of the target self-media according to the comprehensive score;
and determining a target platform to be recommended from each basic platform based on the marketing index, and recommending the target self-media to target classified crowds in the target platform.
Optionally, the composite score is expressed in the form of a multidimensional score map.
Optionally, the collecting public opinion data of a work of a target self-media on each base platform, and obtaining the public opinion data of the target self-media on each base platform includes:
acquiring a uniform resource locator corresponding to each basic platform;
for each basic platform, crawling analysis is carried out on the page file corresponding to the uniform resource locator in a web crawler mode, and the page file of the work corresponding to the target self-media is obtained and used as a target page;
and aiming at each basic platform, extracting public sentiment information related to the corresponding works of the target self-media from the content contained in the target page in a fuzzy matching mode to serve as the public sentiment data of the target self-media on the basic platform.
Optionally, the public opinion data includes at least one of interaction data, work content, and comment data.
Optionally, the analyzing each public opinion data to obtain an analysis result includes:
according to the preset weight of each kind of interaction data, carrying out statistical weighting on the interaction data to obtain a first score, wherein the interaction data comprises at least one of praise, collection, browsing and forwarding;
analyzing the content of the work, and grading the quality of the work according to an analysis result to obtain a second score;
carrying out semantic recognition on the comment data, and grading according to the obtained semantic recognition result to obtain a third score;
and determining assessment information of the public opinion data as the analysis result based on the first score, the second score and the third score.
Optionally, the performing semantic recognition on the comment data, and scoring according to the obtained semantic recognition result to obtain a third score includes:
for the same user name, if the number of user evaluations corresponding to the user name exceeds a preset threshold, selecting the user evaluations with the same number as the preset threshold as effective evaluations of the user name, and if the number of user evaluations corresponding to the user name does not exceed the preset threshold, taking the user evaluations corresponding to each user name as one effective evaluation;
performing evaluation emotion analysis on each effective evaluation in a semantic analysis mode to obtain an approval degree corresponding to each effective evaluation;
and comprehensively evaluating the degree of deeming corresponding to each effective evaluation according to a preset evaluation mode to obtain a third score.
Optionally, the performing, by using a semantic analysis mode, evaluation emotion analysis on each effective evaluation to obtain an approval degree corresponding to each effective evaluation includes:
extracting keywords contained in the effective comments by adopting a preset word segmentation mode;
training the keywords in a word vector mode to obtain space word vectors corresponding to the keywords;
performing clustering analysis on the space word vectors based on a K-Means aggregation algorithm to obtain a clustering analysis result;
and calculating the Euclidean distance between the cluster analysis result and each preset approval degree in a preset approval degree set, and taking the preset approval degree with the minimum Euclidean distance value as the approval degree corresponding to the effective comment.
In order to solve the foregoing technical problem, an embodiment of the present application further provides a self-media pushing apparatus, including:
the data acquisition module is used for acquiring the public sentiment data of the works of the target self-media on each basic platform to obtain the public sentiment data of the target self-media on each basic platform;
the data evaluation module is used for analyzing each public opinion data to obtain an analysis result, determining a target classification crowd corresponding to the target self-media based on the analysis result, and determining a comprehensive score of the target self-media based on a preset score weight corresponding to each basic platform;
the index determining module is used for determining the marketing index of the target self-media according to the comprehensive score;
and the target recommending module is used for determining a target platform to be recommended from each basic platform based on the marketing index and recommending the target self-media to the target classified crowd in the target platform.
Optionally, the data acquisition module comprises:
the resource positioning unit is used for acquiring a uniform resource locator corresponding to each basic platform;
the page determining module is used for crawling and analyzing the page file corresponding to the uniform resource locator in a web crawler mode aiming at each basic platform to obtain the page file of the target works corresponding to the self-media as a target page;
and the data crawling unit is used for extracting public sentiment information related to the corresponding works of the target self-media from the content contained in the target page in a fuzzy matching mode aiming at each basic platform to serve as the public sentiment data of the target self-media on the basic platform.
Optionally, the data evaluation module comprises:
the interactive score evaluation unit is used for carrying out statistical weighting on the interactive data according to the preset weight of each type of interactive data to obtain a first score, wherein the interactive data comprises at least one of praise, collection, browsing and forwarding;
the quality score evaluation unit is used for analyzing the content of the work and scoring the quality of the work according to the analysis result to obtain a second score;
the comment score evaluation unit is used for carrying out semantic recognition on the comment data and scoring according to the obtained semantic recognition result to obtain a third score;
a result generation unit configured to determine evaluation information of the public opinion data as the analysis result based on the first score, the second score, and the third score.
Optionally, the comment score evaluating unit includes:
the effective comment screening subunit is used for selecting the user evaluations, which are the same as the preset threshold value, as effective evaluations of the user name if the number of the user evaluations corresponding to the user name exceeds the preset threshold value, and taking the user evaluations corresponding to each user name as one effective evaluation if the number of the user evaluations corresponding to the user name does not exceed the preset threshold value;
the semantic analysis subunit is used for performing evaluation emotion analysis on each effective evaluation in a semantic analysis mode to obtain an approval degree corresponding to each effective evaluation;
and the score evaluation subunit is used for comprehensively evaluating the thought degree corresponding to each effective evaluation according to a preset evaluation mode to obtain a third score.
Optionally, the semantic analysis subunit includes:
the word segmentation extraction component is used for extracting the keywords contained in the effective comments in a preset word segmentation mode;
the word vector generating component is used for training the keywords in a word vector mode to obtain space word vectors corresponding to the keywords;
the word segmentation and clustering component is used for carrying out clustering analysis on the space word vectors based on a K-Means aggregation algorithm to obtain a clustering analysis result;
and the acceptance degree calculation component is used for calculating the Euclidean distance between the cluster analysis result and each preset acceptance degree in a preset acceptance degree set, and taking the preset acceptance degree with the minimum Euclidean distance value as the corresponding acceptance degree of the effective comment.
In order to solve the technical problem, an embodiment of the present application further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the self-media pushing method when executing the computer program.
In order to solve the above technical problem, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and the computer program, when executed by a processor, implements the steps of the self-media pushing method described above.
The self-media pushing method, the device, the computer equipment and the storage medium provided by the embodiment of the invention acquire the public sentiment data of the works of the target self-media on each basic platform, analyze each public sentiment data to acquire an analysis result, determine a target classification crowd corresponding to the target self-media based on the analysis result, determine the comprehensive score of the target self-media based on the preset score weight corresponding to each basic platform, determine the marketing index of the target self-media according to the comprehensive score, determine a target platform to be recommended from each basic platform based on the marketing index, recommend the target self-media to the target classification crowd in the target platform, realize the rapid positioning of the target self-media, and recommend the target classification crowd in the target platform corresponding to the positioning, the method is favorable for improving the accuracy of the self-media push of the target.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of a self-media push method of the present application;
FIG. 3 is a schematic diagram illustrating an embodiment of a self-media push device according to the present application;
FIG. 4 is a schematic block diagram of one embodiment of a computer device according to the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, as shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104 and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture experts Group Audio Layer III, motion Picture experts compression standard Audio Layer 3), MP4 players (Moving Picture experts Group Audio Layer IV, motion Picture experts compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
It should be noted that, the self-media pushing method provided in the embodiment of the present application is executed by a server, and accordingly, the self-media pushing apparatus is disposed in the server.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. Any number of terminal devices, networks and servers may be provided according to implementation needs, and the terminal devices 101, 102 and 103 in this embodiment may specifically correspond to an application system in actual production.
Referring to fig. 2, fig. 2 shows a self-media pushing method according to an embodiment of the present invention, which is described by taking the application of the method to the server in fig. 1 as an example, and is detailed as follows:
s201: and collecting the public sentiment data of the target self-media on the works of each basic platform to obtain the public sentiment data of the target self-media on each basic platform.
Optionally, the public opinion data includes at least one of interaction data, work content, and comment data.
The target self-media specifically includes but is not limited to an individual author, a brand party and the like, and the released works include but are not limited to: text, pictures, short videos, active documents, and the like.
For a specific way of collecting public opinion data, reference may be made to the description of the following embodiments, and details are not repeated here to avoid repetition.
S202: analyzing each public opinion data to obtain an analysis result, determining a target classification crowd corresponding to the target self-media based on the analysis result, and determining a comprehensive score of the target self-media based on a preset score weight corresponding to each basic platform.
Optionally, the composite score is expressed in the form of a multidimensional score graph.
The multidimensional scoring graph refers to a graph for generating scoring data of multiple dimensions, and the specific dimension number can be set according to actual needs, which is not limited here. For example, in one specific implementation, a six-dimensional score chart is used, which is: the evaluation scores of the six dimensions are obtained based on the analysis of public opinion data, and further a hexagonal scoring pattern is generated and is visually stored in data corresponding to the target self-media.
S203: and determining the marketing index of the target self-media according to the comprehensive score.
The marketing index refers to the market positioning index of the target self-media on each basic platform.
S204: and determining a target platform to be recommended from each basic platform based on the marketing index, and recommending the target self-media to the target classified crowd in the target platform.
In this embodiment, the public sentiment data of the works of the target self-media on each base platform is collected to obtain the public sentiment data of the target self-media on each base platform, each public sentiment data is analyzed to obtain an analysis result, a target classification crowd corresponding to the target self-media is determined based on the analysis result, a comprehensive score of the target self-media is determined based on a preset score weight corresponding to each base platform, a marketing index of the target self-media is determined according to the comprehensive score, a target platform to be recommended is determined from each base platform based on the marketing index, the target self-media is recommended to the target classification crowd in the target platform, rapid positioning of the target self-media is achieved, the target classification crowd in the target platform corresponding to the positioning is recommended, and the accuracy of pushing of the target self-media is improved.
In an embodiment, in step S201, collecting the public sentiment data of the work of the target self-media on each base platform, and obtaining the public sentiment data of the target self-media on each base platform includes:
acquiring a uniform resource locator corresponding to each basic platform;
crawling analysis is carried out on the page file corresponding to the uniform resource locator in a network crawler mode aiming at each basic platform to obtain a page file of a target self-media corresponding work as a target page;
and aiming at each basic platform, extracting public sentiment information related to the corresponding works of the target self-media from the content contained in the target page in a fuzzy matching mode to serve as the public sentiment data of the target self-media on the basic platform.
Specifically, before a target self-media crawls a work of each basic platform, a uniform resource locator corresponding to each basic platform needs to be acquired, the uniform resource locator corresponding to each basic platform corresponds to a plurality of page files, each page file corresponds to one work, and public opinion data of the work can be acquired through the page file corresponding to the uniform resource locator.
A Uniform Resource Locator (URL) is a compact representation of the location and access method of a Resource available from the internet, and is an address of a standard Resource on the internet.
The crawling range and the number of the web crawlers are large, the requirements on crawling speed and storage space are high, the requirements on the order of crawling pages are relatively low, meanwhile, due to the fact that the number of pages to be refreshed is too many, a parallel working mode is generally adopted, and the structure of the web crawlers can be roughly divided into a page crawling module, a page analysis module, a link filtering module, a page database, a URL queue and an initial URL set. In order to improve the working efficiency, the universal web crawler can adopt a certain crawling strategy. Common crawling strategies are: a depth-first policy, a breadth-first policy.
The basic method of the depth-first strategy is to sequentially access next-level webpage links according to the sequence of the depth from low to high until the next-level webpage links cannot be deeply accessed. After completing one crawling branch, the crawler returns to the last link node to further search other links. And after all the links are traversed, finishing the crawling task.
The breadth-first strategy is to crawl pages according to the depth of the content directory hierarchy of the web page, and pages in the shallow directory hierarchy are crawled first. And after the pages in the same layer are crawled, the crawler goes into the next layer to continuously crawl. The strategy can effectively control the crawling depth of the page, avoid the problem that the crawling cannot be finished when an infinite deep branch is encountered, is convenient to realize, and does not need to store a large number of intermediate nodes.
Preferably, the crawling policy adopted in the embodiment of the present invention is a breadth-first policy, which crawls uniform resource locators corresponding to each base platform to obtain a plurality of page files corresponding to preset uniform resource locators, and then crawls each page file subsequently to obtain public opinion information of works contained in each page file, so that extra time overhead caused by crawling of too much useless information is avoided, and crawling efficiency is improved.
The fuzzy matching method includes but is not limited to: fuzzy matching based on a character string pattern matching (Horspool) algorithm, fuzzy matching of search terms based on a Trie tree, fuzzy matching based on a jquery selector and the like.
In this embodiment, the uniform resource locator corresponding to each base platform is obtained, for each base platform, the page file corresponding to the uniform resource locator is crawled and analyzed in a web crawler manner to obtain the page file of the work corresponding to the target self-media, the page file is used as the target page, for each base platform, public opinion information related to the work corresponding to the target self-media is extracted from the content contained in the target page in a fuzzy matching manner and is used as public opinion data of the target self-media on the base platform, intelligent acquisition of the public opinion data of the target self-media from a network is achieved, acquisition time is saved, and acquisition efficiency of the public opinion data is improved.
In a specific optional embodiment, in step S202, analyzing each public opinion data to obtain an analysis result includes:
performing statistical weighting on the interactive data according to the preset weight of each type of interactive data to obtain a first score, wherein the interactive data comprises at least one of praise, collection, browsing and forwarding;
analyzing the content of the work, and grading the quality of the work according to an analysis result to obtain a second score;
carrying out semantic recognition on the comment data, and grading according to the obtained semantic recognition result to obtain a third score;
and determining evaluation information of the public opinion data as an analysis result based on the first score, the second score and the third score.
In a specific optional implementation manner, performing semantic recognition on the comment data, and scoring according to an obtained semantic recognition result to obtain a third score includes:
for the same user name, if the number of user evaluations corresponding to the user name exceeds a preset threshold, selecting the user evaluations with the same number as the preset threshold as effective evaluations of the user name, and if the number of the user evaluations corresponding to the user name does not exceed the preset threshold, taking the user evaluations corresponding to each user name as an effective evaluation;
evaluating emotion analysis is carried out on each effective evaluation in a semantic analysis mode to obtain an approval degree corresponding to each effective evaluation;
and comprehensively evaluating the degree of deeming corresponding to each effective evaluation according to a preset evaluation mode to obtain a third score.
Specifically, in the user evaluation, there is a case where a user performs repeated comments for the same work for a plurality of times, in order to avoid that this case interferes with the analysis of the target evaluation, in this embodiment, a preset threshold is set for the number of user evaluations for the same work by the user, when the number of user evaluations for the same work by the user exceeds the preset threshold, the user evaluations having the same number as the preset threshold are selected as effective evaluations for the work by the user, when the number of user evaluations corresponding to the user name does not exceed the preset threshold, each user evaluation corresponding to the user name is taken as an effective evaluation, after determining the effective evaluation, semantics included in each effective evaluation are analyzed in a semantic analysis manner, so as to obtain the approval degree of the user included in the effective evaluation for the application, and a corresponding evaluation manner is preset for each thought degree, and evaluating the recognition degree corresponding to each effective evaluation of the same work, and further comprehensively evaluating the work to obtain a third score of the work corresponding to the target self-media.
The implementation method of semantic analysis includes but is not limited to: natural Language Processing (NLP), N-Gram model-based Natural Language semantic analysis, word vector-based clustering, and the like.
Preferably, the embodiment adopts a clustering algorithm based on word vectors to realize semantic analysis on effective evaluation.
The preset evaluation mode may be set according to actual requirements, for example, different scores may be set for different recognition degrees, and is not limited herein.
In a specific optional implementation manner, performing evaluation emotion analysis on each effective evaluation by using a semantic analysis manner, and obtaining an approval degree corresponding to each effective evaluation includes:
extracting keywords contained in the effective comments by adopting a preset word segmentation mode;
training the keywords in a word vector mode to obtain space word vectors corresponding to the keywords;
performing clustering analysis on the space word vectors based on a K-Means aggregation algorithm to obtain a clustering analysis result;
and calculating the Euclidean distance between the cluster analysis result and each preset approval degree in the preset approval degree set, and taking the obtained preset approval degree with the minimum Euclidean distance value as the approval degree corresponding to the effective comment.
Specifically, the effective comments are subjected to word segmentation processing through a third-party word segmentation tool or a word segmentation algorithm to obtain at least one keyword, wherein the specific number of the keyword is determined according to a word segmentation result.
Common third-party word segmentation tools include, but are not limited to: the system comprises a Stanford NLP word segmentation device, an ICTCLAS word segmentation system, an ansj word segmentation tool, a HanLP Chinese word segmentation tool and the like.
The word segmentation algorithm includes, but is not limited to: a Maximum forward Matching (MM) algorithm, a reverse direction Maximum Matching (RMM) algorithm, a Bi-directional Maximum Matching (BM) algorithm, a Hidden Markov Model (HMM), an N-gram Model, and the like.
Easily understood, keywords are extracted in a word segmentation mode, on one hand, some nonsense words in effective comments can be filtered, and on the other hand, the method is also beneficial to generating space word vectors by using the keywords subsequently.
In artificial intelligence, word vector representation refers primarily to a formal or mathematical description of a language, in order to represent the language in a computer and to enable automatic processing by a computer program. The word vector in this embodiment is expressed in the form of a vector to represent the keyword.
Specifically, each keyword is mapped into vectors according to a preset corpus, the vectors are connected together to form a word vector space, each vector corresponds to a point in the space, and each vector is used as a space word vector.
For example, two to-be-matched participles such as a bmw and a gallop are provided in a certain product name, and all possible classifications of the two to-be-matched participles are obtained according to a preset corpus: "car", "luxury", "animal", "action", and "food". Therefore, a vector representation is introduced for the two to-be-matched participles:
< cars, luxuries, animals, actions, food >
Calculating the probability of the two to-be-matched participles belonging to each classification according to a statistical learning method, wherein the probability learned by a computer is as follows:
bma ═ 0.5,0.2,0.2,0.0,0.1>
Gallop ═ 0.7,0.2,0.0,0.1,0.0>
It will be appreciated that the values of each dimension of the space word vector represent a feature that has some semantic and grammatical interpretation.
The space word vector of each keyword is constructed through the preset corpus, so that characters which cannot be accurately understood by a machine are converted into word vectors which are easy to identify and operate by the machine, and the recognition degree of the application program contained in the effective evaluation can be obtained by analyzing the keywords in the effective evaluation.
Further, after the space word vectors are constructed, the space distance between the space word vector and other space vectors is calculated for each space word vector corresponding to the effective evaluation, the space word vectors with the space distance exceeding a preset space distance threshold value with the other space word vectors are confirmed as invalid word vectors, and the invalid word vectors are removed, so that each space word vector can represent the semantics of the keywords corresponding to the space word vector in the effective evaluation as correctly as possible.
Further, the spatial word vectors corresponding to the same effective evaluation are clustered and analyzed in a clustering mode to obtain a clustering result corresponding to the effective evaluation, and preferably, the proposal uses a K-Means aggregation algorithm to perform clustering analysis on the spatial word vectors.
The K-means algorithm is a distance-based clustering algorithm, and the distance is used as an evaluation index of similarity, that is, the closer the distance between two objects is, the greater the similarity of the two objects is. The algorithm considers clusters to be composed of closely spaced objects, and therefore targets the resulting compact and independent clusters as final targets.
In this embodiment, the clustering analysis of space word vectors using the K-Means aggregation algorithm is described in detail as follows:
taking the word vectors corresponding to the preset parts of speech as clustering centers;
aiming at each space word vector in effective evaluation, calculating a first distance between the space word vector and each current clustering center, and putting the space word vector into a cluster where the clustering center corresponding to the minimum first distance is located to obtain m temporary clusters;
aiming at each temporary cluster, calculating the mean value of the temporary cluster and a second distance between each space word vector in the temporary cluster and the mean value, and selecting the space word vector corresponding to the minimum second distance as a new cluster center of the temporary cluster to obtain m updated temporary clusters;
the standard deviation of each updated temporal cluster is calculated as follows:
Figure BDA0003141613850000161
wherein σ is the standard deviation, AiIs the ith space word vector in the updated temporary cluster, n is the number of space word vectors in the updated temporary cluster, and mu is the space word vector AiMean value of the updated temporal cluster in which i ∈ [1, n ]]And i and n are positive integers;
if at least one standard deviation in the standard deviations of the m updated temporary clusters is larger than or equal to a preset standard deviation threshold value, returning to execute the step of executing each space word vector in effective evaluation, calculating a first distance between each space word vector and each current cluster center, and placing the space word vector into the cluster where the cluster center corresponding to the minimum first distance is located to obtain m temporary clusters;
and if the standard deviation of the m updated temporary clusters is smaller than the standard deviation threshold value, taking the cluster centers of the m updated temporary clusters as a cluster analysis result.
Further, each preset recognition degree in the preset recognition degree set is converted into a word vector, Euclidean distance calculation is carried out on the clustering analysis result and the word vector corresponding to each preset recognition degree, and the preset recognition degree corresponding to the word vector with the minimum Euclidean distance value serves as the recognition degree corresponding to the effective comment.
The approval level refers to a preference and an approval attitude of the application program included in the user evaluation, and may be specifically set according to an actual requirement, which is not specifically limited herein.
In this embodiment, a preset word segmentation method is adopted to extract keywords contained in the effective comments, then training the keywords in a word vector mode to obtain space word vectors corresponding to the keywords, performing cluster analysis on the space word vectors based on a K-Means aggregation algorithm to obtain cluster analysis results, and calculating Euclidean distance between the cluster analysis result and each preset approval degree in the preset approval degree set, and the obtained preset acceptance degree with the minimum Euclidean distance value is taken as the corresponding acceptance degree of the effective comment, so that the user emotion contained in the effective comment is obtained by converting the user comment into a space word vector and performing cluster analysis, the approval degree of the works is realized, the intelligent analysis of the effective comments is realized, the analysis speed of the effective comments is improved, and the efficiency of evaluating the works is improved.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
Fig. 3 shows a schematic block diagram of a self-media push apparatus in one-to-one correspondence with the self-media push method of the above-described embodiment. As shown in fig. 3, the self-media pushing device includes a data collection module 31, a data evaluation module 32, an index determination module 33, and a target recommendation module 34. The functional modules are explained in detail as follows:
the data acquisition module 31 is used for acquiring the public sentiment data of the works of the target self-media on each basic platform to obtain the public sentiment data of the target self-media on each basic platform;
the data evaluation module 32 is used for analyzing each public sentiment data to obtain an analysis result, determining a target classification crowd corresponding to the target self-media based on the analysis result, and determining a comprehensive score of the target self-media based on a preset score weight corresponding to each basic platform;
the index determining module 33 is used for determining the marketing index of the target self-media according to the comprehensive score;
and the target recommending module 34 is used for determining a target platform to be recommended from each basic platform based on the marketing index and recommending the target self-media to the target classified crowd in the target platform.
Optionally, the data acquisition module 31 includes:
the resource positioning unit is used for acquiring a uniform resource locator corresponding to each basic platform;
the page determining module is used for crawling and analyzing the page file corresponding to the uniform resource locator in a web crawler mode aiming at each basic platform to obtain the page file of the work corresponding to the target self-media as a target page;
and the data crawling unit is used for extracting public sentiment information related to the corresponding works of the target self-media from the content contained in the target page in a fuzzy matching mode aiming at each basic platform to serve as the public sentiment data of the target self-media on the basic platform.
Optionally, the data evaluation module 32 comprises:
the interactive score evaluation unit is used for carrying out statistical weighting on the interactive data according to the preset weight of each type of interactive data to obtain a first score, wherein the interactive data comprises at least one of praise, collection, browsing and forwarding;
the quality score evaluation unit is used for analyzing the content of the work and scoring the quality of the work according to the analysis result to obtain a second score;
the comment score evaluation unit is used for carrying out semantic recognition on the comment data and scoring according to the obtained semantic recognition result to obtain a third score;
and a result generation unit for determining evaluation information of the public opinion data as an analysis result based on the first score, the second score and the third score.
Optionally, the comment score evaluating unit includes:
the effective comment screening subunit is used for selecting the user evaluations with the same number as the preset threshold value as the effective evaluations of the user name if the number of the user evaluations corresponding to the user name exceeds the preset threshold value, and taking the user evaluations corresponding to each user name as one effective evaluation if the number of the user evaluations corresponding to the user name does not exceed the preset threshold value;
the semantic analysis subunit is used for performing evaluation emotion analysis on each effective evaluation in a semantic analysis mode to obtain an approval degree corresponding to each effective evaluation;
and the score evaluation subunit is used for comprehensively evaluating the corresponding degree of belief of each effective evaluation according to a preset evaluation mode to obtain a third score.
Optionally, the semantic analysis subunit includes:
the word segmentation extraction component is used for extracting the keywords contained in the effective comments by adopting a preset word segmentation mode;
the word vector generating component is used for training the keywords in a word vector mode to obtain space word vectors corresponding to the keywords;
the participle clustering component is used for carrying out clustering analysis on the space word vectors based on a K-Means aggregation algorithm to obtain a clustering analysis result;
and the acceptance degree calculating component is used for calculating the Euclidean distance between the clustering analysis result and each preset acceptance degree in the preset acceptance degree set, and taking the preset acceptance degree with the minimum Euclidean distance value as the corresponding acceptance degree of the effective comment.
For specific limitations of the self-media pushing device, reference may be made to the above limitations of the self-media pushing method, which is not described herein again. The modules in the self-media pushing device can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In order to solve the technical problem, an embodiment of the present application further provides a computer device. Referring to fig. 4, fig. 4 is a block diagram of a basic structure of a computer device according to the present embodiment.
The computer device 4 comprises a memory 41, a processor 42, a network interface 43 communicatively connected to each other via a system bus. It is noted that only the computer device 4 having the components connection memory 41, processor 42, network interface 43 is shown, but it is understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.
The memory 41 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or D interface display memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the memory 41 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the computer device 4. Of course, the memory 41 may also include both internal and external storage devices of the computer device 4. In this embodiment, the memory 41 is generally used for storing an operating system installed in the computer device 4 and various types of application software, such as program codes for controlling electronic files. Further, the memory 41 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 42 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to execute the program code stored in the memory 41 or process data, such as program code for executing control of an electronic file.
The network interface 43 may comprise a wireless network interface or a wired network interface, and the network interface 43 is generally used for establishing communication connection between the computer device 4 and other electronic devices.
The present application further provides another embodiment, which is to provide a computer-readable storage medium storing an interface display program, which is executable by at least one processor to cause the at least one processor to execute the steps of the self-media pushing method as described above.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.
It is to be understood that the above-described embodiments are merely illustrative of some, but not restrictive, of the broad invention, and that the appended drawings illustrate preferred embodiments of the invention and do not limit the scope of the invention. This application is capable of embodiments in many different forms and is provided for the purpose of enabling a thorough understanding of the disclosure of the application. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that the present application may be practiced without modification or with equivalents of some of the features described in the foregoing embodiments. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields and are within the protection scope of the present application.

Claims (10)

1. A self-media push method, the self-media push method comprising:
collecting public sentiment data of a work of a target self-media on each basic platform to obtain the public sentiment data of the target self-media on each basic platform;
analyzing each public opinion data to obtain an analysis result, determining a target classification crowd corresponding to the target self-media based on the analysis result, and determining a comprehensive score of the target self-media based on a preset score weight corresponding to each basic platform;
determining a marketing index of the target self-media according to the comprehensive score;
and determining a target platform to be recommended from each basic platform based on the marketing index, and recommending the target self-media to target classified crowds in the target platform.
2. The self-media push method of claim 1, wherein the composite score is expressed in the form of a multidimensional score graph.
3. The self-media pushing method as claimed in claim 1, wherein the collecting public opinion data of the work of the target self-media on each base platform, and obtaining the public opinion data of the target self-media on each base platform comprises:
acquiring a uniform resource locator corresponding to each basic platform;
for each basic platform, crawling analysis is carried out on the page file corresponding to the uniform resource locator in a web crawler mode, and the page file of the work corresponding to the target self-media is obtained and used as a target page;
and aiming at each basic platform, extracting public sentiment information related to the corresponding works of the target self-media from the content contained in the target page in a fuzzy matching mode to serve as the public sentiment data of the target self-media on the basic platform.
4. The self-media pushing method as claimed in any one of claims 1 to 3, wherein the public opinion data includes at least one of interaction data, work content and comment data.
5. The self-media pushing method as claimed in claim 4, wherein the analyzing each of the public opinion data to obtain an analysis result comprises:
according to the preset weight of each kind of interaction data, carrying out statistical weighting on the interaction data to obtain a first score, wherein the interaction data comprises at least one of praise, collection, browsing and forwarding;
analyzing the content of the work, and grading the quality of the work according to an analysis result to obtain a second score;
carrying out semantic recognition on the comment data, and grading according to the obtained semantic recognition result to obtain a third score;
and determining assessment information of the public opinion data as the analysis result based on the first score, the second score and the third score.
6. The self-media pushing method of claim 5, wherein the semantic recognition is performed on the comment data, and the scoring is performed according to the obtained semantic recognition result, and the obtaining of the third score comprises:
for the same user name, if the number of user evaluations corresponding to the user name exceeds a preset threshold, selecting the user evaluations with the same number as the preset threshold as effective evaluations of the user name, and if the number of user evaluations corresponding to the user name does not exceed the preset threshold, taking the user evaluations corresponding to each user name as one effective evaluation;
performing evaluation emotion analysis on each effective evaluation in a semantic analysis mode to obtain an approval degree corresponding to each effective evaluation;
and comprehensively evaluating the degree of deeming corresponding to each effective evaluation according to a preset evaluation mode to obtain the third score.
7. The self-media pushing method according to claim 6, wherein the performing sentiment analysis on each effective evaluation by using semantic analysis to obtain the recognition degree corresponding to each effective evaluation comprises:
extracting keywords contained in the effective comments by adopting a preset word segmentation mode;
training the keywords in a word vector mode to obtain space word vectors corresponding to the keywords;
performing clustering analysis on the space word vectors based on a K-Means aggregation algorithm to obtain a clustering analysis result;
and calculating the Euclidean distance between the cluster analysis result and each preset approval degree in a preset approval degree set, and taking the preset approval degree with the minimum Euclidean distance value as the approval degree corresponding to the effective comment.
8. A self-media push apparatus, characterized in that the self-media push apparatus comprises:
the data acquisition module is used for acquiring the public sentiment data of the works of the target self-media on each basic platform to obtain the public sentiment data of the target self-media on each basic platform;
the data evaluation module is used for analyzing each public opinion data to obtain an analysis result, determining a target classification crowd corresponding to the target self-media based on the analysis result, and determining a comprehensive score of the target self-media based on a preset score weight corresponding to each basic platform;
the index determining module is used for determining the marketing index of the target self-media according to the comprehensive score;
and the target recommending module is used for determining a target platform to be recommended from each basic platform based on the marketing index and recommending the target self-media to the target classified crowd in the target platform.
9. A computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the self-media push method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements the self-media push method according to any one of claims 1 to 7.
CN202110741715.2A 2021-06-30 2021-06-30 Self-media pushing method, device, computer equipment and storage medium Active CN113569118B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110741715.2A CN113569118B (en) 2021-06-30 2021-06-30 Self-media pushing method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110741715.2A CN113569118B (en) 2021-06-30 2021-06-30 Self-media pushing method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113569118A true CN113569118A (en) 2021-10-29
CN113569118B CN113569118B (en) 2023-12-22

Family

ID=78163287

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110741715.2A Active CN113569118B (en) 2021-06-30 2021-06-30 Self-media pushing method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113569118B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114861027A (en) * 2022-04-29 2022-08-05 深圳市东晟数据有限公司 Multi-dimensional public opinion recommendation method based on big data and natural language processing
CN115936514A (en) * 2022-12-14 2023-04-07 湖南工业大学 Rural food creative system based on big data linkage management

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106960063A (en) * 2017-04-20 2017-07-18 广州优亚信息技术有限公司 A kind of internet information crawl and commending system for field of inviting outside investment
CN110532461A (en) * 2019-07-05 2019-12-03 中国平安财产保险股份有限公司 Information platform method for pushing, device, computer equipment and storage medium
WO2019227710A1 (en) * 2018-05-31 2019-12-05 平安科技(深圳)有限公司 Network public opinion analysis method and apparatus, and computer-readable storage medium
CN112116391A (en) * 2020-09-18 2020-12-22 北京达佳互联信息技术有限公司 Multimedia resource delivery method and device, computer equipment and storage medium
CN112749341A (en) * 2021-01-22 2021-05-04 南京莱斯网信技术研究院有限公司 Key public opinion recommendation method, readable storage medium and data processing device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106960063A (en) * 2017-04-20 2017-07-18 广州优亚信息技术有限公司 A kind of internet information crawl and commending system for field of inviting outside investment
WO2019227710A1 (en) * 2018-05-31 2019-12-05 平安科技(深圳)有限公司 Network public opinion analysis method and apparatus, and computer-readable storage medium
CN110532461A (en) * 2019-07-05 2019-12-03 中国平安财产保险股份有限公司 Information platform method for pushing, device, computer equipment and storage medium
CN112116391A (en) * 2020-09-18 2020-12-22 北京达佳互联信息技术有限公司 Multimedia resource delivery method and device, computer equipment and storage medium
CN112749341A (en) * 2021-01-22 2021-05-04 南京莱斯网信技术研究院有限公司 Key public opinion recommendation method, readable storage medium and data processing device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114861027A (en) * 2022-04-29 2022-08-05 深圳市东晟数据有限公司 Multi-dimensional public opinion recommendation method based on big data and natural language processing
CN115936514A (en) * 2022-12-14 2023-04-07 湖南工业大学 Rural food creative system based on big data linkage management
CN115936514B (en) * 2022-12-14 2023-08-08 湖南工业大学 Country food creative system based on big data linkage management

Also Published As

Publication number Publication date
CN113569118B (en) 2023-12-22

Similar Documents

Publication Publication Date Title
CN112632385B (en) Course recommendation method, course recommendation device, computer equipment and medium
CN108629043B (en) Webpage target information extraction method, device and storage medium
CN113822067A (en) Key information extraction method and device, computer equipment and storage medium
CN111797214A (en) FAQ database-based problem screening method and device, computer equipment and medium
CN109145216A (en) Network public-opinion monitoring method, device and storage medium
CN104899322A (en) Search engine and implementation method thereof
CN111813905B (en) Corpus generation method, corpus generation device, computer equipment and storage medium
CN114780727A (en) Text classification method and device based on reinforcement learning, computer equipment and medium
CN106708929B (en) Video program searching method and device
CN111538931A (en) Big data-based public opinion monitoring method and device, computer equipment and medium
CN112287069B (en) Information retrieval method and device based on voice semantics and computer equipment
CN112395421B (en) Course label generation method and device, computer equipment and medium
CN110795527A (en) Candidate entity ordering method, training method and related device
CN113569118B (en) Self-media pushing method, device, computer equipment and storage medium
CN104915399A (en) Recommended data processing method based on news headline and recommended data processing method system based on news headline
CN113626704A (en) Method, device and equipment for recommending information based on word2vec model
CN115730597A (en) Multi-level semantic intention recognition method and related equipment thereof
CN115438149A (en) End-to-end model training method and device, computer equipment and storage medium
CN114817478A (en) Text-based question and answer method and device, computer equipment and storage medium
CN108595466B (en) Internet information filtering and internet user information and network card structure analysis method
CN110019763B (en) Text filtering method, system, equipment and computer readable storage medium
CN112199954A (en) Disease entity matching method and device based on voice semantics and computer equipment
CN114547257B (en) Class matching method and device, computer equipment and storage medium
CN112364649B (en) Named entity identification method and device, computer equipment and storage medium
CN114780724A (en) Case classification method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant