CN116320621A - NLP-based streaming media content analysis method and system - Google Patents

NLP-based streaming media content analysis method and system Download PDF

Info

Publication number
CN116320621A
CN116320621A CN202310554226.5A CN202310554226A CN116320621A CN 116320621 A CN116320621 A CN 116320621A CN 202310554226 A CN202310554226 A CN 202310554226A CN 116320621 A CN116320621 A CN 116320621A
Authority
CN
China
Prior art keywords
information
streaming media
text
nouns
evaluation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310554226.5A
Other languages
Chinese (zh)
Other versions
CN116320621B (en
Inventor
潘春霞
姜凤龙
朱亚辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Jiyi Technology Co ltd
Original Assignee
Suzhou Jiyi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Jiyi Technology Co ltd filed Critical Suzhou Jiyi Technology Co ltd
Priority to CN202310554226.5A priority Critical patent/CN116320621B/en
Publication of CN116320621A publication Critical patent/CN116320621A/en
Application granted granted Critical
Publication of CN116320621B publication Critical patent/CN116320621B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440236Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by media transcoding, e.g. video is transformed into a slideshow of still pictures, audio is converted into text
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44204Monitoring of content usage, e.g. the number of times a movie has been viewed, copied or the amount which has been watched
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/454Content or additional data filtering, e.g. blocking advertisements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4662Learning process for intelligent management, e.g. learning user preferences for recommending movies characterized by learning algorithms
    • H04N21/4666Learning process for intelligent management, e.g. learning user preferences for recommending movies characterized by learning algorithms using neural networks, e.g. processing the feedback provided by the user
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • H04N21/8405Generation or processing of descriptive data, e.g. content descriptors represented by keywords
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention is applicable to the technical field of information processing, and provides a streaming media content analysis method and system based on NLP, wherein the method comprises the following steps: receiving a search keyword input by a user, and determining a matched streaming media video according to the search keyword; screening the streaming media videos according to the heat value, processing the screened streaming media videos, and determining text information corresponding to each streaming media video; receiving a function keyword input by a user, summarizing the function keyword and the search keyword into nouns, extracting adjectives and nouns in each piece of text information based on NLP, binding a noun for each adjective, and determining content evaluation information of the text information; and analyzing and integrating all the content evaluation information to obtain stream media evaluation information, and carrying out special marking on the evaluation content of the functional keywords in the stream media evaluation information. According to the invention, the streaming media evaluation information is automatically obtained, and the streaming media evaluation information can accurately reflect the overall public opinion guidance.

Description

NLP-based streaming media content analysis method and system
Technical Field
The invention relates to the technical field of information processing, in particular to a streaming media content analysis method and system based on NLP.
Background
When a new product is released or marketed, the knowledge of the streaming media content guidance is important for strategic layout adjustment of the new product, and along with the rising of short videos, accurate analysis is required to be performed on the content of the streaming media video, so that manufacturers can know the public opinion of the new product in time, and at present, more accurate public opinion analysis is difficult to automatically perform on a large amount of streaming media video content. Therefore, a method and a system for analyzing a streaming media content based on NLP are needed to solve the above problems.
Disclosure of Invention
Aiming at the defects existing in the prior art, the invention aims to provide a streaming media content analysis method and a streaming media content analysis system based on NLP, so as to solve the problems existing in the background art.
The invention is realized in such a way that a streaming media content analysis method based on NLP comprises the following steps:
receiving a search keyword input by a user, and determining a matched streaming media video according to the search keyword;
screening the streaming media videos according to the heat value, processing the screened streaming media videos, and determining text information corresponding to each streaming media video;
receiving a function keyword input by a user, inducing the function keyword and the search keyword into nouns,
extracting adjectives and nouns in each text message based on NLP, binding a noun for each adjective, and determining content evaluation information of the text message;
and analyzing and integrating all the content evaluation information to obtain stream media evaluation information, and carrying out special marking on the evaluation content of the functional keywords in the stream media evaluation information.
As a further scheme of the invention: the step of processing the filtered streaming media videos and determining text information corresponding to each streaming media video specifically comprises the following steps:
judging whether the screened streaming media video has subtitle information or not;
when subtitle information exists, performing text recognition on the subtitle information in the streaming media video to obtain text information;
when the subtitle information does not exist, the audio information of the streaming media video is acquired, and the audio information is subjected to voice conversion to obtain text information.
As a further scheme of the invention: the step of extracting adjectives and nouns in each text message based on NLP specifically comprises the following steps:
determining the influence degree of a streaming media video author corresponding to the text information;
when the influence degree is smaller than or equal to the set influence value, extracting adjectives and nouns in the text information by using a word segmentation tool, and carrying out position marking on the extracted adjectives and nouns;
when the influence degree is larger than a set influence value, training corpus information is received, feature learning is conducted on the training corpus information based on the CNN-LSTM model to obtain an exclusive neural network model, text information is processed through the exclusive neural network model to obtain adjectives and nouns, and position marking is conducted on the obtained adjectives and nouns.
As a further scheme of the invention: the step of binding a noun for each adjective and determining the content evaluation information of the text information specifically comprises the following steps:
binding a noun for each adjective according to the position mark, and determining the part of speech of each adjective, wherein the part of speech comprises an identification word, a detraction word and a neutral word;
classifying all adjectives according to nouns to obtain a plurality of categories, wherein nouns corresponding to each category are identical;
and determining a text evaluation value of the text information, wherein the text evaluation value=the number of a×sense words+the number of b×devaluation words+the number of c×neutral words, and the category and the text evaluation value form content evaluation information.
As a further scheme of the invention: the step of analyzing and integrating all the content evaluation information to obtain the streaming media evaluation information specifically comprises the following steps:
integrating the categories in all the content evaluation information, and merging the categories corresponding to the same noun;
the influence degree of the streaming media video authors corresponding to each text evaluation value is called;
and determining an overall evaluation value, wherein the overall evaluation value is = Σtext evaluation value multiplied by influence degree, and the integrated category and the overall evaluation value form streaming media evaluation information.
Another object of the present invention is to provide an NLP-based streaming content analysis system, the system comprising:
the streaming media video determining module is used for receiving a search keyword input by a user and determining matched streaming media videos according to the search keyword;
the text information acquisition module is used for screening the streaming media videos according to the heat value, processing the screened streaming media videos and determining text information corresponding to each streaming media video;
a function keyword input module for receiving the function keywords input by the user, inducing the function keywords and the search keywords into nouns,
the adjective noun determining module is used for extracting adjectives and nouns in each piece of text information based on NLP, binding a noun for each adjective, and determining content evaluation information of the text information;
and the streaming media evaluation information module is used for analyzing and integrating all the content evaluation information to obtain streaming media evaluation information, and specially marking the evaluation content of the functional keywords in the streaming media evaluation information.
As a further scheme of the invention: the text information acquisition module comprises:
the subtitle information judging unit is used for judging whether the screened streaming media video has subtitle information or not;
the first text information unit is used for carrying out text recognition on the subtitle information in the streaming media video to obtain text information when the subtitle information exists;
and the second text information unit is used for acquiring the audio information of the streaming media video when the subtitle information does not exist, and carrying out voice conversion on the audio information to obtain text information.
As a further scheme of the invention: the adjective noun determination module includes:
the influence degree determining unit is used for determining the influence degree of the streaming media video author corresponding to the text information;
a first adjective noun unit, when the influence degree is smaller than or equal to a set influence value, extracting adjectives and nouns in the text information by using an adjective tool, and carrying out position marking on the extracted adjectives and nouns;
and the second adjective noun unit is used for receiving training corpus information when the influence degree is larger than a set influence value, performing feature learning on the training corpus information based on the CNN-LSTM model to obtain a proprietary neural network model, processing text information through the proprietary neural network model to obtain adjectives and nouns, and performing position marking on the obtained adjectives and nouns.
As a further scheme of the invention: the adjective noun determination module further includes:
an adjective noun binding unit, configured to bind a noun for each adjective according to the position mark, and determine a part of speech of each adjective, where the part of speech includes an identification word, a disambiguation word, and a neutral word;
the adjective classification unit is used for classifying all adjectives according to nouns to obtain a plurality of categories, and nouns corresponding to each category are the same;
and a text evaluation value unit for determining a text evaluation value of the text information, wherein the text evaluation value=a×the number of the positive words+b×the number of the negative words+c×the number of the neutral words, and the category and the text evaluation value form content evaluation information.
As a further scheme of the invention: the streaming media evaluation information module comprises:
the category integrating unit is used for integrating the categories in all the content evaluation information and combining the categories corresponding to the same noun;
the influence degree calling unit is used for calling the influence degree of the streaming media video author corresponding to each text evaluation value;
and the overall evaluation value unit is used for determining an overall evaluation value, wherein the overall evaluation value is = Σtext evaluation value multiplied by influence degree, and the integrated category and the overall evaluation value form streaming media evaluation information.
Compared with the prior art, the invention has the beneficial effects that:
the invention processes the screened streaming media video to determine the text information corresponding to each streaming media video; the functional keywords and the search keywords input by the user are generalized into nouns, adjectives and nouns in each text message are extracted based on NLP, a noun is bound for each adjective, and content evaluation information of the text message is determined; and analyzing and integrating all the content evaluation information to obtain the streaming media evaluation information. Thus, the streaming media evaluation information can be automatically analyzed and obtained, and the streaming media evaluation information can accurately reflect the overall public opinion guidance.
Drawings
Fig. 1 is a flowchart of a method for analyzing a streaming media content based on NLP.
Fig. 2 is a flowchart of determining text information of a streaming video in an NLP-based streaming content analysis method.
Fig. 3 is a flowchart of extracting adjectives and nouns in each text message in an NLP-based streaming media content analysis method.
Fig. 4 is a flowchart of a method for analyzing a streaming media content based on NLP, in which a noun is bound for each adjective.
Fig. 5 is a flowchart of obtaining streaming media evaluation information in an NLP-based streaming media content analysis method.
Fig. 6 is a schematic structural diagram of an NLP-based streaming media content analysis system.
Fig. 7 is a schematic structural diagram of a text information acquisition module in an NLP-based streaming media content analysis system.
Fig. 8 is a schematic structural diagram of an adjective noun determining module in an NLP-based streaming media content analysis system.
Fig. 9 is a schematic structural diagram of a streaming media evaluation information module in an NLP-based streaming media content analysis system.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more clear, the present invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Specific implementations of the invention are described in detail below in connection with specific embodiments.
As shown in fig. 1, an embodiment of the present invention provides a method for analyzing streaming media content based on NLP, which includes the following steps:
s100, receiving a search keyword input by a user, and determining a matched streaming media video according to the search keyword;
s200, screening the streaming media videos according to the heat value, processing the screened streaming media videos, and determining text information corresponding to each streaming media video;
s300, receiving the function keywords input by the user, inducing the function keywords and the search keywords into nouns,
s400, extracting adjectives and nouns in each text message based on NLP, binding a noun for each adjective, and determining content evaluation information of the text message;
s500, analyzing and integrating all the content evaluation information to obtain stream media evaluation information, and carrying out special marking on the evaluation content of the functional keywords in the stream media evaluation information.
In the embodiment of the invention, when a manufacturer needs to know the public opinion of a new product, a search keyword is input, the search keyword can be a new product name, a streaming media video platform can determine a plurality of matched streaming media videos according to the search keyword, then the embodiment of the invention can screen the streaming media videos according to a heat value, the heat value is related to the praise amount, comment amount and forwarding amount of the streaming media videos, the streaming media videos with higher heat value are reserved, and the screened streaming media videos are processed to determine text information corresponding to each streaming media video; then, the function keyword which is needed to be input by the user is a new push function in a new product, and is a bright product point which is compared with the intention of a manufacturer, the embodiment of the invention can sum up the function keyword and the search keyword into nouns, then the embodiment of the invention can extract adjectives and nouns in each text message based on a natural language processing technology (NLP), bind a noun for each adjective, indicate that the adjectives describe the noun, obtain content evaluation information of the text message, finally analyze and integrate all the content evaluation information to obtain the streaming media evaluation information, the streaming media evaluation information can reflect the whole public opinion guide, and make the evaluation content of the function keyword in the streaming media evaluation information special mark, such as thickening, so that a manufacturer staff can conveniently see the market effect of the new function at a glance, and the evaluation content of the function keyword is easy to understand.
As shown in fig. 2, as a preferred embodiment of the present invention, the step of processing the filtered streaming video to determine text information corresponding to each streaming video specifically includes:
s201, judging whether subtitle information exists in the screened streaming media video;
s202, when subtitle information exists, performing text recognition on the subtitle information in the streaming media video to obtain text information;
and S203, when the subtitle information does not exist, acquiring the audio information of the streaming media video, and performing voice-to-text conversion on the audio information to obtain text information.
In the embodiment of the invention, in order to obtain text information, whether the screened streaming media video contains subtitle information is required to be judged, and if the screened streaming media video contains subtitle information, text information can be obtained by directly carrying out text recognition on the subtitle information in the streaming media video; if the subtitle information does not exist, the audio information of the streaming media video is required to be called, noise reduction processing is carried out on the audio information, and then voice conversion is carried out to obtain text information.
As shown in fig. 3, as a preferred embodiment of the present invention, the step of extracting adjectives and nouns in each text message based on NLP specifically includes:
s401, determining the influence degree of a streaming media video author corresponding to text information;
s402, extracting adjectives and nouns in text information by using a word segmentation tool when the influence degree is smaller than or equal to a set influence value, and carrying out position marking on the extracted adjectives and nouns;
and S403, when the influence degree is larger than the set influence value, receiving training corpus information, performing feature learning on the training corpus information based on the CNN-LSTM model to obtain a proprietary neural network model, processing text information through the proprietary neural network model to obtain adjectives and nouns, and performing position marking on the adjectives and nouns.
In the embodiment of the invention, the influence degree of the streaming media video author corresponding to each text message needs to be determined, the influence degree is determined according to the praise amount and the vermicelli amount of the video author, the influence degree=m×praise amount and +n×vermicelli amount, M and N are fixed values, when the influence degree is smaller than or equal to a set influence value, adjectives and nouns in the text message are directly extracted by using a word segmentation tool, the extracted adjectives and nouns are subjected to position marking, the position marking is used for indicating the position in the text message, and the word segmentation tool can use jieba, hanlp, ansj or standby. When the influence degree is larger than a set influence value, an exclusive neural network model of the streaming media video author is required to be built, so that analysis can be more accurate, in addition, the video author with larger influence degree in each field is limited, the limited exclusive neural network model is built, the streaming media video author can be always used after the first time of building is finished, during building, a user is required to upload training corpus information, the training corpus information is obtained according to the previous video of the video author, and then feature learning is carried out on the training corpus information based on a CNN-LSTM model to obtain the exclusive neural network model, so that the exclusive neural network model can carry out better semantic analysis on the video content of the video author.
As shown in fig. 4, as a preferred embodiment of the present invention, the step of binding a noun for each adjective and determining content rating information of the text information specifically includes:
s404, binding a noun for each adjective according to the position mark, and determining the part of speech of each adjective, wherein the part of speech comprises an identification word, a detraction word and a neutral word;
s405, classifying all adjectives according to nouns to obtain a plurality of categories, wherein nouns corresponding to each category are the same;
and S406, determining a text evaluation value of the text information, wherein the text evaluation value=a×the number of the sense words+b×the number of the devaluation words+c×the number of the neutral words, and the category and the text evaluation value form content evaluation information.
In the embodiment of the invention, a noun is bound for each adjective according to the position mark, the bound noun is the noun with the nearest position of the adjective in the same sentence, the part of speech of each adjective is determined, and the adjective can be input into an electronic dictionary to obtain the part of speech; and classifying all adjectives according to nouns, wherein nouns corresponding to each category are the same, forming a table, wherein the first column is the noun, the second column is the adjective corresponding to the noun, finally determining a text evaluation value of the text information, wherein the text evaluation value = a x the number of the positive words + b x the number of the negative words + c x the number of the neutral words, and the values of a, b and c are all definite values.
As shown in fig. 5, as a preferred embodiment of the present invention, the step of analyzing and integrating all content evaluation information to obtain streaming media evaluation information specifically includes:
s501, integrating the categories in all content evaluation information, and merging the categories corresponding to the same noun;
s502, the influence degree of the streaming media video authors corresponding to each text evaluation value is called;
s503, determining an overall evaluation value, wherein the overall evaluation value= Σtext evaluation value×influence degree, and the integrated category and the overall evaluation value form streaming media evaluation information.
In the embodiment of the invention, the content evaluation information corresponding to the screened streaming media video is integrated, the overall evaluation value is determined, and the overall evaluation value is accumulated after being equal to the total text evaluation value multiplied by the corresponding influence, and the overall evaluation value reflects the quality of the overall public opinion.
As shown in fig. 6, the embodiment of the present invention further provides a streaming media content analysis system based on NLP, where the system includes:
the streaming media video determining module 100 is configured to receive a search keyword input by a user, and determine a matched streaming media video according to the search keyword;
the text information acquisition module 200 is configured to screen the streaming media video according to the hotness value, process the screened streaming media video, and determine text information corresponding to each streaming media video;
a function keyword input module 300 for receiving a function keyword input by a user, generalizing the function keyword and the search keyword into nouns,
adjective noun determination module 400 extracts adjectives and nouns in each text message based on NLP, binds a noun for each adjective, and determines content evaluation information of the text message;
the streaming media evaluation information module 500 is configured to analyze and integrate all the content evaluation information to obtain streaming media evaluation information, and specially mark the evaluation content of the functional keywords in the streaming media evaluation information.
In the embodiment of the invention, when a manufacturer needs to know the public opinion of a new product, a search keyword is input, the search keyword can be a new product name, a streaming media video platform can determine a plurality of matched streaming media videos according to the search keyword, then the embodiment of the invention can screen the streaming media videos according to a heat value, the heat value is related to the praise amount, comment amount and forwarding amount of the streaming media videos, the streaming media videos with higher heat value are reserved, and the screened streaming media videos are processed to determine text information corresponding to each streaming media video; the embodiment of the invention can sum up the functional keywords and the search keywords into nouns, then the embodiment of the invention can extract adjectives and nouns in each text message based on Natural Language Processing (NLP), bind a noun for each adjective, indicate that the adjective is descriptive of the noun, obtain content evaluation information of the text message, finally analyze and integrate all the content evaluation information to obtain streaming media evaluation information, the streaming media evaluation information can reflect the overall public opinion guide, and the evaluation content of the functional keywords in the streaming media evaluation information is specially marked, thereby facilitating the manufacturers to see the market effect of the new function at a glance, and being easy to understand that the evaluation content of the functional keywords is the adjective corresponding to the noun.
As shown in fig. 7, as a preferred embodiment of the present invention, the text information acquiring module 200 includes:
a caption information determining unit 201, configured to determine whether the filtered streaming video has caption information;
a first text information unit 202, configured to perform text recognition on the subtitle information in the streaming media video to obtain text information when the subtitle information exists;
and the second text information unit 203 is configured to obtain audio information of the streaming video when the subtitle information does not exist, and perform voice-to-text conversion on the audio information to obtain text information.
As shown in fig. 8, as a preferred embodiment of the present invention, the adjective noun determining module 400 includes:
an influence degree determining unit 401, configured to determine an influence degree of a streaming media video author corresponding to the text information;
a first adjective noun unit 402 that extracts adjectives and nouns in the text information using the word segmentation tool and position-marks the extracted adjectives and nouns when the influence degree is less than or equal to a set influence value;
the second adjective noun unit 403 is configured to receive training corpus information when the influence degree is greater than the set influence value, perform feature learning on the training corpus information based on the CNN-LSTM model to obtain an exclusive neural network model, process text information through the exclusive neural network model to obtain adjectives and nouns, and perform position marking on the obtained adjectives and nouns.
As shown in fig. 8, as a preferred embodiment of the present invention, the adjective noun determining module 400 further includes:
an adjective noun binding unit 404, configured to bind a noun for each adjective according to the position mark, and determine a part of speech of each adjective, where the part of speech includes an identification word, a disambiguation word, and a neutral word;
an adjective classification unit 405, configured to classify all adjectives according to nouns, so as to obtain a plurality of categories, where nouns corresponding to each category are the same;
a text evaluation value unit 406, configured to determine a text evaluation value of the text information, where the text evaluation value=a×the number of positive words+b×the number of negative words+c×the number of neutral words, and the category and the text evaluation value form content evaluation information.
As shown in fig. 9, as a preferred embodiment of the present invention, the streaming media evaluation information module 500 includes:
a category integrating unit 501, configured to integrate categories in all content evaluation information, and combine categories corresponding to the same noun;
the influence degree retrieving unit 502 is configured to retrieve the influence degree of the streaming media video author corresponding to each text evaluation value;
the overall evaluation value unit 503 is configured to determine an overall evaluation value, where the overall evaluation value= Σtext evaluation value×influence degree, and the integrated category and the overall evaluation value form streaming media evaluation information.
The foregoing description of the preferred embodiments of the present invention should not be taken as limiting the invention, but rather should be understood to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.
It should be understood that, although the steps in the flowcharts of the embodiments of the present invention are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in various embodiments may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or other steps.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
Other embodiments of the present disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (10)

1. A method for analyzing streaming media content based on NLP, the method comprising the steps of:
receiving a search keyword input by a user, and determining a matched streaming media video according to the search keyword;
screening the streaming media videos according to the heat value, processing the screened streaming media videos, and determining text information corresponding to each streaming media video;
receiving a function keyword input by a user, inducing the function keyword and the search keyword into nouns,
extracting adjectives and nouns in each text message based on NLP, binding a noun for each adjective, and determining content evaluation information of the text message;
and analyzing and integrating all the content evaluation information to obtain stream media evaluation information, and carrying out special marking on the evaluation content of the functional keywords in the stream media evaluation information.
2. The method for analyzing the NLP-based streaming media content according to claim 1, wherein the step of processing the filtered streaming media videos to determine text information corresponding to each streaming media video specifically comprises:
judging whether the screened streaming media video has subtitle information or not;
when subtitle information exists, performing text recognition on the subtitle information in the streaming media video to obtain text information;
when the subtitle information does not exist, the audio information of the streaming media video is acquired, and the audio information is subjected to voice conversion to obtain text information.
3. The method for analyzing the streaming media content based on the NLP according to claim 1, wherein the step of extracting adjectives and nouns in each text message based on the NLP comprises the following steps:
determining the influence degree of a streaming media video author corresponding to the text information;
when the influence degree is smaller than or equal to the set influence value, extracting adjectives and nouns in the text information by using a word segmentation tool, and carrying out position marking on the extracted adjectives and nouns;
when the influence degree is larger than a set influence value, training corpus information is received, feature learning is conducted on the training corpus information based on the CNN-LSTM model to obtain an exclusive neural network model, text information is processed through the exclusive neural network model to obtain adjectives and nouns, and position marking is conducted on the obtained adjectives and nouns.
4. The NLP-based streaming media content analysis method of claim 3, wherein the step of binding a noun for each adjective and determining content rating information of the text information comprises:
binding a noun for each adjective according to the position mark, and determining the part of speech of each adjective, wherein the part of speech comprises an identification word, a detraction word and a neutral word;
classifying all adjectives according to nouns to obtain a plurality of categories, wherein nouns corresponding to each category are identical;
and determining a text evaluation value of the text information, wherein the text evaluation value=a×the number of the sense words+b×the number of the devaluation words+c×the number of the neutral words, the category and the text evaluation value form content evaluation information, and a, b and c are all constant values.
5. The method for analyzing and integrating the content rating information according to claim 4, wherein the step of analyzing and integrating all the content rating information to obtain the content rating information specifically comprises:
integrating the categories in all the content evaluation information, and merging the categories corresponding to the same noun;
the influence degree of the streaming media video authors corresponding to each text evaluation value is called;
and determining an overall evaluation value, wherein the overall evaluation value is = Σtext evaluation value multiplied by influence degree, and the integrated category and the overall evaluation value form streaming media evaluation information.
6. A NLP-based streaming media content analysis system, the system comprising:
the streaming media video determining module is used for receiving a search keyword input by a user and determining matched streaming media videos according to the search keyword;
the text information acquisition module is used for screening the streaming media videos according to the heat value, processing the screened streaming media videos and determining text information corresponding to each streaming media video;
a function keyword input module for receiving the function keywords input by the user, inducing the function keywords and the search keywords into nouns,
the adjective noun determining module is used for extracting adjectives and nouns in each piece of text information based on NLP, binding a noun for each adjective, and determining content evaluation information of the text information;
and the streaming media evaluation information module is used for analyzing and integrating all the content evaluation information to obtain streaming media evaluation information, and specially marking the evaluation content of the functional keywords in the streaming media evaluation information.
7. The NLP-based streaming media content analysis system of claim 6, wherein the text information acquisition module comprises:
the subtitle information judging unit is used for judging whether the screened streaming media video has subtitle information or not;
the first text information unit is used for carrying out text recognition on the subtitle information in the streaming media video to obtain text information when the subtitle information exists;
and the second text information unit is used for acquiring the audio information of the streaming media video when the subtitle information does not exist, and carrying out voice conversion on the audio information to obtain text information.
8. The NLP-based streaming media content analysis system of claim 6, wherein the adjective noun determination module comprises:
the influence degree determining unit is used for determining the influence degree of the streaming media video author corresponding to the text information;
a first adjective noun unit, when the influence degree is smaller than or equal to a set influence value, extracting adjectives and nouns in the text information by using an adjective tool, and carrying out position marking on the extracted adjectives and nouns;
and the second adjective noun unit is used for receiving training corpus information when the influence degree is larger than a set influence value, performing feature learning on the training corpus information based on the CNN-LSTM model to obtain a proprietary neural network model, processing text information through the proprietary neural network model to obtain adjectives and nouns, and performing position marking on the obtained adjectives and nouns.
9. The NLP-based streaming media content analysis system of claim 8, wherein the adjective noun determination module further comprises:
an adjective noun binding unit, configured to bind a noun for each adjective according to the position mark, and determine a part of speech of each adjective, where the part of speech includes an identification word, a disambiguation word, and a neutral word;
the adjective classification unit is used for classifying all adjectives according to nouns to obtain a plurality of categories, and nouns corresponding to each category are the same;
and the text evaluation value unit is used for determining a text evaluation value of the text information, wherein the text evaluation value=a×the number of the positive words+b×the number of the negative words+c×the number of the neutral words, the category and the text evaluation value form content evaluation information, and a, b and c are all constant values.
10. The NLP-based streaming content analysis system of claim 9, wherein the streaming rating information module comprises:
the category integrating unit is used for integrating the categories in all the content evaluation information and combining the categories corresponding to the same noun;
the influence degree calling unit is used for calling the influence degree of the streaming media video author corresponding to each text evaluation value;
and the overall evaluation value unit is used for determining an overall evaluation value, wherein the overall evaluation value is = Σtext evaluation value multiplied by influence degree, and the integrated category and the overall evaluation value form streaming media evaluation information.
CN202310554226.5A 2023-05-17 2023-05-17 NLP-based streaming media content analysis method and system Active CN116320621B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310554226.5A CN116320621B (en) 2023-05-17 2023-05-17 NLP-based streaming media content analysis method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310554226.5A CN116320621B (en) 2023-05-17 2023-05-17 NLP-based streaming media content analysis method and system

Publications (2)

Publication Number Publication Date
CN116320621A true CN116320621A (en) 2023-06-23
CN116320621B CN116320621B (en) 2023-08-04

Family

ID=86794504

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310554226.5A Active CN116320621B (en) 2023-05-17 2023-05-17 NLP-based streaming media content analysis method and system

Country Status (1)

Country Link
CN (1) CN116320621B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103455562A (en) * 2013-08-13 2013-12-18 西安建筑科技大学 Text orientation analysis method and product review orientation discriminator on basis of same
US20150186790A1 (en) * 2013-12-31 2015-07-02 Soshoma Inc. Systems and Methods for Automatic Understanding of Consumer Evaluations of Product Attributes from Consumer-Generated Reviews
CN112991017A (en) * 2021-03-26 2021-06-18 刘秀萍 Accurate recommendation method for label system based on user comment analysis
CN114970494A (en) * 2021-02-25 2022-08-30 腾讯科技(北京)有限公司 Comment generation method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103455562A (en) * 2013-08-13 2013-12-18 西安建筑科技大学 Text orientation analysis method and product review orientation discriminator on basis of same
US20150186790A1 (en) * 2013-12-31 2015-07-02 Soshoma Inc. Systems and Methods for Automatic Understanding of Consumer Evaluations of Product Attributes from Consumer-Generated Reviews
CN114970494A (en) * 2021-02-25 2022-08-30 腾讯科技(北京)有限公司 Comment generation method and device, electronic equipment and storage medium
CN112991017A (en) * 2021-03-26 2021-06-18 刘秀萍 Accurate recommendation method for label system based on user comment analysis

Also Published As

Publication number Publication date
CN116320621B (en) 2023-08-04

Similar Documents

Publication Publication Date Title
US20210200961A1 (en) Context-based multi-turn dialogue method and storage medium
JP5167546B2 (en) Sentence search method, sentence search device, computer program, recording medium, and document storage device
KR101498331B1 (en) System for extracting term from document containing text segment
CN107102993B (en) User appeal analysis method and device
CN113254574A (en) Method, device and system for auxiliary generation of customs official documents
KR102476099B1 (en) METHOD AND APPARATUS FOR GENERATING READING DOCUMENT Of MINUTES
Braz et al. Document classification using a Bi-LSTM to unclog Brazil's supreme court
CN107958068B (en) Language model smoothing method based on entity knowledge base
CN111382570A (en) Text entity recognition method and device, computer equipment and storage medium
CN111291535B (en) Scenario processing method and device, electronic equipment and computer readable storage medium
CN116320621B (en) NLP-based streaming media content analysis method and system
CN113128205A (en) Script information processing method and device, electronic equipment and storage medium
CN117216214A (en) Question and answer extraction generation method, device, equipment and medium
CN109992778A (en) Resume document method of discrimination and device based on machine learning
CN112905763B (en) Session system development method, device, computer equipment and storage medium
Shahbazi et al. Computing focus time of paragraph using deep learning
CN110321404B (en) Vocabulary entry selection method and device for vocabulary learning, electronic equipment and storage medium
CN111164589A (en) Emotion marking method, device and equipment of speaking content and storage medium
CN112559798B (en) Method and device for detecting quality of audio content
Sanosi et al. Automated Identification of Discourse Markers Using the NLP Approach: The Case of" Okay".
CN117953533B (en) Efficient extraction method and system for document pages
CN115358158B (en) Method, system and equipment for detecting standardization of rail transit BIM model
CN112331211B (en) Learning situation information acquisition method, device, equipment and storage medium
CN110334215B (en) Construction method and device of vocabulary learning framework, electronic equipment and storage medium
Morley et al. Challenges in automating maze detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant