CN115065872A

CN115065872A - Intelligent recommendation method and system for video and audio

Info

Publication number: CN115065872A
Application number: CN202210721978.1A
Authority: CN
Inventors: 王建方; 刘峰; 邓正豪
Original assignee: China Unicom Online Information Technology Co Ltd; China Unicom WO Music and Culture Co Ltd
Current assignee: China Unicom Online Information Technology Co Ltd; China Unicom WO Music and Culture Co Ltd
Priority date: 2022-06-17
Filing date: 2022-06-17
Publication date: 2022-09-16

Abstract

The invention discloses an intelligent audio-video recommendation method and system, belongs to the technical field of audio-video recommendation, and comprises an analysis cloud processing center, wherein the analysis cloud processing center is used for acquiring audio-video videos of a user side through a cloud platform and establishing analysis tasks. According to the invention, after the features in the video and audio are extracted, the feature fusion processing is carried out by combining with the user image information, and the weight analysis and feature analysis recombination fitting of the extracted user features can be realized through the established neural network model, so that the preference of the video and audio can be quickly positioned, the processing recommendation effect is improved, after the incremental judgment of the extracted features, the recommendation output of the preference of the user can be realized, and meanwhile, the cyclic weighted updating of the features can be realized through the learning incremental module and the information database unit after the video and audio recommendation, so that the judgment of the recommendation effect of the recommendation identification after the feature identification can be realized through the stay time judgment module.

Description

Intelligent recommendation method and system for video and audio

Technical Field

The invention belongs to the technical field of audio-video recommendation, and particularly relates to an intelligent audio-video recommendation method and system.

Background

With the development of communication technology, video short videos have become essential entertainment items in life, and short videos can improve the video watching efficiency by recommending different users in order to improve the watching experience of a large number of videos. In the face of information explosion, how to solve the problem of information overload is one of the main research challenges of data researchers and engineers in the industry, how to recommend video and audio, a recommendation algorithm is needed, the algorithm is an accurate and complete description of a problem solving scheme, and is a relatively complete command with strong logic, and the algorithm represents a strategy mechanism for describing and solving the problem by using a systematic method. That is, we can get the required output within a limited time of a specific specification input by means of an algorithm. Different algorithms may use different time, space, or efficiency to accomplish the same task. The algorithm itself has no good or bad score, and is more measured by the space complexity and time complexity required for solving the problem. Algorithms are roughly classified into basic algorithms, algorithms of data structures, number theory and algebraic algorithms, algorithms of computational geometry, algorithms of graph theory, and the like, and the algorithms have been widely developed and applied with the development of computers.

The invention discloses a Chinese patent document: the patent of CN108134950B discloses an intelligent video recommendation method for solving the problem that in the prior art, the related video recommendation is given without well judging whether the user likes or dislikes the viewed video, and the intelligent video recommendation method includes the steps of: s1: judging whether the playing time of the plurality of videos is within a preset time range or not according to the video playing record of the user; s2: marking preset labels corresponding to the plurality of videos; s3: establishing a preset video recommendation model; s4: and in the video library, calculating the corresponding user preference of each video according to the established preset video recommendation model, arranging the videos in the video library according to the preference, and displaying the videos with preset digits before arrangement in a video recommendation column. According to the method, a more intelligent and personalized video recommendation model can be established according to the watching records of the user, and the video which is more liked by the user is recommended, but in actual use, the identification processing capability of the video image in the user characteristic identification process is lacked, the input judgment is carried out only through the playing and recording time, the judgment processing precision is influenced, and the identification processing requirements cannot be well met.

Disclosure of Invention

The invention aims to: the method and the system for intelligently recommending the video and audio videos are provided for solving the problems that the recognition processing capacity of the video images in the user characteristic recognition process is lacked, the judgment processing precision is influenced and the recognition processing requirements cannot be well met only by inputting and judging the playing recording time.

In order to achieve the purpose, the invention adopts the following technical scheme:

an intelligent recommendation system for audio-video comprises an analysis cloud processing center, wherein the analysis cloud processing center is used for acquiring audio-video of a user end through a cloud platform and establishing an analysis task, the input end of the analysis cloud processing center is electrically connected with the output end of an audio analysis module, the audio analysis module is used for analyzing audio frequency coefficients of the audio-video for assisting matching, the output end of the analysis cloud processing center is electrically connected with the input end of a multi-stage feature extraction unit, the multi-stage feature extraction unit is used for extracting multi-stage features in the video, the output end of the analysis cloud processing center is electrically connected with the input end of an intelligent recommendation matching unit, the intelligent recommendation matching unit is used for performing intelligent recommendation matching through the analyzed features, and the output end of the intelligent recommendation matching unit is electrically connected with the input end of a user preference establishment module, the input end of the user preference establishing module is electrically connected with the output end of the multistage feature extraction unit.

As a further description of the above technical solution:

the output end of the user preference establishing module is electrically connected with the input end of the information database unit, the information database unit is used for recording user preference information and recording user information portrait, the output end of the information database unit is electrically connected with the input end of the learning increasing module, the learning increasing module is used for increasing the learning preference extension characteristics for recommendation, and the output end of the learning increasing module is electrically connected with the input end of the multistage characteristic extracting unit.

As a further description of the above technical solution:

the input end of the analysis cloud processing center is electrically connected with the output end of the stay time judging module, and the stay time judging module is used for providing video stay weighting information and triggering judging threshold values.

As a further description of the above technical solution:

the output end of the audio analysis module is electrically connected with the input end of the audio library module, and the audio library module is used for storing an audio library.

As a further description of the above technical solution:

the multi-stage feature extraction unit comprises a motion feature extraction module, the motion feature extraction module is used for extracting motion features, the output end of the motion feature extraction module is connected with a shape feature extraction module, a color feature extraction module and a texture feature extraction module, the shape feature extraction module is used for extracting overall shape features in key frames, the color feature extraction module is used for extracting color changes in images, and the texture feature extraction module is used for extracting texture features in the images.

As a further description of the above technical solution:

the output end of the motion characteristic comparison and extraction module is electrically connected with the input end of the face recognition module, and the face recognition module is used for carrying out face recognition positioning according to the shape characteristics.

As a further description of the above technical solution:

the intelligent recommendation matching unit comprises a neural network model, the neural network model is used for performing intelligent identification through a neural network, the output end of the neural network model is electrically connected with the input end of a weight analysis module, the weight analysis module is used for analyzing the weight of corresponding features, the output end of the weight analysis module is electrically connected with the input end of a feature analysis module, the feature analysis module is used for performing weighted analysis on the features according to the weight, the output end of the feature analysis module is electrically connected with the input end of a preference matching module, the preference matching module is used for performing preference matching through the extracted features, the output end of the feature analysis module is electrically connected with the input end of a cluster fitting module, and the cluster fitting module is used for performing cluster processing on the features.

As a further description of the above technical solution:

an intelligent recommendation method for video and audio specifically comprises the following steps: s1, establishing an information database, establishing the information database by using the data ID of the audio-video user to be recommended, entering preferred basic information data according to the client filling data, and judging and acquiring the characteristic extraction image according to the staying time and the preference of the user in the video system;

s2, multi-stage extraction of image features, namely extracting feature frames by partitioning a video and extracting the image features according to the image feature frames;

s3, extracting sound features, carrying out fuzzy matching processing on the audio library module according to the audio analysis result, and outputting sound weighting weight after matching the sound features;

and S4, intelligent matching, wherein the extracted image features and sound features are input into an intelligent recommendation matching unit, the intelligent recommendation matching unit identifies corresponding features, and the recommended data in the database is output after weighting and increasing according to the matched features.

As a further description of the above technical solution:

and the extraction of the key frame features in the S2 is 2-4S.

As a further description of the above technical solution:

the user data ID in said S1 further includes an extracted login key for the user ID pair.

In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:

1. according to the invention, after the features in the video and audio are extracted, the feature fusion processing is carried out by combining with the user image information, and the weight analysis and feature analysis recombination fitting of the extracted user features can be realized through the established neural network model, so that the preference of the video and audio can be quickly positioned, the processing recommendation effect is improved, after the incremental judgment of the extracted features, the recommendation output of the preference of the user can be realized, and meanwhile, the cyclic weighted updating of the features can be realized through the learning incremental module and the information database unit after the video and audio recommendation, so that the judgment of the recommendation effect of the recommendation identification after the feature identification can be realized through the stay time judgment module.

2. According to the invention, through the designed audio analysis module and the audio library module, the extraction and the acquisition of audio frequency spectrum data can be realized through the audio analysis module, and the extraction and the judgment of audio data in the audio video process can be quickly realized through the analog matching of the audio library module on the big data audio data, so that the recommendation processing precision is improved.

3. According to the invention, through the designed multi-stage feature extraction unit, the moving image features can be rapidly extracted and identified through the distinguishing judgment of the moving images among key frames, the judgment of image feature vectors is favorably realized through the matching of shape features in the image features, the face recognition module can be used for carrying out face recognition on the image features obtained through movement, the corresponding threshold features of images in video and audio videos can be rapidly obtained, the identification and the positioning of the intelligent matching unit for extracting the favorite features of users are favorably realized, and the processing effect is improved.

Drawings

Fig. 1 is a system block diagram of an intelligent audio-video recommendation system according to the present invention;

FIG. 2 is a logic diagram of a multi-level feature extraction unit system of an intelligent audio-video recommendation system according to the present invention;

fig. 3 is a logic diagram of an intelligent matching unit system of an intelligent recommendation system for audio and video videos provided by the present invention;

fig. 4 is a flowchart of an intelligent recommendation method for audio and video videos according to the present invention.

Illustration of the drawings:

1. analyzing a cloud processing center; 2. an audio analysis module; 3. a multi-stage feature extraction unit; 301. a motion characteristic comparison and extraction module; 302. a shape feature extraction module; 303. a color feature extraction module; 304. a texture feature extraction module; 305. a face recognition module; 4. an intelligent recommendation matching unit; 401. a weight analysis module; 402. a neural network model; 403. a feature analysis module; 404. a preference matching module; 405. a cluster fitting module; 5. a user preference establishing module; 6. an information database unit; 7. a learning increment module; 8. a residence time judgment module; 9. and the audio library module.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1-4, the present invention provides a technical solution: an intelligent recommendation system for audio-video comprises an analysis cloud processing center, wherein the analysis cloud processing center is used for acquiring audio-video of a user end through a cloud platform and establishing an analysis task, the input end of the analysis cloud processing center is electrically connected with the output end of an audio analysis module, the audio analysis module is used for analyzing audio frequency coefficients of the audio-video for assisting matching, the output end of the analysis cloud processing center is electrically connected with the input end of a multi-stage feature extraction unit, the multi-stage feature extraction unit is used for extracting multi-stage features in the video, the output end of the analysis cloud processing center is electrically connected with the input end of an intelligent recommendation matching unit, the intelligent recommendation matching unit is used for performing intelligent recommendation matching through the analyzed features, and the output end of the intelligent recommendation matching unit is electrically connected with the input end of a user preference establishment module, the input end of the user preference establishing module is electrically connected with the output end of the multi-stage feature extraction unit;

the output end of the user preference establishing module is electrically connected with the input end of the information database unit, the information database unit is used for recording user preference information and recording user information portrait, the output end of the information database unit is electrically connected with the input end of the learning increasing module, the learning increasing module is used for increasing the learning preference extension characteristics for recommendation, the output end of the learning increasing module is electrically connected with the input end of the multistage characteristic extraction unit, the input end of the analysis cloud processing center is electrically connected with the output end of the stay time judging module, the stay time judging module is used for providing video stay weighting information and triggering the judging threshold value, the output end of the audio analysis module is electrically connected with the input end of the audio library module, and the audio library module is used for storing an audio library.

The multi-stage feature extraction unit comprises a motion feature comparison extraction module, the motion feature comparison extraction module is used for extracting the output end of the motion feature comparison extraction module, the shape feature extraction module is connected with the shape feature extraction module, the color feature extraction module and the texture feature extraction module, the shape feature extraction module is used for extracting the overall shape features in the key frame, and the description of the shape can be divided into two types, namely contour-based and region-based. The former only utilizes the outer contour information of the shape, the latter utilizes the area information of the whole shape, specifically, the shape characteristics comprise an area, a main shaft direction, a moment, an eccentricity ratio, a circularity ratio, an tangent angle and the like, and judgment is carried out through Fourier description change in the embodiment of the application;

wherein the target is to extract the timing signal from the input timing signal,X(t)＝(x1(t),x2(t),…xt(t)t,extracting the changed information with invariant properties, wherein given an l-dimensional input signal:

x(t)＝(x ₁ (t),x ₂ (t),...x _l (t)) ^t ；

a j-dimensional transformation function:

g(x)＝(g ₁ (x)g ₂ (x),....g _j (x)) ^T ；

if linear at the time of transformation, i.e.

Where X is the input and wj is the weight, so when assuming that the mean is-0 and the variance is 1, selecting the appropriate weight can satisfy the constraint, and the side of the objective function has:

the vector w of the minimum value which can be on the market can be known by linear algebra, and the objective function is minimized after the adjustment is more satisfied, so that the optimal extracted vector threshold characteristic can be obtained;

the color feature extraction module is used for extracting color change in an image, wherein during image color extraction, indexing the image according to global color distribution can be realized by calculating the number of pixels of each color and constructing a color gray histogram, retrieving the image with similar overall color content, or by classifying local color information colors and some primary geometric features, a color set provides effective indexing of color areas by extracting spatial local color information to judge and identify the color image, the texture feature extraction module is used for extracting texture features in the image, the output end of the motion feature comparison extraction module is electrically connected with the input end of a face recognition module, the face recognition module is used for carrying out face recognition positioning according to shape features, the intelligent recommendation matching unit comprises a neural network model, the neural network model is used for intelligently identifying through a neural network, the output end of the neural network model is electrically connected with the input end of a weight analysis module, the neural network model recommends preference of likes through a SIFT algorithm, the weight analysis module is used for analyzing the weight of corresponding features, the output end of the weight analysis module is electrically connected with the input end of a feature analysis module, the feature analysis module is used for performing weighting analysis on the features according to the weight, the output end of the feature analysis module is electrically connected with the input end of a preference matching module, the preference matching module is used for performing preference matching through extracted features, the output end of the feature analysis module is electrically connected with the input end of a cluster fitting module, and the cluster fitting module is used for performing cluster processing on the features.

s2, multi-stage extraction of image features, namely extracting feature frames by partitioning the video and extracting the image features according to the image feature frames;

and S4, intelligently matching, namely inputting the extracted image features and sound features into an intelligent recommendation matching unit, identifying the corresponding features by the intelligent recommendation matching unit, outputting recommended data in a database after weighting and increasing according to the matched features, wherein the extraction of the key frame features in S2 is 2-4S, and the user data ID in S1 also comprises an extracted login key corresponding to the user ID.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims

1. An intelligent recommendation system for audio-video is characterized by comprising an analysis cloud processing center, wherein the analysis cloud processing center is used for acquiring an audio-video of a user end through a cloud platform and establishing an analysis task, the input end of the analysis cloud processing center is electrically connected with the output end of an audio analysis module, the audio analysis module is used for analyzing audio frequency coefficients of the audio-video for assisting matching, the output end of the analysis cloud processing center is electrically connected with the input end of a multistage feature extraction unit, the multistage feature extraction unit is used for extracting multistage features in the video, the output end of the analysis cloud processing center is electrically connected with the input end of an intelligent recommendation matching unit, the intelligent recommendation matching unit is used for performing intelligent recommendation matching through the analyzed features, and the output end of the intelligent recommendation matching unit is electrically connected with the input end of a user preference establishment module, the input end of the user preference establishing module is electrically connected with the output end of the multistage feature extraction unit.

2. The system according to claim 1, wherein an output of the user preference creation module is electrically connected to an input of an information database unit, the information database unit is configured to record user preference information and record a user information representation, an output of the information database unit is electrically connected to an input of a learning increment module, the learning increment module is configured to increment a learning preference extension feature for recommendation, and an output of the learning increment module is electrically connected to an input of the multi-stage feature extraction unit.

3. The system according to claim 1, wherein an input end of the analysis cloud processing center is electrically connected to an output end of the stay time determination module, and the stay time determination module is configured to provide video stay weighting information and a trigger determination threshold.

4. The intelligent audio-visual video recommendation system according to claim 1, wherein the output end of the audio analysis module is electrically connected to the input end of an audio library module, and the audio library module is used for storing an audio library.

5. The system according to claim 1, wherein the multi-stage feature extraction unit comprises a motion feature extraction module, the motion feature extraction module is used, an output end of the motion feature extraction module is connected with a shape feature extraction module, a color feature extraction module and a texture feature extraction module, the shape feature extraction module is used for extracting overall shape features in the key frame, the color feature extraction module is used for extracting color changes in the image, and the texture feature extraction module is used for extracting texture features in the image.

6. The system according to claim 1, wherein the output end of the motion feature comparison and extraction module is electrically connected to the input end of a face recognition module, and the face recognition module is configured to perform face recognition and positioning according to the shape features.

7. The system of claim 1, wherein the recommendation module is further configured to recommend video and audio, the intelligent recommendation matching unit comprises a neural network model used for intelligent recognition through a neural network, the output end of the neural network model is electrically connected with the input end of the weight analysis module, the weight analysis module is used for analyzing the weight of the corresponding characteristic, the output end of the weight analysis module is electrically connected with the input end of the characteristic analysis module, the characteristic analysis module is used for carrying out the weighted analysis of the characteristics according to the weight, the output end of the characteristic analysis module is electrically connected with the input end of the preference matching module, the preference matching module is used for performing preference matching through the extracted characteristics, the output end of the characteristic analysis module is electrically connected with the input end of the cluster fitting module, and the cluster fitting module is used for clustering the characteristics.

8. An intelligent recommendation method of video and audio is applied to the intelligent recommendation system of video and audio of claims 1 to 7, and is characterized by comprising the following steps:

s1, establishing an information database, establishing the information database by using the data ID of the audio-video user to be recommended, entering preferred basic information data according to the client filling data, and judging and acquiring the characteristic extraction image according to the staying time and the preference of the user in the video system;

9. The method of claim 8, wherein the extraction of the key frame features in S2 is 2-4S.

10. The method and system for intelligently recommending audiovisual videos of claim 8, wherein the user data ID in S1 further includes an extracted login key for the user ID.