CN107665261B

CN107665261B - Video duplicate checking method and device

Info

Publication number: CN107665261B
Application number: CN201711008924.6A
Authority: CN
Inventors: 黄君实; 林敏�; 李东亮; 陈强
Original assignee: Beijing Qihoo Technology Co Ltd
Current assignee: Beijing Qihoo Technology Co Ltd
Priority date: 2017-10-25
Filing date: 2017-10-25
Publication date: 2021-06-18
Anticipated expiration: 2037-10-25
Also published as: CN107665261A

Abstract

The embodiment of the invention provides a video duplicate checking method, which comprises the following steps: extracting key frames to be searched with frequency emphasis according to a preset rule, inputting each key frame into a preset feature extraction model to obtain depth features respectively corresponding to each key frame, and then performing image feature pooling on the depth features respectively corresponding to each key frame to obtain depth features respectively corresponding to each key frame after the pooling; then integrating and coding the depth features respectively corresponding to each key frame after the pooling processing to obtain the feature information of the frequency to be searched, and then post-processing the feature information of the frequency to be searched through at least one of the following processing modes to obtain the post-processed feature information of the frequency to be searched, wherein the processing modes comprise: performing feature dimension reduction processing; and performing decorrelation processing, and then performing video duplicate checking according to the post-processed characteristic information of the important frequency to be checked. The method and the device for video duplicate checking provided by the embodiment of the invention are suitable for video duplicate checking.

Description

Video duplicate checking method and device

Technical Field

The invention relates to the technical field of multimedia, in particular to a method and a device for video duplicate checking.

Background

With the development of information technology and multimedia technology, various types of video websites have been developed, and some users or website managers will frequently upload some videos to the website for other users to download and view.

Therefore, a website receives a large number of uploaded videos, but many of the uploaded videos are repeated videos or videos with high similarity, when the website ranks the videos according to the video watching amount to recommend to a user, since a large number of repeated videos or videos with high similarity exist in the videos, the ranking accuracy of the website to the videos is low, the accuracy of the videos recommended to the user is also low, and since a large number of repeated videos or videos with high similarity exist in the videos, the user is not facilitated to search and watch the videos, so that the experience degree of the user is low.

Disclosure of Invention

In order to overcome the above technical problems or at least partially solve the above technical problems, the following technical solutions are proposed:

according to one aspect, an embodiment of the present invention provides a method for video duplicate checking, including:

extracting a key frame of the important frequency to be searched according to a preset rule;

inputting each key frame into a preset feature extraction model to obtain depth features corresponding to each key frame;

performing image feature pooling on the depth features respectively corresponding to the key frames to obtain the depth features respectively corresponding to the key frames after the pooling treatment;

integrating and coding depth features respectively corresponding to each key frame after the pooling treatment to obtain feature information of the important frequency to be searched;

carrying out post-processing on the characteristic information of the frequency to be searched by at least one of the following processing modes to obtain the post-processed characteristic information of the frequency to be searched, wherein the processing modes comprise: performing feature dimension reduction processing; and (5) performing decorrelation processing.

And performing video duplicate checking according to the post-processed characteristic information of the important frequency to be checked.

The preset feature extraction model is obtained by training a deep convolution neural network.

Further, before the step of inputting each key frame into a preset feature extraction model to obtain the depth features corresponding to each key frame, the method further includes:

respectively carrying out image preprocessing on each key frame, wherein the image preprocessing comprises at least one of the following items: performing size normalization processing and picture whitening processing;

the method comprises the following steps of inputting each key frame into a preset feature extraction model to obtain depth features corresponding to each key frame, wherein the steps comprise:

and inputting each key frame after image preprocessing into a preset feature extraction model to obtain the depth features corresponding to each key frame.

Specifically, the step of performing video duplicate checking according to the post-processed characteristic information of the important video to be checked includes:

determining a video characteristic index of the frequency to be searched through Product Quantization according to the post-processed characteristic information of the frequency to be searched;

and performing video duplicate checking according to the video feature index of the video to be duplicated.

Specifically, the video duplicate checking method includes:

judging whether video feature indexes respectively corresponding to the videos are the same or not;

and if the same video feature index exists, determining each video repetition corresponding to the same video feature index.

Further, still include: and determining the video to be deleted from the repeated videos, and deleting the video to be deleted.

According to another aspect, an embodiment of the present invention further provides an apparatus for video duplicate checking, including:

the extraction module is used for extracting the key frames of the important frequency to be searched according to a preset rule;

the input module is used for inputting each key frame extracted by the extraction module into a preset feature extraction model to obtain depth features corresponding to each key frame;

the image feature pooling module is used for performing image feature pooling on the depth features respectively corresponding to the key frames to obtain the depth features respectively corresponding to the key frames after the pooling processing;

an integration coding module for integrating and coding the depth features corresponding to the key frames respectively after the pooling of the image feature pooling module to obtain the feature information of the important frequency to be searched

The post-processing module is used for post-processing the characteristic information of the frequency to be searched by at least one of the following processing modes to obtain the processed characteristic information of the frequency to be searched, and the processing modes comprise: performing feature dimension reduction processing; and (5) performing decorrelation processing.

And the video duplicate checking module is used for carrying out video duplicate checking according to the characteristic information of the important frequency to be checked after the post-processing of the post-processing module.

Further, the apparatus further comprises: an image preprocessing module;

the image preprocessing module is used for respectively preprocessing the images of the key frames, and the image preprocessing comprises at least one of the following items: performing size normalization processing and picture whitening processing;

and the input module is specifically used for inputting each key frame after image preprocessing into a preset feature extraction model to obtain the depth features corresponding to each key frame.

Specifically, the video duplication checking module comprises: the device comprises a determining unit and a video duplicate checking unit;

the determining unit is used for determining a video feature index of the to-be-searched frequency-weighted video through Product Quantization according to the post-processed feature information of the to-be-searched frequency-weighted video;

and the video duplicate checking unit is used for checking the duplicate of the video according to the video characteristic index of the video to be checked and determined by the determining unit.

Specifically, the video duplicate checking module is specifically configured to determine whether video feature indexes respectively corresponding to the videos are the same;

the video duplication checking module is specifically further used for determining each video duplication corresponding to the same video feature index when the same video feature index exists.

Further, the apparatus further comprises: a determining module and a deleting module;

the determining module is used for determining videos to be deleted from the repeated videos;

and the deleting module is used for deleting the video to be deleted.

According to yet another aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the above-mentioned video duplicate checking method.

Embodiments of the present invention also provide, according to yet another aspect, a computing device, comprising: the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the video duplicate checking method.

The invention provides a method and a device for video duplicate checking, which extract key frames to be checked and weighted according to a preset rule, then input each key frame into a preset feature extraction model to obtain depth features respectively corresponding to each key frame, and then perform image feature pooling on the depth features respectively corresponding to each key frame to obtain the depth features respectively corresponding to each key frame after the pooling; then integrating and coding the depth features respectively corresponding to each key frame after the pooling processing to obtain the feature information of the frequency to be searched, and then post-processing the feature information of the frequency to be searched through at least one of the following processing modes to obtain the post-processed feature information of the frequency to be searched, wherein the processing modes comprise: performing feature dimension reduction processing; and performing decorrelation processing, and then performing video duplicate checking according to the post-processed characteristic information of the important frequency to be checked. The method and the device can determine the repeated videos or videos with high similarity in the uploaded videos by duplicate checking of the videos, for example, the uploaded videos, so that the accuracy of the websites on ranking the videos can be improved.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a flowchart of a method for video duplicate checking according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of three pooling modes of an embodiment of the present invention;

FIG. 3 is a schematic structural diagram of an apparatus for video duplicate checking according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of another video duplicate checking apparatus in the embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative only and should not be construed as limiting the invention.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.

It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

As will be appreciated by those skilled in the art, a "terminal" as used herein includes both devices having a wireless signal receiver, which are devices having only a wireless signal receiver without transmit capability, and devices having receive and transmit hardware, which have devices having receive and transmit hardware capable of two-way communication over a two-way communication link. Such a device may include: a cellular or other communication device having a single line display or a multi-line display or a cellular or other communication device without a multi-line display; PCS (Personal Communications Service), which may combine voice, data processing, facsimile and/or data communication capabilities; a PDA (Personal Digital Assistant), which may include a radio frequency receiver, a pager, internet/intranet access, a web browser, a notepad, a calendar and/or a GPS (Global Positioning System) receiver; a conventional laptop and/or palmtop computer or other device having and/or including a radio frequency receiver. As used herein, a "terminal" or "terminal device" may be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or land-based), or situated and/or configured to operate locally and/or in a distributed fashion at any other location(s) on earth and/or in space. As used herein, a "terminal Device" may also be a communication terminal, a web terminal, a music/video playing terminal, such as a PDA, an MID (Mobile Internet Device) and/or a Mobile phone with music/video playing function, or a smart tv, a set-top box, etc.

Example one

The invention provides a video duplicate checking method, as shown in fig. 1, comprising:

step 101, extracting a key frame of a to-be-searched important frequency according to a preset rule.

For the embodiment of the invention, the important frequency to be searched is cut into a plurality of image frames, and one image frame is extracted at preset time intervals to be used as a key frame.

For example, one image frame is extracted from the video to be reviewed every 2 seconds as a key frame of the video to be reviewed.

For the embodiment of the invention, the method for extracting the key frame from the important frequency to be searched according to the preset rule can be as follows: the method comprises the steps of obtaining a video frame image, then extracting multi-dimensional features of the image, carrying out initial fuzzy C clustering on the image, taking a sample with the largest membership degree as a clustering center, selecting neighbor samples which are the same as or different from the clustering center, calculating the weight of each dimensional feature through a Relieff algorithm, carrying out secondary clustering after weighting each dimensional feature to obtain a final clustering result, and selecting a frame closest to the clustering center to obtain a key frame.

Further, if the obtained video frame images total N, the sample set to be clustered is denoted as S ═ X₁，X₂，…，X_N}, extracting an image X_j(j ═ 1, 2, …, N) of M-dimensional features (x)₁，x₂，…，x_M) (ii) a Then the method for performing initial fuzzy C clustering on the image is:

step 1, initializing a class number K according to the number of video sequences and the size of 20:1, simultaneously artificially setting a weighting index m, randomly initializing a membership matrix, and determining the membership mu of a jth image sample to an ith class_ijThe sum of the membership degrees of each sample to all the mode classes is 1, and the iteration number is L;

step 2, calculating the clustering centers of all categories

(i＝1，2，…，K)；

Step 3, calculating a new membership matrix mu_ij(L+1)，

Wherein d (X)_j，Z_i) Indicating that image sample X was last completed_jAnd the clustering center Z_iThe Euclidean distance of;

step 4, if | mu_ij(L+1)-μ_ij(L) less than or equal to epsilon, convergence of the algorithm is achieved, and the criterion function

And if the minimum value is reached, stopping clustering, otherwise, turning to the step 2, wherein epsilon is a specified threshold parameter.

Further, after the initial fuzzy C clustering is carried out on the images, clustering division results are obtained, and a sample X with the maximum membership degree is selected_rAs the center of each category, for a sample at the center of any one category, q samples which are similar to the sample and are not similar to the sample are searched, and for a certain dimension characteristic A, updating is carried out according to a weight updating formula of a Relieff algorithm

Where diff (A, R1, R2) denotes the difference in characteristic A of the two samples R1 and R2, H_jDenotes the jth sample, M, of homogeneous samples_j(C) Is represented by the formula X_rThe j adjacent sample in different classes C, P (C) represents the probability of the C class to appear, and the C class represents the same as X_rThe class of the different classes is calculated by the ratio of the number of samples in class C to the total number of samples, and similarly, P (class (X)_r) Is) is AND X_rThe ratio of the number of samples of the same type to the total number of samples,

and l represents repeated updating for l times, so that the weight of each one-dimensional feature in the feature set is obtained.

Further, weighting each dimension characteristic by using the obtained weight, carrying out secondary clustering according to the initial fuzzy C clustering method, calculating the distance between samples and using weighted EuropeFormula of's distance

Wherein x is_kRepresents a sample X_jK-th attribute feature value of, z_kRepresenting a sample Z_iK-th attribute feature value of (2), λ_kRepresenting the magnitude of the kth feature attribute weight.

And further, selecting a sample with the maximum membership degree in each category as an image of a clustering center according to a secondary clustering division result, namely the image is a key frame.

And 102, inputting each key frame into a preset feature extraction model to obtain the depth features corresponding to each key frame.

For example, the deep convolutional nerve is trained by 2 million material images in 21000 categories to obtain the preset feature extraction model.

For the embodiment of the invention, each key frame is input into the trained deep convolutional neural network to obtain the probability of each class of 21000 classes to which each key frame in each key frame belongs; or outputting the representation of the preset dimension corresponding to the key frame, where the representation can be used to represent an application scene corresponding to the frame image, such as indoor, outdoor, sun, sky, and the like.

And 103, performing image feature pooling on the depth features respectively corresponding to the key frames to obtain the depth features respectively corresponding to the key frames after the pooling.

For the embodiment of the invention, pooling is to average and equalize each convolution characteristic on the basis of convolution characteristic extraction and continuously reduce the dimension of the convolution characteristic corresponding to the hidden node.

It is highly likely that features useful in one image region will be equally applicable in another region for embodiments of the present invention. Thus, to describe a large image, one natural idea is to aggregate statistics on features at different locations, e.g., one can calculate the average (or maximum) of a particular feature over a region of the image. These summary statistical features not only have much lower dimensionality (compared to using all extracted features), but also improve the results (not easily overfitting). This polymerization operation is called pooling (Pooling).

For embodiments of the invention, pooling may comprise: 1) mean-posing, namely, averaging the characteristic points in the neighborhood, and keeping the background better; max-posing, namely, the feature point in the neighborhood is taken to be the largest, and the texture extraction is better; 3) stochastic-posing, which is between the two, gives probability to pixel points according to the numerical size, and then performs sub-sampling according to the probability.

The error of feature extraction mainly comes from two aspects: (1) the variance of the estimated value is increased due to the limited size of the neighborhood; (2) convolutional layer parameter errors cause a shift in the estimated mean. In general, mean-posing can reduce the first error, preserving more background information of the image, and max-posing can reduce the second error, preserving more texture information. In the average sense, similar to mean-pooling, in the local sense, the criterion of max-pooling is obeyed. The three types of pooling are shown in FIG. 2.

And step 104, integrating and coding the depth features respectively corresponding to the pooled key frames to obtain the feature information of the important frequency to be searched.

For example, in step 101, three key frames, namely key frame 1, key frame 2, and key frame 3, are extracted from the important video to be checked, and the video information of the video to be checked is determined according to the feature information corresponding to the key frame 1, the key frame 2, and the key frame 3, respectively.

And 105, post-processing the characteristic information of the video to be checked by at least one of the following processing modes to obtain the post-processed characteristic information of the video to be checked.

Wherein, the processing mode comprises: performing feature dimension reduction processing; and (5) performing decorrelation processing.

For the embodiment of the invention, the feature dimension reduction processing is carried out on the feature information of the video to be checked through a preset dimension reduction algorithm. Wherein, the characteristic algorithm comprises: component Analysis (PCA), Factor Analysis (Factor Analysis), and Independent Component Analysis (ICA). In the embodiment of the invention, the PCA is taken as an example to perform dimensionality reduction processing on the characteristic information of the important frequency to be searched. The idea of PCA is to map n-dimensional features onto k dimensions (k < n), the k-dimensional features are called principal elements and are linear combinations of old features, the linear combinations maximize sample variance, and new k features are made to be uncorrelated as much as possible.

For example, a feature information matrix of 10000 dimensions may be dimension-reduced by 400 dimensions through a feature dimension reduction process.

For the embodiment of the invention, the relevance exists between the image characteristics of the adjacent dimensions, and when the relevance between the image characteristics of the adjacent dimensions is not needed, the characteristic information of the video to be checked is subjected to decorrelation processing. In the embodiment of the invention, the characteristic information of the important frequency to be checked is subjected to characteristic dimension reduction processing and decorrelation processing, so that the obtained characteristic information of the important frequency to be checked has lower dimension and lower interference.

And 106, performing video duplicate checking according to the post-processed characteristic information of the important frequency to be checked.

For the embodiment of the invention, whether the video with higher association degree with the characteristic information of the important frequency to be checked exists in the online video is determined through the characteristic information of the important video to be checked, so that the video duplication checking is realized.

The embodiment of the invention provides a video duplicate checking method, which comprises the steps of extracting key frames to be checked and weighted according to a preset rule, inputting each key frame into a preset feature extraction model to obtain depth features respectively corresponding to each key frame, and performing image feature pooling on the depth features respectively corresponding to each key frame to obtain depth features respectively corresponding to each key frame after the pooling is performed; then integrating and coding the depth features respectively corresponding to each key frame after the pooling processing to obtain the feature information of the frequency to be searched, and then post-processing the feature information of the frequency to be searched through at least one of the following processing modes to obtain the post-processed feature information of the frequency to be searched, wherein the processing modes comprise: performing feature dimension reduction processing; and performing decorrelation processing, and then performing video duplicate checking according to the post-processed characteristic information of the important frequency to be checked. The embodiment of the invention can determine the repeated video or the video with high similarity in the uploaded video by duplicate checking of the video, for example, the uploaded video, so that the accuracy of the website for ranking the video can be improved.

Example two

Another possible implementation manner of the embodiment of the present invention further includes, on the basis of the operation shown in the first embodiment, the operation shown in the second embodiment, wherein,

step 102 is preceded by: respectively carrying out image preprocessing on each key frame, wherein the image preprocessing comprises at least one of the following items: a regularization size process and a picture whitening process.

For the embodiment of the invention, the robustness of each key frame image is improved by respectively carrying out the size regularization processing and the picture whitening processing on each key frame.

For the embodiment of the present invention, resizing the image is to perform resizing on the image by sampling, for example, five chunks are deducted from the image, including deducting from the middle and four corners of the image respectively.

For the embodiment of the invention, the final image of the image is influenced by multiple factors such as ambient illumination intensity, object reflection, shooting camera and the like. In order to enable constant information, which is not affected by the outside, contained in the image, the image is subjected to image whitening processing. Generally to remove the effect of these factors, we convert its pixel values to zero mean and unit variance. Therefore, the pixel average value μ and the variance value δ 2 of the original gray image P are first calculated by formula one and formula two.

Wherein the first formula is:

the second formula is:

then, each pixel value of the original grayscale image will be transformed using μ and δ: for a color image, μ and δ 2 are calculated separately for the three channels, and then pixel conversion is performed separately according to equation three.

Wherein the formula three:

step 102 specifically includes: and inputting each key frame after image preprocessing into a preset feature extraction model to obtain the depth features corresponding to each key frame.

EXAMPLE III

Another possible implementation manner of the embodiment of the present invention further includes, on the basis of the operation shown in the first embodiment, the operation shown in the third embodiment, wherein,

step 106 comprises: determining a video characteristic index of the frequency to be searched through Product Quantization according to the post-processed characteristic information of the frequency to be searched; and performing video duplicate checking according to the video feature index of the video to be duplicated.

For the embodiment of the invention, Product Quantization comprises a grouping Quantization process of two process characteristics and a Cartesian Product process of categories. Assuming a data set is provided, K-means is given class number K, the objective function is the distance and the minimum value from all samples to the class center, and iterative computation is performed to optimize the objective function to obtain K class centers and the class to which each sample belongs. The objective function is unchanged, and the method of product quantization is as follows:

(1) the data set is K categories, each sample is represented in the form of a vector with dimension d, and each component of the vector is divided into m groups.

(2) Using a certain component quantity of all vectors as a data set, and obtaining the component quantity by adopting a k-means algorithm

Running k-means algorithm for m times by individual center, each group has

The class center notes this

The individual class centers are a set.

(3) And (4) performing Cartesian product on the m obtained sets to obtain the class center of the whole data set.

For the embodiment of the invention, the processed characteristic information of the to-be-searched frequency is subjected to Product Quantization to obtain the video characteristic index of the to-be-searched frequency, wherein the video characteristic index of the to-be-searched frequency is the corresponding relation between the to-be-searched frequency and the characteristic index.

For example, the important video to be checked includes video 1, video 2, and video 3, and the index values corresponding to the video 1, video 2, and video 3 are 001, 002, and 003, respectively, and the index values of the video features corresponding to the video 1, video 2, and video 3 are 1, 2, and 1, respectively.

Further, the video duplicate checking method comprises the following steps: judging whether video feature indexes respectively corresponding to the videos are the same or not; and if the same video feature index exists, determining each video repetition corresponding to the same video feature index.

For the embodiment of the invention, if the video feature indexes respectively corresponding to the two videos are the same, the two videos are represented as the repeated videos.

For example, the important video to be checked includes video 1, video 2, and video 3, the corresponding index values are 001, 002, and 003, the index values of the video features corresponding to video 1, video 2, and video 3 are 1, 2, and 1, and the index values of the video features corresponding to video 1 and video 2 are both 1 (the index values of the video features corresponding to two different videos are the same), so that video 1 and video 2 are duplicate videos.

For the embodiment of the invention, the video to be deleted is determined from the repeated videos, and the video to be deleted is deleted.

For the embodiment of the invention, if a plurality of repeated videos exist in the online videos, the videos to be deleted are selected from the repeated videos and deleted.

For the embodiment of the present invention, a video to be deleted is determined from repeated videos according to a preset principle, where the preset principle includes at least one of the following: the definition of the video, the release time of the video, the watching amount of the video, the clicking amount of the video and the downloading amount of the video.

For example, two repeated videos are included in the online video, including: video 1 and video 3, where the download amount of video 1 is 100, and the download amount of video 2 is 1200, the video to be deleted is video 1.

For the embodiment of the invention, the video to be deleted is determined from the repeated videos, and the video to be deleted is deleted, so that when a user downloads the corresponding video from the online video, the video to be downloaded can be accurately determined and downloaded, thereby reducing the repetition rate of the video in the online video, further improving the accuracy of searching the video to be downloaded and improving the experience of the user.

For the embodiment of the invention, the video to be deleted is determined from the repeated videos, and the video to be deleted is deleted, so that the accuracy of ranking each online video can be improved, and moreover, when the video is recommended to the user, the video can be more accurately recommended to the user, and further the user experience can be improved.

For the embodiment of the invention, when the duplicate videos are not searched, if the user searches the required videos in a keyword searching manner, the website may recommend some repeated videos or videos with higher similarity to the user, or recommend the videos without duplicate searches to the user after ranking, so that the accuracy of recommending the videos to the user and ranking the videos by the website is lower, and the user experience is poorer. In the embodiment of the invention, the videos to be deleted are determined from the repeated videos, and the videos to be deleted are deleted, so that when the user searches the required videos in a keyword searching mode, the videos required by the user can be recommended to the user more accurately, or the ranking of the related videos is recommended to the user, and the experience degree of the user can be improved.

An embodiment of the present invention provides a video duplicate checking device, as shown in fig. 3, the video duplicate checking device includes: an extraction module 31, an input module 32, an image feature pooling module 33, an integration coding module 34, and a video duplication checking module 35, wherein,

and the extraction module 31 is configured to extract the key frame of the important frequency to be searched according to a preset rule.

And an input module 32, configured to input each key frame extracted by the extraction module 31 into a preset feature extraction model, so as to obtain a depth feature corresponding to each key frame.

And an image feature pooling module 33, configured to perform image feature pooling on the depth features corresponding to the key frames, respectively, to obtain depth features corresponding to the key frames after the pooling.

And the integration coding module 34 is configured to integrate and code the depth features corresponding to the key frames respectively after the image feature pooling module 33 pools, so as to obtain feature information of the important frequencies to be searched.

And the post-processing module 35 is configured to perform post-processing on the feature information of the to-be-searched frequency-weighted signal in at least one of the following processing manners to obtain processed feature information of the to-be-searched frequency-weighted signal.

And the video duplicate checking module 36 is configured to perform video duplicate checking according to the feature information of the important frequency to be checked after the post-processing by the post-processing module 35. .

Further, as shown in fig. 4, the apparatus further includes: an image pre-processing module 41.

And an image preprocessing module 41, configured to perform image preprocessing on each key frame.

Wherein the image pre-processing comprises at least one of: a regularization size process and a picture whitening process.

The input module 32 is specifically configured to input each of the preprocessed key frames into a preset feature extraction model, so as to obtain depth features corresponding to each of the key frames.

Further, as shown in fig. 4, the video duplication checking module 36 includes: a determining unit 361 and a video duplication checking unit 362.

The determining unit 361 is configured to determine a video feature index of the video to be checked according to the post-processed feature information of the video to be checked and through Product Quantization.

The video duplicate checking unit 362 is configured to perform video duplicate checking according to the video feature index of the video to be duplicated, which is determined by the determining unit 361.

The video duplication checking module 36 is specifically configured to determine whether the video feature indexes respectively corresponding to the videos are the same.

The video duplication checking module 36 is further configured to determine, when the same video feature index exists, each video duplicate corresponding to the same video feature index.

Further, as shown in fig. 4, the apparatus further includes: a determination module 42 and a deletion module 43.

And a determining module 42, configured to determine, from the repeated videos, a video to be deleted.

And a deleting module 43, configured to delete the video to be deleted.

The embodiment of the invention provides a video duplicate checking device, which extracts key frames to be checked and weighted according to a preset rule, inputs each key frame into a preset feature extraction model to obtain depth features respectively corresponding to each key frame, performs image feature pooling on the depth features respectively corresponding to each key frame to obtain depth features respectively corresponding to each key frame after the pooling is performed; then integrating and coding the depth features respectively corresponding to each key frame after the pooling processing to obtain the feature information of the frequency to be searched, and then post-processing the feature information of the frequency to be searched through at least one of the following processing modes to obtain the post-processed feature information of the frequency to be searched, wherein the processing modes comprise: performing feature dimension reduction processing; and performing decorrelation processing, and then performing video duplicate checking according to the post-processed characteristic information of the important frequency to be checked. The embodiment of the invention can determine the repeated video or the video with high similarity in the uploaded video by duplicate checking of the video, for example, the uploaded video, so that the accuracy of the website for ranking the video can be improved.

The device for video duplicate checking provided by the embodiment of the present invention can implement the method embodiment provided above, and for specific function implementation, reference is made to the description in the method embodiment, which is not repeated herein.

The embodiment of the invention provides a computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the program is executed by a processor, the method for checking the duplicate of the video is realized.

The embodiment of the invention provides a computer-readable storage medium, wherein key frames to be searched for frequency emphasis are extracted according to a preset rule, then each key frame is input into a preset feature extraction model to obtain depth features respectively corresponding to each key frame, then the depth features respectively corresponding to each key frame are subjected to image feature pooling to obtain depth features respectively corresponding to each key frame after the pooling is carried out; then integrating and coding the depth features respectively corresponding to each key frame after the pooling processing to obtain the feature information of the frequency to be searched, and then post-processing the feature information of the frequency to be searched through at least one of the following processing modes to obtain the post-processed feature information of the frequency to be searched, wherein the processing modes comprise: performing feature dimension reduction processing; and performing decorrelation processing, and then performing video duplicate checking according to the post-processed characteristic information of the important frequency to be checked. The embodiment of the invention can determine the repeated video or the video with high similarity in the uploaded video by duplicate checking of the video, for example, the uploaded video, so that the accuracy of the website for ranking the video can be improved.

The computer-readable storage medium provided in the embodiments of the present invention can implement the method embodiments provided above, and for specific function implementation, reference is made to the description in the method embodiments, which is not repeated herein.

An embodiment of the present invention provides a computing device, including: the processor, the memory and the communication interface complete mutual communication through the communication bus;

The embodiment of the invention provides computing equipment, wherein key frames to be searched for the important frequency are extracted according to a preset rule, then the key frames are input into a preset feature extraction model to obtain depth features respectively corresponding to the key frames, then the depth features respectively corresponding to the key frames are subjected to image feature pooling to obtain the depth features respectively corresponding to the key frames after the pooling; then integrating and coding the depth features respectively corresponding to each key frame after the pooling processing to obtain the feature information of the frequency to be searched, and then post-processing the feature information of the frequency to be searched through at least one of the following processing modes to obtain the post-processed feature information of the frequency to be searched, wherein the processing modes comprise: performing feature dimension reduction processing; and performing decorrelation processing, and then performing video duplicate checking according to the post-processed characteristic information of the important frequency to be checked. The embodiment of the invention can determine the repeated video or the video with high similarity in the uploaded video by duplicate checking of the video, for example, the uploaded video, so that the accuracy of the website for ranking the video can be improved.

The computing device provided in the embodiment of the present invention may implement the method embodiment provided above, and for specific function implementation, reference is made to the description in the method embodiment, which is not described herein again.

Those skilled in the art will appreciate that the present invention includes apparatus directed to performing one or more of the operations described in the present application. These devices may be specially designed and manufactured for the required purposes, or they may comprise known devices in general-purpose computers. These devices have stored therein computer programs that are selectively activated or reconfigured. Such a computer program may be stored in a device (e.g., computer) readable medium, including, but not limited to, any type of disk including floppy disks, hard disks, optical disks, CD-ROMs, and magnetic-optical disks, ROMs (Read-Only memories), RAMs (Random Access memories), EPROMs (Erasable Programmable Read-Only memories), EEPROMs (Electrically Erasable Programmable Read-Only memories), flash memories, magnetic cards, or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a bus. That is, a readable medium includes any medium that stores or transmits information in a form readable by a device (e.g., a computer).

It will be understood by those within the art that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. Those skilled in the art will appreciate that the computer program instructions may be implemented by a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, implement the features specified in the block or blocks of the block diagrams and/or flowchart illustrations of the present disclosure.

Those of skill in the art will appreciate that various operations, methods, steps in the processes, acts, or solutions discussed in the present application may be alternated, modified, combined, or deleted. Further, various operations, methods, steps in the flows, which have been discussed in the present application, may be interchanged, modified, rearranged, decomposed, combined, or eliminated. Further, steps, measures, schemes in the various operations, methods, procedures disclosed in the prior art and the present invention can also be alternated, changed, rearranged, decomposed, combined, or deleted.

The foregoing is only a partial embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A method for video duplicate checking, comprising:

respectively carrying out image preprocessing on each key frame, wherein the image preprocessing comprises at least one of the following steps: performing size normalization processing and picture whitening processing;

integrating and coding the depth features respectively corresponding to the key frames after the pooling treatment to obtain the feature information of the important frequency to be searched;

performing post-processing on the characteristic information of the frequency to be searched by at least one of the following processing modes to obtain post-processed characteristic information of the frequency to be searched, wherein the processing modes comprise: performing feature dimension reduction processing; performing decorrelation processing;

2. The method of claim 1, wherein the predetermined feature extraction model is obtained by training a deep convolutional neural network.

3. The method according to claim 1 or 2, wherein the step of inputting each key frame into a preset feature extraction model to obtain the depth features corresponding to each key frame comprises:

4. The method according to claim 1, wherein the step of performing video duplicate checking according to the post-processed important video feature information includes:

determining the video characteristic index of the frequency to be searched through Product Quantization according to the post-processed characteristic information of the frequency to be searched;

5. The method of claim 4, wherein the video duplication checking comprises:

6. The method of claim 5, further comprising:

and determining the video to be deleted from the repeated videos, and deleting the video to be deleted.

7. An apparatus for video duplicate checking, comprising:

the input module is used for inputting each key frame extracted by the extraction module into a preset feature extraction model to obtain the depth feature corresponding to each key frame;

the integration coding module is used for integrating and coding the depth features respectively corresponding to the key frames subjected to pooling processing by the image feature pooling module to obtain the feature information of the important frequency to be searched;

the post-processing module is configured to perform post-processing on the to-be-searched frequency-weighted feature information through at least one of the following processing manners to obtain processed to-be-searched frequency-weighted feature information, where the processing manner includes: performing feature dimension reduction processing; performing decorrelation processing;

the video duplicate checking module is used for carrying out video duplicate checking according to the characteristic information of the important frequency to be checked after the post-processing module carries out post-processing;

the device further comprises: an image preprocessing module;

the image preprocessing module is configured to perform image preprocessing on each of the key frames, where the image preprocessing includes at least one of: a regularization size process and a picture whitening process.

8. The apparatus of claim 7, wherein the predetermined feature extraction model is obtained by training a deep convolutional neural network.

9. The apparatus according to claim 7 or 8,

the input module is specifically configured to input each preprocessed key frame into a preset feature extraction model, so as to obtain depth features corresponding to each key frame.

10. The apparatus of claim 7, wherein the video review module comprises: the device comprises a determining unit and a video duplicate checking unit;

the determining unit is used for determining the video feature index of the to-be-searched frequency-weighted video through Product Quantization according to the post-processed feature information of the to-be-searched frequency-weighted video;

11. The apparatus of claim 10,

the video duplicate checking module is specifically used for judging whether video feature indexes respectively corresponding to the videos are the same or not;

the video duplication checking module is specifically configured to determine, when the same video feature index exists, each video duplicate corresponding to the same video feature index.

12. The apparatus of claim 11, further comprising: a determining module and a deleting module;

the determining module is used for determining the video to be deleted from the repeated videos;

and the deleting module is used for deleting the video to be deleted.

13. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the method of any one of claims 1-6.

14. A computing device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;

the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the video duplicate checking method according to any one of claims 1-6.