CA3172605C

CA3172605C - Video jitter detection method and device

Info

Publication number: CA3172605C
Application number: CA3172605A
Authority: CA
Inventors: Chong MU; Xuyang Zhou; Erlong LIU; Wenzhe GUO
Original assignee: 10353744 Canada Ltd
Current assignee: 10353744 Canada Ltd
Priority date: 2019-06-21
Filing date: 2020-06-11
Publication date: 2024-01-02
Anticipated expiration: 2040-06-11
Also published as: CN110248048A; WO2020253618A1; CA3172605A1; CN110248048B

Abstract

Disclosed in the present invention are a video jitter detection method and device. The method comprises: framing a video requiring detection to obtain a frame sequence; performing feature point detection on the frame sequence frame by frame to obtain a feature point of each frame, and generating a frame feature point sequence matrix; performing operation on the frame feature point sequence matrix on the basis of an optical flow tracking algorithm to obtain a motion vector of each frame; obtaining the feature value of the video requiring detection according to the motion vector of each frame; and obtaining an output signal by means of operation by using the feature value of the video requiring detection as an input signal of a detection model, and determining whether jitter occurs to the video requiring detection according to the output signal. According to the present invention, the use of feature point detection and the use of an optical flow tracking algorithm for feature points effectively solve the problem of tracking failure caused by excessive changes between two adjacent frames, and the present invention has good sensitivity and robustness when performing detection on videos captured in cases such as sudden large displacement, strong shaking, and large rotation of a camera lens.

Description

VIDEO JITTER DETECTION METHOD AND DEVICE
BACKGROUND OF THE INVENTION
Technical Field [0001] The present invention relates to the field of computer vision technology, and more particularly to a method of and a device for detecting video jitter.
Description of Related Art

[0002] The wave of science and technology has profoundly changed the life of everybody. Such handheld video capturing devices as smart mobile phones, digital cameras, ILDC
and SLR cameras with ever shrinking size and ever lowering price have become everyday necessities of most people, and an age of photographing by all people has quietly come into being. While people are enjoying the interesting and exhilarating moments recorded by handheld video capturing devices, irregular jitters generated in the video due to unstable movements of the lens caused by moving or unintentional shaking of the photographer make the effects of wonderful highlights as recorded to fall far short of expectation, and severely affect subsequent processing of the video at the same time.
Accordingly, video jitter detection has become an indispensable, important component of the video processing technology.

[0003] Video jitter detection is the basis for subsequent readjustment and processing of videos, and the researching personnel has made great quantities of researches on the basis of video analysis in the fields of video processing, video image stabilization and computer vision, etc. Although some researchers have proposed several methods of detecting video jitters, the currently available detecting algorithms are not high in precision, as some are not sensitive to videos captured under conditions of large displacement and strong jitter of lenses within a short time, some are not adapted to the detection of rotational movements, and some are not applicable to scenarios in which the lens moves slowly. For example, the following commonly employed methods of detecting video jitters are more or less defective in ways described below.
1. Block matching method: at present, this is the most commonly used algorithm in video image stabilization systems. This method divides a current frame into blocks, each pixel in a block has the same and single motion vector, and the optimal match is searched for in each block within a specific range of reference frames, to thereby estimate the global motion vector of the video sequence.
Division into blocks is usually required by the block matching method, whereby the global motion vector is estimated according to the motion vector in each block, then the detection of video jitter in certain specific scenarios is inferior in effect, for instance, a picture is divided into four grids, in which 3 grids are motionless, while objects in one grid are moving. Besides, Kalman filtering is usually required by the block matching method to process the calculated motion vectors, while the computational expense is large, the real-time property is inferior, and the scenario of large displacement and strong jitter of the lens within a short time cannot be accommodated.
2. Grayscale projection method: based on the principle of consistent distribution of grayscales in overlapped and similar regions in an image, and making use of regional grayscale information of adjacent video frames to seek for vector motion relation, this algorithm mainly consists of relevant calculation of grayscale projection in two directions of different regions, rows and columns. The grayscale projection method is effective to scenarios in which only translational jitters exist, and cannot estimate rotational motion vectors.

SUMMARY OF THE INVENTION

[0004] In order to solve the problems pending in the state of the art, embodiments of the present invention provide a method of and a device for detecting video jitter, so as to overcome such prior-art problems as low precision of detecting algorithms, and insensitivity to videos captured under conditions of large displacement and strong jitter of lenses within a short time.

[0005] In order to solve one or more of the aforementioned technical problems, the present invention employs the following technical solutions.

[0006] According to one aspect, a method of detecting video jitter is provided, and the method comprises the following steps:

[0007] performing a framing process on a to-be-detected video to obtain a frame sequence;

[0008] performing feature point detection on the frame sequence frame by frame, obtaining feature points of each frame, and generating a frame feature point sequence matrix;

[0009] basing on an optical flow tracking algorithm to perform an operation on the frame feature point sequence matrix to obtain a motion vector of each frame;

[0010] obtaining a feature value of the to-be-detected video according to the motion vector of each frame; and

[0011] taking the feature value of the to-be-detected video as an input signal of a detection model to perform operation to obtain an output signal, and judging whether jitter occurs to the to-be-detected video according to the output signal.

[0012] Further, prior to performing feature point detection, the method further comprises the following steps of preprocessing the frame sequence:

[0013] grayscale-processing the frame sequence, and obtaining a grayscale frame sequence; and

[0014] denoising the grayscale frame sequence; wherein

[0015] the step of performing feature point detection on the frame sequence frame by frame is performing feature point detection on the preprocessed frame sequence frame by frame.

[0016] Further, the step of performing feature point detection on the frame sequence frame by frame, obtaining feature points of each frame includes:

[0017] employing a feature point detecting algorithm in which are fused FAST
features and SURF features to perform feature point detection on the frame sequence frame by frame, and obtain feature points of each frame.

[0018] Further, the step of basing on an optical flow tracking algorithm to perform an operation on the frame feature point sequence matrix to obtain a motion vector of each frame includes:

[0019] performing optical flow tracking calculation on the frame feature point sequence matrix of each frame, and obtaining an initial motion vector of each frame;

[0020] obtaining a corresponding accumulative motion vector according to the initial motion vector;

[0021] smoothening the accumulative motion vector, and obtaining a smoothened motion vector;
and

[0022] employing the accumulative motion vector and the smoothened motion vector to readjust the initial motion vector of each frame, and obtaining the motion vector of each frame.

[0023] Further, the step of obtaining a feature value of the to-be-detected video according to the motion vector of each frame includes:

[0024] merging and converting the motion vectors of all frames into a matrix, and calculating unbiased standard deviations of various elements in the matrix;

[0025] weighting and fusing the unbiased standard deviations of the various elements, and obtaining a weighted value; and

[0026] taking the unbiased standard deviations of the various elements and the weighted value as the feature value of the to-be-detected video.

[0027] According to another aspect, a device for detecting video jitter is provided, and the device comprises:

[0028] a framing processing module, for performing a framing process on a to-be-detected video to obtain a frame sequence;

[0029] a feature point detecting module, for performing feature point detection on the frame sequence frame by frame, obtaining feature points of each frame, and generating a frame feature point sequence matrix;

[0030] a vector calculating module, for basing on an optical flow tracking algorithm to perform an operation on the frame feature point sequence matrix to obtain a motion vector of each frame;

[0031] a feature value extracting module, for obtaining a feature value of the to-be-detected video according to the motion vector of each frame; and

[0032] a jitter detecting module, for taking the feature value of the to-be-detected video as an input signal of a detection model to perform operation to obtain an output signal, and judging whether jitter occurs to the to-be-detected video according to the output signal.

[0033] Further, the device further comprises:

[0034] a data preprocessing module, for preprocessing the frame sequence;

[0035] the data preprocessing module includes:

[0036] a grayscale-processing unit, for grayscale-processing the frame sequence, and obtaining a grayscale frame sequence; and

[0037] a denoising processing unit, for denoising the grayscale frame sequence;

[0038] the feature point detecting module is employed for performing feature point detection on the preprocessed frame sequence frame by frame.

[0039] Further, the feature point detecting module is further employed for:

[0040] employing a feature point detecting algorithm in which are fused FAST
features and SURF features to perform feature point detection on the frame sequence frame by frame, and obtain feature points of each frame.
s

[0041] Further, the vector calculating module includes:

[0042] an optical flow tracking unit, for performing optical flow tracking calculation on the frame feature point sequence matrix of each frame, and obtaining an initial motion vector of each frame;

[0043] an accumulation calculating unit, for obtaining a corresponding accumulative motion vector according to the initial motion vector;

[0044] a smoothening processing unit, for smoothening the accumulative motion vector, and obtaining a smoothened motion vector; and

[0045] a vector readjusting unit, for employing the accumulative motion vector and the smoothened motion vector to readjust the initial motion vector of each frame, and obtaining the motion vector of each frame.

[0046] Further, the feature value extracting module includes:

[0047] a matrix converting unit, for merging and converting the motion vectors of all frames into a matrix;

[0048] a standard deviation calculating unit, for calculating unbiased standard deviations of various elements in the matrix; and

[0049] a weighting and fusing unit, for weighting and fusing the unbiased standard deviations of the various elements, and obtaining a weighted value.

[0050] The technical solutions provided by the embodiments of the present invention bring about the following advantageous effects.
1. In the method of and device for detecting video jitter provided by the embodiments of the present invention, by basing on the optical flow tracking algorithm to obtain a motion vector of each frame according to the frame feature point sequence matrix, the present invention effectively solves the problem of failed tracking due to unduly large change between two adjacent frames, exhibits excellent toleration and adaptability in detecting jitters of videos captured under the condition in which the lens slowly moves, and achieves excellent sensitivity and robustness when videos are detected as captured under circumstances of abrupt large displacement, strong jitter, and excessive rotation of the lens.
2. In the method of and device for detecting video jitter provided by the embodiments of the present invention, a feature point detecting algorithm in which are fused FAST features and SURF features is employed, that is to say, the feature point extracting algorithm is so optimized that not only the image global feature is considered, but the local features are also retained, moreover, computational expense is small, robustness against image blurs and faint illumination is strong, and real-time property and precision of the detection are further enhanced.
3. In the method of and device for detecting video jitter provided by the embodiments of the present invention, features of at least four dimensions are extracted from the to-be-detected video, and an SVM model is used as the detection model, so that the method of detecting video jitter as provided by the embodiments of the present invention is more advantageous in terms of generality, and precision of detection is further enhanced.

[0051] Of course, implementation of any one solution according to the present application does not necessarily achieve all of the aforementioned advantages simultaneously.
BRIEF DESCRIPTION OF THE DRAWINGS

[0052] To more clearly explain the technical solutions in the embodiments of the present invention, drawings required for use in the following explanation of the embodiments are briefly described below. Apparently, the drawings described below are merely directed to some embodiments of the present invention, while it is further possible for persons ordinarily skilled in the art to base on these drawings to acquire other drawings, and no creative effort will be spent in the process.

[0053] Fig. 1 is a flowchart illustrating a method of detecting video jitter according to an exemplary embodiment;

[0054] Fig. 2 is a flowchart illustrating preprocessing of the frame sequence according to an exemplary embodiment;

[0055] Fig. 3 is a flowchart illustrating performing an operation on the frame feature point sequence matrix based on an optical flow tracking algorithm to obtain a motion vector of each frame according to an exemplary embodiment;

[0056] Fig. 4 is a flowchart illustrating obtaining a feature value of the to-be-detected video according to the motion vector of each frame according to an exemplary embodiment;
and

[0057] Fig. 5 is a view schematically illustrating the structure of a device for detecting video jitter according to an exemplary embodiment.
DETAILED DESCRIPTION OF THE INVENTION

[0058] To make more lucid and clear the objectives, technical solutions and advantages of the present invention, technical solutions in the embodiments of the present invention will be described more clearly and completely below with reference to the accompanying drawings in the embodiments of the present invention. Apparently, the embodiments described below are merely partial, rather than the entire, embodiments of the present invention. All other embodiments achievable by persons ordinarily skilled in the art on the basis of the embodiments in the present invention without creative effort shall all fall within the protection scope of the present invention.

[0059] Fig. 1 is a flowchart illustrating a method of detecting video jitter according to an exemplary embodiment, with reference to Fig. 1, the method comprises the following steps.

[0060] Si - performing a framing process on a to-be-detected video to obtain a frame sequence.

[0061] Specifically, in order to facilitate subsequent calculation so as to detect the to-be-detected video, after the to-be-detected video (indicated as S) has been obtained, a framing extraction process should be firstly performed on the to-be-detected video S
to obtain a frame sequence corresponding to the to-be-detected video, and the frame sequence is expressed as Li (1=1, 2, 3, ..., n), where Li represents the ith frame of the video, and n represents the total number of frames of the video.

[0062] S2 - performing feature point detection on the frame sequence frame by frame, obtaining feature points of each frame, and generating a frame feature point sequence matrix.

[0063] Specifically, it is required in the detection of video jitter to select the current frame and the adjacent next frame (or to extract the next frame by an interval of N
frames) from the video, corresponding feature points should be obtained from the two frames of images, and corresponding matching is subsequently performed according to the feature points of the two frames, to hence judge whether offset (jitter) occurs between the two frames.

[0064] During specific implementation, a feature point detecting algorithm is employed to perform feature point detection on the processed frame sequence Li(i=1, 2, 3, ..., n) frame by frame, feature points of each frame are obtained (i.e., feature points of each frame of image are extracted), and a frame feature point sequence matrix is generated, which is supposed to be expressed as Zi (1=1, 2, ..., n), and which can be specifically expressed as follows:

_ a1 CIL2 = = - a 1 1,q - = - at .
Z-21I,2_2 2,q ,= a' =

Prq 0 a _ F. ai - = - a' i pr2 prq

[0065] where P'q represents a feature point detection result at rowp, column q of the ith frame matrix, 1 is a feature point, 0 is a non-feature point, p represents the number of rows of the matrix, and q represents the number of columns of the matrix.

[0066] S3 - basing on an optical flow tracking algorithm to perform an operation on the frame feature point sequence matrix to obtain a motion vector of each frame.

[0067] Specifically, the optical flow tracking algorithm is employed to perform optical flow tracking calculation on the frame feature point sequence matrix, namely to track the change of feature points in the current frame to the next frame. For instance, the change of a feature point sequence matrix Zi in the ith frame is tracked to the next i+lth frame, and a motion vector 06/ is obtained, whose expression is as follows:
dx' a (i 1,2,¨ rt) ¨ = eiv ¨

I dr'

[0068] where chci represents a Euclidean column offset from the ith frame to the i+lth frame, dyi represents a Euclidean row offset from the 1th frame to the i+lth frame, and dr' represents an angle offset from the ith frame to the i+lth frame.

[0069] S4 - obtaining a feature value of the to-be-detected video according to the motion vector of each frame.
1.0

[0070] Specifically, the feature value of three dimensions is usually used in the state of the art, whereas in the embodiments of the present invention the extracted feature value at least includes the feature value of four dimensions. The addition of one dimension to the feature value as compared with prior-art technology makes the method of detecting video jitter as provided by the embodiments of the present invention more advantageous in terms of generality, and precision of detection is further enhanced.

[0071] S5 - taking the feature value of the to-be-detected video as an input signal of a detection model to perform operation to obtain an output signal, and judging whether jitter occurs to the to-be-detected video according to the output signal.

[0072] Specifically, the feature value of the to-be-detected video obtained in the previous step is taken as an input signal to be input into the detection model to perform operation to obtain an output signal, and the output signal is based on to judge whether jitter occurs to the to-be-detected video. As should be noted here, the detection model in the embodiments of the present invention is well trained in advance. During the specific training, it is possible to correspondingly process sample video data in a set of selected training data by employing the method in the embodiments of the present invention, to obtain a feature value of the sample video data. A detection model is trained according to the feature value of the sample video data and a corresponding annotation result of the sample video data, until model training is completed to obtain the final detection model.

[0073] For instance, suppose that an mth video sample in a set of jittering video data with annotations has undergone the processing specified in the above step to extract to obtain the feature value of the mth video sample. That is, the mth video sample is firstly framing-processed to obtain a frame sequence, feature point detection is then performed frame by frame on the frame sequence, feature points of each frame are obtained, a frame feature point sequence matrix is generated, the optical flow tracking algorithm is thereafter based on to perform an operation on the frame feature point sequence matrix to obtain the motion vector of each frame, and the feature value of the Mth video sample is finally obtained according to the motion vector of each frame. After the motion vectors have been dimensionally converted, unbiased standard deviations of various elements and their weighted and fused value are calculated and obtained, which are respectively expressed as o- [A, (dx)] ,o- [A, (dy)] ,0[A(h1')1 and K., and the annotation result ym (if ym=0, this indicates that no jitter occurs to the video sample, if ym=1, this indicates that jitter occurs to the video sample) of the nith video sample is extracted to obtain the training sample of the ?nth video sample, which can be expressed as follows:
[A,(dx)]m o- [A,(dy)I1m o-P-(dr)1 Km yn, }(m)

[0074] The video sample makes use of features of at least five dimensions, in comparison with prior-art technology in which features of three dimensions are usually used (the average values of average values, variances, and included angles of translational vectors of the translation quantities of adjacent frames are usually used), generality is more advantageous, and precision of detection is further enhanced. In addition, as a preferred embodiment in the embodiments of the present invention, the detection model can be selected from an SVM model, that is, the feature value of the to-be-detected video as obtained through the previous step is input to a well-trained SVM model to obtain an output result. If the output result is 0, this indicates that no jitter occurs to the to-be-detected video, if the output result is 1, this indicates that jitter occurs to the to-be-detected video. The use of a trainable SVM model as a video jitter decider enables jitter detection of videos captured in different scenarios, and the use of this model makes the generality better, and the precision rate of detection higher.

[0075] Fig. 2 is a flowchart illustrating preprocessing of the frame sequence according to an exemplary embodiment, with reference to Fig. 2, as a preferred embodiment in the embodiments of the present invention, prior to performing feature point detection, the method further comprises the following steps of preprocessing the frame sequence.

[0076] S101 - grayscale-processing the frame sequence, and obtaining a grayscale frame sequence.

[0077] Specifically, since the gray space only contains luminance information and does not contain color information, the amount of information of the image is greatly reduced after grayscale-processing; accordingly, in order to reduce subsequent amount of information participating in the calculation to facilitate subsequent calculation, the frame sequence Li (i=1, 2, 3, ..., n) obtained in the previous step is further grayscale-processed in the embodiments of the present invention, and a grayscale frame sequence is obtained to be expressed as Gi (i=1, 2, 3, ..., n) , in which the grayscale conversion expression is as follows:
G = Rx 0.299+G x 0.587 + B x0.114

[0078] S102 - denoising the grayscale frame sequence.

[0079] Specifically, in order to effectively prevent noise points (namely non feature points) from affecting subsequent steps and to enhance precision of detection, it is further required to denoise the grayscale frame sequence. During specific implementation, it is possible to employ a TV denoising method based on a total variation model to denoise the grayscale frame sequence Gi (i=1, 2, 3, ..., n) to obtain a denoised frame sequence expressed as Ti (i=1, 2, 3, ..., n), namely a preprocessed frame sequence to which the to-be-detected video corresponds. As should be noted here, the denoising method is randomly selectable in the embodiments of the present invention, and no restriction is made thereto in this context.

[0080] The step of performing feature point detection on the frame sequence frame by frame is performing feature point detection on the preprocessed frame sequence frame by frame.

[0081] As a preferred embodiment in the embodiments of the present invention, the step of performing feature point detection on the frame sequence frame by frame, obtaining feature points of each frame includes:

[0082] employing a feature point detecting algorithm in which are fused FAST
features and SURF features to perform feature point detection on the frame sequence frame by frame, and obtain feature points of each frame.

[0083] Specifically, since the precision of the video jitter detecting algorithm is affected by feature point extraction and the matching technique, the performance of the feature point extracting algorithm will directly affect the precision of the video jitter detecting algorithm, so the feature point extracting algorithm is optimized in the embodiments of the present invention. As a preferred embodiment, a feature point detecting algorithm in which are fused FAST features and SURF features is employed. The SURF
algorithm is an improvement over SIFT algorithm. SIFT is a feature describing method with excellent robustness and invariant scale, while the SURF algorithm improves over the problems of large amount of data to be calculated, high time complexity and long duration of calculation inherent in the SIFT algorithm at the same time of maintaining the advantages of the SIFT algorithm. Moreover, SURF is more excellent in performance in the aspect of invariance of illumination change and perspective change, particularly excellent in processing severe blurs and rotations of images, and excellent in describing local features of images. FAST feature detection is a kind of corner detection method, and the most prominent advantage of its algorithm rests in its calculation efficiency, and the capability to excellently describe global features of images. Therefore, use of the feature point detecting algorithm in which are fused FAST features and SURF features to perform feature point extraction not only gives consideration to global features of images, but also fully retains local features of images, moreover, computational expense is small, robustness against image blurs and faint illumination is strong, and real-time property and precision of the detection are further enhanced.

[0084] Fig. 3 is a flowchart illustrating performing an operation on the frame feature point sequence matrix based on an optical flow tracking algorithm to obtain a motion vector of each frame according to an exemplary embodiment, with reference to Fig. 3, as a preferred embodiment in the embodiments of the present invention, the step of basing on an optical flow tracking algorithm to perform an operation on the frame feature point sequence matrix to obtain a motion vector of each frame includes the following steps.

[0085] S301 - performing optical flow tracking calculation on the frame feature point sequence matrix of each frame, and obtaining an initial motion vector of each frame.

[0086] Specifically, while the optical flow tracking calculation is performed on the frame feature point sequence matrix, a pyramid optical flow tracking LK (Lucas-Kanade) algorithm can be employed. For instance, the change of a feature point sequence matrix Zi in the ith frame is tracked to the next i+lth frame, and a motion vector ai is obtained, whose expression is as follows:
dx' a i (i = 1,2,¨, n) - - dyi I dr'

[0087] where dx-' represents a Euclidean column offset from the th frame to the i+lth frame, dy represents a Euclidean row offset from the ith frame to the i+1th frame, and dri represents an angle offset from the ith frame to the i+lth frame.

[0088] Use of the pyramid optical flow tracking LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative structure can effectively solve the problem of failed tracking due to unduly large change from feature points of frame A (which is supposed to be the current frame) to feature points of frame B (which is supposed to be the next frame), and lays the foundation for the method of detecting video jitter provided by the embodiments of the present invention to enhance its jitter detecting sensitivity and robustness when videos are processed as captured under circumstances of abrupt large displacement, strong jitter, and excessive rotation of lenses.

[0089] S302 - obtaining a corresponding accumulative motion vector according to the initial motion vector.

[0090] Specifically, accumulative integral transformation is performed on the initial motion vector Ot: of each frame as obtained in step S301 to obtain an accumulative motion vector, expressed as /8 , of each frame, in which the expression of the accumulative motion vector pi is as follows:
j=1 dr

[0091] S303 - smoothening the accumulative motion vector, and obtaining a smoothened motion vector.

[0092] Specifically, a sliding average window is used to smoothen the motion vector /8 i obtained in step S302 to obtain a smoothened motion vector r , whose expression is:

CbCj J =1 = E dr' .1 =1 j J =1

[0093] where n represents the total number of frames of the video; the radius of the smooth window is r, and its expression is:
Fl n 20 1= 10 In (1+ prz) _______________________________________________________ n > 20 1. Ill (1+ ,u)

[0094] where indicates a parameter of the sliding window, and la is a positive number, the specific numerical value of can be dynamically adjusted according to practical requirement, for instance, as a preferred embodiment, it can be set as =30.

[0095] In the embodiments of the present invention, the sliding average window with extremely small computational expense is used to smoothen the motion vector, while Kalman filtering with complicated computation is not used for the process, whereby computational expense is further reduced and real-time property is further enhanced, not at the expense of losing any precision.

[0096] S304 - employing the accumulative motion vector and the smoothened motion vector to readjust the initial motion vector of each frame, and obtaining the motion vector of each frame.

[0097] Specifically, # and y obtained in the previous steps S302 and S303 are used to readjust ai obtained in step S301, to obtain a readjusted motion vector A,/ , whose expression is:
r dri - dr' j=1 I 4 11/4, F
= 1, 2, - = - , n) = +(7, -,,0,) = dy' + dyi-Z dyi j=i j=1 dr'+ Zdri-Zdril j=1 i=1

[0098] The readjusted motion vector as obtained is taken as the motion vector of each frame to participate in subsequent calculation, so that the calculation result is made more precise, i.e., the result of video jitter detection is made more precise.

[0099] Fig. 4 is a flowchart illustrating obtaining a feature value of the to-be-detected video according to the motion vector of each frame according to an exemplary embodiment, with reference to Fig. 4, as a preferred embodiment in the embodiments of the present invention, the step of obtaining a feature value of the to-be-detected video according to the motion vector of each frame includes the following steps.

[0100] S401 - merging and converting the motion vectors of all frames into a matrix, and calculating unbiased standard deviations of various elements in the matrix.

[0101] Specifically, the motion vectors of all frames obtained through the previous steps are firstly merged and converted into a matrix, for instance, with respect to the motion vector A A= = A, A./ , it is converted into the form of a matrix [d 2 n , and unbiased standard deviations of its elements are calculated by rows, the specific calculation expression is as follows:
( fr) 11 \41 ____________________________________________________ A) i.n-

[0102] The unbiased standard deviations of the various elements in the matrix can be obtained through the above expression, and are respectively expressed as a [A, ( dx)] , [A.(dy )] and a[A,(dr)] , in which A represents the average value of the samples.

[0103] S402 - weighting and fusing the unbiased standard deviations of the various elements, and obtaining a weighted value.

[0104] Specifically, weights are assigned to the unbiased standard deviations of the various elements according to practical requirements, and the unbiased standard deviations of the various elements are weighted and fused according to the weights, wherein the weights of the unbiased standard deviations of the various elements can be dynamically readjusted according to practical requirements. For instance, set the weight of a [A, (dx)] as 3, the weight of a [gdy)] as 3, and the weight of a[A,(dr)] as 10, then the fusing expression is as follows:
1. K = 3 Cr[ 2(dX)] 3 a[2(dy)] +I Clo-[2.(dr)]

[0105] S403 - taking the unbiased standard deviations of the various elements and the weighted value as the feature value of the to-be-detected video.

[0106] Specifically, in the embodiments of the present invention, the feature value of the to-be-detected video S is the unbiased standard deviations of the various elements and their weighted value as obtained in the previous steps, which are expressed as:
{1X)1, cr[A4dyns 47[2.0ra

[0107] Fig. 5 is a view schematically illustrating the structure of a device for detecting video jitter according to an exemplary embodiment, with reference to Fig. 5, the device comprises:

[0108] a framing processing module, for performing a framing process on a to-be-detected video to obtain a frame sequence;

[0109] a feature point detecting module, for performing feature point detection on the frame sequence frame by frame, obtaining feature points of each frame, and generating a frame feature point sequence matrix;

[0110] a vector calculating module, for basing on an optical flow tracking algorithm to perform an operation on the frame feature point sequence matrix to obtain a motion vector of each frame;

[0111] a feature value extracting module, for obtaining a feature value of the to-be-detected video according to the motion vector of each frame; and

[0112] a jitter detecting module, for taking the feature value of the to-be-detected video as an input signal of a detection model to perform operation to obtain an output signal, and judging whether jitter occurs to the to-be-detected video according to the output signal.

[0113] As a preferred embodiment in the embodiments of the present invention, the device further comprises:

[0114] a data preprocessing module, for preprocessing the frame sequence;

[0115] the data preprocessing module includes:

[0116] a grayscale-processing unit, for grayscale-processing the frame sequence, and obtaining a grayscale frame sequence; and

[0117] a denoising processing unit, for denoising the grayscale frame sequence;

[0118] the feature point detecting module is employed for performing feature point detection on the preprocessed frame sequence frame by frame.

[0119] As a preferred embodiment in the embodiments of the present invention, the feature point detecting module is further employed for:

[0120] employing a feature point detecting algorithm in which are fused FAST
features and SURF features to perform feature point detection on the frame sequence frame by frame, and obtain feature points of each frame.

[0121] As a preferred embodiment in the embodiments of the present invention, the vector calculating module includes:

[0122] an optical flow tracking unit, for performing optical flow tracking calculation on the frame feature point sequence matrix of each frame, and obtaining an initial motion vector of each frame;

[0123] an accumulation calculating unit, for obtaining a corresponding accumulative motion vector according to the initial motion vector;

[0124] a smoothening processing unit, for smoothening the accumulative motion vector, and obtaining a smoothened motion vector; and

[0125] a vector readjusting unit, for employing the accumulative motion vector and the smoothened motion vector to readjust the initial motion vector of each frame, and obtaining the motion vector of each frame.

[0126] As a preferred embodiment in the embodiments of the present invention, the feature value extracting module includes:

[0127] a matrix converting unit, for merging and converting the motion vectors of all frames into a matrix;

[0128] a standard deviation calculating unit, for calculating unbiased standard deviations of various elements in the matrix; and

[0129] a weighting and fusing unit, for weighting and fusing the unbiased standard deviations of the various elements, and obtaining a weighted value.

[0130] In summary, the technical solutions provided by the embodiments of the present invention bring about the following advantageous effects.
1. In the method of and device for detecting video jitter provided by the embodiments of the present invention, by basing on the optical flow tracking algorithm to obtain a motion vector of each frame according to the frame feature point sequence matrix, the present invention effectively solves the problem of failed tracking due to unduly large change between two adjacent frames, exhibits excellent toleration and adaptability in detecting jitters of videos captured under the condition in which the lens slowly moves, and achieves excellent sensitivity and robustness when videos are detected as captured under circumstances of abrupt large displacement, strong jitter, and excessive rotation of the lens.
2. In the method of and device for detecting video jitter provided by the embodiments of the present invention, a feature point detecting algorithm in which are fused FAST features and SURF features is employed, that is to say, the feature point extracting algorithm is so optimized that not only the image global feature is considered, but the local features are also retained, moreover, computational expense is small, robustness against image blurs and faint illumination is strong, and real-time property and precision of the detection are further enhanced.
3. In the method of and device for detecting video jitter provided by the embodiments of the present invention, features of at least four dimensions are extracted from the to-be-detected video, and an SVM model is used as the detection model, so that the method of detecting video jitter as provided by the embodiments of the present invention is more advantageous in terms of generality, and precision of detection is further enhanced.

[0131] Of course, implementation of any one solution according to the present application does not necessarily achieve all of the aforementioned advantages simultaneously.
As should be noted, when the device for detecting video jitter provided by the aforementioned embodiment triggers a detecting business, it is merely exemplarily described with its division into the aforementioned various functional modules, whereas in actual application it is possible to base on requirements to assign the aforementioned functions to different functional modules for completion, that is to say, the internal structure of the device is divided into different functional modules to complete the entire or partial functions as described above. In addition, the device for detecting video jitter provided by the aforementioned embodiment pertains to the same inventive conception as the method of detecting video jitter, in other words, the device is based on the method of detecting video jitter ¨ see the method embodiment for its specific implementation process, while no repetition will be made in this context.

[0132] As comprehensible to persons ordinarily skilled in the art, the entire or partial steps in the aforementioned embodiments can be completed via hardware, or via a program instructing relevant hardware, the program can be stored in a computer-readable storage medium, and the storage medium can be a read-only memory, a magnetic disk or an optical disk, etc.

[0133] The foregoing embodiments are merely preferred embodiments of the present invention, and they are not to be construed as restrictive to the present invention. Any amendment, equivalent substitution, and improvement makeable within the spirit and principle of the present invention shall all fall within the protection scope of the present invention.

Claims

Claims:

1. A device for detecting video jitter, comprising:
a framing processing module configured to perform a framing process on a to-be-detected video to obtain a frame sequence;
a feature point detecting module configured to perform feature point detection on the frame sequence frame by frame, obtain feature points of each frame by employing a feature point detecting algorithm including fused FAST (Features from Accelerated Segment Test) and SURF (Speeded-Up Robust Features), and generate a frame feature point sequence matrix;
a vector calculating module configured to perform an operation on the frame feature point sequence matrix to obtain a motion vector of each frame based on an optical flow tracking algorithm;
a feature value extracting module configured to obtain a feature value of the to-be-detected video according to the motion vector of each frame; and a jitter detecting module configured to take the feature value of the to-be-detected video as an input signal of a detection model to perform operation to obtain an output signal, and judge whether jitter occurs to the to-be-detected video according to the output signal.

2. The device of claim 1, wherein the device further comprises a data preprocessing module configured to preprocess the frame sequence.

3. The device of any one of claims 1 to 2, wherein the/a data preprocessing module further comprises:
a grayscale-processing unit configured to grayscale-process the frame sequence, and obtain a grayscale frame sequence; and a denoising processing unit configured to denoise the grayscale frame sequence;
wherein the feature point detecting module is configured to perform feature point detection on the preprocessed frame sequence frame by frame.

4. The device of any one of claims 1 to 3, wherein the feature point detecting module is further configured to:
employ the feature point detecting algorithm in which are the fused FAST
(Features from Accelerated Segment Test) and the SURF (Speeded-Up Robust Features) to perform feature point detection on the frame sequence frame by frame, and to obtain the feature points of each frame.

5. The device of any one of claims 1 to 4, wherein the vector calculating module further comprises:
an optical flow tracking unit configured to perform optical flow tracking calculation on the frame feature point sequence matrix of each frame, and obtain an initial motion vector of each frame;
an accumulation calculating unit configured to obtain a corresponding accumulative motion vector according to the initial motion vector;
a smoothening processing unit configured to smoothen the accumulative motion vector, and obtain a smoothened motion vector; and a vector readjusting unit configured to employ the accumulative motion vector and the smoothened motion vector to readjust the initial motion vector of each frame, and obtain the motion vector of each frame.

6. The device of any one of claims 1 to 5, wherein the feature value extracting module further comprises:
a matrix converting unit configured to merge and convert the motion vectors of all frames into a matrix;
a standard deviation calculating unit configured to calculate unbiased standard deviations of various elements in the matrix; and a weighting and fusing unit configured to weight and fuse the unbiased standard deviations of the various elements, and obtain a weighted value.

7. The device of any one of claims 1 to 6, wherein a to-be-detected video is obtained.

8. The device of any one of claims 1 to 7, wherein a framing extraction process is performed on the to-be-detected video to obtain a frame sequence corresponding to the to-be-detected video.

9. The device of any one of claims 1 to 8, wherein the frame sequence is expressed as L, (1=1, 2, 3, ..., n) , where Li represents the ith frame of the video, and n represents the total number of frames of the video.

10. The device of any one of claims 1 to 9, wherein a current frame and an adjacent next frame is selected from the to-be-detected video.

11. The device of any one of claims 1 to 10, wherein a current frame and a next frame by an interval of N frames is selected from the to-be-detected video.

12. The device of any one of claims 1 to 11, wherein corresponding feature points are obtained from the current frame and the adjacent next frame.

13. The device of any one of claims 1 to 12, wherein corresponding feature points are obtained from the current frame and the next frame by an interval of N frames.

14. The device of any one of claims 1 to 13, wherein corresponding matching is performed according to the feature points of the two frames to judge whether offset (jitter) occurs between the current frame and the adjacent next frame.

15. The device of any one of claims 1 to 14, wherein corresponding matching is performed according to the feature points of the two frames to judge whether offset (jitter) occurs between the current frame and the next frame by an interval of N frames.

16. The device of any one of claims 1 to 15, wherein a feature point detecting algorithm is employed to perform feature point detection on the processed frame sequence Li (i=1, 2, 3, ..., n) frame by frame.

17. The device of any one of claims 1 to 16, wherein a feature point detecting algorithm is employed to obtain feature points of each frame, that is, feature points of each frame of image are extracted.

18. The device of any one of claims 1 to 17, wherein a feature point detecting algorithm is employed to generate a frame feature point sequence matrix, which is expressed as Z (i=1, 2, ..., n).

19. The device of any one of claims 1 to 18, wherein Z (i=1, 2, ..., n) is expressed as:
Wherein apq represents a feature point detection result at row p , column q of the ith frame matrix, 1 is a feature point, 0 is a non-feature point, p represents the number of rows of the matrix, and q represents the number of columns of the matrix.

20. The device of any one of claims 1 to 19, wherein the optical flow tracking algorithm is employed to perform optical flow tracking calculation on the frame feature point sequence matrix.

21. The device of any one of claims 1 to 20, wherein the optical flow tracking algorithm is employed to track the change of feature points in the current frame to the next frame.

22. The device of any one of claims 1 to 21, wherein the change of a feature point sequence matrix in the ith frame is tracked to the next i+1 th frame, and a motion vector ai is obtained, whose expression is as follows:

where dr represents a Euclidean column offset from the ith frame to the i+lth frame, cly' represents a Euclidean row offset from the PI frame to the i+lth frame, and dr' represents an angle offset from the ith frame to the i+lth frame.

23. The device of any one of claims 1 to 22, wherein an extracted feature value at least includes the feature value of four dimensions.

24. The device of any one of claims 1 to 23, wherein the detection model is well trained in advance.

25. The device of any one of claims 1 to 24, wherein sample video data in a set of selected training data is correspondingly processed to obtain a feature value of the sample video data.

26. The device of any one of claims 1 to 25, wherein a detection model is trained according to the feature value of the/a sample video data and a corresponding annotation result of the sample video data to obtain the final detection model.

27. The device of any one of claims 1 to 26, wherein after the motion vectors have been dimensionally converted for an 111th video sample, unbiased standard deviations of various elements and their weighted and fused value are calculated and obtained, which are respectively expressed as and Km , and the annotation resultym of the 111th video sample is extracted to obtain the training sample of the 111th video sample, which is expressed as follows:

28. The device of any one of claims 1 to 27, wherein the annotation result ym indicates that no jitter occurs to the video sample if ym=0.

29. The device of any one of claims 1 to 28, wherein the annotation result ym indicates that jitter occurs to the video sample if ym=1.

30. The device of any one of claims 1 to 29, wherein a video sample makes use of features of at least five dimensions.

31. The device of any one of claims 1 to 30, wherein the detection model is selected from an SVM
(Support Vector Machine) model.

32. The device of any one of claims 1 to 31, wherein the feature value of the to-be-detected video is input to a well-trained SVM model to obtain an output result.

33. The device of any one of claims 1 to 32, wherein if the/a SVM model output result is 0, this indicates that no jitter occurs to the to-be-detected video.

34. The device of any one of claims 1 to 33, wherein if the/a SVM model output result is 1, this indicates that jitter occurs to the to-be-detected video.

35. The device of any one of claims 1 to 34, wherein the use of a trainable SVM model as a video jitter decider enables jitter detection of videos captured in different scenarios.

36. The device of any one of claims 1 to 35, wherein the amount of information of the image is greatly reduced after grayscale-processing.

37. The device of any one of claims 1 to 36, wherein the frame sequence L, (i=1, 2, 3, ..., n) is further grayscale-processed.

38. The device of any one of claims 1 to 37, wherein a grayscale frame sequence is obtained to be expressed as G, (1=1, 2, 3, n), in which the grayscale conversion expression is as follows:
G=Rx0.299+Gx0.587+Bx0.114

39. The device of any one of claims 1 to 38, wherein a TV denoising method based on a total variation model is employed to denoise the grayscale frame sequence G, (1=1, 2, 3, ..., n) to obtain a denoised frame sequence expressed as T, (1=1, 2, 3, ..., n).

40. The device of any one of claims 1 to 39, wherein the/a denoised frame sequence is a preprocessed frame sequence to which the to-be-detected video corresponds.

41. The device of any one of claims 1 to 40, wherein the denoi sing method is randomly selectable.

42. The device of any one of claims 4 to 41, wherein the SURF algorithm is an improvement over SIFT (Scale-Invariant Feature Transform) algorithm.

43. The device of claim 42, wherein SIFT is a feature describing method with excellent robustness and invariant scale.

44. The device of any one of claims 42 to 43, wherein the SURF algorithm improves over the problems of large amount of data to be calculated, high time complexity and long duration of calculation inherent in the SIFT algorithm at the same time of maintaining the advantages of the SIFT algorithm.

45. The device of any one of claims 4 to 44, wherein SURF is more excellent in performance in the aspect of invariance of illumination change and perspective change, particularly excellent in processing severe blurs and rotations of images.

46. The device of any one of claims 4 to 45, wherein SURF is excellent in describing local features of images.

47. The device of any one of claims 4 to 46, wherein FAST feature detection is a corner detection method.

48. The device of any one of claims 4 to 47, wherein FAST feature detection has the most prominent advantage of its algorithm in its calculation efficiency, and the capability to excellently describe global features of images.

49. The device of any one of claims 4 to 48, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction gives consideration to global features of images.

50. The device of any one of claims 4 to 49, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction fully retains local features of images.

51. The device of any one of claims 4 to 50, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction has small computational expense.

52. The device of any one of claims 4 to 51, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction has strong robustness against image blurs and faint illumination.

53. The device of any one of claims 4 to 52, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction enhances real-time property and precision of the detection.

54. The device of any one of claims 1 to 53, wherein while the optical flow tracking calculation is performed on the frame feature point sequence matrix, a pyramid optical flow tracking LK
(Lucas-Kanade) algorithm is employed.

55. The device of any one of claims 1 to 54, wherein the change of a feature point sequence matrix Z in the ith frame is tracked to the next i+lth frame, and a motion vector ai is obtained, whose expression is as follows:
where dr' represents a Euclidean column offset from the ith frame to the i+ th frame, dy' represents a Euclidean row offset from the ith frame to the i+ th frame, and dri represents an angle offset from the PI frame to the i+ th frame.

56. The device of any one of claims 1 to 55, wherein use of the/a pyramid optical flow tracking LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative structure solves the problem of failed tracking due to unduly large change from feature points of a current frame to feature points of a next adjacent frame.

57. The device of any one of claims 1 to 56, wherein use of the/a pyramid optical flow tracking LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative structure solves the problem of failed tracking due to unduly large change from feature points of a current frame to feature points of a next frame by an interval of N frames.

58. The device of any one of claims 1 to 57, wherein accumulative integral transformation is performed on the initial motion vector ai of each frame to obtain an accumulative motion vector, expressed as , of each frame, in which the expression of the accumulative motion vector f3i is as follows:

59. The device of any one of claims 1 to 58, wherein a sliding average window is used to smoothen the motion vector )3i to obtain a smoothened motion vector Y , whose expression is:

where n represents the total number of frames of the video.

60. The device of any one of claims 1 to 59, wherein the radius of the smooth window is r, and its expression is:
where tt indicates a parameter ot the sliding window, and la is a positive number.

61. The device of any one of claims 1 to 60, wherein the specific numerical value of can be dynamically adjusted according to practical requirement.

62. The device of any one of claims 1 to 61, wherein the specific numerical value of jt is set as =30.

63. The device of any one of claims 1 to 62, wherein and Y are used to readjust ai to obtain a readjusted motion vector 4, whose expression is:

64. The device of any one of claims 1 to 63, wherein the readjusted motion vector Ai is taken as the motion vector of each frame to participate in subsequent calculation, so that the calculation result is made more precise.

65. The device of any one of claims 1 to 64, wherein the motion vectors of all frames obtained are merged and converted into a matrix.
¨

66. The device of any one of claims 1 to 65, wherein the motion vector Ai is converted into the form of a matrix and unbiased standard deviations of its elements are calculated by rows, the specific calculation expression is as follows:

67. The device of any one of claims 1 to 66, wherein the unbiased standard deviations of the various elements in the matrix are respectively expressed as and , in which A represents the average value of the samples.

68. The device of any one of claims 1 to 67, wherein weig)its are assigned to the unbiased standard deviations of the various elements according to practical requirements.

69. The device of any one of claims 1 to 68, wherein the unbiased standard deviations of the various elements are weighted and fused according to the weights.

70. The device of any one of claims 1 to 69, wherein the weights of the unbiased standard deviations of the various elements can be dynamically readjusted according to practical requirements.

71. The device of any one of claims 1 to 70, wherein the weight of is set as 3, the weight of is set as 3, and the weight of is set as 10, then the fusing expression is as follows:

72. The device of any one of claims 1 to 71, wherein the feature value of the to-be-detected video is the unbiased standard deviations of the various elements and their weighted value, which are expressed as:
{c4.1.(dx)]5 a[A.(dy)], cr[A(dr)], K.,}(s) .

73. A computer-readable storage medium for detecting video jitter, storing therein a computer program, wherein the computer program is configured to:
perform a framing process on a to-be-detected video to obtain a frame sequence;
perform feature point detection on the frame sequence frame by frame, obtain feature points of each frame by employing a feature point detecting algorithm including fused FAST (Features from Accelerated Segment Test) and SURF (Speeded-Up Robust Features), and generate a frame feature point sequence matrix;
perform an operation on the frame feature point sequence matrix to obtain a motion vector of each frame based on an optical flow tracking algorithm;

obtain a feature value of the to-be-detected video according to the motion vector of each frame; and take the feature value of the to-be-detected video as an input signal of a detection model to perform operation to obtain an output signal, and judge whether jitter occurs to the to-be-detected video according to the output signal.

74. The computer-readable storage medium of claim 73, wherein the computer program is further configured to:
grayscale-process the frame sequence, and obtain a grayscale frame sequence;
and denoise the grayscale frame sequence;
wherein the step of feature point detection on the frame sequence frame by frame is feature point detection on the preprocessed frame sequence frame by frame.

75. The computer-readable storage medium of any one of claims 73 to 74, wherein the computer program is further configured to:
employ the feature point detecting algorithm in which are the fused FAST
(Features from Accelerated Segment Test) and the SURF (Speeded-Up Robust Features) to perform feature point detection on the frame sequence frame by frame, and to obtain the feature points of each frame.

76. The computer-readable storage medium of any one of claims 73 to 75, wherein the computer program is further configured to:
perform optical flow tracking calculation on the frame feature point sequence matrix of each frame, and obtain an initial motion vector of each frame;
obtain a corresponding accumulative motion vector according to the initial motion vector;
smoothen the accumulative motion vector, and obtain a smoothened motion vector; and employ the accumulative motion vector and the smoothened motion vector to readjust the initial motion vector of each frame, and obtain the motion vector of each frame.

77. The computer-readable storage medium of any one of claims 73 to 76, wherein the computer program is further configured to:
merge and convert the motion vectors of all frames into a matrix, and calculate unbiased standard deviations of various elements in the matrix;
weight and fuse the unbiased standard deviations of the various elements, and obtain a weighted value; and take the unbiased standard deviations of the various elements and the weighted value as the feature value of the to-be-detected video.

78. The computer-readable storage medium of any one of claims 73 to 77, wherein a to-be-detected video is obtained.

79. The computer-readable storage medium of any one of claims 73 to 78, wherein a framing extraction process is performed on the to-be-detected video to obtain a frame sequence corresponding to the to-be-detected video.

80. The computer-readable storage medium of any one of claims 73 to 79, wherein the frame sequence is expressed as Li (i=1, 2, 3, ..., n), where Li represents the ith frame of the video, and n represents the total number of frames of the video.

81. The computer-readable storage medium of any one of claims 73 to 80, wherein a current frame and an adjacent next frame is selected from the to-be-detected video.

82. The computer-readable storage medium of any one of claims 73 to 81, wherein a current frame and a next frame by an interval of N frames is selected from the to-be-detected video.

83. The computer-readable storage medium of any one of claims 73 to 82, wherein corresponding feature points are obtained from a current frame and the adjacent next frame.

84. The computer-readable storage medium of any one of claims 73 to 83, wherein corresponding feature points are obtained from a current frame and the next frame by an interval of N frames.

85. The computer-readable storage medium of any one of claims 73 to 84, wherein corresponding matching is performed according to the feature points of the two frames to judge whether offset (jitter) occurs between the current frame and the adjacent next frame.

86. The computer-readable storage medium of any one of claims 73 to 85, wherein corresponding matching is performed according to the feature points of the two frames to judge whether offset (jitter) occurs between the current frame and the next frame by an interval of N frames.

87. The computer-readable storage medium of any one of claims 73 to 86, wherein a feature point detecting algorithm is employed to perform feature point detection on the processed frame sequence L, (i=1, 2, 3, ..., n) frame by frame.

88. The computer-readable storage medium of any one of claims 73 to 87, wherein a feature point detecting algorithm is employed to obtain feature points of each frame, that is, feature points of each frame of image are extracted.

89. The computer-readable storage medium of any one of claims 73 to 88, wherein a feature point detecting algorithm is employed to generate a frame feature point sequence matrix, which is expressed as Z (1A, 2, ..., n) .

90. The computer-readable storage medium of any one of claims 73 to 89, wherein Z, (i=1, 2, ..., n) is expressed as:
l Wherein a ' q represents a feature point detection result at row p , column q of the ith frame matrix, 1 is a feature point, 0 is a non-feature point, p represents the number of rows of the matrix, and q represents the number of columns of the matrix.

91. The computer-readable storage medium of any one of claims 73 to 90, wherein the optical flow tracking algorithm is employed to perform optical flow tracking calculation on the frame feature point sequence matrix.

92. The computer-readable storage medium of any one of claims 73 to 91, wherein the optical flow tracking algorithm is employed to track the change of feature points in the current frame to the next frame.

93. The computer-readable storage medium of any one of claims 73 to 92, wherein the change of a feature point sequence matrix Z in the ith frame is tracked to the next i+lth frame, and a motion vector ai is obtained, whose expression is as follows:
where dr represents a Euclidean column offset from the ith frame to the i+lth frame, dy' represents a Euclidean row offset from the ith frame to the i+lth frame, and dr' represents an angle offset from the ith frame to the i+lth frame.

94. The computer-readable storage medium of any one of claims 73 to 93, wherein an extracted feature value at least includes the feature value of four dimensions.

95. The computer-readable storage medium of any one of claims 73 to 94, wherein the detection model is well trained in advance.

96. The computer-readable storage medium of any one of claims 73 to 95, wherein sample video data in a set of selected training data is correspondingly processed to obtain a feature value of the sample video data.

97. The computer-readable storage medium of any one of claims 73 to 96, wherein a detection model is trained according to the feature value of the sample video data and a corresponding annotation result of the sample video data to obtain the final detection model.

98. The computer-readable storage medium of any one of claims 73 to 97, wherein after the motion vectors have been dimensionally converted for an 171th video sample, unbiased standard deviations of various elements and their weighted and fused value are calculated and obtained, which are respectively expressed as and Km , and the annotation result))m of the 171th video sample is extracted to obtain the training sample of the 711th video sample, which is expressed as follows:

99. The computer-readable storage medium of any one of claims 73 to 98, wherein the annotation result))m indicates that no jitter occurs to the video sample if y.=0.

100. The computer-readable storage medium of any one of claims 73 to 99, wherein the annotation result))m indicates that jitter occurs to the video sample if )).=1.

101. The computer-readable storage medium of any one of claims 73 to 100, wherein a video sample makes use of features of at least five dimensions.

102. The computer-readable storage medium of any one of claims 73 to 101, wherein the detection model is selected from an SVM (Support Vector Machine) model.

103. The computer-readable storage medium of any one of claims 73 to 102, wherein the feature value of the to-be-detected video is input to a well-trained SVM model to obtain an output result.

104. The computer-readable storage medium of any one of claims 73 to 103, wherein if the/a SVM
model output result is 0, this indicates that no jitter occurs to the to-be-detected video.

105. The computer-readable storage medium of any one of claims 73 to 104, wherein if the/a SVM
model output result is 1, this indicates that jitter occurs to the to-be-detected video.

106. The computer-readable storage medium of any one of claims 73 to 105, wherein the use of a trainable SVM model as a video jitter decider enables jitter detection of videos captured in different scenarios.

107. The computer-readable storage medium of any one of claims 73 to 106, wherein the amount of information of the image is greatly reduced after grayscale-processing.

108. The computer-readable storage medium of any one of claims 73 to 107, wherein the frame sequence L, (1=1, 2, 3, ..., n) is further graysc ale-processed.

109. The computer-readable storage medium of any one of claims 73 to 108, wherein a grayscale frame sequence is obtained to be expressed as G, (i=1, 2, 3, ..., n), in which the grayscale conversion expression is as follows:
G = Rx 0.299+G x 0.587 B x a 114

110. The computer-readable storage medium of any one of claims 73 to 109, wherein a TV
denoising method based on a total variation model is employed to denoise the grayscale frame sequence G, (i=1, 2, 3, ..., n) to obtain a denoised frame sequence expressed as T, (1=1, 2, 3, ..., n).

111. The computer-readable storage medium of any one of claims 73 to 110, wherein the/a denoised frame sequence is a preprocessed frame sequence to which the to-be-detected video corresponds.

112. The computer-readable storage medium of any one of claims 73 to 111, wherein the/a TV
denoising method is randomly selectable.

113. The computer-readable storage medium of any one of claims 75 to 112, wherein the SURF
algorithm is an improvement over SIFT (Scale-Invariant Feature Transform) algorithm.

114. The computer-readable storage medium of claim 113, wherein SIFT is a feature describing method with excellent robustness and invariant scale.

115. The computer-readable storage medium of any one of claims 113 to 114, wherein the SURF
algorithm improves over the problems of large amount of data to be calculated, high time complexity and long duration of calculation inherent in the SIFT algorithm at the same time of maintaining the advantages of the SIFT algorithm.

116. The computer-readable storage medium of any one of claims 75 to 115, wherein SURF is more excellent in performance in the aspect of invariance of illumination change and perspective change, particularly excellent in processing severe blurs and rotations of images.

117. The computer-readable storage medium of any one of claims 75 to 116, wherein SURF is excellent in describing local features of images.

118. The computer-readable storage medium of any one of claims 75 to 117, wherein FAST feature detection is a corner detection method.

119. The computer-readable storage medium of any one of claims 75 to 118, wherein FAST feature detection has the most prominent advantage of its algorithm in its calculation efficiency, and the capability to excellently describe global features of images.

120. The computer-readable storage medium of any one of claims 75 to 119, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction gives consideration to global features of images.

121. The computer-readable storage medium of any one of claims 75 to 120, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction fully retains local features of images.

122. The computer-readable storage medium of any one of claims 75 to 121, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction has small computational expense.

123. The computer-readable storage medium of any one of claims 75 to 122, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction has strong robustness against image blurs and faint illuminati on.

124. The computer-readable storage medium of any one of claims 75 to 123, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction enhances real-time property and precision of the detection.

125. The computer-readable storage medium of any one of claims 73 to 124, wherein while the optical flow tracking calculation is perfoimed on the frame feature point sequence matrix, a pyramid optical flow tracking LK (Lucas-Kanade) algorithm is employed.

126. The computer-readable storage medium of any one of claims 73 to 125, wherein the change of a feature point sequence matrix Z in the ith frame is tracked to the next i+lth frame, and a motion vector (xi is obtained, whose expression is as follows:
where cbcr represents a Euclidean column offset from the ith frame to the i+lth frame, dy' represents a Euclidean row offset from the ith frame to the i+lth frame, and dr' represents an angle offset from the th frame to the i+lth frame.

127. The computer-readable storage medium of any one of claims 73 to 126, wherein use of the/a pyramid optical flow tracking LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative structure solves the problem of failed tracking due to unduly large change from feature points of a current frame to feature points of an adjacent next frame.

128. The computer-readable storage medium of any one of claims 73 to 127, wherein use of the/a pyramid optical flow tracking LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative structure solves the problem of failed tracking due to unduly large change from feature points of a current frame to feature points of a next frame by an interval of N
frames.

129. The computer-readable storage medium of any one of claims 73 to 128, wherein accumulative integral transformation is performed on the initial motion vector ai of each frame to obtain an accumulative motion vector, expressed as fii , of each frame, in which the expression of the accumulative motion vector fli is as follows:

130. The computer-readable storage medium of any one of claims 73 to 129, wherein a sliding average window is used to smoothen the motion vector 13i to obtain a smoothened motion vector Y , whose expression is:

where n represents the total number of frames of the video.

131. The computer-readable storage medium of any one of claims 73 to 130, wherein the radius of the smooth window is r, and its expression is:
where indicates a parameter of the sliding window, and la is a positive number.

132. The computer-readable storage medium of any one of claims 73 to 131, wherein the specific numerical value of can be dynamically adjusted according to practical requirement.

133. The computer-readable storage medium of any one of claims 73 to 132, wherein the specific numerical value of is set as .=30.

134. The computer-readable storage medium of any one of claims 73 to 133, wherein 13 and are used to readjust ai to obtain a readjusted motion vector 114, whose expression is:

135. The computer-readable storage medium of any one of claims 73 to 134, wherein the readjusted motion vector Ai is taken as the motion vector of each frame to participate in subsequent calculation, so that the calculation result is made more precise.

136. The computer-readable storage medium of any one of claims 73 to 135, wherein the motion vectors of all frames obtained are merged and converted into a matrix.

137. The computer-readable storage medium of any one of claims 73 to 136, wherein the motion vector is converted into the form of a matrix and unbiased standard deviations of its elements are calculated by rows, the specific calculation expression is as follows:

138. The computer-readable storage medium of any one of claims 73 to 137, wherein the unbiased standard deviations of the various elements in the matrix are respectively expressed as , in which A represents the average value of the samples.

139. The computer-readable storage medium of any one of claims 73 to 138, wherein weights are assigned to the unbiased standard deviations of the various elements according to practical requirements.

140. The computer-readable storage medium of any one of claims 73 to 139, wherein the unbiased standard deviations of the various elements are weighted and fused according to the weights.

141. The computer-readable storage medium of any one of claims 73 to 140, wherein the weights of the unbiased standard deviations of the various elements can be dynamically readjusted according to practical requirements.

142. The computer-readable storage medium of any one of claims 73 to 141, wherein the weight of is set as 3, the weight of is set as 3, and the weight of is set as 10, then the fusing expression is as follows:

143. The computer-readable storage medium of any one of claims 73 to 142, wherein the feature value of the to-be-detected video is the unbiased standard deviations of the various elements and their weighted value, which are expressed as:
fo-[Ä(dx)], a[A.(dy)], o[2(dr)], ic,}(s)

144. The computer-readable storage medium of any one of claims 73 to 143, wherein the computer-readable storage medium is a read-only memory.

145. The computer-readable storage medium of any one of claims 73 to 144, wherein the computer-readable storage medium is a magnetic disk.

146. The computer-readable storage medium of any one of claims 73 to 145, wherein the computer-readable storage medium is an optical disk.

147.A computer device for detecting video jitter, comprising:
one or more processors, configured to:
perform a framing process on a to-be-detected video to obtain a frame sequence;
perform feature point detection on the frame sequence frame by frame, obtain feature points of each frame by employing a feature point detecting algorithm including fused FAST (Features from Accelerated Segment Test) and SURF
(Speeded-Up Robust Features), and generate a frame feature point sequence matrix;
perform an operation on the frame feature point sequence matrix to obtain a motion vector of each frame based on an optical flow tracking algorithm;
obtain a feature value of the to-be-detected video according to the motion vector of each frame; and take the feature value of the to-be-detected video as an input signal of a detection model to perfolln operation to obtain an output signal, and judge whether jitter occurs to the to-be-detected video according to the output signal.

148. The computer device of claim 147, wherein the computer device is further configured to:
grayscale-process the frame sequence, and obtain a grayscale frame sequence;
and denoise the grayscale frame sequence;

wherein the step of feature point detection on the frame sequence frame by frame is feature point detection on the preprocessed frame sequence frame by frame.

149. The computer device of any one of claims 147 to 148, wherein the computer device is further configured to:
employ the feature point detecting algorithm in which are the fused FAST
(Features from Accelerated Segment Test) and the SURF (Speeded-Up Robust Features) to perform feature point detection on the frame sequence frame by frame, and to obtain the feature points of each frame.

150. The computer device of any one of claims 147 to 149, wherein the computer device is further configured to:
perform optical flow tracking calculation on the frame feature point sequence matrix of each frame, and obtain an initial motion vector of each frame;
obtain a corresponding accumulative motion vector according to the initial motion vector;
smoothen the accumulative motion vector, and obtain a smoothened motion vector; and employ the accumulative motion vector and the smoothened motion vector to readjust the initial motion vector of each frame, and obtain the motion vector of each frame.

151. The computer device of any one of claims 147 to 150, wherein the computer device is further configured to:
merge and convert the motion vectors of all frames into a matrix, and calculate unbiased standard deviations of various elements in the matrix;
weight and fuse the unbiased standard deviations of the various elements, and obtain a weighted value; and take the unbiased standard deviations of the various elements and the weighted value as the feature value of the to-be-detected video.

152. The computer device of any one of claims 147 to 151, wherein a to-be-detected video is obtained.

153. The computer device of any one of claims 147 to 152, wherein a framing extraction process is performed on the to-be-detected video to obtain a frame sequence corresponding to the to-be-detected video.

154. The computer device of any one of claims 147 to 153, wherein the frame sequence is expressed as Li (1=1, 2, 3, ..., n) , where L, represents the ith frame of the video, and n represents the total number of frames of the video.

155. The computer device of any one of claims 147 to 154, wherein a current frame and an adjacent next frame is selected from the to-be-detected video.

156. The computer device of any one of claims 147 to 155, wherein a current frame and a next frame by an interval of N frames is selected from the to-be-detected video.

157. The computer device of any one of claims 147 to 156, wherein corresponding feature points are obtained from the current frame and the adjacent next frame.

158. The computer device of any one of claims 147 to 157, wherein corresponding feature points are obtained from the current frame and the next frame by an interval of N
frames.

159. The computer device of any one of claims 147 to 158, wherein corresponding matching is performed according to the feature points of the two frames to judge whether offset (jitter) occurs between the current frame and the adjacent next frame.

160. The computer device of any one of claims 147 to 159, wherein corresponding matching is performed according to the feature points of the two frames to judge whether offset (jitter) occurs between the current frame and the next frame by an interval of N
frames.

161. The computer device of any one of claims 147 to 160, wherein a feature point detecting algorithm is employed to perform feature point detection on the processed frame sequence L, (i=1, 2, 3, ..., n) frame by frame.

162. The computer device of any one of claims 147 to 161, wherein a feature point detecting algorithm is employed to obtain feature points of each frame, that is, feature points of each frame of image are extracted.

163. The computer device of any one of claims 147 to 162, wherein a feature point detecting algorithm is employed to generate a frame feature point sequence matrix, which is expressed as Z, (i=1, 2, ..., n).

164. The computer device of any one of claims 147 to 163, wherein Z, (1=1, 2, ..., n) is expressed as:
Wherein p'q represents a feature point detection result at row p, column q of the ith frame matrix, 1 is a feature point, 0 is a non-feature point, p represents the number of rows of the matrix, and q represents the number of columns of the matrix.

165. The computer device of any one of claims 147 to 164, wherein the optical flow tracking algorithm is employed to perform optical flow tracking calculation on the frame feature point sequence matrix.

166. The computer device of any one of claims 147 to 165, wherein the optical flow tracking algorithm is employed to track the change of feature points in the current frame to the next frame.

167. The computer device of any one of claims 147 to 166, wherein the change of a feature point sequence matrix Z in the ith frame is tracked to the next i-Elth frame, and a motion vector ai is obtained, whose expression is as follows:

where dr represents a Euclidean column offset from the ith frame to the i+lth frame, dy' represents a Euclidean row offset from the PI frame to the i+lth frame, and dr' represents an angle offset from the ith frame to the i+lth frame.

168. The computer device of any one of claims 147 to 167, wherein an extracted feature value at least includes the feature value of four dimensions.

169. The computer device of any one of claims 147 to 168, wherein the detection model is well trained in advance.

170. The computer device of any one of claims 147 to 169, wherein sample video data in a set of selected training data is correspondingly processed to obtain a feature value of the sample video data.

171. The computer device of any one of claims 147 to 170, wherein a detection model is trained according to the feature value of the sample video data and a corresponding annotation result of the sample video data to obtain the final detection model.

172. The computer device of any one of claims 147 to 171, wherein after the motion vectors have been dimensionally converted for an 177th video sample, unbiased standard deviations of various elements and their weighted and fused value are calculated and obtained, which are respectively expressed as and m , and the annotation resultym of the 117th video sample is extracted to obtain the training sample of the 177th video sample, which is expressed as follows:
fa[2.(dx)]?õ, a[A(dy)].õ, ()POOL, Km ym)(m)

173. The computer device of any one of claims 147 to 172, wherein the annotation result yn, indicates that no jitter occurs to the video sample if

174. The computer device of any one of claims 147 to 173, wherein the annotation result yn, indicates that jitter occurs to the video sample if y .

175. The computer device of any one of claims 147 to 174, wherein a video sample makes use of features of at least five dimensions.

176. The computer device of any one of claims 147 to 175, wherein the detection model is selected from an SVM (Support Vector Machine) model.

177. The computer device of any one of claims 147 to 176, wherein the feature value of the to-be-detected video is input to a well-trained SVM model to obtain an output result.

178. The computer device of any one of claims 147 to 177, wherein if the/a SVM
model output result is 0, this indicates that no jitter occurs to the to-be-detected video.

179. The computer device of any one of claims 147 to 178, wherein if the/a SVM
model output result is 1, this indicates that jitter occurs to the to-be-detected video.

180. The computer device of any one of claims 147 to 179, wherein the use of a trainable SVM
model as a video jitter decider enables jitter detection of videos captured in different scenarios.

181. The computer device of any one of claims 147 to 180, wherein the amount of information of the image is greatly reduced after grayscale-processing.

182. The computer device of any one of claims 147 to 181, wherein the frame sequence L, (1=1, 2, 3, ..., n) is further grayscale-processed.

183. The computer device of any one of claims 147 to 182, wherein a grayscale frame sequence is obtained to be expressed as G, (i=1, 2, 3, ..., n), in which the grayscale conversion expression is as follows:
G=Rx0.299+Gx0.587+Bx0.1.1A

184. The computer device of any one of claims 147 to 183, wherein a TV
denoising method based on a total variation model is employed to denoise the grayscale frame sequence G, (1A, 2, 3, ..., n) to obtain a denoised frame sequence expressed as T, (1=1, 2, 3, ..., n) .

185. The computer device of any one of claims 147 to 184, wherein the/a denoised frame sequence is a preprocessed frame sequence to which the to-be-detected video corresponds.

186. The computer device of any one of claims 147 to 185, wherein the denoising method is randomly selectable.

187. The computer device of any one of claims 149 to 186, wherein the SURF
algorithm is an improvement over SIFT (Scale-Invariant Feature Transform) algorithm.

188. The computer device of claim 187, wherein SIFT is a feature describing method with excellent robustness and invariant scale.

189. The computer device of any one of claims 187 to 188, wherein the/a SURF
algorithm improves over the problems of large amount of data to be calculated, high time complexity and long duration of calculation inherent in the SIFT algorithm at the same time of maintaining the advantages of the SIFT algorithm.

190. The computer device of any one of claims 149 to 189, wherein SURF is more excellent in performance in the aspect of invariance of illumination change and perspective change, particularly excellent in processing severe blurs and rotations of images.

191. The computer device of any one of claims 149 to 190, wherein SURF is excellent in describing local features of images.

192. The computer device of any one of claims 149 to 191, wherein FAST feature detection is a corner detection method.

193. The computer device of any one of claims 149 to 192, wherein FAST feature detection has the most prominent advantage of its algorithm in its calculation efficiency, and the capability to excellently describe global features of images.

194. The computer device of any one of claims 149 to 193, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction gives consideration to global features of images.

195. The computer device of any one of claims 149 to 194, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction fully retains local features of images.

196. The computer device of any one of claims 149 to 195, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction has small computational expense.

197. The computer device of any one of claims 149 to 196, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction has strong robustness against image blurs and faint illumination.

198. The computer device of any one of claims 149 to 197, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction enhances real-time property and precision of the detection.

199. The computer device of any one of claims 147 to 198, wherein while the optical flow tracking calculation is performed on the frame feature point sequence matrix, a pyramid optical flow tracking LK (Lucas-Kanade) algorithm is employed.

200. The computer device of any one of claims 147 to 199, wherein the change of a feature point sequence matrix Z, in the ith frame is tracked to the next i+1th frame, and a motion vector cti is obtained, whose expression is as follows:
where dxt represents a Euclidean column offset from the ith frame to the i+lth frame, clyi represents a Euclidean row offset from the ith frame to the i+lth frame, and dr' represents an angle offset from the ith frame to the i+lth frame.

201. The computer device of any one of claims 147 to 200, wherein use of the/a pyramid optical flow tracking LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative structure solves the problem of failed tracking due to unduly large change from feature points of a current frame to feature points of an adjacent next frame.

202. The computer device of any one of claims 147 to 201, wherein use of the/a pyramid optical flow tracking LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative structure solves the problem of failed tracking due to unduly large change from feature points of a current frame to feature points of a next frame by an interval of N frames.

203. The computer device of any one of claims 147 to 202, wherein accumulative integral transfommtion is perfomied on the initial motion vector ai of each frame to obtain an accumulative motion vector, expressed as fli , of each frame, in which the expression of the accumulative motion vector /3i is as follows:

204. The computer device of any one of claims 147 to 203, wherein a sliding average window is used to smoothen the motion vector to obtain a smoothened motion vector Y , whose expression is:

where n represents the total number of frames of the video.

205. The computer device of any one of claims 147 to 204, wherein the radius of the smooth window is r, and its expression iS:
where ix indicates a parameter of the sliding window, and is a positive number.

206. The computer device of any one of claims 147 to 205, wherein the specific numerical value of can be dynamically adjusted according to practical requirement.

207. The computer device of any one of claims 147 to 206, wherein the specific numerical value of is set as =30.

208. The computer device of any one of claims 147 to 207, wherein /3 and are used to readjust ai to obtain a readjusted motion vector whose expression is:

209. The computer device of any one of claims 147 to 208, wherein the readjusted motion vector is taken as the motion vector of each frame to participate in subsequent calculation, so that the calculation result is made more precise.

210. The computer device of any one of claims 147 to 209, wherein the motion vectors of all frames obtained are merged and converted into a matrix.

211. The computer device of any one of claims 147 to 210, wherein the motion vector /14 is converted into the form of a matrix , and unbiased standard deviations of its elements are calculated by rows, the specific calculation expression is as follows:

212. The computer device of any one of claims 147 to 211, wherein the unbiased standard deviations of the various elements in the matrix are respectively expressed as , in which A represents the average value of the samples.

213. The computer device of any one of claims 147 to 212, wherein weights are assigned to the unbiased standard deviations of the various elements according to practical requirements.

214. The computer device of any one of claims 147 to 213, wherein the unbiased standard deviations of the various elements are weighted and fused according to the weights.

215. The computer device of any one of claims 147 to 214, wherein the weights of the unbiased standard deviations of the various elements can be dynamically readjusted according to practical requirements.

216. The computer device of any one of claims 147 to 215, wherein the weight of is set as 3, the weight of is set as 3, and the weight of is set as 10, then the fusing expression is as follows:

217. The computer device of any one of claims 147 to 216, wherein the feature value of the to-be-detected video is the unbiased standard deviations of the various elements and their weighted value, which are expressed as:
{0[.1(dx)], cr[A(dy)], cy[2(dr)], KS)(s) .

218.A computer system for detecting video jitter, comprising:
a processor, configured to:

perform a framing process on a to-be-detected video to obtain a frame sequence;
perform feature point detection on the frame sequence frame by frame, obtain feature points of each frame by employing a feature point detecting algorithm including fused FAST (Features from Accelerated Segment Test) and SURF
(Speeded-Up Robust Features), and generate a frame feature point sequence matrix;
perform an operation on the frame feature point sequence man-ix to obtain a motion vector of each frame based on an optical flow tracking algorithm;
obtain a feature value of the to-be-detected video according to the motion vector of each frame; and take the feature value of the to-be-detected video as an input signal of a detection model to perform operation to obtain an output signal, and judge whether jitter occurs to the to-be-detected video according to the output signal;
a computer-readable storage medium.

219. The computer system of claim 218, wherein the computer system is further configured to:
grayscale-process the frame sequence, and obtain a grayscale frame sequence;
and denoise the grayscale frame sequence;
wherein the step of feature point detection on the frame sequence frame by frame is feature point detection on the preprocessed frame sequence frame by frame.

220. The computer system of any one of claims 218 to 219, wherein the computer system is further configured to:

employ the feature point detecting algorithm in which are the fused FAST
(Features from Accelerated Segment Test) and the SURF (Speeded-Up Robust Features) to perform feature point detection on the frame sequence frame by frame, and to obtain the feature points of each frame.

221. The computer system of any one of claims 218 to 220, wherein the computer system is further configured to:
perform optical flow tracking calculation on the frame feature point sequence matrix of each frame, and obtain an initial motion vector of each frame;
obtain a corresponding accumulative motion vector according to the initial motion vector;
smoothen the accumulative motion vector, and obtain a smoothened motion vector; and employ the accumulative motion vector and the smoothened motion vector to readjust the initial motion vector of each frame, and obtain the motion vector of each frame.

222. The computer system of any one of claims 218 to 221, wherein the computer system is further configured to:
merge and convert the motion vectors of all frames into a matrix, and calculate unbiased standard deviations of various elements in the matrix;
weight and fuse the unbiased standard deviations of the various elements, and obtain a weighted value; and take the unbiased standard deviations of the various elements and the weighted value as the feature value of the to-be-detected video.

223. The computer system of any one of claims 218 to 222, wherein a to-be-detected video is obtained.

224. The computer system of any one of claims 218 to 223, wherein a framing extraction process is performed on the to-be-detected video to obtain a frame sequence corresponding to the to-be-detected video.

225. The computer system of any one of claims 218 to 224, wherein the frame sequence is expressed as L, (1A, 2, 3, ..., n) , where L, represents the th frame of the video, and n represents the total number of frames of the video.

226. The computer system of any one of claims 218 to 225, wherein a current frame and an adjacent next frame is selected from the to-be-detected video.

227. The computer system of any one of claims 218 to 226, wherein a current frame and a next frame by an interval of N frames is selected from the to-be-detected video.

228. The computer system of any one of claims 218 to 227, wherein corresponding feature points are obtained from the current frame and the adjacent next frame.

229. The computer system of any one of claims 218 to 228, wherein corresponding feature points are obtained from the current frame and the next frame by an interval of N
frames.

230. The computer system of any one of claims 218 to 229, wherein corresponding matching is performed according to the feature points of the two frames to judge whether offset (jitter) occurs between the current frame and the adjacent next frame.

231. The computer system of any one of claims 218 to 230, wherein corresponding matching is performed according to the feature points of the two frames to judge whether offset (jitter) occurs between the current frame and the next frame by an interval of N
frames.

232. The computer system of any one of claims 218 to 231, wherein a feature point detecting algorithm is employed to perform feature point detection on the processed frame sequence L, (i=1, 2, 3, ..., n) frame by frame.

233. The computer system of any one of claims 218 to 232, wherein a feature point detecting algorithm is employed to obtain feature points of each frame, that is, feature points of each frame of image are extracted.

234. The computer system of any one of claims 218 to 233, wherein a feature point detecting algorithm is employed to generate a frame feature point sequence matrix, which is expressed as Z, (i=1, 2, ..., n).

235. The computer system of any one of claims 218 to 234, wherein Z (1=1, 2, ..., n) is expressed as:
Wherein /3'g represents a feature point detection result at row p, column q of the ith frame matrix, 1 is a feature point, 0 is a non-feature point, p represents the number of rows of the matrix, and q represents the number of columns of the matrix.

236. The computer system of any one of claims 218 to 235, wherein the optical flow tracking algorithm is employed to perform optical flow tracking calculation on the frame feature point sequence matrix.

237. The computer system of any one of claims 218 to 236, wherein the optical flow tracking algorithm is employed to track the change of feature points in the current frame to the next fram e.

238. The computer system of any one of claims 218 to 237, wherein the change of a feature point sequence matrix Z in the ith frame is tracked to the next i+ith frame, and a motion vector ai is obtained, whose expression is as follows:
where dxt represents a Euclidean column offset from the ith frame to the i+lth frame, clyi represents a Euclidean row offset from the ith frame to the i+lth frame, and dr' represents an angle offset from the th frame to the i+Ph frame.

239. The computer system of any one of claims 218 to 238, wherein an extracted feature value at least includes the feature value of four dimensions.

240. The computer system of any one of claims 218 to 239, wherein the detection model is well trained in advance.

241. The computer system of any one of claims 218 to 240, wherein sample video data in a set of selected training data is correspondingly processed to obtain a feature value of the sample video data.

242. The computer system of any one of claims 218 to 241, wherein a detection model is trained according to the feature value of the sample video data and a corresponding annotation result of the sample video data to obtain the final detection model.

243. The computer system of any one of claims 218 to 242, wherein after the motion vectors have been dimensionally converted for an frith video sample, unbiased standard deviations of various elements and their weighted and fused value are calculated and obtained, which are respectively expressed as and K m , and the annotation result ym of the mth video sample is extracted to obtain the training sample of the /nth video sample, which is expressed as follows:
[2.(dx)]n cr[A(cly)].rn ak(dr)1m ICm ym)(m) .

244. The computer system of any one of claims 218 to 243, wherein the annotation result ym indicates that no jitter occurs to the video sample if ym-0.

245. The computer system of any one of claims 218 to 244, wherein the annotation result ym indicates that jitter occurs to the video sample if ym=1.

246. The computer system of any one of claims 218 to 245, wherein a video sample makes use of features of at least five dimensions.

247. The computer system of any one of claims 218 to 246, wherein the detection model is selected from an SVM (Support Vector Machine) model.

248. The computer system of any one of claims 218 to 247, wherein the feature value of the to-be-detected video is input to a well-trained SVM model to obtain an output result.

249. The computer system of any one of claims 218 to 248, wherein if the SVM
model output result is 0, this indicates that no jitter occurs to the to-be-detected video.

250. The computer system of any one of claims 218 to 249, wherein if the SVM
model output result is 1, this indicates that jitter occurs to the to-be-detected video.

251. The computer system of any one of claims 218 to 250, wherein the use of a trainable SVM
model as a video jitter decider enables jitter detection of videos captured in different scenarios.

252. The computer system of any one of claims 218 to 251, wherein the amount of information of the image is greatly reduced after grayscale-processing.

253. The computer system of any one of claims 218 to 252, wherein the frame sequence L, (i=1, 2, 3, ..., n) is further grayscale-processed.

254. The computer system of any one of claims 218 to 253, wherein a grayscale frame sequence is obtained to be expressed as G, (i=1, 2, 3, ..., n) , in which the grayscale conversion expression is as follows:
G=Rx0.299+Gx0.587+Bx0.1.IA

255. The computer system of any one of claims 218 to 254, wherein a TV
denoising method based on a total variation model is employed to denoise the grayscale frame sequence G, (i=1, 2, 3, ..., n) to obtain a denoised frame sequence expressed as T, (i=1, 2, 3, ..., n) .

256. The computer system of any one of claims 218 to 255, wherein the denoised frame sequence is a preprocessed frame sequence to which the to-be-detected video corresponds.

257. The computer system of any one of claims 218 to 256, wherein the denoising method is randomly selectable.

258. The computer system of any one of claims 220 to 257, wherein the SURF
algorithm is an improvement over SIFT (Scale-Invariant Feature Transform) algorithm.

259. The computer system of claims 258, wherein SIFT is a feature describing method with excellent robustness and invariant scale.

260. The computer system of any one of claims 258 to 259, wherein the SURF
algorithm improves over the problems of large amount of data to be calculated, high time complexity and long duration of calculation inherent in the SIFT algorithm at the same time of maintaining the advantages of the SIFT algorithm.

261. The computer system of any one of claims 220 to 260, wherein SURF is more excellent in performance in the aspect of invariance of illumination change and perspective change, particularly excellent in processing severe blurs and rotations of images.

262. The computer system of any one of claims 220 to 261, wherein SURF is excellent in describing local features of images.

263. The computer system of any one of claims 220 to 262, wherein FAST feature detection is a corner detection method.

264. The computer system of any one of claims 220 to 263, wherein FAST feature detection has the most prominent advantage of its algorithm in its calculation efficiency, and the capability to excellently describe global features of images.

265. The computer system of any one of claims 220 to 264, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction gives consideration to global features of images.

266. The computer system of any one of claims 220 to 265, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction fully retains local features of images.

267. The computer system of any one of claims 220 to 266, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction has small computational expense.

268. The computer system of any one of claims 220 to 267, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction has strong robustness against image blurs and faint illumination.

269. The computer system of any one of claims 220 to 268, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction enhances real-time property and precision of the detection.

270. The computer system of any one of claims 218 to 269, wherein while the optical flow tracking calculation is performed on the frame feature point sequence matrix, a pyramid optical flow tracking LK (Lucas-Kanade) algorithm is employed.

271. The computer system of any one of claims 218 to 270, wherein the change of a feature point sequence matrix Z in the ith frame is tracked to the next i+ith frame, and a motion vector ai is obtained, whose expression is as follows:
where dr' represents a Euclidean column offset from the ith frame to the i+lth frame, dy' represents a Euclidean row offset from the ith frame to the i+lth frame, and dri represents an angle offset from the th frame to the i+lth frame.

272. The computer system of any one of claims 218 to 271, wherein use of the/a pyramid optical flow tracking LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative structure solves the problem of failed tracking due to unduly large change from feature points of a current frame to feature points of an adjacent next frame.

273. The computer system of any one of claims 218 to 272, wherein use of the/a pyramid optical flow tracking LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative structure solves the problem of failed tracking due to unduly large change from feature points of a current frame to feature points of a next frame by an interval of N frames.

274. The computer system of any one of claims 218 to 273, wherein accumulative integral transformation is performed on the initial motion vector Cti of each frame to obtain an accumulative motion vector, expressed as Pi , of each frame, in which the expression of the accumulative motion vector Pi is as follows:

275. The computer system of any one of claims 218 to 274, wherein a sliding average window is used to smoothen the motion vector 13 to obtain a smoothened motion vector Y , whose expression is:

where n represents the total number of frames of the video.

276. The computer system of any one of claims 218 to 275, wherein the radius of the smooth window is r, and its expression is:
where ix indicates a parameter of the sliding window, and is a positive number.

277. The computer system of any one of claims 218 to 276, wherein the specific numerical value of can be dynamically adjusted according to practical requirement.

278. The computer system of any one of claims 218 to 277, wherein the specific numerical value of is set as =30.

279. The computer system of any one of claims 218 to 278, wherein /6 and are used to readjust ai to obtain a readjusted motion vector whose expression is:

280. The computer system of any one of claims 218 to 279, wherein the readjusted motion vector is taken as the motion vector of each frame to participate in subsequent calculation, so that the calculation result is made more precise.

281. The computer system of any one of claims 218 to 280, wherein the motion vectors of all frames obtained are merged and converted into a matrix.

282. The computer system of any one of claims 218 to 281. wherein the motion vector /14 is converted into the form of a matrix , and unbiased standard deviations of its elements are calculated by rows, the specific calculation expression is as follows:

283. The computer system of any one of claims 218 to 282, wherein the unbiased standard deviations of the various elements in the matrix are respectively expressed as , in which A represents the average value of the samples.

284. The computer system of any one of claims 218 to 283, wherein weights are assigned to the unbiased standard deviations of the various elements according to practical requirements.

285. The computer system of any one of claims 218 to 284, wherein the unbiased standard deviations of the various elements are weighted and fused according to the weights.

286. The computer system of any one of claims 218 to 285, wherein the weights of the unbiased standard deviations of the various elements can be dynamically readjusted according to practical requirements.

287. The computer system of any one of claims 218 to 286, wherein the weight of is set as 3, the weight of is set as 3, and the weight of is set as 10, then the fusing expression is as follows:

288. The computer system of any one of claims 218 to 287, wherein the feature value of the to-be-detected video is the unbiased standard deviations of the various elements and their weighted value, which are expressed as:
{cî[A(dx)] o-[A(dy)], a[2.(dr)],

289. The computer system of any one of claims 218 to 288, wherein the computer-readable storage medium is a read-only memory.

290. The computer system of any one of claims 218 to 289, wherein the computer-readable storage medium is a magnetic disk.

291. The computer system of any one of claims 218 to 290, wherein the computer-readable storage medium is an optical disk.

292.A method of detecting video jitter, comprising:
performing a framing process on a to-be-detected video to obtain a frame sequence;
performing feature point detection on the frame sequence frame by frame, obtaining feature points of each frame by employing a feature point detecting algorithm including fused FAST (Features from Accelerated Segment Test) and SURF (Speeded-Up Robust Features), and generating a frame feature point sequence matrix;
basing on an optical flow tracking algorithm to perform an operation on the frame feature point sequence matrix to obtain a motion vector of each frame;
obtaining a feature value of the to-be-detected video according to the motion vector of each frame; and taking the feature value of the to-be-detected video as an input signal of a detection model to perform operation to obtain an output signal, and judging whether jitter occurs to the to-be-detected video according to the output signal.

293. The method of claim 292, wherein the method further includes:
grayscale-processing the frame sequence, and obtaining a grayscale frame sequence; and denoising the grayscale frame sequence;
the step of performing feature point detection on the frame sequence frame by frame is performing feature point detection on the preprocessed frame sequence frame by frame.

294. The method of any one of claims 292 to 293, wherein the method further includes:

employing the feature point detecting algorithm in which are the fused FAST
(Features from Accelerated Segment Test) and the SURF (Speeded-Up Robust Features) to perform feature point detection on the frame sequence frame by frame, and to obtain the feature points of each frame.

295. The method of any one of claims 292 to 294, wherein the method further includes:
performing optical flow tracking calculation on the frame feature point sequence matrix of each frame, and obtaining an initial motion vector of each frame;
obtaining a corresponding accumulative motion vector according to the initial motion vector;
smoothening the accumulative motion vector, and obtaining a smoothened motion vector;
and employing the accumulative motion vector and the smoothened motion vector to readjust the initial motion vector of each frame, and obtaining the motion vector of each frame.

296. The method of any one of claims 292 to 295, wherein the method further includes:
merging and converting the motion vectors of all frames into a matrix, and calculating unbiased standard deviations of various elements in the matrix;
weighting and fusing the unbiased standard deviations of the various elements, and obtaining a weighted value; and taking the unbiased standard deviations of the various elements and the weighted value as the feature value of the to-be-detected video.

297. The method of any one of claims 292 to 296, wherein a to-be-detected video is obtained.

298. The method of any one of claims 292 to 297, wherein a framing extraction process is performed on the to-be-detected video to obtain a frame sequence corresponding to the to-be-detected video.

299. The method of any one of claims 292 to 298, wherein the frame sequence is expressed as L, (i=1, 2, 3, ..., n), where L, represents the th frame of the video, and n represents the total number of frames of the video.

300. The method of any one of claims 292 to 299, wherein a current frame and an adjacent next frame is selected from the to-be-detected video.

301. The method of any one of claims 292 to 300, wherein a current frame and a next frame by an interval of N frames is selected from the to-be-detected video.

302. The method of any one of claims 292 to 301, wherein corresponding feature points are obtained from the current frame and the adjacent next frame.

303. The method of any one of claims 292 to 302, wherein corresponding feature points are obtained from the current frame and the next frame by an interval of N frames.

304. The method of any one of claims 292 to 303, wherein corresponding matching is performed according to the feature points of the two frames to judge whether offset (jitter) occurs between the current frame and the adjacent next frame.

305. The method of any one of claims 292 to 304, wherein corresponding matching is performed according to the feature points of the two frames to judge whether offset (jitter) occurs between the current frame and the next frame by an interval of N frames.

306. The method of any one of claims 292 to 305, wherein a feature point detecting algorithm is employed to perform feature point detection on the processed frame sequence L, (1=1, 2, 3, ..., n) frame by frame.

307. The method of any one of claims 292 to 306, wherein a feature point detecting algorithm is employed to obtain feature points of each frame, that is, feature points of each frame of image are extracted.

308. The method of any one of claims 292 to 307, wherein a feature point detecting algorithm is employed to generate a frame feature point sequence matrix, which is expressed as Z, (i=1, 2, ..., n).

309. The method of any one of claims 292 to 308, wherein Z (i=1, 2, ..., n) is expressed as:
Wherein apq ' represents a feature point detection result at row p, column q of the ith frame matrix, 1 is a feature point, 0 is a non-feature point, p represents the number of rows of the matrix, and q represents the number of columns of the matrix.

310. The method of any one of claims 292 to 309, wherein the optical flow tracking algorithm is employed to perform optical flow tracking calculation on the frame feature point sequence matrix.

311. The method of any one of claims 292 to 310, wherein the optical flow tracking algorithm is employed to track the change of feature points in the current frame to the next frame.

312. The method of any one of claims 292 to 311, wherein the change of a feature point sequence matrix Z in the ith frame is tracked to the next i+ lth frame, and a motion vector ai is obtained, whose expression is as follows:
where dxf represents a Euclidean column offset from the ith frame to the i+lth frame, dyi represents a Euclidean row offset from the ith frame to the i+lth frame, and dri represents an angle offset from the ith frame to the i+lth frame.

313. The method of any one of claims 292 to 312, wherein an extracted feature value at least includes the feature value of four dimensions.

314. The method of any one of claims 292 to 313, wherein the detection model is well trained in advance.

315. The method of any one of claims 292 to 314, wherein sample video data in a set of selected training data is correspondingly processed to obtain a feature value of the sample video data.

316. The method of any one of claims 292 to 315, wherein a detection model is trained according to the feature value of the sample video data and a corresponding annotation result of the sample video data to obtain the final detection model.

317. The method of any one of claims 292 to 316, wherein after the motion vectors have been dimensionally converted for an 171th video sample, unbiased standard deviations of various elements and their weighted and fused value are calculated and obtained, which are respectively expressed as , and the annotation result ym of the 117th video sample is extracted to obtain the training sample of the 177th video sample, which is expressed as follows:
fak(dx)lm a[A(dy)].õ, a[A.(dr)lm Km ym)(m).

318. The method of any one of claims 292 to 317, wherein the annotation resultym indicates that no jitter occurs to the video sample if ym-0.

319. The method of any one of claims 292 to 318, wherein the annotation resultym indicates that jitter occurs to the video sample if ym=1.

320. The method of any one of claims 292 to 319, wherein a video sample makes use of features of at least five dimensions.

321. The method of any one of claims 292 to 320, wherein the detection model is selected from an SVM (Support Vector Machine) model.

322. The method of any one of claims 292 to 321, wherein the feature value of the to-be-detected video is input to a well-trained SVM model to obtain an output result.

323. The method of any one of claims 292 to 322, wherein if the SVM model output result is 0, this indicates that no jitter occurs to the to-be-detected video.

324. The method of any one of claims 292 to 323, wherein if the SVM model output result is 1, this indicates that jitter occurs to the to-be-detected video.

325. The method of any one of claims 292 to 324, wherein the use of a trainable SVM model as a video jitter decider enables jitter detection of videos captured in different scenarios.

326. The method of any one of claims 292 to 325, wherein the amount of information of the image is greatly reduced after grayscale-processing.

327. The method of any one of claims 292 to 326, wherein the frame sequence L, (1=1, 2, 3, ..., n) is further grayscale-processed.

328. The method of any one of claims 292 to 327, wherein a grayscale frame sequence is obtained to be expressed as G, (i=1, 2, 3, ..., n), in which the grayscale conversion expression is as follows:
G=Rx0.299+Gx0.587+Bx0.114

329. The method of any one of claims 292 to 328, wherein a TV denoising method based on a total variation model is employed to denoise the grayscale frame sequence G, (i=1, 2, 3, ..., n) to obtain a denoised frame sequence expressed as T1(i=1, 2, 3, ..., n) .

330. The method of any one of claims 292 to 329, wherein the denoised frame sequence is a preprocessed frame sequence to which the to-be-detected video corresponds.

331. The method of any one of claims 292 to 330, wherein the denoising method is randomly selectable.

332. The method of any one of claims 294 to 331, wherein the SURF algorithm is an improvement over SIFT (Scale-Invariant Feature Transform) algorithm.

333. The method of claim 332, wherein SIFT is a feature describing method with excellent robustness and invariant scale.

334. The method of any one of claims 332 to 333, wherein the SURF algorithm improves over the problems of large amount of data to be calculated, high time complexity and long duration of calculation inherent in the SIFT algorithm at the same time of maintaining the advantages of the SIFT algorithm.

335. The method of any one of claims 294 to 334, wherein SURF is more excellent in performance in the aspect of invariance of illumination change and perspective change, particularly excellent in processing severe blurs and rotations of images.

336. The method of any one of claims 294 to 335, wherein SURF is excellent in describing local features of images.

337. The method of any one of claims 294 to 336, wherein FAST feature detection is a comer detection method.

338. The method of any one of claims 294 to 337, wherein FAST feature detection has the most prominent advantage of its algorithm in its calculation efficiency, and the capability to excellently describe global features of images.

339. The method of any one of claims 294 to 338, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction gives consideration to global features of images.

340. The method of any one of claims 294 to 339, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction fully retains local features of images.

341. The method of any one of claims 294 to 340, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction has small computational expense.

342. The method of any one of claims 294 to 341, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction has strong robustness against image blurs and faint illumination.

343. The method of any one of claims 294 to 342, wherein use of the feature point detecting algorithm in which FAST features and SURF features are fused to perform feature point extraction enhances real-time property and precision of the detection.

344. The method of any one of claims 292 to 343, wherein while the optical flow tracking calculation is performed on the frame feature point sequence matrix, a pyramid optical flow tracking LK (Lucas-Kanade) algorithm is employed.

345. The method of any one of claims 292 to 344, wherein the change of a feature point sequence matrix Z in the ith frame is tracked to the next i+1 th frame, and a motion vector ai is obtained, whose expression is as follows:
where dxi represents a Euclidean column offset from the ith frame to the i+lth frame, dy' represents a Euclidean row offset from the ith frame to the i+ 1 th frame, and dr' represents an angle offset from the ith frame to the i+lth frame.

346. The method of any one of claims 292 to 345, wherein use of the pyramid optical flow tracking LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative structure solves the problem of failed tracking due to unduly large change from feature points of a current frame to feature points of an adjacent next frame.

347. The method of any one of claims 292 to 346, wherein use of the pyramid optical flow tracking LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative structure solves the problem of failed tracking due to unduly large change from feature points of a current frame to feature points of a next frame by an interval of N frames.

348. The method of any one of claims 292 to 347, wherein accumulative integral transformation is performed on the initial motion vector ai of each frame to obtain an accumulative motion vector, expressed as fli , of each frame, in which the expression of the accumulative motion vector is as follows:

349. The method of any one of claims 292 to 348, wherein a sliding average window is used to smoothen the motion vector fii to obtain a smoothened motion vector r , whose expression is:

where n represents the total number of frames of the video.

350. The method of any one of claims 292 to 349, wherein the radius of the smooth window is r, and its expression is:
where ix indicates a parameter of the sliding window, and la is a positive number.

351. The method of any one of claims 292 to 350, wherein the specific numerical value of ti can be dynamically adjusted according to practical requirement.

352. The method of any one of claims 292 to 351, wherein the specific numerical value of 1.1 is set as j.t=30.

353. The method of any one of claims 292 to 352, wherein /3 and are used to readjust ai to obtain a readjusted motion vector 4, whose expression is:

354. The method of any one of claims 292 to 353, wherein the readjusted motion vector /1.4 is taken as the motion vector of each frame to participate in subsequent calculation, so that the calculation result is made more precise.

355. The method of any one of claims 292 to 354, wherein the motion vectors of all frames obtained are merged and converted into a matrix.

356. The method of any one of claims 292 to 355, wherein the motion vector Ai is converted into the form of a matiix , and unbiased standard deviations of its elements are calculated by rows, the specific calculation expression is as follows:

357. The method of any one of claims 292 to 356, wherein the unbiased standard deviations of the various elements in the matrix are respectively expressed as and in which A represents the average value of the samples.

358. The method of any one of claims 292 to 357, wherein weights are assigned to the unbiased standard deviations of the various elements according to practical requirements.

359. The method of any one of claims 292 to 358, wherein the unbiased standard deviations of the various elements are weighted and fused according to the weights.

360. The method of any one of claims 292 to 359, wherein the weights of the unbiased standard deviations of the various elements can be dynamically readjusted according to practical requirements.

361. The method of any one of claims 292 to 360, wherein the weight of is set as 3, the weight of is set as 3, and the weight of is set as 10, then the fusing expression is as follows:

362. The method of any one of claims 292 to 361, wherein the feature value of the to-be-detected video is the unbiased standard deviations of the various elements and their weighted value, which are expressed as:
{cî[A(dx)] o-[A(dy)], a[2.(dr)],