CN110234011A - A kind of video-frequency compression method and system - Google Patents

A kind of video-frequency compression method and system Download PDF

Info

Publication number
CN110234011A
CN110234011A CN201910318187.2A CN201910318187A CN110234011A CN 110234011 A CN110234011 A CN 110234011A CN 201910318187 A CN201910318187 A CN 201910318187A CN 110234011 A CN110234011 A CN 110234011A
Authority
CN
China
Prior art keywords
data
residual error
frame
dimension
encoded
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910318187.2A
Other languages
Chinese (zh)
Other versions
CN110234011B (en
Inventor
林鹏程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wangsu Science and Technology Co Ltd
Original Assignee
Wangsu Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wangsu Science and Technology Co Ltd filed Critical Wangsu Science and Technology Co Ltd
Priority to CN201910318187.2A priority Critical patent/CN110234011B/en
Publication of CN110234011A publication Critical patent/CN110234011A/en
Application granted granted Critical
Publication of CN110234011B publication Critical patent/CN110234011B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Discrete Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a kind of video-frequency compression method and systems, wherein the described method includes: determining the frame and reference frame to be encoded in target video, and calculates residual error data of the frame to be encoded relative to the reference frame;The Mean Vector and variance vectors of the residual error data are extracted respectively;Normal distribution sampling is carried out to the Mean Vector and the variance vectors, to obtain the compressed data of the frame to be encoded, wherein the dimension of the compressed data is lower than the dimension of the residual error data.Technical solution provided by the present application can effectively compress video file.

Description

A kind of video-frequency compression method and system
Technical field
The present invention relates to technical field of video processing, in particular to a kind of video-frequency compression method and system.
Background technique
With the continuous promotion of video definition, the data volume of video file is also increasing.In order to save transmission video The bandwidth of file needs efficient, stable video compression scheme.
At present in the video compression scheme of mainstream, first video data can be quantified, then carried out for quantized result After scanning, the cataloged procedure to video file is realized.Specifically, video data can be quantified by quantization table, then Quantized result is scanned by way of ZigZag again.Some 0 values in video data can be given up in this way, to compress The data volume of video file.
However, this video compression scheme in the prior art, the video file more for 0 value can have preferable pressure Contracting effect, and the video file less for 0 value, since the data volume given up is less, compression effectiveness is unsatisfactory.And such as Fruit wants the step-length by improving quantization come compressed data, and it is higher to will lead to compressed video file distortion rate.Therefore, mesh Before need a kind of significantly more efficient video compression scheme.
Summary of the invention
The application's is designed to provide a kind of video-frequency compression method and system, can effectively press video file Contracting.
To achieve the above object, on the one hand the application provides a kind of video-frequency compression method, which comprises determines target Frame and reference frame to be encoded in video, and calculate residual error data of the frame to be encoded relative to the reference frame;It mentions respectively Take the Mean Vector and variance vectors of the residual error data;Normal distribution is carried out to the Mean Vector and the variance vectors to adopt Sample, to obtain the compressed data of the frame to be encoded, wherein the dimension of the compressed data is lower than the dimension of the residual error data Degree.
To achieve the above object, on the other hand the application also provides a kind of video compression system, the system comprises: residual error Data Computation Unit for determining frame and reference frame to be encoded in target video, and calculates the frame to be encoded relative to institute State the residual error data of reference frame;Vector extraction unit, for extracting the Mean Vector and variance vectors of the residual error data respectively; Data compression unit, it is described wait compile to obtain for carrying out normal distribution sampling to the Mean Vector and the variance vectors The compressed data of code frame, wherein the dimension of the compressed data is lower than the dimension of the residual error data.
To achieve the above object, on the other hand the application also provides a kind of video compression apparatus, the video compression apparatus Including memory and processor, the memory for storing computer program, held by the processor by the computer program When row, above-mentioned video-frequency compression method is realized.
Therefore technical solution provided by the present application, for the frame to be encoded in target video, it may be predetermined that should The reference frame of frame to be encoded.Wherein, which can retain the content of full frame in compression.And for frame to be encoded, The residual error data of the frame to be encoded relative to the reference frame can be calculated, it is subsequent when being encoded to the frame to be encoded, it can To be encoded only for residual error data, to considerably reduce data volume needed for coding.In order to further reduce volume Data volume needed for code, can extract the characteristic parameter that can characterize the residual error data from the residual error data.In this application, This feature parameter can be the Mean Vector and variance vectors of residual error data.The dimension of the Mean Vector and variance vectors that extract The dimension of original residual error data can be lower than, so as to realize Data Dimensionality Reduction.It then, can for Mean Vector and variance vectors To carry out normal distribution sampling.The purpose handled in this way is, on the one hand can be sampled by normal distribution and eliminate Mean Vector With the noise in variance vectors, to improve the accuracy of data compression.On the other hand, after normal distribution can be made to sample Data can meet the NATURAL DISTRIBUTION rule of data, after being sampled by normal distribution, be equivalent to by Mean Vector and variance to The preliminary data dimension restored for original residual error data, after only normal distribution samples of amount, than original residual error data Dimension it is low.In this way, not only can guarantee that the data after normal distribution sampling had higher fidelity, but also it can guarantee that normal distribution is adopted Data after sample have lower dimension, to improve the efficiency of data compression while guaranteeing fidelity.In this way, just Data after state profile samples can be used for as the compressed data of frame to be encoded, the compressed data it is subsequent transmission or Person's decoding.Therefore technical solution provided by the present application, data needed for video compress being reduced by residual error data Amount, in addition, normal distribution sampling is carried out by extracting Mean Vector and variance vectors, and to Mean Vector and variance vectors, from And effectively video can be compressed.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.
Fig. 1 is the step schematic diagram of video-frequency compression method in embodiment of the present invention;
Fig. 2 is the schematic diagram for carrying out image procossing in embodiment of the present invention as unit of macro block;
Fig. 3 is the structural schematic diagram of compact model in embodiment of the present invention;
Fig. 4 is the schematic diagram of neural network in embodiment of the present invention;
Fig. 5 is the functional block diagram of video compression system in embodiment of the present invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention Formula is described in further detail.
The application provides a kind of video-frequency compression method, and the method can be applied to the equipment for having data processing function In.Referring to Fig. 1, the method may include following steps.
S1: determining the frame and reference frame to be encoded in target video, and calculates the frame to be encoded relative to the reference The residual error data of frame.
In the present embodiment, the target video can be video (to be compressed) to be encoded, in the target video In can determine frame to be encoded reference frame corresponding with the frame to be encoded.Specifically, SATD (Sum of can be passed through Absolute Transformed Difference, absolute transformed error and algorithm) or SAD (Sum of Absolute Differences, absolute error and algorithm) scheduling algorithm, the similarity between frame and reference frame to be encoded is calculated, when being calculated Similarity when reaching specified threshold value, can be using the reference frame as the corresponding reference frame of the frame to be encoded.Certainly, exist In practical application, the selection of reference frame and frame to be encoded can be not limited to above-mentioned true according to scene according to other standards Fixed scheme.Therefore, the application to the method for determination of reference frame and frame to be encoded without limitation.
In the present embodiment, it after determining the frame to be encoded and reference frame, is being compiled to reduce frame to be encoded Data needed for code process can determine residual quantity of the frame to be encoded relative to reference frame, and be directed to the residual quantity and encoded, So as to greatly reduce frame to be encoded in an encoding process needed for data volume.
Specifically, residual error data of the frame to be encoded relative to the reference frame can be calculated.Calculating the residual error number According to when, the residual error between the frame to be encoded and reference frame can be calculated first.The residual error can be frame and reference frame to be encoded Between corresponding pixel points pixel value difference.For example, reference frame and frame to be encoded are the video frame of 28*28, then being calculated Residual error can be element number be 28*28=784 vector.However, due to compared between reference frame and frame to be encoded Higher similarity, therefore in the vector of characterization residual error, comprising 0 more value, these 0 values can be very big during next code Ground mitigates coding pressure.
In one embodiment, reference frame and frame to be encoded usually can all be divided into the macro block of preset quantity ((MacroBlock), then the process of above-mentioned calculating residual error, can be carried out as unit of macro block.It specifically, can will be described Frame to be encoded is divided into the target macroblock of preset quantity, and determines each target macroblock corresponding ginseng in the reference frame Examine macro block.Referring to Fig. 2, the number for the pixel that the reference macroblock and target macroblock that the both ends of dotted line are respectively directed to are covered Amount and the area size of covering can be consistent.In this way, frame to be encoded and reference frame can be divided into pairs of target macro Block and reference macroblock.It is then possible to calculate separately the part between each target macroblock and the corresponding reference macroblock Residual error.Specifically, the pixel value of corresponding pixel points between target macroblock and reference macroblock can be subtracted each other, to obtain each picture Pixel value difference at vegetarian refreshments position.The combination of each pixel value difference in one target macroblock, can as the target macroblock with Local residual error between corresponding reference macroblock.It, can will be each after the local residual error of each target macroblock is calculated The combination of the part residual error, as the residual error between the frame to be encoded and reference frame.
In the present embodiment, in order to further increase the quantity of 0 value, so that data volume needed for coding is reduced, it can be with The residual error being calculated is converted from time-domain to frequency domain.Specifically, in practical applications, discrete cosine can be used Transformation (Discrete Cosine Transform, DCT) is handled the residual error being calculated, the number after dct transform According to, high frequency section may be implemented and separated with low frequency part so that data volume is less, 0 value it is more.In this way, can will turn Residual error data of the residual error for the frequency domain got in return as the frame to be encoded relative to the reference frame.
Certainly, in practical applications, dct transform can also be carried out as unit of macro block.Specifically, be calculated it is each After the local residual error of above-mentioned target macroblock, each local residual error can be converted from time-domain to frequency domain, and will turn The combination of the local residual error for the frequency domain got in return, the residual error data as the frame to be encoded relative to the reference frame.
S3: the Mean Vector and variance vectors of the residual error data are extracted respectively.
In the present embodiment, after the residual error data that the frame to be encoded is calculated, the residual error data can be directed to It is encoded.Specifically, the process which encoded, can one training complete compact model in into Row.Referring to Fig. 3, may include the two units of encoder and decoder in the compact model.The encoder receives The residual error data of input can extract the characteristic parameter of the residual error data, and the residual error number is characterized using this feature parameter According to.Specifically, the characteristic parameter can be the Mean Vector and variance vectors of the residual error data.
As shown in figure 3, in practical applications, may include the deep neural network for completing training in above-mentioned encoder (Deep Neural Networks, DNN), the DNN can fit data model for the great amount of samples data of input, be fitted Obtained data model can extract corresponding Mean Vector and variance vectors for the residual error data of input.Specifically, In the training stage, a large amount of residual error data sample and the corresponding practical expectation of these residual error data samples can be prepared in advance Vector sum realized variance vector.It is subsequent, these residual error data samples can be inputted into DNN to be trained in batch.For example, every A batch can select the residual error data of 100 video frames, it is assumed that include 784 elements in each residual error data, then each The residual error data matrix that can be 100*784 of batch input DNN.DNN to be trained can be according to initial neuron to defeated The residual error data matrix entered is handled, so that obtaining corresponding prediction expectation vector sum presets variance vectors.Due in instruction To practice the stage, prediction expectation vector sum presets variance vectors and practical Mean Vector and realized variance vector possible error are larger, because This can return to DNN using the error as value of feedback, so that DNN is adjusted the weight coefficient of internal neuron, until It, being capable of correctly predicted actual Mean Vector and variance vectors out after inputting the sample of residual error data again.In this way, by a large amount of The obtained DNN of residual error data sample training, the residual error data that can relatively accurately predict current frame to be encoded is corresponding Mean Vector and variance vectors.
Incorporated by reference to Fig. 3 and Fig. 4, complete to may include multiple full articulamentum (Full Connected in trained DNN Layer, FCL), the extraction of Mean Vector and variance vectors and the function of Data Dimensionality Reduction may be implemented in these full articulamentums.Tool Body, in fig. 4, it is assumed that is inputted is the residual error data of 100 frames to be encoded, the residual error data of input can show as 100* 784 matrix.So by the first full articulamentum in the deep neural network, the residual error data can be tieed up from first Degree (100*784) is reduced to the second dimension (100*256).Specifically, the process of dimensionality reduction can be realized by means of convolution kernel.Volume Product core can be weighted and averaged for the pixel value in a region in frame to be encoded, and being replaced with the value after weighted average should Pixel value in region, to have the function that dimensionality reduction.It is subsequent, it can be complete by second in the deep neural network respectively Articulamentum and the full articulamentum of third extract the Mean Vector and variance vectors of the residual error data of second dimension.Extraction obtains Mean Vector and variance vectors, for the data compared to the second dimension, also have dimensionality reduction effect.Therefore, it is described it is expected to The dimension (100*128) of amount and the variance vectors can be lower than second dimension (100*256).
S5: normal distribution sampling is carried out to the Mean Vector and the variance vectors, to obtain the frame to be encoded Compressed data, wherein the dimension of the compressed data is lower than the dimension of the residual error data.
In the present embodiment, in order to enable natural data can be met by extracting obtained Mean Vector and variance vectors The regularity of distribution can carry out normal distribution sampling to the Mean Vector and the variance vectors, to tentatively restore original Residual error data.The data only sampled by normal distribution, can be lower than the dimension of original residual error data.
Specifically, referring to Fig. 4, can also include being sampled for executing normal distribution in the deep neural network Normal state sample level, in this way, the residual error data of second dimension (100*256) exported by the multiple full articulamentums in front Mean Vector and variance vectors can be entered the normal state sample level and carry out normal distribution sampling, to obtain third dimension The compressed data of (100*128).In this way, the dimension of compressed data is not only below the second dimension, and due to having carried out normal state point Cloth sampling, additionally it is possible to remove the noise in Mean Vector and variance vectors.Finally, the compression exported by above-mentioned normal state sample level Data, can be as the compressed data of the frame to be encoded.
In one embodiment, in order to measure normal distribution sampling data distortion degree, can according to it is described it is expected to Amount and the variance vectors calculate the relative entropy (RL divergence) of the residual error data, and characterize normal state point by the relative entropy The distortion factor after cloth sampling.In an application example, the relative entropy can be indicated by following formula:
Wherein, KL indicates the relative entropy, εiThe corresponding variance vectors of the frame to be encoded of expression i-th, μiIndicate i-th to The corresponding Mean Vector of coded frame.
By the calculated result of above-mentioned relative entropy, the process that can be sampled to normal distribution is adjusted, so that normal state point The distortion factor after cloth sampling is maintained at zone of reasonableness.
In the present embodiment, the compressed data after encoder encodes can carry out subsequent transmission process.It is connecing After receiving the compressed data, it can use decoder shown in Fig. 3 and it be decoded.Specifically, Fig. 3 and Fig. 4 are please referred to, Can be based on the DNN building decoding neural network that above-mentioned training obtains, and pass through the decoding neural network to the frame to be encoded Compressed data reversely reconstructed, so as to which the compressed data is reduced to match with the dimension of the residual error data Decoding data.
Specifically, the decoding neural network can be the reversed network for the DNN that above-mentioned training obtains.In decoding mind In network, two full articulamentums are may include.As shown in figure 4, the compressed data of the frame to be encoded can be inputted first First full articulamentum in neural network is decoded, to the compressed data is restored to from the low latitudes of 100*128 above-mentioned The second dimension 100*256.It is then possible to which the data for continuing to be restored to second dimension input the decoding neural network Second full articulamentum, so as to by the data convert of the second dimension 100*256 to the first dimension 100* 784.In this way, can decode to obtain decoding data identical with the dimension of original residual error data.Therefore, it can will be restored to The decoding data that the data of first dimension match as the dimension with the residual error data.
In one embodiment, in order to assess the encoding and decoding effects of entire encoder and decoder, to described wait compile The compressed data of code frame carries out after reversely reconstructing, and can calculate between the decoding data and the residual error data that reduction obtains Error, and the cross entropy of the error and the relative entropy is calculated, so as to characterize the solution yardage by the cross entropy According to the distortion factor relative to the residual error data.In an application example, the simplified formula of the cross entropy can following institute Show:
C=-logP (X'| X)+KL
Wherein, C indicates that the cross entropy, X' indicate decoding data, and X indicates residual error data, and-logP (X'| X) indicates decoding Error between data and residual error data, KL indicate the relative entropy.
In this way, the neural network of encoder and decoder can be corrected in conjunction with above-mentioned relative entropy and cross entropy, So that the distortion factor after encoding and decoding is in allowed band, or it is preferably minimized.
In the present embodiment, the decoding data decoded, it is subsequent can continue according to existing coding mode (such as The coding modes such as CACBA, VLC) it is encoded, the application is to this and without limitation.
In practical applications, full articulamentum needs to select suitable activation primitive when handling data.For example, In the first above-mentioned full articulamentum, ReLU (Rectified Linear Unit, line rectification) activation primitive can be used. In another example ReLU can be respectively adopted in first full articulamentum and second full articulamentum in above-mentioned decoding neural network Activation primitive and Sigmoid activation primitive.Certainly, it can flexibly be selected according to fitting effect and actual demand in practical applications With other activation primitives.For example, it is also possible to select Tanh activation primitive.
Referring to Fig. 5, the application also provides a kind of video compression system, the system comprises:
Residual error data computing unit for determining frame and reference frame to be encoded in target video, and calculates described wait compile Residual error data of the code frame relative to the reference frame;
Vector extraction unit, for extracting the Mean Vector and variance vectors of the residual error data respectively;
Data compression unit, for carrying out normal distribution sampling to the Mean Vector and the variance vectors, to obtain The compressed data of the frame to be encoded, wherein the dimension of the compressed data is lower than the dimension of the residual error data.
In one embodiment, the residual error data computing unit includes:
Frequency domain conversion module, for calculating the residual error between the frame to be encoded and reference frame, and by the residual error from when Between domain convert to frequency domain, and using the residual error for the frequency domain being converted to as the frame to be encoded relative to the reference frame Residual error data.
In one embodiment, the system also includes:
Neural network input unit, for the residual error data to be inputted to the deep neural network for completing training, the depth Spending includes multiple full articulamentums in neural network;
Dimensionality reduction unit, for by the first full articulamentum in the deep neural network, by the residual error data from Dimension is reduced to the second dimension;
Correspondingly, the vector extraction unit is also used to respectively through the second full connection in the deep neural network Layer and the full articulamentum of third extract the Mean Vector and variance vectors of the residual error data of second dimension;Wherein, the expectation The dimension of variance vectors described in vector sum is lower than second dimension.
In one embodiment, the system also includes:
Decoding unit is reversely reconstructed, also by the compressed data for the compressed data to the frame to be encoded It originally was the decoding data to match with the dimension of the residual error data.
In one embodiment, the decoding unit includes:
Decoding network input module, for first in the compressed data input decoding neural network by the frame to be encoded The compressed data is restored to second dimension by a full articulamentum;
Data restoring module, for the data for being restored to second dimension to be inputted the second of the decoding neural network A full articulamentum by the data convert of second dimension to first dimension, and will be restored to first dimension The decoding data that data match as the dimension with the residual error data.
In one embodiment, the system also includes:
Relative entropy computing unit, for calculating the residual error data according to the Mean Vector and the variance vectors Relative entropy, and the distortion factor after normal distribution sampling is characterized by the relative entropy;
Cross entropy computing unit, for calculating the error between the decoding data and the residual error data that reduction obtains, and The cross entropy of the error and the relative entropy is calculated, and the decoding data is characterized relative to described residual by the cross entropy The distortion factor of difference data.
Therefore technical solution provided by the present application, for the frame to be encoded in target video, it may be predetermined that should The reference frame of frame to be encoded.Wherein, which can retain the content of full frame in compression.And for frame to be encoded, The residual error data of the frame to be encoded relative to the reference frame can be calculated, it is subsequent when being encoded to the frame to be encoded, it can To be encoded only for residual error data, to considerably reduce data volume needed for coding.In order to further reduce volume Data volume needed for code, can extract the characteristic parameter that can characterize the residual error data from the residual error data.In this application, This feature parameter can be the Mean Vector and variance vectors of residual error data.The dimension of the Mean Vector and variance vectors that extract The dimension of original residual error data can be lower than, so as to realize Data Dimensionality Reduction.It then, can for Mean Vector and variance vectors To carry out normal distribution sampling.The purpose handled in this way is, on the one hand can be sampled by normal distribution and eliminate Mean Vector With the noise in variance vectors, to improve the accuracy of data compression.On the other hand, after normal distribution can be made to sample Data can meet the NATURAL DISTRIBUTION rule of data, after being sampled by normal distribution, be equivalent to by Mean Vector and variance to Amount is tentatively reduced to original residual error data, only the data dimension after normal distribution sampling, than original residual error data Dimension is low.In this way, not only can guarantee that the data after normal distribution sampling had higher fidelity, but also it can guarantee that normal distribution samples Data afterwards have lower dimension, to improve the efficiency of data compression while guaranteeing fidelity.In this way, normal state Data after profile samples can be used for as the compressed data of frame to be encoded, the compressed data it is subsequent transmission or Decoding.Therefore technical solution provided by the present application, data needed for video compress being reduced by residual error data Amount, in addition, normal distribution sampling is carried out by extracting Mean Vector and variance vectors, and to Mean Vector and variance vectors, from And effectively video can be compressed.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can It realizes by means of software and necessary general hardware platform, naturally it is also possible to be realized by hardware.Based on such Understand, substantially the part that contributes to existing technology can embody above-mentioned technical proposal in the form of software products in other words Out, which may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, packet Some instructions are included to use so that a computer equipment (can be personal computer, server or the network equipment etc.) executes Method described in certain parts of each embodiment or embodiment.
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

Claims (15)

1. a kind of video-frequency compression method, which is characterized in that the described method includes:
It determines the frame and reference frame to be encoded in target video, and calculates residual error of the frame to be encoded relative to the reference frame Data;
The Mean Vector and variance vectors of the residual error data are extracted respectively;
Normal distribution sampling is carried out to the Mean Vector and the variance vectors, to obtain the compression number of the frame to be encoded According to, wherein the dimension of the compressed data is lower than the dimension of the residual error data.
2. the method according to claim 1, wherein calculating the frame to be encoded relative to the residual of the reference frame Difference data includes:
The residual error between the frame to be encoded and reference frame is calculated, and the residual error is converted from time-domain to frequency domain, and will Residual error data of the residual error for the frequency domain being converted to as the frame to be encoded relative to the reference frame.
3. according to the method described in claim 2, it is characterized in that, calculating the residual error packet between the frame to be encoded and reference frame It includes:
The frame to be encoded is divided into the target macroblock of preset quantity, and determines each target macroblock in the reference frame In corresponding reference macroblock;
Calculate separately the local residual error between each target macroblock and the corresponding reference macroblock, and by each mesh The combination for marking the corresponding local residual error of macro block, as the residual error between the frame to be encoded and reference frame;
Correspondingly, each local residual error is converted from time-domain to frequency domain, and by the part for the frequency domain being converted to The combination of residual error, the residual error data as the frame to be encoded relative to the reference frame.
4. the method according to claim 1, wherein calculating the frame to be encoded relative to the reference frame After residual error data, the method also includes:
The residual error data is inputted to the deep neural network for completing training, includes multiple full connections in the deep neural network Layer;
By the first full articulamentum in the deep neural network, the residual error data is reduced to the second dimension from the first dimension Degree;
Correspondingly, respectively by the second full articulamentum and the full articulamentum of third in the deep neural network, described the is extracted The Mean Vector and variance vectors of the residual error data of two-dimensions;Wherein, the dimension of the Mean Vector and the variance vectors is low In second dimension.
5. according to the method described in claim 4, it is characterized in that, further including for executing normal state in the deep neural network The normal state sample level of profile samples;
Correspondingly, the Mean Vector of the residual error data of second dimension and variance vectors are entered the normal state sample level and carry out Normal distribution sampling, obtains the compressed data of third dimension;Wherein, the third dimension is lower than second dimension.
6. according to the method described in claim 4, it is characterized in that, after the compressed data for obtaining the frame to be encoded, institute State method further include:
The compressed data of the frame to be encoded is reversely reconstructed, the compressed data is reduced to and the residual error data The decoding data that matches of dimension.
7. according to the method described in claim 6, it is characterized in that, reversely being reconstructed to the compressed data of the frame to be encoded Include:
By first full articulamentum in the compressed data input decoding neural network of the frame to be encoded, by the compression number According to being restored to second dimension;
Second full articulamentum that the data for being restored to second dimension are inputted to the decoding neural network, by described the The data convert of two-dimensions to first dimension, and will be restored to the data of first dimension as with the residual error data The decoding data that matches of dimension.
8. according to the method described in claim 6, it is characterized in that, being carried out just to the Mean Vector and the variance vectors After state profile samples, the method also includes:
According to the Mean Vector and the variance vectors, the relative entropy of the residual error data is calculated, and passes through the relative entropy Characterize the distortion factor after normal distribution sampling;
Correspondingly, after the compressed data to the frame to be encoded reversely reconstruct, the method also includes:
The error between the decoding data and the residual error data that reduction obtains is calculated, and calculates the error and the relative entropy Cross entropy, and the distortion factor of the decoding data relative to the residual error data is characterized by the cross entropy.
9. a kind of video compression system, which is characterized in that the system comprises:
Residual error data computing unit for determining frame and reference frame to be encoded in target video, and calculates the frame to be encoded Residual error data relative to the reference frame;
Vector extraction unit, for extracting the Mean Vector and variance vectors of the residual error data respectively;
Data compression unit, it is described to obtain for carrying out normal distribution sampling to the Mean Vector and the variance vectors The compressed data of frame to be encoded, wherein the dimension of the compressed data is lower than the dimension of the residual error data.
10. system according to claim 9, which is characterized in that the residual error data computing unit includes:
Frequency domain conversion module, for calculating the residual error between the frame to be encoded and reference frame, and by the residual error from time-domain It converts to frequency domain, and the residual error using the residual error for the frequency domain being converted to as the frame to be encoded relative to the reference frame Data.
11. system according to claim 9, which is characterized in that the system also includes:
Neural network input unit, for the residual error data to be inputted to the deep neural network for completing training, the depth mind It is included multiple full articulamentums in network;
Dimensionality reduction unit, for by the first full articulamentum in the deep neural network, the residual error data to be tieed up from first Degree is reduced to the second dimension;
Correspondingly, the vector extraction unit, be also used to respectively by the second full articulamentum in the deep neural network and The full articulamentum of third extracts the Mean Vector and variance vectors of the residual error data of second dimension;Wherein, the Mean Vector It is lower than second dimension with the dimension of the variance vectors.
12. system according to claim 11, which is characterized in that the system also includes:
Decoding unit is reversely reconstructed for the compressed data to the frame to be encoded, the compressed data is reduced to The decoding data to match with the dimension of the residual error data.
13. system according to claim 12, which is characterized in that the decoding unit includes:
Decoding network input module is complete for first in the compressed data input decoding neural network by the frame to be encoded The compressed data is restored to second dimension by articulamentum;
Data restoring module is complete for the data for being restored to second dimension to be inputted second for decoding neural network Articulamentum, by the data convert of second dimension to first dimension, and the data that first dimension will be restored to The decoding data to match as the dimension with the residual error data.
14. system according to claim 12, which is characterized in that the system also includes:
Relative entropy computing unit, for calculating the opposite of the residual error data according to the Mean Vector and the variance vectors Entropy, and the distortion factor after normal distribution sampling is characterized by the relative entropy;
Cross entropy computing unit for calculating the error between the decoding data and the residual error data that reduction obtains, and calculates The cross entropy of the error and the relative entropy, and the decoding data is characterized relative to the residual error number by the cross entropy According to the distortion factor.
15. a kind of video compression apparatus, which is characterized in that the video compression apparatus includes memory and processor, described to deposit Reservoir is for storing computer program, when the computer program is executed by the processor, realizes as in claim 1 to 8 Any method.
CN201910318187.2A 2019-04-19 2019-04-19 Video compression method and system Expired - Fee Related CN110234011B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910318187.2A CN110234011B (en) 2019-04-19 2019-04-19 Video compression method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910318187.2A CN110234011B (en) 2019-04-19 2019-04-19 Video compression method and system

Publications (2)

Publication Number Publication Date
CN110234011A true CN110234011A (en) 2019-09-13
CN110234011B CN110234011B (en) 2021-09-24

Family

ID=67860744

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910318187.2A Expired - Fee Related CN110234011B (en) 2019-04-19 2019-04-19 Video compression method and system

Country Status (1)

Country Link
CN (1) CN110234011B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021093393A1 (en) * 2019-11-13 2021-05-20 南京邮电大学 Video compressed sensing and reconstruction method and apparatus based on deep neural network

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060193527A1 (en) * 2005-01-11 2006-08-31 Florida Atlantic University System and methods of mode determination for video compression
US20080084929A1 (en) * 2006-10-05 2008-04-10 Xiang Li Method for video coding a sequence of digitized images
CN102158703A (en) * 2011-05-04 2011-08-17 西安电子科技大学 Distributed video coding-based adaptive correlation noise model construction system and method
CN103546749A (en) * 2013-10-14 2014-01-29 上海大学 Method for optimizing HEVC (high efficiency video coding) residual coding by using residual coefficient distribution features and bayes theorem
CN104299201A (en) * 2014-10-23 2015-01-21 西安电子科技大学 Image reconstruction method based on heredity sparse optimization and Bayes estimation model
CN104702961A (en) * 2015-02-17 2015-06-10 南京邮电大学 Code rate control method for distributed video coding
CN109587487A (en) * 2017-09-28 2019-04-05 上海富瀚微电子股份有限公司 The appraisal procedure and system of the structural distortion factor of a kind of pair of RDO strategy

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060193527A1 (en) * 2005-01-11 2006-08-31 Florida Atlantic University System and methods of mode determination for video compression
US20080084929A1 (en) * 2006-10-05 2008-04-10 Xiang Li Method for video coding a sequence of digitized images
CN102158703A (en) * 2011-05-04 2011-08-17 西安电子科技大学 Distributed video coding-based adaptive correlation noise model construction system and method
CN103546749A (en) * 2013-10-14 2014-01-29 上海大学 Method for optimizing HEVC (high efficiency video coding) residual coding by using residual coefficient distribution features and bayes theorem
CN104299201A (en) * 2014-10-23 2015-01-21 西安电子科技大学 Image reconstruction method based on heredity sparse optimization and Bayes estimation model
CN104702961A (en) * 2015-02-17 2015-06-10 南京邮电大学 Code rate control method for distributed video coding
CN109587487A (en) * 2017-09-28 2019-04-05 上海富瀚微电子股份有限公司 The appraisal procedure and system of the structural distortion factor of a kind of pair of RDO strategy

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王建福: "H.265/HEVC编码加速算法研究", 《优秀博士论文电子期刊》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021093393A1 (en) * 2019-11-13 2021-05-20 南京邮电大学 Video compressed sensing and reconstruction method and apparatus based on deep neural network

Also Published As

Publication number Publication date
CN110234011B (en) 2021-09-24

Similar Documents

Publication Publication Date Title
Cheng et al. Energy compaction-based image compression using convolutional autoencoder
US5699121A (en) Method and apparatus for compression of low bit rate video signals
US20080008246A1 (en) Optimizing video coding
CN110677651A (en) Video compression method
CN107211133B (en) Method and device for inverse quantization of transform coefficients and decoding device
WO2002096118A2 (en) Decoding compressed image data
US8594189B1 (en) Apparatus and method for coding video using consistent regions and resolution scaling
CN109903351B (en) Image compression method based on combination of convolutional neural network and traditional coding
CN101272489B (en) Encoding and decoding device and method for video image quality enhancement
Zhou et al. DCT-based color image compression algorithm using an efficient lossless encoder
Akbari et al. Learned variable-rate image compression with residual divisive normalization
Song et al. Compressed image restoration via artifacts-free PCA basis learning and adaptive sparse modeling
CN111741300A (en) Video processing method
Al-Khafaji Image compression based on quadtree and polynomial
Cai et al. A novel deep progressive image compression framework
CN111163314A (en) Image compression method and system
Akbari et al. Learned bi-resolution image coding using generalized octave convolutions
CN110234011A (en) A kind of video-frequency compression method and system
Lee et al. CNN-based approach for visual quality improvement on HEVC
Akbari et al. Image compression using adaptive sparse representations over trained dictionaries
CN110730347A (en) Image compression method and device and electronic equipment
CN111161363A (en) Image coding model training method and device
Selim et al. A simplified fractal image compression algorithm
Putra et al. Intra-frame based video compression using deep convolutional neural network (dcnn)
CN114501034B (en) Image compression method and medium based on discrete Gaussian mixture super prior and Mask

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210924