CN114827630B - CU depth division method, system, device and medium based on frequency domain distribution learning - Google Patents

CU depth division method, system, device and medium based on frequency domain distribution learning Download PDF

Info

Publication number
CN114827630B
CN114827630B CN202210241583.1A CN202210241583A CN114827630B CN 114827630 B CN114827630 B CN 114827630B CN 202210241583 A CN202210241583 A CN 202210241583A CN 114827630 B CN114827630 B CN 114827630B
Authority
CN
China
Prior art keywords
frequency domain
blocks
dividing
block
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210241583.1A
Other languages
Chinese (zh)
Other versions
CN114827630A (en
Inventor
许皓淇
曹英烈
周智恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Guangzhou City University of Technology
Original Assignee
South China University of Technology SCUT
Guangzhou City University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT, Guangzhou City University of Technology filed Critical South China University of Technology SCUT
Priority to CN202210241583.1A priority Critical patent/CN114827630B/en
Publication of CN114827630A publication Critical patent/CN114827630A/en
Application granted granted Critical
Publication of CN114827630B publication Critical patent/CN114827630B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • H04N19/126Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a frequency domain distribution-based CU depth division method, a system, a device and a medium, wherein the method comprises the following steps: dividing the image into a plurality of 64x64 blocks and performing DCT to obtain a frequency domain coefficient distribution matrix F 64 The method comprises the steps of carrying out a first treatment on the surface of the And corresponding W 64 Calculating probability score p 64 If it is smaller than the division threshold alpha 64 Then the downward division is ended and is greater than the division threshold alpha 64 Continuously dividing the frequency domain coefficient matrix into 4 32x32 sub CU blocks downwards according to the quadtree principle, and similarly obtaining a 32x32 frequency domain coefficient matrix F 32 And W is equal to 32 Calculating to obtain probability score p 32 Then and divide the threshold alpha 32 Judging whether to continue dividing; and so on until all CU blocks stop partitioning in advance or the partition block is the smallest 8x8CU block. The method judges whether to continue dividing or not through the probability score and the dividing threshold value, does not need to carry out traversal recursion on all conditions, reduces the complexity of CU depth division, saves a large amount of encoding time, and can be widely applied to the technical field of video encoding.

Description

CU depth division method, system, device and medium based on frequency domain distribution learning
Technical Field
The invention relates to the technical field of artificial intelligence and video coding, in particular to a method, a system, a device and a medium for learning CU depth division based on frequency domain distribution.
Background
With the development of internet and communication technologies in recent years, the rapid growth of video traffic has presented a great challenge to video coding technology.
In a conventional coding framework (HEVC for example), any coded frame typically needs to be divided into multiple CTU (Code Tree Unit) sequences before subsequent predictive transform quantization operations can be performed. CTUs can be divided down into CUs (Code units) of different sizes according to the quadtree principle, with a maximum size of 64x64 and a minimum size of 8x8. The division of CTUs determines the efficiency of subsequent encoding.
In order to obtain the optimal CTU partitioning, the coding procedure may use a recursive traversal scheme to partition from 64×64CU down to 8×8CU, and predict each partitioning by using a built-in rate-distortion cost function until one of the optimal prediction conditions is selected.
Such a division causes a great waste of encoding time and computing resources, and is significantly improved as the resolution of video images increases. Therefore, how to reduce the complexity of CU depth partitioning becomes a problem in the current industry.
Disclosure of Invention
In order to solve at least one of the technical problems existing in the prior art to a certain extent, the invention aims to provide a method, a system, a device and a medium for learning CU depth division based on frequency domain distribution.
The technical scheme adopted by the invention is as follows:
a CU depth division method based on frequency domain distribution learning comprises the following steps:
acquiring a video image, dividing the video image into a plurality of first CU blocks with the size of 64x64, and acquiring a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) of the first CU blocks
Figure GDA0004182369950000011
According to the frequency domain coefficient distribution matrix->
Figure GDA0004182369950000012
Acquiring probability score->
Figure GDA0004182369950000013
If it is
Figure GDA0004182369950000014
Dividing the first CU block into 4 second CU blocks with the size of 32x32 downwards to obtain frequency domain coefficient partitions of DCT (discrete cosine transform) of the second CU blocksCloth matrix->
Figure GDA0004182369950000015
According to the frequency domain coefficient distribution matrix->
Figure GDA0004182369950000016
Acquiring probability score->
Figure GDA0004182369950000017
Otherwise, ending the division of the first CU block;
if it is
Figure GDA0004182369950000018
Dividing the second CU block into 4 third CU blocks with the size of 16x16 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the third CU blocks>
Figure GDA0004182369950000019
According to the frequency domain coefficient distribution matrix->
Figure GDA00041823699500000110
Acquiring probability score->
Figure GDA00041823699500000111
Otherwise, ending the division of the second CU block;
if it is
Figure GDA0004182369950000021
Dividing the third CU block down into 4 fourth CU blocks of 8x8 size; otherwise, ending the division of the third CU block;
wherein ,αN To divide the threshold, n=64, 32, 16.
Further, the probability score
Figure GDA0004182369950000022
Obtained by calculation by the following formula: />
Figure GDA0004182369950000023
wherein ,W64 And (3) distributing a weight matrix for a preset frequency domain, wherein i represents the row coordinates of the matrix, and j represents the row coordinates of the matrix.
Further, a frequency domain distribution weight matrix W 64 Obtained by:
acquiring a training set and a sample required by a network
Figure GDA0004182369950000024
Figure GDA0004182369950000025
Representing a DCT transformation frequency domain coefficient distribution matrix corresponding to a kth 64x64 size CU block; l (L) k =0, 1, indicating whether CU blocks continue to divide down, 0 indicating no, 1 indicating yes;
training the network according to a preset loss function to obtain a frequency domain distribution weight matrix W 64
The expression of the preset loss function is as follows:
Figure GDA0004182369950000026
further, the division threshold value alpha 64 Obtained by:
selecting a label sample of training set l=0
Figure GDA0004182369950000027
Calculating probability scores from selected samples
Figure GDA0004182369950000028
From calculated probability scores
Figure GDA0004182369950000029
Acquiring the division threshold alpha 64
Further, the threshold of divisionValue of
Figure GDA00041823699500000210
Further, the dividing the video image into a number of first CU blocks of 64x64 size includes:
the video image is divided into first CU blocks of 64x64 size according to the luminance component.
Further, the dividing the video image into a number of first CU blocks of 64x64 size includes:
after dividing the video image into a plurality of first CU blocks with 64x64 size, pixel interpolation is performed on the remaining pixel areas with the size of 64x 64.
The invention adopts another technical scheme that:
a frequency domain distribution based learning CU depth partitioning system, comprising:
a first dividing module for obtaining video image, dividing the video image into a plurality of first CU blocks with 64x64 size, and obtaining DCT transformed frequency domain coefficient distribution matrix of the first CU blocks
Figure GDA0004182369950000031
According to the frequency domain coefficient distribution matrix->
Figure GDA0004182369950000032
Obtaining probability scores
Figure GDA0004182369950000033
A second dividing module for if
Figure GDA0004182369950000034
Dividing the first CU block into 4 second CU blocks with the size of 32x32 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the second CU blocks +.>
Figure GDA0004182369950000035
According to the frequency domain coefficient distribution matrix->
Figure GDA0004182369950000036
Acquiring probability score->
Figure GDA0004182369950000037
Otherwise, ending the division of the first CU block;
a third dividing module for if
Figure GDA0004182369950000038
Dividing the second CU block into 4 third CU blocks with the size of 16x16 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the third CU blocks>
Figure GDA0004182369950000039
According to the frequency domain coefficient distribution matrix->
Figure GDA00041823699500000310
Acquiring probability score->
Figure GDA00041823699500000311
Otherwise, ending the division of the second CU block;
a fourth dividing module for if
Figure GDA00041823699500000312
Dividing the third CU block down into 4 fourth CU blocks of 8x8 size; otherwise, ending the division of the third CU block;
wherein ,αN To divide the threshold, n=64, 32, 16.
The invention adopts another technical scheme that:
a frequency domain distribution based learning CU depth partitioning apparatus, comprising:
at least one processor;
at least one memory for storing at least one program;
the at least one program, when executed by the at least one processor, causes the at least one processor to implement the method described above.
The invention adopts another technical scheme that:
a computer readable storage medium, in which a processor executable program is stored, which when executed by a processor is adapted to carry out the method as described above.
The beneficial effects of the invention are as follows: the method judges whether to continue the division or not through the probability score and the division threshold value, obtains a mode of terminating the division in advance, does not need to carry out traversal recursion on all conditions, reduces the complexity of CU depth division, and saves a large amount of coding time.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description is made with reference to the accompanying drawings of the embodiments of the present invention or the related technical solutions in the prior art, and it should be understood that the drawings in the following description are only for convenience and clarity of describing some embodiments in the technical solutions of the present invention, and other drawings may be obtained according to these drawings without the need of inventive labor for those skilled in the art.
FIG. 1 is a flow chart showing steps of a method for learning CU depth partitioning based on frequency domain distribution in an embodiment of the present invention
FIG. 2 is a schematic illustration of a video image in an embodiment of the invention;
FIG. 3 is a visual presentation of CU block 64x64DCT transform frequency domain coefficient distributions for the video image of FIG. 2;
FIG. 4 is a flowchart illustrating a method for learning CU depth fast partitioning based on frequency domain distribution in an embodiment of the present invention;
FIG. 5 is a diagram of a learning frequency domain distribution weight matrix W in an embodiment of the present invention 64 And a division threshold alpha 64 A network training flow chart.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention. The step numbers in the following embodiments are set for convenience of illustration only, and the order between the steps is not limited in any way, and the execution order of the steps in the embodiments may be adaptively adjusted according to the understanding of those skilled in the art.
In the description of the present invention, it should be understood that references to orientation descriptions such as upper, lower, front, rear, left, right, etc. are based on the orientation or positional relationship shown in the drawings, are merely for convenience of description of the present invention and to simplify the description, and do not indicate or imply that the apparatus or elements referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus should not be construed as limiting the present invention.
In the description of the present invention, a number means one or more, a number means two or more, and greater than, less than, exceeding, etc. are understood to not include the present number, and above, below, within, etc. are understood to include the present number. The description of the first and second is for the purpose of distinguishing between technical features only and should not be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
In the description of the present invention, unless explicitly defined otherwise, terms such as arrangement, installation, connection, etc. should be construed broadly and the specific meaning of the terms in the present invention can be reasonably determined by a person skilled in the art in combination with the specific contents of the technical scheme.
As shown in fig. 1, the present embodiment provides a CU depth division method based on frequency domain distribution learning, including the following steps:
s101, acquiring a video image, dividing the video image into a plurality of first CU blocks with the size of 64x64, and acquiring a DCT transformed frequency domain coefficient distribution matrix of the first CU blocks
Figure GDA0004182369950000041
According to the frequency domain coefficient distribution matrix->
Figure GDA0004182369950000042
Acquiring probability score->
Figure GDA0004182369950000043
The luminance component of the video frame image is selected and divided into N CU blocks of 64x64 size. If there are remaining regions below 64x64 pixel size, pixel interpolation is performed for those regions.
DCT transforming the kth 64x64CU area to obtain frequency domain coefficient matrix
Figure GDA0004182369950000044
k=1, 2,/v. Calculating the division probability:
Figure GDA0004182369950000051
s102, if
Figure GDA0004182369950000052
Dividing the first CU block into 4 second CU blocks with the size of 32x32 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the second CU blocks +.>
Figure GDA0004182369950000053
According to the frequency domain coefficient distribution matrix->
Figure GDA0004182369950000054
Acquiring probability score->
Figure GDA0004182369950000055
And otherwise, ending the division of the first CU block.
If it is
Figure GDA0004182369950000056
The CU area is stopped being divided in advance; if->
Figure GDA0004182369950000057
The CU area (called lcu_64) continues to partition down according to the quadtree principle4CU areas of 32x32 size.
For continuously dividing LCU_64 downwards, obtaining a frequency domain coefficient matrix of an mth 32x 32-sized CU region
Figure GDA0004182369950000058
m=1, 2,3,4. Calculating the division probability:
Figure GDA0004182369950000059
s103, if
Figure GDA00041823699500000510
Dividing the second CU block into 4 third CU blocks with the size of 16x16 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the third CU blocks>
Figure GDA00041823699500000511
According to the frequency domain coefficient distribution matrix->
Figure GDA00041823699500000512
Acquiring probability score->
Figure GDA00041823699500000513
And otherwise, ending the division of the second CU block.
If it is
Figure GDA00041823699500000514
The CU area is stopped being divided in advance; if->
Figure GDA00041823699500000515
The CU area (called LCU 32) continues to divide down into 4CU areas of 16x16 size according to the quadtree principle.
The LCU_32 is continuously divided downwards by the same pair to obtain the frequency domain coefficient matrix of the nth 16x16 size CU area
Figure GDA00041823699500000516
n=1, 2,3,4. Calculating the division probability:
Figure GDA00041823699500000517
s104, if
Figure GDA00041823699500000518
Dividing the third CU block down into 4 fourth CU blocks of 8x8 size; and otherwise, ending the division of the third CU block.
If it is
Figure GDA00041823699500000519
The CU area is stopped being divided in advance; if->
Figure GDA00041823699500000520
The CU area continues to divide down into 4CU areas of 8x8 size according to the quadtree principle and ends. />
From the above, the method can obtain a mode of terminating the division in advance, does not need to carry out a traversal recursion on all conditions, reduces the complexity of CU depth division, and saves a large amount of encoding time. Meanwhile, the probability score essence based on frequency domain learning calculation division is only used for calculating two matrixes, and the processor is simple and convenient to process and can save great calculation resources and time.
The above method is explained in detail below with reference to the drawings and specific examples.
As shown in fig. 4, fig. 4 is a flowchart of a method for quickly dividing CU depth based on frequency domain distribution learning according to an embodiment of the present invention. The method comprises a frequency domain distribution learning network module and a CU depth division judgment module. When CU depth division judgment is carried out, a frequency domain distribution weight matrix W needs to be learned through a frequency domain distribution learning network module N And a division threshold alpha N Two important parameters, n=64, 32, 16.
FIG. 5 shows a learning frequency domain distribution weight matrix W provided by an embodiment of the present invention 64 And a division threshold alpha 64 A network training flow chart. For the rest of the frequency domainCloth weight matrix W 32 ,W 16 And a division threshold alpha 32 ,α 16 And is a similar training process, and no additional drawing is needed.
Step A1: acquiring a training set and a sample required by a network
Figure GDA0004182369950000061
Figure GDA0004182369950000062
Representing a DCT transformation frequency domain coefficient distribution matrix corresponding to a kth 64x64 size CU block; l (L) k =0, 1, indicating whether the CU block continues to divide down, 0 indicating no, 1 indicating yes;
step A2: setting a loss function
Figure GDA0004182369950000063
Training a network to obtain a frequency domain distribution weight matrix W 64
Step A3: according to the currently learned frequency domain distribution weight matrix W 64 Selecting a label sample with L=0 in the data set
Figure GDA0004182369950000064
Calculating the division probability score->
Figure GDA0004182369950000065
Step A4: observation probability score
Figure GDA0004182369950000066
Is selected to set the division threshold alpha in a proper manner 64 . In this embodiment, choose +.>
Figure GDA0004182369950000067
Can obtain better experimental results.
Referring to fig. 2 and 3, the more high frequency components the region tends to be divided down into smaller CU blocks, and the frequency domain distribution learning network module learns how to represent the high frequency component richness (inRichness) and CU partition depth. This association is distributed by a domain-wise weighting matrix W N And a division threshold alpha N Parameters.
Step S1: dividing the brightness component of any video image into a plurality of CU blocks with 64x64 size, and solving the frequency domain coefficient distribution matrix of DCT transformation
Figure GDA0004182369950000068
According to->
Figure GDA0004182369950000069
Calculating the corresponding probability score;
step S2: decision probability score
Figure GDA00041823699500000610
And a division threshold alpha 64 A relationship between;
step S3: if it is
Figure GDA00041823699500000611
Ending the division of the CU blocks in advance;
step S4: if it is
Figure GDA0004182369950000071
The CU block is divided into 4 32x32 sub CU blocks downwards according to the quadtree principle;
step S5: DCT transforming 4 pieces of 32x32 sub CU blocks to obtain frequency domain coefficient distribution matrix
Figure GDA0004182369950000072
Calculating probability score->
Figure GDA0004182369950000073
Step S6: decision probability score
Figure GDA0004182369950000074
And a division threshold alpha 32 A relationship between; />
Step S7: if it is
Figure GDA0004182369950000075
Ending the division of the CU blocks in advance;
step S8: if it is
Figure GDA0004182369950000076
The CU block is divided into 4 16x16 sub CU blocks downwards according to the quadtree principle;
step S9: DCT transforming 4 16x16 sub CU blocks to obtain frequency domain coefficient distribution matrix
Figure GDA0004182369950000077
Calculating probability score->
Figure GDA0004182369950000078
Step S10: decision probability score
Figure GDA0004182369950000079
And a division threshold alpha 16 A relationship between;
step S11: if it is
Figure GDA00041823699500000710
Ending the division of the CU blocks in advance;
step S12: if it is
Figure GDA00041823699500000711
The CU block is divided down into 4 8x8 sub-CU blocks according to the quadtree principle, ending this division.
It can be seen that steps S3, S7, and S11 all have an opportunity to end the partitioning in advance, so that decision making after traversing all the partitioning modes of one CU block can be avoided in many cases, and waste of encoding time and computing resources can be avoided. The division among different sub CU blocks is independent and not affected, so that the program can process in parallel, and meanwhile, decision is made on a plurality of sub CU blocks, so that the coding time is greatly saved. Wherein, S3 and S4, S7 and S8, S11 and S12 all execute only one step, even if a 64x64CU block needs to be divided into a minimum 8x8CU block only needs to be executed for 9 steps, which involves 3 DCT steps, 3 matrix-to-matrix operations and three judgments, thus greatly reducing the requirement on the computing resources of the processor.
In the test experiment in the embodiment, a CU partition depth result similar to that obtained by the conventional coding framework HEVC can be obtained, and the coding time is greatly reduced on the premise of not affecting the video quality and the code rate.
The embodiment also provides a CU depth division system based on frequency domain distribution learning, including:
a first dividing module for obtaining video image, dividing the video image into a plurality of first CU blocks with 64x64 size, and obtaining DCT transformed frequency domain coefficient distribution matrix of the first CU blocks
Figure GDA00041823699500000712
According to the frequency domain coefficient distribution matrix->
Figure GDA00041823699500000713
Obtaining probability scores
Figure GDA00041823699500000714
A second dividing module for if
Figure GDA00041823699500000715
Dividing the first CU block into 4 second CU blocks with the size of 32x32 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the second CU blocks +.>
Figure GDA00041823699500000716
According to the frequency domain coefficient distribution matrix->
Figure GDA00041823699500000717
Acquiring probability score->
Figure GDA0004182369950000081
Otherwise, ending the division of the first CU block;
third dividing moduleA block for if
Figure GDA0004182369950000082
Dividing the second CU block into 4 third CU blocks with the size of 16x16 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the third CU blocks>
Figure GDA0004182369950000083
According to the frequency domain coefficient distribution matrix->
Figure GDA0004182369950000084
Acquiring probability score->
Figure GDA0004182369950000085
Otherwise, ending the division of the second CU block;
a fourth dividing module for if
Figure GDA0004182369950000086
Dividing the third CU block down into 4 fourth CU blocks of 8x8 size; otherwise, ending the division of the third CU block;
wherein ,αN To divide the threshold, n=64, 32, 16.
The frequency domain distribution-based CU depth division system can execute any combination implementation steps of the frequency domain distribution-based CU depth division method provided by the method embodiment of the invention, and has the corresponding functions and beneficial effects.
The embodiment also provides a depth dividing device for learning CU based on frequency domain distribution, which comprises:
at least one processor;
at least one memory for storing at least one program;
the at least one program, when executed by the at least one processor, causes the at least one processor to implement the method illustrated in fig. 1.
The frequency domain distribution-based CU depth dividing device can execute any combination implementation steps of the frequency domain distribution-based CU depth dividing method provided by the method embodiment of the invention, and has corresponding functions and beneficial effects.
The present application also discloses a computer program product or a computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions may be read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, to cause the computer device to perform the method shown in fig. 1.
The embodiment also provides a storage medium which stores instructions or programs for executing the CU depth division method based on the frequency domain distribution learning provided by the method embodiment of the invention, and when the instructions or programs are run, the random combination implementation steps of the method embodiment can be executed, and the method has the corresponding functions and beneficial effects.
In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.
Furthermore, while the invention is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the described functions and/or features may be integrated in a single physical device and/or software module or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the invention, which is to be defined in the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
In the foregoing description of the present specification, reference has been made to the terms "one embodiment/example", "another embodiment/example", "certain embodiments/examples", and the like, means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiment of the present invention has been described in detail, the present invention is not limited to the above embodiments, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present invention, and these equivalent modifications and substitutions are intended to be included in the scope of the present invention as defined in the appended claims.

Claims (8)

1. The frequency domain distribution-based CU depth division learning method is characterized by comprising the following steps of:
acquiring a video image, dividing the video image into a plurality of first CU blocks with the size of 64x64, and acquiring a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) of the first CU blocks
Figure FDA0004199156600000011
According to the frequency domain coefficient distribution matrix->
Figure FDA0004199156600000012
Acquiring probability score->
Figure FDA0004199156600000013
wherein ,
Figure FDA0004199156600000014
k on the table indicates a kth first CU block among the plurality of first CU blocks;
if it is
Figure FDA0004199156600000015
Dividing the first CU block into 4 second CU blocks with the size of 32x32 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the second CU blocks +.>
Figure FDA0004199156600000016
According to the frequency domain coefficient distribution matrix->
Figure FDA0004199156600000017
Acquiring probability score->
Figure FDA0004199156600000018
Otherwise, ending the division of the first CU block; wherein (1)>
Figure FDA0004199156600000019
M above represents the mth m of the second CU divided downward by the first CU;
if it is
Figure FDA00041991566000000110
Dividing the second CU block into 4 third CU blocks with the size of 16x16 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the third CU blocks>
Figure FDA00041991566000000111
According to the frequency domain coefficient distribution matrix->
Figure FDA00041991566000000112
Acquiring probability score->
Figure FDA00041991566000000113
Otherwise, ending the division of the second CU block; wherein (1)>
Figure FDA00041991566000000114
N on the upper represents the nth of the third CUs divided downward by the second CU;
if it is
Figure FDA00041991566000000115
Dividing the third CU block down into 4 fourth CU blocks of 8x8 size; otherwise, ending the division of the third CU block;
wherein ,α64 、α 32 、α 16 All are dividing thresholds;
the probability score
Figure FDA00041991566000000116
Obtained by calculation by the following formula:
Figure FDA00041991566000000117
the probability score
Figure FDA00041991566000000118
Obtained by calculation by the following formula:
Figure FDA00041991566000000119
the probability score
Figure FDA00041991566000000120
Obtained by calculation by the following formula:
Figure FDA00041991566000000121
in the formula ,W64 、W 32 、W 16 For a preset frequency domain distribution weight matrix, i represents row coordinates of the matrix, and j represents column coordinates of the matrix.
2. The CU depth partitioning method based on frequency domain distribution learning as set forth in claim 1, wherein the frequency domain distribution weight matrix W 64 Obtained by:
acquiring a training set and a sample required by a network
Figure FDA0004199156600000021
Figure FDA0004199156600000022
Representing a DCT transformation frequency domain coefficient distribution matrix corresponding to a kth 64x64 size CU block; l (L) k Indicating whether the kth 64x64 size CU block continues to partition downward, L k The value 0 indicates no, L k The value 1 indicates yes;
training the network according to a preset loss function to obtain a frequency domain distribution weight matrix W 64
The expression of the preset loss function is as follows:
Figure FDA0004199156600000023
where N represents the number of data in the training set, and N represents the size of the CU block.
3. The CU depth partitioning method based on frequency domain distribution learning as recited in claim 2, wherein the partitioning threshold α 64 Obtained by:
select training set L k Label sample of =0
Figure FDA0004199156600000024
Calculating probability scores from selected samples
Figure FDA0004199156600000025
From calculated probability scores
Figure FDA0004199156600000026
Acquiring the division threshold alpha 64
Division threshold
Figure FDA0004199156600000027
4. The method for frequency domain distribution based CU depth partitioning according to claim 1, wherein the partitioning the video image into a number of first CU blocks of 64x64 size comprises:
the video image is divided into first CU blocks of 64x64 size according to the luminance component.
5. The method for frequency domain distribution based CU depth partitioning according to claim 1, wherein the partitioning the video image into a number of first CU blocks of 64x64 size comprises:
after dividing the video image into a plurality of first CU blocks with 64x64 size, pixel interpolation is performed on the remaining pixel areas with the size of 64x 64.
6. A frequency domain distribution based learning CU depth partitioning system, comprising:
a first dividing module for obtaining video image, dividing the video image into a plurality of first CU blocks with 64x64 size, and obtaining DCT transformed frequency domain coefficient distribution matrix of the first CU blocks
Figure FDA0004199156600000031
According to the frequency domain coefficient distribution matrix->
Figure FDA0004199156600000032
Acquiring probability score->
Figure FDA0004199156600000033
wherein ,/>
Figure FDA0004199156600000034
K on the table indicates a kth first CU block among the plurality of first CU blocks;
a second dividing module for if
Figure FDA0004199156600000035
Dividing the first CU block into 4 second CU blocks with the size of 32x32 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the second CU blocks +.>
Figure FDA0004199156600000036
According to the frequency domain coefficient distribution matrix->
Figure FDA0004199156600000037
Acquiring probability score->
Figure FDA0004199156600000038
Otherwise, ending the division of the first CU block; wherein (1)>
Figure FDA0004199156600000039
M above represents the mth m of the second CU divided downward by the first CU;
a third dividing module for if
Figure FDA00041991566000000310
Dividing the second CU block into 4 third CU blocks with the size of 16x16 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the third CU blocks>
Figure FDA00041991566000000311
According to the frequency domain coefficient distribution matrix->
Figure FDA00041991566000000312
Acquiring probability score->
Figure FDA00041991566000000313
Otherwise, ending the division of the second CU block; wherein (1)>
Figure FDA00041991566000000314
N on the upper represents the nth of the third CUs divided downward by the second CU;
a fourth dividing module forIf it is
Figure FDA00041991566000000315
Dividing the third CU block down into 4 fourth CU blocks of 8x8 size; otherwise, ending the division of the third CU block;
wherein ,α64 、α 32 、α 16 All are dividing thresholds;
the probability score
Figure FDA00041991566000000316
Obtained by calculation by the following formula:
Figure FDA00041991566000000317
the probability score
Figure FDA00041991566000000318
Obtained by calculation by the following formula:
Figure FDA00041991566000000319
the probability score
Figure FDA00041991566000000320
Obtained by calculation by the following formula:
Figure FDA00041991566000000321
/>
in the formula ,W64 、W 32 、W 16 For a preset frequency domain distribution weight matrix, i represents row coordinates of the matrix, and j represents column coordinates of the matrix.
7. A frequency domain distribution based learning CU depth partitioning apparatus, comprising:
at least one processor;
at least one memory for storing at least one program;
the at least one program, when executed by the at least one processor, causes the at least one processor to implement the method of any one of claims 1-5.
8. A computer readable storage medium, in which a processor executable program is stored, characterized in that the processor executable program is for performing the method according to any of claims 1-5 when being executed by a processor.
CN202210241583.1A 2022-03-11 2022-03-11 CU depth division method, system, device and medium based on frequency domain distribution learning Active CN114827630B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210241583.1A CN114827630B (en) 2022-03-11 2022-03-11 CU depth division method, system, device and medium based on frequency domain distribution learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210241583.1A CN114827630B (en) 2022-03-11 2022-03-11 CU depth division method, system, device and medium based on frequency domain distribution learning

Publications (2)

Publication Number Publication Date
CN114827630A CN114827630A (en) 2022-07-29
CN114827630B true CN114827630B (en) 2023-06-06

Family

ID=82529378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210241583.1A Active CN114827630B (en) 2022-03-11 2022-03-11 CU depth division method, system, device and medium based on frequency domain distribution learning

Country Status (1)

Country Link
CN (1) CN114827630B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018199459A1 (en) * 2017-04-26 2018-11-01 강현인 Image restoration machine learning algorithm using compression parameter, and image restoration method using same

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9756327B2 (en) * 2012-04-03 2017-09-05 Qualcomm Incorporated Quantization matrix and deblocking filter adjustments for video coding
US9432696B2 (en) * 2014-03-17 2016-08-30 Qualcomm Incorporated Systems and methods for low complexity forward transforms using zeroed-out coefficients
CN108370441B (en) * 2015-11-12 2022-07-12 Lg 电子株式会社 Method and apparatus for coefficient-induced intra prediction in image coding system
US10977802B2 (en) * 2018-08-29 2021-04-13 Qualcomm Incorporated Motion assisted image segmentation
US11575896B2 (en) * 2019-12-16 2023-02-07 Panasonic Intellectual Property Corporation Of America Encoder, decoder, encoding method, and decoding method
CN112927202B (en) * 2021-02-25 2022-06-03 华南理工大学 Method and system for detecting Deepfake video with combination of multiple time domains and multiple characteristics
CN113411582A (en) * 2021-05-10 2021-09-17 华南理工大学 Video coding method, system, device and medium based on active contour

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018199459A1 (en) * 2017-04-26 2018-11-01 강현인 Image restoration machine learning algorithm using compression parameter, and image restoration method using same

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于纹理检测的视频序列误差掩盖;周智恒,谢胜利;计算机工程与应用(第05期);全文 *

Also Published As

Publication number Publication date
CN114827630A (en) 2022-07-29

Similar Documents

Publication Publication Date Title
Li et al. A deep learning approach for multi-frame in-loop filter of HEVC
US20200120340A1 (en) Method and device for encoding or decoding image
CN110036637B (en) Method and device for denoising and vocalizing reconstructed image
CN112399176B (en) Video coding method and device, computer equipment and storage medium
EP1610563A2 (en) Selecting encoding types and predictive modes for encoding video data
CN112738511B (en) Fast mode decision method and device combined with video analysis
US11259029B2 (en) Method, device, apparatus for predicting video coding complexity and storage medium
CN111445424B (en) Image processing method, device, equipment and medium for processing mobile terminal video
CN104378636B (en) A kind of video encoding method and device
CN113727106B (en) Video encoding and decoding methods, devices, electronic equipment and storage medium
CN111988628B (en) VVC rapid intra-frame coding method based on reinforcement learning
US20220284632A1 (en) Analysis device and computer-readable recording medium storing analysis program
CN106791850A (en) Method for video coding and device
CN105898565A (en) Video processing method and device
CN110378860A (en) Method, apparatus, computer equipment and the storage medium of restored video
CN113747177B (en) Intra-frame coding speed optimization method, device and medium based on historical information
US9020283B2 (en) Electronic device and method for splitting image
CN114827630B (en) CU depth division method, system, device and medium based on frequency domain distribution learning
CN112669328B (en) Medical image segmentation method
US20140133768A1 (en) Electronic device and method for splitting image
CN105049853A (en) SAO coding method and system based on fragment source analysis
CN111669602B (en) Method and device for dividing coding unit, coder and storage medium
CN111372079B (en) VVC inter-frame CU deep rapid dividing method
CN115643403A (en) AV1 filtering method and device
CN108668166A (en) A kind of coding method, device and terminal device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant