WO2020248618A1 - 双核心计算单元实现环路滤波的方法 - Google Patents

双核心计算单元实现环路滤波的方法 Download PDF

Info

Publication number
WO2020248618A1
WO2020248618A1 PCT/CN2020/075925 CN2020075925W WO2020248618A1 WO 2020248618 A1 WO2020248618 A1 WO 2020248618A1 CN 2020075925 W CN2020075925 W CN 2020075925W WO 2020248618 A1 WO2020248618 A1 WO 2020248618A1
Authority
WO
WIPO (PCT)
Prior art keywords
register
dual
pixel block
computing unit
pixel
Prior art date
Application number
PCT/CN2020/075925
Other languages
English (en)
French (fr)
Inventor
刘行
唐印
林洪周
张磊
刘易华
Original Assignee
上海富瀚微电子股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海富瀚微电子股份有限公司 filed Critical 上海富瀚微电子股份有限公司
Publication of WO2020248618A1 publication Critical patent/WO2020248618A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/436Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements

Definitions

  • the present invention relates to the technical field of coding, in particular to a method for implementing loop filtering by a dual-core computing unit.
  • the entire image is divided into small blocks for processing, and Fourier transform is performed in each block unit, resulting in data loss, and finally causing block boundaries to be more obvious in the image. Call it the block effect.
  • loop filtering is used to reduce the influence of blocking and improve the subjective quality of images.
  • the H.264 standard proposes DeBlocking (DB for short) to perform loop filtering (LoopFilter, LF for short).
  • DB filtering needs to process the horizontal and vertical boundaries corresponding to the surrounding pixels, that is, both the horizontal and vertical of the pixels will be subjected to DB filtering, which will cause the problem of data need to be transposed.
  • the purpose of the present invention is to provide a method for implementing loop filtering by a dual-core computing unit, which improves the speed of loop filtering operations and reduces the difficulty of maintaining data.
  • the present invention provides a method for dual-core computing unit to implement loop filtering, including:
  • each of the macroblocks contains PxP pixels
  • each of the macroblocks is divided into a plurality of pixel blocks
  • the plurality of pixel blocks form an array form of multiple rows and multiple columns ,
  • Each said pixel block includes 4x4 pixels, and each said pixel block has 4 boundaries;
  • S12 Read one of the macroblocks in the memory, and two processing units simultaneously filter the boundary signals of the pixel blocks in two adjacent rows;
  • the method for implementing in-loop filtering by the dual-core computing unit further includes: S14: repeating steps S12 and S13 until all macroblock filtering is completed.
  • the value of P is 16 or 8.
  • the value of P is 16, each of the macroblocks is divided into 16 pixel blocks, and the 16 pixel blocks form 4 rows.
  • the block has two vertical boundaries and two horizontal boundaries.
  • the two processing units are respectively a first processing unit and a second processing unit, and each processing unit corresponds to one register, The first processing unit corresponds to a first register, and the second processing unit corresponds to a second register.
  • the method for two processing units to simultaneously filter boundary signals of pixel blocks in two adjacent rows includes: the first processing unit performs The boundary signal of one row of pixel blocks is filtered, and at the same time, the second processing unit filters the boundary signal of the second row of pixel blocks.
  • the method for the first processing unit to filter the boundary signal of the first row of pixel blocks includes:
  • the first processing unit reads the left 4x4 pixel block and the right 4x4 pixel block of the first vertical boundary and stores it in the first register, and the left 4x4 pixel block and the right 4x4 pixel block form an 8x4 pixel block;
  • the method for performing block filtering on the 8x4 pixel block includes:
  • Read 8x1 pixels sequentially, perform block filtering, and transpose and store in the memory after filtering.
  • the register is a 12x4 register, which is divided into three register areas, namely the first register area, the second register area and the third register area.
  • the first register area is used to store unfiltered data
  • the second register area is used to store data that has been filtered once
  • the third register area is used to store data that has been filtered twice.
  • block filtering is performed on the horizontal boundary.
  • a processing unit is added, and the parallel processing improves the processing speed of the entire macro block and improves the throughput.
  • FIG. 1 is a flowchart of a method for implementing loop filtering by a dual-core computing unit according to an embodiment of the present invention
  • FIGS. 2 and 3 are schematic diagrams of divided pixel blocks according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of the structure of the register of the embodiment of the present invention.
  • f0-first storage area f1-second storage area
  • f2-second storage area f2-second storage area
  • the present invention provides a method for implementing loop filtering by a dual-core computing unit, which includes:
  • each of the macroblocks contains PxP pixels
  • each of the macroblocks is divided into a plurality of pixel blocks
  • the plurality of pixel blocks form an array form of multiple rows and multiple columns ,
  • Each said pixel block includes 4x4 pixels, and each said pixel block has 4 boundaries;
  • S12 Read one of the macroblocks in the memory, and two processing units simultaneously filter the boundary signals of the pixel blocks in two adjacent rows;
  • the method for implementing loop filtering by the dual-core computing unit further includes: S14: repeating steps S12 and S13 until all macroblock filtering is completed.
  • the video contains more information.
  • the video information can be divided into multiple macroblocks. After one macroblock is processed, the next macroblock is processed until all the macroblocks are processed, and the video information is processed.
  • the value of P is 16 or 8.
  • Video information includes image and chrominance. If the image is processed, the value of P can be 16, and if the chrominance is processed, the value of P can be 8.
  • the value of P is 16, each macroblock is divided into 16 pixel blocks, and the 16 pixel blocks form an array of 4 rows and 4 columns.
  • the 4 rows are the first row, the second row, and the third row.
  • the row and the fourth row, and the 4 columns are respectively the first column, the second column, the third column and the fourth column, and each pixel block has two vertical borders and two horizontal borders.
  • the 16 pixel blocks are named 0, 0, 1, 0, 2, 0, 3, 0, 0, 1, 1, 1, 1, 2, 1, 3, 1, 0, 2, 1, 2 , 2, 2, 3, 2, 0, 3, 1, 3, 2, 3 and 3, 3.
  • the two processing units are a first processing unit and a second processing unit, each of the processing units corresponds to one of the registers, the first processing unit corresponds to the first register, and the second unit corresponds to the second register .
  • the method for two processing units to simultaneously filter boundary signals of pixel blocks in two adjacent rows includes: the first processing unit filters the boundary signals of pixel blocks in the first row, and at the same time, the The second processing unit filters the boundary signals of the second row of pixel blocks.
  • the method for the first processing unit to filter the boundary signal of the first row of pixel blocks includes:
  • the first processing unit reads the left 4x4 pixel block and the right 4x4 pixel block of the first vertical boundary and stores it in the first register, and the left 4x4 pixel block and the right 4x4 pixel block form an 8x4 pixel block;
  • the method for performing block filtering on the 8x4 pixel block includes:
  • Read 8x1 pixels sequentially, perform block filtering, and transpose and store in the storage area after filtering.
  • the register is a 12x4 register, divided into 3 register areas, namely the first register area, the second register area and the third register area.
  • the first register area is used to store unfiltered data.
  • the second storage area is used to store data filtered once, and the third storage area is used to store data filtered twice.
  • a macro block includes 16 pixel blocks, and each pixel block has two vertical boundaries and two horizontal boundaries. Since multiple pixel blocks are gathered together, the vertical boundaries of adjacent pixel blocks Coincidence, therefore, only needs to be calculated once.
  • the entire macro block has multiple vertical boundaries. For convenience, start with each column and name the vertical boundaries H0, H1, H2, H3, H4, H5, H6, H7, H8, H9, H10, H11, H12, H13, H14 And H15.
  • the dual processing unit can process H0 and H1 at the same time, and then process H4 and H5 at the same time. In turn, after all the vertical boundaries are processed, the horizontal boundary is processed.
  • the processing method and speed of the first processing unit and the second processing unit are the same. Taking the first processing unit to process H0 as an example, the 4x4 pixel values on the left and right sides of H0 are used as reference as the block filter value of H0. Therefore, it is necessary to process the obtained 8x4 pixel block, and first store it in the register. 4.
  • the 4x4 pixel block on the left of H0 has been processed by the previous macroblock, it has been filtered once and stored in the second register area f1 of the register.
  • the 4x4 pixel block on the right is not filtered and stored in the first Storage area f0.
  • the 4x4 pixel block on the left of H1 is processed once again, and the processing is performed twice, and the pixel block after the two processing is stored in the third register area f2.
  • the twice-processed pixel block in the register is transposed into a 4x8 pixel block and stored in the memory.
  • the horizontal boundary is processed, but it has been transposed before, so the horizontal boundary has become the vertical boundary, and the same method can be used to continue processing. No need to perform transposition, reducing data processing time and increasing throughput.
  • a processing unit is added, and parallel processing increases the processing speed of the entire macroblock and increases the throughput.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本发明提供了一种双核心计算单元实现环路滤波的方法,包括:S11:将视频信息划分为多个宏块,每个所述宏块包含PxP个像素,每个所述宏块划分为多个像素块,多个所述像素块形成多行多列的阵列形式,每个所述像素块包含4x4个像素,每个所述像素块具有4个边界;S12:在存储器内读取一个所述宏块,两个处理单元同时对相邻两行的所述像素块的边界信号进行滤波;S13:对滤波后的信号通过转置存储在寄存器内。在本发明提供的双核心计算单元实现环路滤波的方法中,增加一个处理单元,并行处理提高了整个宏块的的处理速度,并且提高了吞吐量。

Description

双核心计算单元实现环路滤波的方法 技术领域
本发明涉及编码技术领域,尤其是涉及一种双核心计算单元实现环路滤波的方法。
背景技术
随着音视频产业的不断发展,国际上对于音视频编解码技术的要求也越来越高,出现了以MPEG-x、H.26x两大系列为主的视频编码技术国际标准。这些视频编码标准能在保持视频主观质量的条件下,对信息量极大的视频信号进行高效压缩,极大降低存储空间和网络带宽要求。
在H.26x视频编解码标准中,会把整幅图像分割为一个个小方块进行处理,在每个方块单元中进行傅里叶变换,导致数据损失,最终造成方块边界在图像中较为明显,称之为块效应。H.26x标准中使用环路滤波手段来降低块效应的影响,提升图像主观质量。
H.264标准提出去块效应(DeBlocking,后简称为DB)来进行环路滤波(LoopFilter,后简称LF)。H.264把图像分为16x16的宏块(Macro Block),每个宏块内部的4x4小块形成了水平、垂直边界各4条。DB滤波需要处理水平、垂直边界对应周围的像素,即像素的横向与纵向都会进行DB滤波,则会带来数据需转置的问题。而处理转置过程需消耗一定时间,降低了环路滤波通路的吞吐率。
发明内容
本发明的目的在于提供一种双核心计算单元实现环路滤波的方法,提高环路滤波运算的速度并且降低维护数据的难度。
为了达到上述目的,本发明提供了一种双核心计算单元实现环路滤波的方法,包括:
S11:将视频信息划分为多个宏块,每个所述宏块包含PxP个像素,每个所述宏块划分为多个像素块,多个所述像素块形成多行多列的阵列形式,每个所述像素块包含4x4个像素,每个所述像素块具有4个边界;
S12:在存储器内读取一个所述宏块,两个处理单元同时对相邻两行的所述像素块的边界信号进行滤波;
S13:对滤波后的信号通过转置存储在寄存器内。
可选的,在所述的双核心计算单元实现环路滤波的方法中,所述双核心计算单元实现环路滤波的方法还包括:S14:重复S12、S13步骤,直到所有宏块滤波完成。
可选的,在所述的双核心计算单元实现环路滤波的方法中,所述P的取值为16或者8。
可选的,在所述的双核心计算单元实现环路滤波的方法中,P的取值为16,每个所述宏块划分为16个像素块,16个所述像素块形成4行4列的阵列形式,4行分别为第一行、第二行、第三行和第四行,4列分别为第一列、第二列、第三列和第四列,每个所述像素块具有两个垂直边界和两个水平边界。
可选的,在所述的双核心计算单元实现环路滤波的方法中,两个所述处理单元分别为第一处理单元和第二处理单元,每个所述处理单元对应一个所述寄存器,所述第一处理单元对应第一寄存器,所述第二处理单元对应第二寄存器。
可选的,在所述的双核心计算单元实现环路滤波的方法中,两个处理单元同时对相邻两行的像素块的边界信号进行滤波的方法包括:所述第一处理单元对第一行像素块的边界信号进行滤波,同时,所述第二处理单元对第二行像素块的边界信号进行滤波。
可选的,在所述的双核心计算单元实现环路滤波的方法中,所述第一处理单元对所述第一行像素块的边界信号进行滤波的方法包括:
所述第一处理单元读取第一个垂直边界的左边4x4像素块和右边4x4像素块存入所述第一寄存器,所述左边4x4像素块和所述右边4x4像素块组成8x4的像素块;
对所述8x4的像素块进行区块滤波;
对滤波后的8x4的像素块转置为4x8的像素块存入所述存储器。
可选的,在所述的双核心计算单元实现环路滤波的方法中,对所述8x4的像素块进行区块滤波的方法包括:
将所述8x4的像素块写入所述寄存器;
依次读取8x1的像素,进行区块滤波,滤波后进行转置存入存储器。
可选的,在所述的双核心计算单元实现环路滤波的方法中,所述寄存器为12x4的寄存器,分为3个寄存区,分别是第一寄存区、第二寄存区和第三寄存区,所述第一寄存区用于存放未滤波的数据,所述第二寄存区用于存放经过一次滤波的数据,所述第三寄存区用于存放经过两次滤波的数据。
可选的,在所述的双核心计算单元实现环路滤波的方法中,依次对像素块的垂直边界进行区块滤波后,对水平边界进行区块滤波。
在本发明提供的双核心计算单元实现环路滤波的方法中,增加一个处理单元,并行处理提高了整个宏块的的处理速度,并且提高了吞吐量。
附图说明
图1是本发明实施例的双核心计算单元实现环路滤波的方法的流程图;
图2和图3是本发明实施例的划分的像素块的示意图;
图4是本发明实施例的寄存器的结构示意图;
图中:f0-第一寄存区、f1-第二寄存区、f2-第二寄存区。
具体实施方式
下面将结合示意图对本发明的具体实施方式进行更详细的描述。根据下列描述和权利要求书,本发明的优点和特征将更清楚。需说明的是,附图均采用非常简化的形式且均使用非精准的比例,仅用以方便、明晰地辅助说明本发明实施例的目的。
在下文中,术语“第一”“第二”等用于在类似要素之间进行区分,且未必是用于描述特定次序或时间顺序。要理解,在适当情况下,如此使用的这些术语可替换。类似的,如果本文所述的方法包括一系列步骤,且本文所呈现的这些步骤的顺序并非必须是可执行这些步骤的唯一顺序,且一些所述的步骤可被省略和/或一些本文未描述的其他步骤可被添加到该方法。
本发明提供了一种双核心计算单元实现环路滤波的方法,包括:
S11:将视频信息划分为多个宏块,每个所述宏块包含PxP个像素,每个所述宏块划分为多个像素块,多个所述像素块形成多行多列的阵列形式,每个所述像素块包含4x4个像素,每个所述像素块具有4个边界;
S12:在存储器内读取一个所述宏块,两个处理单元同时对相邻两行的所述像素块的边界信号进行滤波;
S13:对滤波后的信号通过转置存储在寄存器内。
进一步的,所述双核心计算单元实现环路滤波的方法还包括:S14:重复S12、S13步骤,直到所有宏块滤波完成。视频包含较多信息,可以将视频信息分为多个宏块,一个宏块处理完成后,对下一个宏块进行处理,直到所有宏块处理完成,视频信息就处理完成了。
优选的,所述P的取值为16或者8。视频信息包含图像和色度,如果是对图像进行处理,P的取值可以为16,如果是对色度进行处理,P的取值可以为8。
本实施例中,P的取值为16,每个宏块划分为16个像素块,16个像素块形成4行4列的阵列形式,4行分别为第一行、第二行、第三行和第四行,4列分别为第一列、第二列、第三列和第四列,每个像素块具有两个垂直边界和两个水平边界。如图3,将16个像素块命名为0,0、1,0、2,0、3,0、0,1、1,1、2,1、3,1、0,2、1,2、2,2、3,2、0,3、1,3、2,3和3,3。
优选的,两个处理单元为第一处理单元和第二处理单元,每个所述处理单元对应一个所述寄存器,所述第一处理单元对应第一寄存器,所述第二单元对应第二寄存器。
本实施例中,两个处理单元同时对相邻两行的像素块的边界信号进行滤波的方法包括:所述第一处理单元对第一行像素块的边界信号进行滤波,同时,所述第二处理单元对第二行像素块的边界信号进行滤波。
进一步的,所述第一处理单元对所述第一行像素块的边界信号进行滤波的方法包括:
所述第一处理单元读取第一条垂直边界的左边4x4像素块和右边4x4像素块存入所述第一寄存器,左边4x4像素块和右边4x4像素块组成8x4的像素块;
对所述8x4的像素块进行区块滤波;
对滤波后的8x4的像素块转置为4x8的像素块存入存储器。
进一步的,对所述8x4的像素块进行区块滤波的方法包括:
将8x4的像素块写入所述寄存器;
依次读取8x1的像素,进行区块滤波,滤波后进行转置存入存储区。
本实施例中,所述寄存器为12x4的寄存器,分为3个寄存区,分别是第一寄存区、第二寄存区和第三寄存区,所述第一寄存区用于存放没有滤波的数据,所述第二寄存区用于存放过滤一次的数据,所述第三寄存区用于存放过滤两次的数据。
优选的,依次对像素块的垂直边界进行区块滤波后,对水平边界进行区块滤波。如图2和图3,一个宏块包括16个像素块,每个像素块具有两条垂直边界和两条水平边界,由于多个像素块集合在一起,因此,相邻的像素块的垂直边界重合,因此,只用计算一次。整个宏块具有多条垂直边界,为了方便,以每列开始将垂直边界命名为H0、H1、H2、H3、H4、H5、H6、H7、H8、H9、H10、H11、H12、H13、H14和H15。以每行开始将水平边界命名为V0、V1、V2、V3、V4、V5、V6、V7、V8、V9、V10、V11、V12、V13、V14和V15。本实施例中,双处理单元可以同时处理H0和H1,之后再同时处理H4和H5,依次,所有垂直边界处理完后,再处理水平边界。第一处理单元和第二处理单元的处理方法和速度一致。以第一处理单元处理H0为例,以H0边界左右各4x4个像素值作为参考作为H0的区块滤波值,因此需要对得到的8x4个像素块进行处理,首先将其存入寄存器,参照图4,由于H0左边的4x4像素块是上一个宏块处理过,因此,已经过一次滤波,将其存入寄存器的第二寄存区f1,右边的4x4像素块没有滤波,将其存入第一寄存区f0。处理H1边界的时候,H1左边的4x4像素块再处理一次,就进行了两次处理,经过两次处理后的像素块存进第三寄存区f2。最后寄存器内的经过两次处理的像素块进行转置成4x8的像素块存入存储器。垂直边界处理完后,开始处理水平边界,但是之前已经转置了,所以水平边界已经变成了垂直边界,可以用同样的方法继续处理。不用再进行转置,减少了数据处理时间,增加了吞吐率。
综上,在本发明实施例提供的双核心计算单元实现环路滤波的方法中,增加一个处理单元,并行处理提高了整个宏块的的处理速度,并且提高了吞吐量。
上述仅为本发明的优选实施例而已,并不对本发明起到任何限制作用。任何所属技术领域的技术人员,在不脱离本发明的技术方案的范围内,对本发明揭露的技术方案和技术内容做任何形式的等同替换或修改等变动,均属未脱离本发明的技术方案的内容,仍属于本发明的保护范围之内。

Claims (10)

  1. 一种双核心计算单元实现环路滤波的方法,其特征在于,包括:
    S11:将视频信息划分为多个宏块,每个所述宏块包含PxP个像素,每个所述宏块划分为多个像素块,多个所述像素块形成多行多列的阵列形式,每个所述像素块包含4x4个像素,每个所述像素块具有4个边界;
    S12:在存储器内读取一个所述宏块,两个处理单元同时对相邻两行的所述像素块的边界信号进行滤波;
    S13:对滤波后的信号通过转置存储在寄存器内。
  2. 如权利要求1所述的双核心计算单元实现环路滤波的方法,其特征在于,所述双核心计算单元实现环路滤波的方法还包括:S14:重复S12、S13步骤,直到所有宏块滤波完成。
  3. 如权利要求1所述的双核心计算单元实现环路滤波的方法,其特征在于,所述P的取值为16或者8。
  4. 如权利要求3所述的双核心计算单元实现环路滤波的方法,其特征在于,P的取值为16,每个所述宏块划分为16个像素块,16个所述像素块形成4行4列的阵列形式,4行分别为第一行、第二行、第三行和第四行,4列分别为第一列、第二列、第三列和第四列,每个所述像素块具有两个垂直边界和两个水平边界。
  5. 如权利要求4所述的双核心计算单元实现环路滤波的方法,其特征在于,两个所述处理单元分别为第一处理单元和第二处理单元,每个所述处理单元对应一个所述寄存器,所述第一处理单元对应第一寄存器,所述第二处理单元对应第二寄存器。
  6. 如权利要求1所述的双核心计算单元实现环路滤波的方法,其特征在于,两个处理单元同时对相邻两行的像素块的边界信号进行滤波的方法包括:所述第一处理单元对第一行像素块的边界信号进行滤波,同时,所述第二处理单元对第二行像素块的边界信号进行滤波。
  7. 如权利要求6所述的双核心计算单元实现环路滤波的方法,其特征在于,所述第一处理单元对所述第一行像素块的边界信号进行滤波的方法包括:
    所述第一处理单元读取第一个垂直边界的左边4x4像素块和右边4x4像素 块存入所述第一寄存器,所述左边4x4像素块和所述右边4x4像素块组成8x4的像素块;
    对所述8x4的像素块进行区块滤波;
    对滤波后的8x4的像素块转置为4x8的像素块存入所述存储器。
  8. 如权利要求7所述的双核心计算单元实现环路滤波的方法,其特征在于,对所述8x4的像素块进行区块滤波的方法包括:
    将所述8x4的像素块写入所述寄存器;
    依次读取8x1的像素,进行区块滤波,滤波后进行转置存入存储器。
  9. 如权利要求8所述的双核心计算单元实现环路滤波的方法,其特征在于,所述寄存器为12x4的寄存器,分为3个寄存区,分别是第一寄存区、第二寄存区和第三寄存区,所述第一寄存区用于存放未滤波的数据,所述第二寄存区用于存放经过一次滤波的数据,所述第三寄存区用于存放经过两次滤波的数据。
  10. 如权利要求1所述的双核心计算单元实现环路滤波的方法,其特征在于,依次对像素块的垂直边界进行区块滤波后,对水平边界进行区块滤波。
PCT/CN2020/075925 2019-06-11 2020-02-20 双核心计算单元实现环路滤波的方法 WO2020248618A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910503172.3A CN110213579A (zh) 2019-06-11 2019-06-11 双核心计算单元实现环路滤波的方法
CN201910503172.3 2019-06-11

Publications (1)

Publication Number Publication Date
WO2020248618A1 true WO2020248618A1 (zh) 2020-12-17

Family

ID=67792054

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/075925 WO2020248618A1 (zh) 2019-06-11 2020-02-20 双核心计算单元实现环路滤波的方法

Country Status (2)

Country Link
CN (1) CN110213579A (zh)
WO (1) WO2020248618A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110213579A (zh) * 2019-06-11 2019-09-06 上海富瀚微电子股份有限公司 双核心计算单元实现环路滤波的方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080123750A1 (en) * 2006-11-29 2008-05-29 Michael Bronstein Parallel deblocking filter for H.264 video codec
CN102801973A (zh) * 2012-07-09 2012-11-28 珠海全志科技股份有限公司 视频图像去块滤波方法及装置
CN106454359A (zh) * 2010-12-07 2017-02-22 索尼公司 图像处理设备和图像处理方法
CN110213579A (zh) * 2019-06-11 2019-09-06 上海富瀚微电子股份有限公司 双核心计算单元实现环路滤波的方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8576924B2 (en) * 2005-01-25 2013-11-05 Advanced Micro Devices, Inc. Piecewise processing of overlap smoothing and in-loop deblocking
CN101170690A (zh) * 2007-11-21 2008-04-30 上海广电(集团)有限公司中央研究院 基于avs的环路滤波器的硬件装置及硬件实现方法
CN101459839A (zh) * 2007-12-10 2009-06-17 三星电子株式会社 去块效应滤波方法及实现该方法的装置
PH12015501384A1 (en) * 2010-12-07 2015-09-28 Sony Corp Image processing device and image processing method
CN104754363B (zh) * 2013-12-31 2017-08-08 展讯通信(上海)有限公司 用于hevc的环路滤波方法及装置、编码器及解码器
CN107135398B (zh) * 2017-06-05 2019-07-19 珠海市杰理科技股份有限公司 去方块滤波方法、装置和***

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080123750A1 (en) * 2006-11-29 2008-05-29 Michael Bronstein Parallel deblocking filter for H.264 video codec
CN106454359A (zh) * 2010-12-07 2017-02-22 索尼公司 图像处理设备和图像处理方法
CN102801973A (zh) * 2012-07-09 2012-11-28 珠海全志科技股份有限公司 视频图像去块滤波方法及装置
CN110213579A (zh) * 2019-06-11 2019-09-06 上海富瀚微电子股份有限公司 双核心计算单元实现环路滤波的方法

Also Published As

Publication number Publication date
CN110213579A (zh) 2019-09-06

Similar Documents

Publication Publication Date Title
EP3764651A1 (en) Loop filter apparatus and image decoding apparatus
KR101962591B1 (ko) 화상 처리 장치, 화상 처리 방법 및 기록 매체
KR101879890B1 (ko) 화상 처리 장치, 화상 처리 방법 및 기록 매체
CA2823902C (en) Method and apparatus for improved in-loop filtering
EP2708027B1 (en) Method and apparatus for reduction of in-loop filter buffer
WO2006110863A1 (en) Generating edge masks for a deblocking filter
CN103947208B (zh) 减少解块滤波器的方法及装置
KR20060060919A (ko) H.264/mpeg-4 에서의 블록킹 효과를 제거하기 위한디블록 필터 및 필터링 방법
JP5183664B2 (ja) ビデオ圧縮のためのデブロッキング装置及び方法
TW200820783A (en) Apparatus and method for deblock filtering
DE112012001609T5 (de) Intra-Prädiktionsverfahren, Kodierer und Dekodierer zur Benutzung desselben
US9762906B2 (en) Method and apparatus for video decoding using multi-core processor
US9237351B2 (en) Encoding/decoding apparatus and method for parallel correction of in-loop pixels based on measured complexity, using video parameter
WO2020248618A1 (zh) 双核心计算单元实现环路滤波的方法
EP2880861B1 (en) Method and apparatus for video processing incorporating deblocking and sample adaptive offset
JP2023515742A (ja) ループ内フィルタリングの方法、コンピュータ可読記憶媒体及びプログラム
US20050259887A1 (en) Video deblocking method and apparatus
CN113132740A (zh) 基于自适应环路滤波重建帧的方法、***及存储介质
KR100359208B1 (ko) 고속 디블럭킹 필터 장치
US20140056363A1 (en) Method and system for deblock filtering coded macroblocks
TW201116064A (en) Deblocking apparatus and method for video compression
KR101331093B1 (ko) 프레임 메모리의 단일뱅크 내 참조 영상의 픽셀 인터리빙 방법 및 장치, 이를 포함하는 영상코덱 시스템
Zummach et al. An uhd 4k@ 60fps deblocking filter hardware targeting the av1 decoder
US8204342B2 (en) Image processor
US11381845B2 (en) Deblocking between block boundaries and sub-block boundaries in a video encoder and/or video decoder

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20821850

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20821850

Country of ref document: EP

Kind code of ref document: A1