CN116527908B

CN116527908B - Motion field estimation method, motion field estimation device, computer device and storage medium

Info

Publication number: CN116527908B
Application number: CN202310390201.6A
Authority: CN
Inventors: 梅奧; H.布伊
Original assignee: Glenfly Tech Co Ltd
Current assignee: Granfei Intelligent Technology Co.,Ltd.
Priority date: 2023-04-11
Filing date: 2023-04-11
Publication date: 2024-05-03
Anticipated expiration: 2043-04-11
Also published as: CN116527908A

Abstract

The application relates to a motion field estimation method, a motion field estimation device, a computer device and a storage medium. The method comprises the following steps: storing the second set of reference motion vectors in the first memory; and acquiring the position information of the pixel unit to be decoded in the current frame, and reading a second reference motion vector set corresponding to the pixel unit mapped by the position information in the first memory, and taking the second reference motion vector set as a time domain reference motion vector set of the pixel unit to be decoded. By adopting the method, when a certain pixel unit to be decoded in the current frame is decoded, the second reference motion vector set corresponding to the pixel unit mapped by the position information is read in the first memory according to the position information of the pixel unit to be decoded and used as the time domain reference motion vector set of the pixel unit to be decoded, the second reference motion vector set mapped by the pixel unit to be decoded is not required to be repeatedly calculated, the calculated amount is reduced, and the pixel unit to be decoded can be decoded without waiting for the completion of motion mapping.

Description

Motion field estimation method, motion field estimation device, computer device and storage medium

Technical Field

The present application relates to the field of video encoding and decoding technologies, and in particular, to a motion field estimation method, apparatus, computer device, and storage medium.

Background

Motion field estimation (motion FIELD ESTIMATE) is a new technology of a new generation video coding and decoding standard formulated by an AV1 (AOM (Alliance for Open Media) organization), and has a good compression effect in a sequence of a high-speed motion scene, and the compressed data block can obtain a more accurate time domain reference motion vector by a linear mapping mode. Typically, a number of already encoded time domain reference frames are selected for motion field estimation in sequence, AV1 prescribes a number of selected maximum time domain reference frames of 3, and the result of the latter mapping may overlap the result of the former mapping, but the final result needs to be determined after all time domain reference frames have been mapped.

The conventional motion field estimation process is as follows: dividing the time domain reference frame into a plurality of 8x8 pixel blocks, wherein each 8x8 pixel block has an original motion vector, linearly mapping the original motion vector of each 8x8 pixel block in the time domain reference frame according to a raster scanning sequence, and if a certain original motion vector is mapped to a certain 8x8 pixel block, then the 8x8 pixel block is called as hit.

However, due to the special implementation of hardware, the hardware is decoding performed by taking 64x64 or 128x128 pixel blocks as a unit, so that more hardware storage resources are required in the motion field estimation process and a large number of repeated calculation processes exist, and when a certain pixel block to be decoded is decoded, only after the pixel block to be decoded is subjected to motion mapping, decoding can be performed based on the mapped motion vector, which results in the problem of low decoding efficiency.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a motion field estimation method, apparatus, computer device, and computer-readable storage medium capable of saving storage resources, reducing repetitive computation processes, and improving decoding efficiency.

In a first aspect, the present application provides a motion field estimation method. The method comprises the following steps:

determining a current frame and a plurality of reference frames from a video sequence, and dividing the current frame and each reference frame into a plurality of pixel units; wherein the reference frame leads or lags the current frame in the time domain;

Determining a second reference motion vector set corresponding to the pixel units with the same mapping sequence in the current frame based on a first reference motion vector set corresponding to the pixel units with the same mapping sequence in each reference frame after the motion vector mapping is completed; the second reference motion vector set is stored in the first memory, and the original position information of each pixel block in the pixel unit is stored in the second memory; the original position information comprises mark information of a reference frame where the pixel block is located and a row where the pixel block is located;

And acquiring the position information of the pixel unit to be decoded in the current frame, and reading a second reference motion vector set corresponding to the pixel unit mapped by the position information in the first memory, and taking the second reference motion vector set as a time domain reference motion vector set of the pixel unit to be decoded.

In one embodiment, the reference frame corresponds to flag information for characterizing an arrangement order of the reference frame in the video sequence, and the pixel unit corresponds to flag information for characterizing an arrangement order of the pixel unit in the reference frame; determining a second reference motion vector set corresponding to the pixel units with the same mapping sequence in the current frame based on the first reference motion vector set corresponding to the pixel units with the same mapping sequence in each reference frame after the motion vector mapping is completed, including:

determining a first mapping sequence of each reference frame based on the mark information of each reference frame, and determining a second mapping sequence of each pixel unit in the same reference frame based on the mark information of each pixel unit in the same reference frame;

According to the first mapping sequence and the second mapping sequence, performing motion vector mapping on the pixel units with the same mapping sequence in each reference frame to obtain a first reference motion vector set corresponding to the pixel units with the same mapping sequence in each reference frame;

And superposing the first reference motion vector sets corresponding to the pixel units with the same mapping sequence in each reference frame to obtain a second reference motion vector set corresponding to the pixel units with the same mapping sequence in the current frame.

In one embodiment, according to a first mapping order and a second mapping order, performing motion vector mapping on pixel units with the same mapping order in each reference frame to obtain a first reference motion vector set corresponding to the pixel units with the same mapping order in each reference frame, where the first reference motion vector set includes:

Taking the first pixel unit in the second mapping sequence in each reference frame as the current pixel unit of each reference frame, and respectively mapping the motion vectors of the current pixel units in each reference frame according to the first mapping sequence to obtain a first reference motion vector set corresponding to the current pixel units in each reference frame; the method comprises the steps that a current pixel unit corresponds to a first reference motion vector set and is stored in a first memory, and original position information of each pixel block in the current pixel unit is stored in a second memory;

Taking the next pixel unit of the current pixel unit in the second mapping sequence in each reference frame as the current pixel unit of each reference frame, returning to execute the step of mapping the motion vector of the current pixel unit in each reference frame according to the first mapping sequence, and continuing to execute until the current pixel unit of each reference frame is the last pixel unit of each reference frame, so as to obtain a first reference motion vector set corresponding to the pixel unit in the same mapping sequence in each reference frame.

In one embodiment, the pixel blocks in the pixel unit correspond to flag information for representing the arrangement order of the pixel blocks in the pixel unit; respectively mapping motion vectors of current pixel units in each reference frame according to a first mapping sequence, wherein obtaining a first reference motion vector set corresponding to the current pixel units in each reference frame comprises the following steps:

Determining a third mapping sequence of each pixel block in the current pixel unit based on the mark information corresponding to each pixel block in the current pixel unit aiming at the current pixel unit in each reference frame;

Taking the first pixel block of the third mapping sequence in the current pixel unit as the current pixel block; calculating a first reference motion vector and a mapping position after mapping the current pixel block based on an original motion vector corresponding to the current pixel block; the mapping position is the storage position of the first reference motion vector in the first memory and the original position information of the current pixel block in the reference frame in the storage position of the second memory;

storing a first reference motion vector corresponding to the current pixel block into a first memory according to the mapping position, and storing original position information corresponding to the current pixel block into a second memory;

Taking the next pixel block of the current pixel block in the third mapping sequence as the current pixel block, returning to execute the step of calculating the first reference motion vector and the mapping position after the mapping of the current pixel block based on the original motion vector corresponding to the current pixel block, and continuing to execute until the current pixel block is the last pixel block in the current pixel unit, and obtaining the first reference motion vector set after the mapping of the current pixel unit.

In one embodiment, storing a first reference motion vector corresponding to a current pixel block in a first memory according to a mapping position, and storing original position information corresponding to the current pixel block in a second memory, including:

determining a first target storage address of a first reference motion vector mapped by the current pixel block in a first memory and a second target storage address of original position information corresponding to the current pixel block in a second memory according to the mapping position;

If the first reference motion vector mapped by the previous pixel block is not stored in the first target storage address, the first reference motion vector mapped by the current pixel block is stored in the first target storage address, and the original position information corresponding to the current pixel block is stored in the second target storage address.

In one embodiment, the method further comprises:

If the first reference motion vector mapped by the previous pixel block is stored in the first target storage address, original position information corresponding to the previous pixel block is read from the second memory;

If it is determined that the mapping sequence of the reference frame where the current pixel block is located lags behind the previous pixel block according to the original position information corresponding to the previous pixel block, replacing the first reference motion vector of the current pixel block with the first reference motion vector mapped by the previous pixel block in the first target storage address, and replacing the original position information corresponding to the current pixel block with the original position information of the previous pixel block in the second target storage address.

In one embodiment, the method further comprises: if the current pixel block and the previous pixel block are determined to be in the same reference frame according to the original position information corresponding to the previous pixel block, and the row in which the current pixel block is located below the previous pixel block or is in the same row, the first reference motion vector of the current pixel block is replaced with the first reference motion vector mapped by the previous pixel block in the first target storage address, and the original position information corresponding to the current pixel block is replaced with the original position information of the previous pixel block in the second target storage address.

In one embodiment, overlapping first reference motion vector sets corresponding to pixel units with the same mapping sequence in each reference frame to obtain second reference motion vector sets corresponding to pixel units with the same mapping sequence in the current frame, including:

Determining a target pixel block with the same mapping position corresponding to the pixel block in the reference frame after mapping in the current pixel unit aiming at the current pixel unit of the current reference frame which is not the first reference frame;

Replacing the first reference motion vector mapped by the target pixel block in the reference frame after mapping by the first reference motion vector mapped by the target pixel block in the current pixel unit, and storing the mapped residual pixel blocks in the current pixel unit and the mapped residual pixel blocks in the reference frame into a first memory;

And taking the pixel unit at the same position in the next reference frame as the current pixel unit, executing the step of determining the target pixel block with the same mapping position corresponding to the pixel block in the pixel unit at the same position in the previous reference frame, and continuing to execute until the current reference frame is the last reference frame, and obtaining a second reference motion vector set and a second reference motion vector corresponding to the pixel unit at the same position in the current frame.

In a second aspect, the present application also provides a motion field estimation apparatus, the apparatus comprising:

The acquisition module is used for determining a current frame and a plurality of reference frames from the video sequence and dividing the current frame and each reference frame into a plurality of pixel units; wherein the reference frame leads or lags the current frame in the time domain;

The motion estimation mapping module is used for determining a second reference motion vector set corresponding to the pixel units with the same mapping sequence in the current frame based on a first reference motion vector set corresponding to the pixel units with the same mapping sequence in each reference frame after the motion vector mapping is completed; the second reference motion vector set is stored in the first memory, and the original position information of each pixel block in the pixel unit is stored in the second memory; the original position information comprises mark information of a reference frame where the pixel block is located and a row where the pixel block is located;

The decoding module is used for acquiring the position information of the pixel unit to be decoded in the current frame, reading the second reference motion vector set corresponding to the pixel unit mapped by the position information in the first memory, and taking the second reference motion vector set as the time domain reference motion vector set of the pixel unit to be decoded.

In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor which when executing the computer program performs the steps of:

In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of:

According to the motion field estimation method, the motion field estimation device, the computer equipment and the storage medium, the second reference motion vector set corresponding to the current frame after the motion vector mapping is finished is stored in the first memory in advance, when a certain pixel unit to be decoded in the current frame is decoded, the second reference motion vector set corresponding to the pixel unit mapped by the position information is read in the first memory and used as the time domain reference motion vector set of the pixel unit to be decoded only according to the position information of the pixel unit to be decoded, the second reference motion vector set mapped by the pixel unit to be decoded does not need to be repeatedly calculated, the calculated amount is reduced, and decoding can be performed without waiting for the pixel unit to be decoded to finish the motion mapping.

Drawings

FIG. 1 is a diagram of an application environment of a motion field estimation method in one embodiment;

FIG. 2 is a schematic diagram of 3 time domain reference frames in one embodiment;

FIG. 3 is a block partitioning diagram of pixels in a reference frame in one embodiment;

FIG. 4 is a diagram of an original raster scan order of pixel blocks in a reference frame in one embodiment;

FIG. 5 is a schematic diagram of pixel unit divisions of a reference frame and a current frame in one embodiment;

FIG. 6 is a flow chart of a motion field estimation method according to another embodiment;

FIG. 7 is a schematic diagram of a first memory storing a second set of reference motion vectors mapped for each pixel unit in a current frame according to one embodiment;

FIG. 8 is a diagram illustrating motion estimation mapping of pixel units according to a conventional technique in one embodiment;

FIG. 9 is a schematic diagram of a first reference motion vector set mapped by pixel units with the same mapping order in each reference frame according to one embodiment;

FIG. 10 is a flow chart of determining a second set of reference motion vectors corresponding to pixel units of the same mapping order for a current frame in one embodiment;

FIG. 11 is a flow diagram of a hardware implemented motion field estimation in one embodiment;

FIG. 12 is a flow chart of a first reference motion vector set after mapping of a current pixel unit in one embodiment;

FIG. 13 is a flowchart of obtaining a set of corresponding first reference motion vectors for a current pixel unit in each reference frame according to one embodiment;

FIG. 14 is a flow diagram of 64x64 pixel block motion field estimation in one embodiment;

FIG. 15 is a diagram showing a first reference motion vector distribution corresponding to pixel 0 of each reference frame in one embodiment;

FIG. 16 is a block diagram showing the construction of a motion field estimation apparatus in one embodiment;

Fig. 17 is an internal structural view of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

The motion field estimation method provided by the embodiment of the application can be applied to an application environment shown in fig. 1. Wherein the terminal 102 determines a current frame and a plurality of reference frames from the video sequence, and divides the current frame and each reference frame into a plurality of pixel units; wherein the reference frame leads or lags the current frame in the time domain; determining a second reference motion vector set corresponding to the pixel units with the same mapping sequence in the current frame based on a first reference motion vector set corresponding to the pixel units with the same mapping sequence in each reference frame after the pixel units do motion vector mapping, wherein the second reference motion vector set is stored in a first memory, and the original position information of each pixel block in the pixel units is stored in a second memory; the original position information comprises mark information of a reference frame where the pixel block is located and a row where the pixel block is located; and acquiring the position information of the pixel unit to be decoded in the current frame, and reading a second reference motion vector set corresponding to the pixel unit mapped by the position information in the first memory, and taking the second reference motion vector set as a time domain reference motion vector set of the pixel unit to be decoded.

The terminal 102 communicates with the server 104 via a network. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, portable wearable devices, and the internet of things devices may be smart televisions, smart vehicle devices, and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.

The conventional motion field estimation procedure is: selecting 3 time domain reference frames as shown in fig. 2 for motion field estimation, dividing the time domain reference frames into a plurality of 8x8 pixel blocks as shown in fig. 3, wherein each 8x8 pixel block has an original motion vector, and performing motion estimation according to the following rules:

1) The terminal sequentially maps the motion vectors of the pixel blocks in the time domain reference frame 0 according to the raster scanning sequence shown in fig. 4 to obtain first reference motion vectors and mapping positions corresponding to the 0 pixel blocks of the time domain reference frame; determining a target storage address of each pixel block in the time domain reference frame 0 in an array A [ m ] [ n ] according to the mapping position corresponding to each pixel block in the time domain reference frame 0, and storing a first reference motion vector of each pixel block in the time domain reference frame 0 in the target storage address in the array A; where m represents an m row 8x8 pixel block and n represents an n column 8x8 pixel block.

2) After each pixel block in the time domain reference frame 0 is subjected to motion vector mapping, the terminal sequentially carries out motion vector mapping on the time domain reference frame 1 according to the raster scanning sequence shown in fig. 4 to obtain a first reference motion vector and a mapping position corresponding to the time domain reference frame 1; according to the mapping positions corresponding to the pixel blocks in the time domain reference frame 1, determining the target storage addresses of the pixel blocks in the time domain reference frame 1 in the array A, and storing the first reference motion vectors corresponding to the pixel blocks in the time domain reference frame 1 in the target storage addresses in the array A; in the process, if a certain pixel block a in the time domain reference frame 1 is found to store a first reference motion vector of the time domain reference frame 0 on a target storage address of the pixel block a in the array a in the process of storing the pixel block a in the array a, replacing the first reference motion vector of the time domain reference frame 0 stored on the target storage address of the pixel block a in the array a with the first reference motion vector of the pixel block a in the post-processed time domain reference frame 1;

3) After each pixel block in the time domain reference frame 1 is subjected to motion vector mapping, the terminal sequentially carries out motion vector mapping on the time domain reference frame 2 according to the raster scanning sequence shown in fig. 4 to obtain a first reference motion vector and a mapping position corresponding to the time domain reference frame 2; determining a target storage address of each pixel block in the time domain reference frame 2 in the array A according to the mapping position corresponding to each pixel block in the time domain reference frame 2, and storing a first reference motion vector corresponding to each pixel block in the time domain reference frame 2 in the target storage address in the array A; in this process, if a certain pixel block b in the temporal reference frame 2 is found to store the first reference motion vector of the temporal reference frame 0 or the temporal reference frame 1 on the target storage address of the pixel block b in the array a during the process of storing the pixel block b in the array a, the first reference motion vector of the temporal reference frame 0 or the temporal reference frame 1 stored on the target storage address of the pixel block b in the array a is replaced by the first reference motion vector of the pixel block b in the post-processed temporal reference frame 2.

4) After each time domain reference frame is subjected to motion vector mapping, storing the final first reference motion vector in each memory unit of the array A as a second reference motion vector corresponding to the pixel block with the same mapping sequence in the current frame after the motion vector mapping is finished. For example, the first reference motion vector stored in the memory unit of the first row and the first column in the array a is used as the second reference motion vector corresponding to the first column pixel block of the first row in the current frame.

It should be noted that: in order to reduce the complexity of hardware implementation, the AV1 standard has the following limitations:

1) The motion vector of each 8x8 pixel block cannot be mapped across the 64x6 pixel block, i.e., the motion vector of each 8x8 pixel block in the 0 th row 64x64 pixel unit cannot be mapped to the 1 st row 64x64 pixel unit and the 2 nd row 64x64 pixel unit, the motion vector of each 8x8 pixel block in the 1 st row 64x64 pixel unit cannot be mapped to the 0 th row 64x64 pixel unit and the 2 nd row 64x64 pixel unit, and the motion vector of each 8x8 pixel block in the 2 nd row 64x64 pixel unit cannot be mapped to the 1 st row 64x64 pixel unit in fig. 4.

2) Meanwhile, in the same row of 64x64 pixel units, the motion vector of each 8x8 pixel block in the current 64x64 pixel unit can only be mapped to the left adjacent 64x64 pixel unit, the current 64x64 pixel unit, and the right adjacent 64x64 pixel unit of the current 64x64 pixel unit, such as the number 1 64x64 pixel unit in the 1 st row of 64x64 pixel unit rows, and the motion vector of each 8x8 pixel block can only be mapped to the number 064 x64 pixel unit, the number 1 64x64 pixel unit, and the number 2 64x64 pixel unit.

3) It is possible that a certain 8x8 pixel block does not have any motion vectors mapped to it, and the temporal reference motion vector corresponding to that 8x8 pixel block is not available. It is possible that a certain 8x8 pixel block has a plurality of motion vectors mapped thereto, and the motion vectors mapped later may overlap the motion vectors mapped earlier.

Because of the special implementation of hardware, the hardware is decoding in units of 64x64 or 128x128 pixel blocks, so that only the motion vectors of 8x8 pixel blocks in a plurality of 64x64 pixel units in a time domain reference frame need to be mapped in the motion field estimation process, and the motion vectors of all 8x8 pixel blocks in the whole time domain reference frame do not need to be mapped, and specifically, the 64x64 pixel units are: the left neighbor 64x64 pixel cell of the current 64x64 pixel cell, and the right neighbor 64x64 pixel cell.

As shown in fig. 5, if the final temporal reference motion vector is needed after motion field estimation of the number 0 x64 pixel unit in the current temporal reference frame, the final temporal reference motion vector can be obtained after motion vector mapping is performed on the number 0 x64 pixel unit and the number 1 x64 pixel unit in the same position in the current temporal reference frame.

If the final temporal reference motion vector is needed after motion field estimation of the number 1 64x64 pixel unit in the current temporal reference frame, the final temporal reference motion vector can be obtained after motion vector mapping is performed on the number 0 x64 pixel unit, the number 1 64x64 pixel unit and the number 2 64x64 pixel unit at the same position in the current temporal reference frame.

If the final temporal reference motion vector is needed after motion field estimation of the No.2 64x64 pixel unit in the current temporal reference frame, the final temporal reference motion vector can be obtained after motion vector mapping is performed on the No.1 64x64 pixel unit, the No.2 64x64 pixel unit and the No. 3 64x64 pixel unit at the same position in the current temporal reference frame.

From the above procedure, it can be found that, in order to obtain the second reference motion vector of 8x8 in each of the 64x64 pixel units in the current temporal reference frame, the same 64x64 pixel unit in the same temporal reference frame needs to be repeatedly mapped multiple times, for example, the number 0 64x64 pixel unit performs 2 motion vector mappings, the number 1 64x64 pixel unit performs 3 motion vector mappings, and the number 2 64x64 pixel unit performs 2 motion vector mappings. Meanwhile, in order to obtain the second reference motion vector of each 8x8 pixel block in a certain 64x64 pixel unit in the current frame, as shown in fig. 3, at least the original motion vectors of 3 64x64 pixel units in each temporal reference frame need to be saved, that is, the original temporal reference motion vectors of 9 64x64 pixel units need to be saved in total. The design not only needs more hardware resources to store the original motion vector, but also performs repeated calculation, and when a certain pixel block to be decoded is decoded, only after the pixel block to be decoded is subjected to motion mapping, the decoding can be performed based on the mapped motion vector, so that the problem of low decoding efficiency is caused.

Accordingly, in order to solve the above-mentioned problems, in one embodiment, as shown in fig. 6, there is provided a motion field estimation method, which is described by taking a terminal in fig. 1 as an example, comprising the steps of:

step 602, determining a current frame and a plurality of reference frames from a video sequence, and dividing the current frame and each reference frame into a plurality of pixel units.

The general motion estimation method is as follows: setting an image frame at a time t as a current frame f (x, y), setting an image frame at a time t 'as a reference frame f' (x, y), and leading or lagging the current frame in a time domain; when t' < t, referred to as backward motion estimation; when t' > t, is referred to as forward motion estimation. When the best matching block of the current block in the current frame t is searched in the reference frame t', a corresponding motion field d (x; t, t+t delta) can be obtained, wherein t delta represents the time difference between the reference frame and the current frame, and the time domain reference motion vector of the current frame can be obtained. In this embodiment, the reference frame lags the current frame.

Because of the special implementation of the hardware, the hardware performs decoding in units of 64×64 or 128×128 pixel blocks, so the present embodiment divides the current frame and each reference frame into a plurality of pixel units, and the subsequent processes of motion estimation and decoding are performed in units of pixel units.

It should be noted that: after dividing the pixel units, the mapping sequence of each pixel block in the reference frame is not the sequence from left to right and from top to bottom in the current reference frame shown in fig. 4, but the pixel units in the same position in each reference frame are mapped according to the arrangement sequence of the reference frame in the video sequence, and the pixel blocks in the pixel units are mapped according to the sequence from left to right and from top to bottom. Taking 3 reference frames as an example, mapping is performed according to the sequence shown in fig. 7, that is, the pixel unit No. 0 in the reference frame 0, the pixel unit No. 0 in the reference frame 1, the pixel unit No. 0 in the reference frame 2, the pixel unit No. 1 in the reference frame 0, and the pixel unit No. 1 in the reference frame 1.

In order to facilitate distinguishing pixel units, each pixel unit in the present embodiment corresponds to flag information for characterizing the arrangement order of the pixel units in the current reference frame, for example, pixel unit 0, pixel unit 1, pixel unit 2. The size of the pixel unit may be 64×64, that is, a unit composed of 8 rows and 8 columns of pixel blocks shown in fig. 3.

Optionally, the terminal determines a current frame and a plurality of encoded reference frames to be decoded from the video sequence, divides the current frame and the reference frames into a plurality of 8x8 pixel blocks, divides the current frame and the reference frames by taking 8 rows and 8 columns of pixel blocks as units, and acquires a plurality of pixel units in the current frame and the reference frames.

Step 604, determining a second reference motion vector set corresponding to the pixel units with the same mapping sequence in the current frame based on the corresponding first reference motion vector set after the pixel units with the same mapping sequence in each reference frame do motion vector mapping; the second reference motion vector set is stored in the first memory, and the original position information of each pixel block in the pixel unit is stored in the second memory; the original position information comprises mark information of a reference frame where the pixel block is located and a row where the pixel block is located.

The pixel units with the same mapping sequence in each reference frame refer to pixel units with the same sign information of the pixel units in each reference frame. For example, in fig. 7, pixel No. 0 in each reference frame is a pixel unit with the same mapping order, and pixel No. 1 in each reference frame is a pixel unit with the same mapping order.

The first reference motion vector set comprises a first reference motion vector corresponding to each pixel block in the current pixel unit in the current reference frame after the motion vector mapping is completed. The second reference motion vector set comprises a second reference motion vector corresponding to each pixel block in the current pixel unit in the current frame after the motion vector mapping is completed. And after the first reference motion vector sets corresponding to the pixel units with the same mapping sequence in each reference frame are subjected to superposition processing, obtaining a second reference motion vector set corresponding to the pixel units with the same mapping sequence in the current frame, so that the second reference motion vector set is stored in the first memory.

It should be noted that: the first memory and the second memory are different in property, the first memory may be an SRAM memory, and the second memory may be a register, and the reference motion vector and the original position information are stored in different memories due to different data attributes of the reference motion vector and the original position information.

As shown in fig. 8, the first memory includes a plurality of memory units, each of which includes 64 memory addresses for storing the mapped first reference motion vectors of the pixel blocks in the 64×64 pixel units, respectively. Where R_64x64_0, R_64x64_1, and R_64x64_2 represent 64x64 pixel cells in the temporal reference frame, C_64x64_0, C_64x64_1, and C_64x64_2 represent 64x64 pixel cells in the current frame. The second memory may be a plurality of register sets, where each register set is configured to store control information of the first reference motion vector mapped by each 8x8 pixel block in the 64x64 pixel unit corresponding to the current reference frame (an original position corresponding to the second reference motion vector mapped by each 8x8 pixel block in the current reference frame). The number of memory units in the first memory may be the same as the number of 64x64 pixel units in the current frame, or may be set to at least 3 memory units according to the memory unit overflow rule (at least 3 memory units are needed because at least the first reference motion vector of each 8x8 pixel block in 364 x64 pixel units in each temporal reference frame needs to be saved in order to obtain the second reference motion vector of each 8x8 pixel unit in a certain 64x64 pixel unit in the current frame).

Optionally, after the terminal performs motion vector mapping on the pixel units with the same mapping sequence in each time domain reference frame, storing a first reference motion vector set corresponding to each pixel unit after performing motion vector mapping according to a preset standard into a first memory, and taking the final first reference motion vector storage condition in all memory units in the first memory as a second reference motion vector set corresponding to the current frame after performing motion vector mapping.

Step 606, obtain the position information of the pixel unit to be decoded in the current frame, read the second reference motion vector set corresponding to the pixel unit mapped by the position information in the first memory, and use the second reference motion vector set as the time domain reference motion vector set of the pixel unit to be decoded.

When a certain pixel block to be decoded is decoded in the conventional motion field estimation process, only after the pixel block to be decoded is subjected to motion mapping, decoding can be performed based on the mapped motion vector, so that the problem of low decoding efficiency is caused. In order to prevent the motion field estimation from blocking the subsequent decoding process of each 64x64 pixel unit, as shown in fig. 9, the embodiment stores the second reference motion vector mapped by each 8x8 pixel block in the current frame into the first memory, and only needs to take out the second reference motion vector set mapped by the pixel unit to be decoded in the current frame from the first memory for use when decoding a certain pixel unit to be decoded in the current frame, and does not need to repeatedly calculate the second reference motion vector set mapped by the pixel unit to be decoded, thereby reducing the calculation amount and not needing to wait for the pixel unit to be decoded to complete the motion mapping.

Optionally, the terminal reads a second reference motion vector set corresponding to the pixel unit mapped by the position information in the first memory according to the position information of the pixel unit to be decoded in the current frame, and takes a second reference motion vector corresponding to each 8x8 pixel block in the second reference motion vector set as a time domain reference motion vector set after motion mapping is performed on each pixel block in the pixel unit to be decoded.

In the motion field estimation method, the second reference motion vector set corresponding to the current frame after the motion vector mapping is finished is stored in the first memory in advance, when a certain pixel unit to be decoded in the current frame is decoded, the second reference motion vector set corresponding to the pixel unit mapped by the position information is read in the first memory and used as the time domain reference motion vector set of the pixel unit to be decoded, the second reference motion vector set mapped by the pixel unit to be decoded does not need to be repeatedly calculated, the calculated amount is reduced, and the pixel unit to be decoded can be decoded after the motion mapping is finished.

In one embodiment, as shown in fig. 10, determining, based on a first reference motion vector set corresponding to a pixel unit with the same mapping order in each reference frame after the pixel unit with the same mapping order performs motion vector mapping, a second reference motion vector set corresponding to a pixel unit with the same mapping order in the current frame includes:

Step 1002, determining a first mapping order of each reference frame based on the flag information of each pixel unit in the same reference frame, and determining a second mapping order of each pixel unit in the same reference frame based on the flag information of each pixel unit in the same reference frame.

The mark information of the reference frames is used for representing the arrangement sequence of the reference frames in the video sequence; the flag information of the pixel units is used to characterize the arrangement order of the pixel units in the reference frame. Taking fig. 7 as an example, the first mapping sequence is reference frame 0, reference frame 1, and reference frame 2; the second mapping order is pixel unit No. 0, pixel unit No. 1.

Optionally, the terminal uses the arrangement order of the reference frames in the video sequence as a first mapping order of the reference frames, and uses the arrangement order of the pixel units in the current reference frame as a second mapping order of the pixel units in the current reference frame.

Step 1004, performing motion vector mapping on the pixel units with the same mapping sequence in each reference frame according to the first mapping sequence and the second mapping sequence to obtain a first reference motion vector set corresponding to the pixel units with the same mapping sequence in each reference frame.

And determining the mapping sequence of each pixel unit when mapping is performed in each reference frame based on the second mapping sequence. Taking fig. 7 as an example, the first mapping sequence is reference frame 0, reference frame 1, and reference frame 2; the second mapping order is pixel unit No. 0, pixel unit No. 1.

In this embodiment, the mapping is not performed according to the mapping sequence shown in fig. 4, and compared with the mapping sequence shown in fig. 4, the second reference motion vector set of the current frame can be obtained without waiting for each pixel unit in each reference frame to perform motion mapping, and the second motion vector set corresponding to the pixel unit in the same mapping sequence in the current frame can be obtained only by performing motion vector mapping on the current pixel unit in the same mapping sequence in each reference frame.

For example, according to the first mapping sequence and the second mapping sequence, performing motion vector mapping on the number 0 pixel unit in each reference frame to obtain a first reference motion vector set corresponding to the pixel unit with the same mapping sequence in each reference frame, and according to the first mapping sequence, performing superposition processing on the first reference motion vector set corresponding to the number 0 pixel unit in each reference frame to obtain a second reference motion vector set corresponding to the number 0 pixel unit in the current frame. In this process, it is not necessary to pay attention to whether the first reference motion vector of the other pixel block in each reference frame covers the first reference motion vector corresponding to the pixel unit No. 0 in each reference frame, because no matter where the first reference motion vectors corresponding to the pixel blocks in the reference frame 0 and the reference frame 1 are mapped, the first reference motion vector of the pixel block in the last reference frame is covered. For example, after obtaining the second reference motion vector set corresponding to the pixel unit No. 0 in the current frame, even if the first reference motion vector of the pixel block b existing in the pixel unit No. 1 in the reference frame 0 is mapped to the second reference motion vector set corresponding to the pixel unit No. 0 in the current frame, according to the first reference motion vector replacement principle, the first reference motion vector corresponding to the pixel unit No. 0of the reference frame 2 mapped later may cover the first reference motion vector of the previous pixel block b.

In one embodiment, step 1004 specifically includes the steps of:

step 1, taking the first pixel unit in the second mapping sequence in each reference frame as the current pixel unit of each reference frame, and respectively mapping the motion vectors of the current pixel units in each reference frame according to the first mapping sequence to obtain a first reference motion vector set corresponding to the current pixel unit in each reference frame; the current pixel unit corresponds to the first reference motion vector set and is stored in the first memory, and the original position information of each pixel block in the current pixel unit is stored in the second memory.

The target storage position of the current pixel unit corresponding to the first reference motion vector set in the first memory is the same as the target storage position of the original position information of each pixel block in the current pixel unit in the second memory. For example, if the target storage location of the current pixel unit corresponding to the first reference motion vector set in the first memory is (0, 0), the original location information of each pixel block in the current pixel unit is not (0, 0) in the target storage location of the second memory.

And 2, taking the next pixel unit of the current pixel unit in the second mapping sequence in each reference frame as the current pixel unit of each reference frame, returning to execute the step of respectively mapping the motion vectors of the current pixel units in each reference frame according to the first mapping sequence, and continuing to execute until the current pixel unit of each reference frame is the last pixel unit of each reference frame, so as to obtain a first reference motion vector set corresponding to the pixel units with the same mapping sequence in each reference frame.

The reference motion vector of the pixel block can be mapped to the current pixel unit, the pixel unit adjacent to the left of the current pixel unit and the pixel unit adjacent to the right of the current pixel unit, and the pixel block subjected to motion vector mapping can cover the pixel block subjected to motion vector mapping in the past, so that the first reference motion vector set corresponding to the pixel units with the same mapping sequence in each reference frame can be determined after the motion mapping is completed on the current pixel unit, the pixel unit adjacent to the left of the current pixel unit and the pixel unit adjacent to the right of the current pixel unit in each reference.

Wherein, with 3 reference frames, each reference frame comprises N pixel units, and the first mapping sequence of the 3 reference frames is reference frame 0, reference frame 1 and reference frame 2; the second mapping sequence of each reference frame is a pixel unit No. 0 and a pixel unit No. 1. The process of mapping the motion vectors of the pixel units with the same mapping sequence in each reference frame is shown in fig. 11, the terminal maps the pixel unit No. 0 of the reference frame 0 to obtain a first reference motion vector set corresponding to the pixel unit No. 0 of the reference frame 0, the first reference motion vector set corresponding to the pixel unit No. 0 of the reference frame 0 is stored in the first memory, and the original position information of each pixel block in the pixel unit No. 0 of the reference frame 0 is stored in the second memory; the terminal maps the No. 0 pixel unit of the reference frame 1 to obtain a first reference motion vector set corresponding to the No. 0 pixel unit of the reference frame 1, stores the first reference motion vector set corresponding to the No. 0 pixel unit of the reference frame 1 into a first memory, and stores the original position information of each pixel block in the No. 0 pixel unit of the reference frame 1 into a second memory; the terminal maps the No. 0 pixel unit of the reference frame 2 to obtain a first reference motion vector set corresponding to the No. 0 pixel unit of the reference frame 2, stores the first reference motion vector set corresponding to the No. 0 pixel unit of the reference frame 2 into the first memory, stores the original position information of each pixel block in the No. 0 pixel unit of the reference frame 2 into the second memory, and so on until the current pixel unit of each reference frame is the last pixel unit of each reference frame, and obtains the first reference motion vector set corresponding to the pixel units with the same mapping sequence in each reference frame.

Step 1006, overlapping the first reference motion vector sets corresponding to the pixel units with the same mapping sequence in each reference frame to obtain the second reference motion vector set corresponding to the pixel units with the same mapping sequence in the current frame.

Wherein, "superimposition processing" means: if the mapping position of the first reference motion vector corresponding to a certain pixel block in the reference frame of the back motion field estimation is the same as the mapping position of the first reference motion vector corresponding to a certain pixel block in at least one reference frame of the front motion estimation, storing the original position signal corresponding to the certain pixel block in the reference frame of the front motion estimation and the original position information corresponding to the certain pixel block in the reference frame of the back motion estimation into the same storage position of a second memory; the first reference motion vector corresponding to a certain pixel block in the reference frame of the previous motion field estimation can be replaced by the first reference motion vector corresponding to a certain pixel block in the reference frame of the previous motion field estimation at the same mapping position of the first memory; in the same storage position of the second memory, the original position information corresponding to a certain pixel block in the reference frame of the postmotion field estimation can be replaced by the original position information corresponding to a certain pixel block in the reference frame of the postmotion field estimation.

For example, as shown in fig. 12, for temporal reference frame 0, temporal reference frame 1, and temporal reference frame 2 arranged in order of motion estimation, if the first reference motion vector of pixel block a000 in temporal reference frame 0 is a000, the first reference motion vector of pixel block a101 in temporal reference frame 1 is a101, and the first reference motion vector of pixel block a202 in temporal reference frame 2 is a202; the mapping positions of the first reference motion vector a000, the first reference motion vector a101 and the first reference motion vector a202 in the first memory are all R, the storage positions of the original position information of the pixel block a000 in the temporal reference frame 0, the pixel block a101 in the temporal reference frame 1 and the pixel block a202 in the temporal reference frame 2 in the second memory are all R', and then the superposition process on R in the first memory is as follows: a101 replaces A000, A202 replaces A101, and the final mapping result on R in the first memory is A202; the superposition process on R' in the second memory is: the original position information of the pixel block a101 replaces the original position information of the pixel block a000, the original position information of the pixel block a202 replaces the original position information of the pixel block a101, and the final mapping result on the second memory R' is the original position information of the pixel block a 202.

In this embodiment, compared with the method that the motion mapping sequence of each pixel unit of the next reference frame is removed after the mapping is performed on each pixel unit of the reference frame, according to the first mapping sequence and the second mapping sequence, the motion vector mapping is performed on the pixel units with the same mapping sequence in each reference frame, the second reference motion vector set of the current frame can be obtained without waiting for the motion mapping of each pixel unit in each reference frame, and the second motion vector set corresponding to the pixel units with the same mapping sequence in the current frame can be obtained after the motion vector mapping is performed on the current pixel units with the same mapping sequence in each reference frame.

In one embodiment, as shown in fig. 13, motion vector mapping is performed on a current pixel unit in each reference frame according to a first mapping sequence, so as to obtain a first reference motion vector set corresponding to the current pixel unit in each reference frame, where the method includes the following steps:

step 1302, for the current pixel unit in each reference frame, determining a third mapping order of each pixel block in the current pixel unit based on the flag information corresponding to each pixel block in the current pixel unit.

The sign information corresponding to the pixel blocks is used for representing the arrangement sequence of the pixel blocks in the pixel units. As shown in fig. 12, a in a pixel block a000 represents a pixel block, and the first numeral 0 represents a reference frame 0; the second digit 0 represents pixel number 0; the numbers following the second number indicate the order in which pixel blocks a are arranged in pixel unit No. 0, i.e., the first pixel block. The third mapping order of each pixel block in the current pixel unit is shown by the arrow in fig. 7.

Step 1304, taking the first pixel block of the third mapping sequence in the current pixel unit as the current pixel block; calculating a first reference motion vector and a mapping position after mapping the current pixel block based on an original motion vector corresponding to the current pixel block; the mapping position is the storage position of the first reference motion vector in the first memory and the original position information of the current pixel block in the reference frame in the second memory.

As shown in fig. 12, if the current pixel unit is the pixel unit No. 0, the first pixel block of the third mapping order in the current pixel unit is a000.

If the mapping position after the original motion vector mapping of the current 8x8 pixel block is the same as the original position of a certain 8x8 pixel block in the current reference frame, the motion vector of the current 8x8 pixel block is said to be mapped to the 8x8 pixel block. The motion vector of each 8x8 pixel block is mapped according to the following formula:

ProPosX＝ProMvX+CurPosX (2)

ProPosY＝ProMvY+CurPosY (3)

Equation (1) is used to calculate a mapped first reference motion vector, proMv denotes the mapped first reference motion vector of the target 8x8 pixel block in the reference frame, oriMv denotes the original temporal reference motion vector of the target 8x8 pixel block in the reference frame, ref_to_cur denotes the distance between the current frame and the current reference frame, and ref_to_ref denotes the distance between the current reference frame and the reference frame of the adjacent reference frame.

Formulas (2) and (3) are used to calculate a mapped 8x8 pixel block to which a first reference motion vector mapped to a target 8x8 pixel block in a reference frame is mapped, proPosX and ProPosY represent coordinates of the mapped 8x8 pixel block, proMvX and ProMvY represent components in two different directions of the first reference motion vector, and CurPosX and CurPosY represent coordinates of an 8x8 pixel block in a current frame that is identical to position information of the target 8x8 pixel block in the reference frame.

Optionally, the terminal obtains an original motion vector of the current pixel block in the current pixel unit, calculates a first reference motion vector mapped by the current pixel block according to the above formulas (1) - (3), and a mapped position of the current pixel block mapped, i.e. which 64x64 pixel unit and which 8x8 pixel block the first reference motion vector is in the current reference frame, and which specific one of the 64x64 pixel units is in.

Step 1306, storing the first reference motion vector corresponding to the current pixel block into the first memory according to the mapping position, and storing the original position information corresponding to the current pixel block into the second memory.

In this embodiment, since the motion vector of each 8x8 pixel block in the time domain reference frame is mapped according to the raster scan order, the right 8x8 pixel block in the current reference frame may replace the mapping result of the left 8x8 pixel block, the lower 8x8 pixel block may replace the mapping result of the upper 8x8 pixel block, and the time domain reference frame with the later motion field estimation order may replace the mapping result of the time domain reference frame with the earlier motion field estimation order. However, the hardware does not scan the 8x8 pixel block of the whole reference frame to perform mapping, but does do mapping in units of 64x64 pixel units, so when performing motion mapping on the current pixel unit, it needs to determine whether the first reference motion vector of the pixel block mapped by the subsequent motion estimation can replace the first reference motion vector of the pixel block mapped by the previous motion estimation according to the original position information.

In some embodiments, the original location information corresponding to the current pixel block includes a reference frame in which the current pixel block is located, and a line number of the current pixel block in the reference frame. Taking the second memory as a register set ProMvReg [8] [8] as an example, each register only stores the original position information corresponding to the pixel block in one pixel unit, and the second stored original position information corresponding to the current pixel block is defined as follows:

Each mapped first reference motion vector requires 5 bits of information to store, then one 64x64 pixel unit requires 320 bits of information to store.

In this embodiment, it is determined whether a new mapped first reference motion vector can cover an old first reference motion vector at the same mapping position according to BlkRowOffset and SrcColRefIdx stored in the second memory.

In some embodiments, storing a first reference motion vector corresponding to a current pixel block in a first memory according to a mapping position, and storing original position information corresponding to the current pixel block in a second memory, specifically including the following steps:

Step 1, determining a first target storage address of a first reference motion vector mapped by a current pixel block in a first memory and a second target storage address of original position information corresponding to the current pixel block in a second memory according to the mapping position.

And step 2, if the first reference motion vector mapped by the previous pixel block is not stored in the first target storage address, storing the first reference motion vector mapped by the current pixel block in the first target storage address, and storing the original position information corresponding to the current pixel block in the second target storage address.

In some embodiments, if the first reference motion vector mapped by the previous pixel block is stored in the first target storage address, original position information corresponding to the previous pixel block is read from the second memory; if it is determined that the mapping sequence of the reference frame where the current pixel block is located lags behind the previous pixel block according to the original position information corresponding to the previous pixel block, replacing the first reference motion vector of the current pixel block with the first reference motion vector mapped by the previous pixel block in the first target storage address, and replacing the original position information corresponding to the current pixel block with the original position information of the previous pixel block in the second target storage address.

Alternatively, as shown in fig. 14, in the process of storing the first reference motion vector mapped to the current pixel block into the first memory, control information R0 mapped to the current pixel block is generated according to the original position information of the current pixel block; generating control information R1 mapped by the previous pixel block according to the original position information of the previous pixel block in the target storage address in the second memory; comparing the mapping sequence of the reference frames characterized by the control information R0 and the control information R1; if the mapping sequence of the reference frame of the current pixel block lags behind the mapping sequence of the reference frame of the previous pixel block, the first reference motion vector of the current pixel block is substituted for the first reference motion vector mapped by the previous pixel block in the first target storage address, and the original position information and the control information of the pixel block stored in the second target storage address in the second memory are updated.

In some embodiments, if it is determined that the current pixel block and the previous pixel block are in the same reference frame according to the original position information corresponding to the previous pixel block, and the line in which the current pixel block is located is below the previous pixel block or is in the same line, the first reference motion vector of the current pixel block is replaced with the first reference motion vector mapped by the previous pixel block in the first target storage address, and the original position information corresponding to the current pixel block is replaced with the original position information of the previous pixel block in the second target storage address.

Alternatively, as shown in fig. 14, in the process of storing the first reference motion vector mapped to the current pixel block into the first memory, control information R0 mapped to the current pixel block is generated according to the original position information of the current pixel block; generating control information R1 mapped by the previous pixel block according to the original position information of the previous pixel block in the target storage address in the second memory; comparing the line numbers of the pixel blocks represented by the control information R0 and the control information R1; if the current pixel block is below the last pixel block or in the same row, the first reference motion vector of the current pixel block is replaced with the first reference motion vector mapped by the last pixel block in the first target storage address, and the original position information and the control information of the pixel block stored in the second target storage address in the second memory are updated.

Step 1308, taking the next pixel block of the current pixel block in the third mapping sequence as the current pixel block, returning to execute the step of calculating the first reference motion vector and the mapping position after mapping the current pixel block based on the original motion vector corresponding to the current pixel block, and continuing to execute until the current pixel block is the last pixel block in the current pixel unit, so as to obtain the first reference motion vector set after mapping the current pixel unit.

Optionally, after obtaining the first reference motion vectors of all the pixel blocks in the current pixel unit according to steps 1304-1308, the terminal sorts the first reference motion vectors corresponding to the pixel blocks in the current pixel unit according to the sequence numbers of the pixel blocks, and obtains a first reference motion vector set after mapping the current pixel unit.

In this embodiment, after dividing the pixel units, the mapping order of the pixel blocks in the pixel units is mapped according to the arrangement order of the pixel blocks in the pixel units, so when the current pixel unit is subjected to motion mapping, whether the first reference motion vector of the pixel block mapped in the subsequent motion estimation can replace the first reference motion vector of the pixel block mapped in the previous motion estimation needs to be determined according to the original position information; according to the embodiment, the second memory is arranged under the special scene, the original position information corresponding to the current pixel block is stored in the second memory, and when judging whether the first reference motion vector of the pixel block mapped by the backward motion estimation can replace the first reference motion vector of the pixel block mapped by the forward motion estimation, a referenceable standard is provided, so that the superposition result is not identical with the standard result when the superposition processing of the mapping result of the pixel block is carried out.

In one embodiment, the method includes the steps of superposing first reference motion vector sets corresponding to pixel units with the same mapping sequence in each reference frame to obtain second reference motion vector sets corresponding to pixel units with the same mapping sequence in the current frame, and specifically includes the following steps:

Step 1, determining a target pixel block with the same mapping position corresponding to a pixel block in a pixel unit with the same mapping sequence in the mapped reference frame in the current pixel unit aiming at the current pixel unit of the current reference frame which is not the first reference frame.

The first reference frame is the frame subjected to mapping processing firstly, and the first reference frame does not have the mapped reference frame before, so that the first reference motion vector subset mapped by each pixel unit in the first reference frame is directly stored in the first memory, and the original position information of each pixel unit in the first reference frame is directly stored in the second memory without superposition processing.

According to the first reference motion vector set of each reference frame obtained in step 1004, for the current pixel unit of the current reference frame which is not the first reference frame, determining the first reference motion vector set mapped by the current pixel unit with the same mapping sequence in each reference frame and the mapping position corresponding to each pixel block mapped by the current pixel unit with the same mapping sequence in each reference frame, and determining the target pixel block with the same mapping position corresponding to the pixel block in the pixel unit with the same mapping sequence in the reference frame after mapping in the current pixel unit.

For example, as shown in fig. 12, assume that the current reference frame is temporal reference frame 2, the current pixel unit is pixel unit No. 0 in temporal reference frame 2, the mapping position of pixel block a200 existing in pixel unit No. 0 of temporal reference frame 2 is P, and the mapping position of pixel block a201 is M. The mapping position of the pixel block a102 in the temporal reference frame 1 is P, and since the temporal reference frame 2 is mapped later, the first reference motion vector a200 mapped by the pixel block a200 in the temporal reference frame 2 can replace the first reference motion vector a102 mapped by the pixel block a102 in the temporal reference frame 1, the final mapping result in the P position in the first memory is a200, and the original position information of the pixel block a200 is stored in the same storage position as the P position in the second memory. The mapping position of the pixel block a001 in the temporal reference frame 0 is M, and since the temporal reference frame 2 is mapped later, the first reference motion vector a201 mapped by the pixel block a201 in the temporal reference frame 2 can replace the first reference motion vector a001 mapped by the pixel block a001 in the temporal reference frame 0, the final mapping result in the P position in the first memory is a201, and the original position information of the pixel block a201 is stored in the same storage position as the M position in the second memory.

Optionally, the terminal obtains a first reference motion vector set of pixel units in each reference frame after mapping is completed, determines a current reference frame and a current pixel unit which are not the first reference frame, determines a current pixel unit with the same position as the current pixel unit in the reference frame after mapping is completed, and a mapped position of pixel blocks in each current pixel unit, compares the mapped position of pixel blocks in each current pixel unit with the mapped position of pixel blocks in each current pixel unit, and determines a target pixel block with the same mapped position of pixel blocks in each current pixel unit.

And 2, replacing the first reference motion vector mapped by the target pixel block in the reference frame after mapping by the first reference motion vector mapped by the target pixel block in the current pixel unit, and storing the first reference motion vector mapped by the residual pixel block in the current pixel unit and the residual pixel block in the reference frame after mapping into a first memory.

Each memory unit in the first memory corresponds to a memory unit number, mapping results of pixel units with the same number are stored in the memory units with the same number, and pixel units with the same mark information in each reference frame are stored in the same target memory unit in the first memory. For example, the mapping result of the pixel unit No. 0 in each reference frame is stored in the memory unit No. 0.

It should be noted that: as can be seen from the above embodiments, the mapping result of the current 64×64 pixel unit can only be mapped to the left adjacent 64×64 pixel unit, the current 64×64 pixel unit, and the right adjacent 64×64 pixel unit of the current 64×64 pixel unit in the current reference frame, and therefore, the mapping result of the pixel block in the current 64×64 pixel unit may be stored in the adjacent memory unit, and the final mapping result of the memory unit is used as the second reference motion vector set with the same mapping sequence in the current frame. For example, the mapping result of the pixel unit No. 0 in each reference frame may be stored in the memory unit No. 1, but the mapping result of the memory unit No. 0 is still used as the second reference motion vector set of the pixel unit No. 0 in the current frame.

For example, for the temporal reference frame 0, the temporal reference frame 1, and the temporal reference frame 2 in the motion estimation sequence shown in fig. 12, the first reference motion vector sets obtained after motion field estimation for each number 0 64x64 pixel unit in the temporal reference frame 0, the temporal reference frame 1, and the temporal reference frame 2 are respectively denoted as: and storing the set n, the set m and the set p into the memory unit 0 in the first storage in sequence according to the motion estimation sequence of the reference frame.

And step 3, taking the pixel unit at the same position in the next reference frame as a current pixel unit, executing the step of determining the target pixel block with the same mapping position corresponding to the pixel block in the pixel unit at the same position in the previous reference frame, and continuing to execute until the current reference frame is the last reference frame, and obtaining a second reference motion vector set corresponding to the pixel unit at the same position in the current frame.

For example, for the time domain reference frame 0, the time domain reference frame 1, and the time domain reference frame 2 which are arranged in the motion estimation sequence as shown in fig. 12, it is assumed that the current reference frame is the time domain reference frame 1, the current pixel unit is the pixel unit No. 0 in the time domain reference frame 1, and the pixel units with the same mapping sequence in the mapped reference frame are the pixel unit No. 0 in the time domain reference frame 0. As shown in fig. 15 (a) and (b) in the distribution storage condition of the first reference motion vector corresponding to the pixel unit No. 0 and the current pixel unit in the temporal reference frame 0 in the first memory, as can be seen from fig. 15, the target pixel blocks with the same mapping positions in the pixel unit No. 0 in the temporal reference frame 0 are respectively marked as S1 and S2, the first reference motion vector corresponding to the S1 and S2 in the current pixel unit can replace the first reference motion vector with the same position in the pixel unit No. 0 in the temporal reference frame 0, the first reference motion vector after mapping the residual pixel block in the current pixel unit and the residual pixel block in the pixel unit No. 0 in the temporal reference frame 0 is superimposed, and the first reference motion vector shown in the graph (d) is stored in the first memory in the superimposed vector distribution storage condition as shown in fig. 15 (d).

The pixel units at the same position in the next reference frame are pixel units No. 0 in the temporal reference frame 2, the distribution storage condition of the first reference motion vector corresponding to the pixel unit No. 0 in the temporal reference frame 2 in the first memory is shown in a graph (c) in fig. 15, as can be seen from fig. 15, the pixel units No. 0 in the temporal reference frame 2 are respectively marked as S3, S4 and S5 with the target pixel blocks with the same mapping positions in the pixel units No. 0 in the temporal reference frame 0 and the pixel units No. 0 in the temporal reference frame 1, so that the first reference motion vector corresponding to the pixel units No. 0 in the temporal reference frame 2, S3, S4 and S5 can replace the first reference motion vector at the same position in the pixel units No. 0 in the temporal reference frame 0 and the pixel unit No. 0 in the temporal reference frame 1, overlapping the residual pixel blocks in the number 0 pixel units in the time domain reference frame 2 with the number 0 pixel units in the time domain reference frame 0 and the first reference motion vectors mapped by the residual pixel blocks in the number 0 pixel units in the time domain reference frame 1, wherein the overlapped vector distribution storage condition is shown as a graph (e) in fig. 15, the first reference motion vectors shown in the graph (e) are replaced by the graph (d) in the first memory, each first reference motion vector shown in the graph (e) is used as a second reference motion vector corresponding to the pixel blocks at the same position in the number 0 pixel units in the current frame, and the first reference motion vector set shown in the graph (e) is used as a second reference motion vector set corresponding to the number 0 pixel units in the current frame.

Optionally, the terminal maps according to the reference frame arrangement sequence shown in fig. 12, and performs superposition processing on the first reference motion vector set mapped by the nth pixel unit in each reference frame according to step 1006, and uses the superposition result as the second reference motion vector set mapped by the nth pixel unit in the current frame.

In this embodiment, a target pixel block in the current pixel unit, which has the same mapping position as the pixel block in the reference frame after mapping, is determined, and the first reference motion vector corresponding to the pixel block mapped later on the position of the target pixel block may replace the first reference motion vector corresponding to the pixel block mapped earlier, so as to meet the requirement of motion vector mapping.

In some embodiments, a motion field estimation method is provided, specifically including the following steps:

Step1, determining a current frame and a plurality of reference frames from a video sequence, and dividing the current frame and each reference frame into a plurality of pixel units; wherein the reference frame leads or lags the current frame in the time domain.

And 2, determining a first mapping sequence of each reference frame based on the mark information of each reference frame, and determining a second mapping sequence of each pixel unit in the same reference frame based on the mark information of each pixel unit in the same reference frame.

And 3, taking the first pixel unit in the second mapping sequence in each reference frame as the current pixel unit of each reference frame, and determining the third mapping sequence of each pixel block in the current pixel unit according to the mark information corresponding to each pixel block in the current pixel unit aiming at the current pixel unit in each reference frame.

Step 4, taking the first pixel block of the third mapping sequence in the current pixel unit as the current pixel block; calculating a first reference motion vector and a mapping position after mapping the current pixel block based on an original motion vector corresponding to the current pixel block; the mapping position is the storage position of the first reference motion vector in the first memory and the original position information of the current pixel block in the reference frame in the second memory.

And 5, determining a first target storage address of the first reference motion vector mapped by the current pixel block in a first memory and a second target storage address of original position information corresponding to the current pixel block in a second memory according to the mapping position.

And step 6, if the first reference motion vector mapped by the previous pixel block is not stored in the first target storage address, storing the first reference motion vector mapped by the current pixel block in the first target storage address, storing the original position information corresponding to the current pixel block in the second target storage address, and executing the step 9.

Step 7, if the first reference motion vector mapped by the previous pixel block is stored in the first target storage address, the original position information corresponding to the previous pixel block is read from the second memory; if it is determined that the mapping sequence of the reference frame where the current pixel block is located lags behind the previous pixel block according to the original position information corresponding to the previous pixel block, replacing the first reference motion vector of the current pixel block with the first reference motion vector mapped by the previous pixel block in the first target storage address, replacing the original position information corresponding to the current pixel block with the original position information of the previous pixel block in the second target storage address, and executing step 9.

And 8, if the current pixel block and the previous pixel block are determined to be in the same reference frame according to the original position information corresponding to the previous pixel block, and the line where the current pixel block is located below the previous pixel block or is in the same line, replacing the first reference motion vector of the current pixel block with the first reference motion vector mapped by the previous pixel block in the first target storage address, replacing the original position information corresponding to the current pixel block with the original position information of the previous pixel block in the second target storage address, and executing the step 9.

And 9, taking the next pixel block of the current pixel block in the third mapping sequence as the current pixel block, returning to execute the step of calculating the first reference motion vector and the mapping position after the mapping of the current pixel block based on the original motion vector corresponding to the current pixel block, and continuing to execute until the current pixel block is the last pixel block in the current pixel unit, so as to obtain the first reference motion vector set after the mapping of the current pixel unit.

Step 10, taking the next pixel unit of the current pixel unit in the second mapping sequence in each reference frame as the current pixel unit of each reference frame, returning to execute the step of respectively mapping the motion vectors of the current pixel units in each reference frame according to the first mapping sequence, and continuing to execute until the current pixel unit of each reference frame is the last pixel unit of each reference frame, so as to obtain a first reference motion vector set corresponding to the pixel units with the same mapping sequence in each reference frame.

Step 11, determining the target pixel block with the same mapping position corresponding to the pixel block in the reference frame after mapping in the current pixel unit aiming at the current pixel unit of the current reference frame which is not the first reference frame.

And step 12, replacing the first reference motion vector mapped by the target pixel block in the current pixel unit with the first reference motion vector mapped by the target pixel block in the reference frame after mapping, and storing the mapping between the residual pixel block in the current pixel unit and the residual pixel block in the reference frame after mapping into a first memory.

And step 13, taking the pixel unit at the same position in the next reference frame as a current pixel unit, executing the step of determining the target pixel block with the same mapping position corresponding to the pixel block in the pixel unit at the same position in the previous reference frame, and continuing to execute until the current reference frame is the last reference frame, and obtaining a second reference motion vector set and a second reference motion vector corresponding to the pixel unit at the same position in the current frame.

The embodiment can save the storage resources on the hardware chip, reduce unnecessary computation and improve the decoding performance of the hardware.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides a motion field estimation device for realizing the motion field estimation method. The implementation of the solution provided by the apparatus is similar to the implementation described in the above method, so the specific limitations in the embodiments of the motion field estimation apparatus or apparatus provided below may be referred to above for the limitation of the motion field estimation method, which is not repeated here.

In one embodiment, as shown in fig. 16, there is provided a motion field estimation apparatus including: an acquisition module 100, a motion estimation mapping module 200 and a decoding module 300, wherein:

An acquisition module 100 for determining a current frame and a plurality of reference frames from a video sequence, dividing the current frame and each reference frame into a plurality of pixel units; wherein the reference frame leads or lags the current frame in the time domain;

The motion estimation mapping module 200 is configured to determine a second reference motion vector set corresponding to a pixel unit with the same mapping sequence in the current frame based on a first reference motion vector set corresponding to a pixel unit with the same mapping sequence in each reference frame after performing motion vector mapping; the second reference motion vector set is stored in the first memory, and the original position information of each pixel block in the pixel unit is stored in the second memory; the original position information comprises mark information of a reference frame where the pixel block is located and a row where the pixel block is located;

the decoding module 300 is configured to obtain location information of a pixel unit to be decoded in the current frame, read, in the first memory, a second reference motion vector set corresponding to the pixel unit mapped by the location information, and use the second reference motion vector set as a time domain reference motion vector set of the pixel unit to be decoded.

In one embodiment, the reference frame corresponds to flag information for characterizing an arrangement order of the reference frame in the video sequence, and the pixel unit corresponds to flag information for characterizing an arrangement order of the pixel unit in the reference frame; the motion estimation mapping module 200 is further configured to: determining a first mapping sequence of each reference frame based on the mark information of each reference frame, and determining a second mapping sequence of each pixel unit in the same reference frame based on the mark information of each pixel unit in the same reference frame;

In one embodiment, the motion estimation mapping module 200 is further configured to: taking the first pixel unit in the second mapping sequence in each reference frame as the current pixel unit of each reference frame, and respectively mapping the motion vectors of the current pixel units in each reference frame according to the first mapping sequence to obtain a first reference motion vector set corresponding to the current pixel units in each reference frame; the method comprises the steps that a current pixel unit corresponds to a first reference motion vector set and is stored in a first memory, and original position information of each pixel block in the current pixel unit is stored in a second memory;

In one embodiment, the pixel blocks within the pixel unit correspond to flag information for characterizing the arrangement order of the pixel blocks in the pixel unit; the motion estimation mapping module 200 is further configured to: determining a third mapping sequence of each pixel block in the current pixel unit based on the mark information corresponding to each pixel block in the current pixel unit aiming at the current pixel unit in each reference frame;

In one embodiment, the motion estimation mapping module 200 is further configured to: determining a first target storage address of a first reference motion vector mapped by the current pixel block in a first memory and a second target storage address of original position information corresponding to the current pixel block in a second memory according to the mapping position;

In one embodiment, the motion estimation mapping module 200 is further configured to: if the first reference motion vector mapped by the previous pixel block is stored in the first target storage address, original position information corresponding to the previous pixel block is read from the second memory;

In one embodiment, the motion estimation mapping module 200 is further configured to: if the current pixel block and the previous pixel block are determined to be in the same reference frame according to the original position information corresponding to the previous pixel block, and the row in which the current pixel block is located below the previous pixel block or is in the same row, the first reference motion vector of the current pixel block is replaced with the first reference motion vector mapped by the previous pixel block in the first target storage address, and the original position information corresponding to the current pixel block is replaced with the original position information of the previous pixel block in the second target storage address.

In one embodiment, the motion estimation mapping module 200 is further configured to: determining a target pixel block with the same mapping position corresponding to the pixel block in the reference frame after mapping in the current pixel unit aiming at the current pixel unit of the current reference frame which is not the first reference frame;

The respective modules in the motion field estimation apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a terminal, and the internal structure thereof may be as shown in fig. 17. The computer device includes a processor, a memory, an input/output interface, a communication interface, a display unit, and an input means. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface, the display unit and the input device are connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a motion field estimation method. The display unit of the computer device is used for forming a visual picture, and can be a display screen, a projection device or a virtual reality imaging device. The display screen can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be a key, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the structure shown in FIG. 17 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In one embodiment, a computer device is provided that includes a memory having a computer program stored therein and a processor that when executing the computer program performs the steps of the above embodiments.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, implements the steps of the above embodiments.

In an embodiment, a computer program product is provided comprising a computer program which, when executed by a processor, implements the steps of the above embodiments.

It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magneto-resistive random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (PHASE CHANGE Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in various forms such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), etc. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims

1. A method of motion field estimation, the method comprising:

determining a current frame and a plurality of reference frames from a video sequence, and dividing the current frame and each reference frame into a plurality of pixel units; wherein a reference frame leads or lags the current frame in the time domain; the reference frames are correspondingly provided with mark information for representing the arrangement sequence of the reference frames in the video sequence; the pixel units are corresponding to mark information for representing the arrangement sequence of the pixel units in the reference frame;

Taking the first pixel unit in the second mapping sequence in each reference frame as the current pixel unit of each reference frame, and respectively carrying out motion vector mapping on the current pixel unit in each reference frame according to the first mapping sequence to obtain a first reference motion vector set corresponding to the current pixel unit in each reference frame; the method comprises the steps that a current pixel unit corresponds to a first reference motion vector set and is stored in a first memory, and original position information of each pixel block in the current pixel unit is stored in a second memory;

Taking the next pixel unit of the current pixel unit in the second mapping sequence in each reference frame as the current pixel unit of each reference frame, returning to execute the step of respectively mapping the motion vectors of the current pixel units in each reference frame according to the first mapping sequence, and continuing to execute until the current pixel unit of each reference frame is the last pixel unit of each reference frame, so as to obtain a first reference motion vector set corresponding to the pixel units with the same mapping sequence in each reference frame;

Superposing the first reference motion vector sets corresponding to the pixel units with the same mapping sequence in each reference frame to obtain a second reference motion vector set corresponding to the pixel units with the same mapping sequence in the current frame; the second reference motion vector set is stored in the first memory, and the original position information of each pixel block in the pixel unit is stored in the second memory; the original position information comprises mark information of a reference frame where the pixel block is located and a row where the pixel block is located; the superposition means that if the mapping position of the first reference motion vector corresponding to the first pixel block in the reference frame of the back motion field estimation is the same as the mapping position of the first reference motion vector corresponding to the second pixel block in the reference frame of the at least one previous motion estimation, the original position signal corresponding to the second pixel block in the reference frame of the previous motion estimation and the original position information corresponding to the first pixel block in the reference frame of the back motion estimation are stored in the same storage position of the second memory; replacing a first reference motion vector corresponding to a second pixel block in a reference frame of the previous motion field estimation with a first reference motion vector corresponding to a first pixel block in a reference frame of the previous motion field estimation at the same mapping position of the first memory; on the same storage position of the second memory, replacing original position information corresponding to a second pixel block in the reference frame of the previous motion field estimation with original position information corresponding to the first pixel block in the reference frame of the next motion field estimation;

2. The method according to claim 1, wherein the pixel blocks within the pixel unit correspond to flag information for characterizing an arrangement order of the pixel blocks in the pixel unit; respectively mapping motion vectors of current pixel units in each reference frame according to the first mapping sequence, and obtaining a first reference motion vector set corresponding to the current pixel units in each reference frame comprises:

Taking the first pixel block of the third mapping sequence in the current pixel unit as the current pixel block; calculating a first reference motion vector and a mapping position after mapping the current pixel block based on an original motion vector corresponding to the current pixel block; the mapping position is the storage position of the first reference motion vector in the first memory, and the original position information of the current pixel block in the reference frame is stored in the second memory;

Storing a first reference motion vector corresponding to the current pixel block into the first memory according to the mapping position, and storing original position information corresponding to the current pixel block into a second memory;

And taking the next pixel block of the current pixel block in the third mapping sequence as the current pixel block, returning to execute the step of calculating the first reference motion vector and the mapping position after the mapping of the current pixel block based on the original motion vector corresponding to the current pixel block, and continuing to execute until the current pixel block is the last pixel block in the current pixel unit, so as to obtain the first reference motion vector set after the mapping of the current pixel unit.

3. The method according to claim 2, wherein storing the first reference motion vector corresponding to the current pixel block in the first memory according to the mapping position, and storing the original position information corresponding to the current pixel block in the second memory includes:

Determining a first target storage address of the first reference motion vector mapped by the current pixel block in the first memory and a second target storage address of original position information corresponding to the current pixel block in the second memory according to the mapping position;

And if the first reference motion vector mapped by the previous pixel block is not stored in the first target storage address, storing the first reference motion vector mapped by the current pixel block into the first target storage address, and storing the original position information corresponding to the current pixel block into the second target storage address.

4. A method according to claim 3, characterized in that the method further comprises:

5. A method according to claim 3, characterized in that the method further comprises: if the current pixel block and the previous pixel block are determined to be in the same reference frame according to the original position information corresponding to the previous pixel block, and the row in which the current pixel block is located below the previous pixel block or is in the same row, replacing the first reference motion vector of the current pixel block with the first reference motion vector mapped by the previous pixel block in the first target storage address, and replacing the original position information corresponding to the current pixel block with the original position information of the previous pixel block in the second target storage address.

6. The method according to claim 1, wherein the superimposing the first reference motion vector sets corresponding to the pixel units with the same mapping order in each reference frame to obtain the second reference motion vector set corresponding to the pixel units with the same mapping order in the current frame includes:

Replacing the first reference motion vector mapped by the target pixel block in the current pixel unit with the first reference motion vector mapped by the target pixel block in the reference frame after mapping, and storing the first reference motion vector mapped by the residual pixel block in the current pixel unit and the residual pixel block in the reference frame after mapping into a first memory;

And taking the pixel unit at the same position in the next reference frame as the current pixel unit, executing the step of determining the target pixel block with the same mapping position corresponding to the pixel block in the pixel unit at the same position in the previous reference frame, and continuing to execute until the current reference frame is the last reference frame, and obtaining a second reference motion vector set corresponding to the pixel unit at the same position in the current frame.

7. A motion field estimation apparatus, the apparatus comprising:

The acquisition module is used for determining a current frame and a plurality of reference frames from the video sequence and dividing the current frame and each reference frame into a plurality of pixel units; wherein a reference frame leads or lags the current frame in the time domain; the reference frames are correspondingly provided with mark information for representing the arrangement sequence of the reference frames in the video sequence; the pixel units are corresponding to mark information for representing the arrangement sequence of the pixel units in the reference frame;

The motion estimation mapping module is used for determining a first mapping sequence of each reference frame based on the mark information of each reference frame, and determining a second mapping sequence of each pixel unit in the same reference frame based on the mark information of each pixel unit in the same reference frame; taking the first pixel unit in the second mapping sequence in each reference frame as the current pixel unit of each reference frame, and respectively carrying out motion vector mapping on the current pixel unit in each reference frame according to the first mapping sequence to obtain a first reference motion vector set corresponding to the current pixel unit in each reference frame; the method comprises the steps that a current pixel unit corresponds to a first reference motion vector set and is stored in a first memory, and original position information of each pixel block in the current pixel unit is stored in a second memory; taking the next pixel unit of the current pixel unit in the second mapping sequence in each reference frame as the current pixel unit of each reference frame, returning to execute the step of respectively mapping the motion vectors of the current pixel units in each reference frame according to the first mapping sequence, and continuing to execute until the current pixel unit of each reference frame is the last pixel unit of each reference frame, so as to obtain a first reference motion vector set corresponding to the pixel units with the same mapping sequence in each reference frame; superposing the first reference motion vector sets corresponding to the pixel units with the same mapping sequence in each reference frame to obtain a second reference motion vector set corresponding to the pixel units with the same mapping sequence in the current frame; the second reference motion vector set is stored in the first memory, and the original position information of each pixel block in the pixel unit is stored in the second memory; the original position information comprises mark information of a reference frame where the pixel block is located and a row where the pixel block is located; the superposition means that if the mapping position of the first reference motion vector corresponding to the first pixel block in the reference frame of the back motion field estimation is the same as the mapping position of the first reference motion vector corresponding to the second pixel block in the reference frame of the at least one previous motion estimation, the original position signal corresponding to the second pixel block in the reference frame of the previous motion estimation and the original position information corresponding to the first pixel block in the reference frame of the back motion estimation are stored in the same storage position of the second memory; replacing a first reference motion vector corresponding to a second pixel block in a reference frame of the previous motion field estimation with a first reference motion vector corresponding to a first pixel block in a reference frame of the previous motion field estimation at the same mapping position of the first memory; on the same storage position of the second memory, replacing original position information corresponding to a second pixel block in the reference frame of the previous motion field estimation with original position information corresponding to the first pixel block in the reference frame of the next motion field estimation;

The decoding module is used for acquiring the position information of the pixel unit to be decoded in the current frame, reading a second reference motion vector set corresponding to the pixel unit mapped by the position information in the first memory, and taking the second reference motion vector set as a time domain reference motion vector set of the pixel unit to be decoded.

8. The apparatus of claim 7, wherein the pixel blocks within the pixel unit correspond to flag information for characterizing an arrangement order of the pixel blocks in the pixel unit; the motion estimation mapping module is further configured to determine, for a current pixel unit in each reference frame, a third mapping sequence of each pixel block in the current pixel unit based on flag information corresponding to each pixel block in the current pixel unit; taking the first pixel block of the third mapping sequence in the current pixel unit as the current pixel block; calculating a first reference motion vector and a mapping position after mapping the current pixel block based on an original motion vector corresponding to the current pixel block; the mapping position is the storage position of the first reference motion vector in the first memory, and the original position information of the current pixel block in the reference frame is stored in the second memory; storing a first reference motion vector corresponding to the current pixel block into the first memory according to the mapping position, and storing original position information corresponding to the current pixel block into a second memory; and taking the next pixel block of the current pixel block in the third mapping sequence as the current pixel block, returning to execute the step of calculating the first reference motion vector and the mapping position after the mapping of the current pixel block based on the original motion vector corresponding to the current pixel block, and continuing to execute until the current pixel block is the last pixel block in the current pixel unit, so as to obtain the first reference motion vector set after the mapping of the current pixel unit.

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.