CN103634604A - Multi-core DSP (digital signal processor) motion estimation-oriented data prefetching method - Google Patents

Multi-core DSP (digital signal processor) motion estimation-oriented data prefetching method Download PDF

Info

Publication number
CN103634604A
CN103634604A CN201310632104.XA CN201310632104A CN103634604A CN 103634604 A CN103634604 A CN 103634604A CN 201310632104 A CN201310632104 A CN 201310632104A CN 103634604 A CN103634604 A CN 103634604A
Authority
CN
China
Prior art keywords
block
data
core
encoding
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310632104.XA
Other languages
Chinese (zh)
Other versions
CN103634604B (en
Inventor
姜宏旭
孙士明
翟东林
李波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201310632104.XA priority Critical patent/CN103634604B/en
Publication of CN103634604A publication Critical patent/CN103634604A/en
Application granted granted Critical
Publication of CN103634604B publication Critical patent/CN103634604B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a multi-core DSP (digital signal processor) motion estimation-oriented data prefetching method. According to the method, data of a coding block and a reference block are prefetched to a local storage according to the space relevancy of data in motion estimation and the data of a prediction motion vector prefetching coding bock and the reference block, so the parallel performing between the motion estimation operation of the current coding block and the data loading operation of the next code block and the reference block are realized, the influence of the storage to the speed of a multi-core processor in the motion estimation is reduced, and tests prove that the data prefetching method enables the motion estimation processing speed in the multi-core DSP parallel video coding to be obviously improved.

Description

A kind of data prefetching method towards multi-core DSP estimation
Technical field
The invention belongs to multi-media decoding and encoding field, be specifically related to a kind of data prefetching method for estimation in embedded multi-core dsp processor parallel video coding, is a kind of by the method for data pre-fetching accelerating video motion estimation process process.
Background technology
Estimation is one of chief component of the Video coding based on hybrid encoding frame, estimation be take data block and is completed prediction as unit, motion search, motion compensation, the operation such as dct transform and quantification, in Video coding, estimation be take data block as processing unit, H.264/AVC in coding, the data block of estimation comprises macro block (MB), sub-macro block, piece etc., in HEVC coding, the data block of estimation comprises coding unit (CU), predicting unit (PU) and converter unit (TU) etc., the estimation of P frame needs the data of current encoded frame and a reference frame, the estimation of B frame needs current encoded frame and a forward reference frame, the data of a backward reference frame, the throughput of the data of processing is very large.In the video sequence of 1080i form, the resolution of every two field picture is 1920 * 1080, export 60 two field pictures each second, with YUV(4:2:0) form represents that the coded data of colour information generation per second reaches 0.746Gbps, more than data in estimation reach 1.5Gbps, along with the raising of video quality, the video data volume of generation is also sharply increasing.In embedded system, Video coding more and more adopts multi-core DSP processor to realize, and embedded multi-core dsp processor is multistage storage organization, and each core independently has local memory separately, and all core is shared MSM memory and jumbo chip external memory.Local memory capacity is little, fastest; MSM memory is larger, and speed is slower, and external memory storage capacity is large, and speed is slow.Because the local storage space of multi-core DSP processor is little, can not store the data of whole coded frame and reference frame, coded frame and reference frame need to be divided into little data block, store the data of the present encoding piece of coded frame and reference block into internal storage and coded frame and reference frame storing memory externally.As shown in Figure 1, in embedded multi-core video encoder, first by video acquisition, obtain video data and be temporarily stored in jumbo external memory storage, when coding, by external memory storage, read internal storage and do computing, yet processor performance is with annual 60% speed increment, and memory access performance is increased less than 10% every year, performance gap between processor and memory is also in continuous increase, memory becomes systematic function bottleneck, in polycaryon processor, store bottleneck problem even more serious, estimation is because the data volume of needs is large, memory bottleneck becomes the key factor that affects processing speed.
In order to reduce the impact of storage bottleneck, adopt the polycaryon processor memory property of multistage storage organization to lean on the hit rate of Cache to guarantee, yet in multistage storage organization, Cache does not hit and can cause outside accessing operation long delay, the long delay time can reach up to a hundred processor clock cycles, reduced the execution speed of processor, the delay of reading to lose efficacy under worst case as TMS320C6678 memory Cache is 287 clock cycle, adds up to 287ns(core to be operated in 1GHz).In estimation, deal with data amount is large, and it is more obvious that Cache does not hit the impact causing.Data pre-fetching technology read in advance data before data are used, by the stand-by period of calculating and the overlapping reduction processor of accessing operation.Existing application number is that 200410101465.2 patent " method that in video encoding-decoding process, macro block data reads " is by setting up the mode resolve buffer district hit rate Problem of Failure of " macroblock address mapping table ", but this method only provides the indexed mode of macro block data in a kind of frame of video, can only reduce the impact that Cache lost efficacy and brings, do not realize the data pre-fetching problem of reference frame in coding, and the data that often need to search for a plurality of reference blocks in estimation, data volume is larger, also larger on the impact of estimation.Application number is that mode that 200710046929.8 patent " data pre-fetching system in video processing " increases " data pre-fetching module " between processor and memory realizes data block and looks ahead, but this mode of looking ahead by the realization of increase hardware cell is not suitable for business-like DSP embedded processor, owing to lacking synchronization mechanism, this method is not suitable for multi-core DSP processor Parallel Implementation simultaneously.
The present invention is according to the look ahead data of coded frame and reference frame of the data space correlation of motion estimation process in parallel encoding and motion vectors, realize data and read parallel with motion estimation process, effectively reduce the impact of storage bottleneck on multi-core DSP processor processing speed, experiment shows that the method effectively raises the execution speed of embedded multi-core dsp processor estimation.
Summary of the invention
In order to overcome the time delay that reference to storage causes when being applied to estimation of multi-core DSP processor, the present invention has announced a kind of according to the look ahead technology of data of coded frame and reference frame of data space correlation and motion vectors in estimation, in upper data block coding, use the look ahead data of next encoding block and reference of DMA, realized parallel that motion estimation data reads and processes.Experiment showed, that the method effectively raises the processing speed of multi-core DSP processor motion estimation operation.
For achieving the above object, the present invention has adopted following technical proposals:
A kind of data prefetching method towards multi-core DSP estimation (as shown in Figure 4), step is as follows:
Step 1, prefetch data block size is set, divides encoding block, reference block size, the data of encoding block and reference block are stored in the local memory of core, and memory block is set to Ping-Pang structure;
If the encoding block of the current execution estimation of step 2 belongs to P frame, carry out prediction, motion search operations, the data of look ahead next encoding block and reference block;
If the encoding block of the current execution estimation of step 3 belongs to B frame, carry out prediction, motion search operations, the data of look ahead next encoding block and forward direction reference block, backward reference block.
Described step 1 specifically comprises following operation:
(1) coded frame, reference frame are divided into encoding block and reference block according to the data block size of processing in multi-core DSP local memory capacity and estimation, the data of encoded image frame and reference frame are stored in outside mass storage, and the data of present encoding piece and reference block are stored in local memory;
(2) coded frame is divided into encoding block and code line, as shown in Figure 2, each core of multi-core DSP is carried out motion estimation operation with coding behavior unit, after pronucleus finishes a code line motion estimation operation, obtains immediately below first estimated coding of not taking exercises is capable to proceed;
(3) system control table is set, system control table comprises the code line indicating device of first estimation operation of not taking exercises in present encoding frame counter, current encoded frame, current encoded frame encoding state table, and system control table is that multinuclear shares, synchronous to realize each core;
(4) in local memory for present encoding piece and reference data piece arrange Ping-Pang structure memory block, present encoding be Pang, data pre-fetching is to Ping.
Described step 2 specifically comprises the steps:
(1), if present encoding piece belongs to P frame, carry out prediction and motion search operations in encoding block estimation;
(2) start DMA and the data of the next encoding block of present encoding piece are prefetched to the internal storage of core by external memory storage, source address, destination address, data block size are pressed formula (1) and are calculated;
DMA _ src = fenc _ base + corei _ line × line _ block × block _ size + ( corei _ block + 1 ) × block _ width DMA _ dst = corei _ fenc _ base data _ size = block _ size - - - ( 1 )
Wherein block_width is encoding block width, block_hight is encoding block height, block_size is coded block size, block_size=block_width * block_hight, DMA_src is the source address of i core DMA carrying data, DMA_dst is the destination address of i core DMA carrying data, data_size is carrying size of data, corei_fenc_base is that the encoding block data of i core are in the initial address of storage inside, fenc_base is the initial address of current encoded frame, corei_line is i the code line that core is current, line_block is code line size, corei_block is the present encoding piece of i core.
(3) start DMA and the data of the reference block of the next encoding block of present encoding piece are prefetched to the internal storage of core by external memory storage, be about to the data pre-fetching of a plurality of reference blocks centered by motion vectors in reference frame to the local memory of core, the quantity of the reference data of looking ahead piece determines by method for searching motion, and source address, destination address, data block size press formula (2) calculating;
DMA _ src = fref _ base + ( mvp _ x × line _ pix - block _ width ) + ( mvp _ y - block _ hight ) DMA _ dst = corei _ ref _ base data _ size = block _ size × bref _ size - - - ( 2 )
Wherein DMA_src is the source address of i core DMA carrying data, DMA_dst is the destination address of i core DMA carrying data, fref_base is reference frame first address, line_pix is the pixel quantity of a line in reference frame, mvp_x, mvp_y is motion vector predictor, after finishing, present encoding piece motion search can obtain motion vector predictor according to the Forecasting Methodology of motion vector, corei_ref_base is the data storage first address of reference block, bref_size is the data block size of reference, other parameter cotypes (1);
(4) carry out motion compensation (MC), dct transform, quantization operation;
(5) upgrade control table, position, present encoding piece coding maker position;
(6) repeating step (1), (2), (3), (4), (5), until each encoding block motion estimation operation of present encoding row finishes.
Described step 3 specifically comprises the steps:
(1), if present encoding piece belongs to B frame, carry out prediction and motion search operations in encoding block estimation;
(2) start DMA and the data of the next encoding block of present encoding piece are prefetched to the local memory of core by external memory storage, source address, destination address, data block size are pressed formula (1) and are calculated;
(3) start DMA by the data pre-fetching of the forward direction reference block of the next data block of present encoding piece the internal storage to core, be about to the data pre-fetching of a plurality of reference blocks centered by motion vectors in reference frame to the local memory of core, the quantity of reference block of looking ahead determines by method for searching motion, and source address, destination address, data block size press formula (2) calculating;
(4) start DMA by the data pre-fetching of the backward reference block of the next data block of present encoding piece the internal storage to core, be about to the data pre-fetching of a plurality of reference blocks centered by motion vectors in reference frame to the local memory of core, the quantity of reference block of looking ahead determines by method for searching motion, and source address, destination address, data block size press formula (2) calculating;
(5) carry out motion compensation (MC), dct transform, quantization operation;
(6) upgrade control table, position, present encoding piece coding maker position;
(7) repeating step (1), (2), (3), (4), (5), (6), until each encoding block motion estimation operation of present encoding row finishes.
The present invention's advantage is compared with prior art:
1, the present invention has realized in estimation data and has read with data processing parallelly, has accelerated the coding rate of multi-core DSP processor;
2, reading in advance of the motion estimation data that data prefetching method of the present invention is realized, takes full advantage of bandwidth of memory and has reduced memory bottleneck problem;
3, the present invention takes full advantage of the DMA high speed data transfer parts of embedded multi-core dsp processor, has accelerated data transmission procedure.
Accompanying drawing explanation
Multinuclear encoder stores structural representation in Fig. 1 video monitoring;
In Fig. 2 the present invention, code line is divided schematic diagram;
Data store organisation schematic diagram in Fig. 3 specific embodiment of the invention;
A kind of flow chart of the data prefetching method towards multi-core DSP parallel video coding in Fig. 4 the present invention.
Embodiment
Provide the embodiment of the TMS320C6678LE evaluation board of TI of the present invention company below.
The TMS320C6678LE evaluation board of TI company comprises 1 TMS320C6678 chip, and external memory space is DDR3 memory, 512MB.TMS320C6678 chip comprises 8 cores, core0 is to core7, the operating frequency of each core is 1.0GHz, each core includes 32KB one-level Data Buffer Memory L1D and 32KB one-level program buffer storage L1P, each core has 512KB second-level storage and 4MB shared storage MSM, one-level Data Buffer Memory is made as to cache, and the encoding block in estimation, the data of reference block are stored in second-level storage, and the data of encoded video, reference frame are stored in external memory storage DDR3.
Estimation adopts diamond search method, search window 48 * 48, the data of 9 reference blocks centered by motion vector predictor of therefore looking ahead in reference frame.Video format is YUV4:2:0, resolution is 704 * 576, coded block size is 16 * 16, therefore a coded frame has 36 code lines, every code line has 44 encoding blocks, video initial address is 0x82000000, forward reference frame initial address is 0x80000000, backward reference frame initial address is 0x80800000, the present encoding blocks of data of each core, the data first address of reference block is corei_fenc_base, corei_ref_base0, corei_ref_base1, each memory block is Ping-Pang structure, be that current estimation is used Pang data, prefetch data stores Ping into.
Parallel encoding control table is set, control table comprises enc_tab, cur_enc_frame, first_unenc_line, coding-control table, motion vector are that multinuclear is shared data, be stored in the MSM memory in sheet, as shown in Figure 3, multinuclear is accessed with exclusive mode, and semaphore is semphone0, and other coding parameters store in the L2 of each core.Each core of core0~core7 arranges present encoding column indicator core0_line~core7_line and present encoding piece indicating device core0_block~core0_block, corei_line is used to refer to the line number of the code line of processing when pronucleus, and corei_block is used to refer to the numbering when pronucleus encoding block.
Data pre-fetching between DDR3 and L2 is realized with QDMA, core0~core7 is used respectively the QDMA0~QDMA7 of TCC0 in EDMA3 as the data pre-fetching passage of each core, be set to AB-sync, LINK mode, the PaRAMSet of QDMA0~QDMA7 is 00~02,10~12,20~22,30~32,40~42,50~52,60~62,70~72.
Specific implementation process is:
(1) core0 reads in video sequence, and recorded video sequence totalframes (total_frames);
(2) core0 initialization global parameter, the cur_enc_frame of control table is initialized as-1, and first_unenc_line is-1;
(3) judged whether not coded frame, whether cur_enc_frame is greater than total_frames, if be greater than, and end-of-encode, otherwise proceed to (4);
(4) initialization codes state table is " 0 ", and first_unenc_line is 0;
(5) judged whether not code line, whether first_unenc_line is greater than 36, if be greater than present encoding frame end, turns (3), otherwise, turn (6);
(6) obtain code line, each core reads first_unenc_line to corei_line by exclusive mode, corei_block=0, and first_unenc_line adds 1;
(7) by cur_enc_frame, corei_line, corei_block calculation code piece first address, Y, U, tri-component address of V are respectively:
Y = 0 x 82000000 + cur _ enc _ frame × 608256 + corei _ line × 11264 + corei _ block × 16 U = 0 x 82000000 + cur _ enc _ frame × 608256 + 0 x 63000 + corei _ line × 5632 + corei _ block × 8 V = 0 x 82000000 + cur _ enc _ frame × 608256 + 0 xBC 00 + corei _ line × 5632 + corei _ blick × 8
Data_size is respectively: 256,64,64;
(8) judge whether a left side, ,Shang, upper right, upper left motion search that present encoding piece is adjacent complete, complete and continue, do not complete and wait for that adjacent block motion search finishes;
(9) whether the data that judge present encoding piece are transported to L2, are to continue, otherwise wait for that carrying finishes;
(10) carry out prediction, the motion search operations of encoding block;
(11) according to next encoding block and the reference macroblock data thereof of looking ahead shown in Fig. 4, the source address of the prefetch data of P frame encoding block, B frame encoding block, destination address, data block size are calculated by formula (1) and formula (2), block_width is 16, block_hight is 16, line_pix is 704, fenc_base=0x82000000+cur_enc_frame * 608256, bref_size is 9;
(12) carry out motion compensation (MC), dct transform, quantization operation;
(13) enc_tab in change control table, i.e. enc_tab[corei_line, corei_block]=1;
(14) judging whether present encoding row finishes, and whether corei_block is less than 44, is to turn (8) to continue the processing of present encoding row, otherwise turn (5), carries out next code line processing;
(15) repeat (5) to (14), until all frame of video processing of video finish.
Test result is as shown in the table: (QP=27)
Figure BDA0000427676130000071
When video image group is IBBP BBP BBP BBP BB structure, average behavior improves approximately 15%.

Claims (4)

1. towards a data prefetching method for multi-core DSP estimation, it is characterized in that: step is as follows:
Step 1, prefetch data block size is set, divides encoding block, reference block size, the data of encoding block and reference block are stored in the local memory of core, and memory block is set to Ping-Pang structure;
If the encoding block of the current execution estimation of step 2 belongs to P frame, carry out prediction, motion search operations, the data of look ahead next encoding block and reference block;
If the encoding block of the current execution estimation of step 3 belongs to B frame, carry out prediction, motion search operations, the data of look ahead next encoding block and forward direction reference block, backward reference block.
2. a kind of data prefetching method towards multi-core DSP estimation according to claim 1, is characterized in that: described step 1 specifically comprises following operation:
(1) coded frame, reference frame are divided into encoding block and reference block according to the data block size of processing in multi-core DSP local storage capacity and estimation, the data of encoded image frame and reference frame are stored in outside mass storage, and the data of present encoding piece and reference block are stored in local memory;
(2) coded frame is divided into encoding block and code line, each core of multi-core DSP is carried out motion estimation operation with coding behavior unit, after pronucleus finishes a code line motion estimation operation, obtains immediately below first estimated coding of not taking exercises is capable to proceed;
(3) system control table is set, system control table comprises the code line indicating device of first estimation operation of not taking exercises in present encoding frame counter, current encoded frame, current encoded frame encoding state table, and system control table is that multinuclear shares, and realizes each core synchronous;
(4) in local memory for present encoding piece and reference data piece arrange Ping-Pang structure memory block, present encoding be Pang, data pre-fetching is to Ping.
3. a kind of data prefetching method towards multi-core DSP estimation according to claim 1, is characterized in that: described step 2 specifically comprises the steps:
(2.1), if present encoding piece belongs to P frame, carry out prediction and motion search operations in encoding block estimation;
(2.2) start DMA and the data of the next encoding block of present encoding piece are prefetched to the internal storage of core by external memory storage, source address, destination address, data block size are pressed formula (1) and are calculated;
DMA _ src = fenc _ base + corei _ line × line _ block × block _ size + ( corei _ block + 1 ) × block _ width DMA _ dst = corei _ fenc _ base data _ size = block _ size - - - ( 1 )
Wherein block_width is encoding block width, block_hight is encoding block height, block_size is coded block size, block_size=block_width * block_hight, DMA_src is the source address of i core DMA carrying data, DMA_dst is the destination address of i core DMA carrying data, data_size is carrying size of data, corei_fenc_base is that the encoding block data of i core are in the initial address of storage inside, fenc_base is the initial address of current encoded frame, corei_line is i the code line that core is current, line_block is code line size, corei_block is the present encoding piece of i core,
(2.3) start DMA and the data of the reference block of the next encoding block of present encoding piece are prefetched to the internal storage of core by external memory storage, be about to the data pre-fetching of a plurality of reference blocks centered by motion vectors in reference frame to the local memory of core, the quantity of the reference data of looking ahead piece determines by method for searching motion, and source address, destination address, data block size press formula (2) calculating;
DMA _ src = fref _ base + ( mvp _ x × line _ pix - block _ width ) + ( mvp _ y - block _ hight ) DMA _ dst = corei _ ref _ base data _ size = block _ size × bref _ size - - - ( 2 )
Wherein DMA_src is the source address of i core DMA carrying data, DMA_dst is the destination address of i core DMA carrying data, fref_base is reference frame first address, line_pix is the pixel quantity of a line in reference frame, mvp_x, mvp_y is motion vector predictor, after finishing, present encoding piece motion search can obtain motion vector predictor according to the Forecasting Methodology of motion vector, corei_ref_base is the data storage first address of reference block, bref_size is the data block quantity of reference, the same formula of other parameters (1);
(2.4) carry out motion compensation (MC), dct transform, quantization operation;
(2.5) upgrade control table, position, present encoding piece coding maker position;
(2.6) repeating step (2.1), (2.2), (2.3), (2.4), (2.5), until each encoding block motion estimation operation of present encoding row finishes.
4. a kind of data prefetching method towards multi-core DSP estimation according to claim 1, is characterized in that: described step 3 specifically comprises the steps:
(3.1), if present encoding piece belongs to B frame, carry out prediction and motion search operations in encoding block estimation;
(3.2) start DMA and the data of the next encoding block of present encoding piece are prefetched to the local memory of core by external memory storage, source address, destination address, data block size are pressed formula (1) and are calculated;
(3.3) start DMA by the data pre-fetching of the forward direction reference block of the next data block of present encoding piece the internal storage to core, be about to the data pre-fetching of a plurality of reference blocks centered by motion vectors in reference frame to the local memory of core, the quantity of reference block of looking ahead determines by method for searching motion, and source address, destination address, data block size press formula (2) calculating;
(3.4) start DMA by the data pre-fetching of the backward reference block of the next data block of present encoding piece the internal storage to core, be about to the data pre-fetching of a plurality of reference blocks centered by motion vectors in reference frame to the local memory of core, the quantity of reference block of looking ahead determines by method for searching motion, and source address, destination address, data block size press formula (2) calculating;
(3.5) carry out motion compensation (MC), dct transform, quantization operation;
(3.6) upgrade control table, position, present encoding piece coding maker position;
(3.7) repeating step (3.1), (3.2), (3.3), (3.4), (3.5), (3.6), until each encoding block motion estimation operation of present encoding row finishes.
CN201310632104.XA 2013-12-01 2013-12-01 Multi-core DSP (digital signal processor) motion estimation-oriented data prefetching method Active CN103634604B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310632104.XA CN103634604B (en) 2013-12-01 2013-12-01 Multi-core DSP (digital signal processor) motion estimation-oriented data prefetching method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310632104.XA CN103634604B (en) 2013-12-01 2013-12-01 Multi-core DSP (digital signal processor) motion estimation-oriented data prefetching method

Publications (2)

Publication Number Publication Date
CN103634604A true CN103634604A (en) 2014-03-12
CN103634604B CN103634604B (en) 2017-01-11

Family

ID=50215177

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310632104.XA Active CN103634604B (en) 2013-12-01 2013-12-01 Multi-core DSP (digital signal processor) motion estimation-oriented data prefetching method

Country Status (1)

Country Link
CN (1) CN103634604B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183698A (en) * 2015-09-23 2015-12-23 上海无线电设备研究所 Control processing system and method based on multi-kernel DSP
CN107329813A (en) * 2017-06-09 2017-11-07 北京中科睿芯科技有限公司 A kind of global perception data active prefetching method and system towards many-core processor
CN111225243A (en) * 2020-01-20 2020-06-02 中南大学 Video block scheduling method and system
CN111316643A (en) * 2019-03-29 2020-06-19 深圳市大疆创新科技有限公司 Video coding method, device and movable platform
CN111683249A (en) * 2020-06-24 2020-09-18 湖南国科微电子股份有限公司 Data reading method, device, decoder and storage medium
US11057637B1 (en) 2020-01-29 2021-07-06 Mellanox Technologies, Ltd. Efficient video motion estimation by reusing a reference search region

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100440973C (en) * 2004-12-21 2008-12-03 北京中星微电子有限公司 A macro block prefetching method in video encoding-decoding process
JP4910576B2 (en) * 2006-09-04 2012-04-04 富士通株式会社 Moving image processing device
CN100484246C (en) * 2007-03-15 2009-04-29 上海交通大学 Pixel prefetching device of motion compensating module in AVS video hardware decoder
CN100524357C (en) * 2007-10-11 2009-08-05 上海交通大学 Data pre-fetching system in video processing

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183698A (en) * 2015-09-23 2015-12-23 上海无线电设备研究所 Control processing system and method based on multi-kernel DSP
CN105183698B (en) * 2015-09-23 2019-03-08 上海无线电设备研究所 A kind of control processing system and method based on multi-core DSP
CN107329813A (en) * 2017-06-09 2017-11-07 北京中科睿芯科技有限公司 A kind of global perception data active prefetching method and system towards many-core processor
CN107329813B (en) * 2017-06-09 2020-08-04 北京中科睿芯科技有限公司 Global sensing data active prefetching method and system for many-core processor
CN111316643A (en) * 2019-03-29 2020-06-19 深圳市大疆创新科技有限公司 Video coding method, device and movable platform
WO2020199050A1 (en) * 2019-03-29 2020-10-08 深圳市大疆创新科技有限公司 Video encoding method and device, and movable platform
CN111225243A (en) * 2020-01-20 2020-06-02 中南大学 Video block scheduling method and system
US11057637B1 (en) 2020-01-29 2021-07-06 Mellanox Technologies, Ltd. Efficient video motion estimation by reusing a reference search region
CN111683249A (en) * 2020-06-24 2020-09-18 湖南国科微电子股份有限公司 Data reading method, device, decoder and storage medium

Also Published As

Publication number Publication date
CN103634604B (en) 2017-01-11

Similar Documents

Publication Publication Date Title
CN103634604A (en) Multi-core DSP (digital signal processor) motion estimation-oriented data prefetching method
CN102165780B (en) Video encoder and method, and video decoder and method thereof
US9948934B2 (en) Estimating rate costs in video encoding operations using entropy encoding statistics
US9351003B2 (en) Context re-mapping in CABAC encoder
CN100562114C (en) Video encoding/decoding method and decoding device
KR101177666B1 (en) Intelligent decoded picture buffering
US8577165B2 (en) Method and apparatus for bandwidth-reduced image encoding and decoding
CN101166277B (en) Method for accessing memory in apparatus for processing moving pictures
KR20050074012A (en) Image coding system
EP2747434A1 (en) Video image compression/decompression device
US10171804B1 (en) Video frame encoding scheme selection
CN100474929C (en) Loading device and method for moving compensating data
CN101350928A (en) Method and apparatus for estimating motion
KR20120066305A (en) Caching apparatus and method for video motion estimation and motion compensation
US9300975B2 (en) Concurrent access shared buffer in a video encoder
CN202995701U (en) Data information cache management system based on preliminary decoding analysis
Wang et al. Motion compensation architecture for 8K UHDTV HEVC decoder
CN101448160B (en) Pixel reconstruction method with data reconstruction feedback, and decoder
CN103327340A (en) Method and device for searching integer
CN103034455A (en) Method and system for managing data information buffer based on pre-decoding and analyzing
CN101505424B (en) An entropy decoding bit parsing method, an entropy decoder, and a video decoding chip
CN101304520A (en) Image decoding system and self-adapting fetching-rapidly method for motion compensation thereof
KR20080090238A (en) Apparatus and method for bandwidth aware motion compensation
CN104754345B (en) Method for video coding and video encoder
KR20110065335A (en) System for video processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant