CN104053000B - It is decoded using the video compress (VC-2) of parallel decoding path - Google Patents

It is decoded using the video compress (VC-2) of parallel decoding path Download PDF

Info

Publication number
CN104053000B
CN104053000B CN201410098981.8A CN201410098981A CN104053000B CN 104053000 B CN104053000 B CN 104053000B CN 201410098981 A CN201410098981 A CN 201410098981A CN 104053000 B CN104053000 B CN 104053000B
Authority
CN
China
Prior art keywords
data
processing
inverse discrete
wavelet conversion
discrete wavelet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410098981.8A
Other languages
Chinese (zh)
Other versions
CN104053000A (en
Inventor
周凯正
陈亭中
黃家春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intersil Corp
Original Assignee
Intersil Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US13/851,821 external-priority patent/US9241163B2/en
Application filed by Intersil Inc filed Critical Intersil Inc
Publication of CN104053000A publication Critical patent/CN104053000A/en
Application granted granted Critical
Publication of CN104053000B publication Critical patent/CN104053000B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present invention is disclosed to be decoded using the video compress (VC-2) of parallel decoding path.In a specific example, a kind of VC-2 decoder includes three parallel data paths, including top frequency band, present band and bottom frequency band data path.The top frequency band data path executes variable-length decoding (VLD), inverse quantization (IQ) and anti-DC prediction (IDCP) processing of a top compression frequency ranges of data.The current frequency band number is handled according to VLD, IQ and IDCP that path executes a current compression frequency ranges of data.The bottom frequency band data path executes VLD, IQ and IDCP processing of a bottom compressed data frequency band.In addition, the decoder includes top, current and bottom data frequency band synthesis source pixel value through decode of three-level inverse discrete wavelet conversion (IDWT) module to execute IDWT processing to depend on using the partial decompressing of three parallel data paths generation.The decoder also includes a segment byte balanced device, a bit streams first in first out (FIFO), a scan transformation FIFO and a module, which will be inserted into during horizontal and vertical hide from view from the received data of scan transformation FIFO.

Description

It is decoded using the video compress (VC-2) of parallel decoding path
Technical field
Specific example of the invention relates generally to provide the decoder used in decoding data and method.Background skill Art
VC-2 video compression standard is to be marked by British Broadcasting Corporation (BBC) to animation and Television Engineer association (SMPTE) The video decoding standard that the disclosure of quasi- contribution freely uses.VC-2 standard converts (DWT) and staggeredly index Pueraria lobota using discrete wavelet Desired video compress is reached in Luo Mu (IEG) variable length code.DWT was originally designed to and the H.264 standard phase be in fashion Competition, it is expected that it leads to the block artifact fewer than the system based on discrete cosine transform (DCT) in fashion.In order to reach serial number According to the low latency requirement in interface (SDI) Transmission system, SMPTE standardizes two low latency configuration files comprising use (2, 2) 65 grades of 64 grades and use overlapping (5,3) DWT of DWT.It has shown that make fine definition (HD) video signal by outstanding video signal Quality is suitble to standard definition SDI(SD-SDI) payload, need 65 grades of compressions.
The VC-265 grades of subsets for the low latency configuration file with lower Column Properties:
1.4:2:210 sampling, have support 1920 × 1080i29.97 of resolution ratio, 1920 × 1080i25,1280 × 720p59.94、1280×720p50。
2. low latency configuration file is used only in coder-decoder.
3. coder-decoder is used only LeGall(5, and 3) wavelet conversion (small echo index=1)
4. small echo depth is definitely 3 ranks.
5. it is horizontal that clip size is fixed as in brightness 16() × 8(is vertical) and 8(is horizontal in coloration) × 8(is vertical).
Knownly, the JPEG-2000 mark that the DWT of overlapping is used to be widely used in digital camera and medical imaging system In standard.In the publication, exist on how to reduce many announcements of the implementation complexity of 2-D DWT.The one of this technology is common Property is, handles DWT/IDWT data on chip using external picture frame buffer storage based on the implementation of JPEG-2000.Cause This, these announcements have focused primarily on following manner: minimizing reading and the write-in access to external memory;Reduce on chip External memory;Accelerate data processing;And one sweeping scheme of selection is so that memory uses minimum.However, external memory is typically Increase cost associated with chip package size and power consumption and overall system complexity and bill of materials (BOM) cost.
Summary of the invention
It is depicted below as the three frequency band parallel processing VC-2 decoding architectures and its implementation method of highly effective rate, including time The design of the high-transmission amount 2-D of overlapping reversed discrete wavelet conversion (IDWT) filter, based on being inputted in real time simultaneously with multistage Processing method based on IDWT fragment is led between program for the segment byte balanced device of easy FIFO processing, for reducing Believe four program processing frameworks of a time slot, IDWT neighborhood segment storage reduction method and IDWT resolution ratio reduction side of buffer Method.Also described below is based on the analytic function for inputting and exporting video signal format assessment input buffer size.According to specific Specific example, pipeline 1-D IDWT program reduce and preferably halve the overall 2-D IDWT processing time.According to specific specific The input data generated in real time is directly fed into IDWT processor (it is also known as IDWT module) by example, without the use of Intermediate buffer, this reduces storage and delay.Additionally, particular embodiment avoids (such as existing using external volatile memory Needed in most video compression systems) and eliminate three band internal memories.Therefore, revealed framework and method allow VC-2 decoder is implemented to use a small amount of internal stationary memory and buffer, and very short processing is caused to postpone.This makes decoder Multiple channels (for example, four channels of decoder) can be packed in an one chip.
The embodiment of the present invention provides a kind of decoder, wherein includes:
Three parallel data paths comprising a top frequency band data path, a present band data path and a bottom Frequency band data path,
The variable-length that the top frequency band data path executes a top compression frequency ranges of data decodes (VLD), inverse quantization (IQ) and anti-DC prediction (IDCP) is handled;
The current frequency band number is handled according to VLD, IQ and IDCP that path executes a current compression frequency ranges of data;And
The bottom frequency band data path executes VLD, IQ and IDCP processing of a bottom compressed data frequency band;And
One three-level inverse discrete wavelet conversion (IDWT) module executes IDWT processing to depend on using this three parallel numbers Decoded pixel value is synthesized according to the top of partial decompressing caused by path, current and bottom data frequency band.
The embodiment of the present invention also provide it is a kind of for the method that is used in decoding data comprising:
(a) variable-length decoding (VLD), inverse quantization (IQ) and the anti-DC prediction of a top compression frequency ranges of data are executed (IDCP) it handles;
(b) VLD, IQ and IDCP processing of a current compression frequency ranges of data are executed;And
(c) VLD, IQ and IDCP processing of a bottom compressed data frequency band are executed;
Wherein step (a) and (b) and (c) through being performed in parallel;And it further includes
(d) execute three-level inverse discrete wavelet conversion (IDWT) processing with depend on step (a) and (b) and (c) caused by The top of partial decompressing, current and bottom data frequency band and synthesize decoded pixel value.
The present invention also provides a kind of decoder comprising:
One segment byte balanced device, Deng some compression words of the change in each data slot just by the decoder for decoding Section, and whereby, grade is including a top compression frequency ranges of data, a current compression frequency ranges of data and a bottom compressed data frequency band Each of three compressed data frequency bands in some packed bytes;
Three parallel data paths comprising a top frequency band data path, a present band data path and a bottom Frequency band data path,
The top frequency band data path executes variable-length decoding (VLD), inverse quantization of the top compression frequency ranges of data (IQ) and anti-DC prediction (IDCP) is handled;
The current frequency band number is handled according to VLD, IQ and IDCP that path executes the current compressed data frequency band;And
The bottom frequency band data path executes VLD, IQ and IDCP processing of the bottom compressed data frequency band;And
One three-level inverse discrete wavelet conversion (IDWT) module is executed using portion caused by three parallel data paths Decompose the top of compression, current and bottom data frequency band IDWT is handled;
Wherein three-level IDWT module includes two-dimentional (2-D) the IDWT composite filter of a pipeline, is using a plurality of One-dimensional (1-D) the IDWT filter to overlap is implemented.
Particular embodiment according to the present invention, serial data interface (SDI) receiver chip do not need and do not include outer Portion's memory, this is beneficial, because the small reduction ratio of the 5-10 in SDI application range can not prove to buffer using external picture frame The fringe cost of memory is proper.This lack external memory be particular embodiment of the invention and other based on DWT's A difference between design.
Compared to non-overlapping (2,2) DWT much simpler used in 64 grade standards, the overlapping essence of (5,3) DWT is difficult To handle in real time.Therefore, (5,3) can lead to performance difficulty if the overlapping essence of DWT is not handled suitably.It is described herein Particular embodiment of the invention overcome these performance difficulties, and actually by the three big frequency for using internal storage It brings the real-time video signal data for keeping incoming and utilizes the overlapping of (5,3) DWT essential, so that can eliminate at other based on DWT's External memory used in design.
In particular embodiment, four decoder channels are filled in a chip by SDI receiver.If not correctly into Row, then this SDI receiver may need the internal storage counted with the lock bigger than the summation of the other parts of entire chip Size.In other words, by three big band internal memories of this potential size and other circuits one for SDI receiver needs Act that be filled in a chip will be not practical.In order to overcome the problems, such as this, particular embodiment described herein can be used to reduce interior Deposit use.
More generally, it is depicted below as eliminating the external memory and main inside that can be in addition needed by SDI receiver chip The systematic manner of memory.Three frameworks/technology/scheme that internal buffer uses is further reduced in addition, disclosing.
Certain specific examples are related with parallel processing framework, which can using three groups small be performed simultaneously Becoming length decoder (VLD), inverse quantization (IQ) and anti-DC predicts (IDCP) module generated in real time for needed for IDWT three Frequency band data.This IDWT input data for repeating to generate in real time, which is completely eliminated, is stored in inside for very big three frequency bands data Demand in memory.Advantageously, lead to the lock less than 1% using the fringe cost of more than two group VLD, IQ and IDCP modules It counts and increases, while it replaces internal storage, if internal storage indicates total greater than 50% without eliminating, by other Lock counts.
In order to reduce and preferably minimize input buffer storage storage, Single port static state RAM can be used to store certainly The compression that SD-SDI link captures inputs crossfire.Also it is described as calculating slow for minimum needed for sustainable SD to HD real-time operation Rush the technology of device size.Understanding will such as be described below certainly, assess buffer sizes using simple formula.
In certain specific examples, in order to make to read simultaneously three variable length codes (VLC) crossfire address calculation Logic is easy, by will " 1 " position fill to the byte boundary of equalization compared with short-movie section come etc. the every segment of change received byte.This Three VLC crossfires needed for technology allows to read in order at the address of equal intervals, significantly simplify input buffer Design.
In certain specific examples, in order to increase the output transmission quantity of 2-D IDWT program, in its two cyclic programs time The continuous 1-D IDWT program of period overlapping, with the double increase of the speed and efficiency for almost making its total.
2D-IDWT implementation method and input framework also described below for based on three-level segment, wherein flat using what is disclosed Row processing framework is directly from the VLD-IQ-IDCP output feed-in input data generated in real time.Using the method and input framework, replace Generation such as in most conventional designs via need it is more storage and postpone store buffer feed-in.
In known VC-2 design of encoder, buffer between a large amount of internal processes is typically needed.In order to reduce buffer It uses, four programs (VLD-IQ-IDCP-IDWT) are combined into a processing time slot by particular embodiment, and make it shared only One group of communication buffer device.In addition, schedule scheme and fair speed processor, which are designed so that, can carry out a time slot design.Below also The mode for being described as reducing required intermodule buffer at least 50% is reduced at least compared with known VC-2 design of encoder 50% is very significant improvement.
According to certain specific examples, the elimination of external memory and the elimination of three band internal static memories cause for packet 3 × 3(of the segment in current decoding is included also that is, 9 segments) data block needed for IDWT storage reduction.The one of decoding architecture In addition improve based in these 9 segments not all pixels or version of its delay be all used to decode and therefore can eliminate its storage The VC-2 property deposited reduces storage to 3.3 segments.In addition, can be used to promote to cache according to the addressing scheme of a specific example At least the 50% of device storage is reduced.
Certain specific examples also utilize by the second level from the first order proportionally reduce 2 and also by the third level from the second level by than The space IDWT that example reduces 2 calibrates property.As a result, can be stored in second level pixel for one position less, and need less to be two Position for storing three-level pixel.In specific words, this can be used to save other 10% used about total buffer.
Detailed description of the invention
Fig. 1 shows and has the HD video camera and SD-SDI Transmission system of VC-2 encoder and VC-2 decoder.
Fig. 2A shows the main process modules (VLD, IQ/IDCP and IDWT) of a specific example of VC-2 decoder.
Fig. 2 B is to show bridge joint SD to HD format to show required buffer sizes for real-time HD.
Fig. 2 C is to show processing group associated with 2-D inverse discrete wavelet conversion (IDWT) of luminance component (Y) is used for Part.
Source segment is resolved into 3 grades of DWT programs of 10 frequency bands to show by Fig. 2 D.
10 frequency bands are synthesized 3 grades of anti-DWT of IDWT(of source segment to show by Fig. 2 E) program.
Fig. 3 A is to show 1-D(5,3) composite filter program, and using a pixel from previous fragment and come from down Two pixels of one segment sufficiently synthesize 1 × 8 pixel of decoding current clip.
The 1-D(5 that Fig. 3 B is introduced to show A referring to Fig. 3,3) two steps of IDWT program, two of them step need 2 circulations are to complete.
Fig. 4 shows the first order 2-D(5 based on segment, 3) composite filter program, and using from top frequency band, current frequency The data of band and bottom frequency band segment generated in real time decode current 2 × 4 brightness (Y) pixel with abundant synthesis.
Fig. 5 shows the second level 2-D(5 based on segment, 3) composite filter program, and using from top frequency band, current frequency The data of band and bottom frequency band segment generated in real time decode current 4 × 8 brightness (Y) pixel with abundant synthesis.
Fig. 6 shows the third level 2-D(5 based on segment, 3) composite filter program, and using from top frequency band, current frequency The data of band and bottom frequency band segment generated in real time decode current 8 × 16 brightness (Y) pixel with abundant synthesis.
Fig. 7 shows the first order 2-D(5 based on segment, 3) composite filter program, and using from top frequency band, current frequency The data of band and bottom frequency band segment generated in real time decode current 2 × 2 coloration (Cb or Cr) pixel with abundant synthesis.
Fig. 8 shows the second level 2-D(5 based on segment, 3) composite filter program, and using from top frequency band, current frequency The data of band and bottom frequency band segment generated in real time decode current 4 × 4 coloration (Cb or Cr) pixel with abundant synthesis.
Fig. 9 shows the third level 2-D(5 based on segment, 3) composite filter program, and using from top frequency band, current frequency The data of band and bottom frequency band segment generated in real time decode current 8 × 8 coloration (Cb or Cr) pixel with abundant synthesis.
Figure 10 shows the pipeline 2-D IDWT composite filter design with the time-interleaving of very high transmission quantity.
Figure 11 A shows three frequency band parallel processing VLD-IQ/IDCP-IDWT VC-2 decoder architectures.
Figure 11 B shows that a segment byte balanced device designs so that input FIFO reading addressing is easy.
Figure 12 A is shown to four program (VLD-IQ/ in a time slot of the program interface buffer for saving at least 50% IDCP-IDWT) framework.
Figure 12 B shows the three frequency band parallel processing frameworks using four program technics in a time slot.
Figure 13 A shows that nine segments to the IDWT program data buffer buffer for saving 63.3% store reduction technology.
Figure 13 B shows the nine fragment data dependence charts for handling all three-levels of 2-D IDWT.
Figure 14 show to save be more than 10.5% IDWT program data buffer buffer IDWT resolution ratio reduction side Method.
Specific embodiment
Fig. 1 is and has the high-order square of an a HD H.264 specific example of a SDI Transmission system 100 of encoder 180 Figure, the system may be implemented in inside (for example) digital video recorder (DVR) for security monitoring application.Referring to Fig. 1, by SDI Transmission system 100 is shown as including the HD video camera 110 for being coupled to HD H.264 encoder 180, has various centres therebetween Block and transmission yarn.
Knownly, HD video camera 110 will be connected to HD via HD-SDI link with 1.4875Gbps rate and H.264 encode Device 180.This high-speed transmission link is limited to about 90 meters of the short distance via 3C-2V coaxial cable.However, for security monitoring Using relatively long distance is preferable.The online transmission range of same coaxial cable is extended into about 160 meters of a mode to use Relatively low rate SD-SDI at 270Mbps.In order to which bit streams rate is subtracted from fine definition (HD) rate of 1.4875Gbps The as low as standard definition of 270Mbps (SD) rate applies video compress to HD video signal source.It, shows in Fig. 1 in more specific words SDI Transmission system 100 in, using VC-2HD to SD encoder 120(, it also can be referred to Di Yueke (Dirac) encoder) come Reach video compress.In the implementation of displaying, size is the input HD source images of 1920 × 1080 × 2 bytes through being compressed into The output SD source images of 1440 × 486 × 1 byte, this reaches about 6/1 compression ratio.It will be from VC-2HD to SD encoder 120 compression bit streams are fed into payload formatter 130 to cause with 10 bit parallel data (time control at 27MHz) CCIR-656 format video streams.10 bit parallel data under 27MHz are transmitted at 270Mbps by SD-SDI transmitter 140 1 Bits Serial data of time control.According to a specific example, HD video camera 110, VC-2 encoder 120, payload formatter 130 And SD-SDI transmitter 140 is the component of the photography pusher side of system.It, can be by payload lattice although being shown as separated block Formula device 130 is embodied as the part of VC-2 encoder 120.Additionally, it is noted that block 120,130 and 140 can be collectively referred to as SDI Conveyor means, the SDI conveyor means can be referred to SDI transmitter chip when being implemented in one chip.
The receiving side of system will be transmitted to compared with low rate SDI data on coaxial transmission cable 145, receiving side includes HD H.264 encoder 180.In more specific words it, SD-SDI receiver 150 receive first the 1 Bits Serial data at 270MHz and will It is converted into the 10 bit parallel CCIR-656 formatted data crossfires at 27MHz.CCIR-656 format string stream passes through payload Acquisition module 160 is stored the input buffer of the decoder 170 to VC-2SD to HD to capture the bit streams of VC-2 compression It is interior.Although being shown as separated block, payload acquisition module 160 can be embodied as to the part of VC-2 decoder 170.? VC-2 decoder 170(its also can be referred to Di Yueke decoder) at, the SD size of 1440 × 486 × 1 byte will be filled in In compression bit streams be decoded into the HD size of 1920 × 1080 × 2 bytes.This HD video signal data rebuild with directly Compare visually free of losses from the initial data of HD video camera 110, and it is formatted at HD BT-1120 format to be fed into HD H.264 in encoder 180.At H.264 encoder 180, multiple HD scenes from various positions can be monitored in real time, and this is more A HD scene also passes through while compressing and storing for future reference.Note that block 150,160 and 170 can be collectively referred to as SDI Acceptor device, the SDI acceptor device can be referred to SDI receiver chip when being implemented in an one chip.When SDI is passed Defeated device device/chip and SDI acceptor device/chip just to extend that HD video streams can transmit apart from when, these devices Also HD-SDI extender transmitter and HD-SDI extender receiver can be respectively referred to as.
It can such as find out from above description, using the benefit of HD-SDI or SD-SDI for positioned at various in safety monitoring system HD video camera (for example, 110) at monitoring position can be connected to H.264 encoder (for example, 180) institute position the HD inside DVR In centerized fusion point.Note that higher level cable also can be used in order to extend video camera to control point distance.Citing and Distance can be extended to 200 meters using RG59 coaxial cable by speech from 90 meters, and can be further using RG9 grades of coaxial cables By range delay to 300 meters.However, transmission yarn is added often through pre-installing using higher level cable in practical situation The cost of its upper installation adds simple VC-2 encoder and SD-SDI considerably beyond at photograph pusher side and SD-SDI receiver Transmitter and the cost that VC-2 decoder is added at H.264 coder side.This for VC-2 encoder and VC-2 decoder at To economic the reason of disposing of the HD monitoring system suitable for security marketplace.
In addition, due to VC-2 decoder (for example, 170) be located at four or more HD channel usually with a HD H.264 At the central control point of encoder (for example, 180) filling together, therefore four VC-2 decoder channels are filled to a chip It is interior with it is existing installation be unanimously available.In the certain specific examples being described herein, simplified VC-2 decoding is focused on The design of device (for example, 170), between all components in described SDI Transmission system 100, this sets to be most challenging Meter.
Fig. 2A is the main process modules (VLD, IQ/IDCP and IDWT) for showing the potential implementation of VC-2 decoder 170 Block diagram.Referring to Fig. 2A, VC-2 decoder 170 is shown as to include compression a bit streams buffer storage 210, a variable-length Decoder (VLD) module 220, an IQ/IDCP module 230, one 3 band buffer memories 240, an IDWT module 250, one scanning Conversion buffered memory 260 and a HD video signal output module 270.In this configuration, 3 band internal buffer storage 240 are used Instead of external memory.However, this 3 band internal buffer storage 240 can dramatically it is bigger than required.
In fig. 2, IDWT module 150 be main process modules, and VLD module 220 and IQ/IDCP module 230 processing and Generate its input data.Referring to Fig. 2A, in the input end of VC-2 encoder 170, VC-2 compressed-bit is contained in its active region The CCIR-656 bit string of crossfire flows through acquisition and is placed in input core buffer 210.In real-time video signal design of encoder, Importantly, allow the continuous video signal display tube at 2270.Since SD input CCIR-656 format and HD export BT-1120 format Have the function of different and hide from view region, therefore buffer 210 is to store input compressed data, so that task scheduler can be The time starts to decode and export video signal program later, to ensure that then its display tube will not interrupt once output video signal.It is known Ground is selected the suitable size for input buffer 210 that scheduler is made to can be easy to design and involves a need to be attached at based on needs The examination of the input of various video signal formats together and the emulation of output data pipe misses program.
It can show as follows in order to be easy emulation and experimental effort according to particular embodiment and be defined as using the duty cycle (DUCY):
DUCY=(zone of action)/(total envelope), equation (1)
It can show, minimal buffering device size is shown as follows:
Input_buffer_size=(HD_DUCY-SD_DUCY) × SD_active_size,
Equation (2)
Wherein SD_active_size is the total payload contained in a SD image.
Fig. 2 B, which has been shown as the input of decoding SD field, can continuously show that HD field exports to generate, and be based on equation (1) And (2), required SD_active_size is 1440 × 243=349,920 bytes, and minimum input_buffer_size is 11,652 bytes.In order to which SD picture frame input conversion to HD picture frame is inputted, due to the double increase of SD_active_size, Minimum input_buffer_size is also double to increase to 23,304 bytes.Also that is, (HD_DUCY-SD_DUCY) × SD_ Active_size × 2=(0.9608-0.9275) × 349,920 × 2=23,304 bytes.Once it is slow to determined minimum input Device size is rushed, then task scheduler timing can be easy to be designed to reach this limit, while maintain seamless (also that is, continuous) and not The video signal of interruption exports display tube.
Referring again to Fig. 2A, in VC-265 grade standard, VLD module 220 is index Ge Luomu (IEG) decoder that interlocks. In order to meet the required timing budget for real-time video signal, variable-length code (VLC) is decoded with every circulation using every one round-robin algorithm of symbol Up to N number of position of data, wherein N is the number of the sign for indicating the longest code word generated from IEG encoder and the position of magnitude. Its number usually by the position for needing the minimum few frequency components for being used to indicate DWT program is limited.
In fig. 2, variable compression length bit streams are de-compressed into the sign and magnitude for indicating DWT pixel by VLD220 " symbol ".VLD module 220 depends on what it was decoded from the received VC-2 of buffer storage 210 compression bit streams output through VLD Symbol.The symbol decoded through VLD is fed into inverse quantization module (IQ) to rebuild its original upper frequency AC value, then feed-in To anti-DC prediction (IDCP) module to rebuild its DC value.In other words, I/Q module restores the original of the original source symbol handled through DWT Beginning magnitude;And IDCP module recovery " DC " value, indicate every segment average value of the original source symbol handled through DWT.Although IQ and IDCP module is jointly shown as IQ/IDCP module 230 in Fig. 2A, but optionally, these modules can be separated.
Referring again to Fig. 2A, the output of IQ/IDCP module 230 is shown as to be provided to 3 band buffer memories 240, by this The output of memory is provided to anti-wavelet conversion (IDWT) module 250.The synthesis of IDWT module 250 comes to be handled (also through 10 frequency band DWT That is, through 10 frequency bands decompress) data symbol the pixel value through decoding.The output of IDWT module 250 is provided to scanning to turn Buffer storage 260 is changed, the output of the memory is provided to HD video signal output module 270.
In VC-265 grades, only one quantization index is used for all DC and AC frequency components of DWT.It is lower in order to emphasize Frequency component is scaling up 2 after every level-one DWT.On decoder side, it is necessary to subtract lower frequency components in proportion Few 2 to rebuild its original value.It will such as show in Figure 14 and be described below referring to Figure 14, property permission is scaled in this grade Less bits are used in IDWT storage to reduce hardware cost.Overall decoder hardware complexity is mainly wanted by the processing of IDWT Influence is asked, the processing requirement of IDWT is discussed hereinafter with reference to Fig. 3.
Fig. 2 C shows the processing component 280 of the 2-D IDWT for luminance component (Y).Most basic unit (being designated as 282) is Vertically upper 8 line multiplies the segments of horizontal upper 16 pixels to size.It can show, in order to obtain the final result of current clip, need to come From the data of all eight neighbours, this is expanded to 3 × 3 segment data blocks 284 for size being 24 lines and 48 pixels The one segment data dependence of (it is also referred to as 9 segment data blocks).Since data are reached in real time by line scanning sequence, In order to obtain the required data for all segments in online span, three frequency bands (being designated as 286) altogether are needed, meaning can need 24 lines are wanted to multiply 1920 pixels to be stored in internal storage 240.Note that frequency band size and line width are (also that is, 1920 pictures Element) and the number of color component (also that is, Y and Cb/Cr) it is proportional.It is to fill four channels to a decoder core in hope In the case where in piece, the required size of three band buffer memories 240 will add up to 737,280 bytes.In order to avoid using The internal storage of this enormous amount, certain specific examples of the following description of the present invention provide decoder architecture more efficiently. As used herein, the term, the segment for being also known as data slot is the data processing unit of the IDWT based on segment. The frequency band for being also known as frequency ranges of data includes 8 lines × 16 pixels, 120 segments, and to be stored (with real-time video signal) with reality The block of the data of the processing of existing segment.(5,3) IDWT filter that the use being described herein overlaps synthesizes decoded In the particular embodiment of the invention of pixel value, concurrently three frequency ranges of data are handled simultaneously so as to describe above with reference to Fig. 2A Three band buffer memories 240 can be eliminated.
3 grades of IDWT programs in order to better understand, Fig. 2 D carry out 3 grades of IDWT to show in VC265 grades of coded programs Program is in a manner of resolving into 10 frequency components (also referred to as sub-band) for a source segment.Firstly, 8 lines × 16 pixels source Segment 2310 undergoes horizontal analysis filter to be broken down into horizontal low frequencies band L32312 and horizontal high-frequent band H32314.L3 And H3 is further subjected to line analysis filter to be broken down into four 3 grades of sub-bands, that is, LL32316, LH32318, HL32320 and HH32322.This terminates third level DWT program, cause generate four band components, each size be 4 lines × 8 pixels.Three high frequency band components LH3, HL3 and HH3 have finished its DWT program, and have been used for then quantifying program.Low frequency Component LL3 undergoes the 2nd grade of similar horizontal and vertical analysis DWT filter then to generate four the 2nd grade of frequency components LL22328, LH22330, HL22332 and HH22334.Every 1 the 2nd grade of frequency component has 2 lines × 4 pixels size.Three compared with High band component LH2, HL2 and HH2 have finished its 2nd grade of DWT program, and have been used for then quantifying program.2nd grade of low-frequency band LL22328 undergoes the 1st grade of similar horizontal and vertical analysis DWT filter then to generate four the 1st grade of frequency components LL02340, LH12342, HL12344 and HH12346.Every 1 the 1st grade of frequency component has 1 line × 2 pixels size.LL0 points Amount experience DC Prediction program, while all four frequency bands LL0, LH1, HL1 and HH1 experience quantization program.Through 10 band decompositions, DC The result of prediction and quantization then undergoes variable length code (VLE) program to be broken down into desired pay load size.
Fig. 2 E shows anti-DWT(IDWT) program, 10 band components generated by DWT are synthesized into original source segment. IDWT since the 1st grade synthesis, wherein four the 1st grade of band component LL02440, LH12442, HL12444 and HH12446 are first Experience vertical filtering and then horizontal filtering are to synthesize the 2nd grade of low-low band component of LL22428.In these two filtering journeys During sequence, the pixel in these 4 band components is filtered staggeredly and then first to generate 2 lines × 4 pixel LL2 points of gained Amount.2nd grade of IDWT program then interlock four the 2nd grade of components (also that is, LL22428, LH22430, HL22432 and HH22434), And vertical filtering is then carried out, horizontal filtering is carried out, then to synthesize 4 lines × 8 pixel LL3 components 2416.Then, 3rd level IDWT program is interlocked four 3rd level components (also that is, LL32416, LH32418, HL32420 and HH32422), and is then hung down Straight filtering, carries out horizontal filtering, then to synthesize original source segment 2410.In the subsequent present invention, 3rd level IDWT program displays Special implementation technology based on the above basic IDWT rule.
The LeGall(5 used in 65 grades of VC-2 low latency configuration file, 3) composite filter has following formula:
Step 1:A2n-=(A2n-1+A2n+1+ 2)/4 equation (3)
Step 2:A2n+1+=(A2n+A2n+2+ 1)/2 equation (4)
Step 1 can be expressed equally are as follows: A2n=A2n-((A2n-1+A2n+1+2)/4);And step 2 can be expressed equally are as follows: A2n+1=A2n+1+((A2n+A2n+2+1)/2).In above equation sequence, " A " indicates the pixel information value in the domain IDWT, wherein each This pixel data value includes (for example) 14 to 16 positions.
For the segment for size 16 × 8, need suitably to dispose boundary condition.In VC-2 standard, to handle 2-D The outer image border pixel of IDWT passes through the boundary extension at encoder and decoder and (means its assigned and hithermost same frequency band The identical value of boundary pixel) so that implementing to generate consistent decoder using different decoders as a result, and generating similar to original Source video signal seems smooth boundary.
Fig. 3 A is to show 1-D(5,3) composite filter program 300, top using an adjacent pixel (with it is previous Segment is associated) and two adjacent pixels (associated with next segment) are used in bottom, current slice is decoded with abundant synthesis 1 × 8 pixel of section.A referring to Fig. 3, the region for being designated as 310 correspond to previous fragment, and the region for being designated as 320 corresponds to current slice Section, and be designated as 330 region correspond to next segment.In addition, the rectangle in region 320 and 330 indicates even-numbered Data, and triangle indicates the data of odd-numbered, and limitary region 340 indicates previous fragment.For step 1, it is based on The A of odd-numbered2n-1And A2n+1Input calculate even-numbered A2n.Therefore, it is necessary to a top data A-1A for calculating0。 For step 2, the A based on even-numbered2nAnd A2n+2Input calculate odd number A2n+1.Need A8To calculate A7, but A8It is also required to From the A in step 19It calculates.Therefore in order to calculate the segment in Fig. 3, three additional datas beyond segment boundaries: A are needed-1、 A8And A9, for handling LeGall(5.3) and composite filter.Here, it could be assumed that, need one above segment to add Data and two additional datas below segment.In real-time video signal operation, data holding continuously arrives.Without using attached In the case where adding memory, this regular keyholed back plate needs to store the view to be 8 lines × 16 pixels fragment computations IDWT for size Interrogate the number of line.
The 1-D(5 that Fig. 3 B is introduced to show A referring to Fig. 3,3) two steps of IDWT program, two of them step need 2 circulations are to complete.After 2 circulations, the pixel 0-7 on the right is result.In more specific words it, Fig. 3 B further show for side The time dependence of the implementation of formula (3) and (4).For step 1, in T=1, the A based on odd-numbered2n-1And A2n+1Input Calculate the A of even-numbered2n.Therefore, it is necessary to a top data A-1A for calculating0.For example, it in time T=1, is based on Pixel value 7t, 0 and 1 of time T=0 generate pixel 0 using equation (3);Based on the generation of pixel value 1,2 and 3 in time T=0 Pixel 2;And pixel 0b is generated based on the pixel value 7,0b and 1b in time T=0.This program also generate time T=2 pixel 0, 2,4 and 6 final result.Note that even if pixel 0b is not needed for the final IDWT in time T=2 as a result, pixel 0b is still passed through It generates for being used in following steps 2.
For the step 2 at T=2, the A based on even-numbered2nAnd A2n+2Input calculate odd-numbered A2n+1.It needs Want A8(pixel 0b) is to calculate A7, but in step 1, it is also desirable to from A9(pixel 1b) calculates A8.For example, based in time T =1 pixel value 0,1 and 2 generates pixel 1 using equation (4);Pixel value 2,3 and 4 based on time T=1 generates pixel 3;And 6,7 and 0b of pixel value based on time T=1 generates pixel 7.This program generates the most termination in the pixel 1,3,5 and 7 of time T=2 Fruit, and carry out 2 step 1-D IDWT programs.This program usually requires two frequency cycles to complete.
Fig. 4 is to show the first order 2-D(5 based on segment, 3) composite filter program 400, using from top, when Preceding and bottom segment the data generated in real time decode current 2 × 4 brightness (Y) pixel with abundant synthesis.This program 400 be with The 2-D of the 1-D program of upper description extends.Firstly, forming size in real time from the transformation data of inverse quantization is that 7 × 5(is as shown) Array.In Fig. 4, each square of 7 × 5 arrays corresponds to a pixel, for example, the pixel of 16 place values.7 × 5 pixels (also that is, a pixel of (4+1+2) × (2+1+2)) is to 4 × 2 low-low bands needed for synthesizing the 1st grade of IDWT.It, will in Fig. 4 Rower is 410 to 430, and column are designated as 440 to 470.Index on the top left corner of each small cube is indicated about current The original coordinates on the direction y and x in the upper left corner of segment (0,0).For example, top row 410 is by being located at from top frequency band The data construction of upward 6th line, current line 415 and 420 is from from the 0th line of present band and the 1st line construction, and the row of bottom two 425 and 430 by the data construct positioned at the 8th line and the 9th line below the origin from current bottom frequency band.Similarly, column 440 By from the 12nd column construction to the left of left side segment, column 445,450,455 and 460 are by the 0th column, the 2nd column, the 1st column from current clip And the 3rd column construction, and column 465 and 470 are by from the 16th column of the right segment and the 18th column construction.In short, needing from current slice The data of all eight adjacent segments of section synthesize to handle the 2-D of low-low 1st grade of (LL1) frequency band, as constructed institute's exhibition from data Show.
According to a specific example, 9 fragment data blocks are directly by the real-time VLD-IQ/ with three group of 2 fragment buffer The data of IDCP processing provides, and does not suffer from external memory or internal storage (for example, 240 in Fig. 2A).Therefore, with use The conventional designs of external memory and/or internal storage are compared, this IDWT based on segment provides advantage.For other grades In the subsequent description of IDWT and chromatic component IDWT, similar input mechanism can be used, and will not further state similar input machine System.
Referring again to Fig. 4, in order to synthesize 2-D LL1 as a result, in vertical direction for seven column 440,445,450,455, 460,465 and 470 the first ID synthetic filtering is carried out.This is followed by carried out in the horizontal direction for three rows 415,420 and 425 ID synthetic filtering.In a specific example, retain 5 × 3(of interior section of the array only with coarse contour wire tag), therefore not Need the filtering for row 410 and 430.The subnumber group that size is 5 × 3 is the composite result for the 1st grade, is proportionally reduced 2 and will be applied to the 2nd grade of IDWT, as follows referring to Fig. 5 describe.
Fig. 5 show be used for luminance component Y the second level (also that is, the 2nd grade) 2-D(5 based on segment, 3) IDWT program 500, using from current clip the data generated in real time and its eight adjacent segments (also that is, be used herein as from all 3 × The data of 3 segments), sufficiently to synthesize current 4 × 8 brightness (Y) pixel.Firstly, the array that size is 11 × 7 is by combination the Second level DWT data are (also that is, directly generating from real-time program and coming from current clip and from its eight using without storage Height-low (HL2), low-high (LH2) and Gao-height (HH2) frequency band of 2nd grade of DWT of adjacent segment) from the transformation data of inverse quantization It is formed.It is 510 to 540 by rower, and column are designated as 545 to 595 in Fig. 5.
After the above array forms program, from the 1st grade of composite result that previously figure obtained (such as by row 515,525 and 535 " C1 ", " CR1 ", " B1 " and " BR1 " label) to fill remaining low-low (LL1) frequency band array component, as shown.Required Data are also such as previously through directly generating in real time and applying, without storing.
Then, similar ID synthesis program (vertically and then horizontal) is executed sequentially to generate the 2nd grade of result.Finally, protecting Stay interior section 9 × 5 as the 2nd grade of composite result.It is named as the subnumber group for low-low -2 frequency band (LL2) that size is 9 × 5 It is proportionally reduced 2 and 3rd level IDWT will be applied to, as described with reference to Fig. 6.
Fig. 6 shows 2D(5,3) 3rd level of IDWT program 600, and as the last of the luminance component (Y) in this specific example Stage.Firstly, using it is real-time generate without storage place from the transformation data construct size for the inverse quantization for being supplied from IQ/IDCP be 19 × 11 array.From the above 2nd grade of program obtain 9 × 5 results (such as by row 612,616,620,624 and 628 " C2 ", " CR2 ", " B2 " and " BR2 " label) then through application is in real time to fill remaining low-low band (LL2) array component, such as at this As shown in the figure.It is 610 to 630 by rower, and column are designated as 632 to 668 in Fig. 6.
Then, similar ID synthesis program (vertically and then horizontal) is executed sequentially to generate 3rd level result.Finally, will Interior section 16 × 8 is left the composite result of 3rd level.This output then undergoes three magnitude adjustment programmes (also that is, with positive and negative Number 2 proportionally reduce program, magnitude cuts out program and magnitude migration program) with enter be suitable for BT-1120 standard output Data area.This finishes the 2-D IDWT program of Y-component.
Fig. 7 shows the first order 2-D(5 based on segment, 3) IDWT program 700, using coming from current clip and its eight The data of adjacent segment generated in real time, sufficiently to synthesize current 2 × 2 coloration (Cb/Cr) pixel.In other words, Fig. 7 displaying is used for The 2-D(5 of chromatic component Cb or Cr, 3) first order of IDWT700.It is 710 to 730 by rower, and column are designated as in Fig. 7 735 to 755.Firstly, the array that size is 5 × 5 as demonstrated is formed from the data that VLD-IQ/IDCP is generated in real time, without It is stored using static memory.Index in the top left of each small cube indicates the upper left corner about current clip (0,0) The original coordinates on the direction y and x.For example, top row 710 is built by being located at from the data of upward 6th line of top frequency band Structure, current line 715 and 720 is from the 0th line of present band and the 1st line construction, and the row 725 and 730 of bottom two is by being located at From the data construct of the 8th line and the 9th line below the origin of bottom frequency band.Similarly, column 735 are by from left side segment to the left the 6th Column construction, column 740 and 745 are by from the 0th of current clip column and the 2nd column construction, and column 750 and 755 are by from the 8th of the right segment Column and the 10th column construction.In short, the data of all eight adjacent segments from current clip is generated from real-time program to realize The 2-D of low-low 1st grade of (LL1) frequency band is synthesized, as shown from data construction.
In order to synthesize 2-D LL1 as a result, carrying out for five column 735,740,745,750 and 755 in vertical direction One ID synthetic filtering.This followed by carries out the ID synthetic filtering for being directed to three rows 715,720 and 725 in the horizontal direction.Note that Retain 3 × 3(of interior section of the array only with coarse contour wire tag), therefore do not need the filtering for row 710 and 730.
The subnumber group that size is 3 × 3 is the composite result for the 1st grade, proportionally reduces 2 and will be applied to the 2nd grade IDWT, as described in the following figure.
Fig. 8 show the second level 2-D(5 based on segment, 3) IDWT program, using come from current clip and its eight phases The data of adjacent segment generated in real time, sufficiently to synthesize current 4 × 4 coloration (Cb/Cr) pixel.In other words, Fig. 8, which is shown, is used for color Spend the 2-D(5 of component Cb or Cr, 3) the 2nd grade of IDWT program 800.Firstly, the array that size is 7 × 7 is by combination second Grade DWT data are (also that is, from current clip and coming from VLD-IQ/IDCP program without the use of memory storage in real time From height-low (HL2), low-high (LH2) and Gao-height (HH2) frequency band of the 2nd grade of DWT of its eight adjacent segments) from inverse quantization Transformation data is formed.In fig. 8, it is 810 to 840 by rower, and column is designated as 845 to 875.
After the above array forms program, from the 1st grade of composite result that previously figure obtained (such as by row 815,825 and 835 " C1 ", " CR1 ", " B1 " and " BR1 " label) to fill remaining low-low (LL1) frequency band array component, as shown.
Execute similar ID synthesis program (vertically and then horizontal) sequentially to generate the 2nd grade of result.Finally, retaining internal Part 5 × 5 is as the 2nd grade of composite result.Be named as size be 5 × 5 low-low -2 frequency band (LL2) subnumber group pass through by than Example reduces 2 and will be applied to the 3rd level IDWT in next figure.
Fig. 9 shows the 3rd level 2-D(5 based on segment, 3) IDWT program 900, using coming from current clip and its eight The data of adjacent segment generated in real time, sufficiently to synthesize current 8 × 8 coloration (Cb/Cr) pixel.In other words, Fig. 9 shows 2D The 3rd level of (5,3) IDWT program 900 is as the final stage for being used for chromatic component Cb or Cr.In Fig. 9, by rower be 910 to 930, and column are designated as 932 to 952.Firstly, the array for being 11 × 11 from VLD-IQ/IDCP real-time program construction size, and from 2nd grade of program described above obtains 5 × 5 as a result, such as by " C2 ", " CR2 ", " B2 " in row 912,916,920,924 and 928 And " BR2 " label.
Then, similar ID synthesis program (vertically and then horizontal) is executed sequentially to generate 3rd level result.Finally, will Interior section 8 × 8 is left the composite result of 3rd level.
This output then undergoes three magnitude adjustment programmes (also that is, 2 of signed proportionally reduce program, magnitude Cut out program and magnitude migration program) to enter the data area for being suitable for BT-1120 standard output.This finishes Cb or Cr points The 2-D IDWT program of amount.
According to particular embodiment, Cb and Cr component interlock in level across a line.This reduces internal storage example Number and reduce totle drilling cost.Cb and Cr component data program all having the same.
The hardware that size is the 2-D IDWT that N column multiply M row is carried out by repeated application 1-D IDWT in two directions Implement.Firstly, vertically synthesize every one M × l column from column 1 to column N, and then horizontally synthesize voluntarily 2 to row (M-l) gained N × l horizontal vector is to obtain 2-D IDWT result.
Figure 10 shows the 2-D IDWT composite filter design with the time-interleaving of very high transmission quantity.In more specific words It, Figure 10 shows an efficient pipeline 2-D IDWT design 1000 with by two step institutes of calculation equation (3) and (4) In two cycle extrusions needed to substantially one circulation.Pipelined architecture by a circulation overlapping 1-D filter procedure 1010, 1020, each of 1030,1040 ... and 1050 so that the second step of a filter just with next filter First step is performed in parallel, and the logic of filter module is substantially hurried always.Each circulation from 1-D filter it is defeated Result indicates 2 step results of each pipeline 1-D filter procedure out.
It is equal to pipe_length/ (pipe_length+1) for the average transmission amount of every 1-D synthesis, very close to One 1-D filter results of each circulation.In other words, implement pipeline two using one-dimensional (1-D) the IDWT filter of N number of overlapping (2-D) IDWT composite filter is tieed up, wherein length of pipeline N is to be consecutively carried out to generate the 1-D IDWT of 2-D IDWT result filter The number of wave device.A 1-D IDWT filter of each frequency cycle N/ (1+N) is reached in this use of the 1-D IDWT filter of N number of overlapping The average transmission amount of wave device result.This is greatly reduced needs when the overall timing budget for IDWT program is very high The number of 1-D filter examples.The cost of this framework is, needs to store the intermediate result from step 1, but it is needed than working as The cost that complete 1-D filter is added when double-speed system requirements is much smaller.
Referring again to Figure 10, each separated filter generate 1-D IDTW filtering as a result, it is also known as filtering Device output.The result of 2-D IDWT filtering or output are reached by executing 1-D IDWT filtering twice.Firstly, in vertical direction On be filtered, generate the vertical composite result of 1-D.Secondly, being filtered in the horizontal direction, the horizontal result of 1-D is generated. Second result (also that is, the horizontal result of 1-D) is 2-D IDWT result.More clearly, in order to 8 × 16 segments progress 2-D The vertical 1-D IDWT in 16 (16) a 8 × 1 is first carried out in IDWT, then executes the horizontal IDWT in eight (8) a 1 × 16.Behind they The horizontal IDWT in eight (8) a 1 × 16 output result 8 × 16 segments thus 2-D IDWT result.If the weight of Figure 10 is not used Repeatedly IDWT is operated, then needs about 48 frequency cycles (also that is, 16 × 2+8 × 2=48) to complete (16+8)=24 1-D IDWT, And average transmission amount is 24/48=0.5 filter results of every circulation.In contrast, if the overlapping IDWT using Figure 10 is operated, Then need only about 26 frequency cycles (also that is, 16+1+8+1=26) complete 24 1-D IDWT, and average transmission amount is often to follow 24/26=0.923 filter results of ring.This realizes the saving of about 22 frequency cycles (also that is, 48-26=22), this is saved The time is handled, and the less example of 1-D composite filter can be used therefore to reach identical required processing capacity requirement.
As shown in fig. 2 c and above with reference to described by Fig. 2 C, 3 big band internal memories is needed to store up knownly Storage 240 supports real-time IDWT program described above.Certain specific examples reduction or complete of the following description of the present invention Eliminate this storage.
Figure 11 A shows that 3 frequency band parallel processing VLD-IQ/IDCP-IDWT VC-2 of a specific example according to the present invention are translated Code device framework 1100.3 frequency band IDWT needed for this 3 frequency band parallel processing decoder architecture 1100 is handled in real time and generated are inputted Data, so that 3 band internal memories 240 discussed above are completely eliminated.This is with two groups of additional VLD and IQ/IDCP moulds Block 1115,1125,1130 and 1140 is cost to reach.Due to relatively small amount needed for implementing VLD and IQ/IDCP module is patrolled Volume, the lock of additional firmware counts count less than lock big required for the memory for implementing its replacement 2%.The operation of parallel framework The use of three frequency bands including the program executed parallel.Top frequency band VLD1115, IQ/IDCP1130 and 2 fragment-delays 1145 Three, the top segment of IDWT input data is generated in real time.As indicated, simultaneously in real time by the left side, current and the right segment Data are fed into IDWT processor 1160.Present band VLD1120, IQ/IDCP1135 and 2 fragment-delays 1150 generate in real time IDWT input data works as first three segment.As indicated, the left side, current and the right segment data are fed into real time simultaneously IDWT processor.Bottom frequency band VLD1125, IQ/IDCP1140 and 2 fragment-delays 1155 generate IDWT input data in real time The segment of bottom three.As indicated, the left side, current and the right segment data are fed into IDWT processor in real time simultaneously 1160.2 fragment-delays 1155 can be used as the buffer storage to store the output data from IQ/IDCP program to implement. IDWT module 1160 receives 9 required segment input datas and is decoded to generate and be suitable for exporting (VO)-via video signal The output data that FIFO1180 and BT-1120 generator 1170 is shown.Particular embodiment according to the present invention is described below Additional implementation detail.
Figure 11 B illustrates segment byte balanced device design 1105 so that input FIFO reading addressing is easy.In more specific words it, The first stage slice_bytes balanced device 1105 of the compressed data length of each input segment is changed in Figure 11 B displaying etc..In VC-2 In, the staggered index Ge Luomu IEG coding that every one 8 line multiplies 16 pixel segments is constrained in being called the whole of " slice_bytes " Keep count of a byte.It is attributed to the essence of variable length code, slice_bytes is usually changed to next from a segment Section.In order to keep decoder synchronous easy with encoded bit streams, group is divided to make slice_bytes together some segments Sequence have regular periodicity pattern.There is complicated week for the slice_bytes group 1190 of 1080I29.97 system Phase property pattern has 17 elements " 42,43,43,43,43,43,43,43,42,43,43,43,43,43,43,43,43 ". This means that the 1st and the 9th segment in 17 fragment groups passes through 42 byte codes, and every other person presses 43 byte codes. On the frequency band 1192 in HD image, there are 1920/16=120 segments for every frequency band.In new parallel framework, need to access By three frequency bands of the variable length data 1192,1193,1194 that 120 segments separate.Due to 120 not 17 it is simple answer Connection, therefore this keeps input data access difficult.In order to be easy this problem, segment byte balanced device 1105 utilizes IEG property: " 1 " position decodes " zero " value ignored to the end in segment.No matter when encounters short-movie section when in input bit streams and (have herein In body example, 42 byte segments) when, therefore eight " 1 " positions are inserted into the end of segment by balanced device, and generate equal length All slice_bytes, as shown in 1195.It is 43 bytes for each segment in this particular embodiment. After gradeization, each segment contains same number compressed data byte, and each frequency band is also such.This is through gradeization The data that variable-length is compressed are transformed into the data of regular length compression by slice_bytes, and allow to be spaced apart from each other one The reading address of the top frequency band of a frequency band, present band and bottom frequency band is easy to calculate.For this particular embodiment, this The storage space that the cost of balanced device is in input FIFO1110 about as many as 0.27%.
In Figure 11 A, three VLD modules 1115,1120,1125 and three IQ/IDCP modules 1130,1135,1140 are same When decode three nearby frequency bands with for current clip IDWT need input data one third provide 3 × 1 segments it is defeated Enter data (right-hand column).Three 2 fragment-delay modules 1145,1150 and 1155 are to store and provide current clip IDWT needs 2/3rds (central series and left-hand columns) of the previous data in 3 × 3 segments of the data for decoding.This parallel framework Therefore 1100 generate 3 × 3 required fragment datas for any segment in decoded picture, and completely eliminate for known one 3 band internal memories (for example, 240 in Fig. 2A) needed for frequency band processing.
Time phase of Figure 12 A to explain four program options 1200 in the efficient time slot according to particular embodiment According to property, to by program interface buffer (to store associated with each of three parallel data paths four The result of each of a program) amount reduce about at least 50%.In more specific words it, the part explanation for being designated as 1202 of Figure 12 A Four programs (VLD, IQ, IDCP and IDWT program) scheme, can be used to save in VC-2 decoder at least 50% journey in one time slot Sequence interface buffer.In order to compare, also show that a known design of encoder 1201, using known 3 stage pipeline designs with Mitigate the rate request for each functional module.Valuably, when using four program option 1202 in a time slot, due to four All intermediate data between program utilize in each segment program time completely, therefore do not need to store these intermediate knots The additional duplicate of fruit realizes 50% elimination of buffer between program for being used later by next stage program whereby.Scheme 1202 cost is to need almost double increase for the processing speed of four programs.However, since random logic is more slow than big Storage group is much smaller, therefore compared with known scheme 1201, total hardware cost is reduced.
In Figure 11 A and Figure 12 A, it is assumed that jointly execute IQ and IDCP processing.Therefore, basically there exist three main journeys Sequence, also that is, VLD, IQ/IDCP and 3 grades of IDWT.By the way that three main programs are divided into three separated processing time slots, Mei Yicheng Sequence can have the time of whole time slots to end processing a segment, therefore processing speed requires to relax.Spy according to the present invention Determine specific example, exists for being shown in Figure 12 B in three panel data band paths of (and also being shown in Figure 11 A) The separated example of four program option frameworks (shown in Figure 12 A) in 1,202 1 time slots of each.In other words, exist by VLDt And four program options 1202 in a time slot of IQ/IDCPt label correspond to top frequency band number wherein " t " indicates top frequency band According to path 12800;Four program options 1202 in the time slot marked by VLDc and IQ/IDCPc, wherein " c " indicates present band, It corresponds to present band data path 12820;And four program options 1202 in the time slot marked by VLDb and IQ/IDCPb, Wherein " b " indicates bottom frequency band, corresponds to bottom frequency band data path 12840.More generally, Figure 12 A and Figure 12 B are said respectively Bright particular embodiment of the invention utilizes time dependence associated with processing top frequency band, present band and bottom frequency band And geometry dependence.Additionally, it is noted that each of aforementioned data path alternatively be referred to as decoder path, this be by In these data paths in VC-2 decoder.
In VC-2 design of encoder, a potential problems associated with three frequency ranges of data separated of processing are, possible Need buffer between a large amount of internal processes.For example, in order to handle current data segment, when being used for the result of previous fragment Next program will lead in use, each functional module may need that two groups of buffers is kept to operate in a manner of table tennis (4 × 2 × 128 × 14 × 3) 43,008 buffer is used, this number is quite big.It is slow in order to save this large amount of intermodule communication Storage uses four program options 1202 in a time slot.Based on sequentially handling the special of lower band data and high frequency band data VC-2 property, it is necessary to all high frequency band data from VLD be waited to can be used for starting to process next IQ/IDCP pipe.Under Before one program can start its pipe, it is only necessary to terminate its small 1/16 part (low-low band).Then, remaining is being carried out 3/16 and 3/4 last high frequency band after, corresponding IQ/IDCP and IDWT program can start.Start to arrange based on what this overlapped Cheng Fangfa, IDWT program obtain using its original budget of this parallel pipeline framework 1202 being more than half.In order to reach for Double-speed needed for IDWT, two small 1-D IDWT programs can be performed in parallel to reach objective speed.
Be as previously mentioned Figure 12 B for repainting version for Figure 11 A show have in three frequency band frameworks with 2 The VC-2 decoder 12700 of the parallel VLD-IQ/IDCP module of Duan Yanchi.Referring to Figure 12 B, decoder 12700 includes a segment 12800,12820 and 12840,1 grades of byte balanced device 12720, a FIFO12740, parallel data paths IDWT modules 12860, a BT-1120 generator 12880 and a VO FIFO12900.Data path 12800 is generated for segment on the right of top Top frequency band real time data.By 2 fragment-delays, required 3 segment data of top frequency band is sent to such as in Fig. 4 to figure 3 grades of IDWT programs described in 9.Data path 12820 generates the present band real time data for current the right segment.It is logical 2 fragment-delays are crossed, required 3 segment data of present band is sent to such as 3 grades of IDWT journeys described in Fig. 4 to Fig. 9 Sequence.Data path 12840 generates the bottom frequency band real time data for being used for right lower quadrant segment.It, will be required by 2 fragment-delays Frequency band 3 segment data in bottom is sent to such as 3 grades of IDWT programs described in Fig. 4 to Fig. 9.According to certain specific examples, no Direct real time data transmitting is carried out using any internal memory buffer, this is the advantage of these specific examples.
More generally, Figure 12 B explanation includes top frequency band data path 12800, present band data path 12820 and bottom Three parallel data paths of frequency band data path 12840.Top frequency band data path 12800 executes top compression frequency ranges of data Variable-length decoding (VXD), inverse quantization (IQ) and anti-DC prediction (IDCP) processing.Present band data path 12820 executes VLD, IQ and IDCP of current compression frequency ranges of data are handled.Bottom frequency band data path 12840 executes bottom compressed data frequency band VLD, IQ and IDCP processing.Figure 12 B shows three-level inverse discrete wavelet conversion (IDWT) module 12860 also to execute IDWT processing To depend on the top of the partial decompressing generated using three parallel data paths, current and bottom data frequency band is synthesized through translating The pixel value of code.Each of three parallel data paths 12800,12820 and 12840 also execute 2 fragment-delays, 2 segments Postpone to be provided to three-level the data frequency band of its partial decompressing out of the ordinary generated from VLD, IQ and IDCP processing to be divided into The left side of IDWT module 12860, current and the right data fragment.The VO-FIFO12900 for being also known as scan transformation FIFO will Three-level IDWT module 12860 based on segment output conversion to be provided to BT-1220 generator 12880 based on line scanning Video signal output.BT-1220 generator 12880 will be inserted into self-scanning conversion FIFO12900 and receive during horizontal and vertical hide from view Data in export the video signal with BT-1220 format whereby.
Figure 12 B also shows the segment byte balanced device 12720 for the segment byte balanced device 1105 being equivalent in Figure 11 A, and It is equivalent to the bit streams FIFO12740 of the FIFO1110 in Figure 11 A.Segment byte balanced device 12720 is to each segment of gradeization In every data slot some bytes, and will include that top compression frequency ranges of data, current compression frequency ranges of data and bottom are pressed It is each in the grade equal frequency ranges of data before three compressed data frequency bands of contracting frequency ranges of data are provided to three parallel data paths Person.Bit streams FIFO12740 is provided to bit string to buffer from the received compressed data of segment byte balanced device 12720 to accommodate Flow the input video signal format of the input terminal of FIFO12740 and the output video signal format of the output end output in decoder 12700 Processing in real time.According to particular embodiment, equation discussed above (1) and (2) are to determine that bit streams FIFO12740's is big It is small, and therefore, the size of bit streams FIFO12740 depend on input video signal format and export between video signal format in active region Whole effect image sizes of difference and the input video signal format in the duty cycle of domain.This enables bit streams FIFO12740 It is enough that the seamless and non-discontinuity display operation of output video signal format is provided.
Referring again to Figure 12 B, according to particular embodiment, three-level IDWT module 12860 includes that a pipeline is two-dimentional (2-D) IDWT composite filter, be using N number of overlapping one-dimensional (1-D) IDWT filter implement, wherein N be consecutively carried out with The number for generating the 1-D IDWT filter of 2-D IDWT result, as explained above with reference to Figure 10.The 1-D IDWT of N number of overlapping The average transmission amount of each a 1-D IDWT filter results of frequency cycle N/ (1+N) is reached in the use of filter, as also described above Referring to described by Figure 10.
Another technology 1300 of Figure 13 A to 3 × 3 segments needed for showing to reduce buffer.It, schemes in more specific words 13A is to illustrate that 9 segments to the IDWT program data buffer buffer for saving 63.3% store reduction technology 1300.This Technology is based on VC-2 property: not all pixels in 3 × 3 neighborhood blocks are all to assess current clip.Note that scheming In 11A, combined VLD/IQ/IDCP module 1130 contains upper right quarter segment, and 1135 containing intermediate the right segment, and 1140 contain Right lower quadrant segment.2 fragment-delay modules 1145 contain central upper portion and upper left quarter segment.2 fragment-delay modules 1150, which contain, to be worked as Preceding center and current left side segment.2 fragment-delay modules 1155 contain lower central and lower left quarter segment.
Figure 13 B shows the 9 fragment data block dependence charts 1395 for handling all 3 grades of IDWT.For example, In order to handle the 1st grade (L1), it is only necessary to a pixel from upper right quarter segment 0.Similarly, L2 only needs a pixel, and L3 also only needs a pixel.It only needs for 3 pixels to be stored in upper right quarter segment together, such as stores 1330 in segment 0 In 128 adequately filled up pixel it is opposite.Based on this chart 1395, show that central upper portion segment 11320 only needs 28 to delay Storage;Upper left quarter segment 21310 only needs 28 buffers;Current the right segment 31360 only needs 14 buffers;In current Lamination section 41350 needs all 128 pixels;Current left side segment 51340 is also required to all 128 pixels;Right lower quadrant segment 61390 only need 6 pixels;Lower central segment 71380 only needs 44 pixels;And lower left quarter segment 81370 only needs 44 A pixel.One addressing, specific scheme is designed so that every level-one that required specific data is transferred to IDWT program in real time.
In this specific example, for storage capacity, there are 9 fragment data blocks to the reduction of only 3.3 segment blocks.With Adequately fill up (4 × 2 × 9 × 128 × 14) or 129, the prior art method of 024 buffer is compared, the method be used only (4 × 2 × 423 × 14) or 47,376 buffers.Advantage is the buffer for saving 63.3%.
Figure 14 is to show the IDWT resolution reduction technique 1400 to squeeze out last redundancy in IDWT implementation.It is more special Surely say it, Figure 14 to show IDWT resolution ratio reduce method 1400, to save be more than 10.5% IDWT program data delay Rush device buffer.This technology is based on VC-2 property: only one quantization index (qindex) is used for the DWT program by encoder It is parsed into the whole fragment of ten frequency bands.In order to use only one qindex to emphasize lower band and de-emphasize high frequency band, After every level-one of DWT in coder side, VC-2 application factor 2 is scaling up.On decoder side, exist from L1 to L2 And counter from L2 to L3 proportionally reduces 2 again.This means and differentiates required for the L2-IDWT in calculating 1420,1430,1440 A rate position fewer than resolution ratio required for calculating L1-IDWT1410, and calculate needed for the L3-IDWT in 1450,1460,1470 The resolution ratio wanted an also position fewer than L2-IDWT.Contain 96 in 75% pixel or 128 pixels due to L3, and L2 contains 24 in 18.75% pixel or 128 pixels, most of rest segment Storage Register can be used less 2 positions or to be less 1 position.In the specific example, this saves other 10.5% whole remaining cache devices.It is also by the 2nd grade and 3rd level arithmetic Data depth reduces same number position, and therefore causes slightly faster IDWT processing speed.Therefore, three-level IDWT module (it is designated as 1160 in Figure 11 A, and is designated as 12860) may be configured to work as when the 2nd grade of IDWT program when ratio of execution and hold in Figure 11 B A few position is handled when the 1st grade of IDWT program of row, and when execution 3rd level IDWT program when ratio is when executing the 1st grade of IDWT program Few two positions are handled, if this realizes use than when in the 1st grade of IDWT program of execution, the 2nd grade of IDWT program and 3rd level IDWT program Each when the resume module same number position three-level IDWT will need few substantially 10% program interface buffer and same The low resolution of sample and slightly faster arithmetical logic.
Hardware, firmware, software and/or a combination thereof can be used to implement for kind described above module and block, such as will be by Generally those who familiarize themselves with the technology will be appreciated that after reading the present invention.This hardware (for example) can use one or more processors, field can Programmable gate array (FPGA) and/or special application integrated circuit (ASIC) are implemented, but not limited to this.
Although being described herein and describing particular embodiment, skilled artisan will understand that, can generally use It is computed and replaces shown particular embodiment to reach any configuration of same purpose.It is, therefore, apparent that being intended to the present invention Only limited by claim and its equivalent.
Although various specific examples of the invention already described above, it should be appreciated that it has passed through example and unrestricted has been in It is existing.Being familiar with related technician will be evident, in the case where not departing from spirit and scope of the invention, can carry out shape wherein The various changes of formula and details.
Width and scope of the invention should not be limited by any one of above-mentioned illustrative specific example, and should be according only to Following claims and its equivalent define.

Claims (22)

1. a kind of decoder, characterized by comprising:
Three parallel data paths comprising a top frequency band data path, a present band data path and a bottom frequency band Data path,
The top frequency band data path executes more than variable-length decoding, inverse quantization and the inverse discrete of a top compression frequency ranges of data String prediction processing and 2 fragment-delays;
The present band data path executes more than variable-length decoding, inverse quantization and the inverse discrete of a current compression frequency ranges of data String prediction processing and 2 fragment-delays;And
The bottom frequency band data path executes more than variable-length decoding, inverse quantization and the inverse discrete of a bottom compressed data frequency band String prediction processing and 2 fragment-delays;And
One three-level inverse discrete wavelet conversion module executes the processing of three-level inverse discrete wavelet conversion to depend on using described three The top of partial decompressing caused by parallel data paths, current and bottom data frequency band and synthesize decoded pixel value.
2. decoder according to claim 1, which is characterized in that also include:
One segment byte balanced device, equalization is just by some compression words in each data slot of the decoder for decoding Section, and whereby, it will include the top compression frequency ranges of data, the current compression frequency ranges of data and the bottom compressed data Before three compressed data frequency bands of frequency band are provided to three parallel data paths, three compressed data frequency bands described in equalization Each of some packed bytes in compressed data frequency band.
3. decoder according to claim 2, which is characterized in that also include:
One bit streams first in first out is buffered from the received compressed data of the segment byte balanced device institute;
It is wherein to be provided to described three in parallel to generate by the received compressed data of the bit streams first in first out institute The top compression frequency ranges of data, the current compression frequency ranges of data and the bottom compressed data frequency band of data path;
Wherein the bit streams first in first out accommodates an input video signal of the input terminal for being provided to the bit streams first in first out Format and the decoder an output export one output video signal format real-time processing, the output video signal format It is different from the input video signal format;And
Wherein the size of the bit streams first in first out depends between the input video signal format and the output video signal format A difference in the duty cycle of the zone of action and the input video signal format whole effect image sizes;And
A seamless and non-interruption of video signal format is wherein exported described in the bit streams first in first out enable with the size Property display operation.
4. decoder according to claim 1, which is characterized in that 2 fragment-delay is to will be from the variable length Spend the frequency ranges of data of partial decompressing caused by decoding processing, inverse quantization processing and anti-discrete cosine prediction processing It is divided into the left side for being provided to the three-level inverse discrete wavelet conversion module, current and right data segment.
5. decoder according to claim 1, which is characterized in that also comprising a scan transformation first in first out with by described three Output conversion of the one of grade inverse discrete wavelet conversion module based on segment a to video signal based on line scanning exports.
6. decoder according to claim 5, which is characterized in that include also a module, to will be during horizontal blank And it is inserted into during vertical blank from the scan transformation first in first out received data specified with one to export whereby One video signal of format.
7. decoder according to claim 1, which is characterized in that the three-level inverse discrete wavelet conversion module includes a pipe Wire type two dimension inverse discrete wavelet conversion composite filter, the pipeline two dimension inverse discrete wavelet conversion composite filter be using The monodimensional inverse discrete wavelet conversion filter of N number of overlapping is implemented, and wherein N is to be consecutively carried out to generate two-dimentional inverse discrete small echo and turn Change a number of the monodimensional inverse discrete wavelet conversion filter of result.
8. decoder according to claim 7, which is characterized in that turned using the monodimensional inverse discrete small echo of N number of overlapping Filter is changed, to reach an average transmission of each a monodimensional inverse discrete wavelet conversion filter results of frequency cycle N/ (1+N) Amount.
9. decoder according to claim 7, which is characterized in that
It is associated with each of three parallel data paths data path to store for handling interface buffer Four processing each of processing as a result, it is described four processing include the variable-length decoding processing, the inverse Change processing, anti-discrete cosine prediction processing and three-level inverse discrete wavelet conversion processing;And
Using four processing schemes in a time slot, if being held in four separated time slots in pipeline operation with enable ratio Row four processing will need the processing interface buffer to be used few at least 50%.
10. decoder according to claim 1, which is characterized in that
It is associated with each of three parallel data paths data path to store for handling interface buffer Four processing each of processing as a result, it is described four processing include the variable-length decoding processing, the inverse Change processing, anti-discrete cosine prediction processing and three-level inverse discrete wavelet conversion processing;And
Use three parallel data paths and the data being present between the different fragments of every one 3 × 3 data slot unit Dependence is stored in the processing interface with the one third of enable every one 3 × 3 data slot unit at any point in time In buffer.
11. decoder according to claim 1, which is characterized in that
Processing interface buffer be to store as performed by the three-level inverse discrete wavelet conversion module one the 1st grade instead from Dissipate each of wavelet conversion processing, one the 2nd grade of inverse discrete wavelet conversion processing and 3rd level inverse discrete wavelet conversion processing The result of inverse discrete wavelet conversion processing;And
The three-level inverse discrete wavelet conversion module is to when execution the 2nd grade of inverse discrete wavelet conversion processing when ratio is when execution A position is handled when the 1st grade of inverse discrete wavelet conversion processing less, and is handled when executing the 3rd level inverse discrete wavelet conversion When than when execute the 1st grade of inverse discrete wavelet conversion processing when it is few handle two positions, if enable ratio when execute it is described 1st grade In the processing of inverse discrete wavelet conversion, the 2nd grade of inverse discrete wavelet conversion processing and 3rd level inverse discrete wavelet conversion processing The processing of each inverse discrete wavelet conversion when the three-level inverse discrete wavelet conversion resume module same number position will needs The processing interface buffer few 10% used.
12. it is a kind of for the method that is used in decoding data, characterized by comprising:
(a) variable-length decoding, inverse quantization and the anti-discrete cosine for executing a top compression frequency ranges of data predict processing and 2 Duan Yanchi;
(b) variable-length decoding, inverse quantization and the anti-discrete cosine for executing a current compression frequency ranges of data predict processing and 2 Duan Yanchi;And
(c) variable-length decoding, inverse quantization and the anti-discrete cosine for executing a bottom compressed data frequency band predict processing and 2 Duan Yanchi;
Wherein step (a), (b) and (c) through being performed in parallel;And it further includes
(d) execute three-level inverse discrete wavelet conversion processing with depend on step (a), (b) and (c) caused by partial decompressing Top, current and bottom data frequency band and synthesize decoded pixel value.
13. according to the method for claim 12, which is characterized in that also include:
Step (a), (b) and (c) at be performed in parallel including the top compression frequency ranges of data, the current compression data frequency Before the processing of three compressed data frequency bands of band and the bottom compressed data frequency band, three compressed data frequency bands described in equalization Each of each data slot in compressed data frequency band some bytes.
14. according to the method for claim 12, which is characterized in that 2 fragment-delay is used to will be from the variable length Spend the frequency ranges of data of partial decompressing caused by decoding processing, inverse quantization processing and anti-discrete cosine prediction processing It is divided into for executing the left side, the current and right data segment that the three-level inverse discrete wavelet conversion is handled at step (d).
15. according to the method for claim 12, which is characterized in that also include:
(e) the one of step (d) result based on segment is converted to a video signal based on line scanning and is exported.
16. according to the method for claim 15, which is characterized in that also include:
(f) it will be inserted into what the conversion that comfortable step (e) place executes generated during horizontal blank and during vertical blank To generate the video signal with a specified format whereby in data.
17. according to the method for claim 12, which is characterized in that in the three-level inverse discrete that step (d) place executes Wavelet conversion processing is implemented using the monodimensional inverse discrete wavelet conversion filter of N number of overlapping, and wherein N is to be consecutively carried out to produce One number of the monodimensional inverse discrete wavelet conversion filter of raw two dimension inverse discrete wavelet conversion result.
18. according to the method for claim 17, which is characterized in that turned using the monodimensional inverse discrete small echo of N number of overlapping Filter is changed, to reach an average transmission of each a monodimensional inverse discrete wavelet conversion filter results of frequency cycle N/ (1+N) Amount.
19. according to the method for claim 17, which is characterized in that also include:
It will include variable-length decoding processing, inverse quantization processing, anti-discrete cosine prediction processing and described three The result of each of four processing of grade inverse discrete wavelet conversion processing processing is stored in processing interface buffer;And
If using four processing schemes in a time slot four processing will be executed in four separated time slots by needs Processing interface buffer quantity reduce at least 50%.
20. according to the method for claim 12, which is characterized in that be performed in parallel step (a), (b) and (c), and use The data dependencies being present between the different fragments of every one 3 × 3 data slot unit, with enable as execution step (a), (b) And when any time point when (c), the one third of every one 3 × 3 data slot unit is stored.
21. according to the method for claim 12, which is characterized in that
It include at one the 1st grade of inverse discrete wavelet conversion in the three-level inverse discrete wavelet conversion processing that step (d) place executes Reason, one the 2nd grade of inverse discrete wavelet conversion processing and 3rd level inverse discrete wavelet conversion processing;And
When executing three-level inverse discrete wavelet conversion processing at step (d), turn when executing the 2nd grade of inverse discrete small echo It is fewer than when executing the 1st grade of inverse discrete wavelet conversion processing when changing processing to handle a position and anti-when executing the 3rd level Two positions are handled than few when executing the 1st grade of inverse discrete wavelet conversion processing when discrete wavelet conversion process.
22. a kind of decoder, characterized by comprising:
One segment byte balanced device, some compression words of the equalization in each data slot just by the decoder for decoding Section, and whereby, equalization is including a top compression frequency ranges of data, a current compression frequency ranges of data and a bottom compressed data frequency Some packed bytes in each of three compressed data frequency bands of band compressed data frequency band;
Three parallel data paths comprising a top frequency band data path, a present band data path and a bottom frequency band Data path,
The top frequency band data path executes variable-length decoding, inverse quantization and the inverse discrete of the top compression frequency ranges of data Cosine prediction processing and 2 fragment-delays;
The present band data path executes variable-length decoding, inverse quantization and the inverse discrete of the current compression frequency ranges of data Cosine prediction processing and 2 fragment-delays;And
The bottom frequency band data path executes variable-length decoding, inverse quantization and the inverse discrete of the bottom compressed data frequency band Cosine prediction processing and 2 fragment-delays;And
One three-level inverse discrete wavelet conversion module is executed using partial decompressing caused by three parallel data paths Top, current and bottom data frequency band inverse discrete wavelet conversion processing;
Wherein the three-level inverse discrete wavelet conversion module includes a pipeline two dimension inverse discrete wavelet conversion composite filter, It is to be implemented using the monodimensional inverse discrete wavelet conversion filter of a plurality of overlappings.
CN201410098981.8A 2013-03-15 2014-03-17 It is decoded using the video compress (VC-2) of parallel decoding path Active CN104053000B (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201361798790P 2013-03-15 2013-03-15
US61/798,790 2013-03-15
US13/851,821 US9241163B2 (en) 2013-03-15 2013-03-27 VC-2 decoding using parallel decoding paths
US13/851,821 2013-03-27

Publications (2)

Publication Number Publication Date
CN104053000A CN104053000A (en) 2014-09-17
CN104053000B true CN104053000B (en) 2018-12-25

Family

ID=51505314

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410098981.8A Active CN104053000B (en) 2013-03-15 2014-03-17 It is decoded using the video compress (VC-2) of parallel decoding path

Country Status (1)

Country Link
CN (1) CN104053000B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1729690A (en) * 2002-11-13 2006-02-01 索尼电子有限公司 Method of real time MPEG-4 texture decoding for a multiprocessor environment
CN101335888A (en) * 2007-06-27 2008-12-31 中国科学院微电子研究所 Standard interframe prediction pixel generation device for digital audio and video coding and decoding technology
CN102547291A (en) * 2012-02-08 2012-07-04 中国电影科学技术研究所 Field programmable gate array (FPGA)-based joint photographic experts group (JPEG) 2000 image decoding device and method
CN102550029A (en) * 2010-07-30 2012-07-04 松下电器产业株式会社 Image decoding device, image decoding method, image encoding device, and image encoding method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6859563B2 (en) * 2001-03-30 2005-02-22 Ricoh Co., Ltd. Method and apparatus for decoding information using late contexts

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1729690A (en) * 2002-11-13 2006-02-01 索尼电子有限公司 Method of real time MPEG-4 texture decoding for a multiprocessor environment
CN101335888A (en) * 2007-06-27 2008-12-31 中国科学院微电子研究所 Standard interframe prediction pixel generation device for digital audio and video coding and decoding technology
CN102550029A (en) * 2010-07-30 2012-07-04 松下电器产业株式会社 Image decoding device, image decoding method, image encoding device, and image encoding method
CN102547291A (en) * 2012-02-08 2012-07-04 中国电影科学技术研究所 Field programmable gate array (FPGA)-based joint photographic experts group (JPEG) 2000 image decoding device and method

Also Published As

Publication number Publication date
CN104053000A (en) 2014-09-17

Similar Documents

Publication Publication Date Title
US10051288B2 (en) Method and apparatus for compressing image data using a tree structure
JP4371120B2 (en) Image processing apparatus, image processing method, program, and recording medium
JP4360379B2 (en) Image processing apparatus, image processing method, program, and recording medium
CN101516031B (en) Image processing apparatus, image processing method
CN105120293A (en) Image cooperative decoding method and apparatus based on CPU and GPU
CN104982036A (en) Band separation filtering / inverse filtering for frame packing / unpacking higher-resolution chroma sampling formats
KR101710001B1 (en) Apparatus and Method for JPEG2000 Encoding/Decoding based on GPU
CN102263950A (en) Encoding device, encoding method, decoding device, and decoding method
CN106464887A (en) Image decoding method and device therefor, and image encoding method and device therefor
WO2015038156A1 (en) An efficient progressive jpeg decode method
CN105472442A (en) Out-chip buffer compression system for superhigh-definition frame rate up-conversion
US7676096B2 (en) Modular, low cost, memory efficient, input resolution independent, frame-synchronous, video compression system using multi stage wavelet analysis and temporal signature analysis with a highly optimized hardware implementation
US8213731B2 (en) Information processing device and method
CN105519108B (en) The weight predicting method and device of quantization matrix coding
CN104053000B (en) It is decoded using the video compress (VC-2) of parallel decoding path
KR102247196B1 (en) Vc-2 decoding using parallel decoding paths
CN102333222B (en) Two-dimensional discrete wavelet transform circuit and image compression method using same
CN105007444B (en) A kind of single pixel video display devices and display methods
US6125210A (en) Method and apparatus for two-dimensional wavelet recomposition
CN105359508A (en) Multi-level spatial-temporal resolution increase of video
JP5232182B2 (en) Screen division encoding apparatus and program thereof, and screen division decoding apparatus and program thereof
CN102204250A (en) Encoding method, encoding device, and encoding program for encoding interlaced image
KR20110071204A (en) Parallel processing method in wavelet-based jpeg2000
JP2014119949A (en) Super-resolution system and program
EP2254338B1 (en) Coding and decoding methods and devices, computer program and information carrier enabling the implementation of such methods

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant