JP4676888B2

JP4676888B2 - Data processing device

Info

Publication number: JP4676888B2
Application number: JP2006012621A
Authority: JP
Inventors: 将史星野; 政宏大橋
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 2005-01-25
Filing date: 2006-01-20
Publication date: 2011-04-27
Anticipated expiration: 2026-01-20
Also published as: JP2006236325A

Description

本発明は、主にパイプライン処理を行うデータ処理装置において、消費電力を削減する技術に関するものである。 The present invention relates to a technique for reducing power consumption in a data processing apparatus that mainly performs pipeline processing.

画像処理や音声処理を行うデータ処理装置に対して、高速化の要求が年々高まっている。 The demand for higher speed is increasing year by year for data processing devices that perform image processing and audio processing.

複数の演算部が直列に接続されて並列処理を行うパイプライン処理装置は、処理の高速化を実現する。 A pipeline processing apparatus in which a plurality of arithmetic units are connected in series to perform parallel processing realizes high-speed processing.

従来の技術では、直列に接続された複数の演算部と、各演算部間を接続する複数のメモリで構成されたデータ処理装置が提案されている（例えば、特許文献１参照）。各メモリは、ダブルバッファで構成されている。 In the prior art, a data processing device has been proposed that includes a plurality of arithmetic units connected in series and a plurality of memories that connect the arithmetic units (for example, see Patent Document 1). Each memory is composed of a double buffer.

図１７は、従来の技術におけるデータ処理装置のブロック図である。データ処理装置２００は、４つの演算部２０１−２０４と、これら４つの演算部同士を接続する３つのメモリ２０５−２０７を備えている。演算部２０１〜２０４は、直列接続されており、入力されたデータに対して所定の演算を行う。メモリ２０５〜２０７は、ダブルバッファの構成を有しており、メモリの前段の演算部からの出力されるデータを記憶すると共に、後段の演算部へ記憶しているデータを出力する。 FIG. 17 is a block diagram of a conventional data processing apparatus. The data processing device 200 includes four arithmetic units 201-204 and three memories 205-207 that connect the four arithmetic units. The arithmetic units 201 to 204 are connected in series and perform a predetermined operation on the input data. The memories 205 to 207 have a double buffer configuration, and store data output from the preceding calculation unit of the memory and output data stored in the subsequent calculation unit.

例えば、メモリ２０５は、演算部２０１の出力データを記憶し、記憶しているデータを演算部２０２に出力する。 For example, the memory 205 stores the output data of the calculation unit 201 and outputs the stored data to the calculation unit 202.

演算部２０１に入力されたデータは、演算部２０１でまず演算され、次段の演算部２０２に演算結果が受け渡される。演算部２０２は、渡された演算結果に基づいて所定の演算をして、演算結果をメモリ２０６を介して、次段の演算部２０３に渡す。演算部２０２が演算を実行している期間に、演算部２０１は、次の入力データに対する演算を行う。すなわち、演算部２０１〜２０４は並列動作を行う。 The data input to the calculation unit 201 is first calculated by the calculation unit 201, and the calculation result is passed to the calculation unit 202 at the next stage. The calculation unit 202 performs a predetermined calculation based on the passed calculation result, and passes the calculation result to the next-stage calculation unit 203 via the memory 206. During the period when the calculation unit 202 is executing the calculation, the calculation unit 201 performs a calculation on the next input data. That is, the arithmetic units 201 to 204 perform a parallel operation.

このように、複数の演算部２０１〜２０４がメモリ２０５〜２０７を介して直列接続されることで、パイプライン処理が実現される。 Thus, pipeline processing is realized by connecting a plurality of arithmetic units 201 to 204 in series via the memories 205 to 207.

しかしながら、従来の技術におけるデータ処理装置は、高い処理性能を必要としないアプリケーションに対応する場合であっても、同じ期間内に全ての演算部を用いるパイプライン処理を行う。データ処理装置は、最大の要求性能に対応したパイプライン演算を行う。 However, the data processing apparatus according to the conventional technique performs pipeline processing using all the arithmetic units within the same period even if it corresponds to an application that does not require high processing performance. The data processing apparatus performs a pipeline operation corresponding to the maximum required performance.

このため、アプリケーションの要求と乖離する不要な消費電力が発生する問題があった。結果として、消費電力の削減が不十分であった。特に、ピーク電力が大きくなる問題があった。 For this reason, there is a problem that unnecessary power consumption that deviates from the request of the application occurs. As a result, power consumption has been insufficiently reduced. In particular, there is a problem that peak power becomes large.

一方、画像処理や音声処理においては、同じ規格であっても、要求される画像サイズや音声サイズにより（例えば、ＣＩＦ画像とＱＣＩＦ画像）要求される処理性能が異なることが多い。 On the other hand, in image processing and audio processing, the required processing performance often differs depending on the required image size and audio size (for example, CIF image and QCIF image) even if the standards are the same.

すなわち、同一アプリケーションの処理において、要求される処理性能の変化に応じた演算処理ができなかった。結果として消費電力の無駄も発生していた。
特開平１０−３３４２２５号公報特開平１１−１４９３７３号公報特開平６−２９２１７８号公報 That is, in the processing of the same application, the arithmetic processing according to the required change in processing performance could not be performed. As a result, power consumption was wasted.
JP 10-334225 A Japanese Patent Laid-Open No. 11-149373 JP-A-6-292178

そこで本発明は、要求される処理性能の変化に応じた消費電力に対応するパイプライン処理を実現し、消費電力を削減できるデータ処理装置を提供することを目的とする。 SUMMARY OF THE INVENTION An object of the present invention is to provide a data processing apparatus that realizes pipeline processing corresponding to power consumption corresponding to a required change in processing performance and can reduce power consumption.

第１の発明に係るデータ処理装置は、単位サイクル内に各々に割り当てられた演算を行うと共に各々が直列接続された複数の演算部と、複数の演算部の各々の間に接続された複数のメモリと、複数の演算部の内、ある単位サイクル内に各々に割り当てられた演算を行う演算部を選択する制御部を備える。 A data processing device according to a first invention performs a calculation assigned to each in a unit cycle, and a plurality of calculation units each connected in series, and a plurality of calculation units connected between each of the plurality of calculation units A memory and a control unit that selects a calculation unit that performs a calculation assigned to each unit within a unit cycle among a plurality of calculation units.

この構成により、単位サイクル内に演算が不要な演算部を生じさせることができ、不要な消費電力が削減できる。 With this configuration, an operation unit that does not require an operation can be generated in a unit cycle, and unnecessary power consumption can be reduced.

第２の発明に係るデータ処理装置では、単位サイクルの時間は、複数の演算部の各々が行う演算において、最大の演算時間に基づいて定められる。 In the data processing device according to the second invention, the unit cycle time is determined based on the maximum calculation time in the calculation performed by each of the plurality of calculation units.

この構成により、単位サイクル内で動作する演算部を選択した場合でも、パイプライン処理が実現できる。 With this configuration, pipeline processing can be realized even when an arithmetic unit that operates within a unit cycle is selected.

第３の発明に係るデータ処理装置では、制御部は、複数の演算部の内、ある単位サイクル内に各々に割り当てられた演算を行う演算部の個数を決定する。 In the data processing device according to the third aspect of the invention, the control unit determines the number of operation units that perform the operation assigned to each unit within a certain unit cycle among the plurality of operation units.

この構成により、単位サイクル内に演算が不要な演算部を生じさせることができ、不要な消費電力が削減できる。特に、ある瞬間での消費電力であるピーク電力が削減できる。 With this configuration, an operation unit that does not require an operation can be generated in a unit cycle, and unnecessary power consumption can be reduced. In particular, peak power that is power consumption at a certain moment can be reduced.

第４の発明に係るデータ処理装置では、制御部は、複数の演算部の各々が行う演算に要する時間の合計に基づいて、複数の演算部の内、ある単位サイクル内に演算を行う演算部を選択する。 In the data processing device according to the fourth aspect of the invention, the control unit is a calculation unit that performs calculation within a unit cycle among the plurality of calculation units based on the total time required for calculation performed by each of the plurality of calculation units. Select.

この構成により、制御部は、アプリケーションの違いによる全体処理時間の要求を満足した上で、単位サイクル内で演算を実行する演算部を決定できる。 With this configuration, the control unit can determine a calculation unit that performs a calculation within a unit cycle after satisfying the request for the total processing time due to the difference in applications.

第５の発明に係るデータ処理装置では、制御部は、複数の演算部の初段の演算部に入力するデータの入力間隔に基づいて、複数の演算部の内、ある単位サイクル内に演算を行う演算部を決定する。 In the data processing device according to the fifth aspect of the invention, the control unit performs an operation within a unit cycle among the plurality of operation units based on an input interval of data input to the first operation unit of the plurality of operation units. Determine the computing unit.

この構成により、制御部は、アプリケーションの違いによる、入力データの入力間隔を満足した上で、単位サイクル内で演算を行う演算部を決定する。 With this configuration, the control unit determines an operation unit that performs an operation within a unit cycle while satisfying an input interval of input data due to a difference in application.

第６の発明に係るデータ処理装置では、制御部は、外部から通知される複数の演算部の各々が行う演算に要する時間の合計に関する許容時間情報に基づき、入力間隔を算出する。 In the data processing device according to the sixth aspect of the invention, the control unit calculates the input interval based on allowable time information related to the total time required for calculations performed by each of the plurality of calculation units notified from the outside.

この構成により、外部からの情報による要求を満足するように、制御部は単位サイクル内に演算を行う演算部を決定できる。 With this configuration, the control unit can determine a calculation unit that performs a calculation within a unit cycle so as to satisfy a request based on information from the outside.

第７の発明に係るデータ処理装置では、制御部は、複数の演算部の内、ある単位サイクル内に演算が不要な演算部を判定し、制御部は、判定された演算部への電力供給を遮断する第１処理、判定された演算部へのクロック信号の供給を停止する第２処理、判定された演算部へのクロック信号の周波数を低減する第３処理および判定された演算部の閾値電圧を増加する第４処理の内、少なくとも一つの処理を行う。 In the data processing device according to the seventh invention, the control unit determines a calculation unit that does not require calculation within a unit cycle among the plurality of calculation units, and the control unit supplies power to the determined calculation unit. A first process for cutting off the clock signal, a second process for stopping the supply of the clock signal to the determined arithmetic unit, a third process for reducing the frequency of the clock signal to the determined arithmetic unit, and the threshold value of the determined arithmetic unit At least one of the fourth processes for increasing the voltage is performed.

この構成により、アプリケーションの要求を満足するデータ処理が実現されつつ、単位サイクル内で演算不要な演算部の消費電力を削減できる。結果として、データ処理装置の動作性能とデータ処理装置の消費電力の削減が両立される。 With this configuration, it is possible to reduce power consumption of a calculation unit that does not require calculation within a unit cycle, while realizing data processing that satisfies application requirements. As a result, the operation performance of the data processing device and the power consumption of the data processing device are reduced.

第８の発明に係るデータ処理装置では、複数のメモリのそれぞれは、演算部からのデータを記憶し、演算部の出力に接続される第１バンクと、演算部へ出力するデータを記憶し、演算部の入力に接続される第２バンクを有する。 In the data processing device according to the eighth invention, each of the plurality of memories stores the data from the calculation unit, stores the first bank connected to the output of the calculation unit, and the data to be output to the calculation unit, A second bank is connected to the input of the arithmetic unit.

第９の発明に係るデータ処理装置では、制御部は、第１バンクと第２バンクの内、ある単位サイクル内に、データ更新またはデータ出力が不要なバンクを判定し、制御部は、判定されたバンクへの電力供給を遮断する第１処理、判定されたバンクへのクロック信号の供給を停止する第２処理、判定されたバンクへのクロック信号の周波数を低減する第３処理および判定されたバンクへの閾値電圧を増加する第４処理の内、少なくとも一つの処理を行う。 In the data processing device according to the ninth aspect, the control unit determines a bank that does not require data update or data output within a certain unit cycle among the first bank and the second bank, and the control unit determines A first process for shutting off power supply to the bank, a second process for stopping the supply of the clock signal to the determined bank, a third process for reducing the frequency of the clock signal to the determined bank, and the determined At least one of the fourth processes for increasing the threshold voltage to the bank is performed.

これらの構成により、演算を行う演算部として決定された結果により生じるデータ更新もしくはデータ出力が不要なバンクの消費電力が削減できる。アプリケーションの要求を満足するデータ処理が実現されつつ、演算不要な演算部と加えて、データ処理装置の消費電力が効果的に削減できる。 With these configurations, it is possible to reduce the power consumption of a bank that does not require data update or data output caused by a result determined as a calculation unit that performs a calculation. Data processing that satisfies the application requirements can be realized, and in addition to a calculation unit that does not require calculation, power consumption of the data processing apparatus can be effectively reduced.

第１０の発明に係るデータ処理装置では、複数のメモリは、第１メモリと第２メモリを有し、複数の演算部は、第１メモリの前段に接続される演算部と後段に接続される演算部からなる第１演算部ペアと、第２メモリの前段に接続される演算部と後段に接続される演算部からなる第２演算部ペアを有し、第１演算部ペアと第２演算部ペアがある単位サイクル内で排他的に演算を行う場合には、第１演算部ペアと第２演算部ペアは、第１メモリおよび第２メモリの一方を共用する。 In the data processing device according to the tenth aspect, the plurality of memories include a first memory and a second memory, and the plurality of arithmetic units are connected to the arithmetic unit connected to the previous stage of the first memory and the subsequent stage. A first computing unit pair comprising a computing unit; a second computing unit pair comprising a computing unit connected to the previous stage and a computing unit connected to the subsequent stage of the second memory; the first computing unit pair and the second computing unit When the operation is performed exclusively within a unit cycle, the first operation unit pair and the second operation unit pair share one of the first memory and the second memory.

第１１の発明に係るデータ処理装置では、第１メモリ及び第２メモリの一方が共用される場合に、制御部は、共用されないメモリへの電力供給を遮断する第１処理、共用されないメモリへのクロック信号の供給を停止する第２処理、共用されないメモリへのクロック信号の周波数を低減する第３処理および共用されないメモリへの閾値電圧を増加する第４処理の内、少なくとも一つの処理を行う。 In the data processing device according to the eleventh aspect, when one of the first memory and the second memory is shared, the control unit performs the first process of cutting off the power supply to the non-shared memory, and the non-shared memory. At least one of the second process for stopping the supply of the clock signal, the third process for reducing the frequency of the clock signal to the non-shared memory, and the fourth process for increasing the threshold voltage to the non-shared memory is performed.

これらの構成により、制御部による演算部の決定に応じて、不使用のメモリを生じさせることができる。この不使用となるメモリの消費電力を削減することにより、演算不要の演算部、不使用のバンクの消費電力削減と合わせて、データ処理装置の消費電力を効果的に削減できる。 With these configurations, an unused memory can be generated according to the determination of the calculation unit by the control unit. By reducing the power consumption of the unused memory, it is possible to effectively reduce the power consumption of the data processing apparatus together with the power consumption reduction of the calculation unit that does not require computation and the unused bank.

第１５の発明に係る半導体集積回路は、単位サイクル内に各々に割り当てられた演算を行うと共に各々が直列接続された複数の演算部と、複数の演算部の各々の間に接続された複数のメモリと、複数の演算部の内、ある単位サイクル内に各々に割り当てられた演算を行う演算部を選択する制御部を備える。 According to a fifteenth aspect of the present invention, there is provided a semiconductor integrated circuit that performs a calculation assigned to each in a unit cycle and that includes a plurality of calculation units connected in series and a plurality of calculation units connected between the plurality of calculation units. A memory and a control unit that selects a calculation unit that performs a calculation assigned to each unit within a unit cycle among a plurality of calculation units.

これらの構成により、アプリケーションの仕様を満足しつつ、半導体集積回路の消費電力を削減できる。 With these configurations, the power consumption of the semiconductor integrated circuit can be reduced while satisfying the application specifications.

本発明によれば、アプリケーションによる許容処理時間を満足すると共に、不要な消費電力が削減できるデータ処理装置が実現される。 According to the present invention, it is possible to realize a data processing apparatus that can satisfy an allowable processing time by an application and can reduce unnecessary power consumption.

また、演算不要な演算部だけでなく、データ更新やデータ出力の不要なメモリや、メモリに含まれるバンクでの不要な消費電力が削減できる。特に、ある瞬間での電力であるピーク電力が削減できる。 Further, not only a calculation unit that does not require calculation but also a memory that does not require data update and data output and unnecessary power consumption in a bank included in the memory can be reduced. In particular, the peak power that is the power at a certain moment can be reduced.

以下、図面を参照しながら、本発明の実施の形態を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

（実施の形態１）
図１は、本発明の実施の形態１におけるデータ処理装置のブロック図である。 (Embodiment 1)
FIG. 1 is a block diagram of a data processing apparatus according to Embodiment 1 of the present invention.

まず、データ処理装置１の構成について説明する。 First, the configuration of the data processing apparatus 1 will be described.

データ処理装置１は、複数の演算部２〜５と、複数のメモリ６〜８と、制御部９を備えている。 The data processing apparatus 1 includes a plurality of calculation units 2 to 5, a plurality of memories 6 to 8, and a control unit 9.

複数の演算部２〜５は、所定の時間間隔である単位サイクル内に、それぞれ所定の演算を行う。また、演算部２〜５は、図１に示されるとおり直列に接続している。データ処理装置１において、演算部２が初段の演算部であり、演算部５が最終段の演算部である。 The plurality of calculation units 2 to 5 each perform a predetermined calculation within a unit cycle that is a predetermined time interval. Moreover, the calculating parts 2-5 are connected in series as FIG. 1 shows. In the data processing device 1, the calculation unit 2 is a first-stage calculation unit, and the calculation unit 5 is a final-stage calculation unit.

図１に示されるデータ処理装置１では、データ処理装置１に入力するデータに対して、演算部２から演算部５までが、それぞれの演算を実行して、最終段の演算部５で全ての演算処理が終了する。すなわち、演算部２から演算部５は、１セットの演算処理を行う。 In the data processing device 1 shown in FIG. 1, the calculation unit 2 to the calculation unit 5 perform respective calculations on the data input to the data processing device 1, and the final calculation unit 5 performs all the calculations. The calculation process ends. That is, the calculation unit 2 to the calculation unit 5 perform one set of calculation processing.

複数のメモリ６〜８は、演算部２〜５の各々の間に接続されており、演算部同士のデータの受け渡しを行う。 The plurality of memories 6 to 8 are connected between the calculation units 2 to 5 and exchange data between the calculation units.

メモリ６は、演算部２と演算部３の間に接続され、演算部２の演算結果を記憶し、記憶している演算結果を演算部３に出力する。 The memory 6 is connected between the calculation unit 2 and the calculation unit 3, stores the calculation result of the calculation unit 2, and outputs the stored calculation result to the calculation unit 3.

メモリ７は、演算部３と演算部４の間に接続され、演算部３の演算結果を記憶し、記憶している演算結果を演算部４に出力する。 The memory 7 is connected between the calculation unit 3 and the calculation unit 4, stores the calculation result of the calculation unit 3, and outputs the stored calculation result to the calculation unit 4.

メモリ８は、演算部４と演算部５の間に接続され、演算部４の演算結果を記憶し、記憶している演算結果を演算部５に出力する。 The memory 8 is connected between the calculation unit 4 and the calculation unit 5, stores the calculation result of the calculation unit 4, and outputs the stored calculation result to the calculation unit 5.

制御部９は、複数の演算部２〜５の内、単位サイクル内に演算を行う演算部を決定する。あるいは、制御部９は、単位サイクル内に演算を行う演算部の個数を決定する。ここで、制御部９は、演算を行う演算部の決定にあたって、複数の演算部２〜５全体で行われる全体処理に要求される全体処理時間を考慮する。あるいは、制御部９は、データ処理装置１に入力するデータの入力間隔を考慮する。 The control part 9 determines the calculating part which calculates in a unit cycle among the several calculating parts 2-5. Or the control part 9 determines the number of the calculating parts which perform a calculation in a unit cycle. Here, the control unit 9 considers the total processing time required for the entire processing performed by the plurality of calculation units 2 to 5 when determining the calculation unit that performs the calculation. Or the control part 9 considers the input interval of the data input into the data processor 1. FIG.

また、制御部９は、演算不要な演算部の消費電力を制御する。 Further, the control unit 9 controls the power consumption of the calculation unit that does not require calculation.

次に、図２〜図４を用いて、データ処理装置１の動作を説明する。 Next, the operation of the data processing apparatus 1 will be described with reference to FIGS.

図２、図３、図４は、本発明の実施の形態１におけるデータ処理装置の動作を説明する説明図である。 2, 3 and 4 are explanatory diagrams for explaining the operation of the data processing apparatus according to the first embodiment of the present invention.

図２、図３、図４において、横軸は時間軸であり、縦軸は、演算部の接続方向を示す。単位サイクルは、所定の時間間隔であり、任意に定められれば良いが、複数の演算部２〜５の各々の処理時間の内、最大の処理時間を基準に定められることが好適である。全ての演算部の演算が、単位サイクル内に終了するからである。 2, 3, and 4, the horizontal axis is the time axis, and the vertical axis indicates the connection direction of the arithmetic units. The unit cycle is a predetermined time interval and may be arbitrarily determined, but it is preferable that the unit cycle be determined based on the maximum processing time among the processing times of the plurality of computing units 2 to 5. This is because the calculation of all the calculation units is completed within the unit cycle.

ある時刻に演算部２に入力したデータは、単位サイクル１では、演算部２で演算処理され、単位サイクル２では、演算部３で演算処理され、単位サイクル３では、演算部４で処理され、単位サイクル４では、演算部５で処理されて、演算処理の終了した出力データとして出力される。 Data input to the calculation unit 2 at a certain time is calculated by the calculation unit 2 in the unit cycle 1, is calculated by the calculation unit 3 in the unit cycle 2, and is processed by the calculation unit 4 in the unit cycle 3. In the unit cycle 4, the data is processed by the calculation unit 5 and output as output data after the calculation process.

図２においては、全体処理時間における要求速度が最大の場合の動作が示されている。言い換えると、入力データは、単位サイクル毎にデータ処理装置１の初段の演算部である演算部２に入力される。 FIG. 2 shows an operation in the case where the required speed in the entire processing time is maximum. In other words, the input data is input to the calculation unit 2 that is the first-stage calculation unit of the data processing device 1 for each unit cycle.

図２では、単位サイクル１において、第１入力データが演算部２に入力される。単位サイクル２において、第２入力データが、演算部２に入力される。単位サイクル３において、第３入力データが、演算部２に入力される。単位サイクル４において、第４入力データが、演算部２に入力される。第１入力データから第４入力データは、単位サイクル毎に演算部２〜５を移動しながら演算される。 In FIG. 2, in the unit cycle 1, the first input data is input to the calculation unit 2. In the unit cycle 2, the second input data is input to the calculation unit 2. In the unit cycle 3, the third input data is input to the calculation unit 2. In the unit cycle 4, the fourth input data is input to the calculation unit 2. The first input data to the fourth input data are calculated while moving the calculation units 2 to 5 for each unit cycle.

単位サイクル１では、演算部２は第１入力データを用いて演算を行い、メモリ６に演算結果を出力する。他の演算部３〜５は、第１入力データ以前の入力データを用いた演算を行う。 In unit cycle 1, operation unit 2 performs an operation using the first input data and outputs the operation result to memory 6. The other calculation units 3 to 5 perform calculations using input data before the first input data.

単位サイクル２では、演算部２は、第２入力データを用いて演算を行う。同時に、演算部３は、メモリ６に記憶されている演算結果を用いて演算を行い、演算結果をメモリ７に出力する。 In the unit cycle 2, the calculation unit 2 performs a calculation using the second input data. At the same time, the calculation unit 3 performs a calculation using the calculation result stored in the memory 6 and outputs the calculation result to the memory 7.

単位サイクル３では、演算部２は、第３入力データを用いて演算を行い、メモリ６に演算結果を出力する。同時に、演算部３は、メモリ６に記憶されている演算結果を用いて演算を行い、同時に、演算部４はメモリ７に記憶されている演算結果を用いて演算を行う。 In unit cycle 3, operation unit 2 performs an operation using the third input data and outputs the operation result to memory 6. At the same time, the calculation unit 3 performs a calculation using the calculation result stored in the memory 6, and at the same time, the calculation unit 4 performs a calculation using the calculation result stored in the memory 7.

単位サイクル４では、演算部２は、第４入力データを用いて演算を行い、メモリ６に演算結果を出力する。同時に、演算部３はメモリ６に記憶されている演算結果を用いて演算を行う。同時に、演算部４は、メモリ７に記憶されている演算結果を用いて演算を行う。同時に演算部５は、メモリ８に記憶されている演算結果を用いて演算を行う。 In unit cycle 4, operation unit 2 performs an operation using the fourth input data and outputs the operation result to memory 6. At the same time, the calculation unit 3 performs a calculation using the calculation result stored in the memory 6. At the same time, the calculation unit 4 performs a calculation using the calculation result stored in the memory 7. At the same time, the calculation unit 5 performs a calculation using the calculation result stored in the memory 8.

以上の動作により、単位サイクル４においては、演算部２において第４入力データの演算までが開始されている。 With the above operation, in the unit cycle 4, the calculation unit 2 starts to calculate the fourth input data.

すなわち、要求速度が最大の場合には、各単位サイクルにおいては、４つの演算部２〜５は全て演算を行う。当然ながら、全ての演算部が動作するので、メモリ６〜８も全て動作している。 That is, when the required speed is the maximum, all the four calculation units 2 to 5 perform calculations in each unit cycle. Of course, since all the arithmetic units operate, all the memories 6 to 8 also operate.

すなわち、全ての単位サイクルにおいて、制御部９は、全ての演算部２〜５を、演算を行う演算部として決定する。 That is, in all the unit cycles, the control unit 9 determines all the calculation units 2 to 5 as calculation units that perform the calculation.

このため、図２に示される場合には、データ処理装置１の消費電力は最大であるが、処理速度も最大になる。 For this reason, in the case shown in FIG. 2, the power consumption of the data processing apparatus 1 is maximum, but the processing speed is also maximum.

次に、図３を用いて、全体処理時間に対する要求速度が中程度の場合について説明する。この場合には、２単位サイクルおきに、入力データが演算部２に入力される。 Next, the case where the required speed for the entire processing time is medium will be described with reference to FIG. In this case, input data is input to the calculation unit 2 every two unit cycles.

２単位サイクルおきに入力データが演算部２に入力されるので、制御部９は、同一単位サイクル内では、４つの演算部２〜５の内、２つの演算部を演算を行う演算部として決定する。 Since input data is input to the calculation unit 2 every two unit cycles, the control unit 9 determines two calculation units among the four calculation units 2 to 5 as calculation units that perform calculation within the same unit cycle. To do.

図３において、斜線の施されている演算部は、該当する単位サイクル内に演算を行わない演算部である。 In FIG. 3, an operation unit that is shaded is an operation unit that does not perform an operation within a corresponding unit cycle.

単位サイクル１においては、演算部２と演算部４のみが演算を行う。単位サイクル２においては、演算部３と演算部５のみが、演算を行う。単位サイクル３においては、演算部２と演算部４のみが演算を行う。単位サイクル４においては、演算部３と演算部５のみが演算を行う。 In the unit cycle 1, only the calculation unit 2 and the calculation unit 4 perform the calculation. In the unit cycle 2, only the calculation unit 3 and the calculation unit 5 perform the calculation. In the unit cycle 3, only the calculation unit 2 and the calculation unit 4 perform the calculation. In the unit cycle 4, only the calculation unit 3 and the calculation unit 5 perform calculations.

以上の動作により、単位サイクル４においては、第２入力データの演算までが開始されている。すなわち、図２に示される場合に比べて、データ処理装置１は、半分のデータ量の演算処理を終了する。しかしながら、演算を行っていない斜線が施された演算部の電力は低減する。 By the above operation, in the unit cycle 4, the calculation up to the second input data is started. That is, as compared with the case shown in FIG. 2, the data processing device 1 finishes the calculation processing of half the data amount. However, the electric power of the calculation part to which the diagonal line which is not calculating is given reduces.

このため、図３に示される場合には、データ処理装置１の消費電力が削減できる。 For this reason, in the case shown in FIG. 3, the power consumption of the data processing apparatus 1 can be reduced.

特に、演算を行っていない演算部への電力供給が遮断される場合には、これらの演算部の消費電力はゼロに近づくため、単位サイクル内において必要とされる消費電力はほぼ半減する。 In particular, when power supply to a computation unit that is not performing computation is interrupted, the power consumption of these computation units approaches zero, so the power consumption required in a unit cycle is almost halved.

次に、図４を用いて、全体処理時間に対する要求速度が低い場合について説明する。この場合には、４単位サイクルおきに入力データが演算部２に入力する。 Next, a case where the required speed for the entire processing time is low will be described with reference to FIG. In this case, input data is input to the computing unit 2 every four unit cycles.

４単位サイクルおきに入力データが演算部２に入力されるので、制御部９は、同一単位サイクル内では、４つの演算部２〜５の内、１つの演算部を、演算を行う演算部として決定する。 Since the input data is input to the calculation unit 2 every four unit cycles, the control unit 9 uses one of the four calculation units 2 to 5 as a calculation unit that performs a calculation within the same unit cycle. decide.

図４において、斜線の施されている演算部は、該当する単位サイクル内に演算を行わない演算部である。 In FIG. 4, an operation unit that is shaded is an operation unit that does not perform an operation within the corresponding unit cycle.

単位サイクル１においては、演算部２のみが演算を行い、演算結果をメモリ６に出力する。 In the unit cycle 1, only the calculation unit 2 performs the calculation and outputs the calculation result to the memory 6.

単位サイクル２においては、メモリ６に記憶されている演算結果を用いて、演算部３のみが演算を行い、演算結果をメモリ７に出力する。 In the unit cycle 2, only the operation unit 3 performs an operation using the operation result stored in the memory 6 and outputs the operation result to the memory 7.

単位サイクル３においては、メモリ７に記憶されている演算結果を用いて、演算部４のみが演算を行い、演算結果をメモリ８に出力する。 In the unit cycle 3, only the operation unit 4 performs an operation using the operation result stored in the memory 7 and outputs the operation result to the memory 8.

単位サイクル４においては、メモリ８に記憶されている演算結果を用いて、演算部５のみが演算を行う。 In the unit cycle 4, only the operation unit 5 performs an operation using the operation result stored in the memory 8.

以上の動作により、単位サイクル４においては、第１入力データの演算が終了する。すなわち、図２に示される場合に比べて、データ処理装置１は、４分の１のデータ量の演算処理を終了する。しかしながら、演算を行っていない斜線が施された演算部の電力が低減する。 With the above operation, in the unit cycle 4, the calculation of the first input data is completed. That is, as compared with the case shown in FIG. 2, the data processing device 1 finishes the calculation process of the data amount of ¼. However, the electric power of the calculation part to which the diagonal line which is not calculating is given reduces.

特に、演算を行っていない演算部への電力供給が遮断される場合には、これらの演算部の消費電力はゼロに近づくため、単位サイクル内において必要とされる消費電力は、ほぼ４分の１となる。すなわち、ピーク電力が４分の１となる。 In particular, when the power supply to the computation units that are not performing computation is interrupted, the power consumption of these computation units approaches zero, so the power consumption required in a unit cycle is approximately 4 minutes. 1 That is, the peak power is a quarter.

ここで、図３、図４において、斜線の施された演算を行っていない演算部の消費電力削減について説明する。 Here, in FIG. 3 and FIG. 4, a description will be given of power consumption reduction of a calculation unit that does not perform a calculation with hatching.

ある単位サイクルにおいて演算を行わない演算部に対して（図３、図４で斜線の施されている演算部）、制御部９は、電力供給を遮断する。演算が不要な演算部への電力供給が遮断されることで、最も効果的に消費電力が削減される。 For a calculation unit that does not perform calculation in a certain unit cycle (calculation unit hatched in FIGS. 3 and 4), the control unit 9 cuts off the power supply. The power consumption is most effectively reduced by cutting off the power supply to the computation unit that does not require computation.

電力供給の遮断は、演算部への電力線路に設けられたスイッチの開放により実現されればよい。 The interruption of the power supply may be realized by opening a switch provided on the power line to the calculation unit.

また、消費電力削減にとって、制御部９が、演算を行わない演算部に対するクロック信号の周波数を低減することも、消費電力削減のためには好適である。演算部を構成する電子回路の消費電力は、クロック信号の周波数に比例して増加する。このため、クロック信号の周波数が低減されることで、データ処理装置１の消費電力が削減できる。 In order to reduce power consumption, it is also preferable for the power consumption reduction that the control unit 9 reduces the frequency of the clock signal for the computation unit that does not perform computation. The power consumption of the electronic circuit constituting the arithmetic unit increases in proportion to the frequency of the clock signal. For this reason, the power consumption of the data processing device 1 can be reduced by reducing the frequency of the clock signal.

演算部を構成する電子回路は、クロック信号の周波数に比例して消費電力が大きくなる。このため、クロック信号の周波数を低減することで、演算部での消費電力が削減できる。 In the electronic circuit constituting the arithmetic unit, power consumption increases in proportion to the frequency of the clock signal. For this reason, the power consumption in a calculating part can be reduced by reducing the frequency of a clock signal.

なお、クロック信号の周波数の低減の代わりに、制御部９は、演算不要な演算部に対するクロック信号を停止しても良い。 Instead of reducing the frequency of the clock signal, the control unit 9 may stop the clock signal for the calculation unit that does not require calculation.

あるいは、消費電力削減にとって、制御部９は、演算を行わない演算部の閾値電圧を増加することも、消費電力削減のために好適である。 Alternatively, for power consumption reduction, it is also preferable for the power consumption reduction to increase the threshold voltage of the computation unit 9 that does not perform computation.

閾値電圧が増加することで、演算部を構成するＭＯＳトランジスタのソース、ドレイン間の電位差が減少し、ＭＯＳトランジスタのリーク電流が低減する。結果として、演算が不要な演算部の消費電力が削減できる。 As the threshold voltage increases, the potential difference between the source and drain of the MOS transistor constituting the arithmetic unit decreases, and the leakage current of the MOS transistor decreases. As a result, the power consumption of the calculation unit that does not require calculation can be reduced.

制御部９による以上の制御により、高速の演算が不要な場合には、データ処理装置１の消費電力が削減できる。結果として、データ処理装置１に対して要求される全体処理時間と消費電力の適切なバランスによる、演算処理が実現される。 With the above control by the control unit 9, the power consumption of the data processing apparatus 1 can be reduced when high-speed computation is unnecessary. As a result, arithmetic processing is realized by an appropriate balance between the total processing time required for the data processing device 1 and power consumption.

次に、図５〜図７を用いて演算の不要な演算部に加えて、データ更新の不要なメモリの消費電力を削減することについて説明する。 Next, reduction of the power consumption of the memory that does not require data update in addition to the arithmetic unit that does not require computation will be described with reference to FIGS.

図５は、本発明の実施の形態１におけるデータ処理装置のブロック図である。 FIG. 5 is a block diagram of the data processing apparatus according to Embodiment 1 of the present invention.

メモリ６〜８は、それぞれ複数のバンクを含み、メモリの前段の演算部からのデータの記憶を行う第１バンクと、メモリの後段の演算部に出力するデータの記憶を行う第２バンクを有している。 Each of the memories 6 to 8 includes a plurality of banks, and has a first bank that stores data from a calculation unit at the preceding stage of the memory and a second bank that stores data output to the calculation unit at the subsequent stage of the memory. is doing.

第１バンク１０は、演算部２の演算結果を記憶し、第２バンク１１は、演算部３へ出力するデータを記憶する。 The first bank 10 stores the calculation result of the calculation unit 2, and the second bank 11 stores data to be output to the calculation unit 3.

第１バンク１２は、演算部３の演算結果を記憶し、第２バンク１３は、演算部４へ出力するデータを記憶する。 The first bank 12 stores the calculation result of the calculation unit 3, and the second bank 13 stores data to be output to the calculation unit 4.

第１バンク１４は、演算部４の演算結果を記憶し、第２バンク１５は、演算部５へ出力するデータを記憶する。 The first bank 14 stores the calculation result of the calculation unit 4, and the second bank 15 stores data to be output to the calculation unit 5.

データ処理装置１に対する要求速度が最大の場合には、全ての演算部２〜５が演算を行うので、全てのメモリ６〜８も全ての単位サイクルにおいてデータ更新の必要性がある。 When the required speed for the data processing device 1 is maximum, all the calculation units 2 to 5 perform calculations, and therefore all the memories 6 to 8 need to be updated in every unit cycle.

一方、データ処理装置１に対する要求速度が中程度の場合には、図３で説明したように、同一の単位サイクル内では、４つの演算部２〜５の内、２つの演算部が演算を行う。このため、演算を行わない演算部に隣接するメモリ中のバンクにおいて、データ更新、もしくはデータ出力の不要なバンクが発生する。 On the other hand, when the requested speed with respect to the data processing device 1 is medium, as described with reference to FIG. 3, two of the four computing units 2 to 5 perform computation within the same unit cycle. . For this reason, a bank that does not require data update or data output occurs in a bank in a memory adjacent to a computation unit that does not perform computation.

データ更新もしくはデータ出力の不要なバンクの発生について、図６、図７を用いて説明する。 Generation of a bank that does not require data update or data output will be described with reference to FIGS.

図６、図７は本発明の実施の形態１におけるデータ処理装置の動作を説明する説明図である。 6 and 7 are explanatory diagrams for explaining the operation of the data processing apparatus according to the first embodiment of the present invention.

図６は、図３と同じく２単位サイクルおきに、入力データが演算部２に入力される場合を示している。 FIG. 6 shows a case where input data is input to the computing unit 2 every two unit cycles, as in FIG.

単位サイクル１においては、演算部３と演算部５が演算不要であるので、演算の不要な演算部３と演算部５に関わる第２バンク１１と第１バンク１２と第２バンク１５は、データ更新もしくはデータ出力を不要とする。 In the unit cycle 1, the calculation unit 3 and the calculation unit 5 do not need calculation, so the second bank 11, the first bank 12, and the second bank 15 related to the calculation unit 3 and calculation unit 5 that do not need calculation are data No update or data output is required.

同様に、単位サイクル２においては、第１バンク１０、第２バンク１３と第１バンク１４は、データ更新もしくはデータ出力を不要とする。 Similarly, in the unit cycle 2, the first bank 10, the second bank 13, and the first bank 14 do not require data update or data output.

単位サイクル３においては、第２バンク１１、第１バンク１２と第２バンク１５が、データ更新もしくはデータ出力を不要とする。 In the unit cycle 3, the second bank 11, the first bank 12, and the second bank 15 do not require data update or data output.

単位サイクル４においては、第１バンク１０、第２バンク１３と第１バンク１４が、データ更新もしくはデータ出力を不要とする。 In the unit cycle 4, the first bank 10, the second bank 13, and the first bank 14 do not require data update or data output.

データ更新もしくはデータ出力が不要なバンクは、演算が不要な演算部と同じく、消費電力削減の対象となる。メモリ６〜８が、バンクで分割されていない場合には、図６の単位サイクル１において、演算部３が演算不要であっても、演算部２と演算部４が演算を行うため、メモリ６とメモリ７は、データ更新を必要とする。このため、メモリを消費電力削減の対象とできない。しかし、メモリを複数のバンクに分割することで、演算不要な演算部に対応するバンクを消費電力削減の対象とできる。 Banks that do not require data update or data output are subject to power consumption reduction, as are computation units that do not require computation. If the memories 6 to 8 are not divided by banks, the calculation unit 2 and the calculation unit 4 perform calculations even when the calculation unit 3 does not require calculation in the unit cycle 1 of FIG. And the memory 7 requires data update. For this reason, the memory cannot be a target for reducing power consumption. However, by dividing the memory into a plurality of banks, a bank corresponding to a calculation unit that does not require calculation can be targeted for power consumption reduction.

データ更新もしくはデータ出力が不要なバンクに対しては、制御部９は、電力供給を遮断する。電力供給が遮断されることで、消費電力が効果的に削減できる。特に、ある瞬間での電力であるピーク電力が削減できる。 For banks that do not require data update or data output, the controller 9 cuts off the power supply. By cutting off the power supply, power consumption can be effectively reduced. In particular, the peak power that is the power at a certain moment can be reduced.

または、制御部９は、データ更新もしくはデータ出力が不要なバンクに対するクロック信号の周波数を低減、あるいはクロック信号を停止する。これは、クロック信号の周波数の低減もしくはクロック信号の停止により、消費電力が削減できるからである。 Alternatively, the control unit 9 reduces the frequency of the clock signal for a bank that does not require data update or data output, or stops the clock signal. This is because power consumption can be reduced by reducing the frequency of the clock signal or stopping the clock signal.

更に、制御部９は、データ更新もしくはデータ出力の不要なバンクの閾値電圧を増加する。閾値電圧の増加により、リーク電流が減少し、消費電力が削減できるからである。 Furthermore, the control unit 9 increases the threshold voltage of the bank that does not require data update or data output. This is because leakage current is reduced and power consumption can be reduced by increasing the threshold voltage.

なお、図６に示されるとおり、斜線が施された演算部とバンクの両方に対して、制御部９は、電力供給の遮断、クロック信号の停止、周波数の低減、及び閾値電圧の増加のいずれか（あるいは合わせて）を行う。結果として、データ処理装置１の消費電力が削減できる。 Note that, as shown in FIG. 6, for both the calculation unit and the bank that are shaded, the control unit 9 can either cut off the power supply, stop the clock signal, reduce the frequency, or increase the threshold voltage. Do (or combine). As a result, the power consumption of the data processing apparatus 1 can be reduced.

入力データが４単位サイクルおきに入力する場合も同様であり、図７に示される。斜線が施された演算部とバンクが、消費電力削減の対象となる。 The same applies when input data is input every four unit cycles, as shown in FIG. The calculation units and banks that are shaded are the targets for power consumption reduction.

図７においては、更に多くの演算部とバンクが演算不要となり、消費電力が削減できる。 In FIG. 7, more calculation units and banks are not required to calculate, and power consumption can be reduced.

なお、制御部９は、データ処理装置１に要求される全体処理時間、初段の演算部に入力する入力データの入力間隔に基づいて、複数の演算部の内、演算を行う演算部を決定する。更に、制御部９は、外部から通知される演算処理の許容時間情報に応じて、複数の演算部の内、演算を行う演算部を決定する。例えば、画像圧縮処理を行うアプリケーションに、データ処理装置１が実装される場合には、対象とする画像サイズの変更と再生速度の情報に基づく許容時間情報が、データ処理装置１に通知される。制御部９は、この許容時間情報に基づいて、入力データの入力間隔を決定し、決定された入力間隔に基づいて、複数の演算部の内、演算を行う演算部を決定する。 Note that the control unit 9 determines a calculation unit to perform calculation among a plurality of calculation units based on the total processing time required for the data processing apparatus 1 and the input interval of input data input to the first-stage calculation unit. . Furthermore, the control part 9 determines the calculating part which performs a calculation among several calculating parts according to the permissible time information of the calculating process notified from the outside. For example, when the data processing apparatus 1 is installed in an application that performs image compression processing, the data processing apparatus 1 is notified of allowable time information based on information on the target image size change and reproduction speed. The control unit 9 determines an input interval of the input data based on the allowable time information, and determines a calculation unit that performs a calculation among the plurality of calculation units based on the determined input interval.

以上のように、要求速度や入力データの入力間隔に応じて、複数の演算部の内、演算を行う演算部を決定することで、演算不要な演算部およびデータ更新もしくはデータ出力の不要なバンクを生じさせることができる。演算不要な演算部およびデータ更新もしくはデータ出力不要なバンクの消費電力を削減できる。 As described above, the calculation unit that does not need calculation and the bank that does not need data update or data output can be determined by determining the calculation unit to perform calculation among the plurality of calculation units according to the required speed and the input interval of the input data. Can be generated. It is possible to reduce power consumption of a calculation unit that does not require calculation and a bank that does not require data update or data output.

結果として、要求速度に見合った処理時間と消費電力の削減を両立できる。 As a result, it is possible to achieve both a processing time corresponding to the required speed and a reduction in power consumption.

（実施の形態２）
次に、実施の形態２について説明する。実施の形態１では、汎用の演算部を備えたデータ処理装置について説明したが、実施の形態２では、データ処理装置が、Ｍ画素ｘＮ画素（Ｍ、Ｎは１以上の整数）の画像に対する画像処理を行う場合を例に説明する。また、データ処理装置に含まれる複数の演算部の各々は、この画像処理に含まれる複数の処理単位の各々に対応する演算を行う。 (Embodiment 2)
Next, a second embodiment will be described. In the first embodiment, the data processing apparatus including the general-purpose arithmetic unit has been described. However, in the second embodiment, the data processing apparatus performs an image for an image of M pixels × N pixels (M and N are integers of 1 or more). A case where processing is performed will be described as an example. Each of the plurality of calculation units included in the data processing apparatus performs a calculation corresponding to each of the plurality of processing units included in the image processing.

図８は、図１１は、本発明の実施の形態２におけるデータ処理装置のブロック図である。 FIG. 8 is a block diagram of the data processing apparatus according to Embodiment 2 of the present invention.

図８において、データ処理装置１００は、動画圧縮伸長規格の１つであるＭＰＥＧ−４による動画像圧縮処理を行う。 In FIG. 8, the data processing apparatus 100 performs moving image compression processing according to MPEG-4, which is one of the moving image compression / decompression standards.

データ処理装置１００は、入力メモリ１０１、動き検出部１０２、ＤＣＴ/量子化部１０３、ＤＣＡＣ予測部１０４、ＶＬＣ部１０５、逆量子化/逆ＤＣＴ部１０６、動き補償/再構成部１０７、メモリ１０８〜１１１、出力メモリ１１２と１１３、そして、制御部１１４を備える。動き検出部１０２、ＤＣＴ/量子化部１０３、ＤＣＡＣ予測部１０４、ＶＬＣ部１０５、逆量子化/逆ＤＣＴ部１０６、動き補償/再構成部１０７の各々は、データ処理装置１００に含まれる演算部である。これらの演算部は、パイプライン処理を行う。 The data processing apparatus 100 includes an input memory 101, a motion detection unit 102, a DCT / quantization unit 103, a DCAC prediction unit 104, a VLC unit 105, an inverse quantization / inverse DCT unit 106, a motion compensation / reconstruction unit 107, and a memory 108. To 111, output memories 112 and 113, and a control unit 114. The motion detection unit 102, the DCT / quantization unit 103, the DCAC prediction unit 104, the VLC unit 105, the inverse quantization / inverse DCT unit 106, and the motion compensation / reconstruction unit 107 are each an arithmetic unit included in the data processing apparatus 100 It is. These arithmetic units perform pipeline processing.

メモリ１０８、１０９、１１０、１１１は、それぞれ複数のバンクを有しており、ここでは第１バンクと第２バンクを備えている。 Each of the memories 108, 109, 110, and 111 has a plurality of banks, and here includes a first bank and a second bank.

データ処理装置１００が実行するデータ単位は、１６×１６画素のマクロブロック単位であるとする。 It is assumed that the data unit executed by the data processing apparatus 100 is a macro block unit of 16 × 16 pixels.

入力メモリ１０１は、動き検出部１０２への入力データとなる符号化対象画素データと参照画素データを記憶する。符号化対象画素データと参照画素データはダイレクト・メモリ・アクセス（以下、「ＤＭＡ」という）により外部メモリ１４０から転送される。 The input memory 101 stores encoding target pixel data and reference pixel data that are input data to the motion detection unit 102. The encoding target pixel data and the reference pixel data are transferred from the external memory 140 by direct memory access (hereinafter referred to as “DMA”).

動き検出部１０２は、入力メモリ１０１からの符号化対象画素データと参照画素データを入力とし、動き検出処理を実行する。動き検出部１０２は、演算結果である差分画素データをメモリ１０８に出力する。ただし、画面内符号化を行う場合は、入力メモリ１０１から読み出した符号化対象画素データを、そのままメモリ１０８に出力する。 The motion detection unit 102 receives the encoding target pixel data and the reference pixel data from the input memory 101 and executes a motion detection process. The motion detection unit 102 outputs the difference pixel data that is the calculation result to the memory 108. However, when performing intra-screen encoding, the encoding target pixel data read from the input memory 101 is output to the memory 108 as it is.

ＤＣＴ/量子化部１０３は、メモリ１０８から、差分画素データ、または符号化対象画素データを読み出す。ＤＣＴ／量子化部１０３は、読み出したデータを用いてＤＣＴ演算と量子化演算を実行し、演算結果の係数データをメモリ１０９へ出力する。 The DCT / quantization unit 103 reads the difference pixel data or the encoding target pixel data from the memory 108. The DCT / quantization unit 103 performs a DCT operation and a quantization operation using the read data, and outputs the coefficient data of the operation result to the memory 109.

ＤＣＡＣ予測部１０４は、メモリ１０９から、係数データと係数予測処理で参照するＤＣ成分とＡＣ成分の参照係数データを、読み出す。ＤＣＡＣ予測部１０４は、ＤＣ成分とＡＣ成分の係数予測処理と、スキャン方法による、係数データの２次元配列から１次元配列への並び替えを実行する。次いで、ＤＣＡＣ予測部１０４は、演算結果である１次元配列の係数データと、将来の係数予測処理で参照されるＤＣ成分とＡＣ成分の係数データをメモリ１１０へ出力する。 The DCAC prediction unit 104 reads the coefficient data and the reference coefficient data of the DC component and the AC component that are referred to in the coefficient prediction process from the memory 109. The DCAC prediction unit 104 executes DC component and AC component coefficient prediction processing, and rearrangement of coefficient data from a two-dimensional array to a one-dimensional array by a scanning method. Next, the DCAC prediction unit 104 outputs the coefficient data of the one-dimensional array as the calculation result and the coefficient data of the DC component and the AC component referred to in the future coefficient prediction process to the memory 110.

ＶＬＣ部１０５は、メモリ１１０から、１次元配列となった係数データを読み出す。ＶＬＣ部１０５は、可変長符号化を実行し、演算結果となる可変長符号データを出力メモリ１１２へ出力する。 The VLC unit 105 reads the coefficient data that has become a one-dimensional array from the memory 110. The VLC unit 105 performs variable length coding, and outputs variable length code data that is a calculation result to the output memory 112.

逆量子化／逆ＤＣＴ部１０６は、メモリ１０９から、量子化係数データを読み出す。読み出したデータを用いて、逆量子化／逆ＤＣＴ部１０６は、逆量子化演算と逆ＤＣＴ演算を行い、演算結果となるデコードされた差分画素データ、または、符号化対象画素データをメモリ１１１へ出力する。 The inverse quantization / inverse DCT unit 106 reads the quantized coefficient data from the memory 109. Using the read data, the inverse quantization / inverse DCT unit 106 performs an inverse quantization operation and an inverse DCT operation, and outputs the decoded difference pixel data or the encoding target pixel data as the operation result to the memory 111. Output.

動き補償／再構成部１０７は、メモリ１１１から、デコードされた差分画素データと動き補償処理で用いられる参照画素データを読み出す。更に、動き補償／再構成部１０７は、符号化対象画素データと動き補償処理で用いる参照画素データを読み出す。差分画素データの場合は、動き補償／再構成部１０７は、参照画素データを用いて動き補償処理を実行する。更に、動き補償／再構成部１０７は、動き補償済み参照画素データと差分画素データを加算する再構成演算を行い、再構成された符号化対象画素データを出力メモリ１１３へ出力する。あるいは、入力されたデータがデコードされた符号化対象画素データの場合は、動き補償／再構成部１０７は、符号化対象画素データをそのまま出力メモリ１１３へ出力する。 The motion compensation / reconstruction unit 107 reads the decoded difference pixel data and reference pixel data used in the motion compensation process from the memory 111. Further, the motion compensation / reconstruction unit 107 reads out encoding target pixel data and reference pixel data used in the motion compensation process. In the case of difference pixel data, the motion compensation / reconstruction unit 107 performs a motion compensation process using the reference pixel data. Further, the motion compensation / reconstruction unit 107 performs a reconstruction operation for adding the motion compensated reference pixel data and the difference pixel data, and outputs the reconstructed encoding target pixel data to the output memory 113. Alternatively, when the input data is decoded pixel data to be encoded, the motion compensation / reconstruction unit 107 outputs the encoding target pixel data to the output memory 113 as it is.

メモリ１０８は、第１バンク１０８ａと第２バンク１０８ｂの二つのバンクを備えている。一方のバンクは、動き検出部１０２から出力される差分画素データ、または、符号化対象画素データを記憶する。他方のバンクは、ＤＣＴ／量子化部１０３へ保持している差分画素データ、または、符号化対象画素データを出力する。１回の演算が終了する度に、第１バンク１０８ａと第２バンク１０８ｂの役割が入れ替わる。つまり、ある単位サイクルにおいて、第１バンク１０８ａが、動き検出部１０２からの出力データを記憶し、第２バンク１０８ｂが、ＤＣＴ／量子化部１０３への入力データを出力している場合、次の単位サイクルでは、第１バンク１０８ａが、前の単位サイクルで記憶していたデータをＤＣＴ／量子化部１０３へ出力し、第２バンク１０８ｂが、動き検出部１０２からの出力データを記憶する。 The memory 108 includes two banks, a first bank 108a and a second bank 108b. One bank stores differential pixel data output from the motion detection unit 102 or encoding target pixel data. The other bank outputs difference pixel data or encoding target pixel data held in the DCT / quantization unit 103. Each time one operation is completed, the roles of the first bank 108a and the second bank 108b are switched. That is, in a certain unit cycle, when the first bank 108a stores the output data from the motion detection unit 102 and the second bank 108b outputs the input data to the DCT / quantization unit 103, In the unit cycle, the first bank 108a outputs the data stored in the previous unit cycle to the DCT / quantization unit 103, and the second bank 108b stores the output data from the motion detection unit 102.

メモリ１０９も、第１バンク１０９ａと第２バンク１０９ｂを有する。 The memory 109 also has a first bank 109a and a second bank 109b.

一方のバンクはＤＣＴ／量子化部１０３から出力される係数データとＤＭＡにより外部メモリ１４０から入力されるＤＣ成分とＡＣ成分の参照係数データを記憶する。もう一方のバンクは、ＤＣＡＣ予測部１０４へ、記憶している係数データとＤＣ成分とＡＣ成分の参照係数データを出力する。 One bank stores coefficient data output from the DCT / quantization unit 103 and DC and AC component reference coefficient data input from the external memory 140 by DMA. The other bank outputs the stored coefficient data, DC component, and AC component reference coefficient data to the DCAC prediction unit 104.

メモリ１１０は、第１バンク１１０ａと第２バンク１１０ｂを有する。一方のバンクは、ＤＣＡＣ予測部１０４から出力される１次元配列の係数データと将来の係数予測処理の参照用ＤＣ成分とＡＣ成分の係数データを保持する。他方のバンクは、ＶＬＣ部１０５へ保持した１次元配列の係数データを出力する。また、将来の係数予測処理の参照用ＤＣ成分とＡＣ成分の係数データを記憶しているバンクから、ＤＭＡを用いて、外部メモリ１４０へ、将来の係数予測処理の参照用ＤＣ成分とＡＣ成分の係数データが転送される。 The memory 110 has a first bank 110a and a second bank 110b. One bank holds the one-dimensional array coefficient data output from the DCAC prediction unit 104, the reference DC component for future coefficient prediction processing, and the AC component coefficient data. The other bank outputs the one-dimensional array of coefficient data stored in the VLC unit 105. Further, the reference DC component and the AC component for the future coefficient prediction process are transferred from the bank storing the coefficient data of the reference coefficient and the AC component for the future coefficient prediction process to the external memory 140 using the DMA. Coefficient data is transferred.

メモリ１１１は、第１バンク１１１ａと第２バンク１１１ｂを有する。一方のバンクは逆量子化／逆ＤＣＴ部１０６から出力される、デコードされた差分画素データ、または符号化対象画素データを記憶する。または、このバンクは、外部メモリ１４０から入力される動き補償／再構成部１０７の動き補償処理で用いる参照画素データを記憶する。 The memory 111 has a first bank 111a and a second bank 111b. One bank stores decoded differential pixel data or encoding target pixel data output from the inverse quantization / inverse DCT unit 106. Alternatively, this bank stores reference pixel data used in the motion compensation process of the motion compensation / reconstruction unit 107 input from the external memory 140.

もう一方のバンクは、動き補償/再構成部１０７へ、記憶しているデコードされた差分画素データ、または符号化対象画素データと参照画素データを出力する。 The other bank outputs the stored decoded difference pixel data or encoding target pixel data and reference pixel data to the motion compensation / reconstruction unit 107.

出力メモリ１１２は、ＶＬＣ部１０５から出力される可変長符号データを記憶する。記憶されている可変長符号データは、符号化したビットストリームを形成するため、外部メモリ１４０へ出力される。 The output memory 112 stores variable length code data output from the VLC unit 105. The stored variable length code data is output to the external memory 140 to form an encoded bit stream.

出力メモリ１１３は、動き補償／再構成部１０７から出力される再構成された符号化対象画素データを保持する。記憶している再構成された符号化対象画素データは、次の画像フレームの動き検出時の参照画素データとして、外部メモリ１４０に出力される。 The output memory 113 holds the reconstructed encoding target pixel data output from the motion compensation / reconstruction unit 107. The stored reconstructed pixel data to be encoded is output to the external memory 140 as reference pixel data at the time of motion detection of the next image frame.

制御部１１４は、ある単位サイクル内において演算を行う演算部を決定する。あるいは、演算を行う演算部の個数を決定する。ここでは、演算部として設けられている、動き検出部１０２、ＤＣＴ／量子化部１０３、ＤＣＡＣ予測部１０４、ＶＬＣ部１０５、逆量子化／逆ＤＣＴ部１０６、動き補償／再構成部１０７の内から、ある単位サイクル内に演算を行う演算部を決定する。 The control unit 114 determines a calculation unit that performs a calculation within a certain unit cycle. Alternatively, the number of calculation units that perform calculation is determined. Here, among the motion detection unit 102, the DCT / quantization unit 103, the DCAC prediction unit 104, the VLC unit 105, the inverse quantization / inverse DCT unit 106, and the motion compensation / reconstruction unit 107, which are provided as calculation units. Thus, an arithmetic unit that performs an arithmetic operation within a certain unit cycle is determined.

また、制御部９は、演算を行う演算部の決定に伴い、データ更新、もしくはデータ出力の不要なバンクを決定する。 Further, the control unit 9 determines a bank that does not require data update or data output in accordance with the determination of the calculation unit that performs the calculation.

データ処理装置１００が行う画像処理においては、複数の仕様やアプリケーションが選択される。例えば、データ処理装置１００の最大性能がＶＧＡサイズ（６４０×４８０画素）でフレームレート３０枚のＭＰＥＧ−４の圧縮処理であるとする。 In the image processing performed by the data processing apparatus 100, a plurality of specifications and applications are selected. For example, it is assumed that the maximum performance of the data processing apparatus 100 is MPEG-4 compression processing with a VGA size (640 × 480 pixels) and a frame rate of 30 sheets.

このとき、第１アプリケーションは、ＶＧＡサイズ、フレームレート３０枚の圧縮処理を要求するとする。第２アプリケーションは、ＱＶＧＡサイズ（３２０ｘ２４０画素）、フレームレート３０枚の圧縮処理を要求するとする。第３アプリケーションは、ＱＣＩサイズ（（１７６×１４４画素）、フレームレート１５枚の圧縮処理を要求するとする。 At this time, the first application requests a compression process with a VGA size and a frame rate of 30 sheets. Assume that the second application requests a compression process with a QVGA size (320 × 240 pixels) and a frame rate of 30 sheets. Assume that the third application requests compression processing with a QCI size ((176 × 144 pixels) and a frame rate of 15 sheets.

以上より、データ処理装置１００にとって、第１アプリケーション時には、要求速度が最大であり、第１アプリケーション時の要求速度を値「１」とすると、第２アプリケーション時には、要求速度が約１/４であり、第３アプリケーション時には、要求速度が約１/２４である。したがって、第２アプリケーション、または第３アプリケーション時においては、全体処理に許容される時間は、第１アプリケーション時に比べて長い。 From the above, for the data processing apparatus 100, the requested speed is maximum at the time of the first application, and if the requested speed at the time of the first application is “1”, the requested speed is about ¼ at the time of the second application. In the third application, the required speed is about 1/24. Therefore, in the second application or the third application, the time allowed for the entire process is longer than that in the first application.

アプリケーションの違いによる、データ処理装置１００の動作の違いを、図９〜図１２を用いて説明する。 A difference in operation of the data processing apparatus 100 due to a difference in application will be described with reference to FIGS.

図９、図１０、図１２は、本発明の実施の形態２におけるデータ処理装置の処理を説明する説明図である。 9, 10 and 12 are explanatory diagrams for explaining the processing of the data processing apparatus according to the second embodiment of the present invention.

図９は、データ処理装置１００に最大速度を要求する第１アプリケーション時の、データ処理装置１００の動作を示す。 FIG. 9 shows the operation of the data processing apparatus 100 during the first application requesting the maximum speed from the data processing apparatus 100.

横方向に、単位サイクルＴ〜Ｔ＋４が示されている。縦方向に、直列接続された演算部が示されている。マトリクスの中に記載されたｎ−３〜ｎ＋４は、演算対象のマクロブロックの番号である。各演算部は、マクロブロック単位で演算を行う。例えば、単位サイクルＴにおいて、動き検出部１０２は、ｎ番目のマクロブロックに対して、動き検出の演算を行う。 In the lateral direction, unit cycles T to T + 4 are shown. In the vertical direction, arithmetic units connected in series are shown. N-3 to n + 4 described in the matrix are the numbers of macro blocks to be calculated. Each operation unit performs an operation in units of macro blocks. For example, in the unit cycle T, the motion detection unit 102 performs motion detection calculation on the nth macroblock.

なお、単位サイクルは、データ処理装置１００に含まれる演算部の処理時間の内、最大の処理時間に基づいて定められる。 The unit cycle is determined based on the maximum processing time among the processing times of the arithmetic units included in the data processing apparatus 100.

図９に示されるように、第１アプリケーション時には、同一単位サイクル内に、動き検出部１０２から動き補償／再構成部１０７までの全ての演算部が演算を行う。また、全てのメモリ１０８〜１１１も動作する。 As shown in FIG. 9, in the first application, all the calculation units from the motion detection unit 102 to the motion compensation / reconstruction unit 107 perform calculations within the same unit cycle. All the memories 108 to 111 also operate.

制御部１１４は、要求速度が最大であるので、同一単位サイクル内に、全ての演算部を、演算を行う演算部として決定する。 Since the required speed is the maximum, the control unit 114 determines all the calculation units as the calculation units that perform the calculation within the same unit cycle.

すなわち、処理時間は最短であるが、消費電力は最大の状態である。 That is, the processing time is the shortest, but the power consumption is the maximum.

ここで、第１アプリケーション時のデータ処理装置で処理されるマクロブロックの平均処理時間を時間「Ｓ］と定義する。 Here, the average processing time of the macroblock processed by the data processing device at the time of the first application is defined as time “S”.

ここで、第２アプリケーション時での処理に許容される許容処理時間が、時間「２．２Ｓ」であるとする。なお、ホストＣＰＵ１３０が、許容処理時間についての情報を、制御部１１４に通知する。 Here, it is assumed that the allowable processing time allowed for processing in the second application is time “2.2S”. The host CPU 130 notifies the control unit 114 of information about the allowable processing time.

第２アプリケーションで処理が行われる場合、制御部１１４は、２単位サイクルおきに、入力データを処理するように、演算を行う演算部を決定する。このときの状況が、図１０に示される。 When processing is performed in the second application, the control unit 114 determines a calculation unit that performs calculation so as to process input data every two unit cycles. The situation at this time is shown in FIG.

図１０において、斜線が施されている演算部は、演算を行っていない演算部である。例えば、単位サイクルＴにおいては、動き検出部１０２とＤＣＡＣ予測部１０４と逆量子化／逆ＤＣＴ部１０６が演算を行い、残りの演算部は、演算を行わない。 In FIG. 10, an operation unit that is shaded is an operation unit that is not performing an operation. For example, in the unit cycle T, the motion detection unit 102, the DCAC prediction unit 104, and the inverse quantization / inverse DCT unit 106 perform calculations, and the remaining calculation units do not perform calculations.

図１０から明らかな通り、単位サイクルＴ＋４において、ｎ＋２番目のマクロブロックまでの動き検出が終了している。すなわち、１マクロブロックの平均処理時間は、時間「２Ｓ」となり、許容処理時間「２．２Ｓ」以内に収まる。 As is apparent from FIG. 10, in the unit cycle T + 4, the motion detection up to the (n + 2) th macroblock is completed. That is, the average processing time of one macroblock is time “2S”, which falls within the allowable processing time “2.2S”.

更に、ある単位サイクル内において演算を行わない演算部については、電力供給を遮断する、あるいはクロック信号の供給を停止する、あるいはクロック信号の周波数を低減する、あるいは閾値電圧を増加することにより、消費電力を削減できる。 Furthermore, for an arithmetic unit that does not perform an operation within a certain unit cycle, the power supply is cut off, the clock signal supply is stopped, the clock signal frequency is reduced, or the threshold voltage is increased. Electric power can be reduced.

すなわち、アプリケーションにより異なる許容処理時間に応じて、制御部１１４は、演算を行う演算部を決定する。 That is, according to the allowable processing time that varies depending on the application, the control unit 114 determines a calculation unit that performs the calculation.

また、図３では、単位サイクル１において、例えば演算部３は演算を行っておらず、演算部４は、演算を行っている。演算部３と演算部４に挟まれているメモリ７に含まれる第１バンクと第２バンクの一方は、演算部３からのデータ更新が不要である。 In FIG. 3, in the unit cycle 1, for example, the calculation unit 3 does not perform calculation, and the calculation unit 4 performs calculation. One of the first bank and the second bank included in the memory 7 sandwiched between the calculation unit 3 and the calculation unit 4 does not require data update from the calculation unit 3.

図１１は、第２アプリケーション時に、データ更新が不要であるか、データ出力が不要であるバンクを、斜線で明示している。 In FIG. 11, a bank that does not require data update or data output is indicated by hatching in the second application.

図１０に示される単位サイクルＴにおいては、ＤＣＴ／量子化部１０３と、ＶＬＣ部１０５と、動き補償／再構成部１０７が、演算を行わない。このため、第２バンク１０８ｂ、第２バンク１０９ｂ、第２バンク１１１ｂがデータ更新、もしくはデータ出力不要である。 In the unit cycle T shown in FIG. 10, the DCT / quantization unit 103, the VLC unit 105, and the motion compensation / reconstruction unit 107 do not perform calculations. For this reason, the second bank 108b, the second bank 109b, and the second bank 111b do not require data update or data output.

制御部１１４は、これらのバンクに対して、電力供給を遮断する、あるいはクロック信号の供給を停止する、あるいはクロック信号の周波数を低減する、あるいは閾値電圧を増加する。結果として、消費電力が削減できる。特に、演算不要の演算部と共に、電力供給の削減やクロック信号の供給停止などが行われることで、更に消費電力が削減できる。 The control unit 114 cuts off the power supply to these banks, stops the supply of the clock signal, reduces the frequency of the clock signal, or increases the threshold voltage. As a result, power consumption can be reduced. In particular, the power consumption can be further reduced by reducing the power supply or stopping the supply of the clock signal together with the computation unit that does not require computation.

次に、第３アプリケーションでの処理の場合について、図１２を用いて説明する。 Next, the case of processing in the third application will be described with reference to FIG.

第３アプリケーション時には、許容処理時間が時間「３．４Ｓ」であるとする。 In the third application, it is assumed that the allowable processing time is the time “3.4S”.

この場合、制御部１１４は、３単位サイクル毎に、入力データを入力して処理を実行する。図１２から明らかな通り、単位サイクルＴ＋４において、ｎ＋１番目のマクロブロックのＤＣＴ／量子化演算が終了しており、全体処理時間は時間「３Ｓ」であり、許容処理時間を満たす。 In this case, the control unit 114 inputs the input data and executes the process every 3 unit cycles. As is apparent from FIG. 12, in the unit cycle T + 4, the DCT / quantization calculation of the (n + 1) th macroblock is completed, and the total processing time is time “3S”, which satisfies the allowable processing time.

図１２においても、斜線が施されている演算部は、演算を行わない演算部であり、制御部１１４が決定する。 Also in FIG. 12, a calculation unit that is shaded is a calculation unit that does not perform calculation, and is determined by the control unit 114.

演算を行わない演算部がある場合には、図１１で説明したのと同様に、データ更新もしくはデータ出力が不要なバンクが発生する。 When there is an operation unit that does not perform an operation, a bank that does not require data update or data output is generated as described with reference to FIG.

制御部９は、演算の不要な演算部や、データ更新もしくはデータ出力が不要なバンクに対して、電力供給を遮断したり、クロック信号の供給を遮断したり、クロック信号の周波数を低減したり、閾値電圧を増加したりする。結果として、動作や処理時間に影響なく、消費電力を削減できる。 The control unit 9 cuts off the power supply, cuts off the supply of the clock signal, or reduces the frequency of the clock signal to a calculation unit that does not require calculation or a bank that does not require data update or data output. Increase the threshold voltage. As a result, power consumption can be reduced without affecting the operation and processing time.

例えば、図１２における単位サイクルＴにおいては、ＤＣＴ／量子化部１０３、ＤＣＡＣ予測部１０４、逆量子化／逆ＤＣＴ部１０６に対して、電力供給遮断やクロック信号の供給遮断などが行われる。 For example, in the unit cycle T in FIG. 12, power supply interruption, clock signal supply interruption, and the like are performed on the DCT / quantization unit 103, the DCAC prediction unit 104, and the inverse quantization / inverse DCT unit 106.

なお、各演算部は、個別に電源供給と遮断が可能であり、あるいは個別にクロック信号の供給が可能であることが、消費電力削減のための制御において好適である。 In addition, it is preferable in the control for power consumption reduction that each arithmetic unit can individually supply and shut off power, or can individually supply a clock signal.

同様に、メモリに含まれるバンクも、個別に電源供給と遮断が可能であったり、個別にクロック信号の供給が可能であったりすることが好適である。 Similarly, it is preferable that the banks included in the memory can be individually supplied and cut off, or can be individually supplied with a clock signal.

以上のように、制御部１１４は、アプリケーションにより異なる許容処理時間に応じて、ある単位サイクル内で演算を行う演算部を決定し、演算の不要な演算部、及びデータ更新やデータ出力の不要なバンクの消費電力を削減する。結果として、アプリケーション毎の仕様を満足すると共に、データ処理装置１００の消費電力を削減できる。 As described above, the control unit 114 determines a calculation unit that performs a calculation within a certain unit cycle according to an allowable processing time that varies depending on the application, and does not need a calculation unit that does not need calculation, data update, or data output. Reduce the power consumption of the bank. As a result, the specifications for each application are satisfied, and the power consumption of the data processing apparatus 100 can be reduced.

なお、実施の形態２では、データ処理装置１００が、マクロブロック単位で処理する場合について説明したが、ブロック単位やスライス単位などで処理する場合であってもよい。また、データ処理装置１００に含まれる演算部は、図８で示されるものに限られない。 In the second embodiment, the case where the data processing apparatus 100 performs processing in units of macroblocks has been described. However, processing may be performed in units of blocks or slices. Further, the calculation unit included in the data processing apparatus 100 is not limited to that shown in FIG.

また、データ処理装置１００が行う処理は、ＭＰＥＧ−４規格による画像処理以外であってもよい。ＭＰＥＧ２やＪＰＥＧなどの、他の規格による画像処理であっても、音声処理であってもよい。 Further, the processing performed by the data processing apparatus 100 may be other than image processing based on the MPEG-4 standard. It may be image processing according to other standards such as MPEG2 or JPEG, or audio processing.

なお、パイプライン処理を実行する演算部の構成は、動き検出部１０２〜動き補償/再構成部１０７の構成に限定するものではない。例えば、動き検出部１０２を２単位サイクルに分割する構成や、ＤＣＴ/量子化処理部１０３をＤＣＴ演算を行う演算部と量子化演算を行う演算部の２つに分割する構成でも良い。 Note that the configuration of the calculation unit that executes the pipeline processing is not limited to the configuration of the motion detection unit 102 to the motion compensation / reconstruction unit 107. For example, a configuration in which the motion detection unit 102 is divided into two unit cycles, or a configuration in which the DCT / quantization processing unit 103 is divided into two units, a calculation unit that performs DCT calculation and a calculation unit that performs quantization calculation, may be employed.

なお、メモリ１０８〜１１１の２バンク構成は、物理的に異なる２つのメモリ・セルで構成する場合と、１つのデュアル・ポートのメモリ・セルで構成する場合のどちらでも構わない。 It should be noted that the two-bank configuration of the memories 108 to 111 may be either configured with two physically different memory cells or configured with one dual-port memory cell.

制御部１１４が、データ処理装置１００に含まれる複数の演算部を制御しても良く、ホストＣＰＵ１３０が、データ処理装置１００に含まれる複数の演算部を制御しても良い。 The control unit 114 may control a plurality of arithmetic units included in the data processing apparatus 100, and the host CPU 130 may control a plurality of arithmetic units included in the data processing apparatus 100.

また、データ処理装置１００が半導体集積回路に搭載される場合には、この半導体集積回路の消費電力が削減できる。特に、ある瞬間での電力であるピーク電力が削減できる。 Further, when the data processing apparatus 100 is mounted on a semiconductor integrated circuit, the power consumption of the semiconductor integrated circuit can be reduced. In particular, the peak power that is the power at a certain moment can be reduced.

（実施の形態３）
次に、実施の形態３について、図１３、図１４を用いて説明する。 (Embodiment 3)
Next, Embodiment 3 will be described with reference to FIGS.

実施の形態３においては、複数の演算部の内、演算を行う演算部が少ない場合に、メモリを共用して消費電力を削減するデータ処理装置について説明する。 In the third embodiment, a data processing apparatus will be described that uses a memory in common and reduces power consumption when there are few calculation units that perform calculation among a plurality of calculation units.

図１３は、本発明の実施の形態３におけるデータ処理装置のブロック図である。図１４は、本発明の実施の形態３におけるデータ処理装置の動作を説明する説明図である。 FIG. 13 is a block diagram of a data processing apparatus according to Embodiment 3 of the present invention. FIG. 14 is an explanatory diagram for explaining the operation of the data processing apparatus according to Embodiment 3 of the present invention.

図１３において、メモリ１０８を第１メモリ、メモリ１１０を第２メモリとする。 In FIG. 13, the memory 108 is a first memory and the memory 110 is a second memory.

第１メモリ１０８は、前段と後段にそれぞれ演算部が接続されている。この接続されている、前段の演算部と後段の演算部を合わせて、第１演算部ペアという。図１３では、動き検出部１０２と、ＤＣＴ／量子化部１０３が第１演算部ペア１５０である。 In the first memory 108, arithmetic units are connected to the former stage and the latter stage, respectively. The connected first calculation unit and the subsequent calculation unit are collectively referred to as a first calculation unit pair. In FIG. 13, the motion detection unit 102 and the DCT / quantization unit 103 are the first calculation unit pair 150.

第２メモリ１１０は、前段にＤＣＡＣ予測部１０４が接続され、後段にＶＬＣ部１０５が接続されている。このＤＣＡＣ予測部１０４とＶＬＣ部１０５が、第２演算部ペア１５１である。 In the second memory 110, the DCAC prediction unit 104 is connected to the preceding stage, and the VLC unit 105 is connected to the subsequent stage. The DCAC prediction unit 104 and the VLC unit 105 are the second calculation unit pair 151.

ここで、アプリケーションの要求速度によっては、ある同一単位サイクル内において、第１演算部ペア１５０と第２演算部ペア１５１が排他的に動作する。すなわち、同一単位サイクル内で、第１演算部ペア１５０と第２演算部ペア１５１の一方が演算を行い、他方は、演算を行わない。 Here, depending on the required speed of the application, the first calculation unit pair 150 and the second calculation unit pair 151 operate exclusively within a certain unit cycle. That is, in the same unit cycle, one of the first arithmetic unit pair 150 and the second arithmetic unit pair 151 performs an operation, and the other does not perform an operation.

このような場合には、第１演算部ペア１５０と第２演算部ペア１５１は、第１メモリ１０８と第２メモリ１１０のいずれかを共用できる。例えば、ＤＣＡＣ予測部１０４の出力は、第２メモリ１１０のみでなく、第１メモリ１０８にも出力され、第１メモリ１０８の出力は、ＤＣＴ／量子化部１０３のみでなく、ＶＬＣ部１０５にも出力されている。 In such a case, the first computing unit pair 150 and the second computing unit pair 151 can share either the first memory 108 or the second memory 110. For example, the output of the DCAC prediction unit 104 is output not only to the second memory 110 but also to the first memory 108, and the output of the first memory 108 is not only to the DCT / quantization unit 103 but also to the VLC unit 105. It is output.

すなわち、第１メモリ１０８は、動き検出部１０２から出力されるデータを保持し、保持したデータをＤＣＴ／量子化部１０３へ出力する機能に加えて、ＤＣＡＣ予測部１０４からのデータ入力バスとＶＬＣ部１０５への出力バスを更に備える。また第１メモリ１０８はあ、ＤＣＡＣ予測部１０４から出力されるデータを保持し、保持したデータをＶＬＣ部１０５へ出力する。 That is, the first memory 108 holds the data output from the motion detection unit 102, and in addition to the function of outputting the held data to the DCT / quantization unit 103, the data input bus and the VLC from the DCAC prediction unit 104 An output bus to the unit 105 is further provided. The first memory 108 holds the data output from the DCAC prediction unit 104 and outputs the held data to the VLC unit 105.

この場合には、第２メモリ１１０を使用せず、第１演算部ペア１５０と第２演算部ペア１５１は、第１メモリ１０８を共用する。第１演算部ペア１５０と第２演算部ペア１５１は、同一単位サイクル内では、排他的に動作する。このため、第１演算部ペア１５０が動作する単位サイクルにおいては、第１演算部ペア１５０が、第１メモリ１０８を使用し、第２演算部ペア１５１が動作する単位サイクルにおいては、第２演算部ペア１５１が、第１メモリ１０８を使用する。 In this case, the second memory 110 is not used, and the first arithmetic unit pair 150 and the second arithmetic unit pair 151 share the first memory 108. The first arithmetic unit pair 150 and the second arithmetic unit pair 151 operate exclusively within the same unit cycle. Therefore, in the unit cycle in which the first arithmetic unit pair 150 operates, the first arithmetic unit pair 150 uses the first memory 108, and in the unit cycle in which the second arithmetic unit pair 151 operates, the second arithmetic unit The pair 151 uses the first memory 108.

この結果、第２メモリ１１０は、未使用となるので、電力供給やクロック信号の供給を遮断できる。すなわち、第２メモリ１１０での消費電力が削減できる。 As a result, since the second memory 110 is not used, power supply and clock signal supply can be cut off. That is, power consumption in the second memory 110 can be reduced.

次に、図１４、図１５を用いて、全体処理に許容される許容処理時間が、時間「４．２Ｓ」である場合について説明する。図１５は、本発明の実施の形態３におけるデータ処理装置のブロック図である。 Next, a case where the allowable processing time allowed for the entire process is time “4.2S” will be described with reference to FIGS. 14 and 15. FIG. 15 is a block diagram of a data processing apparatus according to Embodiment 3 of the present invention.

制御部１１４は、４単位サイクルおきに、入力データが動き検出部１０２に入力する。すなわち、単位サイクルＴにおいて、動き検出部１０２が、ｎ番目のマクロブロックを処理し、単位サイクルＴ＋４において、動き検出部１０２が、ｎ＋１番目のマクロブロックを処理する。すなわち、図１４に示されるように、同一単位サイクル内では、最大２つまでの演算部が演算を行い、残りの演算部は演算を行わない。 The control unit 114 inputs the input data to the motion detection unit 102 every four unit cycles. That is, in the unit cycle T, the motion detection unit 102 processes the nth macroblock, and in the unit cycle T + 4, the motion detection unit 102 processes the n + 1th macroblock. That is, as shown in FIG. 14, in the same unit cycle, up to two arithmetic units perform arithmetic operations, and the remaining arithmetic units do not perform arithmetic operations.

単位サイクルＴにおいては、動き検出部１０２が、ｎ番目のマクロブロックを処理する。他の演算部は演算を行わない。図１４で斜線が施されたマトリクスに対応する演算部は演算を行わない。 In the unit cycle T, the motion detection unit 102 processes the nth macroblock. Other calculation units do not perform calculations. The calculation unit corresponding to the hatched matrix in FIG. 14 does not perform calculation.

単位サイクルＴ＋１においては、ＤＣＴ／量子化部１０３が、ｎ番目のマクロブロックを処理し、他の演算部は演算を行わない。 In the unit cycle T + 1, the DCT / quantization unit 103 processes the nth macroblock, and the other calculation units do not perform calculation.

単位サイクルＴ＋２においては、ＤＣＡＣ予測部１０４と逆量子化／逆ＤＣＴ部１０６が、ｎ番目のマクロブロックを処理し、他の演算部は演算を行わない。 In the unit cycle T + 2, the DCAC prediction unit 104 and the inverse quantization / inverse DCT unit 106 process the nth macroblock, and the other calculation units do not perform calculation.

単位サイクルＴ＋３においては、ＶＬＣ部１０５と動き補償／再構成部１０７が、ｎ番目のマクロブロックを処理し、他の演算部は演算を行わない。 In the unit cycle T + 3, the VLC unit 105 and the motion compensation / reconstruction unit 107 process the nth macroblock, and the other calculation units do not perform calculation.

単位サイクルＴ＋４では、新たなｎ＋１番目のマクロブロックのデータが、動き検出部１０２に入力する。 In the unit cycle T + 4, the data of the new n + 1-th macroblock is input to the motion detection unit 102.

以上のように、ｎ番目のマクロブロックの処理は、４単位サイクルで終了する。このため、許容処理時間「４．２Ｓ」を満たしている。 As described above, the processing of the nth macroblock ends in 4 unit cycles. Therefore, the allowable processing time “4.2S” is satisfied.

図１４に示されるように、斜線が施されているマトリクスに対応する演算部は、演算を行わない。 As shown in FIG. 14, the calculation unit corresponding to the hatched matrix does not perform calculation.

この演算を行わない演算部に対して、制御部１１４は、電力供給を遮断したり、クロック信号の供給を停止したりする。結果として、演算不要な演算部における消費電力が削減される。制御部１１４は、演算不要な演算部に対するクロック信号の周波数を低減したり、閾値電圧を増加したりしてもよい。 For the calculation unit that does not perform this calculation, the control unit 114 cuts off the power supply or stops the supply of the clock signal. As a result, power consumption in the calculation unit that does not require calculation is reduced. The control unit 114 may reduce the frequency of the clock signal for the computation unit that does not require computation or increase the threshold voltage.

さらに、動き検出部１０２とＤＣＴ／量子化部１０３からなる第１演算部ペア１５０と、ＤＣＡＣ予測部１０４とＶＬＣ部１０５からなる第２演算部ペア１５１は、同一単位サイクル内では、排他的に動作する。このため、第１演算部ペア１５０と第２演算部ペア１５１は、第１メモリ１０８を共用できる。 Furthermore, the first calculation unit pair 150 including the motion detection unit 102 and the DCT / quantization unit 103 and the second calculation unit pair 151 including the DCAC prediction unit 104 and the VLC unit 105 are exclusively included in the same unit cycle. Operate. For this reason, the first computing unit pair 150 and the second computing unit pair 151 can share the first memory 108.

単位サイクルＴと単位サイクルＴ＋１では、図１４に示されるとおり、第１演算部ペア１５０に含まれる動き検出部１０２とＤＣＴ／量子化部１０３のみが演算を行っている。このため、単位サイクルＴと単位サイクルＴ＋１においては、第１演算部ペア１５０が、第１メモリ１０８を使用する。 In the unit cycle T and the unit cycle T + 1, as shown in FIG. 14, only the motion detection unit 102 and the DCT / quantization unit 103 included in the first calculation unit pair 150 perform calculations. Therefore, in the unit cycle T and the unit cycle T + 1, the first arithmetic unit pair 150 uses the first memory 108.

一方、単位サイクルＴ＋２と単位サイクルＴ＋３においては、図１４に示されるとおり、第２演算部ペア１５１に含まれるＤＣＡＣ予測部１０４とＶＬＣ部１０５が演算を行い、第１演算部ペア１５０は演算を行わない。このため、単位サイクルＴ＋２とＴ＋３においては、第２演算部ペア１５１が第１メモリ１０８を使用する。なお、いずれの単位サイクルにおいても、第２メモリ１１０は、使用されない。すなわち、全ての単位サイクルにおいて、第２メモリ１１０は使用されない。 On the other hand, in unit cycle T + 2 and unit cycle T + 3, as shown in FIG. 14, DCAC prediction unit 104 and VLC unit 105 included in second operation unit pair 151 perform operations, and first operation unit pair 150 performs operations. Not performed. For this reason, in the unit cycles T + 2 and T + 3, the second arithmetic unit pair 151 uses the first memory 108. Note that the second memory 110 is not used in any unit cycle. That is, the second memory 110 is not used in all unit cycles.

制御部１１４は、第２メモリ１１０に対して、電力供給を遮断したり、クロック信号の供給を停止したりする。あるいは、制御部１１４は、第２メモリ１１０に対するクロック信号の周波数を低減したり、閾値電圧を増加したりする。結果として、第２メモリ１１０での消費電力が削減される。特に、第２メモリ１１０は、全単位サイクルにおいて、使用されないので、消費電力削減の効果が大きい。 The control unit 114 cuts off the power supply to the second memory 110 or stops the supply of the clock signal. Alternatively, the control unit 114 reduces the frequency of the clock signal for the second memory 110 or increases the threshold voltage. As a result, power consumption in the second memory 110 is reduced. In particular, since the second memory 110 is not used in all unit cycles, the effect of reducing power consumption is great.

また、図１５に示されるように、斜線を施されたバンクにおいても、消費電力が削減できる。 Further, as shown in FIG. 15, power consumption can be reduced even in a hatched bank.

以上のように、メモリの共用がされることで、更なる消費電力の削減ができる。加えて、制御部が、アプリケーションが要求する許容処理時間に応じて、単位サイクル毎に演算を行う演算部を決定することで、許容処理時間を満足したデータ処理が実現される。 As described above, the power consumption can be further reduced by sharing the memory. In addition, data processing that satisfies the allowable processing time is realized by the control unit determining a calculation unit that performs calculation for each unit cycle according to the allowable processing time required by the application.

なお、許容処理時間が更に長くなった場合には、共用されるメモリの個数が増加し、消費電力が更に削減できる。 Note that when the allowable processing time is further increased, the number of shared memories increases, and the power consumption can be further reduced.

なお、第１メモリをメモリ１０８とした場合に、第１メモリ１０８を共有する場合について説明したが、メモリを共有するデータ処理装置１００の構成はこれに限定されない。データ処理装置１００を構成する演算部のパイプライン段数、接続形態に合わせて、共有するメモリを決定すればよい。 In addition, although the case where the 1st memory was used as the memory 108 and the case where the 1st memory 108 was shared was demonstrated, the structure of the data processing apparatus 100 which shares a memory is not limited to this. The memory to be shared may be determined in accordance with the number of pipeline stages of the arithmetic units constituting the data processing apparatus 100 and the connection form.

また、データ処理装置１００が半導体集積回路に搭載される場合には、この半導体集積回路の消費電力が削減できる。 Further, when the data processing apparatus 100 is mounted on a semiconductor integrated circuit, the power consumption of the semiconductor integrated circuit can be reduced.

なお、パイプライン処理を実行する演算部の構成は、動き検出回路１０２〜動き補償/再構成回路１０７の構成に限定するものではない。例えば、動き検出部１０２を２単位サイクルに分割する構成や、ＤＣＴ/量子化処理部１０３をＤＣＴ演算を行う演算部と量子化演算を行う演算部の２つに分割する構成でも良い。 Note that the configuration of the arithmetic unit that executes pipeline processing is not limited to the configuration of the motion detection circuit 102 to the motion compensation / reconstruction circuit 107. For example, a configuration in which the motion detection unit 102 is divided into two unit cycles, or a configuration in which the DCT / quantization processing unit 103 is divided into two units, a calculation unit that performs DCT calculation and a calculation unit that performs quantization calculation, may be employed.

メモリ１０８〜１１１の２バンク構成は、物理的に異なる２つのメモリ・セルで構成する場合と、１つのデュアル・ポートのメモリ・セルで構成する場合のどちらでも構わない。 The two-bank configuration of the memories 108 to 111 may be either configured with two physically different memory cells or configured with one dual-port memory cell.

また、制御部１１４が、データ処理装置１００に含まれる複数の演算部を制御しても良く、ホストＣＰＵ１３０が、データ処理装置１００に含まれる複数の演算部を制御しても良い。 Further, the control unit 114 may control a plurality of arithmetic units included in the data processing apparatus 100, and the host CPU 130 may control a plurality of arithmetic units included in the data processing apparatus 100.

（実施の形態４）
次に、実施の形態４について説明する。実施の形態４では、データ処理装置１００が、半導体集積回路に搭載された場合について説明する。 (Embodiment 4)
Next, a fourth embodiment will be described. In the fourth embodiment, a case where the data processing apparatus 100 is mounted on a semiconductor integrated circuit will be described.

図１６は、本発明の実施の形態３における半導体集積回路のブロック図である。半導体集積回路１６０は、ＩＣやＬＳＩで実現される。 FIG. 16 is a block diagram of a semiconductor integrated circuit according to the third embodiment of the present invention. The semiconductor integrated circuit 160 is realized by an IC or LSI.

なお、半導体集積回路１６０は、図１６においては、単一のＬＳＩとして示されているが、複数のＬＳＩに実装されても良い。また、半導体集積回路１６０は、必要に応じてパッケージに封入され、電子基板に実装される。この電子基板は、必要に応じて筐体に格納される。更に半導体集積回路１６０の外部には、必要な電子部品や装置が接続されて、電子機器が構成される。例えば、ビデオカメラやノートブックパソコン、携帯端末などが構成される。 The semiconductor integrated circuit 160 is shown as a single LSI in FIG. 16, but may be mounted on a plurality of LSIs. Further, the semiconductor integrated circuit 160 is sealed in a package as necessary and mounted on an electronic substrate. This electronic substrate is stored in a housing as necessary. Further, necessary electronic components and devices are connected to the outside of the semiconductor integrated circuit 160 to constitute an electronic device. For example, a video camera, a notebook personal computer, a mobile terminal, and the like are configured.

半導体集積回路１６０は、実施の形態２、３で説明されたデータ処理装置１００をはじめ、ＣＰＵ１６１、外部入出力制御回路１６２、シリアル入出力制御回路１６３、ビデオ入出力制御回路１６４、オーディオ入出力制御回路１６５、メモリカード入出力制御回路１６６、メモリ制御回路１６７を備えている。 The semiconductor integrated circuit 160 includes the data processing apparatus 100 described in the second and third embodiments, the CPU 161, the external input / output control circuit 162, the serial input / output control circuit 163, the video input / output control circuit 164, and the audio input / output control. A circuit 165, a memory card input / output control circuit 166, and a memory control circuit 167 are provided.

ビデオ入出力制御回路１６４は、外部のカメラ１６８およびＬＣＤ１６９と接続される。オーディオ入出力制御回路１６５は、マイク１７０およびスピーカ１７１と接続される。メモリカード入出力制御回路１６６は、メモリカード１７２と接続される。メモリ制御回路１６７はＤＲＡＭ１７３と接続される。半導体集積回路１６０に含まれるこれらの制御回路は、外部に接続される装置を制御する。 Video input / output control circuit 164 is connected to external camera 168 and LCD 169. The audio input / output control circuit 165 is connected to the microphone 170 and the speaker 171. Memory card input / output control circuit 166 is connected to memory card 172. Memory control circuit 167 is connected to DRAM 173. These control circuits included in the semiconductor integrated circuit 160 control devices connected to the outside.

データ処理装置１００は、実施の形態２、３で説明された構成と機能を有している。 The data processing apparatus 100 has the configuration and functions described in the second and third embodiments.

すなわち、データ処理装置１００は、単位サイクル内に所定の演算を行う直列接続された複数の演算部と、これらの複数の演算部の間に接続された複数のメモリと、制御部を備えている。 That is, the data processing apparatus 100 includes a plurality of arithmetic units connected in series that perform predetermined calculations within a unit cycle, a plurality of memories connected between the plurality of arithmetic units, and a control unit. .

制御部は、これらの複数の演算部の内、単位サイクル内に演算を行う演算部を決定する。 A control part determines the calculating part which calculates in a unit cycle among these several calculating parts.

具体的には、制御部は、データ処理装置１００での許容処理時間や入力データの入力間隔などに応じて、複数の演算部の内、ある単位サイクル内に演算を行う演算部を決定する。このとき、共用されて未使用となるメモリが生じたり、メモリが複数のバンクを有している場合には、ある単位サイクルにおいて未使用となるバンクが生じる。 Specifically, the control unit determines a calculation unit that performs a calculation in a unit cycle among a plurality of calculation units according to an allowable processing time in the data processing apparatus 100, an input interval of input data, and the like. At this time, a shared memory that is unused or a memory having a plurality of banks is generated, an unused bank is generated in a certain unit cycle.

制御部は、これらの演算不要な演算部、メモリ、バンクに対する電力供給を遮断したり、クロック信号の供給を遮断したりする。あるいは、制御部は、これら演算部やメモリなどに対するクロック信号の周波数を低減したり、これら演算部やメモリなどの閾値電圧を増加する。 The control unit cuts off the power supply to these calculation units, memories, and banks that do not require calculation, and cuts off the supply of the clock signal. Or a control part reduces the frequency of the clock signal with respect to these calculating parts, memory, etc., or increases threshold voltage, such as these calculating parts, memory.

上記の制御部の制御により、不要な消費電力が削減される。特に、アプリケーションの違いに基づく許容処理時間を満足する処理も実現される。 Unnecessary power consumption is reduced by the control of the control unit. In particular, processing that satisfies the allowable processing time based on the difference in applications is also realized.

本発明の半導体集積回路によれば、アプリケーションの要求速度を満足すると共に、消費電力が削減できる。特に、ある瞬間での電力であるピーク電力が削減できる。 According to the semiconductor integrated circuit of the present invention, the required speed of the application can be satisfied and the power consumption can be reduced. In particular, the peak power that is the power at a certain moment can be reduced.

本発明は、例えば画像処理や音声処理に関わるデータ処理を行う装置や機器の分野等において好適に利用できる。 The present invention can be suitably used, for example, in the field of apparatuses and devices that perform data processing related to image processing and audio processing.

本発明の実施の形態１におけるデータ処理装置のブロック図1 is a block diagram of a data processing apparatus according to Embodiment 1 of the present invention. 本発明の実施の形態１におけるデータ処理装置の動作を説明する説明図Explanatory drawing explaining operation | movement of the data processor in Embodiment 1 of this invention. 本発明の実施の形態１におけるデータ処理装置の動作を説明する説明図Explanatory drawing explaining operation | movement of the data processor in Embodiment 1 of this invention. 本発明の実施の形態１におけるデータ処理装置の動作を説明する説明図Explanatory drawing explaining operation | movement of the data processor in Embodiment 1 of this invention. 本発明の実施の形態１におけるデータ処理装置のブロック図1 is a block diagram of a data processing apparatus according to Embodiment 1 of the present invention. 本発明の実施の形態１におけるデータ処理装置の動作を説明する説明図Explanatory drawing explaining operation | movement of the data processor in Embodiment 1 of this invention. 本発明の実施の形態１におけるデータ処理装置の動作を説明する説明図Explanatory drawing explaining operation | movement of the data processor in Embodiment 1 of this invention. 本発明の実施の形態２におけるデータ処理装置のブロック図The block diagram of the data processor in Embodiment 2 of this invention 本発明の実施の形態２におけるデータ処理装置の処理を説明する説明図Explanatory drawing explaining the process of the data processor in Embodiment 2 of this invention 本発明の実施の形態２におけるデータ処理装置の処理を説明する説明図Explanatory drawing explaining the process of the data processor in Embodiment 2 of this invention 本発明の実施の形態２におけるデータ処理装置のブロック図The block diagram of the data processor in Embodiment 2 of this invention 本発明の実施の形態２におけるデータ処理装置の処理を説明する説明図Explanatory drawing explaining the process of the data processor in Embodiment 2 of this invention 本発明の実施の形態３におけるデータ処理装置のブロック図The block diagram of the data processor in Embodiment 3 of this invention 本発明の実施の形態３におけるデータ処理装置の動作を説明する説明図Explanatory drawing explaining operation | movement of the data processor in Embodiment 3 of this invention. 本発明の実施の形態３におけるデータ処理装置のブロック図The block diagram of the data processor in Embodiment 3 of this invention 本発明の実施の形態３における半導体集積回路のブロック図Block diagram of a semiconductor integrated circuit according to Embodiment 3 of the present invention 従来の技術におけるデータ処理装置のブロック図Block diagram of a data processing apparatus in the prior art

符号の説明Explanation of symbols

１データ処理装置
２、３、４、５演算部
６、７、８メモリ
９制御部
１０、１２、１４第１バンク
１１、１３、１５第２バンク
１００データ処理装置
１０１入力メモリ
１０２動き検出部
１０３ＤＣＴ／量子化部
１０４ＤＣＡＣ予測部
１０５ＶＬＣ部
１０６逆量子化／逆ＤＣＴ部
１０７動き補償／再構成部
１０８、１０９、１１０、１１１メモリ
１１２、１１３出力メモリ
１３０ホストＣＰＵ
１４０外部メモリ
１６０半導体集積回路
１６１ＣＰＵ
１６２外部入出力制御回路
１６３シリアル入出力制御回路
１６４ビデオ入出力制御回路
１６５オーディオ入出力制御回路
１６６メモリカード入出力制御回路
１６７メモリ制御回路
１６８カメラ
１６９ＬＣＤ
１７０マイク
１７１スピーカ
１７２メモリカード
１７３ＤＲＡＭ DESCRIPTION OF SYMBOLS 1 Data processing device 2, 3, 4, 5 Operation part 6, 7, 8 Memory 9 Control part 10, 12, 14 1st bank 11, 13, 15 2nd bank 100 Data processing apparatus 101 Input memory 102 Motion detection part 103 DCT / quantization unit 104 DCAC prediction unit 105 VLC unit 106 Inverse quantization / inverse DCT unit 107 Motion compensation / reconstruction units 108, 109, 110, 111 Memory 112, 113 Output memory 130 Host CPU
140 External memory 160 Semiconductor integrated circuit 161 CPU
162 External input / output control circuit 163 Serial input / output control circuit 164 Video input / output control circuit 165 Audio input / output control circuit 166 Memory card input / output control circuit 167 Memory control circuit 168 Camera 169 LCD
170 Microphone 171 Speaker 172 Memory card 173 DRAM

Claims

単位サイクル内に各々に割り当てられた演算を行うと共に各々が直列接続された複数の演算部と、
前記複数の演算部の各々の間に接続された複数のメモリと、
前記複数の演算部の内、ある単位サイクル内に各々に割り当てられた演算を行う演算部を選択する制御部を備え、
前記制御部は、外部から通知され前記複数の演算部の各々が行う演算に要する時間の合計に関するとともに、処理に許容される許容時間を表す許容時間情報に基づき、前記複数の演算部の初段の演算部に入力するデータの入力間隔を算出するデータ処理装置。 A plurality of operation units that perform operations assigned to each in a unit cycle and that are each connected in series;
A plurality of memories connected between each of the plurality of arithmetic units;
Among the plurality of arithmetic units, a control unit that selects an arithmetic unit that performs an arithmetic operation assigned to each unit cycle is provided.
The control unit is notified from the outside and relates to the total time required for the calculation performed by each of the plurality of calculation units, and based on the allowable time information indicating the allowable time allowed for processing, the first stage of the plurality of calculation units A data processing device that calculates an input interval of data to be input to a calculation unit.

前記許容時間情報は、平均処理時間の倍数で表されるとともに、
前記制御部は、前記許容時間情報を超えない範囲の値で、かつ前記平均処理時間の整数倍で演算を行うように前記演算部を選択する請求項１記載のデータ処理装置。 The allowable time information is expressed as a multiple of the average processing time,
The data processing apparatus according to claim 1 , wherein the control unit selects the calculation unit so as to perform a calculation within a value that does not exceed the allowable time information and an integer multiple of the average processing time .

前記制御部は、前記入力間隔に基づいて、前記複数の演算部の内、ある単位サイクル内に演算を行う演算部を決定する請求項１記載のデータ処理装置。 The data processing apparatus according to claim 1, wherein the control unit determines a calculation unit that performs a calculation in a unit cycle among the plurality of calculation units based on the input interval.

前記制御部は、前記複数の演算部の内、ある単位サイクル内に演算が不要な演算部を判定し、前記制御部は、
前記判定された演算部への電力供給を遮断する第１処理、
前記判定された演算部へのクロック信号の供給を停止する第２処理、
前記判定された演算部へのクロック信号の周波数を低減する第３処理および
前記判定された演算部の閾値電圧を増加する第４処理の内、少なくとも一つの処理を行
う請求項１記載のデータ処理装置。 The control unit determines a calculation unit that does not require calculation within a unit cycle among the plurality of calculation units, and the control unit includes:
A first process for cutting off power supply to the determined computing unit;
A second process of stopping the supply of the clock signal to the determined arithmetic unit;
The data processing according to claim 1, wherein at least one of a third process for reducing the frequency of the clock signal to the determined arithmetic unit and a fourth process for increasing the threshold voltage of the determined arithmetic unit is performed. apparatus.

前記複数のメモリは、第１メモリと第２メモリを有し、前記複数の演算部は、前記第１メモリの前段に接続される演算部と後段に接続される演算部からなる第１演算部ペアと、前記第２メモリの前段に接続される演算部と後段に接続される演算部からなる第２演算部ペアを有し、前記第１演算部ペアと前記第２演算部ペアがある単位サイクル内で排他的に演算を行う場合には、前記第１演算部ペアと前記第２演算部ペアは、前記第１メモリおよび前記第２メモリの一方を共用する請求項１記載のデータ処理装置。 The plurality of memories include a first memory and a second memory, and the plurality of calculation units are a first calculation unit including a calculation unit connected to a preceding stage and a calculation unit connected to a subsequent stage of the first memory. A unit having a pair, a second computing unit pair consisting of a computing unit connected to the previous stage of the second memory and a computing unit connected to the subsequent stage, wherein the first computing unit pair and the second computing unit pair are present 2. The data processing device according to claim 1, wherein when performing an operation exclusively within a cycle, the first operation unit pair and the second operation unit pair share one of the first memory and the second memory. .

前記第１メモリ及び第２メモリの一方が共用される場合に、前記制御部は、
共用されないメモリへの電力供給を遮断する第１処理、
共用されないメモリへのクロック信号の供給を停止する第２処理、
共用されないメモリへのクロック信号の周波数を低減する第３処理および
共用されないメモリへの閾値電圧を増加する第４処理の内、少なくとも一つの処理を行う請求項５記載のデータ処理装置。 When one of the first memory and the second memory is shared, the control unit
A first process for shutting off power supply to unshared memory;
A second process for stopping the supply of the clock signal to the memory that is not shared;
6. The data processing apparatus according to claim 5 , wherein at least one of a third process for reducing a frequency of a clock signal to a memory not shared and a fourth process for increasing a threshold voltage to the memory not shared is performed.

単位サイクル内に各々に割り当てられた演算を行うと共に各々が直列接続された複数の演算部と、
前記複数の演算部の各々の間に接続された複数のメモリと、
前記複数の演算部の内、ある単位サイクル内に各々に割り当てられた演算を行う演算部を選択する制御部を備え、
前記制御部は、外部から通知される前記複数の演算部の各々が行う演算に要する時間の合計に関するとともに、処理に許容される許容時間を表す許容時間情報に基づき、前記複数の演算部の初段の演算部に入力するデータの入力間隔を算出する半導体集積回路。 A plurality of operation units that perform operations assigned to each in a unit cycle and that are each connected in series;
A plurality of memories connected between each of the plurality of arithmetic units;
Among the plurality of arithmetic units, a control unit that selects an arithmetic unit that performs an arithmetic operation assigned to each unit cycle is provided.
The control unit relates to the total time required for the calculation performed by each of the plurality of calculation units notified from the outside, and based on the allowable time information indicating the allowable time allowed for processing, the first stage of the plurality of calculation units The semiconductor integrated circuit which calculates the input interval of the data input into the calculating part.