KR20090053129A - Arithmetic and logic unit of 3-dimentional graphics shader, method for determining branch condition and recording medium recorded program executing it - Google Patents
Arithmetic and logic unit of 3-dimentional graphics shader, method for determining branch condition and recording medium recorded program executing it Download PDFInfo
- Publication number
- KR20090053129A KR20090053129A KR1020070119802A KR20070119802A KR20090053129A KR 20090053129 A KR20090053129 A KR 20090053129A KR 1020070119802 A KR1020070119802 A KR 1020070119802A KR 20070119802 A KR20070119802 A KR 20070119802A KR 20090053129 A KR20090053129 A KR 20090053129A
- Authority
- KR
- South Korea
- Prior art keywords
- data
- result
- source
- branch
- field
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/57—Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations
- G06F7/575—Basic arithmetic logic units, i.e. devices selectable to perform either addition, subtraction or one of several logical operations, using, at least partially, the same circuitry
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3005—Arrangements for executing specific machine instructions to perform operations for flow control
- G06F9/30058—Conditional branch instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
Landscapes
- Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computational Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Computing Systems (AREA)
- Computer Graphics (AREA)
- Advance Control (AREA)
- Executing Machine-Instructions (AREA)
Abstract
An arithmetic logic device for a three-dimensional graphics shader including branch instructions. More specifically, in an arithmetic logic device of a three-dimensional graphics shader that reads a shader instruction and performs an operation corresponding to the shader instruction, the arithmetic logic apparatus comprises a first source data, a second source data, a first Boolean data And the second unbalanced data are input, and a source comparison result, which is a result of a comparison operation between the first source data and the second source data, and one of the first unbalanced data and the second unbalanced data are finalized as a branch condition determination result. And an arithmetic logic device for a three-dimensional graphics shader characterized in that the output is transmitted to a branch control unit. This enables support for Direct3D's shader model 3.0 and OpenGL ES 2.0, including branch instructions.
Shader, branch, arithmetic logic unit, judgment
Description
The present invention relates to a three-dimensional graphics shader, and more particularly to an arithmetic logic device of a three-dimensional graphics shader including a branch instruction.
The real-time three-dimensional graphics field is developing at a very fast pace as hardware improvements and usage increases. Three-dimensional graphics use a large amount of data to perform complex operations. By transferring the functions that were previously implemented on the CPU to the graphics hardware, the performance is improved and the CPU can concentrate on tasks other than graphics.
In general, a graphics processor unit (hereinafter referred to as a GPU) which is dedicated to processing graphics operations for acceleration of 3D graphics is used. The GPU is composed of various functional acceleration units such as a fixed pipeline transform and lighting processor, rasterizer, and shader.
Among these, the shader provides the user with an environment in which to program operations to be performed on the GPU. In general, rendering refers to the process of adding realism to computer graphics by incorporating three-dimensional textures, such as shadows or changes in color and density, and shaders add the relationship between objects and light to these processes. This allows you to handle various effects. Therefore, if you enable the shader option, texture acceleration will give you a more realistic graphical image.
Shader functionality, high level language, assembly language, and structural features are defined in the 3D graphics programming interface standards Direct3D and OpenGL (Open Graphics Library). Vertex Shader Model 2.0 and Pixel Shader Model 2.x versions of Direct3D support Static Flow Control, and Dynamic Flow Control from Vertex Shader Model 2.x and Pixel Shader Model 3.0 versions. ) Function.
The flow control function allows branching based on the results of various condition determinations. This makes it possible to significantly reduce the amount of computation for shaders that repeat operations by applying the same instruction to large data sets. If you run a shader program with a shader that does not support branching, you must run all invalid operations as a result of the shader's characteristic of running the same program for all data. For example, among a number of vertices, some vertices may be light-calculated based on their distance to the light source, while some may not. In this case, you can use the function of if, a flow control command, to light only some vertices according to the comparison result. However, in versions of shaders that do not support flow control, unnecessary operations can be performed as is or other complex methods can be used to perform the same result.
Flow control into the shader reduces the amount of shader code that must be computed as a result, improving overall graphics acceleration. Shaders that do not support flow control have a relatively low program flexibility and a relatively low performance depending on the program.
The present invention provides an arithmetic logic device for a three-dimensional graphics shader that can support Direct3D shader model 3.0 and OpenGL ES 2.0, including branch instructions.
According to an aspect of the present invention, an arithmetic logic apparatus of a three-dimensional graphics shader that reads a shader instruction and performs an operation corresponding to the shader instruction, wherein the arithmetic logic apparatus comprises: first source data, second source data, A branch condition is determined based on a source comparison result, a source comparison result that is a result of a comparison operation between first source data and second source data, and one of the first and second untyped data. As a result, an arithmetic logic apparatus for a three-dimensional graphics shader is provided, which is finally output and transmitted to the branch control unit.
Here, the first Boolean data may be any one of Boolean values stored in a component of a predicate register.
In addition, the second unbalanced data may be any one of a stored value stored in a unbalanced register.
Meanwhile, the arithmetic logic device performs a floating point basic operation on the first source data and the second source data to output a floating point type operation result or performs a floating point type comparison operation to perform a floating point type comparison result. A floating point type operation unit outputting a; An integer arithmetic unit configured to output an integer arithmetic result by performing an integer basic operation on the first source data and the second source data or an integer arithmetic result by performing an integer comparison operation; A first MUX configured to output one of the floating point operation result and the integer operation result as basic output data; A second MUX outputting any one of the floating point comparison result and the integer comparison result as the source comparison result; And a third MUX for outputting any one of the first unbalanced data and the second unbalanced data as a final branch determination result.
The shader instruction may include a micro operation index field, a destination field, a first source data field, and a second source data field. When the data of the micro operation index field is related to a branch instruction, the destination field may include a destination address field, It can be interpreted as an inversion field, source selection field, address field, component selection field, comparison option field, and integer field.
The destination address field may indicate an address to branch according to the final branch determination result. The inversion field may store a condition of whether to invert the final branch determination result. The source selection field may store information for selecting one of the first unbalanced data, the second unbalanced data, and the source comparison result to use for branch condition determination.
In another aspect of the present invention, a method for determining a branch condition in an arithmetic logic device of a three-dimensional graphics shader, the method comprising: reading a shader instruction; Determining whether the shader instruction is a branch instruction; Performing a basic operation according to the shader instruction when the shader instruction is not the branch instruction as a result of the determination, and identifying a source of branch condition determination when the shader instruction is the branch instruction; And outputting one of the first unbalanced data, the second unbalanced data, and the source comparison result according to the checking result as a branch condition determination result which is a result of the execution of the branch instruction. .
After the confirming step, selecting one of the components of the predicate register when using the confirm result predicate register; The method may further include outputting a Boolean value of the selected component as the first Boolean data.
After the checking step, checking the address field of the branch instruction when using the immutation register as a result of the checking; The method may further include outputting a Boolean value of the Boolean register corresponding to the address stored in the address field as the second Boolean data.
After the checking, comparing the result of the floating point type comparison operation and the result of the integer type comparison operation on two source data; The method may further include outputting any one of the floating point comparison operation result and the integer comparison operation result as the source comparison result.
Meanwhile, the branch condition determination method may be performed by a computer, and may be recorded in a computer readable recording medium for recording a program for execution in the computer.
Other aspects, features, and advantages other than those described above will become apparent from the following drawings, claims, and detailed description of the invention.
The arithmetic logic unit of the 3D graphics shader according to the present invention may support Direct3D shader model 3.0 and OpenGL ES 2.0 including branch instructions.
In addition, it is possible to branch according to various condition determination results by using the flow control function, and it is possible to drastically reduce the amount of shaders associated with a situation in three-dimensional graphics that performs a complex operation using a large amount of data. This increases the acceleration of 3D graphics and reduces the power consumed by the shader.
In addition, branch condition determination using the shader's arithmetic logic unit (ALU stage 0) enables flow control without additional data paths or computational units.
In addition, by using an instruction to interpret some fields differently according to operations, the usage efficiency of the instruction field may be increased. This minimizes the size of the instruction, saving space for storing instruction code.
As the invention allows for various changes and numerous embodiments, particular embodiments will be illustrated in the drawings and described in detail in the written description. However, this is not intended to limit the present invention to specific embodiments, it should be understood to include all transformations, equivalents, and substitutes included in the spirit and scope of the present invention. In the following description of the present invention, if it is determined that the detailed description of the related known technology may obscure the gist of the present invention, the detailed description thereof will be omitted.
Terms such as first and second may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another.
The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting of the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In this application, the terms "comprise" or "have" are intended to indicate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, and one or more other features. It is to be understood that the present invention does not exclude the possibility of the presence or the addition of numbers, steps, operations, components, components, or a combination thereof.
1 is an arithmetic logic device of a three-dimensional graphics shader according to an embodiment of the present invention, FIG. 2 is a diagram illustrating a predicate register and output data for branch determination, and FIG. 3 is an immutable register and output for branch determination. A diagram showing data.
The arithmetic and logic unit (ALU) stage 0 (ALU) 100 of the 3D graphics shader performs branch condition determination when a current instruction is a branch instruction. An instruction based processor such as the
The arithmetic logic unit ALU of the 3D graphics shader has a three-stage structure, and the
The branching function of the 3D graphics shader may be divided into a branch using the
The
The
The
The
If the micro operation is not a branch instruction, the basic output data (ALU_stage0_output) is output.
When the micro operation is a branch instruction, a comparison operation of the first source data and the second source data is performed according to the comparison option. One of the comparison operation result, the first imbalance data, and the second imbalance data is selected as the final branch determination result by the source selection value. The final comparison result (comparison result), which is the result of the final branch determination, is 1-bit Boolean data. This value is passed to the subsequent branch control unit. The branch control unit receives the final comparison result, which is the final branch determination result, and generates and outputs a control signal so that no further processing is performed in each unit.
A process of performing branch condition determination in the
The first block includes a floating point
If the microoperation is not a branch instruction, the floating point
When the micro operation is a branch instruction, the floating point
The comparison operation may be performed in any one of the following eight depending on the comparison option.
(1) Greater than (>): Great
(2) Greater than or equal to (≥): greater than or equal to
(3) Less than (<): small
(4) Less than or equal to (≤): Less than or equal
(5) Equal to (=): equal
(6) Not equal to (≠): not equal
(7) Any: The output of the comparison result is fixed to '0' regardless of the comparison result of the source data.
(8) All: The output of the comparison result is fixed to '1' regardless of the comparison result of the source data.
The second block includes a
The
The
The third block includes a
In an embodiment of the present invention, branch condition determination is performed using an arithmetic logic unit (ALU stage 0) of a 3D graphics shader, thereby supporting a flow control function without additional data paths or calculation units.
FIG. 5 is a diagram illustrating a field configuration of a shader instruction according to an embodiment of the present invention, and FIG. 6 is a detail table of a branch operation field according to an embodiment of the present invention.
The shader instruction generally includes a micro operation index (MO Index) field, a destination field, a first source data (Source # 0) field, and a second source data (Source # 1) field.
When the micro operation index field is a flow control related operation, that is, a branch instruction, the destination field of the shader instruction is interpreted as branch instruction information. The first source data field and the second source data field are two source data to be compared or related information.
The destination field of the shader instruction is interpreted as shown in FIG.
The target address indicates the address of the branch destination. If the branch condition is judged to be taken, the branch is made to the corresponding address. Here, taken means a case where the Comparison Result is 1 and may mean a case where 0 is necessary.
The final branch determination result (Comparison Result) output through the
The Source Selection (ss) field represents the source of branch condition determination. Contains information about whether the predicate register, the unregistered register, or the comparison result of the two source data is used to determine the branch condition.
Depending on the value of the source selection (ss) field, the values of the Address field, Component Selection (cs) field, Comparison option (cmp option) field, and Integer (I) field Used.
The data in the offset field (here, the target address field) is always valid regardless of the value of the source selection (ss) field. If the value of the source selection (ss) field is '00', the component selection (cs) field is valid. If the value is '01', the value of the Address field is valid. In the case of '10', data of a cmp option field and an integer (I) field are valid.
The Address field is used as the address of the
The component selection (cs) field is used to select a component of the
If the value of the source selection (ss) field is '10', the integer (I) field is used to select the result of the comparison operation. The comparison operation of the source data is performed simultaneously in the floating point
As described above, by using a shader instruction that interprets some fields differently according to micro operations, the usage efficiency of the instruction field may be increased. This minimizes the size of the instruction, saving space for storing instruction code.
7 is a flowchart illustrating a branch condition determination method in an arithmetic logic device according to an embodiment of the present invention.
The arithmetic logic device reads the shader instruction (step S700).
The data of the microoperation index field of the shader instruction is checked to determine whether the currently read shader instruction is a branch instruction (step S710). If it is determined that the branch instruction is not a branch instruction, a basic operation is performed according to the corresponding shader instruction (step S715).
As a result of the determination, in the case of the branch instruction, the destination field of the shader instruction is interpreted as shown in FIG. 5. The source of the branch condition determination is confirmed through the data of the source selection field (step S720).
If the data of the source selection field is '00', the branch condition is determined using the data of the predicate register. The component of the predicate register is selected through the data of the component selection field (step S730), and the value of the selected component is output as the first untyped data (S735).
If the data of the source selection field is '01', the branch condition is determined using the data of the Boolean register. One false value of the immutable registers is selected through the data of the address field (step S740), and the selected false value is output as the second imbalance data (S745).
If the data of the source selection field is '10', the branch condition is determined using the comparison result of the two source data. The result of comparing and comparing the two source data in the floating point unit and the integer unit is output (step S750). One of the output floating point comparison result and the integer comparison result is selected using the integer field and output as a source comparison result (step S755).
Any one of the first imbalance data, the second imbalance data, and the source comparison result is finally output as the branch condition determination result (step S760), and is transmitted to the branch control unit.
In addition, the above-mentioned branch condition determination method can be created by a computer program. Codes and code segments constituting the program can be easily deduced by computer programmers in the art. In addition, the program is stored in a computer readable media, and read and executed by a computer to implement a method for providing a document search service. The information storage medium includes a magnetic recording medium, an optical recording medium, and a carrier wave medium.
In the above, the present invention has been described based on the embodiments, but those skilled in the art may vary the present invention without departing from the spirit and scope of the present invention as set forth in the claims below. It will be understood that modifications and changes can be made.
1 is an arithmetic logic device for a three-dimensional graphics shader in accordance with one embodiment of the present invention.
2 illustrates a predicate register and output data for branch determination.
3 illustrates a Boolean register and output data for branch determination.
4 is a block diagram of the arithmetic logic device shown in FIG.
5 is a diagram illustrating a field configuration of a shader instruction according to an embodiment of the present invention.
6 is a table of details of a branch instruction field according to an embodiment of the present invention.
7 is a flowchart of a branch condition determination method in an arithmetic logic device according to an embodiment of the present invention.
<Description of the symbols for the main parts of the drawings>
100: Arithmetic Logic Device
112: floating point type calculation unit 114: integer type calculation unit
122, 124, 132: MUX
Claims (13)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020070119802A KR20090053129A (en) | 2007-11-22 | 2007-11-22 | Arithmetic and logic unit of 3-dimentional graphics shader, method for determining branch condition and recording medium recorded program executing it |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020070119802A KR20090053129A (en) | 2007-11-22 | 2007-11-22 | Arithmetic and logic unit of 3-dimentional graphics shader, method for determining branch condition and recording medium recorded program executing it |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20090053129A true KR20090053129A (en) | 2009-05-27 |
Family
ID=40860692
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020070119802A KR20090053129A (en) | 2007-11-22 | 2007-11-22 | Arithmetic and logic unit of 3-dimentional graphics shader, method for determining branch condition and recording medium recorded program executing it |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR20090053129A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20190076807A (en) * | 2017-12-22 | 2019-07-02 | 한국기술교육대학교 산학협력단 | Method for vertex optimization using depth image in workspace modeling and system thereof |
-
2007
- 2007-11-22 KR KR1020070119802A patent/KR20090053129A/en not_active Application Discontinuation
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20190076807A (en) * | 2017-12-22 | 2019-07-02 | 한국기술교육대학교 산학협력단 | Method for vertex optimization using depth image in workspace modeling and system thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5242771B2 (en) | Programmable streaming processor with mixed precision instruction execution | |
US10430912B2 (en) | Dynamic shader instruction nullification for graphics processing | |
TWI512669B (en) | Compiling for programmable culling unit | |
KR102459322B1 (en) | Primitive culling using compute shaders that compile automatically | |
US8289341B2 (en) | Texture sampling | |
JP7253488B2 (en) | Composite world-space pipeline shader stage | |
US9799094B1 (en) | Per-instance preamble for graphics processing | |
CN101802874A (en) | Fragment shader bypass in a graphics processing unit, and apparatus and method thereof | |
KR101973924B1 (en) | Per-shader preamble for graphics processing | |
KR101941832B1 (en) | Uniform predicates in shaders for graphics processing units | |
US10643369B2 (en) | Compiler-assisted techniques for memory use reduction in graphics pipeline | |
CN106575440B (en) | Constant buffer size multi-sample anti-aliasing depth compression | |
TWI720981B (en) | Method, apparatus, and non-transitory computer readable medium for handling instructions that require adding results of a plurality of multiplications | |
US11080927B2 (en) | Method and apparatus of cross shader compilation | |
KR20230010672A (en) | Data compression method and apparatus | |
KR20090053129A (en) | Arithmetic and logic unit of 3-dimentional graphics shader, method for determining branch condition and recording medium recorded program executing it | |
KR20090077432A (en) | Method of processing opengl programmable shader by using off-line compiling | |
US20200004533A1 (en) | High performance expression evaluator unit | |
KR20090075521A (en) | Shader processor and method for carrying out operations | |
Krause | A shader unit | |
KR20090075530A (en) | Shader processor, method for processing dual phase instruction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WITN | Withdrawal due to no request for examination |