CN115280368A - Filtering for rendering - Google Patents

Filtering for rendering Download PDF

Info

Publication number
CN115280368A
CN115280368A CN202080098477.1A CN202080098477A CN115280368A CN 115280368 A CN115280368 A CN 115280368A CN 202080098477 A CN202080098477 A CN 202080098477A CN 115280368 A CN115280368 A CN 115280368A
Authority
CN
China
Prior art keywords
texture
sub
pixels
rendering
function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202080098477.1A
Other languages
Chinese (zh)
Inventor
刘保权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN115280368A publication Critical patent/CN115280368A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/50Lighting effects
    • G06T15/60Shadow generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2215/00Indexing scheme for image rendering
    • G06T2215/12Shadow map, environment map

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Generation (AREA)

Abstract

A graphics processing system for rendering an image having a plurality of pixels, wherein the system is configured to: for each of at least one of the plurality of pixels: determining a plurality of texels corresponding to respective pixels, each texel including a sub-pixel data point indicative of a texture sample value in a texture space; and rendering each pixel by applying a filter function to the sub-pixel data points in dependence on the sample position x to form a filtered value, wherein x is the fractional position of the texture coordinate; wherein the filtering function is performed by summing outputs of a linear interpolation function implemented by the graphics processing system with respect to sub-pixel data points at a plurality of locations equidistant from x.

Description

Filtering for rendering
Technical Field
The present invention relates to filtering during rendering, for example for rendering soft shadows in a graphics processing unit.
Background
The rendering of shadows is important to video games because the presence of shadows not only can increase the realism of the rendered image, but can also indicate to the viewer the distance and location between objects.
A simple shadow mapping algorithm (called hard shadow) does not use any filtering algorithm (i.e. no weighting calculation), but rather extracts a nearest texture sample from the shadow map texture to test whether the currently rendered fragment is within the shadow region. However, this may result in poor image quality, as shown at 101 in fig. 1, which is not realistic.
When rendering shadows in modern games, filtering shadow map samples is important to reduce jaggy artifacts. In general, there is a need to reduce aliasing due to undersampling of geometry rasterized into a shadow map, but this may also be useful in cases where the shadow map itself is undersampled by the pixel shader.
However, rendering high quality soft shadows using high quality filtering algorithms involving a large number of texture sample taps is very difficult, especially for modern mobile Graphics Processing Units (GPUs).
Shading algorithms found in modern game engines (e.g., ghost engines and Unity 3D engines) typically require texture sampling in an N × N area in the shadow tile texture to obtain N × N texel data samples, which are then subjected to filtering operations to obtain high quality soft shadow values to render the final pixel in the pixel shader.
One common Filtering technique is Percentage asymptotic Filtering (PCF), which applies multiple sample taps to the shadow-mapped texture, then performs a shadow test, and finally filters the results of all tests.
To achieve the simplest soft shading to achieve cross-pixel light transitions, as a minimum solution, four visibility sample taps, e.g., a 2 × 2 grid of adjacent shadow map texels, must be extracted and weighted using the sub-texel coordinate offset as a weight to generate a smooth light transition result. This is called 2 x 2 bilinear PCF filtering.
Fig. 2a shows the improved rendering results when soft shadowing is generated using a 2 x 2 bilinear PCF, as shown at 201. As described above, the shadow mapping algorithm extracts four texture samples in a 2 × 2 neighborhood region around the target UV coordinates from the shadow mapping texture, as shown in (s, t) of fig. 2 b. A 2 x 2 bilinear weighted filtering is then performed to calculate the percentage of whether the current rendered fragment is within the shadow, i.e. whether the surface point is closer to the illuminated area and therefore not in the shadow. However, the resulting image is still not realistic and has the characteristic of saw-tooth aliasing.
Fig. 3a shows the rendering results when a 3 x 3 bilinear PCF is used. Many modern games use 3 x 3 bilinear PCF filtering, extracting nine texture samples in a 3 x 3 neighborhood around the target UV coordinate, as shown in (s, t) in fig. 3 b. A 3 x 3 bilinear weighting calculation is then performed on the nine samples in the fragment shader to obtain a smooth filtered shadow. A triangular bilinear filter kernel is used as a filter weight. As can be seen from the shaded area indicated at 301, the resulting rendered image quality is improved and more natural than 2 x 2PCF filtering. Fig. 4 shows an enlarged view of the area 301 shown in fig. 3 (a) of the image processed using a 3 × 3 bilinear PCF.
Fig. 5 further illustrates a 3 x 3PCF filtering algorithm that samples a 3 x 3 grid of texels around the target UV coordinate 500. The weight of each of the nine samples is calculated as the coverage area in each texel unit by a box 501, the box 501 being enclosed by sub-texel offsets [ α, β ] along the X-axis and Y-axis, respectively. The filter weights for the nine samples are calculated as the percentage coverage in texel units per sample covered by the 2D (two dimensional) sample footprint (as indicated by block 501), i.e. the weights are the covered 2D area calculated for each texel unit.
For a 3 × 3 bilinear PCF, the weight vectors along the X-axis and Y-axis are: a1-alpha, 1, alpha, C1-beta, 1, beta; the total area of the 2D sample footprint (as shown in block 501) is 2 x 2=4, so the final result is normalized by dividing by 4. Here, α and β are sub-texel offsets along two axes in texel space.
For a 3 × 3PCF with bilinear filtering, in two-dimensional and 3 × 3 data sample taps, B is given by:
Figure BDA0003844019490000021
the footprint of the filter kernel is 2 x 2 and it can then be normalized by dividing it by the result. In the matrix representation, the 3 × 3 bilinear PCF equation can be expressed as shown in equation (2) below. Here, for ease of derivation, α and β are represented by X and Y, respectively, i.e., X and Y are sub-texture offsets in texel space along the X and Y axes, respectively.
Figure BDA0003844019490000022
In addition, a and C also require dynamic computation because the sub-texel offset for each target pixel is not fixed. Therefore, all weights must be computed dynamically (the weight of each sample is unique) because the shadow texels are not located at a fixed distance from the sampling location, i.e. these are dynamic sub-texel offsets, which will be used to compute bilinear weights for nine samples. This is to achieve a continuous transition across the pixel filtering results, since a non-continuous PCF (e.g., a uniform PCF where the same weight is used for all nine samples) is not desirable.
In many modern gaming engines on mobile platforms, shadow filtered 3 x 3PCF is computed directly on the 3 x 3 data samples to obtain final high quality filtered soft shadow values.
Although a large kernel PCF may produce very good image quality, as described above, it requires many texture fetch instructions to sample the data, which consumes a large amount of data bandwidth. It also requires many weighting calculations to perform weighted filtering (i.e., many ALU instructions) to color each individual fragment. This can be a heavy burden on the memory and texture bandwidth and computational power of the mobile GPU.
Mobile devices require real-time rendering performance, with high frame rate and low latency interaction. Also, when devices are held for long periods of time, they require low power consumption to extend battery life and low heat dissipation to improve user comfort. These requirements may not be fulfilled when higher order filtering algorithms are used for the rendering of high quality soft shadows.
Thus, while 3X 3PCF filtering allows for smooth shadow transitions, it is very time consuming because it requires nine texture fetch instructions (as shown in matrix B) into the shadow pasting texture, which then needs to be filtered using a weighted calculation in the X and Y directions in a 3X 3 texel grid, involving twelve-bit floating point multiplication and eight-bit addition for a single filtering operation, as shown in equation (2), where two vectors a and C have three weights, respectively, and matrix B has nine data samples (in a 3X 3 texel grid).
In the shadow filtering method described in US 7106326 B2, the computing unit is configured to access data values from the memory and to perform a filtering operation (e.g. linear, bilinear, trilinear, cubic or bicubic filtering) on data values of a neighborhood (Np × Np), where Np × Np is the size of the neighborhood in texels. However, it requires Np × Np data sample taps and a very complex weighted filtering calculation based on multiple samples and dynamic weights.
The method described by Gruen in "Fast Conventional Shadow Filtering" ("Fast computational Shadow Filtering", GPU Pro, A K Peters,2010, pages 415-445) can achieve high quality PCF Filtering with less texture extraction by applying dynamically shifted sample positions and additional post-texture weights to each newly sampled texture value. However, for triangular weighted filtering, it involves solving a complex linear system for each sample to compute the correct post-texture weights and the correct dynamic offsets to obtain new shifted texture sample coordinates for all new sample positions. Thus, the computations to solve the linear system are complex and may consume too many ALU shader instructions for the mobile GPU.
Higher order filtering, such as bicubic filtering, is more complicated due to the higher number of complex weighted filtering computations, and the dynamic weights of all samples are also not easy to compute since this involves the computation of higher order polynomials.
It is desirable to develop a new method for rendering high quality soft shadows to overcome these problems.
Disclosure of Invention
According to a first aspect, there is provided a graphics processing system for rendering an image having a plurality of pixels, wherein the system is configured to: for each of at least one of the plurality of pixels: determining a plurality of texels corresponding to respective pixels, each texel comprising a sub-pixel data point indicative of a texture sample value in the texture space, and rendering the respective pixel by applying a filter function to the sub-pixel data point in dependence on a sampling position x to form a filter value, wherein x is the fractional position of the texture coordinate; wherein the filtering function is performed by summing outputs of a linear interpolation function implemented by the graphics processing system with respect to sub-pixel data points at a plurality of locations equidistant from x.
This may allow faster shadow filtering in 2D image applications and may produce smooth image quality results for soft shadows.
The plurality of locations may each be spaced from x by an offset of 0.5, which is the spacing of the sub-pixel data points in texture space along each dimension in texture space. The offset may be +/-0.5. This may reduce the processing required to render the image.
The number of texels covered by each pixel along each dimension in texture space may be a multiple of three. The number of texels covered by each pixel along at least one of the X, Y and Z-direction in texture space may be a multiple of three.
The sum of the outputs of the two linear interpolation functions implemented by the graphics processing system at (x-0.5) and (x + 0.5) may be approximately equal to the weighted sum of the linear weighted 3-tap percentage-asymptotic filtering of the three function values at one-dimensional integer texture coordinate positions in texture space. Thus, after extending the 1D solution to 2D image applications, the system may allow faster 3 x 3PCF shading filtering.
The system may be configured to apply the filter function along a plurality of orthogonal directions in the texture space. The multiple locations may each be spaced from x along one, two, or three orthogonal directions in texture space. This may allow the method to be applied to one-, two-or three-dimensional PCF filtering.
The system may be configured to implement the filtering function in one of the GLSL, HLSL and Spir-V languages or any other suitable coloring language. Thus, the system is compatible with the filtering algorithms and languages used in many modern image processing systems and video games.
The system may be configured to implement the filtering function in a single instruction or fixed function hardware unit. This may reduce the processing cost required to render the image.
The system may include a texture cache configured to store the processed sampling results for reuse by neighboring pixels in the image. This may further reduce the processing required to render the image.
At least some of the plurality of pixels may represent shadows of objects in the image. The filter function may be a PCF filter function. The filter function may be a shadow filter function. The filtered value may be a filtered shadow value. This may allow faster shadow filtering in 2D image applications and may produce smooth image quality results for soft shadows.
According to a second aspect, there is provided a method for rendering an image having a plurality of pixels, at least some of the pixels representing shadows of objects in the image, wherein the method comprises: for each of at least one of the plurality of pixels: determining a plurality of texels corresponding to the respective pixels, each texel comprising a sub-pixel data point indicative of a texture sample value in the texture space, and rendering the respective pixel by applying a filter function to the sub-pixel data point in dependence on a sampling position x to form a filter shadow value, wherein x is the fractional position of the texture coordinate;
wherein the filter function is performed by summing outputs of a linear interpolation function implemented by the graphics processing system with respect to sub-pixel data points at a plurality of locations equidistant from x;
this approach may allow faster shading filtering and may produce smooth image quality results for soft shading.
The plurality of locations may each be spaced from x by an offset of 0.5, which is the spacing of the sub-pixel data points in texture space. The offset may be +/-0.5. This may reduce the processing required to render the image.
The number of texels covered by each pixel may be a multiple of three. The number of texels covered by each pixel may be three along at least one of the X, Y and the Z-direction in texture space. Thus, the system can simplify 3 × 3PCF shading filtering in 2D.
The method may comprise applying a filter function along a plurality of orthogonal directions in the texture space. The multiple locations may each be spaced from x along one, two, or three orthogonal directions in texture space. This may allow the method to be applied to one-, two-or three-dimensional PCF filtering.
At least some of the plurality of pixels may represent shadows of objects in the image. The filter function may be a PCF filter function. The filter function may be a shadow filter function. The filtered value may be a filtered shadow value. This may allow for faster shading filtering in 2D image applications and may produce smooth image quality results for soft shading.
According to a third aspect, there is provided a computer program which, when executed by a computer, causes the computer to perform the method described above. The computer program may be provided on a non-transitory computer readable storage medium.
Drawings
The invention will now be described by way of example with reference to the accompanying drawings.
In the drawings:
fig. 1 shows the rendering results of the hard shading algorithm without any filtering.
Fig. 2a shows the rendering results when soft shading is generated using 2 x 2 bilinear percent asymptotic filtering (PCF).
Fig. 2b shows sampling a 2 x 2 texel grid around the target UV coordinates.
Fig. 3a shows the rendering results when soft shading is generated using 3 x 3 bilinear percentage asymptotic filtering (PCF).
Fig. 3b shows the sampling around a 3 x 3 texel grid of the target UV coordinates.
Fig. 4 shows an enlarged view of the rendering result of fig. 3a using a 3 x 3 bilinear PCF.
Fig. 5 schematically illustrates a 3 x 3PCF filtering algorithm (sampling a 3 x 3 grid of texels around the target UV coordinates). The weight of each of the nine samples is computed as the coverage area in each texel unit by a box (i.e. the sample footprint) that is internally wrapped by the sub-texture offsets [ α, β ] along the X-axis and Y-axis, respectively.
FIG. 6 shows the integer position (f) to be filtered by-1,f0,f1) The three adjacent function values at (a) are weighted and the calculation of the 3 x 1 linear PCF filter is performed in 1D (one-dimensional) at position x, where the filter weights are calculated based on the sub-texel offsets.
FIG. 7 illustrates a flow chart of a method for rendering an image having a plurality of pixels, at least some of which represent shadows of objects in the image.
FIG. 8 shows an example of a graphics processing system.
Detailed Description
A graphics processing system and method are described herein, such as may be used to render high quality soft shadows.
Rendering here refers to any form of generating a visible image, such as displaying an image on a computer screen, printing, or projecting.
Without any filtering, the data sample value of the shadow map texture is 1.0 or 0.0. As a result, the rendered shadow in the final image shows strong aliasing. This is because if the data sample value is equal to 1.0, it means that the pixel is completely out of shadow, and if 0.0, it means that the pixel is completely in shadow.
After filtering, the data sample value may be a floating point number between 1.0 and 0.0. This is to achieve a continuous transformation, so pixels can be rendered as soft shadows in the final image.
A pixel P in image space has a projected area centered on x in texture space, and in a 3 x 3PCF, x has a local neighborhood of 3 x 3 texels, which is nine texels of the footprint coverage of the projected area. The PCF filter function is applied to these nine texels to obtain a filtered value, which is the final shaded value of pixel P. Thus, each texel indicates a texture sample value in the texture space. For per-pixel shading, multiple texel taps may be needed because the pixel is located in image space, which has projected footprint in texture space, which covers multiple texels. Filtering is performed in texture space. In the example described below, the number of texels per pixel is three in each direction or dimension of X, Y or Z (if present).
The application of the method in 1D will now be explained, which is then extended to 2D for soft shadow rendering. The method can also be extended to 3D (three-dimensional) for 3 x 3 trilinear PCF filtering of 3D volumetric data.
To not lose generality, it is assumed that the samples of the continuous function f (x) are known at integer texel positions, and f (x) needs to be approximately reconstructed from these discrete integer positions as a normalized weighted sum of these discrete samples.
The piecewise linear function reconstruction is defined as follows.
A general 1D linear interpolation is a method for dividing the function value f(i+x)Estimated as the known function value f at integer grid positions (i and i + 1)(i)And f(i+1)In which x ∈ [0,1)]I.e. 1>x ≧ 0, and i ∈ Z is an integer, x is the fractional part of (i + x). To simplify the notation, in the following (fi)-1,fi,fi+1) I is omitted from the symbol, and these values will be denoted as (f) only-1,f0,f1)。
Similar to general 1D linear interpolation, the 1D 3 × 1 linear PCF filter is a higher-level interpolation that computes the function value (f) at three integer grid positions (-1,0 and 1)-1,f0,f1) A weighted sum of the three taps.
Here, the footprint size of the 1D filter (i.e., the footprint of the 1D samples) is 2 (ranging from x-1 to x + 1), which is divided by the final result for normalization.
Fig. 6 illustrates the integral computation of the weighted sum of the 3 x 1 linear PCF filters in 1D. Function value (f) at three integer grid positions (-1,0 and 1)-1,f0,f1) Represented by bar graphs at 601, 602, and 603, respectively.
Where x is the fractional part of the texture coordinate (i.e., between integer texture coordinate positions in texture space), flin(x) Is the result of a hardware linear interpolation function (i.e., a linear interpolation function implemented by a graphics processing system) at an offset x between two consecutive integer positions.
The PCF filter result can be calculated as the sum of the areas of the three shaded regions 604, 605, 606, where the area of each region 604, 605, 606 is calculated as the product of the weight and height of each region covered by the sample footprint from x-1 to x + 1.
Figure BDA0003844019490000061
Referring to FIG. 6, for any offset x ∈ [0,1), it was found that linear interpolation at (x-0.5) ∈ [0.5,0.5) can be expressed as a weighted sum of two function values for two integer positions (-1 and 0), and that linear difference at (x + 0.5) ∈ [0.5,1.5) can be expressed as a weighted sum of two function values for two integer positions (0 and 1). lerp () is an interpolation function. This is expressed as:
flin(x-0.5)=lerp(f-1,f0,x)=(1-x)*f-1+x*f0=f-1-x*f-1+x*f0 (4)
flin(x+0.5)=lerp(f0,f1,x)=(1-x)*f0+x*f1=f0+x*f1-x*f0 (5)
adding these two terms together yields:
flin(x-0.5)+flin(x+0.5)=f-1+f0+x*(f1-f-1)=f-1–x*f-1+f0+x*f1=(1-x)*f-1+1*f0+x*f1=fpcf(x) (6)
or, equivalently, in the notation of matrix operations:
Figure BDA0003844019490000062
from this derivation, it can be seen that the 3 × 1PCF filter result at sample position x in 1D is equal to the sum of two general linear interpolation functions at two positions: (x-0.5) and (x + 0.5), which have been treated as linear sampling texture instructions flin(x) Supported by the GPU hardware.
In 2D (two-dimensional), 3 × 3 bilinear PCF filtering will involve 3 × 3=9 taps:
Figure BDA0003844019490000063
the footprint size of the 2D filter (i.e., the footprint of the 2D samples shown in block 501 in fig. 5) is 2 x 2=4, which is then divided by the final result for normalization.
As shown in equation (2), the filter used for 2D bilinear PCF filtering is a separable filter, which means that it can be separated into two one-dimensional filters in the X and Y directions, respectively.
To simplify the conventional equation (2) for 2D rendering, the 1D equation derived above may be applied first along the X direction, and then after obtaining intermediate results, the 1D equation may be applied again along the Y direction. In the notation of matrix operations, the 2D bilinear PCF filter equation can be simplified by the following derivation:
Figure BDA0003844019490000064
Figure BDA0003844019490000071
as a result of the above derivation, it can be seen that the 3 × 3PCF filtering result at the 2D sampling position (x, y) is equal to four positionsThe sum of the four generic 2D bilinear interpolation functions, the 2D offset to the original 2D sample position along each of the two dimensions is ± 0.5. These functions have been implemented as 2D bilinear sampled texture instructions: f. oflin(x, y) is supported by the GPU hardware.
In 3D applications, such as volume rendering, 3D textures are sampled along rays and the sample values need to be filtered, where PCF is very useful for achieving smooth rendering results.
In 3D, a 3 × 3 × 3 trilinear PCF filter would involve 3 × 3 × 3=27 function value taps at integer grid positions, the footprint of the 3D filter (i.e., the volume of 3D sample footprint) is 2 × 2 × 2=8, which needs to be divided by the final result for normalization.
The 3 x 3 trilinear PCF filtering result in 3D can be expressed as a normalized weighted sum of 27 function values at 27 integer grid positions.
The 3D extension of the PCF interpolation scheme described herein can be derived similar to the 2D case. Thus, for 3D, only the results of this derivation are given below:
fpcf(x,y,z)=flin(x-0.5,y-0.5,z-0.5)+flin(x-0.5,y-0.5,z+0.5)+flin(x-0.5,y+0.5,z-0.5)+fllin(x-0.5,y+0.5,z+0.5)+flin(x+0.5,y-0.5,z-0.5)+fllin(x+0.5,y-0.5,z+0.5)+fllin(x+0.5,y+0.5,z-0.5)+fllin(x+0.5,y+0.5,z+0.5) (9)
from equation (9), it can be concluded that the 3 x 3PCF filter result at the 3D sample location (x, y, z) is equal to the sum of eight generic 3D tri-linear interpolation functions at eight locations, with a 3D offset to the original 3D sample location (x, y, z) of ± 0.5 along each of the three dimensions. This 3D trilinear interpolation function has been used as a 3D trilinear sampling texture instruction: f. oflin(x, y, z) is supported by the GPU hardware.
FIG. 7 summarizes a method for rendering an image having a plurality of pixels, wherein the method includes, for each of at least some of the plurality of pixels. The method comprises the following steps for each of at least one of the plurality of pixels. At step 701, the method includes determining a plurality of texels corresponding to respective pixels, each texel including a sub-pixel data point indicative of a texture sample value in a texture space. At step 702, the method comprises rendering each pixel by applying a filter function to sub-pixel data points in dependence on a selected sample position x to form a filter value, where x is the fractional position of the texture coordinate; wherein the filtering function is performed by summing outputs of a linear interpolation function implemented by the graphics processing system with respect to sub-pixel data points at a plurality of locations equidistant from x.
Some advantages of using this method compared to the method using the conventional equation (2) are as follows.
In 1D, the method requires fewer GPU instructions to process tex _ instructions (reduced from three to two) and ALU instructions (from three multiplications and two additions to only two additions and no multiplications).
The benefit of this simplified 2D equation compared to conventional equation (2) is that fewer GPU instructions are required for tex _ instruction (reduced from 9 to 4) and ALU _ instruction (from 9+3 and 6+2=8 additions to only 3 and zero multiplications). This approach also requires less register allocation, which means more concurrent GPU threads are scheduled on the fly. This results in faster rendering speed and less power consumption.
The 2D equations described herein for a 3 x 3 bilinear PCF may be used to filter shadow map samples to obtain high quality soft shadow rendering results for a video game.
By replacing the old fragment shader using equation (2) with a shader that implements the methods described herein, the rendering methods can be deployed onto many modern video games in shaders for soft shadow rendering. In some implementations, this may result in a GPU power improvement that reduces the average power consumption by more than 30mA, i.e., approximately 5-10%.
The method of the 3 x 3 trilinear PCF may be used for 3D applications, such as volume rendering, where the 3D texture will be sampled along the ray and the sample values need to be filtered, where the 3 x 3 trilinear PCF is very useful for achieving smooth volume rendering results. In terms of tex _ instraction (reduced from 27 to 8 times) and ALU _ instraction (reduced from 27+9+3=39 multiplications and 18+6+2=26 additions to only 7 additions and 0 multiplications), it requires fewer GPU instructions. There is also less register allocation, which means more concurrent GPU threads are scheduled on the fly. This results in faster rendering speed and less power consumption.
The methods described herein for PCF filtering involve fewer weight calculations, particularly for 2D and 3D.
Fewer GPU instructions can result in longer mobile battery life and can also reduce latency and increase frame rate for complex and demanding game rendering.
This also allows possible GPU software to be implemented using shader code. Equations in 1D, 2D, or 3D may be implemented in a graphics processing system by several lines of GPU shader code using a shading language (e.g., GLSL, HLSL, spir-V, or any other suitable language).
The Texture Unit (TU) module of the GPU may be modified to implement the 3 x 3PCF filter equation with a single instruction instead of shader code using fixed function hardware. For example, one ISA-live call may be used to accomplish 2D3 x 3PCF filtering.
Thus, the method may be implemented by using fixed function hardware inside the TU hardware module of the GPU. The hardware functionality may be provided to the user by a single shader instruction rather than multiple lines of shader code.
In one implementation, a texture cache may be utilized to store four granularities of post-processing sampling results for reuse by neighboring pixels. In addition, data prefetching may also be performed using independent texture reads (due to the constant offset of 0.5 in the preferred implementation) to further enhance rendering performance.
The methods described herein may allow faster and cheaper triangular linear PCF filtering (filter diameter size with three texels) for 1D, 2D and 3D, respectively, which enables less weighted filtering computations than conventional methods in terms of texture sampling instructions and ALU instructions. This also enables reduced latency and increased frame rate for complex and demanding game rendering.
Fig. 8 is a schematic diagram of a system 800 configured to perform the methods described herein. System 800 may be implemented on a device such as a laptop, tablet, smartphone, TV, or any other device in which graphical data is to be processed.
The system 800 includes a graphics processor 801 configured to process data. For example, the processor 801 may be a GPU. Alternatively, the processor 801 may be implemented as a computer program running on a programmable device such as a GPU or a Central Processing Unit (CPU). The system 800 includes a memory 802 arranged in communication with the graphics processor 801. The memory 802 may be a non-volatile memory. Graphics processor 801 may also include a cache (not shown in FIG. 8) that may be used to temporarily store data from memory 802. The system may include more than one processor and more than one memory. The memory may store data that is executable by the processor. The processor may be configured to operate in accordance with a computer program stored in a non-transitory form on a machine-readable storage medium. The computer program may store instructions for causing a processor to perform its methods in the manner described herein.
The system may allow rendering high quality 3 x 3PCF filtered soft shadows with higher performance with fewer weighted filtering computations on the mobile GPU than conventional approaches. As described above, this approach may involve fewer texture sampling instructions and ALU computations than previous approaches.
Although the above systems and methods are described primarily with respect to shadow rendering-related applications, the systems and methods may be applied to other types of image rendering applications.
The applicant hereby discloses in isolation each individual feature described herein and any combination of two or more such features, to the extent that such features or combinations are capable of being carried out based on the present specification as a whole in the light of the common general knowledge of a person skilled in the art, irrespective of whether such features or combinations of features solve any problems disclosed herein, and without limitation to the scope of the claims. The applicant indicates that aspects of the present invention may consist of any such individual feature or combination of features. In view of the foregoing description it will be evident to a person skilled in the art that various modifications may be made within the scope of the invention.

Claims (17)

1. A graphics processing system for rendering an image having a plurality of pixels, wherein the system is configured to, for each of at least one of the plurality of pixels:
determining a plurality of texels corresponding to the respective pixels, each texel including a sub-pixel data point indicative of a texture sample value in texture space; and
rendering the respective pixel by applying a filter function to the sub-pixel data point in dependence on the selected sampling position x to form a filter value, where x is the fractional position of the texture coordinate;
wherein the filtering function is performed by summing the outputs of the linear interpolation function implemented by the graphics processing system for sub-pixel data points at a plurality of locations equidistant from x.
2. The system of claim 1, wherein the plurality of locations are each spaced from x by an offset of 0.5, the offset being the spacing of the sub-pixel data points in the texture space along each dimension in the texture space.
3. The system of claim 1 or 2, wherein the number of texels covered by each pixel along each dimension in the texture space is three.
4. The system of any of the preceding claims, wherein the sum of the outputs of two linear interpolation functions implemented by the graphics processing system at (x-0.5) and (x + 0.5) is approximately equal to a weighted sum of a linear weighted 3-tap percentage asymptotic filter of three function values at integer texture coordinate positions along each dimension in the texture space.
5. The system of any preceding claim, wherein the system is configured to apply the filter function along a plurality of orthogonal directions in the texture space.
6. The system of any preceding claim, wherein the plurality of locations are each spaced from x along one, two or three orthogonal directions in the texture space.
7. A system according to any preceding claim, wherein the system is configured to implement the filtering function in one of GLSL, HLSL and Spir-V languages.
8. The system of any preceding claim, wherein the system is configured to implement the filtering function in a single instruction or fixed function hardware unit.
9. The system of any preceding claim, wherein the system comprises a texture cache configured to store processed sampling results for reuse by adjacent pixels in the image.
10. The system of any preceding claim, wherein at least some of the plurality of pixels represent shadows of objects in the image.
11. A method for rendering an image having a plurality of pixels, wherein the method comprises, for each of at least one of the plurality of pixels:
determining a plurality of texels corresponding to the respective pixels, each texel including a sub-pixel data point indicative of a texture sample value in a texture space; and
rendering the respective pixel by applying a filter function to the sub-pixel data point in dependence on the selected sampling position x to form a filtered value, where x is the fractional position of the texture coordinate;
wherein the filtering function is performed by summing the outputs of the linear interpolation function implemented by the graphics processing system for sub-pixel data points at a plurality of locations equidistant from x.
12. The method of claim 11, wherein the plurality of locations are each spaced from x by an offset of 0.5, the offset being the spacing of the sub-pixel data points in the texture space along each dimension in the texture space.
13. The method of claim 11 or 12, wherein the number of texels covered by each pixel along each dimension in the texture space is three.
14. A method according to any one of claims 11 to 13, wherein the method comprises applying the filter function along a plurality of orthogonal directions in the texture space.
15. The method of any of claims 11 to 14, wherein the plurality of locations are each spaced from x along one, two or three orthogonal directions in the texture space.
16. The method of any of claims 11 to 15, wherein at least some of the plurality of pixels represent shadows of objects in the image.
17. A computer program which, when executed by a computer, causes the computer to perform the method of any one of claims 11 to 16.
CN202080098477.1A 2020-04-24 2020-04-24 Filtering for rendering Pending CN115280368A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2020/061417 WO2021213664A1 (en) 2020-04-24 2020-04-24 Filtering for rendering

Publications (1)

Publication Number Publication Date
CN115280368A true CN115280368A (en) 2022-11-01

Family

ID=70465066

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202080098477.1A Pending CN115280368A (en) 2020-04-24 2020-04-24 Filtering for rendering

Country Status (2)

Country Link
CN (1) CN115280368A (en)
WO (1) WO2021213664A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422807A (en) * 2023-12-15 2024-01-19 摩尔线程智能科技(北京)有限责任公司 Method and device for determining color value electronic device, computer storage medium
WO2024113227A1 (en) * 2022-11-30 2024-06-06 Qualcomm Incorporated Range aware spatial upscaling

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7106326B2 (en) 2003-03-03 2006-09-12 Sun Microsystems, Inc. System and method for computing filtered shadow estimates using reduced bandwidth
US9367948B2 (en) * 2013-11-14 2016-06-14 Intel Corporation Flexible filter logic for multi-mode filtering of graphical texture data
KR20180037838A (en) * 2016-10-05 2018-04-13 삼성전자주식회사 Method and apparatus for processing texture

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024113227A1 (en) * 2022-11-30 2024-06-06 Qualcomm Incorporated Range aware spatial upscaling
CN117422807A (en) * 2023-12-15 2024-01-19 摩尔线程智能科技(北京)有限责任公司 Method and device for determining color value electronic device, computer storage medium
CN117422807B (en) * 2023-12-15 2024-03-08 摩尔线程智能科技(北京)有限责任公司 Color value determining method and device, electronic equipment and computer storage medium

Also Published As

Publication number Publication date
WO2021213664A1 (en) 2021-10-28

Similar Documents

Publication Publication Date Title
JP6728316B2 (en) Method and apparatus for filtered coarse pixel shading
US10362289B2 (en) Method for data reuse and applications to spatio-temporal supersampling and de-noising
EP3129974B1 (en) Gradient adjustment for texture mapping to non-orthonormal grid
US9754407B2 (en) System, method, and computer program product for shading using a dynamic object-space grid
US9747718B2 (en) System, method, and computer program product for performing object-space shading
EP3748584B1 (en) Gradient adjustment for texture mapping for multiple render targets with resolution that varies by screen location
US8704830B2 (en) System and method for path rendering with multiple stencil samples per color sample
US7742060B2 (en) Sampling methods suited for graphics hardware acceleration
US9530189B2 (en) Alternate reduction ratios and threshold mechanisms for framebuffer compression
US9230363B2 (en) System, method, and computer program product for using compression with programmable sample locations
US9501860B2 (en) Sparse rasterization
US9230362B2 (en) System, method, and computer program product for using compression with programmable sample locations
US7502035B1 (en) Apparatus, system, and method for multi-sample pixel coalescing
US10192348B2 (en) Method and apparatus for processing texture
CN115280368A (en) Filtering for rendering
US20230298212A1 (en) Locking mechanism for image classification
WO2022106016A1 (en) High-order texture filtering
US20230298133A1 (en) Super resolution upscaling
EP4042380A1 (en) Soft shadow optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination