CN104090742A

CN104090742A - Parallelization type progressive photon mapping method and device based on OpenCL

Info

Publication number: CN104090742A
Application number: CN201410341679.0A
Authority: CN
Inventors: 贾庆轩; 扎西次仁; 李旭龙; 孙汉旭; 宋荆洲
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2014-07-17
Filing date: 2014-07-17
Publication date: 2014-10-08

Abstract

The invention discloses a parallelization type progressive photon mapping method and device based on OpenCL. The parallelization type progressive photon mapping method and device are applied to the overall illumination field in the virtual reality technology. Parallelization type progressive photon mapping is achieved through the OpenCL. The parallelization type progressive photon mapping method comprises the steps that firstly, initialization is conducted, a scene model is loaded, and the OpenCL calculating parameters are initialized; secondly, parallelization is conducted on viewpoint ray tracing, photo tracing and scene rendering based on the OpenCL, working loads are designed on corresponding processors reasonably, and after a command queue is executed, a computation result is read and transmitted to a CPU; finally, data resources stored in the CPU are released through the OpenCL standard library functions. By the adoption of the parallelization type progressive photon mapping method and device based on the OpenCL, the efficiency of the progressive photon mapping algorithm can be improved remarkably, compared with a computation method that design is conducted on the CPU, the efficiency is improved by four to nine times, the transportability is high, and the rendering effect is improved to a certain extent.

Description

The gradual Photon Mapping method and apparatus of a kind of parallelization based on OpenCL

Technical field

The present invention relates to the gradual Photon Mapping method and apparatus of a kind of parallelization based on OpenCL, improved the efficiency of gradual Photon Mapping algorithm, there is very high using value for sense of reality global illumination, belong to virtual reality technology field.

Background technology

Along with computer graphics is widely used in digital entertainment, virtual navigation, education and study, simulated training, virtual medical treatment, in ecommerce etc. field, people require more and more higher to the sense of reality of graph rendering, and sense of reality global illumination is one of gordian technique improving Realistic Images of Virtual Scene.Global illumination based on physics has ray trace at present, radiancy and Photon Mapping, and wherein Photon Mapping algorithm utilizes the advantage of the first two method, can simulate various lighting effects.For example, caustic, radiance and color are overflowed etc.

Need very high internal memory cost because Photon Mapping algorithm calculated amount is large and preserve photon pinup picture, in interactive system, be difficult to reach requirement of real-time.Propose a lot of optimization methods for Photon Mapping in recent years and improved efficiency of algorithm.Wherein gradual Photon Mapping algorithm has good robustness, and has again planned Photon Mapping algorithm based on condition for consistence, has solved well the internal memory Cost Problems of photon pinup picture.Although gradual Photon Mapping algorithm has improved scene rendering effect to a certain extent, efficiency of algorithm is a bottleneck all the time.In gradual Photon Mapping algorithm, mainly comprise three steps, viewpoint ray trace, photon are followed the tracks of and scene rendering.Accelerating viewpoint ray trace, photon tracking and scene rendering speed is to improve the key issue of efficiency of algorithm.

Along with the fast development of Computer graphics hardware, the method for utilizing the parallel general-purpose computations of GPU to improve global illumination algorithm rendering efficiency has obtained increasing concern.The business of some main flows is played up software and is started to utilize GPU general-purpose computations to realize global illumination.In gradual Photon Mapping algorithm, viewpoint ray trace, photon follow the tracks of and scene rendering has massive parallelism, realize the gradual Photon Mapping of parallelization based on OpenCL heterogeneous computing platforms, and for improving, efficiency of algorithm is significant.

Summary of the invention

The object of the present invention is to provide a kind of gradual Photon Mapping method of parallelization based on OpenCL, the different piece of operating load is assigned to applicable processor, make algorithm operational efficiency obtain significantly improving.

Another object of the present invention is to provide a kind of parallelization based on OpenCL gradual Photon Mapping device, and the different piece of operating load is assigned to suitable processor, can make algorithm operational efficiency obtain significantly improving.

In order to achieve the above object, technical scheme of the present invention is achieved in that

The gradual Photon Mapping method of parallelization based on OpenCL, specifically comprises several steps:

Step 1: initialization OpenCL calculating parameter, comprise: the dimension of working group and size information, create device context and instruction queue, create model of place memory object, sight line intersection point memory object, photon pinup picture memory object and picture element matrix memory object, load viewpoint ray trace, photon tracking and scene rendering file, statement viewpoint ray trace kernel function, photon are followed the tracks of kernel function and scene rendering kernel function;

Step 2: viewpoint ray trace is carried out to parallelization based on OpenCL;

Step 3: photon is followed the tracks of and carried out parallelization based on OpenCL;

Step 4: based on OpenCL to scene rendering parallelization;

Step 5: discharge the data resource of storing in GPU.

The gradual Photon Mapping device of parallelization based on OpenCL, this device comprises:

Dispensing unit, for initialization OpenCL calculating parameter, comprise: the dimension of working group and size information, create device context and instruction queue, create model of place memory object, sight line intersection point memory object, photon pinup picture memory object and picture element matrix memory object, load viewpoint ray trace, photon tracking and scene rendering file, statement viewpoint ray trace kernel function, photon are followed the tracks of kernel function and scene rendering kernel function;

Parallelization unit, for utilizing OpenCL to carry out parallel computation to viewpoint ray trace, photon tracking and scene rendering on GPU, and passes to CPU final calculation result;

Releasing unit, for utilizing OpenCL standard library function clReleaseMemObject () to discharge the data resource that GPU stores.

Above-mentioned technical scheme can find out, benefit of the present invention is:

Based on the gradual Photon Mapping method of a kind of parallelization of OpenCL heterogeneous Computing Platform Designing, portable strong between different graphic processor; The different piece of operating load is designed on applicable processor, and CPU is mainly in charge of and dispatches each task, and GPU is responsible for general parallel computation, significantly boosting algorithm execution efficiency.

Brief description of the drawings

Fig. 1 is the process flow diagram that the present invention is based on the gradual Photon Mapping embodiment of the method for parallelization of OpenCL;

Fig. 2 is calculating viewpoint light and the scene thing intersection point schematic diagram in the inventive method embodiment;

Fig. 3 is the parallel task schematic diagram of each work item in the viewpoint ray trace in the inventive method embodiment;

Fig. 4 is the composition structural drawing that the present invention is based on the gradual Photon Mapping device of the parallelization embodiment of OpenCL.

Embodiment

For problems of the prior art, in the present invention, in conjunction with OpenCL heterogeneous Computing platform, the gradual Photon Mapping scheme of a kind of parallelization is proposed, to realize the execution efficiency that improves algorithm.

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described.The present invention is the gradual Photon Mapping method and apparatus of a kind of parallelization based on OpenCL, in conjunction with OpenCL heterogeneous Computing framework, each step in gradual Photon Mapping is carried out to parallelization, design on applicable processor by different operating loads, CPU is mainly in charge of and dispatches each task, GPU is responsible for general parallel computation, and the execution efficiency of algorithm is greatly enhanced.

Fig. 1 is the process flow diagram that the present invention is based on the gradual Photon Mapping embodiment of the method for parallelization of OpenCL.As shown in Figure 1, viewpoint ray trace, photon follow the tracks of and scene rendering designs on GPU, thereby take full advantage of the calculated performance of GPU, improve rendering efficiency.CPU is responsible for wherein scheduling, initialization scene and the loading scenario model of each process.Specifically comprise step:

Step 101: according to a picture element matrix of screen resolution definition, and each value in initialization picture element matrix is 0, sets virtual camera and light source position.

Step 102: read model of place, how much dough sheets of model are kept in a region of memory, and set how much dough sheets in organization and management scene with KD-, for next step how much caps.

How reading model is prior art with setting up KD-tree, repeats no more.

Step 103: initialization OpenCL calculating parameter, it is 16 × 16 that working group's size is set, guarantee that each dimension of overall working node can be divided exactly by the corresponding dimension of working group, utilize OpenCL standard library function clCreateContextFromType () to create device context, and with clCreateCommandQueue () create command queue, utilize OpenCL standard library function clCreateBuffer () to create model of place buffer zone, sight line intersection point buffer zone, photon pinup picture buffer zone and pixel buffer, wherein model of place buffer zone is a kernel reading mode, sight line intersection point buffer zone, photon pinup picture buffer zone and pixel buffer are kernel read-write mode, utilize OpenCL standard library function clCreateProgramWithSource () to load viewpoint ray trace, photon is followed the tracks of and scene rendering file, and be converted into the executable program file of kernel with clBuildProgram (), create the viewpoint ray trace kernel function of statement based on OpenCL, photon is followed the tracks of kernel function and scene rendering kernel function.

Step 104: in the parallelization of viewpoint ray trace or eye tracking parallelization, first utilize OpenCL standard library function clSetKernelArg () to import argument address into viewpoint ray trace kernel function, parameter comprises model of place data, picture element matrix and sight line intersection point data, utilize OpenCL standard library function clEnqueueWriteBuffer (), needed viewpoint ray trace in instruction sequence data buffer model of place data and picture element matrix are delivered in GPU, utilize OpenCL standard library function clEnqueueNDRangeKernel () to activate viewpoint ray trace kernel its execution, be specially:

The corresponding OpenCL work item of each viewpoint light as shown in Figure 2, completes transmitting viewpoint light, calculates intersection point, the preservation intersection point of viewpoint light and how much dough sheets of scene and upgrade intersection point information in each work item.As shown in Figure 3, from viewpoint by the grid point imaging plane to scene emission of light, calculate the intersection point with scene how much dough sheets.Wherein intersection point information comprises position of intersecting point, light incident direction, pixel weights and location of pixels etc.Generally launch hundreds of thousands to millions of light to scene according to different pixel quantities;

Finally utilize all command execution in the queue of OpenCL standard library function clFinish () wait command to complete, and be reportedly delivered to CPU internal memory by calling the sight line number of hits that clEnqueueReadBuffer () calculates OpenCL.

The intersection point that how to calculate how much dough sheets of viewpoint light and scene is prior art, repeats no more.

Step 105: photon is followed the tracks of in parallelization, utilize OpenCL standard library function clSetKernelArg () to import argument address into photon and follow the tracks of kernel function, parameter comprises model of place data and photon pinup picture, utilize OpenCL standard library function clEnqueueWriteBuffer (), needed viewpoint ray trace in instruction sequence data buffer model of place data are delivered in GPU, utilize OpenCL standard library function clEnqueueNDRangeKernel () to activate photon and follow the tracks of kernel its execution, be specially:

It is separate that each photon is followed the tracks of, the establishment of each photon, transmitting and Tracking in the work item of an OpenCL.First create a photon, photon information mainly comprises light source position, luminous energy and dough sheet index.From the line trace of going forward side by side of source emissioning light, follow the tracks of and be provided with maximum tracking depths for photon, can infinitely not follow the tracks of like this photon.When every secondary tracking photon, first judge whether photon tracking depths reaches maximal value, just stop continuing to follow the tracks of if reach maximal value, and photon is saved in photon pinup picture;

Finally utilize all command execution in the queue of OpenCL standard library function clFinish () wait command to complete, and be delivered to CPU internal memory by calling the photon pinup picture that clEnqueueReadBuffer () calculates OpenCL.

Step 106: in scene rendering parallelization, utilize OpenCL standard library function clEnqueueWriteBuffer (), instruction sequence data buffer Scene is played up to needed sight line intersection point data and photon pinup picture is delivered in GPU, utilize OpenCL standard library function clEnqueueNDRangeKernel () activation scenario to play up kernel its execution, be specially:

Determine the color value of respective pixel by calculating the radiancy at how much dough sheet intersection point places of viewpoint light and scene.Each intersection point is separate, can parallel computation.Formula (1) is for solving radiancy.Wherein N _emittedfor utilizing emitted light quantum count, for viewpoint light is in the incident direction at intersection point place, for photon incident direction, f _rfor BRDF function, for photon p is at x _pplace luminous flux in direction;

\begin{matrix} L (x, \overset{&RightArrow;}{ω}) = {&Integral;}_{2 π} f_{r} (x, \overset{&RightArrow;}{ω}, {\overset{&RightArrow;}{ω}}^{'}) L (x, {\overset{&RightArrow;}{ω}}^{'}) (\overset{&RightArrow;}{n} \cdot {\overset{&RightArrow;}{ω}}^{'}) {dω}^{'} \\ \approx \frac{1}{ΔA} Σ_{p = 1}^{n} f_{r} (x, \overset{&RightArrow;}{ω}, {\overset{&RightArrow;}{ω}}_{p}) Δ φ_{p} (x_{p}, {\overset{&RightArrow;}{ω}}_{p}) φ \\ = \frac{1}{πR {(x)}^{2}} \frac{τ (x, \overset{&RightArrow;}{ω})}{N_{emitted}} \end{matrix} - - - (1)

τ (x, \overset{&RightArrow;}{ω}) = N_{emitted} \cdot Σ_{p = 1}^{n} f_{r} (x, \overset{&RightArrow;}{ω}, {\overset{&RightArrow;}{ω}}_{p}) {Δφ}_{p} (x_{p}, {\overset{&RightArrow;}{ω}}_{p}) - - - (2)

Finally utilize all command execution in the queue of OpenCL standard library function clFinish () wait command to complete, and be delivered to CPU internal memory by calling the picture element matrix color value that clEnqueueReadBuffer () calculates OpenCL.

Step 107: utilize OpenCL standard library function clReleaseMemObject () to discharge the data resource of storing in GPU.

So far, completed the gradual Photon Mapping processing procedure of the parallelization based on OpenCL of the present invention.

Based on said method, Fig. 4 is the composition structural representation that the present invention is based on the gradual Photon Mapping device of the parallelization embodiment of OpenCL.As shown in Figure 4, this device comprises:

Dispensing unit 41, for initialization OpenCL calculating parameter, comprise: the dimension of working group and size information, create device context and instruction queue, create model of place memory object, sight line intersection point memory object, photon pinup picture memory object and picture element matrix memory object, load viewpoint ray trace, photon tracking and scene rendering file, statement viewpoint ray trace kernel function, photon are followed the tracks of kernel function and scene rendering kernel function;

Parallelization unit 42, for utilizing OpenCL to carry out parallel computation to viewpoint ray trace, photon tracking and scene rendering on GPU, and passes to CPU final calculation result;

Releasing unit 43, for utilizing OpenCL standard library function clReleaseMemObject () to discharge the data resource that GPU stores.

Wherein, in dispensing unit 41, can specifically comprise:

Working group unit 411 is set, and is 16 × 16 for working group's size will be set, and guarantee that each dimension of overall working node can be divided exactly by the corresponding dimension of working group;

Create platform unit 412, for utilizing OpenCL standard library function clCreateContextFromType () to create device context, and with clCreateCommandQueue () establishment command queue;

Load kernel unit 413, be used for utilizing OpenCL standard library function clCreateProgramWithSource () to load viewpoint ray trace, photon is followed the tracks of and scene rendering file, and is converted into the executable program file of kernel with clBuildProgram ().

In parallelization unit 42, can specifically comprise:

Parallelization viewpoint ray trace unit 421, for importing argument address into viewpoint ray trace kernel function by OpenCL standard library function clSetKernelArg (), parameter comprises model of place data, picture element matrix and sight line intersection point data, utilize OpenCL built-in function clEnqueueWriteBuffer (), needed viewpoint ray trace in instruction sequence data buffer model of place data and picture element matrix are delivered in GPU, activate viewpoint ray trace kernel its execution by OpenCL standard library function clEnqueueNDRangeKernel (), each viewpoint ray trace kernel operates in a corresponding OpenCL work item, as shown in Figure 2, each work item is separate parallel, in viewpoint light kernel from viewpoint by the grid point imaging plane to scene emission of light, follow the tracks of viewpoint light, while running in scene how much dough sheets, calculate its intersection point, as shown in Figure 3, preserve intersection point information in the shared drive region of specifying, intersection point information comprises position of intersecting point, light incident direction, pixel weights and location of pixels, finally utilize all command execution in the queue of OpenCL standard library function clFinish () wait command to complete, and be reportedly delivered to CPU internal memory by calling the sight line number of hits that clEnqueueReadBuffer () calculates OpenCL,

Parallelization photon tracking cell 422, follow the tracks of kernel function for importing argument address into photon by OpenCL standard library function clSetKernelArg (), parameter comprises model of place data and photon pinup picture, utilize OpenCL standard library function clEnqueueWriteBuffer (), needed viewpoint ray trace in instruction sequence data buffer model of place data are delivered in GPU, activate photon by OpenCL standard library function clEnqueueNDRangeKernel () and follow the tracks of kernel its execution, photon is followed the tracks of in kernel and is created photon, photon information mainly comprises light source position, luminous energy and dough sheet index, the transmitting photon line trace of going forward side by side from light source to scene, determine that by Russian Roulette a photon reflects, refraction or absorption, new direction after being reflected is by BRDF (the Bidirectional Reflectance Distribution Function of impact surface, be bidirectional reflectance distribution function) calculate, absorbed photon is kept in photon pinup picture, finally utilize all command execution in the queue of OpenCL standard library function clFinish () wait command to complete, and be delivered to CPU internal memory by calling the photon pinup picture that clEnqueueReadBuffer () calculates OpenCL,

Parallelization scene rendering unit 423, for importing argument address into scene rendering kernel function by OpenCL standard library function clSetKernelArg (), parameter comprises scene pixel matrix, sight line intersection point data and photon pinup picture, utilize OpenCL standard library function clEnqueueWriteBuffer (), by the needed picture element matrix of viewpoint ray trace in instruction sequence data buffer, sight line intersection point data and photon pinup picture are delivered in GPU, play up kernel its execution by OpenCL standard library function clEnqueueNDRangeKernel () activation scenario, in scene rendering kernel, the intersection point calculating in viewpoint ray trace is searched, obtain the corresponding intersection point of each pixel, the photon numbers of specifying in radius is calculated in intersection as center, calculate the radiancy at intersection point place by photon numbers, formula (1) is for solving radiancy.Determine the color value of intersection point respective pixel by radiancy, finally utilize all command execution in the queue of OpenCL standard library function clFinish () wait command to complete, and be delivered to CPU internal memory by calling the picture element matrix color value that clEnqueueReadBuffer () calculates OpenCL

\begin{matrix} L (x, \overset{&RightArrow;}{ω}) = {&Integral;}_{2 π} f_{r} (x, \overset{&RightArrow;}{ω}, {\overset{&RightArrow;}{ω}}^{'}) L (x, {\overset{&RightArrow;}{ω}}^{'}) (\overset{&RightArrow;}{n} \cdot {\overset{&RightArrow;}{ω}}^{'}) {dω}^{'} \\ \approx \frac{1}{ΔA} Σ_{p = 1}^{n} f_{r} (x, \overset{&RightArrow;}{ω}, {\overset{&RightArrow;}{ω}}_{p}) Δ φ_{p} (x_{p}, {\overset{&RightArrow;}{ω}}_{p}) φ \\ = \frac{1}{πR {(x)}^{2}} \frac{τ (x, \overset{&RightArrow;}{ω})}{N_{emitted}} \end{matrix} - - - (1)

τ (x, \overset{&RightArrow;}{ω}) = N_{emitted} \cdot Σ_{p = 1}^{n} f_{r} (x, \overset{&RightArrow;}{ω}, {\overset{&RightArrow;}{ω}}_{p}) {Δφ}_{p} (x_{p}, {\overset{&RightArrow;}{ω}}_{p}) - - - (2)

Wherein, N _emittedfor utilizing emitted light quantum count, for viewpoint light is in the incident direction at intersection point place, for photon incident direction, f _rfor BRDF function, for photon p is at x _pplace luminous flux in direction.

In a word, can find out by technique scheme, benefit of the present invention is to utilize OpenCL heterogeneous Computing platform, carry out the parallelization of gradual Photon Mapping, load task is assigned on corresponding different processors, CPU is mainly responsible for scheduling and the management of various tasks, and GPU is mainly responsible for general parallel computation, finally improves the efficiency of algorithm.

In sum, these are only preferred embodiment of the present invention, be not intended to limit protection scope of the present invention.Within the spirit and principles in the present invention all, any amendment of making, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims

1. the gradual Photon Mapping method of the parallelization based on OpenCL, is characterized in that, comprises following step:

Step 2: viewpoint ray trace is carried out to parallelization based on OpenCL;

Step 4: based on OpenCL to scene rendering parallelization;

Step 5: utilize OpenCL standard library function clReleaseMemObject () to discharge the data resource of storing in GPU.

2. the gradual Photon Mapping method of the parallelization based on OpenCL according to claim 1, is characterized in that, step 1 specifically comprises:

Step 1-1: it is 16 × 16 that working group's size is set, guarantee that each dimension of overall working node can be divided exactly by the corresponding dimension of working group;

Step 1-2: utilize OpenCL standard library function clCreateContextFromType () to create device context, and with clCreateCommandQueue () establishment command queue;

Step 1-3: utilize OpenCL standard library function clCreateBuffer () to create model of place buffer zone, sight line intersection point buffer zone, photon pinup picture buffer zone and pixel buffer, wherein model of place buffer zone is a kernel reading mode, and sight line intersection point buffer zone, photon pinup picture buffer zone and pixel buffer are kernel read-write mode;

Step 1-4: utilize OpenCL standard library function clCreateProgramWithSource () to load viewpoint ray trace, photon is followed the tracks of and scene rendering file, and be converted into the executable program file of kernel with clBuildProgram ();

Step 1-5: create the viewpoint ray trace kernel function of statement based on OpenCL;

Step 1-6: create the photon of statement based on OpenCL and follow the tracks of kernel function;

Step 1-7: create the scene rendering kernel function of statement based on OpenCL.

3. the gradual Photon Mapping method of the parallelization based on OpenCL according to claim 1, is characterized in that, step 2 specifically comprises:

Step 2-1: utilize OpenCL standard library function clSetKernelArg () to import argument address into viewpoint ray trace kernel function, parameter comprises model of place data, picture element matrix and sight line intersection point data;

Step 2-2: utilize OpenCL standard library function clEnqueueWriteBuffer (), needed viewpoint ray trace in instruction sequence data buffer model of place data and picture element matrix are delivered in GPU;

Step 2-3: utilize OpenCL standard library function clEnqueueNDRangeKernel () to activate viewpoint ray trace kernel its execution:

Step 2-3-1: from viewpoint process pixel to scene emission of light;

Step 2-3-2: the intersection point of how much dough sheets of compute ray and scene;

Step 2-3-3: preserve sight line intersection point information, comprise position of intersecting point, light incident direction, pixel weights and location of pixels etc.;

Step 2-3-4: upgrade sight line intersection point information;

Step 2-4: utilize all command execution in the queue of OpenCL standard library function clFinish () wait command to complete, and be reportedly delivered to CPU internal memory by calling the sight line number of hits that clEnqueueReadBuffer () calculates OpenCL.

4. the gradual Photon Mapping method of the parallelization based on OpenCL according to claim 1, is characterized in that, step 3 specifically comprises:

Step 3-1: utilize OpenCL standard library function clSetKernelArg () to import argument address into photon and follow the tracks of kernel function, parameter comprises model of place data and photon pinup picture;

Step 3-2: utilize OpenCL standard library function clEnqueueWriteBuffer (), photon in instruction sequence data buffer is followed the tracks of to needed model of place data and be delivered in GPU;

Step 3-3: utilize OpenCL standard library function clEnqueueNDRangeKernel () to activate photon and follow the tracks of kernel its execution:

Step 3-3-1: create a photon, comprise following information: photon position, luminous energy and dough sheet index;

Step 3-3-2: from source emissioning light;

Step 3-3-3: follow the tracks of photon path;

Step 3-3-4: if photon is crossing with mirror surface, transfer step 3 to;

Step 3-3-5: if photon runs into diffusing surface, preserve photon;

Step 3-3-6: set up photon KD-Tree, and preserve photon information;

Step 3-4: utilize all command execution in the queue of OpenCL standard library function clFinish () wait command to complete, and be delivered to CPU internal memory by calling the photon pinup picture that clEnqueueReadBuffer () calculates OpenCL.

5. the gradual Photon Mapping method of the parallelization based on OpenCL according to claim 1, is characterized in that, step 4 specifically comprises:

Step 4-1: utilize OpenCL standard library function clSetKernelArg () to import parameter into scene rendering kernel function, parameter comprises sight line intersection point data, picture element matrix and photon pinup picture;

Step 4-2: utilize OpenCL standard library function clEnqueueWriteBuffer (), instruction sequence data buffer Scene is played up to needed sight line intersection point data and photon pinup picture is delivered in GPU;

Step 4-3: utilize OpenCL standard library function clEnqueueNDRangeKernel () activation scenario to play up kernel its execution:

Step 4-3-1: taking sight line intersection point as the center of circle, search the photon numbers of specifying in radius;

Step 4-3-2: through type (1) calculates the radiancy at sight line intersection point place:

\begin{matrix} L (x, \overset{&RightArrow;}{ω}) = {&Integral;}_{2 π} f_{r} (x, \overset{&RightArrow;}{ω}, {\overset{&RightArrow;}{ω}}^{'}) L (x, {\overset{&RightArrow;}{ω}}^{'}) (\overset{&RightArrow;}{n} \cdot {\overset{&RightArrow;}{ω}}^{'}) {dω}^{'} \\ \approx \frac{1}{ΔA} Σ_{p = 1}^{n} f_{r} (x, \overset{&RightArrow;}{ω}, {\overset{&RightArrow;}{ω}}_{p}) Δ φ_{p} (x_{p}, {\overset{&RightArrow;}{ω}}_{p}) φ \\ = \frac{1}{πR {(x)}^{2}} \frac{τ (x, \overset{&RightArrow;}{ω})}{N_{emitted}} \end{matrix} - - - (1)

τ (x, \overset{&RightArrow;}{ω}) = N_{emitted} \cdot Σ_{p = 1}^{n} f_{r} (x, \overset{&RightArrow;}{ω}, {\overset{&RightArrow;}{ω}}_{p}) {Δφ}_{p} (x_{p}, {\overset{&RightArrow;}{ω}}_{p}) - - - (2)

Wherein N _emittedfor utilizing emitted light quantum count, for viewpoint light is in the incident direction at intersection point place, for photon incident direction, f _rfor BRDF function, for photon p is at x _pplace luminous flux in direction;

Step 4-3-3: the color value of determining intersection point respective pixel by the radiancy at sight line intersection point place;

Step 4-4: utilize all command execution in the queue of OpenCL standard library function clFinish () wait command to complete, and be delivered to CPU internal memory by calling the picture element matrix color value that clEnqueueReadBuffer () calculates OpenCL.

6. the gradual Photon Mapping device of the parallelization based on OpenCL, is characterized in that, this device comprises: dispensing unit, parallelization unit and releasing unit, wherein,

Described dispensing unit, for initialization OpenCL calculating parameter, comprise: the dimension of working group and size information, create device context and instruction queue, create model of place memory object, sight line intersection point memory object, photon pinup picture memory object and picture element matrix memory object, load viewpoint ray trace, photon tracking and scene rendering file, statement viewpoint ray trace kernel function, photon are followed the tracks of kernel function and scene rendering kernel function;

Described parallelization unit, for utilizing OpenCL to carry out parallel computation to viewpoint ray trace, photon tracking and scene rendering on GPU, and passes to CPU final calculation result;

Described releasing unit, for utilizing OpenCL standard library function clReleaseMemObject () to discharge the data resource that GPU stores.

7. device according to claim 6, is characterized in that, described dispensing unit comprises: working group unit is set, creates platform unit and loads kernel unit, wherein,

The described working group unit that arranges, is 16 × 16 for working group's size is set, and guarantee that each dimension of overall working node can be divided exactly by the corresponding dimension of working group;

Described establishment platform unit, for utilizing OpenCL standard library function clCreateContextFromType () to create device context, and with clCreateCommandQueue () establishment command queue;

Described loading kernel unit, be used for utilizing OpenCL built-in function clCreateProgramWithSource () to load viewpoint ray trace, photon is followed the tracks of and scene rendering file, and is converted into the executable program file of kernel with clBuildProgram ().

8. device according to claim 6, is characterized in that, described parallelization unit comprises: parallelization viewpoint ray trace unit, parallelization photon tracking cell and parallelization scene rendering unit, wherein,

Described parallelization viewpoint ray trace unit, for importing argument address into viewpoint ray trace kernel function by OpenCL standard library function clSetKernelArg (), parameter comprises model of place data, picture element matrix and sight line intersection point data, utilize OpenCL built-in function clEnqueueWriteBuffer (), needed viewpoint ray trace in instruction sequence data buffer model of place data and picture element matrix are delivered in GPU, activate viewpoint ray trace kernel its execution by OpenCL standard library function clEnqueueNDRangeKernel (), in viewpoint light kernel from viewpoint by the grid point imaging plane to scene emission of light, calculate the intersection point with how much dough sheets of scene, preserve and upgrade intersection point information, finally utilize all command execution in the queue of OpenCL standard library function clFinish () wait command to complete, and be reportedly delivered to CPU internal memory by calling the sight line number of hits that clEnqueueReadBuffer () calculates OpenCL,

Described parallelization photon tracking cell, follow the tracks of kernel function for importing argument address into photon by OpenCL standard library function clSetKernelArg (), parameter comprises model of place data and photon pinup picture, utilize OpenCL standard library function clEnqueueWriteBuffer (), needed viewpoint ray trace in instruction sequence data buffer model of place data are delivered in GPU, activate photon by OpenCL standard library function clEnqueueNDRangeKernel () and follow the tracks of kernel its execution, photon is followed the tracks of in kernel and is created photon, the transmitting photon line trace of going forward side by side from light source to scene, absorbed photon is kept in photon pinup picture, finally utilize all command execution in the queue of OpenCL standard library function clFinish () wait command to complete, and be delivered to CPU internal memory by calling the photon pinup picture that clEnqueueReadBuffer () calculates OpenCL,

Described parallelization scene rendering unit, for importing argument address into scene rendering kernel function by OpenCL standard library function clSetKernelArg (), parameter comprises scene pixel matrix, sight line intersection point data and photon pinup picture, utilize OpenCL standard library function clEnqueueWriteBuffer (), by the needed picture element matrix of viewpoint ray trace in instruction sequence data buffer, sight line intersection point data and photon pinup picture are delivered in GPU, play up kernel its execution by OpenCL standard library function clEnqueueNDRangeKernel () activation scenario, in scene rendering kernel, the intersection point calculating in viewpoint ray trace is searched, obtain the corresponding intersection point of each pixel, the photon numbers of specifying in radius is calculated in intersection as center, calculate the radiancy at intersection point place by photon numbers, formula (1) is for solving radiancy.Determine the color value of intersection point respective pixel by radiancy, finally utilize all command execution in the queue of OpenCL standard library function clFinish () wait command to complete, and be delivered to CPU internal memory by calling the picture element matrix color value that clEnqueueReadBuffer () calculates OpenCL

\begin{matrix} L (x, \overset{&RightArrow;}{ω}) = {&Integral;}_{2 π} f_{r} (x, \overset{&RightArrow;}{ω}, {\overset{&RightArrow;}{ω}}^{'}) L (x, {\overset{&RightArrow;}{ω}}^{'}) (\overset{&RightArrow;}{n} \cdot {\overset{&RightArrow;}{ω}}^{'}) {dω}^{'} \\ \approx \frac{1}{ΔA} Σ_{p = 1}^{n} f_{r} (x, \overset{&RightArrow;}{ω}, {\overset{&RightArrow;}{ω}}_{p}) Δ φ_{p} (x_{p}, {\overset{&RightArrow;}{ω}}_{p}) φ \\ = \frac{1}{πR {(x)}^{2}} \frac{τ (x, \overset{&RightArrow;}{ω})}{N_{emitted}} \end{matrix} - - - (1)

τ (x, \overset{&RightArrow;}{ω}) = N_{emitted} \cdot Σ_{p = 1}^{n} f_{r} (x, \overset{&RightArrow;}{ω}, {\overset{&RightArrow;}{ω}}_{p}) {Δφ}_{p} (x_{p}, {\overset{&RightArrow;}{ω}}_{p}) - - - (2)