The content of the invention
It is an object of the invention to provide a kind of method for the real-time and treatment effeciency for improving image procossing.
To achieve these goals, the technical scheme is that:
A kind of GPU parallel calculating methods based on double video cards applied to image procossing, it comprises the following steps:
Step one:The GPU resource of the double video cards of initialization;
Step 2:Image memory is divided into video data block one and video data block two, video data block one and image
Data block two is continuous in physical space, the overlapping region size of setting video data block one and video data block two;
Step 3:Create and start two threads, the corresponding GPU resource of two video cards is called respectively, perform image procossing
Program, obtains video data block one and the respective result of video data block two;
Step 4:The data of overlapping region are abandoned, merge the result of video data block one and video data block two, it is complete
Concurrent processing of the GPU of video card in pairs to an image.
Further, step one is specially:
Step 101:Library file needed for GPU operations is installed;
Step 102:Video card equipment is initialized, the video card that GPU resource can be called is found in platform;
Step 103:Video card facility information is initialized, the required program object of GPU operations is set up;
Step 104:Compile GPU concurrent program modules.
Further, in step 2, image memory dividing mode is:Horizontal segmentation, longitudinally split or oblique segmentation;Level
During segmentation:
Step 201:By image memory using horizontal division line even partition as two pieces;
Step 202:Equal-sized overlapping region one and overlapping region are built respectively along horizontal division line to both sides up and down
Two, the data of overlapping region one and overlapping region two are consistent with the data of bottom layer image internal memory, build video data block one, including
Part and overlapping region two above horizontal division line, build video data block two, including part below horizontal division line and
Overlapping region one;
Step 203:Neighborhood calculating is carried out to video data block one and video data block two, neighbour is carried out to video data block one
When domain is calculated, the offer data of overlapping region two, when carrying out neighborhood calculating to video data block two, the offer number of overlapping region one
According to.
Further, when longitudinally split in described step two:First by 90 ° of image transposition, according still further to horizontal segmentation processing.
Further, the overlapping region size described in step 202 is the size that is calculated according to neighborhood to determine, overlapping
The size in region is twice of neighborhood computed altitude.
Further, step 3 is specially:
Step 301:Create and start two threads, the corresponding GPU resource of two video cards is called respectively.
Step 302:Image processing program is performed in two threads, each thread corresponds to a video data block respectively,
Two threads are synchronously waited to complete corresponding processing using thread control interface, two video card run times of record are figure compared with elder
The time handled as internal memory.
Further, step 4 is specially:
Step 401:Video data block one abandons the data of overlapping region two, and video data block two abandons overlapping region one
Data;
Step 402:Merge the result of video data block one and video data block two;
Step 403:GPU resource is discharged, result is exported.
Beneficial effects of the present invention:
The present invention improves the computational efficiency of image procossing, and whole efficiency can lift 70%~80%, for current
Main flow 4k image frames, the GPU concurrent mechanisms of double video cards can greatly improve the real-time of image procossing, while being less than in cost
DSP or the FPGA hardware design of common class effect, the real-time processing and realistic meaning for high-definition picture are very big, lead to simultaneously
Cross setting overlapping region, it is ensured that the pixel of new adjacent edges calculates consistent with artwork.
Embodiment
The preferred embodiment of the present invention is more fully described below with reference to accompanying drawings.Although showing the present invention in accompanying drawing
Preferred embodiment, however, it is to be appreciated that may be realized in various forms the present invention without the embodiment party that should be illustrated here
Formula is limited.
As Figure 1-3, a kind of GPU parallel calculating methods based on double video cards applied to image procossing, it is characterized in that
It comprises the following steps:
Step one:The GPU resource of the double video cards of initialization, it is ensured that the validity of equipment;Specially:GPU operations are installed required
Library file;Video card equipment is initialized, the video card that GPU resource can be called is found in platform;Video card facility information is initialized,
Set up the required program object of GPU operations;Compile GPU concurrent program modules.
Step 2:The overlapping region that image memory is divided between physically continuous two pieces, two pieces of internal memories of setting is big
Small, overlapping region size is the size that is calculated according to neighborhood to determine, the size of overlapping region is the two of neighborhood computed altitude
Times;It is specific as shown in figure 3, two image memory blocks are 1, shadow region, shadow region above section respectively, below shadow region
And with shadow region area identical part;2nd, part below shadow region and shadow region, up and down two pieces of calculating consider be
It is different:The calculating of top half need not enter shadow region below and with shadow region area identical part, shade
The neighborhood for being only to provide data below region and with shadow region area identical part to be accomplished to medium line is calculated;Lower half
The calculating divided does not enter shadow region, and the neighborhood that shadow region is only to provide data to complete medium line is calculated.
Step 3:Create and start two threads, the corresponding GPU resource of two video cards is called respectively, perform image procossing
Program, synchronously wait two threads to complete corresponding processing using thread control interface, two video card run times of record are longer
Person is the time that image memory is handled, and obtains video data block one and the respective result of video data block two;
Step 4:The data of overlapping region are abandoned, merge the result of video data block one and video data block two, it is complete
Concurrent processing of the GPU of video card in pairs to an image.Specially:Video data block one abandons the data of overlapping region two, figure
As data block two abandons the data of overlapping region one;Merge the result of video data block one and video data block two;Release
GPU resource, exports result.
Consider the concurrent mechanism of more video cards, the treatment mechanism with double video cards is basically identical, general PC frameworks are difficult to adopt
The deployment of more than two video card.So we are used as the primary structure of many GPU concurrent processing using double video cards.
It is described above various embodiments of the present invention, described above is exemplary, and non-exclusive, and
It is not limited to disclosed each embodiment.In the case of without departing from the scope and spirit of illustrated each embodiment, for this skill
Many modifications and changes will be apparent from for the those of ordinary skill in art field.