CN111833232A - Image processing device - Google Patents

Image processing device Download PDF

Info

Publication number
CN111833232A
CN111833232A CN201910312498.8A CN201910312498A CN111833232A CN 111833232 A CN111833232 A CN 111833232A CN 201910312498 A CN201910312498 A CN 201910312498A CN 111833232 A CN111833232 A CN 111833232A
Authority
CN
China
Prior art keywords
image processing
image data
accelerator card
cpu
processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910312498.8A
Other languages
Chinese (zh)
Other versions
CN111833232B (en
Inventor
张凯军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN201910312498.8A priority Critical patent/CN111833232B/en
Priority claimed from CN201910312498.8A external-priority patent/CN111833232B/en
Publication of CN111833232A publication Critical patent/CN111833232A/en
Application granted granted Critical
Publication of CN111833232B publication Critical patent/CN111833232B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

The embodiment of the invention provides an image processing device, which comprises a CPU and a GPU, wherein an image processing accelerator card is arranged in the image processing device, the CPU and the GPU are both in communication connection with the image processing accelerator card, the image processing device determines an appointed processor in the CPU and the GPU of the image processing device according to the load capacity of the CPU and the GPU of the image processing device, and the appointed processor is used for executing: obtaining target image data to be processed; writing the target image data into an image data cache region of the image processing accelerator card so that the image processing accelerator card acquires the target image data from the image data cache region, and performing image processing on the target image data to obtain an image processing result; writing the image processing result into an image processing result cache region of the image processing accelerator card; and reading the image processing result from the image processing result buffer area. By the technical scheme provided by the embodiment of the invention, the workload of the CPU and the GPU can be reduced, so that the loads of the CPU and the GPU are reduced.

Description

Image processing device
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to an image processing apparatus.
Background
With the development of scientific technology, electronic devices can perform image processing on images, wherein the image processing can be identification of targets in the images, tracking of targets in the images, or measurement and estimation of features of targets or areas in the images.
In the related art, the electronic device generally adopts a CPU + GPU architecture to implement image processing. However, the CPU and the GPU need to perform many other tasks in addition to the image processing task, and thus, the workload of the CPU and the GPU is large and the load is high.
Disclosure of Invention
An object of the embodiments of the present invention is to provide an image processing apparatus, so as to reduce the workload of a CPU and a GPU, thereby reducing the load of the CPU and the GPU. The specific technical scheme is as follows:
the embodiment of the invention provides an image processing device, which comprises a CPU and a GPU, wherein an image processing acceleration card is arranged in the image processing device, the CPU and the GPU of the image processing device are both in communication connection with the image processing acceleration card, the image processing device determines a designated processor in the CPU and the GPU of the image processing device according to the load capacity of the CPU and the GPU of the image processing device, and the designated processor is used for executing:
obtaining target image data to be processed;
writing the target image data into an image data cache region of the image processing accelerator card so that the image processing accelerator card obtains the target image data from the image data cache region, and performing image processing on the target image data to obtain an image processing result; writing the image processing result into an image processing result cache region of the image processing accelerator card;
and reading the image processing result from the image processing result buffer area.
Optionally, the determining, according to the load amounts of the CPU and the GPU of the image processing apparatus, a designated processor in the CPU and the GPU of the image processing apparatus includes:
the CPU of the image processing device judges whether the load of the CPU is smaller than a preset load or not;
if the load capacity of the CPU of the image processing device is smaller than a preset load capacity, determining the CPU of the image processing device as a designated processor;
and if the load of the CPU of the image processing device is not less than the preset load, determining the GPU of the image processing device as the designated processor.
Optionally, the image data buffer of the image processing accelerator card includes a plurality of sub-regions, and the image data occupation state of each sub-region is represented by a code bit value of one code bit of the first register.
Optionally, the writing, by the designated processor, the target image data into an image data buffer of the image processing accelerator card includes:
the appointed processor searches first code bits from all code bits of a first register, wherein the code bit values of the first code bits are used for representing that a sub region is not occupied by image data;
and the appointed processor writes the target image data into a first subarea corresponding to the first code bits.
Optionally, the writing, by the designated processor, the target image data into an image data buffer of the image processing accelerator card includes:
and the appointed processor writes target image data into an image data cache region of the image processing accelerator card in a Direct Memory Access (DMA) mode.
Optionally, the specific processor is further configured to perform: and importing the configuration file of the image processing algorithm into an image processing accelerator card so that the image processing accelerator card calls the model parameters corresponding to the image processing algorithm in the configuration file, and carrying out image processing on the target image data by using the image processing algorithm to obtain an image processing result.
Optionally, the image processing result buffer area includes a plurality of sub-areas, and each sub-area is represented by one code bit of the second register.
Optionally, the reading, by the designated processor, the image processing result from the image processing result buffer area includes:
the appointed processor determines a second sub-area of an image processing result cache area of the image accelerator card, wherein the second sub-area is as follows: a sub-region characterized by second code bits of the second register at the same positions as the first code bits in the first register;
and the appointed processor reads the image processing result from the second subarea of the image processing result cache region.
Optionally, the determining, by the designated processor, a second sub-area of an image processing result buffer area of the image accelerator card includes:
the appointed processor calls an interrupt monitoring processing function to monitor whether the image processing accelerator card sends an interrupt signal;
and after monitoring that the image processing accelerator card sends an interrupt signal, the appointed processor determines a second sub-area of an image processing result cache area of the image processing accelerator card.
Optionally, the image processing accelerator card is comprised of an integrated processing system PS and programmable logic PL.
According to the technical scheme provided by the embodiment of the invention, after a CPU or a GPU of an image processing device obtains target image data to be processed, the target image data is written into an image data cache region of an image processing accelerator card; the image processing accelerator card can acquire target image data from the image data cache region and perform image processing on the target image data to obtain an image processing result; and writes the image processing result into an image processing result buffer area of the image processing accelerator card, so that the CPU or GPU of the image processing apparatus can read the image processing result from the image processing result buffer area. Therefore, according to the technical scheme provided by the embodiment of the invention, the image processing is realized by the image processing accelerator card, so that the workload of the CPU and the GPU of the image processing device can be reduced, and the load of the CPU and the GPU is further reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a method for executing an image processing by a designated processor according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a designated processor storing target image data according to an embodiment of the present invention;
FIG. 4(a) is a diagram illustrating a designated processor writing target image data into an image data buffer of an image processing accelerator card according to an embodiment of the present invention;
fig. 4(b) is a schematic diagram of a designated processor reading an image processing result from an image processing result buffer of an image processing accelerator card according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to reduce the workload of a CPU and a GPU of an image processing apparatus and further reduce the load of the CPU and the GPU, an embodiment of the invention provides an image processing apparatus.
The image processing apparatus may be an electronic device such as a server and a terminal, and the image processing apparatus is not particularly limited in the embodiment of the present invention.
As shown in fig. 1, the image processing apparatus provided by the embodiment of the present invention may include a CPU110 and a GPU120, and an image processing accelerator card 130 is disposed in the image processing apparatus, and both the CPU110 and the GPU120 of the image processing apparatus are communicatively connected to the image processing accelerator card 130.
In practical applications, before the image processing apparatus performs image processing, a specific processor may be determined among the CPU and the GPU of the image processing apparatus according to load amounts of the CPU and the GPU of the image processing apparatus.
In one embodiment, determining a specific processor among the CPU and the GPU of the image processing apparatus according to load amounts of the CPU and the GPU of the image processing apparatus may include:
the CPU of the image processing device judges whether the load of the CPU is smaller than a preset load or not;
if the load capacity of the CPU of the image processing device is smaller than the preset load capacity, determining the CPU of the image processing device as a designated processor;
and if the load of the CPU of the image processing device is not less than the preset load, determining the GPU of the image processing device as the appointed processor.
In this embodiment, the CPU of the image processing apparatus may determine whether its own current load amount is less than a preset load amount, and if the current load amount of the CPU of the image processing apparatus is less than the preset load amount, it indicates that the current load of the CPU of the image processing apparatus is low, and therefore, the CPU of the image processing apparatus may be determined as a designated processor to obtain the target image data to be processed by using the CPU of the image processing apparatus. If the current load of the CPU of the image processing device is not less than the preset load, the current load of the CPU of the image processing device is higher, and therefore the GPU of the image processing device can be determined as the designated processor, so that the GPU of the image processing device is utilized to obtain the target image data to be processed.
It can be understood that the magnitude of the preset load may be set according to an actual situation, and the magnitude of the preset load is not specifically limited in the embodiment of the present invention.
After determining the designated processor, the designated processor may be used to perform steps S210-S230 as follows, as shown in FIG. 2.
S210, target image data to be processed are obtained.
The designated processor can acquire target image data to be processed from image acquisition equipment, wherein the image acquisition equipment can be a network camera, a network dome camera and the like. Specifically, the image capturing device may capture an image or a video, and after the image or the video is captured, the image or the video may be sent to the designated processor, so that the designated processor may obtain target image data to be processed.
It should be noted that the image capturing device may be provided with a network SDK, and the image processing apparatus may be provided with a network SDK matching the network SDK in the image capturing device, so that the designated processor may obtain the target image data from the image capturing device in real time through the network SDK.
The process of acquiring the target image data from the image acquisition device in real time by the designated processor may be: firstly, initializing a network SDK to obtain an operation handle; secondly, registering the electronic equipment, wherein specifically, the appointed processor can log in by adopting a preset user name, a preset password and a preset IP address; thirdly, setting an abnormal information callback function, and processing abnormal conditions in the abnormal information callback function, such as disconnection reconnection and the like; and fourthly, starting preview and setting a decoding callback function, and acquiring target image data by calling the decoding callback function.
And, the target image data acquired by the designation processor may be: YUV data. Wherein, the obtained YUV data can be stored by the appointed processor in the storage format as shown in fig. 3. As can be seen from fig. 3, every fourth Y corresponds to one U and one V.
In order to reduce the amount of calculation of the image processing accelerator card during image processing in the subsequent step, the designation processor may down-sample the acquired target image data. For example, 2 × 2 down-sampling may be adopted, that is, four pixels in a 2 × 2 window in the target image data are changed into one pixel, and the pixel value of the pixel is an average value of the four pixels.
S220, writing the target image data into an image data cache region of the image processing accelerator card so that the image processing accelerator card obtains the target image data from the image data cache region, and performing image processing on the target image data to obtain an image processing result; and writing the image processing result into an image processing result buffer area of the image processing accelerator card.
After the appointed processor obtains the target image data to be processed, the target image data can be written into the image processing accelerator card through the PCIE bus, wherein the PCIE bus is a high-speed serial computer expansion bus standard, and the main advantage is high data transmission efficiency.
In practical applications, the designated processor continuously acquires target image data and continuously transmits the target image data to the image processing accelerator card. Therefore, in the embodiment of the present invention, the designated processor may write the target image data into an image data buffer of the image processing accelerator card, so as to prevent the target image data from being lost and ensure the smoothness of the target image data processing.
As an implementation manner of the embodiment of the present invention, the step of the designated processor writing the target image data into the image data buffer of the image processing accelerator card may include:
and the appointed processor writes target image data into an image data buffer area of the image processing accelerator card in a direct memory storage DMA mode.
The data transmission mode of direct memory storage DMA is as follows: copying the target image data from an address space of the designated processor to an address space of the image processing accelerator card; moreover, the data transmission mode of direct memory storage DMA is completed by the DMA controller without depending on a designated processor, so that the efficiency of writing target image data into the image acceleration card can be improved by writing the target image data into the image data buffer area of the image processing acceleration card in the mode of direct memory storage DMA. For completeness of the scheme and clarity of description, a specific implementation process of writing target image data into an image data buffer of an image processing accelerator card will be described in the following embodiments.
After the designated processor writes the target image data into the image data cache region of the image processing accelerator card, the image processing accelerator card may acquire the target image data from the image data cache region, further perform image processing on the target image data by using an image processing algorithm to obtain an image processing result, and access the image processing result of the target image data into the image processing result cache region of the image processing accelerator card. For completeness of the scheme and clarity of description, the following embodiments will describe in detail the specific implementation process of the image processing accelerator card acquiring target image data to be processed from the image data cache region and writing the image processing result into the image processing result cache region.
Moreover, because the CPU and the GPU are both general-purpose processors, the hardware architectures of the CPU and the GPU are relatively fixed, resulting in a relatively slow operating speed of some image processing algorithms. The image processing accelerator card is composed of an integrated PS (processing system) and PL (Programmable Logic), and it can be seen that the architecture of the image processing accelerator card includes the Programmable Logic PL, so that the architecture of the image processing accelerator card has flexibility, and different architectures of the image processing accelerator card can be customized for different image processing algorithms, so that the operation speed of the image processing algorithms can be increased, and the speed of image processing can be increased.
S230, reading the image processing result from the image processing result buffer area.
After the image processing accelerator card writes the image processing result in the image processing result buffer, the designated processor may read the image processing result of the target image data from the image processing result buffer.
For completeness of the scheme and clarity of description, a specific implementation process for reading the image processing result from the image processing result buffer area will be described in the following embodiments.
According to the technical scheme provided by the embodiment of the invention, after a CPU or a GPU of an image processing device obtains target image data to be processed, the target image data is written into an image data cache region of an image processing accelerator card; the image processing accelerator card can acquire target image data from the image data cache region and perform image processing on the target image data to obtain an image processing result; and writes the image processing result into an image processing result buffer area of the image processing accelerator card, so that the CPU or GPU of the image processing apparatus can read the image processing result from the image processing result buffer area. Therefore, according to the technical scheme provided by the embodiment of the invention, the image processing is realized by the image processing accelerator card, so that the workload of the CPU and the GPU of the image processing device can be reduced, and the load of the CPU and the GPU is further reduced. Moreover, because the architecture of the image processing accelerator card has flexibility, the image processing accelerator cards with different architectures can be customized for different image processing algorithms, so that the running speed of the image processing algorithms can be improved.
The specific implementation process of the designated processor writing the target image data into the image data buffer of the image processing accelerator card and reading the image processing result from the image processing result buffer will be described below.
In one embodiment, the image data buffer includes a plurality of sub-regions, and the image data occupation state of each sub-region is represented by a code bit value of one code bit of the first register.
At this time, the step of designating the processor to write the target image data into the image data buffer of the image processing accelerator card may include:
the appointed processor searches a first code bit from each code bit of the first register, wherein the code bit value of the first code bit is used for representing that the subarea is not occupied by the image data;
and the appointed processor writes the target image data into the first subarea corresponding to the first code bits.
In this embodiment, the image processing accelerator card includes: a first register; the image data buffer area comprises a plurality of sub-areas, and the number of code bits of the first register is the same as that of the sub-areas of the image data buffer area. Each code bit of the first register corresponds to a sub-region of the image data cache region, and the image data occupation state of the sub-region can be characterized by the code bit value of the code bit. For example, a code bit value of the code bit is 0, which may be used to characterize that the sub-region is not occupied by image data; the code bit value of the code bit is 1, which can be used to characterize that the sub-region is occupied by the image data.
When the appointed processor writes the target image data into the image data cache region of the image processing accelerator card, the first code bit can be searched in each code bit of the first register, and the code bit value of the first code bit is used for representing that the sub region corresponding to the first code bit is not occupied by the image data, so that the appointed processor can write the target image data into the sub region corresponding to the first code bit.
The image processing result cache region may include a plurality of sub-regions, and each sub-region is characterized by one code bit of the second register;
at this time, the step of the designated processor reading the image processing result from the image processing result buffer may include:
the appointed processor determines a second sub-area of an image processing result cache area of the image accelerator card, wherein the second sub-area is as follows: a sub-region characterized by second code bits of the second register, the second code bits being located at the same positions as the first code bits in the first register;
and the appointed processor reads the image processing result from the second subarea of the image processing result cache region.
In order to enable the designated processor to accurately acquire the image processing result of the target image data from the image processing result buffer area of the image processing accelerator card, the image processing accelerator card may buffer the image processing result of the target image data into a second sub-area of the image processing result buffer area, where the second sub-area is: and a sub-region characterised by a second code bit of the second register, the second code bit being located in the second register at the same position as the first code bit in the first register. So that the designated processor can read the image processing result from the second sub-area of the image processing result buffer area.
For example, as shown in fig. 4(a), the first register is an 8-bit register, the image data buffer includes 8 sub-regions, and each code bit of the first register corresponds to one sub-region of the image data buffer in a left-to-right order. Assuming that the specified processor finds that the code bit value of the fourth code bit of the first register is 0, that is, the fourth sub-region corresponding to the fourth code bit is not occupied by the image data, the specified processor may write the target image data into the fourth sub-region, and at this time, the image processing accelerator card may set the code bit value of the fourth code bit of the first register from 0 to 1.
As shown in fig. 4(b), the second register is an 8-bit register, the image processing result buffer includes 8 sub-regions, and each code bit of the second register corresponds to one sub-region of the image data buffer in the order from left to right. Since the designation processor writes the target image data into the fourth sub-area of the image data buffer, in order to correspond the target image data to the image processing result of the target image data, the image processing accelerator card may write the image processing result of the target image data into the fourth sub-area of the image processing result buffer, so that the designation processor may acquire the image processing result of the target image data from the fourth sub-area of the image processing result buffer.
In addition, in an embodiment, the step of determining the second sub-area of the image processing result buffer area of the image accelerator card may include:
the appointed processor calls an interrupt monitoring processing function to monitor whether the image processing accelerator card sends an interrupt signal;
and after monitoring that the image processing accelerator card sends an interrupt signal, the appointed processor determines a second subarea of an image processing result cache region of the image processing accelerator card.
In this embodiment, the image processing accelerator card may send an interrupt signal to the designated processor after performing image processing on the target image data to obtain an image processing result, so that the designated processor may call an interrupt monitoring processing function to monitor whether the image processing accelerator card sends the interrupt signal, and if it is monitored that the image processing accelerator card sends the interrupt signal, it indicates that the image processing result of the target image data has been stored in the image processing result buffer area of the image accelerator card, at this time, the designated processor may execute the second sub-area determining the image processing buffer area of the image accelerator card, so as to read the image processing result of the target image data from the second sub-area.
It can be understood that the existing neural network model is usually a floating point model, for example, the convolutional neural network model Caffemodle after being trained by the convolutional neural network framework Caffe is a floating point model, that is, the convolutional neural network model Caffemodle is a floating point model, and in the image processing accelerator card, the floating point model needs to be converted from a floating point model to a fixed point model, so as to improve the operating efficiency of the image processing algorithm, and thus improve the efficiency of image processing.
In one embodiment, the designated processor of the image processing apparatus may be further configured to perform:
and importing the configuration file of the image processing algorithm into an image processing accelerator card so that the image processing accelerator card calls the model parameters corresponding to the image processing algorithm in the configuration file, and carrying out image processing on the target image data by using the image processing algorithm to obtain an image processing result.
In this embodiment, the designation processor may fix the floating point model, generate a configuration file including model parameters corresponding to the image processing algorithm, and import the configuration file into the image processing accelerator card, so that the image processing accelerator card may call the model parameters corresponding to the image processing algorithm when performing image processing using the image processing algorithm, and perform image processing on the target image data using the image processing algorithm to obtain an image processing result of the target image data.
Specifically, the method comprises the following steps. The floating-point model generally comprises a plurality of convolution layers, and the fixed-point floating-point model can be realized by determining the fixed-point parameters corresponding to each convolution layer and the weight and the activation bias of the fixed-point floating-point model. For example, 3000 images may be tested by using the floating point model, each image corresponds to one piece of test information, and the fixed-point parameters corresponding to each convolution layer may be determined according to 3000 runs. Those skilled in the art can understand the specific implementation process of the fixed-point floating-point model, and the embodiments of the present invention are not described herein again.
The file generated after the floating-point model is fixed-point may be referred to as a configuration file, and the specified processor may import the configuration file into a memory specified location for image processing acceleration.
For the sake of completeness and clear description of the scheme, the following describes in detail the specific implementation process of the image processing accelerator card acquiring target image data to be processed from the image data cache region and writing the image processing result into the image processing result cache region.
In one embodiment, the image data buffer comprises a plurality of sub-regions, and the data occupation state of each sub-region is characterized by the code bit value of one code bit of the first register;
at this time, the step of the image processing accelerator card obtaining the target image data to be processed from the image data buffer area may include:
determining a third sub-region in the image data buffer region, wherein the third sub-region is as follows: and a sub-region characterised by a third code bit of the first register, the third code bit being: the code bit value is a code bit used for representing that the area is occupied by the image data;
and acquiring target image data to be processed from a third subarea in an image data cache region of the image processing accelerator card.
In this embodiment, the image processing accelerator card includes: a first register; the image data buffer area comprises a plurality of sub-areas, and the number of code bits of the first register is the same as that of the sub-areas of the image data buffer area. Each code bit of the first register corresponds to a sub-region of the image data cache region, and the data occupation state of the sub-region can be characterized by the code bit value of the code bit. For example, a code bit value of the code bit is 0, which may be used to characterize that the sub-region is not occupied by image data; the code bit value of the code bit is 1, which can be used to characterize that the sub-region is occupied by the image data.
When the image processing accelerator card acquires target image data to be processed from the image data cache region, a third sub-region may be determined in each code bit of the first register, where the third sub-region is: a sub-region characterised by a third code bit of the first register, the third code bit being: and the code bit value is used for representing the code bit occupied by the image data of the area, so that the image processing accelerator card can acquire target image data to be processed from the third sub-area.
The image processing result cache region comprises a plurality of sub-regions, and each sub-region is represented by one code bit of the second register;
at this time, the step of writing the image processing result into the image processing result buffer by the image processing accelerator card may include:
determining a fourth sub-area in the image processing result cache area, wherein the fourth sub-area is as follows: the sub-region is characterized by a fourth code bit of the second register, and the position of the fourth code bit in the second register is the same as the position of the third code bit in the first register;
and writing an image processing result of the target image data into the fourth sub-area.
In order to enable the designated processor to accurately acquire the image processing result of the target image data from the image processing result buffer area of the image processing accelerator card, the image processing accelerator card may buffer the image processing result of the target image data into a fourth sub-area of the image processing result buffer area, where the fourth sub-area is: and the fourth code bit is positioned in the same position as the third code bit in the first register in the second register, so that the specified processor can read the image processing result of the target image data from the fourth sub-area of the image processing result cache area.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the embodiments of the apparatus, the electronic device, the image processing accelerator card and the storage medium, since they are substantially similar to the embodiments of the method, the description is simple, and the relevant points can be referred to the partial description of the embodiments of the method.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (10)

1. An image processing apparatus, wherein the image processing apparatus includes a CPU and a GPU, and an image processing accelerator card is disposed in the image processing apparatus, the CPU and the GPU of the image processing apparatus are both communicatively connected to the image processing accelerator card, the image processing apparatus determines a designated processor among the CPU and the GPU of the image processing apparatus according to load amounts of the CPU and the GPU of the image processing apparatus, and the designated processor is configured to perform:
obtaining target image data to be processed;
writing the target image data into an image data cache region of the image processing accelerator card so that the image processing accelerator card obtains the target image data from the image data cache region, and performing image processing on the target image data to obtain an image processing result; writing the image processing result into an image processing result cache region of the image processing accelerator card;
and reading the image processing result from the image processing result buffer area.
2. The image processing apparatus according to claim 1, wherein said determining a specific processor among the CPU and the GPU of the image processing apparatus according to load amounts of the CPU and the GPU of the image processing apparatus comprises:
the CPU of the image processing device judges whether the current load of the CPU is smaller than a preset load or not;
if the current load of the CPU of the image processing device is smaller than a preset load, determining the CPU of the image processing device as a designated processor;
and if the current load of the CPU of the image processing device is not less than the preset load, determining the GPU of the image processing device as the designated processor.
3. The image processing apparatus according to claim 1, wherein the image data buffer of the image processing accelerator card comprises a plurality of sub-regions, and the image data occupation state of each sub-region is characterized by a code bit value of one code bit of the first register.
4. The image processing apparatus according to claim 3, wherein said specifying processor writes the target image data into an image data buffer of the image processing accelerator card, comprising:
the appointed processor searches for a first code bit from each code bit of a first register, wherein the code bit value of the first code bit is used for representing that a sub-region corresponding to the first code bit is not occupied by image data;
and the appointed processor writes the target image data into a first subarea corresponding to the first code bits.
5. The image processing apparatus according to claim 1, wherein said designation processor writes said target image data into an image data buffer of said image processing accelerator card, comprising:
and the appointed processor writes target image data into an image data cache region of the image processing accelerator card in a Direct Memory Access (DMA) mode.
6. The image processing apparatus according to any one of claims 1 to 5, wherein said designation processor is further configured to execute:
and importing the configuration file of the image processing algorithm into an image processing accelerator card so that the image processing accelerator card calls the model parameters corresponding to the image processing algorithm in the configuration file, and carrying out image processing on the target image data by using the image processing algorithm to obtain an image processing result.
7. The image processing apparatus according to claim 3, wherein the image processing result buffer includes a plurality of sub-regions, each sub-region being characterized by one code bit of the second register.
8. The image processing apparatus according to claim 7, wherein said specifying processor reads the image processing result from said image processing result buffer, including:
the appointed processor determines a second sub-area of an image processing result cache area of the image accelerator card, wherein the second sub-area is as follows: a sub-region characterized by second code bits of the second register at the same positions as the first code bits in the first register;
and the appointed processor reads the image processing result from the second subarea of the image processing result cache region.
9. The image processing apparatus of claim 8, wherein the specifying processor determines a second sub-area of an image processing result buffer area of the image accelerator card, comprising:
the appointed processor calls an interrupt monitoring processing function to monitor whether the image processing accelerator card sends an interrupt signal;
and after monitoring that the image processing accelerator card sends an interrupt signal, the appointed processor determines a second sub-area of an image processing result cache area of the image processing accelerator card.
10. The image processing apparatus according to claim 1, wherein the image processing accelerator card is constituted by an integrated processing system PS and programmable logic PL.
CN201910312498.8A 2019-04-18 Image processing apparatus Active CN111833232B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910312498.8A CN111833232B (en) 2019-04-18 Image processing apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910312498.8A CN111833232B (en) 2019-04-18 Image processing apparatus

Publications (2)

Publication Number Publication Date
CN111833232A true CN111833232A (en) 2020-10-27
CN111833232B CN111833232B (en) 2024-07-02

Family

ID=

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112598565A (en) * 2020-12-09 2021-04-02 第四范式(北京)技术有限公司 Service operation method and device based on accelerator card, electronic equipment and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009125372A1 (en) * 2008-04-09 2009-10-15 Nxp B.V. Image stream processing circuit and a method of performing image processing operations
JP2011259511A (en) * 2011-09-29 2011-12-22 Seiko Epson Corp Image processing apparatus and image processing method
US20120147016A1 (en) * 2009-08-26 2012-06-14 The University Of Tokyo Image processing device and image processing method
CN102655440A (en) * 2011-03-03 2012-09-05 中兴通讯股份有限公司 Method and device for scheduling multiple sets of Turbo decoders
CN102693526A (en) * 2011-03-23 2012-09-26 中国科学院上海技术物理研究所 Infrared image processing method based on reconfigurable computing
CN102906726A (en) * 2011-12-09 2013-01-30 华为技术有限公司 Co-processing accelerating method, device and system
US20160189329A1 (en) * 2014-12-31 2016-06-30 Texas Instruments Incorporated Data processing system and method thereof
CN106776372A (en) * 2017-02-15 2017-05-31 北京中航通用科技有限公司 Emulation data access method and device based on FPGA
CN107483952A (en) * 2017-08-29 2017-12-15 郑州云海信息技术有限公司 A kind of method, apparatus and system of jpeg image decompression
CN109618165A (en) * 2019-01-07 2019-04-12 郑州云海信息技术有限公司 A kind of picture decoding method, system and host and image processing system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009125372A1 (en) * 2008-04-09 2009-10-15 Nxp B.V. Image stream processing circuit and a method of performing image processing operations
US20120147016A1 (en) * 2009-08-26 2012-06-14 The University Of Tokyo Image processing device and image processing method
CN102655440A (en) * 2011-03-03 2012-09-05 中兴通讯股份有限公司 Method and device for scheduling multiple sets of Turbo decoders
CN102693526A (en) * 2011-03-23 2012-09-26 中国科学院上海技术物理研究所 Infrared image processing method based on reconfigurable computing
JP2011259511A (en) * 2011-09-29 2011-12-22 Seiko Epson Corp Image processing apparatus and image processing method
CN102906726A (en) * 2011-12-09 2013-01-30 华为技术有限公司 Co-processing accelerating method, device and system
US20160189329A1 (en) * 2014-12-31 2016-06-30 Texas Instruments Incorporated Data processing system and method thereof
CN106776372A (en) * 2017-02-15 2017-05-31 北京中航通用科技有限公司 Emulation data access method and device based on FPGA
CN107483952A (en) * 2017-08-29 2017-12-15 郑州云海信息技术有限公司 A kind of method, apparatus and system of jpeg image decompression
CN109618165A (en) * 2019-01-07 2019-04-12 郑州云海信息技术有限公司 A kind of picture decoding method, system and host and image processing system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112598565A (en) * 2020-12-09 2021-04-02 第四范式(北京)技术有限公司 Service operation method and device based on accelerator card, electronic equipment and storage medium
WO2022121866A1 (en) * 2020-12-09 2022-06-16 第四范式(北京)技术有限公司 Acceleration card-based service running method, apparatus, electronic device, and computer-readable storage medium
CN112598565B (en) * 2020-12-09 2024-05-28 第四范式(北京)技术有限公司 Service operation method and device based on accelerator card, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
KR101457313B1 (en) Method, apparatus and computer program product for providing object tracking using template switching and feature adaptation
CN108875535B (en) Image detection method, device and system and storage medium
CN110796624B (en) Image generation method and device and electronic equipment
CN108875519B (en) Object detection method, device and system and storage medium
CN110335313B (en) Audio acquisition equipment positioning method and device and speaker identification method and system
CN110263680B (en) Image processing method, device and system and storage medium
CN114996103A (en) Page abnormity detection method and device, electronic equipment and storage medium
CN111080544B (en) Face distortion correction method and device based on image and electronic equipment
CN109727187B (en) Method and device for adjusting storage position of multiple region of interest data
CN113158773B (en) Training method and training device for living body detection model
CN113129298A (en) Definition recognition method of text image
CN108764206B (en) Target image identification method and system and computer equipment
CN111833232B (en) Image processing apparatus
CN111833232A (en) Image processing device
CN108629219B (en) Method and device for identifying one-dimensional code
CN110969587A (en) Image acquisition method and device and electronic equipment
CN113205079B (en) Face detection method and device, electronic equipment and storage medium
CN115272682A (en) Target object detection method, target detection model training method and electronic equipment
CN114550062A (en) Method and device for determining moving object in image, electronic equipment and storage medium
CN108958929B (en) Method and device for applying algorithm library, storage medium and electronic equipment
CN113485855A (en) Memory sharing method and device, electronic equipment and readable storage medium
CN113628192A (en) Image blur detection method, device, apparatus, storage medium, and program product
CN112101284A (en) Image recognition method, training method, device and system of image recognition model
CN112070144A (en) Image clustering method and device, electronic equipment and storage medium
CN111626075A (en) Target identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant