WO2022213718A1 - 样本图像增量、图像检测模型训练及图像检测方法 - Google Patents

样本图像增量、图像检测模型训练及图像检测方法 Download PDF

Info

Publication number
WO2022213718A1
WO2022213718A1 PCT/CN2022/075152 CN2022075152W WO2022213718A1 WO 2022213718 A1 WO2022213718 A1 WO 2022213718A1 CN 2022075152 W CN2022075152 W CN 2022075152W WO 2022213718 A1 WO2022213718 A1 WO 2022213718A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
candidate region
loss value
probability
target
Prior art date
Application number
PCT/CN2022/075152
Other languages
English (en)
French (fr)
Inventor
王云浩
张滨
辛颖
冯原
韩树民
Original Assignee
北京百度网讯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京百度网讯科技有限公司 filed Critical 北京百度网讯科技有限公司
Priority to JP2022552961A priority Critical patent/JP2023531350A/ja
Priority to US17/939,364 priority patent/US20230008696A1/en
Publication of WO2022213718A1 publication Critical patent/WO2022213718A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the present disclosure relates to the field of artificial intelligence, in particular to computer vision and deep learning technologies, which can be applied to intelligent cloud and industrial quality inspection scenarios, and in particular to a sample image increment method, an image detection model training method, an image detection method, and Corresponding apparatuses, electronic devices, computer-readable storage media, and computer program products.
  • the sample increment of small samples is usually realized by transforming the sample images such as rotation, or based on generative adversarial networks or transfer learning.
  • the embodiments of the present disclosure provide a sample image increment method, an image detection model training method, an image detection method, and a corresponding apparatus, electronic device, computer-readable storage medium, and computer program product.
  • an embodiment of the present disclosure proposes a sample image increment method, including: acquiring a first convolution feature of an original sample image; determining a candidate region according to the region generation network and the first convolution feature, and the candidate region includes The first probability of the target object; determine the target candidate area in the candidate area based on the first probability, and map the target candidate area back to the original sample image to obtain an intermediate image; perform image enhancement processing on the part of the intermediate image corresponding to the target candidate area and /or performing image blurring on the portion of the intermediate image corresponding to the non-target candidate region to obtain an incremental sample image.
  • an embodiment of the present disclosure provides an apparatus for incrementing a sample image, including: a first convolution feature acquisition unit configured to acquire a first convolution feature of an original sample image; a candidate region and probability determination unit, configured by is configured to determine a candidate region according to the region generation network and the first convolution feature, and a first probability that the candidate region contains the target object; the target candidate region determination and mapping unit is configured to determine the target candidate in the candidate region based on the first probability area, and map the target candidate area back to the original sample image to obtain an intermediate image; the intermediate image processing unit is configured to perform image enhancement processing on the part of the intermediate image corresponding to the target candidate area and/or perform image enhancement processing on the corresponding non-target candidate in the intermediate image Part of the area is image blurred to obtain an incremental sample image.
  • an embodiment of the present disclosure provides a method for training an image detection model, including: acquiring a second convolution feature of an incremental sample image; wherein the incremental sample image is obtained by any one of the implementation methods in the first aspect. ; Determine the new candidate region and the second probability that the new candidate region contains the target object according to the region generation network and the second convolution feature; obtain the first loss value corresponding to the first probability, and the second loss value corresponding to the second probability; based on The weighted first loss value and the second loss value determine a comprehensive loss value; and based on the comprehensive loss value meeting the preset requirements, a trained image detection model is obtained.
  • an embodiment of the present disclosure provides an apparatus for training an image detection model, including: a second convolution feature acquiring unit configured to acquire a second convolution feature of an incremental sample image; wherein the incremental sample image passes through Obtained as in any implementation manner in the second aspect; the new candidate region and the probability determination unit are configured to determine the new candidate region and the second probability that the new candidate region contains the target object according to the region generation network and the second convolution feature; the loss a value obtaining unit configured to obtain a first loss value corresponding to the first probability and a second loss value corresponding to the second probability; a comprehensive loss value determining unit configured to obtain a weighted first loss value and a second loss based on the weighted value value, to determine the comprehensive loss value; the image detection model training unit is configured to meet the preset requirements based on the comprehensive loss value, and obtain a trained image detection model.
  • an embodiment of the present disclosure provides an image detection method, including: receiving an image to be detected; invoking an image detection model to detect the to-be-detected image; wherein the image detection model is obtained by any one of the implementation manners in the third aspect .
  • an embodiment of the present disclosure provides an image detection apparatus, including: a to-be-detected image receiving unit configured to receive a to-be-detected image; an image detection unit configured to call an image detection model to detect the to-be-detected image; wherein , the image detection model is obtained by any one of the implementation manners in the fourth aspect.
  • an embodiment of the present disclosure provides an electronic device, the electronic device includes: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor , the instruction is executed by at least one processor, so that when executed by at least one processor, the sample image increment method described in any implementation manner of the first aspect and/or the image detection described in any implementation manner of the third aspect can be implemented The model training method and/or the image detection method described in any implementation manner of the fifth aspect.
  • an embodiment of the present disclosure provides a non-transitory computer-readable storage medium storing computer instructions, where the computer instructions are used to enable a computer to implement the sample image augmentation described in any implementation manner of the first aspect when executed.
  • embodiments of the present disclosure provide a computer program product including a computer program, the computer program being capable of implementing the sample image increment method and/or the method described in any of the implementation manners of the first aspect when the computer program is executed by a processor.
  • the image detection model training method, the image detection method, and the corresponding apparatus, electronic device, computer-readable storage medium, and computer program product provided by the embodiments of the present disclosure, first, the first volume of the original sample image is acquired product feature; then, determine the candidate region according to the region generation network and the first convolution feature, and the first probability that the candidate region contains the target object; next, determine the target candidate region in the candidate region based on the first probability, and assign the target The candidate area is mapped back to the original sample image to obtain an intermediate image; finally, image enhancement processing is performed on the part of the intermediate image corresponding to the target candidate area and/or image blur processing is performed on the part of the intermediate image corresponding to the non-target candidate area to obtain the incremental Sample image.
  • the technical solution provided by the present disclosure uses the region generation network to determine the candidate region that may contain the target object, and then uses the target candidate region with a higher probability as the target candidate region, maps the target candidate region back to the original image, and compares the original image with the target candidate region.
  • the part corresponding to the target candidate area and/or the part corresponding to the non-target candidate area is processed in a corresponding clearing or blurring manner, so as to obtain an incremental sample image that highlights the target object as much as possible.
  • a high-availability incremental sample image can be generated on the premise of not destroying the key part of the original sample image.
  • FIG. 1 is an exemplary system architecture in which the present disclosure may be applied
  • FIG. 2 is a flowchart of a sample image increment method provided by an embodiment of the present disclosure
  • FIG. 3 is a flowchart of another sample image increment method provided by an embodiment of the present disclosure.
  • FIG. 5 is a schematic flowchart of a sample image increment method in an application scenario provided by an embodiment of the present disclosure
  • FIG. 6 is a structural block diagram of an apparatus for incrementing a sample image provided by an embodiment of the present disclosure
  • FIG. 7 is a structural block diagram of an apparatus for training an image detection model according to an embodiment of the present disclosure.
  • FIG. 8 is a structural block diagram of an image detection apparatus according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic structural diagram of an electronic device suitable for performing a sample image increment method and/or an image detection model training method and/or an image detection method according to an embodiment of the present disclosure.
  • the acquisition, storage and application of the user's personal information involved all comply with the relevant laws and regulations, take necessary confidentiality measures, and do not violate public order and good customs.
  • FIG. 1 shows an exemplary system architecture 100 to which embodiments of the sample image increment method, image detection model training method, image detection method, and corresponding apparatuses, electronic devices, and computer-readable storage media of the present disclosure may be applied. .
  • the system architecture 100 may include terminal devices 101 , 102 , and 103 , a network 104 and a server 105 .
  • the network 104 is a medium used to provide a communication link between the terminal devices 101 , 102 , 103 and the server 105 .
  • the network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
  • the user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and the like.
  • Various applications for implementing information communication between the terminal devices 101 , 102 , 103 and the server 105 may be installed, such as image transmission applications, sample image increment applications, target detection model training applications, and the like.
  • the terminal devices 101, 102, 103 and the server 105 may be hardware or software.
  • the terminal devices 101, 102, 103 are hardware, they can be various electronic devices with display screens, including but not limited to smart phones, tablet computers, laptop computers, desktop computers, etc.; when the terminal devices 101, 102 When , 103 are software, they can be installed in the electronic devices listed above, which can be implemented as multiple software or software modules, or can be implemented as a single software or software module, which is not specifically limited here.
  • the server 105 is hardware, it can be implemented as a distributed server cluster composed of multiple servers, or it can be implemented as a single server; when the server is software, it can be implemented as multiple software or software modules, or as a single software or software. module, which is not specifically limited here.
  • the server 105 can provide various services through various built-in applications. Taking an image increment application that can provide sample image increment services as an example, the server 105 can achieve the following effects when running the image increment application:
  • the network 104 receives the original sample images from the terminal devices 101, 102, and 103, and then extracts its first convolution feature through a conventional feature extraction network;
  • the region contains the first probability of the target object; next, the target candidate region is determined in the candidate region based on the first probability, and the target candidate region is mapped back to the original sample image to obtain an intermediate image; finally, the corresponding target candidate in the intermediate image is Image enhancement processing is performed on the part of the region and/or image blurring processing is performed on the part of the intermediate image corresponding to the non-target candidate region to obtain an incremental sample image.
  • the server 105 can also use the generated incremental sample images to train a corresponding image detection model. For example, when the server 105 runs a model training application, the following effects can be achieved: acquiring the second convolution feature of the incremental sample images; The generating network and the second convolution feature determine the new candidate region and the second probability that the new candidate region contains the target object; obtain the first loss value corresponding to the first probability and the second loss value corresponding to the second probability; based on the weighted The first loss value and the second loss value determine the comprehensive loss value; based on the comprehensive loss value satisfying the preset requirements, the trained image detection model is obtained.
  • the server 105 can also provide external image detection services based on the image detection model, that is, by calling the image detection model to detect the image to be detected, and return the detection result. .
  • the server 105 detects that such data is already stored locally (eg, a task of incremental sample images to be processed before starting processing), it may choose to obtain such data directly from the local area, in which case the exemplary system architecture 100
  • the terminal devices 101, 102, 103 and the network 104 may also not be included.
  • the first convolutional feature of the original sample image can also be extracted through a feature extraction network in advance, and the finished product will be directly obtained for use later.
  • the sample image increment methods provided by subsequent embodiments of the present disclosure are generally performed by the server 105 with stronger computing power and more computing resources.
  • the sample image incrementing device is generally also provided in the server 105 .
  • the terminal devices 101, 102, and 103 can also use the image increment applications installed on the terminal devices 101, 102, and 103.
  • the terminal device can be used to execute it.
  • the above calculation can appropriately reduce the calculation pressure of the server 105.
  • the sample image increment device may also be provided in the terminal devices 101, 102, and 103.
  • the example system architecture 100 may also not include the server 105 and the network 104 .
  • terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.
  • FIG. 2 is a flowchart of a sample image increment method provided by an embodiment of the present disclosure, wherein the process 200 includes the following steps:
  • Step 201 obtain the first convolution feature of the original sample image
  • the purpose of this step is to obtain the first convolution feature of the original sample image by the execution body of the sample image increment method (for example, the server 105 shown in FIG. 1 ).
  • the first convolution feature can be extracted from the original sample image through a feature extraction network, and the specific type of the feature extraction network is not limited.
  • the original sample image is an image containing a target object.
  • the target object can be various objects in a small sample scene, such as cracks in metal materials under a microscope, microorganisms in a certain state of motion, and so on.
  • Step 202 Determine a candidate region according to the region generation network and the first convolution feature, and a first probability that the candidate region contains the target object;
  • the purpose of this step is to input the first convolutional feature into the area generation network by the above-mentioned executive body, so as to use the area generation network to determine the candidate area suspected of containing the target object, and the first convolutional feature containing the target object in each candidate area.
  • a probability is used to describe the possibility that the candidate region to which it belongs actually contains the target object, and it can even be quantified as a probability score.
  • the candidate region is the region determined by the region generation network based on the convolutional features (graph) that may contain the target object, that is to say, the region generation network should have the ability to identify the convolutional features of the target object.
  • Step 203 Determine the target candidate region in the candidate region based on the first probability, and map the target candidate region back to the original sample image to obtain an intermediate image;
  • step 202 the purpose of this step is that the above-mentioned executive body determines a candidate region with a higher probability of containing the target object as the target candidate region according to the first probability given to the candidate region, and further maps the target candidate region Return to the original sample image, and then obtain an intermediate image that bounds the suspected target object.
  • the candidate region is determined based on the convolution feature (map) extracted from the original sample image
  • the candidate region is the region on the convolution feature map, not directly the region on the original sample image.
  • the corresponding relationship between the convolution feature and the original sample image can be used to map the target candidate region back to the original sample image, so as to frame the existence boundary of the target object on the original sample image.
  • whether the existence boundary of the target object is accurately framed depends on the accuracy of the region generation network for extracting candidate regions and determining the first probability.
  • Step 204 Perform image enhancement processing on a portion of the intermediate image corresponding to the target candidate region and/or perform image blur processing on a portion of the intermediate image corresponding to the non-target candidate region to obtain an incremental sample image.
  • step 203 the purpose of this step is to use different image processing means for the part with the target object and/or the part without the target object framed by the above-mentioned execution subject in the intermediate image, and then the processed image volume sample image.
  • this step includes three different implementations:
  • the first type only perform image enhancement processing on the part of the intermediate image corresponding to the target candidate region, and use the image-enhanced intermediate image as an incremental sample image;
  • image blurring is only performed on the part of the intermediate image corresponding to the non-target candidate region, and the intermediate image after image blurring is used as the incremental sample image;
  • the third type not only perform image enhancement processing on the part of the intermediate image corresponding to the target candidate area, but also perform image blur processing on the part of the intermediate image corresponding to the non-target candidate area, and the intermediate image after image enhancement and image blur processing will be processed. as an incremental sample image.
  • image enhancement processing is an image processing method to improve image clarity
  • image blur processing is an image processing method to reduce image clarity. The clearer the image, the easier it is to accurately identify whether it contains the target object.
  • the embodiment of the present disclosure provides a sample image increment method.
  • the method uses a region generation network to determine a candidate region that may contain a target object, and then uses a target candidate region with a higher probability of inclusion in the method. Map back to the original image, and use corresponding clearing or blurring processing methods for the parts corresponding to the target candidate area and/or the non-target candidate area in the original image, so as to obtain an incremental sample image that highlights the target object as much as possible.
  • a high-availability incremental sample image can be generated on the premise of not destroying the key part of the original sample image.
  • FIG. 3 is a flowchart of another sample image increment method provided by an embodiment of the present disclosure, wherein the process 300 includes the following steps:
  • Step 301 Obtain the first convolution feature of the original sample image
  • Step 302 Determine a candidate region according to the region generation network and the first convolution feature, and a first probability that the candidate region contains the target object;
  • Step 303 Determine the candidate region with the first probability greater than the preset probability as the target candidate region, and map the target candidate region back to the original sample image to obtain an intermediate image;
  • this embodiment provides a specific implementation method for selecting the target candidate region through this step, that is, by presetting a preset probability (for example, 70%) that is considered to be able to distinguish between high and low probability, so only It is necessary to compare the first probability of each candidate region with the preset probability to select a target candidate region with a high probability that the target object exists.
  • a preset probability for example, 70%
  • the target candidate area based on the preset probability provided in step 303, it is also possible to select a method such as ranking the first probability by the size of the top (in descending order, the top N refers to the N with larger probability value) ) candidate regions are determined as target candidate regions, and can also be selected based on the previous percentage and other methods. That is, the purpose of either method is to determine the candidate region containing the target object with the highest possible probability as the target candidate region, so that after the target candidate region is mapped back to the original sample image, the original sample image can be framed as accurately as possible. target object.
  • a method such as ranking the first probability by the size of the top (in descending order, the top N refers to the N with larger probability value)
  • candidate regions are determined as target candidate regions, and can also be selected based on the previous percentage and other methods. That is, the purpose of either method is to determine the candidate region containing the target object with the highest possible probability as the target candidate region, so that after the target candidate region is mapped back to the original sample image
  • Step 304 Perform Gaussian blurring on the portion of the intermediate image corresponding to the non-target candidate region
  • step 303 the purpose of this step is to perform Gaussian blurring on the part of the intermediate image corresponding to the non-target candidate region by the above-mentioned executive body.
  • Gaussian blur also known as Gaussian smoothing
  • the image generated by this blurring technique has the visual effect of looking at the image through a frosted glass, which is significantly different from the bokeh effect of the lens bokeh and the effect in the shadow of ordinary lighting.
  • Gaussian smoothing is also used in the preprocessing stage in computer vision algorithms to enhance images at different scales. From a mathematical point of view, the Gaussian blurring process of an image is the convolution of the image with the normal distribution. Since the normal distribution is also known as the Gaussian distribution, this technique is called Gaussian blur. Convolving the image with the circular box blur will result in a more accurate bokeh image. Since the Fourier transform of the Gaussian function is another Gaussian function, the Gaussian blur is a low-pass filter for the image.
  • Step 305 Perform first image enhancement processing on the first target area in the intermediate image
  • Step 306 Perform second image enhancement processing on the second target area in the intermediate image
  • steps 305 and 306 respectively perform image enhancement processing with different image enhancement intensities on the first target area and the second target area in the intermediate image, so as to distinguish the image enhancement effects of different target areas.
  • the first target area is the overlapped part of at least two target candidate areas mapped in the original sample image; different from the first target area, the second target area is the part of a single target candidate area mapped in the original sample image.
  • Step 307 Use the processed image as an incremental sample image.
  • this embodiment provides a specific method for determining a target candidate region based on a first probability through step 303;
  • the part of the area specifically adopts the image blurring method of Gaussian blurring.
  • Steps 305 to 306 provide images with different image enhancement strengths for the part of the intermediate image corresponding to the non-target candidate area according to whether multiple target candidate areas overlap. Enhance processing to highlight the target object as much as possible.
  • any of the above embodiments provides different sample image increment schemes, and further, can also be combined with the above-mentioned technical scheme for generating incremental sample images to provide a model training method for training to obtain a target detection model, a method comprising and
  • a method comprising and
  • the process 400 includes the following steps:
  • Step 401 Obtain the second convolution feature of the incremental sample image
  • the second convolutional feature is extracted from the enhanced sample image, and the method for extracting the second convolutional feature may be the same as the method for extracting the first convolutional feature from the original sample image, for example, using the same feature extraction network.
  • Step 402 Determine a new candidate region and a second probability that the new candidate region contains the target object according to the region generation network and the second convolution feature;
  • the new candidate region and its second probability are similar to the candidate region and its first probability.
  • the object that distinguishes the new candidate region and its second probability is the incremental sample image, and the candidate region and its first probability object are the original sample image.
  • Step 403 Obtain a first loss value corresponding to the first probability and a second loss value corresponding to the second probability;
  • step 402 the purpose of this step is to obtain the loss values used to guide the training of the model. Since there are original sample images and incremental sample images, the corresponding loss values are determined based on the first probability and the second probability respectively. .
  • Step 404 Determine a comprehensive loss value based on the weighted first loss value and the second loss value
  • this step aims to combine the weighted first loss value and the second loss value to determine a more reasonable comprehensive loss value.
  • the weight used for weighting the first loss value and the weight used for weighting the second loss value may be the same or different, and may be flexibly adjusted according to the actual situation.
  • An implementation manner including but not limited to: taking the sum of the weighted first loss value and the weighted second loss value as the comprehensive loss value.
  • Step 405 Obtain a trained image detection model based on the comprehensive loss value satisfying the preset requirements.
  • this step is aimed at obtaining a trained image detection model by the above-mentioned execution subject meeting the preset requirements based on the comprehensive loss value.
  • An implementation manner that includes but is not limited to: in response to the comprehensive loss being the minimum value in the iterative training with a preset number of rounds, outputting a trained image detection model.
  • the training goal is to control the minimum comprehensive loss value.
  • the embodiment shown in FIG. 4 is based on the previous embodiments, and further combines the incremental sample images to train the target detection model, so that the trained target detection model can be directly used to detect the image to be tested accurately and efficiently. Whether the target object exists in .
  • An image detection method can be:
  • the image to be detected is received, and then the image detection model is called to detect the image to be detected.
  • the obtained detection results can also be returned later.
  • the present disclosure also provides a specific implementation scheme in combination with a specific application scenario, please refer to the schematic flowchart shown in FIG. 5 .
  • this embodiment provides a target detection method based on region generation enhancement, which aims to use candidate region generation for data enhancement, which can be combined with various existing sample increment technologies. Use, so as to comprehensively improve the availability of incremental samples from different angles, and finally train a target detection model with better detection effect based on the incremental sample set:
  • the final classification probability is obtained, and the regression boundary (that is, b1 and b2) corresponding to the classification probability is mapped to the original image to be detected according to a certain threshold to obtain the final detection result.
  • the loss value of the image after the candidate region mapping during the training process will be more convergent.
  • the above solution can also be transplanted into the existing method based on the area generation network, and can also be used together with other technologies for small sample detection to improve the effect, so as to further improve the practicability.
  • the present disclosure also provides device embodiments, namely, a sample image incrementing device corresponding to the sample image incrementing method shown in FIG. 2 , and an image detection model shown in FIG. 4 .
  • the image detection model training device corresponding to the training method and the image detection device corresponding to the image detection method can be specifically applied to various electronic devices.
  • the sample image incrementing device 600 in this embodiment may include: a first convolution feature acquisition unit 601 , a candidate region and probability determination unit 602 , a target candidate region determination and mapping unit 603 , and an intermediate image processing unit 604 .
  • the first convolution feature obtaining unit 601 is configured to obtain the first convolution feature of the original sample image
  • the candidate region and probability determining unit 602 is configured to determine the candidate region according to the region generating network and the first convolution feature, and the first probability that the candidate region contains the target object
  • the target candidate region determination and mapping unit 603 is configured to determine the target candidate region in the candidate region based on the first probability, and map the target candidate region back to the original sample image to obtain an intermediate The image
  • the intermediate image processing unit 604 is configured to perform image enhancement processing on the part of the intermediate image corresponding to the target candidate region and/or perform image blur processing on the part of the intermediate image corresponding to the non-target candidate region to obtain an incremental sample image.
  • the sample image incrementing device 600 in the sample image incrementing device 600 : the first convolution feature acquisition unit 601 , the candidate region and probability determination unit 602 , the target candidate region determination and mapping unit 603 , and the specific processing of the intermediate image processing unit 604 and
  • the relevant descriptions of steps 201 to 204 in the corresponding embodiment of FIG. 2 which will not be repeated here.
  • the intermediate image processing unit 604 may include a blurring processing subunit that performs image blurring processing on a portion of the intermediate image corresponding to the non-target candidate region, and the blurring processing subunit is further configured to:
  • Gaussian blurring is performed on the part of the intermediate image corresponding to the non-target candidate region.
  • the target candidate region determination and mapping unit 603 may include a target candidate region determination subunit configured to determine a target candidate region in the candidate regions based on the first probability, and the target candidate region determination Subunits are further configured to:
  • a candidate area with a first probability greater than a preset probability is determined as a target candidate area.
  • the intermediate image processing unit 604 may include an enhancement processing subunit that performs image enhancement processing on a portion of the intermediate image corresponding to the target candidate region, and the enhancement processing subunit is further configured to:
  • the second image enhancement processing is performed on the second target area in the intermediate image, the second target area is the part of the single target candidate area mapped in the original sample image, and the image enhancement intensity of the first image enhancement processing is greater than that of the second image enhancement processing.
  • Image enhancement strength is the image enhancement strength.
  • the image detection model training apparatus 700 in this embodiment may include: a second convolution feature acquisition unit 701, a new candidate region and probability determination unit 702, a loss value acquisition unit 703, a comprehensive loss value determination unit 704, Image detection model training unit 705 .
  • the second convolution feature obtaining unit 701 is configured to obtain the second convolution feature of the incremental sample image; wherein, the incremental sample image is obtained by the sample image incremental device as shown in FIG. 6; the new candidate area and
  • the probability determination unit 702 is configured to determine the new candidate region and the second probability that the new candidate region contains the target object according to the region generation network and the second convolution feature; the loss value obtaining unit 703 is configured to obtain the first probability corresponding to the first probability.
  • the comprehensive loss value determination unit 704 is configured to determine the comprehensive loss value based on the weighted first loss value and the second loss value;
  • the image detection model training unit 705 is configured to meet the preset requirements based on the comprehensive loss value, and obtain a trained image detection model.
  • the comprehensive loss value determination unit may be further configured to:
  • the sum of the weighted first loss value and the weighted second loss value is taken as the comprehensive loss value.
  • the image detection model training unit is further configured to:
  • the trained image detection model is output.
  • the image detection apparatus 800 in this embodiment may include: a to-be-detected image receiving unit 801 and an image detection unit 802 .
  • the image receiving unit 801 to be detected is configured to receive the image to be detected;
  • the image detection unit 802 is configured to call the image detection model to detect the image to be detected; wherein, the image detection model passes the image detection model shown in FIG. 7 .
  • Training device is obtained.
  • the sample image incrementing device determines a candidate region that may contain a target object by means of a region generation network, and then selects a candidate region with a higher probability of inclusion in it.
  • As the target candidate area by mapping its target candidate area back to the original image, and applying the corresponding clearing or fuzzification processing method to the part corresponding to the target candidate area and/or the corresponding non-target candidate area in the original image, so as to obtain the best possible results.
  • Incremental sample images that may highlight the target object. Through the technical solution, a high-availability incremental sample image can be generated on the premise of not destroying the key part of the original sample image.
  • the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.
  • FIG. 9 shows a schematic block diagram of an example electronic device 900 that may be used to implement embodiments of the present disclosure.
  • Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
  • Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices.
  • the components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein.
  • the device 900 includes a computing unit 901 that can be executed according to a computer program stored in a read only memory (ROM) 902 or a computer program loaded from a storage unit 908 into a random access memory (RAM) 903 Various appropriate actions and handling.
  • ROM read only memory
  • RAM random access memory
  • various programs and data necessary for the operation of the device 900 can also be stored.
  • the computing unit 901, the ROM 902, and the RAM 903 are connected to each other through a bus 904.
  • An input/output (I/O) interface 905 is also connected to bus 904 .
  • Various components in the device 900 are connected to the I/O interface 905, including: an input unit 906, such as a keyboard, mouse, etc.; an output unit 907, such as various types of displays, speakers, etc.; a storage unit 908, such as a magnetic disk, an optical disk, etc. ; and a communication unit 909, such as a network card, a modem, a wireless communication transceiver, and the like.
  • the communication unit 909 allows the device 900 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.
  • Computing unit 901 may be various general-purpose and/or special-purpose processing components with processing and computing capabilities. Some examples of computing units 901 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various specialized artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc.
  • the computing unit 901 performs the various methods and processes described above, such as the sample image increment method.
  • the sample image increment method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 908 .
  • part or all of the computer program may be loaded and/or installed on device 900 via ROM 902 and/or communication unit 909.
  • ROM 902 and/or communication unit 909 When a computer program is loaded into RAM 903 and executed by computing unit 901, one or more steps of the sample image increment method described above may be performed.
  • the computing unit 901 may be configured to perform the sample image increment method by any other suitable means (eg, by means of firmware).
  • Various implementations of the systems and techniques described herein above may be implemented in digital electronic circuitry, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips system (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof.
  • FPGAs field programmable gate arrays
  • ASICs application specific integrated circuits
  • ASSPs application specific standard products
  • SOC systems on chips system
  • CPLD load programmable logic device
  • computer hardware firmware, software, and/or combinations thereof.
  • These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that
  • the processor which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.
  • Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, performs the functions/functions specified in the flowcharts and/or block diagrams. Action is implemented.
  • the program code may execute entirely on the machine, partly on the machine, partly on the machine and partly on a remote machine as a stand-alone software package or entirely on the remote machine or server.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer.
  • a display device eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
  • a keyboard and pointing device eg, a mouse or trackball
  • Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.
  • the systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user's computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system.
  • the components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
  • a computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.
  • the server can be a cloud server, also known as a cloud computing server or a cloud host. It is a host product in the cloud computing service system to solve the management difficulties in traditional physical host and virtual private server (VPS, Virtual Private Server) services. Large, weak business expansion defects.
  • VPN Virtual Private Server
  • the sample technical solution provided by the embodiments of the present disclosure uses the region generation network to determine the candidate regions that may contain the target object, and then uses the target candidate regions with a higher probability of inclusion as the target candidate regions, and maps the target candidate regions back to the original image, and Corresponding clearing or blurring processing is applied to the part corresponding to the target candidate area and/or the non-target candidate area in the original image, so as to obtain an incremental sample image that highlights the target object as much as possible.
  • a high-availability incremental sample image can be generated without destroying the key part of the original sample image.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

本公开提供了一种样本图像增量方法、图像检测模型训练方法、图像检测方法,以及对应的装置、电子设备、计算机可读存储介质及计算机程序产品,涉及计算机视觉和深度学习等人工智能领域,可应用于智能云和工业质检场景下。一具体实施方式包括:获取原始样本图像的第一卷积特征;根据区域生成网络和第一卷积特征确定候选区域,及候选区域中包含目标对象的第一概率;基于第一概率在候选区域中确定目标候选区域,并将目标候选区域映射回原始样本图像,得到中间图像;对中间图像中对应目标候选区域的部分进行图像增强处理和/或对中间图像中对应非目标候选区域的部分进行图像模糊处理,得到增量样本图像。应用该实施方式生成的增量样本图像可用性更高。

Description

样本图像增量、图像检测模型训练及图像检测方法
本专利申请要求于2021年4月7日提交的、申请号为202110371342.4、发明名称为“样本图像增量、图像检测模型训练及图像检测方法”的中国专利申请的优先权,该申请的全文以引用的方式并入本申请中。
技术领域
本公开涉及为人工智能领域,具体涉及计算机视觉和深度学习技术,可应用于智能云和工业质检场景下,尤其涉及一种样本图像增量方法、图像检测模型训练方法、图像检测方法,以及对应的装置、电子设备、计算机可读存储介质及计算机程序产品。
背景技术
在目标检测领域中,机器学习算法往往需要通过对大量已标注的训练样本进行学习,从而利用训练好的模型对实际样本进行目标检测。
在部分技术领域下,由于目标物体数量稀少或者获得难度极高,很难收集到足够的训练样本,也就无法保证训练出的模型的识别能力。
现有技术通常通过对样本图像进行旋转等变换方式、基于生成对抗网络或迁移学习的方式来实现小样本的样本增量。
发明内容
本公开实施例提出了一种样本图像增量方法、图像检测模型训练方法、图像检测方法,以及对应的装置、电子设备、计算机可读存储介质及计算机程序产品。
第一方面,本公开实施例提出了一种样本图像增量方法,包括:获取原始样本图像的第一卷积特征;根据区域生成网络和第一卷积特征确定候选区域,及候选区域中包含目标对象的第一概率;基于第一概率在候选区域中确定目标候选区域,并将目标候选区域映射回原始样本图像,得到中间图像;对中间图像中对应目标候选区域的部分进行图像增强处理 和/或对中间图像中对应非目标候选区域的部分进行图像模糊处理,得到增量样本图像。
第二方面,本公开实施例提出了一种样本图像增量装置,包括:第一卷积特征获取单元,被配置成获取原始样本图像的第一卷积特征;候选区域及概率确定单元,被配置成根据区域生成网络和第一卷积特征确定候选区域,及候选区域中包含目标对象的第一概率;目标候选区域确定及映射单元,被配置成基于第一概率在候选区域中确定目标候选区域,并将目标候选区域映射回原始样本图像,得到中间图像;中间图像处理单元,被配置成对中间图像中对应目标候选区域的部分进行图像增强处理和/或对中间图像中对应非目标候选区域的部分进行图像模糊处理,得到增量样本图像。
第三方面,本公开实施例提出了一种图像检测模型训练方法,包括:获取增量样本图像的第二卷积特征;其中,增量样本图像通过如第一方面中的任一实现方式得到;根据区域生成网络和第二卷积特征确定新候选区域、新候选区域包含目标对象的第二概率;获取对应第一概率的第一损失值,以及对应第二概率的第二损失值;基于加权后的第一损失值和第二损失值,确定综合损失值;基于综合损失值满足预设要求,得到训练完成的图像检测模型。
第四方面,本公开实施例提出了一种图像检测模型训练装置,包括:第二卷积特征获取单元,被配置成获取增量样本图像的第二卷积特征;其中,增量样本图像通过如第二方面中的任一实现方式得到;新候选区域及概率确定单元,被配置成根据区域生成网络和第二卷积特征确定新候选区域、新候选区域包含目标对象的第二概率;损失值获取单元,被配置成获取对应第一概率的第一损失值,以及对应第二概率的第二损失值;综合损失值确定单元,被配置成基于加权后的第一损失值和第二损失值,确定综合损失值;图像检测模型训练单元,被配置成基于综合损失值满足预设要求,得到训练完成的图像检测模型。
第五方面,本公开实施例提供了一种图像检测方法,包括:接收待检测图像;调用图像检测模型对待检测图像进行检测;其中,图像检测模型通过如第三方面中的任一实现方式得到。
第六方面,本公开实施例提供了一种图像检测装置,包括:待检测图像接收单元,被配置成接收待检测图像;图像检测单元,被配置成调用图像检测模型对待检测图像进行检测;其中,图像检测模型通过如第四方面中的任一实现方式得到。
第七方面,本公开实施例提供了一种电子设备,该电子设备包括:至少一个处理器;以及与至少一个处理器通信连接的存储器;其中,存储器存储有可被至少一个处理器执行的指令,该指令被至少一个处理器执行,以使至少一个处理器执行时能够实现如第一方面中任一实现方式描述的样本图像增量方法和/或第三方面任一实现方式描述的图像检测模型训练方法和/或第五方面任一实现方式描述的图像检测方法。
第八方面,本公开实施例提供了一种存储有计算机指令的非瞬时计算机可读存储介质,该计算机指令用于使计算机执行时能够实现如第一方面中任一实现方式描述的样本图像增量方法和/或第三方面任一实现方式描述的图像检测模型训练方法和/或第五方面任一实现方式描述的图像检测方法。
第九方面,本公开实施例提供了一种包括计算机程序的计算机程序产品,该计算机程序在被处理器执行时能够实现如第一方面中任一实现方式描述的样本图像增量方法和/或第三方面任一实现方式描述的图像检测模型训练方法和/或第五方面任一实现方式描述的图像检测方法。
本公开实施例提供的样本图像增量方法、图像检测模型训练方法、图像检测方法,以及对应的装置、电子设备、计算机可读存储介质及计算机程序产品,首先,获取原始样本图像的第一卷积特征;然后,根据区域生成网络和第一卷积特征确定候选区域,及候选区域中包含目标对象的第一概率;接下来,基于第一概率在候选区域中确定目标候选区域,并将目标候选区域映射回原始样本图像,得到中间图像;最后,对中间图像中对应目标候选区域的部分进行图像增强处理和/或对中间图像中对应非目标候选区域的部分进行图像模糊处理,得到增量样本图像。
本公开所提供的技术方案借助区域生成网络来确定可能包含有目标对象的候选区域,然后将其中包含概率较高的作为目标候选区域, 通过将其目标候选区域映射回原图,并对原图中对应目标候选区域和/或对应非目标候选区域的部分采用相应的清晰化或模糊化处理方式,进而得到尽可能凸显出目标对象的增量样本图像。通过该技术方案得以在不破坏原始样本图像中关键部分的前提下,生成高可用性的增量样本图像。
应当理解,本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征,也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。
附图说明
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本公开的其它特征、目的和优点将会变得更明显:
图1是本公开可以应用于其中的示例性***架构;
图2为本公开实施例提供的一种样本图像增量方法的流程图;
图3为本公开实施例提供的另一种样本图像增量方法的流程图;
图4为本公开实施例提供的一种图像检测模型训练方法的流程图;
图5为本公开实施例提供的在一应用场景下的样本图像增量方法的流程示意图;
图6为本公开实施例提供的一种样本图像增量装置的结构框图;
图7为本公开实施例提供的一种图像检测模型训练装置的结构框图;
图8为本公开实施例提供的一种图像检测装置的结构框图;
图9为本公开实施例提供的一种适用于执行样本图像增量方法和/或图像检测模型训练方法和/或图像检测方法的电子设备的结构示意图。
具体实施方式
以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做 出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。需要说明的是,在不冲突的情况下,本公开中的实施例及实施例中的特征可以相互组合。
本公开的技术方案中,所涉及的用户个人信息的获取,存储和应用等,均符合相关法律法规的规定,采取了必要的保密措施,且不违背公序良俗。
首先,图1示出了可以应用本公开的样本图像增量方法、图像检测模型训练方法、图像检测方法,以及对应的装置、电子设备及计算机可读存储介质的实施例的示例性***架构100。
如图1所示,***架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103和服务器105上可以安装有各种用于实现两者之间进行信息通讯的应用,例如图像传输类应用、样本图像增量类应用、目标检测模型训练类应用等。
终端设备101、102、103和服务器105可以是硬件,也可以是软件。当终端设备101、102、103为硬件时,可以是具有显示屏的各种电子设备,包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等;当终端设备101、102、103为软件时,可以安装在上述所列举的电子设备中,其可以实现成多个软件或软件模块,也可以实现成单个软件或软件模块,在此不做具体限定。当服务器105为硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器;服务器为软件时,可以实现成多个软件或软件模块,也可以实现成单个软件或软件模块,在此不做具体限定。
服务器105通过内置的各种应用可以提供各种服务,以可以提供样本图像增量服务的图像增量类应用为例,服务器105在运行该图像 增量类应用时可实现如下效果:首先,通过网络104从终端设备101、102、103中接收原始样本图像,然后通过常规的特征提取网络提取到其第一卷积特征;然后,根据区域生成网络和第一卷积特征确定候选区域,及候选区域中包含目标对象的第一概率;接下来,基于第一概率在候选区域中确定目标候选区域,并将目标候选区域映射回原始样本图像,得到中间图像;最后,对中间图像中对应目标候选区域的部分进行图像增强处理和/或对中间图像中对应非目标候选区域的部分进行图像模糊处理,得到增量样本图像。
进一步的,服务器105还可以利用生成的增量样本图像训练相应的图像检测模型,例如服务器105在运行模型训练类应用时可实现如下效果:获取增量样本图像的第二卷积特征;根据区域生成网络和第二卷积特征确定新候选区域、新候选区域包含目标对象的第二概率;获取对应第一概率的第一损失值,以及对应第二概率的第二损失值;基于加权后的第一损失值和第二损失值,确定综合损失值;基于综合损失值满足预设要求,得到训练完成的图像检测模型。
更进一步的,在服务器105按照上述训练方式得到训练完成的图像检测模型后,还可以对外提供基于图像检测模型的图像检测服务,即通过调用该图像检测模型对待检测图像进行检测,并返回检测结果。
需要指出的是,原始样本图像除可以从终端设备101、102、103通过网络104获取到之外,也可以通过各种方式预先存储在服务器105本地。因此,当服务器105检测到本地已经存储有这些数据时(例如开始处理之前留存的待处理样本图像增量任务),可选择直接从本地获取这些数据,在此种情况下,示例性***架构100也可以不包括终端设备101、102、103和网络104。进一步的,原始样本图像的第一卷积特征也可以事先经由特征提取网络提取得到,后续将直接获取到成品来使用。
由于进行图像增量需要占用较多的运算资源和较强的运算能力,因此本公开后续各实施例所提供的样本图像增量方法一般由拥有较强运算能力、较多运算资源的服务器105来执行,相应地,样本图像增量装置一般也设置于服务器105中。但同时也需要指出的是,在终端 设备101、102、103也具有满足要求的运算能力和运算资源时,终端设备101、102、103也可以通过其上安装的图像增量类应用完成上述本交由服务器105做的各项运算,进而输出与服务器105同样的结果。尤其是在同时存在多种具有不同运算能力的终端设备的情况下,但图像增量类应用判断所在的终端设备拥有较强的运算能力和剩余较多的运算资源时,可以让终端设备来执行上述运算,从而适当减轻服务器105的运算压力,相应的,样本图像增量装置也可以设置于终端设备101、102、103中。在此种情况下,示例性***架构100也可以不包括服务器105和网络104。
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。
请参考图2,图2为本公开实施例提供的一种样本图像增量方法的流程图,其中流程200包括以下步骤:
步骤201:获取原始样本图像的第一卷积特征;
本步骤旨在由样本图像增量方法的执行主体(例如图1所示的服务器105)获取原始样本图像的第一卷积特征。
其中,第一卷积特征可通过特征提取网络提取自该原始样本图像,特征提取网络的具体类型不做限定。原始样本图像为包含有目标对象的图像,目标对象根据实际需求的不同,可以为小样本场景的各种物体,例如显微镜下的金属材料中的裂隙、处于某种运动状态的微生物等等。
步骤202:根据区域生成网络和第一卷积特征确定候选区域,及候选区域中包含目标对象的第一概率;
在步骤201的基础上,本步骤旨在由上述执行主体将第一卷积特征输入区域生成网络,以利用区域生成网络确定疑似包含目标对象的候选区域,以及各候选区域中包含目标对象的第一概率。具体的,该第一概率用于描述所属候选区域确实包含有目标对象的可能性大小,甚至还可以将其量化为概率得分。应当理解的是,候选区域是区域生成网络基于卷积特征(图)确定出的可能包含目标对象的区域,也就是说区域生成网络应当具有识别目标对象的卷积特征的能力。
步骤203:基于第一概率在候选区域中确定目标候选区域,并将目标候选区域映射回原始样本图像,得到中间图像;
在步骤202的基础上,本步骤旨在由上述执行主体根据给候选区域的第一概率,从中将有较大概率包含目标对象的候选区域确定为目标候选区域,并进一步的将目标候选区域映射回原始样本图像,进而得到对疑似存在目标对象进行边界框定的中间图像。
应当理解的是,由于候选区域是基于原始样本图像中提取出的卷积特征(图)确定出的,因此候选区域是卷积特征图上的区域,并不直接是原始样本图像上的区域,但可以借助卷积特征与原始样本图像之间的对应关系,将目标候选区域映射回原始样本图像上,从而在原始样本图像上框定目标对象的存在边界。但应当理解的是,目标对象的存在边界框定的是否准确,依赖于区域生成网络提取候选区域及确定第一概率的准确性。
步骤204:对中间图像中对应目标候选区域的部分进行图像增强处理和/或对中间图像中对应非目标候选区域的部分进行图像模糊处理,得到增量样本图像。
在步骤203的基础上,本步骤旨在由上述执行主体对中间图像中框定出的存在目标对象的部分和/或不存在目标对象的部分,采用不同的图像处理手段,进而将处理后的增量样本图像。
具体的,本步骤具体包含了三种不同的实现方式:
第一种:仅对中间图像中对应目标候选区域的部分进行图像增强处理,并将经图像增强处理后的中间图像作为增量样本图像;
第二种:仅对中间图像中对应非目标候选区域的部分进行图像模糊处理,并将将经图像模糊处理后的中间图像作为增量样本图像;
第三种:不仅对中间图像中对应目标候选区域的部分进行图像增强处理,还对对中间图像中对应非目标候选区域的部分进行图像模糊处理,将经过图像增强和图像模糊处理后的中间图像作为增量样本图像。
无论是上述哪一种实现方式的目的都是为了尽可能的突出目标对象存在的部分区域。
应当理解的是,图像增强处理是一种提升图像清晰度的图像处理手段、图像模糊处理则是降低图像清晰度的图像处理手段,图像越清晰、越 容易准确的识别出是否包含目标对象。
本公开实施例提供了一种样本图像增量方法,该方法借助区域生成网络来确定可能包含有目标对象的候选区域,然后将其中包含概率较高的作为目标候选区域,通过将其目标候选区域映射回原图,并对原图中对应目标候选区域和/或对应非目标候选区域的部分采用相应的清晰化或模糊化处理方式,进而得到尽可能凸显出目标对象的增量样本图像。通过该技术方案得以在不破坏原始样本图像中关键部分的前提下,生成高可用性的增量样本图像。
请参考图3,图3为本公开实施例提供的另一种样本图像增量方法的流程图,其中流程300包括以下步骤:
步骤301:获取原始样本图像的第一卷积特征;
步骤302:根据区域生成网络和第一卷积特征确定候选区域,及候选区域中包含目标对象的第一概率;
以上步骤301-302与如图2所示的步骤201-202一致,相同部分内容请参见上一实施例的相应部分,此处不再进行赘述
步骤303:将第一概率大于预设概率的候选区域确定为目标候选区域,并将目标候选区域映射回原始样本图像,得到中间图像;
在步骤203的基础上,本实施例通过本步骤提供了一种选取出目标候选区域的具体实现方式,即通过预先设置一个被认为能够区分概率高低的预设概率(例如70%),因此只需要将各候选区域的第一概率与该预设概率进行比较即可选取出高概率存在目标对象的目标候选区域。
除步骤303所提供的基于预设概率确定目标候选区域的方式外,还可以选用诸如将第一概率的大小排名前几(按从大到小排,排名前N指概率值较大的N个)的候选区域确定为目标候选区域,也可以基于前百分比等方式进行选取。即无论哪种方式的目的都是将尽可能高概率包含目标对象的候选区域确定为目标候选区域,以使将目标候选区域映射回原始样本图像后,能够尽可能准确的框定出原始样本图像中的目标对象。
步骤304:对中间图像中对应非目标候选区域的部分进行高斯模糊处理;
在步骤303的基础上,本步骤旨在由上述执行主体对中间图像中对应非目标候选区域的部分进行高斯模糊处理。
高斯模糊也被称为高斯平滑,通常用它来减少图像噪声以及降低细节层次。通过该种模糊技术生成的图像,其视觉效果就像是经过一个毛玻璃在观察图像,这与镜头焦外成像效果散景以及普通照明阴影中的效果都明显不同。高斯平滑也用于计算机视觉算法中的预先处理阶段,以增强图像在不同比例大小下的图像效果。从数学的角度来看,图像的高斯模糊过程就是图像与正态分布做卷积。由于正态分布又叫作高斯分布,所以这项技术就叫作高斯模糊。图像与圆形方框模糊做卷积将会生成更加精确的焦外成像效果。由于高斯函数的傅立叶变换是另外一个高斯函数,所以高斯模糊对于图像来说是一个低通滤波器。
步骤305:对中间图像中的第一目标区域进行第一图像增强处理;
步骤306:对中间图像中的第二目标区域进行第二图像增强处理;
在步骤303的基础上,步骤305和步骤306分别对中间图像中的第一目标区域和第二目标区域进行不同图像增强强度的图像增强处理,以区别不同目标区域的图像增强效果。
其中,第一目标区域为至少两个目标候选区域映射在原始样本图像中的重叠部分;区别于第一目标区域,第二目标区域为单个目标候选区域映射在原始样本图像中的部分。可以理解的是,越多目标候选区域映射在原始样本图像的同一位置,将能够进一步提升该位置下存在目标对象的判断准确性,反之则仅能保持原判断准确性。因此,本实施例通过步骤305和步骤306对越有可能存在目标对象的部分区域使用了图像增强强度较高的图像增强手段,而对存在目标对象的可能性一般的部分区域使用了较常规的图像增强手段。
步骤307:将经处理后的图像作为增量样本图像。
在上一实施例所提供的技术方案的基础上,本实施例通过步骤303提供了一种具体的基于第一概率确定目标候选区域的方法;通过步骤304提供了对中间图像中对应非目标候选区域的部分具体采用高斯模 糊的图像模糊处理方式,通过步骤305-步骤306提供了对中间图像中对应非目标候选区域的部分按照是否为多个目标候选区域重叠,采用了不同图像增强强度的图像增强处理,以尽可能的凸显目标对象。
应当理解的是,步骤303、步骤304、步骤305-步骤306各自所提供的具体实现方式均可以与流程200所示的实施例单独结合形成不同的实施例,其各自之间不存在因果和依赖关系。因此本实施例实际上仅是同时三个具体实现方式的优选实施例。
上述任意实施例提供了不同的样本图像增量方案,进一步的,还可以结合上述生成增量样本图像的技术方案,提供了一种用于训练得到目标检测模型的模型训练方法,一种包括且不限于的实现方式可参见如图4所示的流程图,其流程400包括以下步骤:
步骤401:获取增量样本图像的第二卷积特征;
第二卷积特征提取自增强样本图像,用于提取得到第二卷积特征的方式可与从原始样本图像提取到第一卷积特征的方式相同,例如采用相同的特征提取网络。
步骤402:根据区域生成网络和第二卷积特征确定新候选区域、新候选区域包含所述目标对象的第二概率;
新候选区域和其第二概率类似于候选区域和其第一概率,区别为新候选区域和其第二概率的对象是增量样本图像,候选区域和其第一概率的对象是原始样本图像。
步骤403:获取对应第一概率的第一损失值,以及对应第二概率的第二损失值;
在步骤402的基础上,本步骤旨在分别获取到用于指导模型训练的损失值,由于存在原始样本图像和增量样本图像,因此分别基于第一概率和第二概率确定出相应的损失值。
步骤404:基于加权后的第一损失值和第二损失值,确定综合损失值;
在步骤403的基础上,本步骤旨在综合加权后的第一损失值和第二损失值来确定更加合理的综合损失值。其中,用于对第一损失值进 行加权的权值与用于对第二损失值进行加权的权值可以相同也可以不同,可根据实际情况灵活调整。
一种包括且不限于的实现方式为:将加权后的第一损失值与加权后的第二损失值的和,作为综合损失值。
步骤405:基于综合损失值满足预设要求,得到训练完成的图像检测模型。
在步骤404的基础上,本步骤旨在由上述执行主体基于综合损失值满足预设要求,得到训练完成的图像检测模型。
一种包括且不限于的实现方式为:响应于综合损失值为预设轮数的迭代训练中的最小值,输出训练完成的图像检测模型。此种方式可以理解为以控制综合损失值最小为训练目标,综合损失值越小将使得模型的检测精度越高。
如图4所示的实施例在之前各实施例的基础上,进一步结合增量后的样本图像训练目标检测模型,以便后续可直接使用训练好的目标检测模型来准确、高效的检测待测图像中是否存在目标对象。
一种图像检测方法可以为:
首先,接收待检测图像,然后,调用图像检测模型对待检测图像进行检测。后续还可以返回得到的检测结果。
为加深理解,本公开还结合一个具体应用场景,给出了一种具体的实现方案,请参见如图5所示的流程示意图。
针对样本图像数量较少的现实目标检测场景,本实施例提供了一种基于区域生成增强的目标检测方法,旨在利用候选区域生成进行数据增强,可与现有的各种样本增量技术一起使用,从而从不同的角度综合提升增量样本的可用性,最后基于增量后的样本集训练出检测效果更好的目标检测模型:
1)对原始图像A使用卷积神经网络提取卷积特征;
2)通过区域生成网络对提取到卷积特征生成可能含有目标的候选区域,以及各个候选区域可能含有目标的概率得分;
3)由2)中获得的候选区域和1)提取的卷积特征经过常见的ROI (region of interest,感兴趣区域)池化,输入到两个全连接层中,得到数千个分类概率,其中每一个分类概率都有其对应的回归边界,将其记为分类概率a1和回归边界a2;
4)将2)中得到的候选区域按照概率得分从高到低排序,选择前N个候选区域映射回原图(N取50,该参数可以根据具体任务进行调整),可以获得一张标有N个检测框的中间图像;
5)将4)获得的中间图像中位于检测框之外的区域记为背景区域,将该背景区域进行高斯模糊,并对位于检测框内的前景区域使用图像增强提升其清晰度,获得图像B;
6)将图像B输入到卷积特征提取网络中,最终可以得到分类概率b1和回归边界b2;
7)将分类概率a1与分类概率a2加权求和后获得最终的分类概率,按照一定阈值将分类概率对应的回归边界(即b1和b2)映射到待检测的原始图像中,获得最终的检测结果。
由于处理过程中,会对候选区域映射后的图像进行背景模糊,所以只有当候选区域包含图像中所有待检测目标时,候选区域映射后的图像在训练过程中的损失值才会更加收敛。
上述方案也可以移植到基于区域生成网络的现有方法中,还可以和其它用于小样本检测的技术共同提高效果,以进一步提升实用性。
作为对上述各图所示方法的实现,本公开还分别提供了装置实施例,即与图2所示的样本图像增量方法对应的样本图像增量装置、与图4所示的图像检测模型训练方法对应的图像检测模型训练装置,与图像检测方法对应的图像检测装置,各装置可具体应用于各种电子设备中。
如图6所示,本实施例的样本图像增量装置600可以包括:第一卷积特征获取单元601、候选区域及概率确定单元602、目标候选区域确定及映射单元603、中间图像处理单元604。其中,第一卷积特征获取单元601,被配置成获取原始样本图像的第一卷积特征;候选区域及概率确定单元602,被配置成根据区域生成网络和第一卷积特征确定候选区域, 及候选区域中包含目标对象的第一概率;目标候选区域确定及映射单元603,被配置成基于第一概率在候选区域中确定目标候选区域,并将目标候选区域映射回原始样本图像,得到中间图像;中间图像处理单元604,被配置成对中间图像中对应目标候选区域的部分进行图像增强处理和/或对中间图像中对应非目标候选区域的部分进行图像模糊处理,得到增量样本图像。
在本实施例中,样本图像增量装置600中:第一卷积特征获取单元601、候选区域及概率确定单元602、目标候选区域确定及映射单元603、中间图像处理单元604的具体处理及其所带来的技术效果可分别参考图2对应实施例中的步骤201-204的相关说明,在此不再赘述。
在本实施例的一些可选的实现方式中,中间图像处理单元604可以包括对中间图像中对应非目标候选区域的部分进行图像模糊处理的模糊处理子单元,模糊处理子单元被进一步配置成:
对中间图像中对应非目标候选区域的部分进行高斯模糊处理。
在本实施例的一些可选的实现方式中,目标候选区域确定及映射单元603可以包括被配置成基于第一概率在候选区域中确定目标候选区域的目标候选区域确定子单元,目标候选区域确定子单元被进一步配置成:
将第一概率大于预设概率的候选区域确定为目标候选区域。
在本实施例的一些可选的实现方式中,中间图像处理单元604可以包括对中间图像中对应目标候选区域的部分进行图像增强处理的增强处理子单元,增强处理子单元被进一步配置成:
对中间图像中的第一目标区域进行第一图像增强处理,第一目标区域为至少两个目标候选区域映射在原始样本图像中的重叠部分;
对中间图像中的第二目标区域进行第二图像增强处理,第二目标区域为单个目标候选区域映射在原始样本图像中的部分,第一图像增强处理的图像增强强度大于第二图像增强处理的图像增强强度。
如图7所示,本实施例的图像检测模型训练装置700可以包括:第二卷积特征获取单元701、新候选区域及概率确定单元702、损失值获取单元703、综合损失值确定单元704、图像检测模型训练单元705。其中,第二卷积特征获取单元701,被配置成获取增量样本图像的第二卷积 特征;其中,增量样本图像通过如图6所示的样本图像增量装置得到;新候选区域及概率确定单元702,被配置成根据区域生成网络和第二卷积特征确定新候选区域、新候选区域包含目标对象的第二概率;损失值获取单元703,被配置成获取对应第一概率的第一损失值,以及对应第二概率的第二损失值;综合损失值确定单元704,被配置成基于加权后的第一损失值和第二损失值,确定综合损失值;图像检测模型训练单元705,被配置成基于综合损失值满足预设要求,得到训练完成的图像检测模型。
在本实施例的一些可选的实现方式中,综合损失值确定单元可以被进一步配置成:
将加权后的第一损失值与加权后的第二损失值的和,作为综合损失值。
在本实施例的一些可选的实现方式中,图像检测模型训练单元被进一步配置成:
响应于综合损失值为预设轮数的迭代训练中的最小值,输出训练完成的图像检测模型。
如图8所示,本实施例的图像检测装置800可以包括:待检测图像接收单元801、图像检测单元802。其中,待检测图像接收单元801,被配置成接收待检测图像;图像检测单元802,被配置成调用图像检测模型对待检测图像进行检测;其中,图像检测模型通过如图7所示的图像检测模型训练装置得到。
本实施例作为对应于上述方法实施例的装置实施例存在,本公开实施例所提供的样本图像增量装置借助区域生成网络来确定可能包含有目标对象的候选区域,然后将其中包含概率较高的作为目标候选区域,通过将其目标候选区域映射回原图,并对原图中对应目标候选区域和/或对应非目标候选区域的部分采用相应的清晰化或模糊化处理方式,进而得到尽可能凸显出目标对象的增量样本图像。通过该技术方案得以在不破坏原始样本图像中关键部分的前提下,生成高可用性的增量样本图像。
根据本公开的实施例,本公开还提供了一种电子设备、一种可读 存储介质和一种计算机程序产品。
图9示出了可以用来实施本公开的实施例的示例电子设备900的示意性框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本公开的实现。
如图9所示,设备900包括计算单元901,其可以根据存储在只读存储器(ROM)902中的计算机程序或者从存储单元908加载到随机访问存储器(RAM)903中的计算机程序,来执行各种适当的动作和处理。在RAM 903中,还可存储设备900操作所需的各种程序和数据。计算单元901、ROM 902以及RAM 903通过总线904彼此相连。输入/输出(I/O)接口905也连接至总线904。
设备900中的多个部件连接至I/O接口905,包括:输入单元906,例如键盘、鼠标等;输出单元907,例如各种类型的显示器、扬声器等;存储单元908,例如磁盘、光盘等;以及通信单元909,例如网卡、调制解调器、无线通信收发机等。通信单元909允许设备900通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。
计算单元901可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元901的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元901执行上文所描述的各个方法和处理,例如样本图像增量方法。例如,在一些实施例中,样本图像增量方法可被实现为计算机软件程序,其被有形地包含于机器可读介质,例如存储单元908。在一些实施例中,计算机程序的部分或者全部可以经由ROM 902和/或通信单元909而被载入和/或安装到设备900上。当计算机程序加载到RAM 903并由计算单元901执 行时,可以执行上文描述的样本图像增量方法的一个或多个步骤。备选地,在其他实施例中,计算单元901可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行样本图像增量方法。
本文中以上描述的***和技术的各种实施方式可以在数字电子电路***、集成电路***、场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上***的***(SOC)、负载可编程逻辑设备(CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程***上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储***、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储***、该至少一个输入装置、和该至少一个输出装置。
用于实施本公开的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行***、装置或设备使用或与指令执行***、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体***、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
为了提供与用户的交互,可以在计算机上实施此处描述的***和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。
可以将此处描述的***和技术实施在包括后台部件的计算***(例如,作为数据服务器)、或者包括中间件部件的计算***(例如,应用服务器)、或者包括前端部件的计算***(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的***和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算***中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将***的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)和互联网。
计算机***可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,又称为云计算服务器或云主机,是云计算服务体系中的一项主机产品,以解决传统物理主机与虚拟专用服务器(VPS,Virtual Private Server)服务中存在的管理难度大,业务扩展性弱的缺陷。
本公开实施例所提供的样技术方案借助区域生成网络来确定可能包含有目标对象的候选区域,然后将其中包含概率较高的作为目标候选区域,通过将其目标候选区域映射回原图,并对原图中对应目标候选区域和/或对应非目标候选区域的部分采用相应的清晰化或模糊化处理方式,进而得到尽可能凸显出目标对象的增量样本图像。通过该技术方案得以在不破坏原始样本图像中关键部分的前提下,生成高可 用性的增量样本图像。
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本发公开中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本公开公开的技术方案所期望的结果,本文在此不进行限制。
上述具体实施方式,并不构成对本公开保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等,均应包含在本公开保护范围之内。

Claims (19)

  1. 一种样本图像增量方法,包括:
    获取原始样本图像的第一卷积特征;
    根据区域生成网络和所述第一卷积特征确定候选区域,及所述候选区域中包含目标对象的第一概率;
    基于所述第一概率在所述候选区域中确定目标候选区域,并将所述目标候选区域映射回所述原始样本图像,得到中间图像;
    对所述中间图像中对应所述目标候选区域的部分进行图像增强处理和/或对所述中间图像中对应非所述目标候选区域的部分进行图像模糊处理,得到增量样本图像。
  2. 根据权利要求1所述的方法,其中,所述对所述中间图像中对应非所述目标候选区域的部分进行图像模糊处理,包括:
    对所述中间图像中对应非所述目标候选区域的部分进行高斯模糊处理。
  3. 根据权利要求1所述的方法,其中,所述基于所述第一概率在所述候选区域中确定目标候选区域,包括:
    将所述第一概率大于预设概率的候选区域确定为目标候选区域。
  4. 根据权利要求1所述的方法,其中,所述对所述中间图像中对应所述目标候选区域的部分进行图像增强处理,包括:
    对所述中间图像中的第一目标区域进行第一图像增强处理,所述第一目标区域为至少两个所述目标候选区域映射在所述原始样本图像中的重叠部分;
    对所述中间图像中的第二目标区域进行第二图像增强处理,所述第二目标区域为单个所述目标候选区域映射在所述原始样本图像中的部分,所述第一图像增强处理的图像增强强度大于所述第二图像增强处理的图像增强强度。
  5. 一种图像检测模型训练方法,包括:
    获取增量样本图像的第二卷积特征;其中,所述增量样本图像根据权利要求1-4任一项所述的样本图像增量方法得到;
    根据区域生成网络和所述第二卷积特征确定新候选区域,及所述新候选区域包含目标对象的第二概率;
    获取对应第一概率的第一损失值,以及对应所述第二概率的第二损失值;
    基于加权后的第一损失值和第二损失值,确定综合损失值;
    基于所述综合损失值满足预设要求,得到训练完成的图像检测模型。
  6. 根据权利要求5所述的方法,其中,所述基于加权后的第一损失值和第二损失值,确定综合损失值,包括:
    将加权后的第一损失值与加权后的第二损失值的和,作为所述综合损失值。
  7. 根据权利要求5所述的方法,其中,所述基于所述综合损失值满足预设要求,得到训练完成的图像检测模型,包括:
    响应于所述综合损失值为预设轮数的迭代训练中的最小值,输出训练完成的图像检测模型。
  8. 一种图像检测方法,包括:
    接收待检测图像;
    调用图像检测模型对所述待检测图像进行检测;其中,所述图像检测模型根据权利要求5-7中任一项所述的图像检测模型训练方法得到。
  9. 一种样本图像增量装置,包括:
    第一卷积特征获取单元,被配置成获取原始样本图像的第一卷积特征;
    候选区域及概率确定单元,被配置成根据区域生成网络和所述第一卷积特征确定候选区域,及所述候选区域中包含目标对象的第一概率;
    目标候选区域确定及映射单元,被配置成基于所述第一概率在所述候选区域中确定目标候选区域,并将所述目标候选区域映射回所述原始样本图像,得到中间图像;
    中间图像处理单元,被配置成对所述中间图像中对应所述目标候选区域的部分进行图像增强处理和/或对所述中间图像中对应非所述目标候选区域的部分进行图像模糊处理,得到增量样本图像。
  10. 根据权利要求9所述的装置,其中,所述中间图像处理单元包括对所述中间图像中对应非所述目标候选区域的部分进行图像模糊处理的模糊处理子单元,所述模糊处理子单元被进一步配置成:
    对所述中间图像中对应非所述目标候选区域的部分进行高斯模糊处理。
  11. 根据权利要求9所述的装置,其中,所述目标候选区域确定及映射单元包括被配置成基于所述第一概率在所述候选区域中确定目标候选区域的目标候选区域确定子单元,所述目标候选区域确定子单元被进一步配置成:
    将所述第一概率大于预设概率的候选区域确定为目标候选区域。
  12. 根据权利要求9所述的装置,其中,所述中间图像处理单元包括对所述中间图像中对应所述目标候选区域的部分进行图像增强处理的增强处理子单元,所述增强处理子单元被进一步配置成:
    对所述中间图像中的第一目标区域进行第一图像增强处理,所述第一目标区域为至少两个所述目标候选区域映射在所述原始样本图像中的重叠部分;
    对所述中间图像中的第二目标区域进行第二图像增强处理,所述第二目标区域为单个所述目标候选区域映射在所述原始样本图像中的部分,所述第一图像增强处理的图像增强强度大于所述第二图像增强处理的图像增强强度。
  13. 一种图像检测模型训练装置,包括:
    第二卷积特征获取单元,被配置成获取增量样本图像的第二卷积特征;其中,所述增量样本图像根据权利要求9-12任一项所述的样本图像增量装置得到;
    新候选区域及概率确定单元,被配置成根据区域生成网络和所述第二卷积特征确定新候选区域,及所述新候选区域包含目标对象的第二概率;
    损失值获取单元,被配置成获取对应第一概率的第一损失值,以及对应所述第二概率的第二损失值;
    综合损失值确定单元,被配置成基于加权后的第一损失值和第二损失值,确定综合损失值;
    图像检测模型训练单元,被配置成基于所述综合损失值满足预设要求,得到训练完成的图像检测模型。
  14. 根据权利要求13所述的装置,其中,所述综合损失值确定单元被进一步配置成:
    将加权后的第一损失值与加权后的第二损失值的和,作为所述综合损失值。
  15. 根据权利要求13所述的装置,其中,所述图像检测模型训练单元被进一步配置成:
    响应于所述综合损失值为预设轮数的迭代训练中的最小值,输出训练完成的图像检测模型。
  16. 一种图像检测装置,包括:
    待检测图像接收单元,被配置成接收待检测图像;
    图像检测单元,被配置成调用图像检测模型对所述待检测图像进行检测;其中,所述图像检测模型根据权利要求13-15中任一项所述的图像检测模型训练方法得到。
  17. 一种电子设备,包括:
    至少一个处理器;以及
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-4中任一项所述的样本图像增量方法和/或权利要求5-7所述的图像检测模型训练方法和/或权利要求8所述的图像检测方法。
  18. 一种存储有计算机指令的非瞬时计算机可读存储介质,所述计算机指令用于使所述计算机执行权利要求1-4中任一项所述的样本图像增量方法和/或权利要求5-7所述的图像检测模型训练方法和/或权利要求8所述的图像检测方法。
  19. 一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现根据权利要求1-4中任一项所述的样本图像增量方法和/或权利要求5-7所述的图像检测模型训练方法和/或权利要求8所述的图像检测方法。
PCT/CN2022/075152 2021-04-07 2022-01-30 样本图像增量、图像检测模型训练及图像检测方法 WO2022213718A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2022552961A JP2023531350A (ja) 2021-04-07 2022-01-30 サンプル画像を増分する方法、画像検出モデルの訓練方法及び画像検出方法
US17/939,364 US20230008696A1 (en) 2021-04-07 2022-09-07 Method for incrementing sample image

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110371342.4 2021-04-07
CN202110371342.4A CN112949767B (zh) 2021-04-07 2021-04-07 样本图像增量、图像检测模型训练及图像检测方法

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/939,364 Continuation US20230008696A1 (en) 2021-04-07 2022-09-07 Method for incrementing sample image

Publications (1)

Publication Number Publication Date
WO2022213718A1 true WO2022213718A1 (zh) 2022-10-13

Family

ID=76232374

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/075152 WO2022213718A1 (zh) 2021-04-07 2022-01-30 样本图像增量、图像检测模型训练及图像检测方法

Country Status (4)

Country Link
US (1) US20230008696A1 (zh)
JP (1) JP2023531350A (zh)
CN (1) CN112949767B (zh)
WO (1) WO2022213718A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112949767B (zh) * 2021-04-07 2023-08-11 北京百度网讯科技有限公司 样本图像增量、图像检测模型训练及图像检测方法
CN113361535B (zh) * 2021-06-30 2023-08-01 北京百度网讯科技有限公司 图像分割模型训练、图像分割方法及相关装置
CN113516185B (zh) * 2021-07-09 2023-10-31 北京百度网讯科技有限公司 模型训练的方法、装置、电子设备及存储介质
CN114596637B (zh) * 2022-03-23 2024-02-06 北京百度网讯科技有限公司 图像样本数据增强训练方法、装置及电子设备
CN115100431B (zh) * 2022-07-26 2023-08-08 北京百度网讯科技有限公司 目标检测方法、神经网络及其训练方法、设备和介质
CN117036227A (zh) * 2022-09-21 2023-11-10 腾讯科技(深圳)有限公司 数据处理方法、装置、电子设备、介质及程序产品

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110503097A (zh) * 2019-08-27 2019-11-26 腾讯科技(深圳)有限公司 图像处理模型的训练方法、装置及存储介质
US20200005460A1 (en) * 2018-06-28 2020-01-02 Shenzhen Imsight Medical Technology Co. Ltd. Method and device for detecting pulmonary nodule in computed tomography image, and computer-readable storage medium
CN111428875A (zh) * 2020-03-11 2020-07-17 北京三快在线科技有限公司 图像识别方法、装置及相应模型训练方法、装置
CN111597945A (zh) * 2020-05-11 2020-08-28 济南博观智能科技有限公司 一种目标检测方法、装置、设备及介质
CN112949767A (zh) * 2021-04-07 2021-06-11 北京百度网讯科技有限公司 样本图像增量、图像检测模型训练及图像检测方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102665062B (zh) * 2012-03-16 2016-03-30 华为技术有限公司 一种使视频中目标物体图像稳定的方法及装置
CN108229267B (zh) * 2016-12-29 2020-10-16 北京市商汤科技开发有限公司 对象属性检测、神经网络训练、区域检测方法和装置
CN109559285B (zh) * 2018-10-26 2021-08-06 北京东软医疗设备有限公司 一种图像增强显示方法及相关装置
CN110248107A (zh) * 2019-06-13 2019-09-17 Oppo广东移动通信有限公司 图像处理方法和装置
CN110245662B (zh) * 2019-06-18 2021-08-10 腾讯科技(深圳)有限公司 检测模型训练方法、装置、计算机设备和存储介质
CN110569721B (zh) * 2019-08-01 2023-08-29 平安科技(深圳)有限公司 识别模型训练方法、图像识别方法、装置、设备及介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200005460A1 (en) * 2018-06-28 2020-01-02 Shenzhen Imsight Medical Technology Co. Ltd. Method and device for detecting pulmonary nodule in computed tomography image, and computer-readable storage medium
CN110503097A (zh) * 2019-08-27 2019-11-26 腾讯科技(深圳)有限公司 图像处理模型的训练方法、装置及存储介质
CN111428875A (zh) * 2020-03-11 2020-07-17 北京三快在线科技有限公司 图像识别方法、装置及相应模型训练方法、装置
CN111597945A (zh) * 2020-05-11 2020-08-28 济南博观智能科技有限公司 一种目标检测方法、装置、设备及介质
CN112949767A (zh) * 2021-04-07 2021-06-11 北京百度网讯科技有限公司 样本图像增量、图像检测模型训练及图像检测方法

Also Published As

Publication number Publication date
US20230008696A1 (en) 2023-01-12
CN112949767B (zh) 2023-08-11
JP2023531350A (ja) 2023-07-24
CN112949767A (zh) 2021-06-11

Similar Documents

Publication Publication Date Title
WO2022213718A1 (zh) 样本图像增量、图像检测模型训练及图像检测方法
US11321593B2 (en) Method and apparatus for detecting object, method and apparatus for training neural network, and electronic device
CN113033537B (zh) 用于训练模型的方法、装置、设备、介质和程序产品
CN113436100B (zh) 用于修复视频的方法、装置、设备、介质和产品
CN114429633B (zh) 文本识别方法、模型的训练方法、装置、电子设备及介质
CN111539897A (zh) 用于生成图像转换模型的方法和装置
CN113239807B (zh) 训练票据识别模型和票据识别的方法和装置
US20230096921A1 (en) Image recognition method and apparatus, electronic device and readable storage medium
CN113887615A (zh) 图像处理方法、装置、设备和介质
CN114511743B (zh) 检测模型训练、目标检测方法、装置、设备、介质及产品
CN113869253A (zh) 活体检测方法、训练方法、装置、电子设备及介质
CN114724144B (zh) 文本识别方法、模型的训练方法、装置、设备及介质
CN115457365B (zh) 一种模型的解释方法、装置、电子设备及存储介质
US20230008473A1 (en) Video repairing methods, apparatus, device, medium and products
CN114842482B (zh) 一种图像分类方法、装置、设备和存储介质
CN114863450B (zh) 图像处理方法、装置、电子设备及存储介质
CN114612651B (zh) Roi检测模型训练方法、检测方法、装置、设备和介质
CN116052288A (zh) 活体检测模型训练方法、活体检测方法、装置和电子设备
CN112560848B (zh) 兴趣点poi预训练模型的训练方法、装置及电子设备
CN114093006A (zh) 活体人脸检测模型的训练方法、装置、设备以及存储介质
CN114119990A (zh) 用于图像特征点匹配的方法、装置及计算机程序产品
CN113989152A (zh) 图像增强方法、装置、设备以及存储介质
CN113139463A (zh) 用于训练模型的方法、装置、设备、介质和程序产品
CN115147902B (zh) 人脸活体检测模型的训练方法、装置及计算机程序产品
CN113870142B (zh) 用于增强图像对比度的方法、装置

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2022552961

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22783789

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22783789

Country of ref document: EP

Kind code of ref document: A1