CN114612732A - Sample data enhancement method, system and device, medium and target classification method - Google Patents

Sample data enhancement method, system and device, medium and target classification method Download PDF

Info

Publication number
CN114612732A
CN114612732A CN202210509914.5A CN202210509914A CN114612732A CN 114612732 A CN114612732 A CN 114612732A CN 202210509914 A CN202210509914 A CN 202210509914A CN 114612732 A CN114612732 A CN 114612732A
Authority
CN
China
Prior art keywords
image
sample
information
data enhancement
enhanced
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210509914.5A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Shuzhilian Technology Co Ltd
Original Assignee
Chengdu Shuzhilian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Shuzhilian Technology Co Ltd filed Critical Chengdu Shuzhilian Technology Co Ltd
Priority to CN202210509914.5A priority Critical patent/CN114612732A/en
Publication of CN114612732A publication Critical patent/CN114612732A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/187Segmentation; Edge detection involving region growing; involving region merging; involving connected component labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a sample data enhancement method, a system, a device and a medium and target classification method, which relate to the field of data enhancement and comprise the following steps: training the convolutional neural network by using a first sample to obtain a first model; obtaining a class activation graph corresponding to a target in the first image by using the first model; positioning key information in the first image based on the class activation graph to obtain positioning information; performing data enhancement processing on the first image to obtain an enhanced image; obtaining loss amount of key information in the enhanced image relative to key information in the first image based on the positioning information, if the loss amount exceeds a threshold value, ignoring the enhanced image, and if the loss amount is less than or equal to the threshold value, obtaining the second image based on the enhanced image; the method can effectively enhance the data of the sample, and the enhanced sample can better retain the key information in the original sample.

Description

Sample data enhancement method, system and device, medium and target classification method
Technical Field
The present invention relates to the field of data enhancement, and in particular, to a method, a system, a device, a medium, and a target classification method for enhancing sample data.
Background
After the first artificial satellite is emitted in 1957, the resolution, particularly the spatial resolution, of the remote sensing image is greatly improved along with the continuous progress of geospatial science and technology for more than sixty years, and the spatial resolution of a plurality of satellite images such as IKONOS, QuickBird, WordView, GeoEye-1, Pleiades, high resolution two and resource three even reaches the meter level or even the sub-meter level. Compared with natural images, the high-resolution remote sensing image has richer spectral information, shape and texture characteristics and scene semantic information. Semantic categories are high-level knowledge abstractions and generalizations of scene content, which are predefined manually, and therefore require a series of interpretation processes on the acquired high-resolution images to extract information meaningful to the production activities. The 'scene classification' is an important understanding mode for the information of the remote sensing image, namely, the 'scene' expressed according to the content of the remote sensing image is understood and marked as a certain semantic category. The method has great significance for interpretation of images and understanding of the real world, and is one of hot spots for research in the field of remote sensing.
Scene classification methods can be divided into three major categories according to different feature extraction methods: methods based on low-level visual features, methods based on mid-level visual expressions, and methods based on high-level visual information. The traditional method for remote sensing image scene classification mainly extracts bottom layer characteristics and middle layer characteristics of an image and represents information such as specific color, shape and the like of the image. The deep learning method mainly extracts high-level semantic information of the image and represents high-level abstract information of the image. From the current research situation of remote sensing image scene classification, people tend to automate the feature extraction and classification of remote sensing images, and a method based on deep learning is taken as the most popular classification method in recent years. However, it is very difficult to further improve the accuracy in remote sensing scene image classification only by using a traditional method or a deep learning method, so how to efficiently improve the remote sensing image classification accuracy is still a significant topic.
The deep learning-based method requires a large amount of sample training and consumes more computing resources. However, it takes a long time to prepare and obtain a large number of high-quality training samples, so that how to improve the identification accuracy of the remote sensing ground features of the small samples is an important research direction at present. At present, a data enhancement method is an important means, which can greatly reduce the time required by manual marking and improve the extraction precision of a target object. Sample enhancement techniques based on image processing are one of the commonly used methods, but such methods may result in loss of critical information.
Image processing based data enhancement methods are the current underlying methods, which include many classical methods such as horizontal/vertical flipping, color space transformation, translation, rotation, and noise injection, etc., and the application of horizontal/vertical axis flipping enhancement methods is very wide, and this extension is one of the easiest to implement and has proven useful for datasets such as CIFAR-10 and ImageNet.
(1) Turnover (Flip)
Simple horizontal and vertical flipping of images, but also some frameworks do not provide vertical flipping functionality to expand the image dataset, as meaningful data expansion may not be obtained by this approach.
(2) Rotation (Rotation)
The image is randomly rotated by a certain angle in a clockwise or counterclockwise direction. The size of the image may change when the image is rotated according to a given angle. For example, a square image rotated at right angles will keep the size of the image unchanged, but a rectangular image will have an exchange of length and width, and similarly rotating the image at finer angles will change the final image size.
(3) Zooming (Scale)
The scaling of the image is divided into two categories, one is outward scaling and the final image size will be larger than the original image size. Most image manipulation methods will cut a part from the new image to a size equal to the original image; another approach is to scale inward, with missing portions complementing fixed pixel values.
(4) Cutting (Crop)
Unlike rotation-type operations, cropping changes the size of an image, and random cropping is generally used in the cropping method, that is, a portion of an original image is randomly sampled and then the size of the portion is adjusted to the size of the original image. The enlarging, cutting and zooming method has the same work as the original method.
(5) Shift (Translation)
The image is translated by a certain pixel in the horizontal, vertical or oblique direction. This image data enhancement method is very practical because the model can be traversed to all levels of image features by appropriate shifting of the image, thereby effectively improving the training effect of the model.
These methods involve text recognition data sets (e.g., MNIST, Street View House Numbers, etc.) that may result in loss of tag information because samples of text, etc., after flipping, become meaningless information, or vice versa.
The method can show that key information is lost on an image of an enhanced sample such as remote sensing, and even the semantic information of the image can be changed when the key information is serious. The overall quality of the data set is reduced due to the wrong samples, so that the generalization capability of the neural network training model is reduced, and a large amount of false identifications occur in the application.
Disclosure of Invention
The invention aims to perform sample enhancement on a smaller number of images and save more key information to improve the quality level of a sample data set.
In order to achieve the above object, the present invention provides a sample data enhancement method, including:
training the convolutional neural network by using a first sample to obtain a first model;
preprocessing a first image in the first sample to obtain a second image, and obtaining a data-enhanced second sample based on the first image and the second image;
the pretreatment comprises the following steps:
obtaining a class activation graph corresponding to a target in the first image by using the first model;
positioning key information in the first image based on the class activation graph to obtain positioning information;
performing data enhancement processing on the first image to obtain an enhanced image;
based on the positioning information, obtaining a loss amount of key information in the enhanced image relative to key information in the first image, if the loss amount exceeds a threshold value, ignoring the enhanced image, and if the loss amount is less than or equal to the threshold value, obtaining the second image based on the enhanced image.
The method comprises the following steps: the method comprises the steps of firstly obtaining a first model through training, then obtaining a class activation graph corresponding to a target in a first image by utilizing the first model, wherein the class activation graph can solve the defect problem of insensitivity to class information representation in the field of image processing and recognition, namely, a neural network cannot intuitively interpret the reason why the class activation graph can determine the class of the information in the image. Class activation maps are a feature learning method of discriminative area localization that works on classification training CNNs with global mean pooling and enables the network to learn to perform object localization without using any bounding box annotation. And then, positioning the key information in the first image by using the class activation map, and judging whether the key information in the enhanced image after data enhancement is over-lost or not by using the positioning information, and reserving the image after data enhancement with small loss, thereby realizing the purposes of data enhancement and reserving the key information in the original image.
Preferably, the learning rate of the convolutional neural network training is adjusted in an exponential slow-down mode.
The learning rate is an important hyper-parameter in neural network optimization. In the gradient descent method, the value of the learning rate is very critical, if the value is too large, convergence cannot be achieved, and if the value is too small, the convergence speed is too slow. Exponential slow down is an orderly adjustment strategy: orderly adjusted according to a certain rule. The advantage of this is that the initially large learning rate can help to jump out of the local optimum, and the small learning rate helps to model convergence and to model refinement.
Preferably, the learning rate of the convolutional neural network training is specifically adjusted by using the following formula:
Figure 170925DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 822487DEST_PATH_IMAGE002
as a function of the number of the coefficients,
Figure 336644DEST_PATH_IMAGE003
epochnum is the number of iterations, step is the step size,
Figure 466274DEST_PATH_IMAGE004
in order to be the initial learning rate,
Figure 624855DEST_PATH_IMAGE005
is the adjusted learning rate.
Preferably, the class activation map corresponding to the target in the first image is obtained by using the following formula:
Figure 181738DEST_PATH_IMAGE006
Figure 183192DEST_PATH_IMAGE007
wherein the content of the first and second substances,
Figure 241147DEST_PATH_IMAGE008
unit representing the last convolutional layer
Figure 644446DEST_PATH_IMAGE002
In a spatial grid
Figure 372231DEST_PATH_IMAGE009
Of each cell, each cell
Figure 860981DEST_PATH_IMAGE002
The result after pooling by global averaging is
Figure 455384DEST_PATH_IMAGE010
(ii) a For object class
Figure 713190DEST_PATH_IMAGE011
The input value of the classifier is
Figure 877455DEST_PATH_IMAGE012
Wherein
Figure 978135DEST_PATH_IMAGE013
Representing object classes
Figure 987679DEST_PATH_IMAGE011
Unit of classifier
Figure 365571DEST_PATH_IMAGE002
The corresponding weight;
Figure 44945DEST_PATH_IMAGE014
indicating that different activation units are identified as being of a certain class
Figure 508288DEST_PATH_IMAGE011
The sum of the weights of (a).
Preferably, the loss amount of the key information is calculated by the method through the cross-to-true ratio, and the cross-to-true ratio is calculated by adopting the following formula:
Figure 321523DEST_PATH_IMAGE015
wherein, IoT is the cross-to-true ratio,
Figure 412976DEST_PATH_IMAGE016
the position and area of the target prediction box in the first image acquired by the first model,
Figure 919043DEST_PATH_IMAGE017
to enhance the position and area of the boundary of the image in the first image.
The invention provides an information loss evaluation method based on the IoU method, wherein the information loss evaluation method is used for evaluating whether the sample information quantity reservation generated by the sample enhancement method is qualified or not, namely quantitatively analyzing the value of key information loss.
Preferably, if the computed true-to-false ratio is greater than a threshold, it is determined that the loss amount exceeds the threshold, and the enhanced image is ignored; and if the computed true-to-false ratio is smaller than or equal to a threshold value, judging that the loss amount is smaller than or equal to the threshold value, and obtaining the second image based on the enhanced image.
Preferably, the positioning the key information in the first image based on the class activation map to obtain the positioning information specifically includes:
and converting the class activation graph into a binary graph according to the probability information of the target object in the class activation graph, and acquiring the positioning information of the target object through the binary graph and the regional connectivity.
Preferably, the performing data enhancement processing on the first image to obtain an enhanced image specifically includes:
turning the first image;
and/or, performing rotation processing on the first image;
and/or, scaling the first image;
and/or, performing cropping processing on the first image;
and/or, performing translation processing on the first image;
and/or, performing noise injection processing on the first image.
Preferably, when the data enhancement processing mode is noise injection, the method specifically includes:
calculating to obtain weight information added by Gaussian noise according to the probability information of the target object in the class activation graph;
resampling the weight information data obtained by calculation to a first image size and multiplying the first image size by a Gaussian noise matrix to obtain added information;
and adding the added information into the first image to obtain an enhanced image.
Among them, gaussian noise is one of the commonly used methods in the image enhancement method of noise injection. Gaussian noise is generally sensor noise due to poor illumination and high temperature, and this noise has a characteristic of gaussian distribution (normal distribution), which appears more clearly in RGB images. The invention provides an improved method for combining a class activation map CAM and Gaussian noise.
Wherein, weight information added by Gaussian noise is calculated according to the probability information of the target object of the CAM, and less noise data is injected into the region with higher probability of the target object. The Gaussian noise injection enhancement data calculated through the class activation map can retain more target object information to a certain extent, and can distinguish foreground information from background information, and the method can help the deep learning network to learn feature information more specifically.
Preferably, the method calculates and obtains the weight information added by the gaussian noise by using the following formula:
Figure 604103DEST_PATH_IMAGE018
wherein the content of the first and second substances,
Figure 221029DEST_PATH_IMAGE019
it is shown that different activation cells add weight to the gaussian noise,
Figure 681835DEST_PATH_IMAGE020
unit representing the last convolutional layer
Figure 358804DEST_PATH_IMAGE002
In a spatial grid
Figure 265580DEST_PATH_IMAGE009
The value of (a) of (b),
Figure 545252DEST_PATH_IMAGE021
representing object classes
Figure 486663DEST_PATH_IMAGE011
Unit of classifier
Figure 334533DEST_PATH_IMAGE002
The corresponding weight.
The present invention also provides a sample data enhancement system, said system comprising:
the training unit is used for training the convolutional neural network by utilizing a first sample to obtain a first model;
the data enhancement unit is used for preprocessing a first image in the first sample to obtain a second image, and obtaining a data-enhanced second sample based on the first image and the second image;
the pretreatment comprises the following steps:
obtaining a class activation graph corresponding to a target in the first image by using the first model;
positioning key information in the first image based on the class activation graph to obtain positioning information;
performing data enhancement processing on the first image to obtain an enhanced image;
based on the positioning information, obtaining a loss amount of key information in the enhanced image relative to key information in the first image, if the loss amount exceeds a threshold value, ignoring the enhanced image, and if the loss amount is less than or equal to the threshold value, obtaining the second image based on the enhanced image.
The invention also provides a target classification method, which comprises the following steps:
obtaining a basic sample;
performing data enhancement processing on the basic sample based on the sample data enhancement method to obtain a training sample;
constructing a first target classification model, and training the first target classification model by using the training sample to obtain a second target classification model;
and acquiring an image to be processed, inputting the image to be processed into the second target classification model, and outputting a target classification result in the image to be processed.
The invention also provides a sample data enhancement device, which comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, wherein the processor realizes the steps of the sample data enhancement method when executing the computer program.
The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the sample data enhancement method.
One or more technical schemes provided by the invention at least have the following technical effects or advantages:
the method can effectively enhance the data of the sample, and the enhanced sample can better retain the key information in the original sample.
Drawings
The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention;
FIG. 1 is a schematic flow chart of a sample data enhancement method;
FIG. 2 is a technical flow chart of a sample enhancement method based on a class activation mechanism;
fig. 3 is a schematic diagram of the composition of the sample data enhancement system.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments of the present invention and features of the embodiments may be combined with each other without conflicting with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described and thus the scope of the present invention is not limited by the specific embodiments disclosed below.
It should be understood that "system", "device", "unit" and/or "module" as used herein is a method for distinguishing different components, elements, parts, portions or assemblies of different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.
As used in this specification and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.
Flowcharts are used in this specification to illustrate the operations performed by the system according to embodiments of the present specification. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.
Example one
Referring to fig. 1, fig. 1 is a schematic flow chart of a sample data enhancement method, a first embodiment of the present invention provides a sample data enhancement method, including:
training the convolutional neural network by using a first sample to obtain a first model;
preprocessing a first image in the first sample to obtain a second image, and obtaining a data-enhanced second sample based on the first image and the second image;
the pretreatment comprises the following steps:
obtaining a class activation graph corresponding to a target in the first image by using the first model;
positioning key information in the first image based on the class activation graph to obtain positioning information;
performing data enhancement processing on the first image to obtain an enhanced image;
based on the positioning information, obtaining a loss amount of key information in the enhanced image relative to key information in the first image, if the loss amount exceeds a threshold value, ignoring the enhanced image, and if the loss amount is less than or equal to the threshold value, obtaining the second image based on the enhanced image.
The method provides a supervised data enhancement method based on a Class Activation Map and a traditional data enhancement method, and the flow steps of the scheme are as follows:
pre-training a model:
the small sample dataset before augmentation was first trained using classical convolutional neural networks (ResNet-18, SqueezeNet, DenseNet) to obtain an initial model. The convolutional neural network can adopt networks in various forms or frames in the prior art, the specific form, implementation mode and specific structure of the convolutional neural network are not specifically limited, and the related functions in the invention can be realized. For optimizing the model training, the learning rate is adjusted by an Exponential slow-down (explicit slow), that is, the learning rate of the model training is adjusted by an Exponential slow-down method, as shown in the following formula:
Figure 869551DEST_PATH_IMAGE001
wherein
Figure 562700DEST_PATH_IMAGE002
As a function of the number of the coefficients,
Figure 358618DEST_PATH_IMAGE003
the value is 0.85, the initial learning rate is 0.001, epochnum is the number of iterations, step is the step length, and the value is 10.
Figure 502023DEST_PATH_IMAGE004
In order to be the initial learning rate,
Figure 648971DEST_PATH_IMAGE005
is the adjusted learning rate.
Generating a CAM based on a pre-training model:
then, a class activation map of a truth label generated by the initial model is used, and the probability of the target object in the sample image can be obtained through the class activation map; bolei Zhou et al propose a Class Activation Map (CAM) in CVPR-2016 that addresses the problem of the lack of sensitivity to the representation of class information in the field of image processing and recognition, i.e., the inability of neural networks to intuitively interpret the reason why it is possible to determine the class to which information in an image belongs. Class activation maps are a feature learning method of discriminative area localization that works on classification training CNNs with global mean pooling and enables the network to learn to perform object localization without using any bounding box annotation. The CAM calculation derivation formula of a certain object in the image is shown as the following formula:
Figure 411391DEST_PATH_IMAGE006
Figure 173067DEST_PATH_IMAGE007
wherein the content of the first and second substances,
Figure 97160DEST_PATH_IMAGE008
unit representing the last convolutional layer
Figure 731404DEST_PATH_IMAGE002
In the space gridNet
Figure 156569DEST_PATH_IMAGE009
Of each cell, each cell
Figure 192658DEST_PATH_IMAGE002
The result after global Average Pooling (Golbal Average Pooling) GAP (Global Average Pooling) is
Figure 694178DEST_PATH_IMAGE010
. For object class
Figure 815718DEST_PATH_IMAGE011
The input value of the classifier softmax is
Figure 654361DEST_PATH_IMAGE012
In which
Figure 653278DEST_PATH_IMAGE013
Representing object classes
Figure 184754DEST_PATH_IMAGE011
Unit of classifier
Figure 528010DEST_PATH_IMAGE002
The corresponding weight.
Figure 294978DEST_PATH_IMAGE014
Indicating that different activation units are identified as being of a certain class
Figure 40080DEST_PATH_IMAGE011
The sum of the weights of (1) and (2) is resampled to the original image size to obtain the classification
Figure 742457DEST_PATH_IMAGE011
The class activation map CAM. Class activation mapping allows visualization of the predicted class score on any given image, highlighting the discriminative object parts detected by CNNs.
Key information positioning based on CAM and IoT:
IoU, is a measure of the accuracy of detecting a corresponding object in a particular data set, and is known as the Intersection over Union (r) ratio, which is a value that increases with higher correlation. IoU, the ratio of the intersection and union of the "predicted bounding box" and the "true bounding box" is calculated, as shown in the following equation.
Figure 182797DEST_PATH_IMAGE022
In order to evaluate whether the sample information quantity reservation generated by the sample enhancement method is qualified or not, namely quantitatively analyzing the numerical value of key information loss, the invention provides an information loss evaluation method based on the IoU method, and the method adopts calculation IoT (interaction over true) for evaluation and discloses the following formula.
Figure 628822DEST_PATH_IMAGE015
In IoT
Figure 228430DEST_PATH_IMAGE016
The method is defined as the position and the area of a target prediction frame in an original image acquired by a pre-training model, wherein the position refers to the position on the image, and the position is acquired after the position is acquired
Figure 960763DEST_PATH_IMAGE023
And
Figure 278612DEST_PATH_IMAGE017
the intersection of these two boxes, then based on the size of the area of the intersection, is then removed to
Figure 262748DEST_PATH_IMAGE016
The area size of (d) yields the value of IoT,
Figure 93694DEST_PATH_IMAGE017
to enter intoThe position and area of the boundary of a new image after line image processing (e.g. random panning, random cropping) in the original image, where the position is known to be obtained
Figure 137874DEST_PATH_IMAGE016
And
Figure 943019DEST_PATH_IMAGE017
the intersection of the two frames is then known, and the size of the intersection is then removed
Figure 730846DEST_PATH_IMAGE016
The area size of (d) yields the value of IoT.
IoU can be used to evaluate the positioning accuracy of the target detection in the neural network, i.e. if the value IoU is closer to 1, it means that the position and size of the detection box and the real tag box are more consistent, but it cannot be used to better judge how much information is lost in the prediction box compared with the real box, so the present invention proposes IoT.
Performing data set constraint amplification based on the key information:
and finally, under the supervision of the CAM, the method ensures that a large amount of key information cannot be lost, and performs sample amplification. The data set augmented by the semi-supervised learning mode can enable the model to learn more correct information, and the identification accuracy is improved. The image processing method is divided into two methods according to different image processing methods, as shown in fig. 2, 1-CAM in fig. 2 is 1 minus the value of CAM, for example, one pixel in CAM is 0.2, and the calculation becomes 0.8.
The first one is: CAM (computer-aided manufacturing) -based random clipping/random translation method
Image cropping is the process of randomly sampling a portion of an original image and resizing the portion to the size of the original image, and is also commonly referred to as random cropping. Random image translation translates the image by a certain pixel in the horizontal, vertical direction (or both). This approach is very practical because the model can be traversed to all levels of image features by appropriate shifting of the image, effectively improving the training effect of the model. These methods have a drawback that the selected target area may not contain the real target area, or a large amount of key information is lost, and such information loss may cause the label of the network training error, specifically including:
1) then, the Map is converted into a binary image according to the target object probability information of the CAM, and the approximate region of the target object can be obtained through the binary image and the region connectivity; the connectivity of the regions can be obtained through an image algorithm, the connected regions can be regarded as one region, and then the minimum circumscribed rectangle of the communication region with the largest area is taken, so that the required approximate region is obtained;
2) and randomly clipping or randomly translating the sample to be processed, evaluating whether key information is lost or not by utilizing IoT (IoT), and if the IoT value is less than 0.5, indicating that the sample is qualified, otherwise, indicating that the sample is unqualified.
3) Sampling the data set by this method enhances the production of high quality samples, enabling neural networks to learn more useful features.
The second method is as follows: gaussian noise injection method based on CAM
In the image enhancement method of noise injection, gaussian noise is one of the commonly used methods. Gaussian noise is generally sensor noise due to poor illumination and high temperature, and this noise has a characteristic of gaussian distribution (normal distribution), which appears more clearly in RGB images. The invention provides an improved method for combining a class activation map CAM and Gaussian noise, which specifically comprises the following steps:
1) weight information added by Gaussian noise is calculated according to the target object probability information of the CAM, less noise data is injected into the region with higher probability of the target object, and the calculation formula of the weight information is shown as the following formula. The Gaussian noise injection enhancement data calculated through the class activation map can retain more target object information to a certain extent, and can distinguish foreground information from background information, and the method can help the deep learning network to learn feature information more specifically.
Figure 429681DEST_PATH_IMAGE024
Wherein
Figure 379182DEST_PATH_IMAGE019
Representing that different activation units add weight of Gaussian noise;
2) and resampling the obtained weight information data into the size of the original image, multiplying the original image by the Gaussian noise matrix, and finally adding the original image to obtain an enhanced image.
The system provided by the invention explores the importance and the improvement effect of the existing sample enhancement method on the remote sensing image to the deep learning network;
the quantitative evaluation method of the key information on the image sample is innovatively explained;
the method is improved on the basis of the existing method, and a CAM probability distribution map is introduced, so that the method has a guiding significance on sample amplification and can improve the quality of a data set.
Example two
Referring to fig. 3, fig. 3 is a schematic diagram illustrating a sample data enhancement system, according to a second embodiment of the present invention, the sample data enhancement system includes:
the training unit is used for training the convolutional neural network by utilizing a first sample to obtain a first model;
the data enhancement unit is used for preprocessing a first image in the first sample to obtain a second image, and obtaining a data-enhanced second sample based on the first image and the second image;
the pretreatment comprises the following steps:
obtaining a class activation graph corresponding to a target in the first image by using the first model;
positioning key information in the first image based on the class activation graph to obtain positioning information;
performing data enhancement processing on the first image to obtain an enhanced image;
based on the positioning information, obtaining a loss amount of key information in the enhanced image relative to key information in the first image, if the loss amount exceeds a threshold value, ignoring the enhanced image, and if the loss amount is less than or equal to the threshold value, obtaining the second image based on the enhanced image.
EXAMPLE III
The third embodiment of the invention provides a target classification method, which comprises the following steps:
obtaining a basic sample;
performing data enhancement processing on the basic sample based on the sample data enhancement method to obtain a training sample;
constructing a first target classification model, and training the first target classification model by using the training sample to obtain a second target classification model;
and acquiring an image to be processed, inputting the image to be processed into the second target classification model, and outputting a target classification result in the image to be processed.
Example four
The fourth embodiment of the present invention provides a sample data enhancement apparatus, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, wherein the processor implements the steps of the sample data enhancement method when executing the computer program.
EXAMPLE five
An embodiment five of the present invention provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the steps of the sample data enhancement method are implemented.
The processor may be a Central Processing Unit (CPU), other general purpose processor, a digital signal processor (digital signal processor), an Application Specific Integrated Circuit (Application Specific Integrated Circuit), an off-the-shelf programmable gate array (Field programmable gate array) or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may be used for storing the computer programs and/or modules, and the processor may implement various functions of the sample data enhancement device in the invention by running or executing data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a smart memory card, a secure digital card, a flash memory card, at least one magnetic disk storage device, a flash memory device, or other volatile solid state storage device.
The sample data enhancement device, if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium. Based on such understanding, all or part of the flow in the method of implementing the embodiments of the present invention may also be stored in a computer readable storage medium through a computer program, and when the computer program is executed by a processor, the computer program may implement the steps of the above-described method embodiments. Wherein the computer program comprises computer program code, an object code form, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying said computer program code, a recording medium, a usb-disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory, a random access memory, a point carrier signal, a telecommunications signal, a software distribution medium, etc. It should be noted that the computer readable medium may contain content that is appropriately increased or decreased as required by legislation and patent practice in the jurisdiction.
Having described the basic concept of the invention, it should be apparent to those skilled in the art that the foregoing detailed disclosure is to be considered merely as illustrative and not restrictive of the broad invention. Various modifications, improvements and adaptations to the present description may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present specification and thus fall within the spirit and scope of the exemplary embodiments of the present specification.
Also, the description uses specific words to describe embodiments of the description. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the specification is included. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the specification may be combined as appropriate.
Moreover, those skilled in the art will appreciate that aspects of the present description may be illustrated and described in terms of several patentable species or situations, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereof. Accordingly, aspects of this description may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.), or by a combination of hardware and software. The above hardware or software may be referred to as "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present description may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.
The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.
Computer program code required for the operation of various portions of this specification may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any network format, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).
Additionally, the order in which elements and sequences are described in this specification, the use of numerical letters, or other designations are not intended to limit the order of the processes and methods described in this specification, unless explicitly stated in the claims. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.
Similarly, it should be noted that in the preceding description of embodiments of the present specification, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to imply that more features than are expressly recited in a claim. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
For each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., cited in this specification, the entire contents of each are hereby incorporated by reference into this specification. Except where the application history document does not conform to or conflict with the contents of the present specification, it is to be understood that the application history document, as used herein in the present specification or appended claims, is intended to define the broadest scope of the present specification (whether presently or later in the specification) rather than the broadest scope of the present specification. It is to be understood that the descriptions, definitions and/or uses of terms in the accompanying materials of this specification shall control if they are inconsistent or contrary to the descriptions and/or uses of terms in this specification.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (14)

1. A method for sample data enhancement, the method comprising:
training the convolutional neural network by using a first sample to obtain a first model;
preprocessing a first image in the first sample to obtain a second image, and obtaining a data-enhanced second sample based on the first image and the second image;
the pretreatment comprises the following steps:
obtaining a class activation graph corresponding to a target in the first image by using the first model;
positioning key information in the first image based on the class activation graph to obtain positioning information;
performing data enhancement processing on the first image to obtain an enhanced image;
based on the positioning information, obtaining a loss amount of key information in the enhanced image relative to key information in the first image, if the loss amount exceeds a threshold value, ignoring the enhanced image, and if the loss amount is less than or equal to the threshold value, obtaining the second image based on the enhanced image.
2. The method of claim 1, wherein the learning rate of convolutional neural network training is adjusted by exponential slowing.
3. The method of claim 2, wherein the learning rate of the convolutional neural network training is adjusted using the following formula:
Figure DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE004
as a function of the number of the coefficients,
Figure DEST_PATH_IMAGE006
epochnum is the number of iterations, step is the step size,
Figure DEST_PATH_IMAGE008
in order to be the initial learning rate,
Figure DEST_PATH_IMAGE010
is the adjusted learning rate.
4. The method according to claim 1, wherein the class activation graph corresponding to the object in the first image is obtained by using the following formula:
Figure DEST_PATH_IMAGE012
Figure DEST_PATH_IMAGE014
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE016
unit representing the last convolutional layer
Figure 579556DEST_PATH_IMAGE004
In a spatial grid
Figure DEST_PATH_IMAGE018
Of each cell, each cell
Figure 152488DEST_PATH_IMAGE004
The result after pooling by global averaging is
Figure DEST_PATH_IMAGE020
(ii) a For object class
Figure DEST_PATH_IMAGE022
The input value of the classifier is
Figure DEST_PATH_IMAGE024
Wherein
Figure DEST_PATH_IMAGE026
Representing object classes
Figure DEST_PATH_IMAGE027
Unit of classifier
Figure 56859DEST_PATH_IMAGE004
The corresponding weight;
Figure DEST_PATH_IMAGE029
indicating that different activation units are identified as being of a certain class
Figure 655331DEST_PATH_IMAGE027
The sum of the weights of (a).
5. The sample data enhancement method according to claim 1, wherein the loss amount of the key information is calculated by a cross-to-true ratio, which is calculated by the following formula:
Figure DEST_PATH_IMAGE031
wherein, IoT is the cross-to-true ratio,
Figure DEST_PATH_IMAGE033
the position and area of the target prediction box in the first image acquired by the first model,
Figure DEST_PATH_IMAGE035
to enhance the position and area of the boundary of the image in the first image.
6. The sample data enhancement method according to claim 5, wherein if the computed true-to-false ratio is greater than a threshold, it is determined that the loss amount exceeds the threshold, and the enhanced image is ignored; and if the computed true-to-false ratio is smaller than or equal to a threshold value, judging that the loss amount is smaller than or equal to the threshold value, and obtaining the second image based on the enhanced image.
7. The method for enhancing sample data according to claim 1, wherein the locating key information in the first image based on the class activation graph to obtain locating information specifically includes:
and converting the class activation graph into a binary graph according to the probability information of the target object in the class activation graph, and acquiring the positioning information of the target object through the binary graph and the regional connectivity.
8. The method of claim 1, wherein the data enhancement of the first image to obtain an enhanced image comprises:
turning the first image;
and/or, performing rotation processing on the first image;
and/or, scaling the first image;
and/or, performing cropping processing on the first image;
and/or, performing translation processing on the first image;
and/or, performing noise injection processing on the first image.
9. The method according to claim 8, wherein when the data enhancement processing mode is noise injection, the method specifically comprises:
calculating to obtain weight information added by Gaussian noise according to the probability information of the target object in the class activation graph;
resampling the weight information data obtained by calculation to a first image size and multiplying the first image size by a Gaussian noise matrix to obtain added information;
and adding the additional information into the first image to obtain an enhanced image.
10. The method of claim 9, wherein the method calculates the weight information added by gaussian noise by using the following formula:
Figure DEST_PATH_IMAGE037
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE039
it is shown that different activation cells add weight to the gaussian noise,
Figure DEST_PATH_IMAGE041
unit representing the last convolutional layer
Figure DEST_PATH_IMAGE042
In space grids
Figure 52726DEST_PATH_IMAGE018
The value of (a) of (b),
Figure 609609DEST_PATH_IMAGE026
representing object classes
Figure 345484DEST_PATH_IMAGE022
Unit of classifier
Figure 278805DEST_PATH_IMAGE042
The corresponding weight.
11. A sample data enhancement system, characterized in that said system comprises:
the training unit is used for training the convolutional neural network by utilizing a first sample to obtain a first model;
the data enhancement unit is used for preprocessing a first image in the first sample to obtain a second image, and obtaining a data-enhanced second sample based on the first image and the second image;
the pretreatment comprises the following steps:
obtaining a class activation graph corresponding to a target in the first image by using the first model;
positioning key information in the first image based on the class activation graph to obtain positioning information;
performing data enhancement processing on the first image to obtain an enhanced image;
based on the positioning information, obtaining a loss amount of key information in the enhanced image relative to key information in the first image, if the loss amount exceeds a threshold value, ignoring the enhanced image, and if the loss amount is less than or equal to the threshold value, obtaining the second image based on the enhanced image.
12. Sample data enhancement apparatus comprising a memory, a processor and a computer program stored in said memory and executable on said processor, wherein said processor implements the steps of the sample data enhancement method according to any one of claims 1 to 10 when executing said computer program.
13. A computer readable storage medium having a computer program stored thereon, wherein the computer program when executed by a processor implements the steps of the sample data enhancement method according to any one of claims 1-10.
14. A method of object classification, the method comprising:
obtaining a basic sample;
performing data enhancement processing on the basic sample based on the sample data enhancement method according to any one of claims 1 to 10 to obtain a training sample;
constructing a first target classification model, and training the first target classification model by using the training sample to obtain a second target classification model;
and acquiring an image to be processed, inputting the image to be processed into the second target classification model, and outputting a target classification result in the image to be processed.
CN202210509914.5A 2022-05-11 2022-05-11 Sample data enhancement method, system and device, medium and target classification method Pending CN114612732A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210509914.5A CN114612732A (en) 2022-05-11 2022-05-11 Sample data enhancement method, system and device, medium and target classification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210509914.5A CN114612732A (en) 2022-05-11 2022-05-11 Sample data enhancement method, system and device, medium and target classification method

Publications (1)

Publication Number Publication Date
CN114612732A true CN114612732A (en) 2022-06-10

Family

ID=81870576

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210509914.5A Pending CN114612732A (en) 2022-05-11 2022-05-11 Sample data enhancement method, system and device, medium and target classification method

Country Status (1)

Country Link
CN (1) CN114612732A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114821204A (en) * 2022-06-30 2022-07-29 山东建筑大学 Meta-learning-based embedded semi-supervised learning image classification method and system
CN114896307A (en) * 2022-06-30 2022-08-12 北京航空航天大学杭州创新研究院 Time series data enhancement method and device and electronic equipment
CN116863277A (en) * 2023-07-27 2023-10-10 北京中关村科金技术有限公司 RPA-combined multimedia data detection method and system
CN117636073A (en) * 2024-01-24 2024-03-01 贵州科筑创品建筑技术有限公司 Concrete defect detection method, device and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109410204A (en) * 2018-10-31 2019-03-01 电子科技大学 A kind of processing of cortex cataract image and Enhancement Method based on CAM
CN110689081A (en) * 2019-09-30 2020-01-14 中国科学院大学 Weak supervision target classification and positioning method based on bifurcation learning
CN110832499A (en) * 2017-11-14 2020-02-21 谷歌有限责任公司 Weak supervision action localization over sparse time pooling networks
CN111860789A (en) * 2020-07-31 2020-10-30 Oppo广东移动通信有限公司 Model training method, terminal and storage medium
US20210133943A1 (en) * 2019-10-31 2021-05-06 Lg Electronics Inc. Video data quality improving method and apparatus
CN113033549A (en) * 2021-03-09 2021-06-25 北京百度网讯科技有限公司 Training method and device for positioning diagram acquisition model
CN113657285A (en) * 2021-08-18 2021-11-16 中国人民解放军陆军装甲兵学院 Real-time target detection method based on small-scale target
CN113721905A (en) * 2021-08-30 2021-11-30 武汉真蓝三维科技有限公司 Code-free programming system and method for three-dimensional digital software development

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110832499A (en) * 2017-11-14 2020-02-21 谷歌有限责任公司 Weak supervision action localization over sparse time pooling networks
CN109410204A (en) * 2018-10-31 2019-03-01 电子科技大学 A kind of processing of cortex cataract image and Enhancement Method based on CAM
CN110689081A (en) * 2019-09-30 2020-01-14 中国科学院大学 Weak supervision target classification and positioning method based on bifurcation learning
US20210133943A1 (en) * 2019-10-31 2021-05-06 Lg Electronics Inc. Video data quality improving method and apparatus
CN111860789A (en) * 2020-07-31 2020-10-30 Oppo广东移动通信有限公司 Model training method, terminal and storage medium
CN113033549A (en) * 2021-03-09 2021-06-25 北京百度网讯科技有限公司 Training method and device for positioning diagram acquisition model
CN113657285A (en) * 2021-08-18 2021-11-16 中国人民解放军陆军装甲兵学院 Real-time target detection method based on small-scale target
CN113721905A (en) * 2021-08-30 2021-11-30 武汉真蓝三维科技有限公司 Code-free programming system and method for three-dimensional digital software development

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
RAMPRASAATH R. SELVARAJU等: "《Visual Explanations from Deep》", 《2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION》 *
WEI ZHANG等: "《A new data augmentation method of remote sensing dataset》", 《2021 INTERNATIONAL CONFERENCE ON COMPUTER TECHNOLOGY AND GIS》 *
ZHOU B等: "《Learning Deep Features for Discriminative》", 《2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114821204A (en) * 2022-06-30 2022-07-29 山东建筑大学 Meta-learning-based embedded semi-supervised learning image classification method and system
CN114896307A (en) * 2022-06-30 2022-08-12 北京航空航天大学杭州创新研究院 Time series data enhancement method and device and electronic equipment
CN114896307B (en) * 2022-06-30 2022-09-27 北京航空航天大学杭州创新研究院 Time series data enhancement method and device and electronic equipment
CN116863277A (en) * 2023-07-27 2023-10-10 北京中关村科金技术有限公司 RPA-combined multimedia data detection method and system
CN117636073A (en) * 2024-01-24 2024-03-01 贵州科筑创品建筑技术有限公司 Concrete defect detection method, device and storage medium
CN117636073B (en) * 2024-01-24 2024-04-26 贵州科筑创品建筑技术有限公司 Concrete defect detection method, device and storage medium

Similar Documents

Publication Publication Date Title
CN111080628B (en) Image tampering detection method, apparatus, computer device and storage medium
CN114612732A (en) Sample data enhancement method, system and device, medium and target classification method
CN106547880B (en) Multi-dimensional geographic scene identification method fusing geographic area knowledge
CN105868758B (en) method and device for detecting text area in image and electronic equipment
CN108537269B (en) Weak interactive object detection deep learning method and system thereof
CN108399386A (en) Information extracting method in pie chart and device
CN112580507B (en) Deep learning text character detection method based on image moment correction
WO2021077947A1 (en) Image processing method, apparatus and device, and storage medium
CN113673338A (en) Natural scene text image character pixel weak supervision automatic labeling method, system and medium
CN112800955A (en) Remote sensing image rotating target detection method and system based on weighted bidirectional feature pyramid
CN112101344B (en) Video text tracking method and device
Zhang et al. Small object detection in remote sensing images based on attention mechanism and multi-scale feature fusion
CN113657393B (en) Shape prior missing image semi-supervised segmentation method and system
CN110991437A (en) Character recognition method and device, and training method and device of character recognition model
Wan et al. Random Interpolation Resize: A free image data augmentation method for object detection in industry
CN113378642B (en) Method for detecting illegal occupation buildings in rural areas
CN116152575B (en) Weak supervision target positioning method, device and medium based on class activation sampling guidance
KR102026280B1 (en) Method and system for scene text detection using deep learning
Zhou et al. Self-supervised saliency estimation for pixel embedding in road detection
CN114067221B (en) Remote sensing image woodland extraction method, system, device and medium
CN114758123A (en) Remote sensing image target sample enhancement method
CN116543246A (en) Training method of image denoising model, image denoising method, device and equipment
CN114241470A (en) Natural scene character detection method based on attention mechanism
CN117422787B (en) Remote sensing image map conversion method integrating discriminant and generative model
Wu et al. Industrial equipment detection algorithm under complex working conditions based on ROMS R-CNN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220610

RJ01 Rejection of invention patent application after publication