CN114612732A

CN114612732A - Sample data enhancement method, system and device, medium and target classification method

Info

Publication number: CN114612732A
Application number: CN202210509914.5A
Authority: CN
Inventors: 不公告发明人
Original assignee: Chengdu Shuzhilian Technology Co Ltd
Current assignee: Chengdu Shuzhilian Technology Co Ltd
Priority date: 2022-05-11
Filing date: 2022-05-11
Publication date: 2022-06-10

Abstract

The invention discloses a sample data enhancement method, a system, a device and a medium and target classification method, which relate to the field of data enhancement and comprise the following steps: training the convolutional neural network by using a first sample to obtain a first model; obtaining a class activation graph corresponding to a target in the first image by using the first model; positioning key information in the first image based on the class activation graph to obtain positioning information; performing data enhancement processing on the first image to obtain an enhanced image; obtaining loss amount of key information in the enhanced image relative to key information in the first image based on the positioning information, if the loss amount exceeds a threshold value, ignoring the enhanced image, and if the loss amount is less than or equal to the threshold value, obtaining the second image based on the enhanced image; the method can effectively enhance the data of the sample, and the enhanced sample can better retain the key information in the original sample.

Description

Sample data enhancement method, system and device, medium and target classification method

Technical Field

The present invention relates to the field of data enhancement, and in particular, to a method, a system, a device, a medium, and a target classification method for enhancing sample data.

Background

After the first artificial satellite is emitted in 1957, the resolution, particularly the spatial resolution, of the remote sensing image is greatly improved along with the continuous progress of geospatial science and technology for more than sixty years, and the spatial resolution of a plurality of satellite images such as IKONOS, QuickBird, WordView, GeoEye-1, Pleiades, high resolution two and resource three even reaches the meter level or even the sub-meter level. Compared with natural images, the high-resolution remote sensing image has richer spectral information, shape and texture characteristics and scene semantic information. Semantic categories are high-level knowledge abstractions and generalizations of scene content, which are predefined manually, and therefore require a series of interpretation processes on the acquired high-resolution images to extract information meaningful to the production activities. The 'scene classification' is an important understanding mode for the information of the remote sensing image, namely, the 'scene' expressed according to the content of the remote sensing image is understood and marked as a certain semantic category. The method has great significance for interpretation of images and understanding of the real world, and is one of hot spots for research in the field of remote sensing.

Scene classification methods can be divided into three major categories according to different feature extraction methods: methods based on low-level visual features, methods based on mid-level visual expressions, and methods based on high-level visual information. The traditional method for remote sensing image scene classification mainly extracts bottom layer characteristics and middle layer characteristics of an image and represents information such as specific color, shape and the like of the image. The deep learning method mainly extracts high-level semantic information of the image and represents high-level abstract information of the image. From the current research situation of remote sensing image scene classification, people tend to automate the feature extraction and classification of remote sensing images, and a method based on deep learning is taken as the most popular classification method in recent years. However, it is very difficult to further improve the accuracy in remote sensing scene image classification only by using a traditional method or a deep learning method, so how to efficiently improve the remote sensing image classification accuracy is still a significant topic.

The deep learning-based method requires a large amount of sample training and consumes more computing resources. However, it takes a long time to prepare and obtain a large number of high-quality training samples, so that how to improve the identification accuracy of the remote sensing ground features of the small samples is an important research direction at present. At present, a data enhancement method is an important means, which can greatly reduce the time required by manual marking and improve the extraction precision of a target object. Sample enhancement techniques based on image processing are one of the commonly used methods, but such methods may result in loss of critical information.

Image processing based data enhancement methods are the current underlying methods, which include many classical methods such as horizontal/vertical flipping, color space transformation, translation, rotation, and noise injection, etc., and the application of horizontal/vertical axis flipping enhancement methods is very wide, and this extension is one of the easiest to implement and has proven useful for datasets such as CIFAR-10 and ImageNet.

(1) Turnover (Flip)

Simple horizontal and vertical flipping of images, but also some frameworks do not provide vertical flipping functionality to expand the image dataset, as meaningful data expansion may not be obtained by this approach.

(2) Rotation (Rotation)

The image is randomly rotated by a certain angle in a clockwise or counterclockwise direction. The size of the image may change when the image is rotated according to a given angle. For example, a square image rotated at right angles will keep the size of the image unchanged, but a rectangular image will have an exchange of length and width, and similarly rotating the image at finer angles will change the final image size.

(3) Zooming (Scale)

The scaling of the image is divided into two categories, one is outward scaling and the final image size will be larger than the original image size. Most image manipulation methods will cut a part from the new image to a size equal to the original image; another approach is to scale inward, with missing portions complementing fixed pixel values.

(4) Cutting (Crop)

Unlike rotation-type operations, cropping changes the size of an image, and random cropping is generally used in the cropping method, that is, a portion of an original image is randomly sampled and then the size of the portion is adjusted to the size of the original image. The enlarging, cutting and zooming method has the same work as the original method.

(5) Shift (Translation)

The image is translated by a certain pixel in the horizontal, vertical or oblique direction. This image data enhancement method is very practical because the model can be traversed to all levels of image features by appropriate shifting of the image, thereby effectively improving the training effect of the model.

These methods involve text recognition data sets (e.g., MNIST, Street View House Numbers, etc.) that may result in loss of tag information because samples of text, etc., after flipping, become meaningless information, or vice versa.

The method can show that key information is lost on an image of an enhanced sample such as remote sensing, and even the semantic information of the image can be changed when the key information is serious. The overall quality of the data set is reduced due to the wrong samples, so that the generalization capability of the neural network training model is reduced, and a large amount of false identifications occur in the application.

Disclosure of Invention

The invention aims to perform sample enhancement on a smaller number of images and save more key information to improve the quality level of a sample data set.

In order to achieve the above object, the present invention provides a sample data enhancement method, including:

training the convolutional neural network by using a first sample to obtain a first model;

preprocessing a first image in the first sample to obtain a second image, and obtaining a data-enhanced second sample based on the first image and the second image;

the pretreatment comprises the following steps:

obtaining a class activation graph corresponding to a target in the first image by using the first model;

positioning key information in the first image based on the class activation graph to obtain positioning information;

performing data enhancement processing on the first image to obtain an enhanced image;

based on the positioning information, obtaining a loss amount of key information in the enhanced image relative to key information in the first image, if the loss amount exceeds a threshold value, ignoring the enhanced image, and if the loss amount is less than or equal to the threshold value, obtaining the second image based on the enhanced image.

The method comprises the following steps: the method comprises the steps of firstly obtaining a first model through training, then obtaining a class activation graph corresponding to a target in a first image by utilizing the first model, wherein the class activation graph can solve the defect problem of insensitivity to class information representation in the field of image processing and recognition, namely, a neural network cannot intuitively interpret the reason why the class activation graph can determine the class of the information in the image. Class activation maps are a feature learning method of discriminative area localization that works on classification training CNNs with global mean pooling and enables the network to learn to perform object localization without using any bounding box annotation. And then, positioning the key information in the first image by using the class activation map, and judging whether the key information in the enhanced image after data enhancement is over-lost or not by using the positioning information, and reserving the image after data enhancement with small loss, thereby realizing the purposes of data enhancement and reserving the key information in the original image.

Preferably, the learning rate of the convolutional neural network training is adjusted in an exponential slow-down mode.

The learning rate is an important hyper-parameter in neural network optimization. In the gradient descent method, the value of the learning rate is very critical, if the value is too large, convergence cannot be achieved, and if the value is too small, the convergence speed is too slow. Exponential slow down is an orderly adjustment strategy: orderly adjusted according to a certain rule. The advantage of this is that the initially large learning rate can help to jump out of the local optimum, and the small learning rate helps to model convergence and to model refinement.

Preferably, the learning rate of the convolutional neural network training is specifically adjusted by using the following formula:

wherein the content of the first and second substances,

as a function of the number of the coefficients,

epochnum is the number of iterations, step is the step size,

in order to be the initial learning rate,

is the adjusted learning rate.

Preferably, the class activation map corresponding to the target in the first image is obtained by using the following formula:

wherein the content of the first and second substances,

unit representing the last convolutional layer

In a spatial grid

Of each cell, each cell

The result after pooling by global averaging is

(ii) a For object class

The input value of the classifier is

Wherein

Representing object classes

Unit of classifier

The corresponding weight;

indicating that different activation units are identified as being of a certain class

The sum of the weights of (a).

Preferably, the loss amount of the key information is calculated by the method through the cross-to-true ratio, and the cross-to-true ratio is calculated by adopting the following formula:

wherein, IoT is the cross-to-true ratio,

the position and area of the target prediction box in the first image acquired by the first model,

to enhance the position and area of the boundary of the image in the first image.

The invention provides an information loss evaluation method based on the IoU method, wherein the information loss evaluation method is used for evaluating whether the sample information quantity reservation generated by the sample enhancement method is qualified or not, namely quantitatively analyzing the value of key information loss.

Preferably, if the computed true-to-false ratio is greater than a threshold, it is determined that the loss amount exceeds the threshold, and the enhanced image is ignored; and if the computed true-to-false ratio is smaller than or equal to a threshold value, judging that the loss amount is smaller than or equal to the threshold value, and obtaining the second image based on the enhanced image.

Preferably, the positioning the key information in the first image based on the class activation map to obtain the positioning information specifically includes:

and converting the class activation graph into a binary graph according to the probability information of the target object in the class activation graph, and acquiring the positioning information of the target object through the binary graph and the regional connectivity.

Preferably, the performing data enhancement processing on the first image to obtain an enhanced image specifically includes:

turning the first image;

and/or, performing rotation processing on the first image;

and/or, scaling the first image;

and/or, performing cropping processing on the first image;

and/or, performing translation processing on the first image;

and/or, performing noise injection processing on the first image.

Preferably, when the data enhancement processing mode is noise injection, the method specifically includes:

calculating to obtain weight information added by Gaussian noise according to the probability information of the target object in the class activation graph;

resampling the weight information data obtained by calculation to a first image size and multiplying the first image size by a Gaussian noise matrix to obtain added information;

and adding the added information into the first image to obtain an enhanced image.

Among them, gaussian noise is one of the commonly used methods in the image enhancement method of noise injection. Gaussian noise is generally sensor noise due to poor illumination and high temperature, and this noise has a characteristic of gaussian distribution (normal distribution), which appears more clearly in RGB images. The invention provides an improved method for combining a class activation map CAM and Gaussian noise.

Wherein, weight information added by Gaussian noise is calculated according to the probability information of the target object of the CAM, and less noise data is injected into the region with higher probability of the target object. The Gaussian noise injection enhancement data calculated through the class activation map can retain more target object information to a certain extent, and can distinguish foreground information from background information, and the method can help the deep learning network to learn feature information more specifically.

Preferably, the method calculates and obtains the weight information added by the gaussian noise by using the following formula:

wherein the content of the first and second substances,

it is shown that different activation cells add weight to the gaussian noise,

unit representing the last convolutional layer

In a spatial grid

The value of (a) of (b),

representing object classes

Unit of classifier

The corresponding weight.

The present invention also provides a sample data enhancement system, said system comprising:

the training unit is used for training the convolutional neural network by utilizing a first sample to obtain a first model;

the data enhancement unit is used for preprocessing a first image in the first sample to obtain a second image, and obtaining a data-enhanced second sample based on the first image and the second image;

the pretreatment comprises the following steps:

The invention also provides a target classification method, which comprises the following steps:

obtaining a basic sample;

performing data enhancement processing on the basic sample based on the sample data enhancement method to obtain a training sample;

constructing a first target classification model, and training the first target classification model by using the training sample to obtain a second target classification model;

and acquiring an image to be processed, inputting the image to be processed into the second target classification model, and outputting a target classification result in the image to be processed.

The invention also provides a sample data enhancement device, which comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, wherein the processor realizes the steps of the sample data enhancement method when executing the computer program.

The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the sample data enhancement method.

One or more technical schemes provided by the invention at least have the following technical effects or advantages:

the method can effectively enhance the data of the sample, and the enhanced sample can better retain the key information in the original sample.

Drawings

The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention;

FIG. 1 is a schematic flow chart of a sample data enhancement method;

FIG. 2 is a technical flow chart of a sample enhancement method based on a class activation mechanism;

fig. 3 is a schematic diagram of the composition of the sample data enhancement system.

Detailed Description

In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments of the present invention and features of the embodiments may be combined with each other without conflicting with each other.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described and thus the scope of the present invention is not limited by the specific embodiments disclosed below.

It should be understood that "system", "device", "unit" and/or "module" as used herein is a method for distinguishing different components, elements, parts, portions or assemblies of different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.

As used in this specification and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.

Flowcharts are used in this specification to illustrate the operations performed by the system according to embodiments of the present specification. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.

Example one

Referring to fig. 1, fig. 1 is a schematic flow chart of a sample data enhancement method, a first embodiment of the present invention provides a sample data enhancement method, including:

the pretreatment comprises the following steps:

The method provides a supervised data enhancement method based on a Class Activation Map and a traditional data enhancement method, and the flow steps of the scheme are as follows:

pre-training a model:

the small sample dataset before augmentation was first trained using classical convolutional neural networks (ResNet-18, SqueezeNet, DenseNet) to obtain an initial model. The convolutional neural network can adopt networks in various forms or frames in the prior art, the specific form, implementation mode and specific structure of the convolutional neural network are not specifically limited, and the related functions in the invention can be realized. For optimizing the model training, the learning rate is adjusted by an Exponential slow-down (explicit slow), that is, the learning rate of the model training is adjusted by an Exponential slow-down method, as shown in the following formula:

wherein

As a function of the number of the coefficients,

the value is 0.85, the initial learning rate is 0.001, epochnum is the number of iterations, step is the step length, and the value is 10.

In order to be the initial learning rate,

is the adjusted learning rate.

Generating a CAM based on a pre-training model:

then, a class activation map of a truth label generated by the initial model is used, and the probability of the target object in the sample image can be obtained through the class activation map; bolei Zhou et al propose a Class Activation Map (CAM) in CVPR-2016 that addresses the problem of the lack of sensitivity to the representation of class information in the field of image processing and recognition, i.e., the inability of neural networks to intuitively interpret the reason why it is possible to determine the class to which information in an image belongs. Class activation maps are a feature learning method of discriminative area localization that works on classification training CNNs with global mean pooling and enables the network to learn to perform object localization without using any bounding box annotation. The CAM calculation derivation formula of a certain object in the image is shown as the following formula:

wherein the content of the first and second substances,

unit representing the last convolutional layer

In the space gridNet

Of each cell, each cell

The result after global Average Pooling (Golbal Average Pooling) GAP (Global Average Pooling) is

. For object class

The input value of the classifier softmax is

In which

Representing object classes

Unit of classifier

The corresponding weight.

The sum of the weights of (1) and (2) is resampled to the original image size to obtain the classification

The class activation map CAM. Class activation mapping allows visualization of the predicted class score on any given image, highlighting the discriminative object parts detected by CNNs.

Key information positioning based on CAM and IoT:

IoU, is a measure of the accuracy of detecting a corresponding object in a particular data set, and is known as the Intersection over Union (r) ratio, which is a value that increases with higher correlation. IoU, the ratio of the intersection and union of the "predicted bounding box" and the "true bounding box" is calculated, as shown in the following equation.

In order to evaluate whether the sample information quantity reservation generated by the sample enhancement method is qualified or not, namely quantitatively analyzing the numerical value of key information loss, the invention provides an information loss evaluation method based on the IoU method, and the method adopts calculation IoT (interaction over true) for evaluation and discloses the following formula.

In IoT

The method is defined as the position and the area of a target prediction frame in an original image acquired by a pre-training model, wherein the position refers to the position on the image, and the position is acquired after the position is acquired

And

the intersection of these two boxes, then based on the size of the area of the intersection, is then removed to

The area size of (d) yields the value of IoT,

to enter intoThe position and area of the boundary of a new image after line image processing (e.g. random panning, random cropping) in the original image, where the position is known to be obtained

And

the intersection of the two frames is then known, and the size of the intersection is then removed

The area size of (d) yields the value of IoT.

IoU can be used to evaluate the positioning accuracy of the target detection in the neural network, i.e. if the value IoU is closer to 1, it means that the position and size of the detection box and the real tag box are more consistent, but it cannot be used to better judge how much information is lost in the prediction box compared with the real box, so the present invention proposes IoT.

Performing data set constraint amplification based on the key information:

and finally, under the supervision of the CAM, the method ensures that a large amount of key information cannot be lost, and performs sample amplification. The data set augmented by the semi-supervised learning mode can enable the model to learn more correct information, and the identification accuracy is improved. The image processing method is divided into two methods according to different image processing methods, as shown in fig. 2, 1-CAM in fig. 2 is 1 minus the value of CAM, for example, one pixel in CAM is 0.2, and the calculation becomes 0.8.

The first one is: CAM (computer-aided manufacturing) -based random clipping/random translation method

Image cropping is the process of randomly sampling a portion of an original image and resizing the portion to the size of the original image, and is also commonly referred to as random cropping. Random image translation translates the image by a certain pixel in the horizontal, vertical direction (or both). This approach is very practical because the model can be traversed to all levels of image features by appropriate shifting of the image, effectively improving the training effect of the model. These methods have a drawback that the selected target area may not contain the real target area, or a large amount of key information is lost, and such information loss may cause the label of the network training error, specifically including:

1) then, the Map is converted into a binary image according to the target object probability information of the CAM, and the approximate region of the target object can be obtained through the binary image and the region connectivity; the connectivity of the regions can be obtained through an image algorithm, the connected regions can be regarded as one region, and then the minimum circumscribed rectangle of the communication region with the largest area is taken, so that the required approximate region is obtained;

2) and randomly clipping or randomly translating the sample to be processed, evaluating whether key information is lost or not by utilizing IoT (IoT), and if the IoT value is less than 0.5, indicating that the sample is qualified, otherwise, indicating that the sample is unqualified.

3) Sampling the data set by this method enhances the production of high quality samples, enabling neural networks to learn more useful features.

The second method is as follows: gaussian noise injection method based on CAM

In the image enhancement method of noise injection, gaussian noise is one of the commonly used methods. Gaussian noise is generally sensor noise due to poor illumination and high temperature, and this noise has a characteristic of gaussian distribution (normal distribution), which appears more clearly in RGB images. The invention provides an improved method for combining a class activation map CAM and Gaussian noise, which specifically comprises the following steps:

1) weight information added by Gaussian noise is calculated according to the target object probability information of the CAM, less noise data is injected into the region with higher probability of the target object, and the calculation formula of the weight information is shown as the following formula. The Gaussian noise injection enhancement data calculated through the class activation map can retain more target object information to a certain extent, and can distinguish foreground information from background information, and the method can help the deep learning network to learn feature information more specifically.

Wherein

Representing that different activation units add weight of Gaussian noise;

2) and resampling the obtained weight information data into the size of the original image, multiplying the original image by the Gaussian noise matrix, and finally adding the original image to obtain an enhanced image.

The system provided by the invention explores the importance and the improvement effect of the existing sample enhancement method on the remote sensing image to the deep learning network;

the quantitative evaluation method of the key information on the image sample is innovatively explained;

the method is improved on the basis of the existing method, and a CAM probability distribution map is introduced, so that the method has a guiding significance on sample amplification and can improve the quality of a data set.

Example two

Referring to fig. 3, fig. 3 is a schematic diagram illustrating a sample data enhancement system, according to a second embodiment of the present invention, the sample data enhancement system includes:

the pretreatment comprises the following steps:

EXAMPLE III

The third embodiment of the invention provides a target classification method, which comprises the following steps:

obtaining a basic sample;

Example four

The fourth embodiment of the present invention provides a sample data enhancement apparatus, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, wherein the processor implements the steps of the sample data enhancement method when executing the computer program.

EXAMPLE five

An embodiment five of the present invention provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the steps of the sample data enhancement method are implemented.

The processor may be a Central Processing Unit (CPU), other general purpose processor, a digital signal processor (digital signal processor), an Application Specific Integrated Circuit (Application Specific Integrated Circuit), an off-the-shelf programmable gate array (Field programmable gate array) or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory may be used for storing the computer programs and/or modules, and the processor may implement various functions of the sample data enhancement device in the invention by running or executing data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a smart memory card, a secure digital card, a flash memory card, at least one magnetic disk storage device, a flash memory device, or other volatile solid state storage device.

The sample data enhancement device, if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium. Based on such understanding, all or part of the flow in the method of implementing the embodiments of the present invention may also be stored in a computer readable storage medium through a computer program, and when the computer program is executed by a processor, the computer program may implement the steps of the above-described method embodiments. Wherein the computer program comprises computer program code, an object code form, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying said computer program code, a recording medium, a usb-disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory, a random access memory, a point carrier signal, a telecommunications signal, a software distribution medium, etc. It should be noted that the computer readable medium may contain content that is appropriately increased or decreased as required by legislation and patent practice in the jurisdiction.

Having described the basic concept of the invention, it should be apparent to those skilled in the art that the foregoing detailed disclosure is to be considered merely as illustrative and not restrictive of the broad invention. Various modifications, improvements and adaptations to the present description may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present specification and thus fall within the spirit and scope of the exemplary embodiments of the present specification.

Also, the description uses specific words to describe embodiments of the description. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the specification is included. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the specification may be combined as appropriate.

Moreover, those skilled in the art will appreciate that aspects of the present description may be illustrated and described in terms of several patentable species or situations, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereof. Accordingly, aspects of this description may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.), or by a combination of hardware and software. The above hardware or software may be referred to as "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present description may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.

The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.

Computer program code required for the operation of various portions of this specification may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any network format, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).

Additionally, the order in which elements and sequences are described in this specification, the use of numerical letters, or other designations are not intended to limit the order of the processes and methods described in this specification, unless explicitly stated in the claims. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.

Similarly, it should be noted that in the preceding description of embodiments of the present specification, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to imply that more features than are expressly recited in a claim. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.

For each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., cited in this specification, the entire contents of each are hereby incorporated by reference into this specification. Except where the application history document does not conform to or conflict with the contents of the present specification, it is to be understood that the application history document, as used herein in the present specification or appended claims, is intended to define the broadest scope of the present specification (whether presently or later in the specification) rather than the broadest scope of the present specification. It is to be understood that the descriptions, definitions and/or uses of terms in the accompanying materials of this specification shall control if they are inconsistent or contrary to the descriptions and/or uses of terms in this specification.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A method for sample data enhancement, the method comprising:

the pretreatment comprises the following steps:

2. The method of claim 1, wherein the learning rate of convolutional neural network training is adjusted by exponential slowing.

3. The method of claim 2, wherein the learning rate of the convolutional neural network training is adjusted using the following formula:

wherein the content of the first and second substances,

as a function of the number of the coefficients,

epochnum is the number of iterations, step is the step size,

in order to be the initial learning rate,

is the adjusted learning rate.

4. The method according to claim 1, wherein the class activation graph corresponding to the object in the first image is obtained by using the following formula:

wherein the content of the first and second substances,

unit representing the last convolutional layer

In a spatial grid

Of each cell, each cell

The result after pooling by global averaging is

(ii) a For object class

The input value of the classifier is

Wherein

Representing object classes

Unit of classifier

The corresponding weight;

The sum of the weights of (a).

5. The sample data enhancement method according to claim 1, wherein the loss amount of the key information is calculated by a cross-to-true ratio, which is calculated by the following formula:

wherein, IoT is the cross-to-true ratio,

6. The sample data enhancement method according to claim 5, wherein if the computed true-to-false ratio is greater than a threshold, it is determined that the loss amount exceeds the threshold, and the enhanced image is ignored; and if the computed true-to-false ratio is smaller than or equal to a threshold value, judging that the loss amount is smaller than or equal to the threshold value, and obtaining the second image based on the enhanced image.

7. The method for enhancing sample data according to claim 1, wherein the locating key information in the first image based on the class activation graph to obtain locating information specifically includes:

8. The method of claim 1, wherein the data enhancement of the first image to obtain an enhanced image comprises:

turning the first image;

and/or, performing rotation processing on the first image;

and/or, scaling the first image;

and/or, performing cropping processing on the first image;

and/or, performing translation processing on the first image;

and/or, performing noise injection processing on the first image.

9. The method according to claim 8, wherein when the data enhancement processing mode is noise injection, the method specifically comprises:

and adding the additional information into the first image to obtain an enhanced image.

10. The method of claim 9, wherein the method calculates the weight information added by gaussian noise by using the following formula:

wherein the content of the first and second substances,

it is shown that different activation cells add weight to the gaussian noise,

unit representing the last convolutional layer

In space grids

The value of (a) of (b),

representing object classes

Unit of classifier

The corresponding weight.

11. A sample data enhancement system, characterized in that said system comprises:

the pretreatment comprises the following steps:

12. Sample data enhancement apparatus comprising a memory, a processor and a computer program stored in said memory and executable on said processor, wherein said processor implements the steps of the sample data enhancement method according to any one of claims 1 to 10 when executing said computer program.

13. A computer readable storage medium having a computer program stored thereon, wherein the computer program when executed by a processor implements the steps of the sample data enhancement method according to any one of claims 1-10.

14. A method of object classification, the method comprising:

obtaining a basic sample;

performing data enhancement processing on the basic sample based on the sample data enhancement method according to any one of claims 1 to 10 to obtain a training sample;