CN111582345A - Target identification method for complex environment under small sample - Google Patents

Target identification method for complex environment under small sample Download PDF

Info

Publication number
CN111582345A
CN111582345A CN202010358400.5A CN202010358400A CN111582345A CN 111582345 A CN111582345 A CN 111582345A CN 202010358400 A CN202010358400 A CN 202010358400A CN 111582345 A CN111582345 A CN 111582345A
Authority
CN
China
Prior art keywords
training
network
data
data set
loss function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010358400.5A
Other languages
Chinese (zh)
Inventor
姚远
郑志浩
张学睿
张帆
尚明生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Chongqing Institute of Green and Intelligent Technology of CAS
Original Assignee
Chongqing Institute of Green and Intelligent Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Institute of Green and Intelligent Technology of CAS filed Critical Chongqing Institute of Green and Intelligent Technology of CAS
Priority to CN202010358400.5A priority Critical patent/CN111582345A/en
Publication of CN111582345A publication Critical patent/CN111582345A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a target identification method for a complex environment under a small sample, belonging to the technical field of image processing. The method comprises the following steps: 1) data expansion, specifically comprising: s11: constructing and training a GAN; s12: after the GAN network training is finished, screening a data set generated by the GAN, and mixing a result and a real data set to form a new data set to obtain an expanded small sample data set; labeling the new data set, and taking the new data set after labeling as an input of YOLOV 3; 2) the target identification specifically comprises the following steps: s21: constructing and training a Yolov3 network; s22: after the coordinate, confidence and classification training of the YOLOV3 network is completed, inputting the new data set into the YOLOV3 network, performing NMS processing on the finally remaining detection frames in the picture, deleting redundant frames, and outputting the picture with the detection frames. The method can solve the problem that the target is difficult to clearly identify in a complex environment under a small sample.

Description

Target identification method for complex environment under small sample
Technical Field
The invention belongs to the technical field of image processing, and relates to a target identification method for a complex environment under a small sample.
Background
In actual engineering, the data samples acquired are usually insufficient, so that model learning is insufficient, and an overfitting state occurs. Under a complex scene, overexposure occurs, and the number of positive samples is greater than that of negative samples, and under such a situation, a method for identifying a target under a complex environment by using a small sample is urgently needed.
In recent years, recognition technology based on deep learning has been developed rapidly, and deep convolution classification networks represented by GoogleNet, VGG, ResNet, and sentet have been especially successful in the industrial and academic fields. Compared with the traditional image classification and identification technology, the deep convolution classification network enables the feature extraction and the feature classification to be unified into an integral framework for joint training, so that the problems of manual feature extraction and semantic gap existing in the traditional identification method are solved. However, these classification models are end-to-end supervision models, the higher accuracy depends on a large amount of labeled data, under the condition of rare data, the models are easy to over-fit, a poor generalization result and a lower accuracy are obtained, and the data enhancement and regularization technology can only relieve the problem of target identification of small samples and cannot fundamentally solve the problem.
In order to solve the problem of small number of collected samples, it is currently easy to think of expanding sample data by using a Generative Adaptive Network (GAN), which is a deep learning model and one of the most promising methods for unsupervised learning in complex distribution in recent years. The model passes through two modules in the framework: the mutual game learning of the Generative Model (Generative Model) and the discriminant Model (discriminant Model) produces quite good output, thereby obtaining an extended sample.
Disclosure of Invention
In view of this, the present invention aims to provide a target identification method in a complex environment under a small sample, which expands the sample data through a GAN network to solve the problem of difficult sample collection; and then inputting the expanded sample data into a YOLOV3 network, and improving the self identification capability of the network by learning the offset of the central point of the bounding box, thereby solving the problem that the target is difficult to clearly identify in a complex environment under a small sample.
In order to achieve the purpose, the invention provides the following technical scheme:
a target identification method of a complex environment under a small sample comprises the following steps:
s1: data expansion, specifically comprising:
s11: constructing and training a GAN network;
s12: after the GAN network training is finished, screening a data set generated by the GAN, and mixing a result and a real data set to form a new data set to obtain an expanded small sample data set; labeling the new data set, and taking the new data set after labeling as an input of YOLOV 3;
s2: the target identification specifically comprises the following steps:
s21: constructing and training a Yolov3 network;
s22: after the coordinate, confidence and classification training of the YOLOV3 network is completed, inputting the new data set into the YOLOV3 network, performing NMS processing on the finally remaining detection frames in the picture, deleting redundant frames, and outputting the picture with the detection frames.
Further, in step S11, the GAN network constructed includes a generator C and a discriminator T; the generator C has an input of noise data conforming to a probability distribution, such as gaussian distribution, bernoulli distribution, uniform distribution, etc., where it is assumed that the noise data conforms to gaussian distribution, and the role of C is to generate a new picture from the input noise data; the discriminator T has two inputs, one is a real data set, the label of the real data set is automatically set to be 1, the other input is data generated by the generator C, the label is automatically set to be 0, the function of the discriminator T is to identify the real data and the data generated by the generator as far as possible, and the discriminator T can be regarded as a two-classification network;
the loss function of the GAN network is:
Figure BDA0002474208960000021
wherein, t to Atrue(t)A data set t (t) representing data t from real data; n to Anoise(n)The representative data n comes from the data set C (n) generated by the generator C.
Further, in step S11, the training GAN network is: a mode training generator C and a discriminator T which adopt single alternate iterative training; before training, initializing a generator C randomly, and pre-training a discriminator T to ensure that the discriminator T has certain classification capability when training is started;
the specific steps of GAN network training are as follows:
1) a fixed generator C trains a discriminator T and circularly executes the following steps K times;
① from noise data AnoiseMiddle sampling n objects to generate a set n-Anoise(n)
② from the real data set Atrue(t)Middle sampling t objects to generate set t-Atrue(t)
③ mixing n to Anoise(n)Inputting the data into the GAN to generate a new data set C (n);
④ mixing C (n) and t-Atrue(t)Inputting T, training with the following formula as a loss function, wherein the loss function is similar to a binary network, and when T is distinguished, A is inclined to betrue(t)The score of the data in (1) is close to 1, and the score of the data in (c) (n) is close to 0;
Figure BDA0002474208960000022
adopting cross entropy loss as a loss function, updating network parameters in a gradient descending mode, and circulating for K times to find out the optimal discriminator T under the condition of the current GAN;
2) the fixed discriminator T and the training generator C execute the following steps once;
① from noise data AnoiseMiddle sampling n objects to generate a set n-Anoise(n)
② mixing n to Anoise(n)Inputting C, and recording the output data as C (n);
③ from C (n) and Atrue(t)Sampling n data to form a set and inputting the set into T;
fourthly, training C according to the loss function of the following formula and the output result of T, and updating the network parameters by adopting gradient descent;
Figure BDA0002474208960000031
the objective of the penalty function is to hopefully find the true dataset A under the current conditions of the arbitertrue(t)And output of C (n) network parameter with minimum KL divergence between two data sets; since the parameters of the discriminator T are fixed while training the generator G, C is optimized according to the discrimination result T (C (n)) of T;
3) and ending the single training process, returning to the beginning, and performing training again.
Further, in step S21, the YOLOV3 network is constructed as: the main network is 53 layers of dark net, the image features are extracted by adopting a residual error structure and a small convolution kernel, and three detection layers with different sizes are set to respectively detect a larger target and a smaller target by using a feature pyramid structure.
Further, in step S21, the training of the YOLOV3 network specifically includes:
1) obtaining a clustering center of the data set by using a K _ means clustering algorithm, and setting the clustering center as the value of anchors;
2) using random resize to enhance data, and adjusting the size of an input picture to any multiple of 16;
3) inputting a picture into a network, extracting the features of the picture through Darknet53, dividing the feature picture into three cells with different numbers, respectively sending the extracted features to three YOLO detection layers, and outputting a picture with a prediction frame drawn by the YOLO layers;
4) and comparing the coordinates of the prediction frame drawn by the YOLO layer with the coordinates of the anchor, regressing the coordinate offset by adopting a logistic mode, and calculating by using the following four formulas:
Figure BDA0002474208960000032
Figure BDA0002474208960000033
wherein R ismAnd RnIs the coordinate of the cell at the upper left corner of the prediction box, sigmoid (O)m) And sigmoid (O)n) Is the offset of the coordinate of the center point of the prediction frame compared with the coordinate of the center of the anchor, OmAnd OnIs the coordinate of the center point of the prediction box, bmAnd bnIs the normalized value of the center of the prediction box relative to the coordinate of the upper left corner of the cell, AwAnd AhIs width and height of anchor, OwAnd OhB is the width and height of the detection framewAnd bhThe width and height of the prediction box after normalization are relative to the width and height of the anchor;
5) meanwhile, the probability that an object exists in each detection frame is scored by using logistic regression, the probability is recorded as confidence degree, the detection frame with the highest confidence degree is selected for reservation, and the rest frames are deleted;
6) after the confidence degree score is obtained, the network classifies the objects in the detection frame according to the classified loss function;
further, in step 4) of training the YOLOV3 network, the loss function of the center coordinates and the width and height is:
Figure BDA0002474208960000041
wherein, αexitIs the weight coefficient of the central coordinate loss function, l x l represents the number of cells into which the characteristic diagram is divided, K represents the number of prediction frames,
Figure BDA0002474208960000042
judging whether the b-th prediction frame of the a-th cell is responsible for detecting the current object or not, and if so, judging whether the current object exists
Figure BDA0002474208960000043
Has a value of1, otherwise 0; then using the squared error, wherein
Figure BDA0002474208960000044
And
Figure BDA0002474208960000045
is the center coordinate of the manual mark frame, maAnd naIs the center coordinate of the prediction box;
the loss function for width and height is:
Figure BDA0002474208960000046
wherein p isaAnd q isaIs the width and height of the prediction box,
Figure BDA0002474208960000047
and
Figure BDA0002474208960000048
then the width and height of the manual label box.
Further, in step 5) of training the YOLOV3 network, the loss function of the confidence in the training is:
Figure BDA0002474208960000049
Figure BDA00024742089600000410
the loss function of confidence coefficient adopts cross entropy error, the first expression represents the loss function of confidence coefficient under the condition that the prediction frame has an object, the second expression represents the loss function of confidence coefficient under the condition that no object exists, αnoexitThe weight coefficient is expressed when no object exists, and the influence of a prediction frame without the object on the updating of the network parameters is reduced;
Figure BDA00024742089600000411
confidence indicating the presence of an object, 1 if an object is present, and absence of an objectIs 0; e.g. of the typeaRepresenting the confidence level calculated by the network itself.
Further, in step 6) of training the YOLOV3 network, the classified loss function adopts cross entropy error, and the calculation formula is as follows:
Figure BDA00024742089600000412
wherein e ∈ classes represents the object type to which the object in the prediction box belongs,
Figure BDA00024742089600000413
indicates that e is 1 when belonging to the correct class of the object, 0 when belonging to the other class, and Ga(e) Indicating the score of the network after classifying e.
The invention has the beneficial effects that: according to the invention, the GAN network is adopted to expand sample data, so that the problem of difficulty in sample acquisition is solved; and then inputting the expanded sample data into a YOLO V3 network, and improving the self recognition capability of the network by learning the offset of the central point of the bounding box, thereby solving the problem that the target is difficult to clearly recognize in a complex environment under a small sample.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Drawings
For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a block diagram of a target identification method of the present invention;
FIG. 2 is an original picture obtained according to an embodiment of the present invention;
fig. 3 shows the recognition result of the embodiment of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Referring to fig. 1, a method for identifying a target in a complex environment under a small sample includes the following steps:
s1: data expansion, specifically comprising:
s11: constructing and training a GAN network;
the constructed GAN network comprises a generator C and a discriminator T; the generator C has an input of noise data conforming to a probability distribution, such as gaussian distribution, bernoulli distribution, uniform distribution, etc., where it is assumed that the noise data conforms to gaussian distribution, and the role of C is to generate a new picture from the input noise data; the discriminator T has two inputs, one is a real data set, the label of the real data set is automatically set to be 1, the other input is data generated by the generator C, the label is automatically set to be 0, the function of the discriminator T is to identify the real data and the data generated by the generator as far as possible, and the discriminator T can be regarded as a two-classification network;
the loss function of the GAN network is:
Figure BDA0002474208960000051
wherein, t to Atrue(t)A data set t (t) representing data t from real data; n to Anoise(n)The representative data n comes from the data set C (n) generated by the generator C.
The training GAN network is: a mode training generator C and a discriminator T which adopt single alternate iterative training; before training, initializing a generator C randomly, and pre-training a discriminator T to ensure that the discriminator T has certain classification capability when training is started;
the specific steps of GAN network training are as follows:
1) a fixed generator C trains a discriminator T and circularly executes the following steps K times;
① from noise data AnoiseMiddle sampling n objects to generate a set n-Anoise(n)
② from the real data set Atrue(t)Middle sampling t objects to generate set t-Atrue(t)
③ mixing n to Anoise(n)Inputting the data into the GAN to generate a new data set C (n);
④ mixing C (n) and t-Atrue(t)Inputting T, training with the following formula as a loss function, wherein the loss function is similar to a binary network, and when T is distinguished, A is inclined to betrue(t)The score of the data in (a) is close to 1, and the score of the data in (c) (n) is close to 0;
Figure BDA0002474208960000061
because the network parameters of the generator C are fixed and unchanged when the arbiter T is trained, the cross entropy loss is adopted as a loss function, the network parameters are updated in a gradient descending manner, and the cycle is performed for K times to find the optimal arbiter T under the condition of the current GAN;
2) the fixed discriminator T and the training generator C execute the following steps once;
① from noise data AnoiseMiddle sampling n objects to generate a set n-Anoise(n)
② mixing n to Anoise(n)Inputting C, and recording the output data as C (n);
③ from C (n) and Atrue(t)Sampling n data to form a set and inputting the set into T;
fourthly, training C according to the loss function of the following formula and the output result of T, and updating the network parameters by adopting gradient descent;
Figure BDA0002474208960000062
the objective of the penalty function is to hopefully find the true dataset A under the current conditions of the arbitertrue(t)And output of C (n) network parameter with minimum KL divergence between two data sets; since the parameters of the discriminator T are fixed while training the generator G, C is optimized according to the discrimination result T (C (n)) of T;
3) and ending the single training process, returning to the beginning, and performing training again.
S12: after the GAN network training is finished, screening a data set generated by the GAN, and mixing a result and a real data set to form a new data set to obtain an expanded small sample data set; the new data set is annotated and the new data set after annotation is taken as input to YOLOV 3.
S2: the target identification specifically comprises the following steps:
s21: constructing and training a Yolov3 network;
the constructed YOLOV3 network was: the main network is 53 layers of dark net, the image features are extracted by adopting a residual error structure and a small convolution kernel, and three detection layers with different sizes are set to respectively detect a larger target and a smaller target by using a feature pyramid structure.
The training YOLOV3 network specifically includes:
1) obtaining a clustering center of the data set by using a K _ means clustering algorithm, and setting the clustering center as the value of anchors;
2) using random resize to enhance data, and adjusting the size of an input picture to any multiple of 16;
3) inputting a picture into a network, extracting the features of the picture through Darknet53, dividing the feature picture into three cells with different numbers, respectively sending the extracted features to three YOLO detection layers, and outputting a picture with a prediction frame drawn by the YOLO layers;
4) and comparing the coordinates of the prediction frame drawn by the YOLO layer with the coordinates of the anchor, regressing the coordinate offset by adopting a logistic mode, and calculating by using the following four formulas:
Figure BDA0002474208960000071
Figure BDA0002474208960000072
wherein R ismAnd RnIs the coordinate of the cell at the upper left corner of the prediction box, sigmoid (O)m) And sigmoid (O)n) Is the offset of the coordinate of the center point of the prediction frame compared with the coordinate of the center of the anchor, OmAnd OnIs the coordinate of the center point of the prediction box, bmAnd bnIs the normalized value of the center of the prediction box relative to the coordinate of the upper left corner of the cell, AwAnd AhIs width and height of anchor, OwAnd OhB is the width and height of the detection framewAnd bhThe width and height of the prediction box are normalized and are relative to the width and height of the anchor.
The loss function of center coordinates and width and height is:
Figure BDA0002474208960000073
wherein, αexitIs the weight coefficient of the central coordinate loss function, l x l represents the number of cells into which the characteristic diagram is divided, K represents the number of prediction frames,
Figure BDA0002474208960000074
judging whether the b-th prediction frame of the a-th cell is responsible for detecting the current object or not, and if so, judging whether the current object exists
Figure BDA0002474208960000075
The value is 1, otherwise 0; then using the squared error, wherein
Figure BDA0002474208960000076
And
Figure BDA0002474208960000077
is the center coordinate of the manual mark frame, maAnd naIs the center coordinate of the prediction box;
the loss function for width and height is:
Figure BDA0002474208960000078
wherein p isaAnd q isaIs the width and height of the prediction box,
Figure BDA0002474208960000079
and
Figure BDA00024742089600000710
then the width and height of the manual label box.
5) And meanwhile, scoring the possibility that an object exists in each detection frame by using logistic regression, recording the score as a confidence coefficient, selecting the detection frame with the highest confidence coefficient, reserving the detection frame, and deleting the rest detection frames.
The loss function of confidence in training is:
Figure BDA0002474208960000081
the loss function of confidence coefficient adopts cross entropy error, the first expression represents the loss function of confidence coefficient under the condition that the prediction frame has an object, the second expression represents the loss function of confidence coefficient under the condition that no object exists, αnoexitThe weight coefficient is expressed when no object exists, and the influence of a prediction frame without the object on the updating of the network parameters is reduced;
Figure BDA0002474208960000082
the confidence coefficient of the existing object is 1 when the object exists, and is 0 when the object does not exist; e.g. of the typeaRepresenting the confidence level calculated by the network itself.
6) After the confidence scores, the network classifies the objects in the detection frame according to the classified loss function.
The classified loss function adopts cross entropy error, and the calculation formula is as follows:
Figure BDA0002474208960000083
wherein e ∈ classes represents the object type to which the object in the prediction box belongs,
Figure BDA0002474208960000084
indicates that e is 1 when belonging to the correct class of the object, 0 when belonging to the other class, and Ga(e) Indicating the score of the network after classifying e.
S22: after the coordinate, confidence and classification training of the YOLOV3 network is completed, inputting the new data set into the YOLOV3 network, performing NMS processing on the finally remaining detection frames in the picture, deleting redundant frames, and outputting the picture with the detection frames.
Example 1: the citrus data set shot by the unmanned aerial vehicle is used for comparing and verifying the invention and a common target identification algorithm, and the obtained citrus data is shown in figure 2. In this embodiment, the specific recognition result is shown in table 1, comparing the method with three common target recognition algorithms of YOLO V3, Mask Fast rcnn and Fast rcnn, and the method of the present invention has the highest recognition accuracy.
TABLE 1 comparative experimental results
Serial number Name of algorithm Accuracy of identification
1 GAN+YOLO V3 90.1%
2 YOLO V3 85.4%
3 Mask fast rcnn 76.5%
4 Fast rcnn 69.4%
As can be seen from FIG. 3, the method of the present invention has better recognition under the condition of high density.
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims (8)

1. A target identification method of a complex environment under a small sample is characterized by comprising the following steps:
s1: data expansion, specifically comprising:
s11: constructing and training a generating type countermeasure network (GAN);
s12: after the GAN network training is finished, screening a data set generated by the GAN, and mixing a result and a real data set to form a new data set to obtain an expanded small sample data set; labeling the new data set, and taking the new data set after labeling as an input of YOLOV 3;
s2: the target identification specifically comprises the following steps:
s21: constructing and training a Yolov3 network;
s22: after the coordinate, confidence and classification training of the YOLOV3 network is completed, inputting the new data set into the YOLOV3 network, performing NMS processing on the finally remaining detection frames in the picture, deleting redundant frames, and outputting the picture with the detection frames.
2. The method for identifying the target under the complex environment with the small sample according to claim 1, wherein in step S11, the constructed GAN network comprises a generator C and a discriminator T; the generator C has an input, i.e. noise data according to a certain probability distribution, the role of C is to generate a new picture based on the input noise data; the discriminator T has two inputs, one is a real data set, the label of the real data set is automatically set to 1, the other input is data generated by the generator C, the label is automatically set to 0, and the function of T is to identify the real data and the data generated by the generator;
the loss function of the GAN network is:
Figure FDA0002474208950000011
wherein, t to Atrue(t)A data set t (t) representing data t from real data; n to Anoise(n)The representative data n comes from the data set C (n) generated by the generator C.
3. The method for target recognition in a complex environment under a small sample according to claim 2, wherein in step S11, the training GAN network is: a mode training generator C and a discriminator T which adopt single alternate iterative training; before training, initializing a generator C randomly, and pre-training a discriminator T to ensure that the discriminator T has certain classification capability when training is started;
the specific steps of GAN network training are as follows:
1) a fixed generator C trains a discriminator T and circularly executes the following steps K times;
① from noise data AnoiseMiddle sampling n objects to generate a set n-Anoise(n)
② from the real data set Atrue(t)Middle sampling t objects to generate set t-Atrue(t)
③ mixing n to Anoise(n)Inputting the data into the GAN to generate a new data set C (n);
④ mixing C (n) and t-Atrue(t)Inputting T, training with the following formula as a loss function, wherein the loss function is similar to a binary network, and when T is distinguished, A is inclined to betrue(t)The score of the data in (1) is close to 1, and the score of the data in (c) (n) is close to 0;
Figure FDA0002474208950000021
adopting cross entropy loss as a loss function, updating network parameters in a gradient descending mode, and circulating for K times to find out the optimal discriminator T under the condition of the current GAN;
2) the fixed discriminator T and the training generator C execute the following steps once;
① from noise data AnoiseMiddle sampling n objects to generate a set n-Anoise(n)
② mixing n to Anoise(n)Inputting C, and recording the output data as C (n);
③ from C (n) and Atrue(t)Sampling n data to form a set and inputting the set into T;
fourthly, training C according to the loss function of the following formula and the output result of T, and updating the network parameters by adopting gradient descent;
Figure FDA0002474208950000022
3) and ending the single training process, returning to the beginning, and performing training again.
4. The method for target recognition in a complex environment under a small sample according to claim 1, wherein in step S21, the YOLOV3 network is constructed by: the main network is 53 layers of dark net, the image features are extracted by adopting a residual error structure and a small convolution kernel, and three detection layers with different sizes are set to respectively detect a larger target and a smaller target by using a feature pyramid structure.
5. The method for target recognition in a complex environment under a small sample according to claim 4, wherein the step S21 of training the YoloV3 network specifically includes:
1) obtaining a clustering center of the data set by using a K _ means clustering algorithm, and setting the clustering center as the value of anchors;
2) using random resize to enhance data, and adjusting the size of an input picture to any multiple of 16;
3) inputting a picture into a network, extracting the features of the picture through Darknet53, dividing the feature picture into three cells with different numbers, respectively sending the extracted features to three YOLO detection layers, and outputting a picture with a prediction frame drawn by the YOLO layers;
4) and comparing the coordinates of the prediction frame drawn by the YOLO layer with the coordinates of the anchor, regressing the coordinate offset by adopting a logistic mode, and calculating by using the following four formulas:
Figure FDA0002474208950000023
Figure FDA0002474208950000024
wherein R ismAnd RnIs the coordinate of the cell at the upper left corner of the prediction box, sigmoid (O)m) And sigmoid (O)n) Is the offset of the coordinate of the center point of the prediction frame compared with the coordinate of the center of the anchor, OmAnd OnIs the coordinate of the center point of the prediction box, bmAnd bnIs the normalized value of the center of the prediction box relative to the coordinate of the upper left corner of the cell, AwAnd AhIs width and height of anchor, OwAnd OhB is the width and height of the detection framewAnd bhThe width and height of the prediction box after normalization are relative to the width and height of the anchor;
5) meanwhile, the probability that an object exists in each detection frame is scored by using logistic regression, the probability is recorded as confidence degree, the detection frame with the highest confidence degree is selected for reservation, and the rest frames are deleted;
6) after the confidence scores, the network classifies the objects in the detection frame according to the classified loss function.
6. The method for target recognition in a complex environment under a small sample according to claim 5, wherein in the step 4) of training the YOLOV3 network, the loss functions of the center coordinates and the width and height are as follows:
Figure FDA0002474208950000031
wherein, αexitIs the weight coefficient of the central coordinate loss function, l x l represents the number of cells into which the characteristic diagram is divided, K represents the number of prediction frames,
Figure FDA0002474208950000032
judging whether the b-th prediction frame of the a-th cell is responsible for detecting the current object or not, and if so, judging whether the current object exists
Figure FDA0002474208950000033
The value is 1, otherwise 0; then using the squared error, wherein
Figure FDA0002474208950000034
And
Figure FDA0002474208950000035
is the center coordinate of the manual mark frame, maAnd naIs the center coordinate of the prediction box;
the loss function for width and height is:
Figure FDA0002474208950000036
wherein p isaAnd q isaIs the width and height of the prediction box,
Figure FDA0002474208950000037
and
Figure FDA0002474208950000038
then the width and height of the manual label box.
7. The method for target recognition in a complex environment under a small sample according to claim 5, wherein in the step 5) of training the YOLOV3 network, the loss function of the confidence coefficient during training is:
Figure FDA0002474208950000039
Figure FDA00024742089500000310
the loss function of confidence coefficient adopts cross entropy error, the first expression represents the loss function of confidence coefficient under the condition that the prediction frame has an object, the second expression represents the loss function of confidence coefficient under the condition that no object exists, αnoexitThe weight coefficient is expressed when no object exists, and the influence of a prediction frame without the object on the updating of the network parameters is reduced;
Figure FDA00024742089500000311
the confidence coefficient of the existing object is 1 when the object exists, and is 0 when the object does not exist; e.g. of the typeaRepresenting the confidence level calculated by the network itself.
8. The method for identifying the target under the complex environment with the small sample as recited in claim 5, wherein in the step 6) of training the YOLOV3 network, the classified loss function adopts cross entropy error, and the calculation formula is as follows:
Figure FDA0002474208950000041
wherein e ∈ classes represents the object type to which the object in the prediction box belongs,
Figure FDA0002474208950000042
indicates that e is 1 when belonging to the correct class of the object, 0 when belonging to the other class, and Ga(e) Indicating the score of the network after classifying e.
CN202010358400.5A 2020-04-29 2020-04-29 Target identification method for complex environment under small sample Pending CN111582345A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010358400.5A CN111582345A (en) 2020-04-29 2020-04-29 Target identification method for complex environment under small sample

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010358400.5A CN111582345A (en) 2020-04-29 2020-04-29 Target identification method for complex environment under small sample

Publications (1)

Publication Number Publication Date
CN111582345A true CN111582345A (en) 2020-08-25

Family

ID=72117097

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010358400.5A Pending CN111582345A (en) 2020-04-29 2020-04-29 Target identification method for complex environment under small sample

Country Status (1)

Country Link
CN (1) CN111582345A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767907A (en) * 2020-09-03 2020-10-13 江苏铨铨信息科技有限公司 Method of multi-source data fire detection system based on GA and VGG network
CN112184673A (en) * 2020-09-30 2021-01-05 中国电子科技集团公司电子科学研究院 Tablet target detection method for medication compliance management
CN112200055A (en) * 2020-09-30 2021-01-08 深圳市信义科技有限公司 Pedestrian attribute identification method, system and device of joint countermeasure generation network
CN113139476A (en) * 2021-04-27 2021-07-20 山东英信计算机技术有限公司 Data center-oriented human behavior attribute real-time detection method and system
CN113239813A (en) * 2021-05-17 2021-08-10 中国科学院重庆绿色智能技术研究院 Three-order cascade architecture-based YOLOv3 prospective target detection method
CN113744262A (en) * 2021-09-17 2021-12-03 浙江工业大学 Target segmentation detection method based on GAN and YOLO-v5
CN113780480A (en) * 2021-11-11 2021-12-10 深圳佑驾创新科技有限公司 Method for constructing multi-target detection and category identification model based on YOLOv5
CN113903009A (en) * 2021-12-10 2022-01-07 华东交通大学 Railway foreign matter detection method and system based on improved YOLOv3 network
CN114155453A (en) * 2022-02-10 2022-03-08 深圳爱莫科技有限公司 Training method for ice chest commodity image recognition, model and occupancy calculation method
CN114170597A (en) * 2021-11-12 2022-03-11 天健创新(北京)监测仪表股份有限公司 Algae detection equipment and detection method

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107844770A (en) * 2017-11-03 2018-03-27 东北大学 A kind of electric melting magnesium furnace unusual service condition automatic recognition system based on video
CN108805070A (en) * 2018-06-05 2018-11-13 合肥湛达智能科技有限公司 A kind of deep learning pedestrian detection method based on built-in terminal
CN109171605A (en) * 2018-08-29 2019-01-11 合肥工业大学 Intelligent edge calculations system with target positioning and hysteroscope video enhancing processing function
US20190065901A1 (en) * 2017-08-29 2019-02-28 Vintra, Inc. Systems and methods for a tailored neural network detector
CN109409365A (en) * 2018-10-25 2019-03-01 江苏德劭信息科技有限公司 It is a kind of that method is identified and positioned to fruit-picking based on depth targets detection
CN109635875A (en) * 2018-12-19 2019-04-16 浙江大学滨海产业技术研究院 A kind of end-to-end network interface detection method based on deep learning
CN109671018A (en) * 2018-12-12 2019-04-23 华东交通大学 A kind of image conversion method and system based on production confrontation network and ResNets technology
CN109753946A (en) * 2019-01-23 2019-05-14 哈尔滨工业大学 A kind of real scene pedestrian's small target deteection network and detection method based on the supervision of body key point
CN109919058A (en) * 2019-02-26 2019-06-21 武汉大学 A kind of multisource video image highest priority rapid detection method based on Yolo V3
CN109928107A (en) * 2019-04-08 2019-06-25 江西理工大学 A kind of automatic classification system
CN109934117A (en) * 2019-02-18 2019-06-25 北京联合大学 Based on the pedestrian's weight recognition detection method for generating confrontation network
CN110060248A (en) * 2019-04-22 2019-07-26 哈尔滨工程大学 Sonar image submarine pipeline detection method based on deep learning
CN110070072A (en) * 2019-05-05 2019-07-30 厦门美图之家科技有限公司 A method of generating object detection model
CN110135468A (en) * 2019-04-24 2019-08-16 中国矿业大学(北京) A kind of recognition methods of gangue
CN110135267A (en) * 2019-04-17 2019-08-16 电子科技大学 A kind of subtle object detection method of large scene SAR image
CN110472467A (en) * 2019-04-08 2019-11-19 江西理工大学 The detection method for transport hub critical object based on YOLO v3
CN110659702A (en) * 2019-10-17 2020-01-07 黑龙江德亚文化传媒有限公司 Calligraphy copybook evaluation system and method based on generative confrontation network model

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190065901A1 (en) * 2017-08-29 2019-02-28 Vintra, Inc. Systems and methods for a tailored neural network detector
CN107844770A (en) * 2017-11-03 2018-03-27 东北大学 A kind of electric melting magnesium furnace unusual service condition automatic recognition system based on video
CN108805070A (en) * 2018-06-05 2018-11-13 合肥湛达智能科技有限公司 A kind of deep learning pedestrian detection method based on built-in terminal
CN109171605A (en) * 2018-08-29 2019-01-11 合肥工业大学 Intelligent edge calculations system with target positioning and hysteroscope video enhancing processing function
CN109409365A (en) * 2018-10-25 2019-03-01 江苏德劭信息科技有限公司 It is a kind of that method is identified and positioned to fruit-picking based on depth targets detection
CN109671018A (en) * 2018-12-12 2019-04-23 华东交通大学 A kind of image conversion method and system based on production confrontation network and ResNets technology
CN109635875A (en) * 2018-12-19 2019-04-16 浙江大学滨海产业技术研究院 A kind of end-to-end network interface detection method based on deep learning
CN109753946A (en) * 2019-01-23 2019-05-14 哈尔滨工业大学 A kind of real scene pedestrian's small target deteection network and detection method based on the supervision of body key point
CN109934117A (en) * 2019-02-18 2019-06-25 北京联合大学 Based on the pedestrian's weight recognition detection method for generating confrontation network
CN109919058A (en) * 2019-02-26 2019-06-21 武汉大学 A kind of multisource video image highest priority rapid detection method based on Yolo V3
CN109928107A (en) * 2019-04-08 2019-06-25 江西理工大学 A kind of automatic classification system
CN110472467A (en) * 2019-04-08 2019-11-19 江西理工大学 The detection method for transport hub critical object based on YOLO v3
CN110135267A (en) * 2019-04-17 2019-08-16 电子科技大学 A kind of subtle object detection method of large scene SAR image
CN110060248A (en) * 2019-04-22 2019-07-26 哈尔滨工程大学 Sonar image submarine pipeline detection method based on deep learning
CN110135468A (en) * 2019-04-24 2019-08-16 中国矿业大学(北京) A kind of recognition methods of gangue
CN110070072A (en) * 2019-05-05 2019-07-30 厦门美图之家科技有限公司 A method of generating object detection model
CN110659702A (en) * 2019-10-17 2020-01-07 黑龙江德亚文化传媒有限公司 Calligraphy copybook evaluation system and method based on generative confrontation network model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王沣: "人工智能在门窗检测中图纸识别的应用", 《四川水泥》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767907A (en) * 2020-09-03 2020-10-13 江苏铨铨信息科技有限公司 Method of multi-source data fire detection system based on GA and VGG network
CN112184673A (en) * 2020-09-30 2021-01-05 中国电子科技集团公司电子科学研究院 Tablet target detection method for medication compliance management
CN112200055A (en) * 2020-09-30 2021-01-08 深圳市信义科技有限公司 Pedestrian attribute identification method, system and device of joint countermeasure generation network
CN112200055B (en) * 2020-09-30 2024-04-30 深圳市信义科技有限公司 Pedestrian attribute identification method, system and device of combined countermeasure generation network
CN113139476A (en) * 2021-04-27 2021-07-20 山东英信计算机技术有限公司 Data center-oriented human behavior attribute real-time detection method and system
CN113239813B (en) * 2021-05-17 2022-11-25 中国科学院重庆绿色智能技术研究院 YOLOv3 distant view target detection method based on third-order cascade architecture
CN113239813A (en) * 2021-05-17 2021-08-10 中国科学院重庆绿色智能技术研究院 Three-order cascade architecture-based YOLOv3 prospective target detection method
CN113744262A (en) * 2021-09-17 2021-12-03 浙江工业大学 Target segmentation detection method based on GAN and YOLO-v5
CN113744262B (en) * 2021-09-17 2024-02-02 浙江工业大学 Target segmentation detection method based on GAN and YOLO-v5
CN113780480A (en) * 2021-11-11 2021-12-10 深圳佑驾创新科技有限公司 Method for constructing multi-target detection and category identification model based on YOLOv5
CN114170597A (en) * 2021-11-12 2022-03-11 天健创新(北京)监测仪表股份有限公司 Algae detection equipment and detection method
CN113903009A (en) * 2021-12-10 2022-01-07 华东交通大学 Railway foreign matter detection method and system based on improved YOLOv3 network
CN114155453A (en) * 2022-02-10 2022-03-08 深圳爱莫科技有限公司 Training method for ice chest commodity image recognition, model and occupancy calculation method

Similar Documents

Publication Publication Date Title
CN111582345A (en) Target identification method for complex environment under small sample
CN113378632B (en) Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method
CN109657584B (en) Improved LeNet-5 fusion network traffic sign identification method for assisting driving
CN112418117B (en) Small target detection method based on unmanned aerial vehicle image
CN107247956B (en) Rapid target detection method based on grid judgment
CN103605972B (en) Non-restricted environment face verification method based on block depth neural network
Xu et al. Augmenting strong supervision using web data for fine-grained categorization
WO2019140767A1 (en) Recognition system for security check and control method thereof
CN111832608B (en) Iron spectrum image multi-abrasive particle identification method based on single-stage detection model yolov3
CN110929679B (en) GAN-based unsupervised self-adaptive pedestrian re-identification method
CN113011319A (en) Multi-scale fire target identification method and system
Wang et al. LPR-Net: Recognizing Chinese license plate in complex environments
Silva-Rodriguez et al. Self-learning for weakly supervised gleason grading of local patterns
CN107833213A (en) A kind of Weakly supervised object detecting method based on pseudo- true value adaptive method
CN113643228B (en) Nuclear power station equipment surface defect detection method based on improved CenterNet network
CN105930792A (en) Human action classification method based on video local feature dictionary
CN114092742B (en) Multi-angle-based small sample image classification device and method
CN112766170B (en) Self-adaptive segmentation detection method and device based on cluster unmanned aerial vehicle image
WO2022062419A1 (en) Target re-identification method and system based on non-supervised pyramid similarity learning
CN114092747A (en) Small sample image classification method based on depth element metric model mutual learning
CN112613428B (en) Resnet-3D convolution cattle video target detection method based on balance loss
CN112597324A (en) Image hash index construction method, system and equipment based on correlation filtering
CN112232395B (en) Semi-supervised image classification method for generating countermeasure network based on joint training
JP2022082493A (en) Pedestrian re-identification method for random shielding recovery based on noise channel
CN114818963B (en) Small sample detection method based on cross-image feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210527

Address after: 400714 No. 266 Fangzheng Road, Beibei District, Chongqing.

Applicant after: Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences

Applicant after: Chongqing University

Address before: 400714 No. 266 Fangzheng Road, Beibei District, Chongqing.

Applicant before: Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences

CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Zhang Xuerui

Inventor after: Yao Yuan

Inventor after: Zheng Zhihao

Inventor after: Zhang Fan

Inventor after: Shang Mingsheng

Inventor before: Yao Yuan

Inventor before: Zheng Zhihao

Inventor before: Zhang Xuerui

Inventor before: Zhang Fan

Inventor before: Shang Mingsheng

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200825