WO2021056914A1 - 一种目标检测模型的自动建模方法及装置 - Google Patents
一种目标检测模型的自动建模方法及装置 Download PDFInfo
- Publication number
- WO2021056914A1 WO2021056914A1 PCT/CN2019/130024 CN2019130024W WO2021056914A1 WO 2021056914 A1 WO2021056914 A1 WO 2021056914A1 CN 2019130024 W CN2019130024 W CN 2019130024W WO 2021056914 A1 WO2021056914 A1 WO 2021056914A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target detection
- model
- detection model
- network
- parameters
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 198
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000011156 evaluation Methods 0.000 claims abstract description 43
- 238000012549 training Methods 0.000 claims abstract description 34
- 238000013528 artificial neural network Methods 0.000 claims abstract description 31
- 238000010200 validation analysis Methods 0.000 claims abstract description 4
- 238000012795 verification Methods 0.000 claims description 17
- 230000004927 fusion Effects 0.000 claims description 12
- 238000012360 testing method Methods 0.000 claims description 12
- 125000004122 cyclic group Chemical group 0.000 claims description 6
- 230000000306 recurrent effect Effects 0.000 claims description 6
- 230000000694 effects Effects 0.000 abstract description 6
- 238000010586 diagram Methods 0.000 description 7
- 238000000605 extraction Methods 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/092—Reinforcement learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0985—Hyperparameter optimisation; Meta-learning; Learning-to-learn
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/776—Validation; Performance evaluation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Definitions
- the invention relates to the field of target detection, in particular to an automatic modeling method and device in the field of target detection.
- object detection has a wide range of applications in traffic monitoring, image retrieval, human-computer interaction, and so on. It aims to detect the target objects that people are interested in in a static image (or dynamic video).
- the more popular algorithms include Yolo, SSD, R-CNN algorithms and so on.
- the existing target detection algorithm uses a fixed network structure to extract image features, and cannot use different network structures to extract the most suitable image features according to different tasks and data characteristics, so that the manually designed target detection model can only achieve high accuracy for specific tasks. Lack of flexibility.
- the technical problem to be solved by the present invention is to provide an automatic modeling method of a target detection model, which can search for different models according to different tasks and improve the target detection effect.
- an embodiment of the present invention provides an automatic modeling method of a target detection model, including:
- the training set to train the first target detection model; when the number of training times reaches the first preset number of times, use the verification set to evaluate the current first target detection model; and output the evaluation result; wherein, the first target detection
- the model is a model obtained by fusion of the detection part of the NAS network model and the second target detection model;
- Steps S1-S4 are repeated for a second preset number of times, and the optimal first target detection model is determined according to the evaluation result.
- the first neural network is a recurrent neural network RNN;
- Step S1 includes:
- Step S11 According to the pre-configured hyper-parameters, the cyclic neural network RNN controller is sampled to obtain the sub-network structure code; wherein, the hyper-parameters include sub-network hyper-parameters and controller hyper-parameters; sub-network hyper-parameters include sub-network hyper-parameters The number of layers, the number of cell branches, the parameters related to the learning rate of the sub-network, and the number of output channels; the hyperparameters of the controller include the parameters related to the learning rate of the controller and the configuration parameters of the optimizer;
- Step S12 Decode the sub-network structure code through a decoder, and output a NAS network model.
- the NAS network model is a NAS network without a fully connected layer
- the fusion of the detection part of the NAS network and the first target detection model refers to:
- Multiple output terminals of different scales of the NAS network model are each connected to a 1*1 convolutional layer, and the output of the 1*1 convolutional layer is used as the input of the detection part of the second target detection model.
- the method includes:
- the training set is used to train the first target detection model with the best evaluation result, and the first target detection model with the best evaluation result after the training is tested on the test set.
- the second target detection model includes Yolo.
- using the verification set to evaluate the current first target detection model includes:
- Determining the optimal first target detection model according to the evaluation result includes:
- the first target detection model corresponding to the largest MAP within the second preset number of times is determined as the optimal first target detection model.
- calculating the target detection task reward reward corresponding to the current first target detection model includes:
- L cla classification loss
- L reg regression loss
- the baseline is a preset value or formula.
- adjusting the parameters used for searching the NAS network model according to the reward includes:
- the backpropagation gradient may represent training parameters, ⁇ represents a preset parameter, log ⁇ ⁇ (s t, a t) is searched when the t-th step S1 NAS network model corresponding to the cross-entropy loss, R t represents The reward value corresponding to the NAS network model found when step S1 is executed for the tth time.
- an embodiment of the present invention also provides an automatic modeling device for a target detection model, including: a memory and a processor;
- the memory is used to store a program for automatic modeling of a target detection model
- the processor is configured to read and execute the program for automatic modeling of the target detection model, and perform the following operations:
- the training set to train the first target detection model; when the number of training times reaches the first preset number of times, use the verification set to evaluate the current first target detection model; and output the evaluation result; wherein, the first target detection
- the model is a model obtained by fusion of the detection part of the NAS network model and the second target detection model;
- Steps S1-S4 are repeated for a second preset number of times, and the optimal first target detection model is determined according to the evaluation result.
- the first neural network is a recurrent neural network RNN;
- Step S1 includes:
- Step S11 According to the pre-configured hyper-parameters, the cyclic neural network RNN controller is sampled to obtain the sub-network structure code; wherein, the hyper-parameters include sub-network hyper-parameters and controller hyper-parameters; sub-network hyper-parameters include sub-network hyper-parameters The number of layers, the number of cell branches, the parameters related to the learning rate of the sub-network, and the number of output channels; the hyperparameters of the controller include the parameters related to the learning rate of the controller and the configuration parameters of the optimizer;
- Step S12 Decode the sub-network structure code through a decoder, and output a NAS network model.
- the NAS network model is a NAS network without a fully connected layer
- the fusion of the detection part of the NAS network and the first target detection model refers to:
- Multiple output terminals of different scales of the NAS network model are each connected to a 1*1 convolutional layer, and the output of the 1*1 convolutional layer is used as the input of the detection part of the second target detection model.
- the processor is configured to read and execute the program for automatic modeling of the target detection model, and perform the following operations:
- the training set is used to train the first target detection model with the best evaluation result, and the first target detection model with the best evaluation result after training is trained Test on the test set.
- the second target detection model includes Yolo.
- using the verification set to evaluate the current first target detection model includes:
- Determining the optimal first target detection model according to the evaluation result includes:
- the first target detection model corresponding to the largest MAP within the second preset number of times is determined as the optimal first target detection model.
- calculating the target detection task reward corresponding to the current first target detection model includes:
- L cla classification loss
- L reg regression loss
- the baseline is a preset value or formula.
- adjusting the parameters used for searching the NAS network model according to the reward includes:
- the backpropagation gradient may represent training parameters, ⁇ represents a preset parameter, log ⁇ ⁇ (s t, a t) is searched when the t-th step S1 NAS network model corresponding to the cross-entropy loss, R t represents The reward value corresponding to the NAS network model found when step S1 is executed for the tth time.
- the embodiments of the present invention provide an automatic modeling method and device for a target detection model.
- a new target detection model is formed by fusing the feature extraction model searched for according to different tasks with the target detection model in the prior art. , Improve the target detection effect.
- Fig. 1 is a schematic diagram of an automatic modeling method of a target detection model according to an embodiment of the present invention.
- Fig. 2 is a flowchart of automatic modeling of a target detection model according to an embodiment of the present invention.
- Fig. 3 is a schematic diagram of Yolo3 according to an embodiment of the present invention.
- Fig. 4 is a schematic diagram of automatic modeling of a Yolo3-NAS model according to an embodiment of the present invention.
- Fig. 5 is a schematic diagram of an automatic modeling device for a target detection model according to an embodiment of the present invention.
- Fig. 1 is a schematic diagram of an automatic modeling method of a target detection model according to an embodiment of the present invention. As shown in Fig. 1, the automatic modeling method of this embodiment includes:
- the first neural network may be a recurrent neural network RNN;
- the NAS network model refers to a neural architecture search (Neural Architecture Search) network model
- Step S1 may include:
- Step S11 According to the pre-configured hyper-parameters, the cyclic neural network RNN controller is sampled to obtain the sub-network structure code; wherein, the hyper-parameters include sub-network hyper-parameters and controller hyper-parameters; sub-network hyper-parameters include sub-network hyper-parameters The number of layers, the number of cell branches, the parameters related to the learning rate of the sub-network, and the number of output channels; the hyperparameters of the controller include the parameters related to the learning rate of the controller and the configuration parameters of the optimizer;
- Step S12 Decode the sub-network structure code through a decoder, and output a NAS network model.
- the training set to train the first target detection model; when the number of training times reaches the first preset number of times, use the verification set to evaluate the current first target detection model; and output the evaluation result; wherein, the first target detection
- the model is a model obtained by fusion of the detection part of the NAS network model and the second target detection model;
- the NAS network model is a NAS network without a fully connected layer
- the fusion of the detection part of the NAS network and the first target detection model refers to:
- Multiple output terminals of different scales of the NAS network model are each connected to a 1*1 convolutional layer, and the output of the 1*1 convolutional layer is used as the input of the detection part of the second target detection model.
- using the verification set to evaluate the current first target detection model may include:
- the second target detection model may include Yolo.
- Yolo You Only Live Once
- Yolo is an object recognition and localization algorithm based on deep neural networks. It belongs to a one-stage algorithm, that is, the algorithm is directly applied to the input image and the category and corresponding positioning are output.
- determining the optimal first target detection model according to the evaluation result may include:
- the first target detection model corresponding to the largest MAP within the second preset number of times is determined as the optimal first target detection model.
- calculating the target detection task reward corresponding to the current first target detection model includes:
- L cla classification loss
- L reg regression loss
- the baseline is a preset value or formula.
- adjusting the parameters used for searching the NAS network model according to the reward includes:
- the backpropagation gradient may represent training parameters, ⁇ represents a preset parameter, log ⁇ ⁇ (s t, a t) is searched when the t-th step S1 NAS network model corresponding to the cross-entropy loss, R t represents The reward value corresponding to the NAS network model found when step S1 is executed for the tth time.
- the parameters used by the NAS network model may include the trainable parameters of the RNN controller.
- determining the optimal first target detection model according to the evaluation result may include:
- the first target detection model corresponding to the largest MAP within the second preset number of times is determined as the optimal first target detection model.
- the optimal first target detection model may include:
- the training set is used to train the first target detection model with the best evaluation result, and the first target detection model with the best evaluation result after the training is tested on the test set.
- Fig. 2 is a schematic diagram of an automatic modeling method of a target detection model according to an embodiment of the present invention. As shown in Figure 2, the steps are as follows:
- Step 201 Initialize input.
- Initialization input includes hyper-parameter configuration and database read-in data set.
- the hyper-parameters may include sub-network hyper-parameters and controller hyper-parameters.
- the sub-network hyperparameters mainly include the number of layers of the sub-network, the number of cell branches, the parameters related to the model learning rate, and the number of output channels.
- the number of layers of the sub-network is the number of cells, and the parameters related to the learning rate of the sub-network refer to the decay rate and decay steps in exponential decay.
- the aforementioned data set may include a training set, a verification set, and a test set.
- Step 202 Using the RNN network as the controller, sampling and outputting the sub-network structure code.
- Step 203 Output the NAS network model through the decoder, and merge the output part of Yolo3 to form a Yolo3-NAS model.
- the current Yolo3 uses the Darknet-53 network structure to extract image features.
- the detection part it refers to the idea of FPN (feature pyramid networks).
- the present invention replaces the feature extraction network Darknet-53 in Yolo3 with an automatically searched and generated NAS network (normal cell and reduce cell superimposition), by removing the fully connected layer of the NAS network model, and in the NAS network model Multiple output terminals add 1*1 convolution to merge with Yolo's detection output part to form a Yolo3-NAS model. It should be noted that this embodiment is described on the basis of Yolo3, but it is not limited to Yolo3 in practical applications, and other target detection models can also be improved in this way.
- Step 204 Train the Yolo3-NAS model on the training set, and after the preset number of training times is reached, verify the Yolo3-NAS model through the validation set, and output the evaluation result.
- the evaluation result can be evaluated with the mean average precision MAP.
- the MAP evaluation is an existing technology, and will not be repeated here.
- Step 205 Calculate the target detection task reward corresponding to the Yolo3-NAS model.
- L cla classification loss
- L reg regression loss
- baseline a preset value or formula.
- Target detection has two parts: classification and regression. Classification is whether the target category is correctly classified in target detection, and regression is whether the position of the target is correct.
- the loss corresponding to classification and regression is classification loss and regression loss.
- the present invention expands the excitation effect (from (-1, 1) to (- ⁇ , ⁇ )) through the function, so that the controller parameter is updated faster, that is, the optimal model structure is searched faster.
- Step 206 Feedback the reward to the controller, and update the trainable parameters of the controller.
- ⁇ represents the trainable parameters of the RNN controller
- ⁇ represents the preset parameter
- log ⁇ ⁇ (s t , a t ) can be understood as the t-th structure (the NAS network model searched during the t-th execution of step S1) corresponds to The cross entropy loss.
- R t the reward value corresponding to the t-th structure (the NAS network model searched when step S1 is executed for the t-th time)
- reward To feedback whether the gradient calculated by the cross-entropy is a trustworthy gradient. If the reward is small or negative, it means that the gradient descent is in the wrong direction and we should update the parameters in the other direction. If the reward is positive or large, it means that the gradient descent is in the right direction and is moving in Update parameters in this direction.
- Step 207 Repeat the above steps 202-206 for a preset number of times, and the evaluation result of the preset number of times is the largest as the searched best sub-network model (that is, the detection part of the NAS network model and the second target detection model are merged to obtain Model), retrain and test its final effect on the test set.
- the target detection into an automatic search for the best model method
- different models can be searched out according to different tasks, thereby improving the detection effect in a targeted manner.
- FIG. 5 is a schematic diagram of an automatic modeling device of a target detection model according to an embodiment of the present invention.
- the automatic modeling device of this embodiment includes: a memory and a processor;
- the memory is used to store a program for automatic modeling of a target detection model
- the processor is configured to read and execute the program for automatic modeling of the target detection model, and perform the following operations:
- the training set to train the first target detection model; when the number of training times reaches the first preset number of times, use the verification set to evaluate the current first target detection model; and output the evaluation result; wherein, the first target detection
- the model is a model obtained by fusion of the detection part of the NAS network model and the second target detection model;
- Steps S1-S4 are repeated for a second preset number of times, and the optimal first target detection model is determined according to the evaluation result.
- the first neural network is a recurrent neural network RNN;
- step S1 includes:
- Step S11 According to the pre-configured hyper-parameters, the cyclic neural network RNN controller is sampled to obtain the sub-network structure code; wherein, the hyper-parameters include sub-network hyper-parameters and controller hyper-parameters; sub-network hyper-parameters include sub-network hyper-parameters The number of layers, the number of cell branches, the parameters related to the learning rate of the sub-network, and the number of output channels; the hyperparameters of the controller include the parameters related to the learning rate of the controller and the configuration parameters of the optimizer;
- Step S12 Decode the sub-network structure code through a decoder, and output a NAS network model.
- the NAS network model is a NAS network without a fully connected layer
- the fusion of the detection part of the NAS network and the first target detection model refers to:
- Multiple output terminals of different scales of the NAS network model are each connected to a 1*1 convolutional layer, and the output of the 1*1 convolutional layer is used as the input of the detection part of the second target detection model.
- the processor is configured to read and execute the program for automatic modeling of the target detection model, and perform the following operations:
- the training set is used to train the first target detection model with the best evaluation result, and the first target detection model with the best evaluation result after training is trained Test on the test set.
- the second target detection model includes Yolo.
- using the verification set to evaluate the current first target detection model includes:
- determining the optimal first target detection model according to the evaluation result includes:
- the first target detection model corresponding to the largest MAP within the second preset number of times is determined as the optimal first target detection model.
- calculating the target detection task reward corresponding to the current first target detection model includes:
- L cla classification loss
- L reg regression loss
- the baseline is a preset value or formula.
- adjusting the parameters used for searching the NAS network model according to the reward includes:
- the backpropagation gradient may represent training parameters, ⁇ represents a preset parameter, log ⁇ ⁇ (s t, a t) is searched when the t-th step S1 NAS network model corresponding to the cross-entropy loss, R t represents The reward value corresponding to the NAS network model found when step S1 is executed for the tth time.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Biodiversity & Conservation Biology (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (16)
- 一种目标检测模型的自动建模方法,其特征在于,包括:S1、根据预定的第一神经网络搜索神经网络结构搜索NAS网络模型;S2、利用训练集训练第一目标检测模型;当训练次数达到第一预设次数后,利用验证集对当前的第一目标检测模型进行评估;并输出评估结果;其中,所述第一目标检测模型是所述NAS网络模型和第二目标检测模型的检测部分融合得到的模型;S3、计算所述当前的第一目标检测模型对应的目标检测任务奖励reward;S4、根据所述目标检测任务奖励reward调整搜索所述NAS网络模型所用的参数;重复步骤S1-S4第二预设次数,根据所述评估结果确定最优的第一目标检测模型。
- 如权利要求1所述的方法,其特征在于,所述第一神经网络为循环神经网络RNN;步骤S1包括:步骤S11、根据预先配置的超参数通过循环神经网络RNN控制器采样,获取子网络结构编码;其中,所述超参数包括子网络超参数、控制器的超参数;子网络超参数包括子网络的层数、cell分支数目、与子网络学习率相关的参数、输出通道数目;控制器的超参数包括与控制器学习率相关的参数、优化器配置参数;步骤S12、通过解码器对所述子网络结构编码进行解码,输出NAS网络模型。
- 如权利要求2所述的方法,其特征在于,所述NAS网络模型为没有全连接层的NAS网络;所述NAS网络和所述第一目标检测模型的检测部分融合是指:所述NAS网络模型的多个不同尺度的输出端各自连接一个1*1卷积层,将所述1*1卷积层的输出作为所述第二目标检测模型的检测部分的输 入。
- 如权利要求1所述的方法,其特征在于,根据所述评估结果确定最优的第一目标检测模型后,包括:利用训练集对评估结果最优的第一目标检测模型进行训练,并对所述训练后的评估结果最优的第一目标检测模型在测试集上进行测试。
- 如权利要求1所述的方法,其特征在于,所述第二目标检测模型包括Yolo。
- 如权利要求1所述的方法,其特征在于,利用验证集对当前的第一目标检测模型进行评估,包括:利用验证集对当前的第一目标检测模型进行均值平均精度MAP评估;根据所述评估结果确定最优的第一目标检测模型,包括:将第二预设次数内最大的MAP对应的第一目标检测模型确定为最优的第一目标检测模型。
- 一种目标检测模型的自动建模装置,包括:存储器和处理器;其特征在于:所述存储器,用于保存用于目标检测模型的自动建模的程序;所述处理器,用于读取执行所述用于目标检测模型的自动建模的程序,执行如下操作:S1、根据预定的第一神经网络搜索神经网络结构搜索NAS网络模型;S2、利用训练集训练第一目标检测模型;当训练次数达到第一预设次数后,利用验证集对当前的第一目标检测模型进行评估;并输出评估结果;其中,所述第一目标检测模型是所述NAS网络模型和第二目标检测模型的检测部分融合得到的模型;S3、计算所述当前的第一目标检测模型对应的目标检测任务奖励reward;S4、根据所述目标检测任务奖励reward调整搜索所述NAS网络模型所用的参数;重复步骤S1-S4第二预设次数,根据所述评估结果确定最优的第一目标检测模型。
- 如权利要求9所述的装置,其特征在于,所述第一神经网络为循环神经网络RNN;步骤S1包括:步骤S11、根据预先配置的超参数通过循环神经网络RNN控制器采样,获取子网络结构编码;其中,所述超参数包括子网络超参数、控制器的超参数;子网络超参数包括子网络的层数、cell分支数目、与子网络学习率相关的参数、输出通道数目;控制器的超参数包括与控制器学习率相关的参数、优化器配置参数;步骤S12、通过解码器对所述子网络结构编码进行解码,输出NAS网络模型。
- 如权利要求10所述的装置,其特征在于,所述NAS网络模型为没有全连接层的NAS网络;所述NAS网络和所述第一目标检测模型的检测部分融合是指:所述NAS网络模型的多个不同尺度的输出端各自连接一个1*1卷积层,将所述1*1卷积层的输出作为所述第二目标检测模型的检测部分的输入。
- 如权利要求9所述的装置,其特征在于,所述处理器,用于读取执行所述用于目标检测模型的自动建模的程序,执行如下操作:根据所述评估结果确定最优的第一目标检测模型后,利用训练集对评估结果最优的第一目标检测模型进行训练,并对所述训练后的评估结果最优的第一目标检测模型在测试集上进行测试。
- 如权利要求9所述的装置,其特征在于,所述第二目标检测模型包括Yolo。
- 如权利要求9所述的装置,其特征在于,利用验证集对当前的第一目标检测模型进行评估,包括:利用验证集对当前的第一目标检测模型进行均值平均精度MAP评估;根据所述评估结果确定最优的第一目标检测模型,包括:将第二预设次数内最大的MAP对应的第一目标检测模型确定为最优的第一目标检测模型。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP19946971.9A EP4036796A4 (en) | 2019-09-25 | 2019-12-30 | AUTOMATIC MODELING METHOD AND APPARATUS FOR OBJECT DETECTION MODEL |
JP2022517307A JP7335430B2 (ja) | 2019-09-25 | 2019-12-30 | 目標検出モデルの自動モデリング方法及び装置 |
KR1020227009936A KR20220051383A (ko) | 2019-09-25 | 2019-12-30 | 표적 검출 모델의 자동 모델링 방법 및 장치 |
US17/642,816 US20220383627A1 (en) | 2019-09-25 | 2019-12-30 | Automatic modeling method and device for object detection model |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910912868.1 | 2019-09-25 | ||
CN201910912868.1A CN110705573A (zh) | 2019-09-25 | 2019-09-25 | 一种目标检测模型的自动建模方法及装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021056914A1 true WO2021056914A1 (zh) | 2021-04-01 |
Family
ID=69196577
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/130024 WO2021056914A1 (zh) | 2019-09-25 | 2019-12-30 | 一种目标检测模型的自动建模方法及装置 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20220383627A1 (zh) |
EP (1) | EP4036796A4 (zh) |
JP (1) | JP7335430B2 (zh) |
KR (1) | KR20220051383A (zh) |
CN (1) | CN110705573A (zh) |
WO (1) | WO2021056914A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117291845A (zh) * | 2023-11-27 | 2023-12-26 | 成都理工大学 | 一种点云地面滤波方法、***、电子设备及存储介质 |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111738098B (zh) * | 2020-05-29 | 2022-06-17 | 浪潮(北京)电子信息产业有限公司 | 一种车辆识别方法、装置、设备及存储介质 |
CN113869521A (zh) * | 2020-06-30 | 2021-12-31 | 华为技术有限公司 | 构建预测模型的方法、装置、计算设备和存储介质 |
CN111930795B (zh) * | 2020-07-02 | 2022-11-29 | 苏州浪潮智能科技有限公司 | 一种分布式模型搜索方法及*** |
CN112149551A (zh) * | 2020-09-21 | 2020-12-29 | 上海孚聪信息科技有限公司 | 一种基于嵌入式设备和深度学习的安全帽识别方法 |
CN116821513B (zh) * | 2023-08-25 | 2023-11-10 | 腾讯科技(深圳)有限公司 | 一种推荐场景下的参数搜索方法、装置、设备和介质 |
CN117036869B (zh) * | 2023-10-08 | 2024-01-09 | 之江实验室 | 一种基于多样性和随机策略的模型训练方法及装置 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120119987A1 (en) * | 2010-11-12 | 2012-05-17 | Soungmin Im | Method and apparatus for performing gesture recognition using object in multimedia devices |
CN109063759A (zh) * | 2018-07-20 | 2018-12-21 | 浙江大学 | 一种应用于图片多属性预测的神经网络结构搜索方法 |
CN109325454A (zh) * | 2018-09-28 | 2019-02-12 | 合肥工业大学 | 一种基于YOLOv3的静态手势实时识别方法 |
CN109788222A (zh) * | 2019-02-02 | 2019-05-21 | 视联动力信息技术股份有限公司 | 一种视联网视频的处理方法及装置 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7023613B2 (ja) | 2017-05-11 | 2022-02-22 | キヤノン株式会社 | 画像認識装置および学習装置 |
CN117892774A (zh) * | 2017-07-21 | 2024-04-16 | 谷歌有限责任公司 | 用于卷积神经网络的神经架构搜索 |
EP3688673A1 (en) | 2017-10-27 | 2020-08-05 | Google LLC | Neural architecture search |
CN107886117A (zh) | 2017-10-30 | 2018-04-06 | 国家新闻出版广电总局广播科学研究院 | 基于多特征提取和多任务融合的目标检测算法 |
FR3082963A1 (fr) * | 2018-06-22 | 2019-12-27 | Amadeus S.A.S. | Systeme et procede d'evaluation et de deploiement de modeles d'apprentissage automatique non supervises ou semi-supervises |
US10713491B2 (en) * | 2018-07-27 | 2020-07-14 | Google Llc | Object detection using spatio-temporal feature maps |
-
2019
- 2019-09-25 CN CN201910912868.1A patent/CN110705573A/zh not_active Withdrawn
- 2019-12-30 WO PCT/CN2019/130024 patent/WO2021056914A1/zh unknown
- 2019-12-30 JP JP2022517307A patent/JP7335430B2/ja active Active
- 2019-12-30 US US17/642,816 patent/US20220383627A1/en active Pending
- 2019-12-30 KR KR1020227009936A patent/KR20220051383A/ko unknown
- 2019-12-30 EP EP19946971.9A patent/EP4036796A4/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120119987A1 (en) * | 2010-11-12 | 2012-05-17 | Soungmin Im | Method and apparatus for performing gesture recognition using object in multimedia devices |
CN109063759A (zh) * | 2018-07-20 | 2018-12-21 | 浙江大学 | 一种应用于图片多属性预测的神经网络结构搜索方法 |
CN109325454A (zh) * | 2018-09-28 | 2019-02-12 | 合肥工业大学 | 一种基于YOLOv3的静态手势实时识别方法 |
CN109788222A (zh) * | 2019-02-02 | 2019-05-21 | 视联动力信息技术股份有限公司 | 一种视联网视频的处理方法及装置 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4036796A4 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117291845A (zh) * | 2023-11-27 | 2023-12-26 | 成都理工大学 | 一种点云地面滤波方法、***、电子设备及存储介质 |
CN117291845B (zh) * | 2023-11-27 | 2024-03-19 | 成都理工大学 | 一种点云地面滤波方法、***、电子设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
EP4036796A1 (en) | 2022-08-03 |
JP2022548293A (ja) | 2022-11-17 |
CN110705573A (zh) | 2020-01-17 |
EP4036796A4 (en) | 2023-10-18 |
JP7335430B2 (ja) | 2023-08-29 |
KR20220051383A (ko) | 2022-04-26 |
US20220383627A1 (en) | 2022-12-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021056914A1 (zh) | 一种目标检测模型的自动建模方法及装置 | |
US11651259B2 (en) | Neural architecture search for convolutional neural networks | |
US11669744B2 (en) | Regularized neural network architecture search | |
KR102114564B1 (ko) | 학습 시스템, 학습 장치, 학습 방법, 학습 프로그램, 교사 데이터 작성 장치, 교사 데이터 작성 방법, 교사 데이터 작성 프로그램, 단말 장치 및 임계치 변경 장치 | |
CN109242149B (zh) | 一种基于教育数据挖掘的学生成绩早期预警方法及*** | |
CN112232476B (zh) | 更新测试样本集的方法及装置 | |
WO2021238262A1 (zh) | 一种车辆识别方法、装置、设备及存储介质 | |
TW201626293A (zh) | 由知識圖譜偏置的資料分類 | |
US11416743B2 (en) | Swarm fair deep reinforcement learning | |
JP2022508091A (ja) | 動的再構成訓練コンピュータアーキテクチャ | |
US20210073628A1 (en) | Deep neural network training method and apparatus, and computer device | |
WO2022246843A1 (zh) | 软件项目的风险评估方法、装置、计算机设备、存储介质 | |
CN114842343A (zh) | 一种基于ViT的航空图像识别方法 | |
CN112749737A (zh) | 图像分类方法及装置、电子设备、存储介质 | |
CN117173568A (zh) | 目标检测模型训练方法和目标检测方法 | |
CN115062779A (zh) | 基于动态知识图谱的事件预测方法及装置 | |
CN114627402A (zh) | 一种基于时空图的跨模态视频时刻定位方法及*** | |
CN111242176B (zh) | 计算机视觉任务的处理方法、装置及电子*** | |
CN112580581A (zh) | 目标检测方法、装置及电子设备 | |
CN117152528A (zh) | 绝缘子状态识别方法、装置、设备、存储介质和程序产品 | |
CN113761337B (zh) | 基于事件隐式要素与显式联系的事件预测方法和装置 | |
CN115410250A (zh) | 阵列式人脸美丽预测方法、设备及存储介质 | |
CN114970732A (zh) | 分类模型的后验校准方法、装置、计算机设备及介质 | |
CN112861474A (zh) | 一种信息标注方法、装置、设备及计算机可读存储介质 | |
CN112861689A (zh) | 一种基于nas技术的坐标识别模型的搜索方法及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19946971 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022517307 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20227009936 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2019946971 Country of ref document: EP Effective date: 20220425 |