WO2021056914A1 - 一种目标检测模型的自动建模方法及装置 - Google Patents

一种目标检测模型的自动建模方法及装置 Download PDF

Info

Publication number
WO2021056914A1
WO2021056914A1 PCT/CN2019/130024 CN2019130024W WO2021056914A1 WO 2021056914 A1 WO2021056914 A1 WO 2021056914A1 CN 2019130024 W CN2019130024 W CN 2019130024W WO 2021056914 A1 WO2021056914 A1 WO 2021056914A1
Authority
WO
WIPO (PCT)
Prior art keywords
target detection
model
detection model
network
parameters
Prior art date
Application number
PCT/CN2019/130024
Other languages
English (en)
French (fr)
Inventor
刘红丽
李峰
刘鑫
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Priority to EP19946971.9A priority Critical patent/EP4036796A4/en
Priority to JP2022517307A priority patent/JP7335430B2/ja
Priority to KR1020227009936A priority patent/KR20220051383A/ko
Priority to US17/642,816 priority patent/US20220383627A1/en
Publication of WO2021056914A1 publication Critical patent/WO2021056914A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the invention relates to the field of target detection, in particular to an automatic modeling method and device in the field of target detection.
  • object detection has a wide range of applications in traffic monitoring, image retrieval, human-computer interaction, and so on. It aims to detect the target objects that people are interested in in a static image (or dynamic video).
  • the more popular algorithms include Yolo, SSD, R-CNN algorithms and so on.
  • the existing target detection algorithm uses a fixed network structure to extract image features, and cannot use different network structures to extract the most suitable image features according to different tasks and data characteristics, so that the manually designed target detection model can only achieve high accuracy for specific tasks. Lack of flexibility.
  • the technical problem to be solved by the present invention is to provide an automatic modeling method of a target detection model, which can search for different models according to different tasks and improve the target detection effect.
  • an embodiment of the present invention provides an automatic modeling method of a target detection model, including:
  • the training set to train the first target detection model; when the number of training times reaches the first preset number of times, use the verification set to evaluate the current first target detection model; and output the evaluation result; wherein, the first target detection
  • the model is a model obtained by fusion of the detection part of the NAS network model and the second target detection model;
  • Steps S1-S4 are repeated for a second preset number of times, and the optimal first target detection model is determined according to the evaluation result.
  • the first neural network is a recurrent neural network RNN;
  • Step S1 includes:
  • Step S11 According to the pre-configured hyper-parameters, the cyclic neural network RNN controller is sampled to obtain the sub-network structure code; wherein, the hyper-parameters include sub-network hyper-parameters and controller hyper-parameters; sub-network hyper-parameters include sub-network hyper-parameters The number of layers, the number of cell branches, the parameters related to the learning rate of the sub-network, and the number of output channels; the hyperparameters of the controller include the parameters related to the learning rate of the controller and the configuration parameters of the optimizer;
  • Step S12 Decode the sub-network structure code through a decoder, and output a NAS network model.
  • the NAS network model is a NAS network without a fully connected layer
  • the fusion of the detection part of the NAS network and the first target detection model refers to:
  • Multiple output terminals of different scales of the NAS network model are each connected to a 1*1 convolutional layer, and the output of the 1*1 convolutional layer is used as the input of the detection part of the second target detection model.
  • the method includes:
  • the training set is used to train the first target detection model with the best evaluation result, and the first target detection model with the best evaluation result after the training is tested on the test set.
  • the second target detection model includes Yolo.
  • using the verification set to evaluate the current first target detection model includes:
  • Determining the optimal first target detection model according to the evaluation result includes:
  • the first target detection model corresponding to the largest MAP within the second preset number of times is determined as the optimal first target detection model.
  • calculating the target detection task reward reward corresponding to the current first target detection model includes:
  • L cla classification loss
  • L reg regression loss
  • the baseline is a preset value or formula.
  • adjusting the parameters used for searching the NAS network model according to the reward includes:
  • the backpropagation gradient may represent training parameters, ⁇ represents a preset parameter, log ⁇ ⁇ (s t, a t) is searched when the t-th step S1 NAS network model corresponding to the cross-entropy loss, R t represents The reward value corresponding to the NAS network model found when step S1 is executed for the tth time.
  • an embodiment of the present invention also provides an automatic modeling device for a target detection model, including: a memory and a processor;
  • the memory is used to store a program for automatic modeling of a target detection model
  • the processor is configured to read and execute the program for automatic modeling of the target detection model, and perform the following operations:
  • the training set to train the first target detection model; when the number of training times reaches the first preset number of times, use the verification set to evaluate the current first target detection model; and output the evaluation result; wherein, the first target detection
  • the model is a model obtained by fusion of the detection part of the NAS network model and the second target detection model;
  • Steps S1-S4 are repeated for a second preset number of times, and the optimal first target detection model is determined according to the evaluation result.
  • the first neural network is a recurrent neural network RNN;
  • Step S1 includes:
  • Step S11 According to the pre-configured hyper-parameters, the cyclic neural network RNN controller is sampled to obtain the sub-network structure code; wherein, the hyper-parameters include sub-network hyper-parameters and controller hyper-parameters; sub-network hyper-parameters include sub-network hyper-parameters The number of layers, the number of cell branches, the parameters related to the learning rate of the sub-network, and the number of output channels; the hyperparameters of the controller include the parameters related to the learning rate of the controller and the configuration parameters of the optimizer;
  • Step S12 Decode the sub-network structure code through a decoder, and output a NAS network model.
  • the NAS network model is a NAS network without a fully connected layer
  • the fusion of the detection part of the NAS network and the first target detection model refers to:
  • Multiple output terminals of different scales of the NAS network model are each connected to a 1*1 convolutional layer, and the output of the 1*1 convolutional layer is used as the input of the detection part of the second target detection model.
  • the processor is configured to read and execute the program for automatic modeling of the target detection model, and perform the following operations:
  • the training set is used to train the first target detection model with the best evaluation result, and the first target detection model with the best evaluation result after training is trained Test on the test set.
  • the second target detection model includes Yolo.
  • using the verification set to evaluate the current first target detection model includes:
  • Determining the optimal first target detection model according to the evaluation result includes:
  • the first target detection model corresponding to the largest MAP within the second preset number of times is determined as the optimal first target detection model.
  • calculating the target detection task reward corresponding to the current first target detection model includes:
  • L cla classification loss
  • L reg regression loss
  • the baseline is a preset value or formula.
  • adjusting the parameters used for searching the NAS network model according to the reward includes:
  • the backpropagation gradient may represent training parameters, ⁇ represents a preset parameter, log ⁇ ⁇ (s t, a t) is searched when the t-th step S1 NAS network model corresponding to the cross-entropy loss, R t represents The reward value corresponding to the NAS network model found when step S1 is executed for the tth time.
  • the embodiments of the present invention provide an automatic modeling method and device for a target detection model.
  • a new target detection model is formed by fusing the feature extraction model searched for according to different tasks with the target detection model in the prior art. , Improve the target detection effect.
  • Fig. 1 is a schematic diagram of an automatic modeling method of a target detection model according to an embodiment of the present invention.
  • Fig. 2 is a flowchart of automatic modeling of a target detection model according to an embodiment of the present invention.
  • Fig. 3 is a schematic diagram of Yolo3 according to an embodiment of the present invention.
  • Fig. 4 is a schematic diagram of automatic modeling of a Yolo3-NAS model according to an embodiment of the present invention.
  • Fig. 5 is a schematic diagram of an automatic modeling device for a target detection model according to an embodiment of the present invention.
  • Fig. 1 is a schematic diagram of an automatic modeling method of a target detection model according to an embodiment of the present invention. As shown in Fig. 1, the automatic modeling method of this embodiment includes:
  • the first neural network may be a recurrent neural network RNN;
  • the NAS network model refers to a neural architecture search (Neural Architecture Search) network model
  • Step S1 may include:
  • Step S11 According to the pre-configured hyper-parameters, the cyclic neural network RNN controller is sampled to obtain the sub-network structure code; wherein, the hyper-parameters include sub-network hyper-parameters and controller hyper-parameters; sub-network hyper-parameters include sub-network hyper-parameters The number of layers, the number of cell branches, the parameters related to the learning rate of the sub-network, and the number of output channels; the hyperparameters of the controller include the parameters related to the learning rate of the controller and the configuration parameters of the optimizer;
  • Step S12 Decode the sub-network structure code through a decoder, and output a NAS network model.
  • the training set to train the first target detection model; when the number of training times reaches the first preset number of times, use the verification set to evaluate the current first target detection model; and output the evaluation result; wherein, the first target detection
  • the model is a model obtained by fusion of the detection part of the NAS network model and the second target detection model;
  • the NAS network model is a NAS network without a fully connected layer
  • the fusion of the detection part of the NAS network and the first target detection model refers to:
  • Multiple output terminals of different scales of the NAS network model are each connected to a 1*1 convolutional layer, and the output of the 1*1 convolutional layer is used as the input of the detection part of the second target detection model.
  • using the verification set to evaluate the current first target detection model may include:
  • the second target detection model may include Yolo.
  • Yolo You Only Live Once
  • Yolo is an object recognition and localization algorithm based on deep neural networks. It belongs to a one-stage algorithm, that is, the algorithm is directly applied to the input image and the category and corresponding positioning are output.
  • determining the optimal first target detection model according to the evaluation result may include:
  • the first target detection model corresponding to the largest MAP within the second preset number of times is determined as the optimal first target detection model.
  • calculating the target detection task reward corresponding to the current first target detection model includes:
  • L cla classification loss
  • L reg regression loss
  • the baseline is a preset value or formula.
  • adjusting the parameters used for searching the NAS network model according to the reward includes:
  • the backpropagation gradient may represent training parameters, ⁇ represents a preset parameter, log ⁇ ⁇ (s t, a t) is searched when the t-th step S1 NAS network model corresponding to the cross-entropy loss, R t represents The reward value corresponding to the NAS network model found when step S1 is executed for the tth time.
  • the parameters used by the NAS network model may include the trainable parameters of the RNN controller.
  • determining the optimal first target detection model according to the evaluation result may include:
  • the first target detection model corresponding to the largest MAP within the second preset number of times is determined as the optimal first target detection model.
  • the optimal first target detection model may include:
  • the training set is used to train the first target detection model with the best evaluation result, and the first target detection model with the best evaluation result after the training is tested on the test set.
  • Fig. 2 is a schematic diagram of an automatic modeling method of a target detection model according to an embodiment of the present invention. As shown in Figure 2, the steps are as follows:
  • Step 201 Initialize input.
  • Initialization input includes hyper-parameter configuration and database read-in data set.
  • the hyper-parameters may include sub-network hyper-parameters and controller hyper-parameters.
  • the sub-network hyperparameters mainly include the number of layers of the sub-network, the number of cell branches, the parameters related to the model learning rate, and the number of output channels.
  • the number of layers of the sub-network is the number of cells, and the parameters related to the learning rate of the sub-network refer to the decay rate and decay steps in exponential decay.
  • the aforementioned data set may include a training set, a verification set, and a test set.
  • Step 202 Using the RNN network as the controller, sampling and outputting the sub-network structure code.
  • Step 203 Output the NAS network model through the decoder, and merge the output part of Yolo3 to form a Yolo3-NAS model.
  • the current Yolo3 uses the Darknet-53 network structure to extract image features.
  • the detection part it refers to the idea of FPN (feature pyramid networks).
  • the present invention replaces the feature extraction network Darknet-53 in Yolo3 with an automatically searched and generated NAS network (normal cell and reduce cell superimposition), by removing the fully connected layer of the NAS network model, and in the NAS network model Multiple output terminals add 1*1 convolution to merge with Yolo's detection output part to form a Yolo3-NAS model. It should be noted that this embodiment is described on the basis of Yolo3, but it is not limited to Yolo3 in practical applications, and other target detection models can also be improved in this way.
  • Step 204 Train the Yolo3-NAS model on the training set, and after the preset number of training times is reached, verify the Yolo3-NAS model through the validation set, and output the evaluation result.
  • the evaluation result can be evaluated with the mean average precision MAP.
  • the MAP evaluation is an existing technology, and will not be repeated here.
  • Step 205 Calculate the target detection task reward corresponding to the Yolo3-NAS model.
  • L cla classification loss
  • L reg regression loss
  • baseline a preset value or formula.
  • Target detection has two parts: classification and regression. Classification is whether the target category is correctly classified in target detection, and regression is whether the position of the target is correct.
  • the loss corresponding to classification and regression is classification loss and regression loss.
  • the present invention expands the excitation effect (from (-1, 1) to (- ⁇ , ⁇ )) through the function, so that the controller parameter is updated faster, that is, the optimal model structure is searched faster.
  • Step 206 Feedback the reward to the controller, and update the trainable parameters of the controller.
  • represents the trainable parameters of the RNN controller
  • represents the preset parameter
  • log ⁇ ⁇ (s t , a t ) can be understood as the t-th structure (the NAS network model searched during the t-th execution of step S1) corresponds to The cross entropy loss.
  • R t the reward value corresponding to the t-th structure (the NAS network model searched when step S1 is executed for the t-th time)
  • reward To feedback whether the gradient calculated by the cross-entropy is a trustworthy gradient. If the reward is small or negative, it means that the gradient descent is in the wrong direction and we should update the parameters in the other direction. If the reward is positive or large, it means that the gradient descent is in the right direction and is moving in Update parameters in this direction.
  • Step 207 Repeat the above steps 202-206 for a preset number of times, and the evaluation result of the preset number of times is the largest as the searched best sub-network model (that is, the detection part of the NAS network model and the second target detection model are merged to obtain Model), retrain and test its final effect on the test set.
  • the target detection into an automatic search for the best model method
  • different models can be searched out according to different tasks, thereby improving the detection effect in a targeted manner.
  • FIG. 5 is a schematic diagram of an automatic modeling device of a target detection model according to an embodiment of the present invention.
  • the automatic modeling device of this embodiment includes: a memory and a processor;
  • the memory is used to store a program for automatic modeling of a target detection model
  • the processor is configured to read and execute the program for automatic modeling of the target detection model, and perform the following operations:
  • the training set to train the first target detection model; when the number of training times reaches the first preset number of times, use the verification set to evaluate the current first target detection model; and output the evaluation result; wherein, the first target detection
  • the model is a model obtained by fusion of the detection part of the NAS network model and the second target detection model;
  • Steps S1-S4 are repeated for a second preset number of times, and the optimal first target detection model is determined according to the evaluation result.
  • the first neural network is a recurrent neural network RNN;
  • step S1 includes:
  • Step S11 According to the pre-configured hyper-parameters, the cyclic neural network RNN controller is sampled to obtain the sub-network structure code; wherein, the hyper-parameters include sub-network hyper-parameters and controller hyper-parameters; sub-network hyper-parameters include sub-network hyper-parameters The number of layers, the number of cell branches, the parameters related to the learning rate of the sub-network, and the number of output channels; the hyperparameters of the controller include the parameters related to the learning rate of the controller and the configuration parameters of the optimizer;
  • Step S12 Decode the sub-network structure code through a decoder, and output a NAS network model.
  • the NAS network model is a NAS network without a fully connected layer
  • the fusion of the detection part of the NAS network and the first target detection model refers to:
  • Multiple output terminals of different scales of the NAS network model are each connected to a 1*1 convolutional layer, and the output of the 1*1 convolutional layer is used as the input of the detection part of the second target detection model.
  • the processor is configured to read and execute the program for automatic modeling of the target detection model, and perform the following operations:
  • the training set is used to train the first target detection model with the best evaluation result, and the first target detection model with the best evaluation result after training is trained Test on the test set.
  • the second target detection model includes Yolo.
  • using the verification set to evaluate the current first target detection model includes:
  • determining the optimal first target detection model according to the evaluation result includes:
  • the first target detection model corresponding to the largest MAP within the second preset number of times is determined as the optimal first target detection model.
  • calculating the target detection task reward corresponding to the current first target detection model includes:
  • L cla classification loss
  • L reg regression loss
  • the baseline is a preset value or formula.
  • adjusting the parameters used for searching the NAS network model according to the reward includes:
  • the backpropagation gradient may represent training parameters, ⁇ represents a preset parameter, log ⁇ ⁇ (s t, a t) is searched when the t-th step S1 NAS network model corresponding to the cross-entropy loss, R t represents The reward value corresponding to the NAS network model found when step S1 is executed for the tth time.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Image Analysis (AREA)

Abstract

一种目标检测模型的自动建模方法,该方法包括:S1、根据预定的第一神经网络搜索NAS网络模型;S2、利用训练集训练第一目标检测模型;当训练次数达到第一预设次数后,利用验证集对当前的第一目标检测模型进行评估;并输出评估结果;其中,所述第一目标检测模型是所述NAS网络模型和第二目标检测模型的检测部分融合得到的模型;S3、计算所述当前的第一目标检测模型对应的reward;S4、根据所述reward调整搜索所述NAS网络模型所用的参数;重复步骤S1-S4第二预设次数,根据所述评估结果确定最优的第一目标检测模型。本发明还公开了自动建模装置。本发明提供的方法和装置能够根据不同任务搜索出不同模型,提高了目标检测效果。

Description

一种目标检测模型的自动建模方法及装置
本申请要求于2019年9月25日提交中国专利局、申请号为201910912868.1、发明名称为“一种目标检测模型的自动建模方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及目标检测领域,尤其涉及目标检测领域中一种自动建模方法及装置。
背景技术
目标检测作为图像处理和计算机视觉领域中的经典课题,在交通监控、图像检索、人机交互等方面有着广泛的应用。它旨在一个静态图像(或动态视频)中检测出人们感兴趣的目标对象。目前比较流行的算法有Yolo、SSD、R-CNN系算法等。但是,现有目标检测算法采用固定网络结构提取图像特征,不能根据不同任务和数据特点采用不同网络结构提取最合适的图像特征,使得人工设计的目标检测模型只能针对特定任务实现较高精度,缺乏灵活性。
发明内容
本发明要解决的技术问题是提供一种目标检测模型的自动建模方法,能够根据不同任务搜索出不同模型,提高目标检测效果。
为了解决上述技术问题,本发明实施例提供了一种目标检测模型的自动建模方法,包括:
S1、根据预定的第一神经网络搜索神经网络结构搜索NAS网络模型;
S2、利用训练集训练第一目标检测模型;当训练次数达到第一预设次数后,利用验证集对当前的第一目标检测模型进行评估;并输出评估结果;其中,所述第一目标检测模型是所述NAS网络模型和第二目标检测模型的检测部分融合得到的模型;
S3、计算所述当前的第一目标检测模型对应的目标检测任务奖励 reward;
S4、根据所述目标检测任务奖励reward调整搜索所述NAS网络模型所用的参数;
重复步骤S1-S4第二预设次数,根据所述评估结果确定最优的第一目标检测模型。
优选地,所述第一神经网络为循环神经网络RNN;
步骤S1包括:
步骤S11、根据预先配置的超参数通过循环神经网络RNN控制器采样,获取子网络结构编码;其中,所述超参数包括子网络超参数、控制器的超参数;子网络超参数包括子网络的层数、cell分支数目、与子网络学习率相关的参数、输出通道数目;控制器的超参数包括与控制器学习率相关的参数、优化器配置参数;
步骤S12、通过解码器对所述子网络结构编码进行解码,输出NAS网络模型。
优选地,所述NAS网络模型为没有全连接层的NAS网络;
所述NAS网络和所述第一目标检测模型的检测部分融合是指:
所述NAS网络模型的多个不同尺度的输出端各自连接一个1*1卷积层,将所述1*1卷积层的输出作为所述第二目标检测模型的检测部分的输入。
优选地,根据所述评估结果确定最优的第一目标检测模型后,包括:
利用训练集对评估结果最优的第一目标检测模型进行训练,并对所述训练后的评估结果最优的第一目标检测模型在测试集上进行测试。
优选地,所述第二目标检测模型包括Yolo。
优选地,利用验证集对当前的第一目标检测模型进行评估,包括:
利用验证集对当前的第一目标检测模型进行均值平均精度MAP评估;
根据所述评估结果确定最优的第一目标检测模型,包括:
将第二预设次数内最大的MAP对应的第一目标检测模型确定为最优的第一目标检测模型。
优选地,计算所述当前的第一目标检测模型对应的目标检测任务奖励 reward,包括:
Figure PCTCN2019130024-appb-000001
其中
Figure PCTCN2019130024-appb-000002
其中,L cla为分类loss,L reg为回归loss,所述baseline为预设的数值或公式。
优选地,根据所述reward调整搜索所述NAS网络模型所用的参数,包括:
将所述reward反馈到所述RNN控制器,
通过
Figure PCTCN2019130024-appb-000003
更新所述RNN控制器的可训练参数;其中,θ表示所述RNN控制器的可训练参数,
Figure PCTCN2019130024-appb-000004
表示所述可训练参数的反向传播梯度,γ表示预设参数,logπ θ(s t,a t)为第t次执行步骤S1时搜索到的NAS网络模型对应的交叉熵loss,R t表示第t次执行步骤S1时搜索到的NAS网络模型对应的reward值。
为了解决上述技术问题,本发明实施例还提供了一种目标检测模型的自动建模装置,包括:存储器和处理器;
所述存储器,用于保存用于目标检测模型的自动建模的程序;
所述处理器,用于读取执行所述用于目标检测模型的自动建模的程序,执行如下操作:
S1、根据预定的第一神经网络搜索神经网络结构搜索NAS网络模型;
S2、利用训练集训练第一目标检测模型;当训练次数达到第一预设次数后,利用验证集对当前的第一目标检测模型进行评估;并输出评估结果;其中,所述第一目标检测模型是所述NAS网络模型和第二目标检测模型的检测部分融合得到的模型;
S3、计算所述当前的第一目标检测模型对应的目标检测任务奖励 reward;
S4、根据所述目标检测任务奖励reward调整搜索所述NAS网络模型所用的参数;
重复步骤S1-S4第二预设次数,根据所述评估结果确定最优的第一目标检测模型。
优选地,所述第一神经网络为循环神经网络RNN;
步骤S1包括:
步骤S11、根据预先配置的超参数通过循环神经网络RNN控制器采样,获取子网络结构编码;其中,所述超参数包括子网络超参数、控制器的超参数;子网络超参数包括子网络的层数、cell分支数目、与子网络学习率相关的参数、输出通道数目;控制器的超参数包括与控制器学习率相关的参数、优化器配置参数;
步骤S12、通过解码器对所述子网络结构编码进行解码,输出NAS网络模型。
优选地,所述NAS网络模型为没有全连接层的NAS网络;
所述NAS网络和所述第一目标检测模型的检测部分融合是指:
所述NAS网络模型的多个不同尺度的输出端各自连接一个1*1卷积层,将所述1*1卷积层的输出作为所述第二目标检测模型的检测部分的输入。
优选地,所述处理器,用于读取执行所述用于目标检测模型的自动建模的程序,执行如下操作:
根据所述评估结果确定最优的第一目标检测模型后,利用训练集对评估结果最优的第一目标检测模型进行训练,并对所述训练后的评估结果最优的第一目标检测模型在测试集上进行测试。
优选地,所述第二目标检测模型包括Yolo。
优选地,利用验证集对当前的第一目标检测模型进行评估,包括:
利用验证集对当前的第一目标检测模型进行均值平均精度MAP评估;
根据所述评估结果确定最优的第一目标检测模型,包括:
将第二预设次数内最大的MAP对应的第一目标检测模型确定为最优 的第一目标检测模型。
优选地,计算所述当前的第一目标检测模型对应的目标检测任务奖励reward,包括:
Figure PCTCN2019130024-appb-000005
其中
Figure PCTCN2019130024-appb-000006
其中,L cla为分类loss,L reg为回归loss,所述baseline为预设的数值或公式。
优选地,根据所述reward调整搜索所述NAS网络模型所用的参数,包括:
将所述reward反馈到所述RNN控制器,
通过
Figure PCTCN2019130024-appb-000007
更新所述RNN控制器的可训练参数;其中,θ表示所述RNN控制器的可训练参数,
Figure PCTCN2019130024-appb-000008
表示所述可训练参数的反向传播梯度,γ表示预设参数,logπ θ(s t,a t)为第t次执行步骤S1时搜索到的NAS网络模型对应的交叉熵loss,R t表示第t次执行步骤S1时搜索到的NAS网络模型对应的reward值。
综上,本发明实施例提供一种目标检测模型的自动建模方法及装置,通过将根据不同任务搜索出的用于特征提取模型与现有技术中的目标检测模型融合形成新的目标检测模型,提高了目标检测效果。
附图说明
图1为根据本发明实施例的一种目标检测模型的自动建模方法的示意图。
图2为根据本发明实施例的目标检测模型的自动建模的流程图。
图3为根据本发明实施例的Yolo3的原理图。
图4为根据本发明实施例的Yolo3-NAS模型的自动建模示意图。
图5为根据本发明实施例的目标检测模型的自动建模装置的示意图。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚明白,下文中将结合附图对本发明的实施例进行详细说明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。
实施例一
图1为本发明实施例的一种目标检测模型的自动建模方法的示意图,如图1所示,本实施例的自动建模方法包括:
S1、根据预定的第一神经网络搜索NAS网络模型。
一种示例性的实施例中,所述第一神经网络可以为循环神经网络RNN;NAS网络模型指神经网络结构搜索(Neural Architecture Search)网络模型
步骤S1可以包括:
步骤S11、根据预先配置的超参数通过循环神经网络RNN控制器采样,获取子网络结构编码;其中,所述超参数包括子网络超参数、控制器的超参数;子网络超参数包括子网络的层数、cell分支数目、与子网络学习率相关的参数、输出通道数目;控制器的超参数包括与控制器学习率相关的参数、优化器配置参数;
步骤S12、通过解码器对所述子网络结构编码进行解码,输出NAS网络模型。
S2、利用训练集训练第一目标检测模型;当训练次数达到第一预设次数后,利用验证集对当前的第一目标检测模型进行评估;并输出评估结果;其中,所述第一目标检测模型是所述NAS网络模型和第二目标检测模型的检测部分融合得到的模型;
一种示例性的实施例中,所述NAS网络模型为没有全连接层的NAS网络;
一种示例性的实施例中,所述NAS网络和所述第一目标检测模型的检测部分融合是指:
所述NAS网络模型的多个不同尺度的输出端各自连接一个1*1卷积层,将所述1*1卷积层的输出作为所述第二目标检测模型的检测部分的输入。
一种示例性的实施例中,利用验证集对当前的第一目标检测模型进行评估,可以包括:
利用验证集对当前的第一目标检测模型进行均值平均精度MAP评估;
其中,所述第二目标检测模型可以包括Yolo。Yolo(You Only Live Once)是基于深度神经网络的对象识别和定位算法。其属于一步走(one-stage)算法,即直接对输入图像应用算法并输出类别和相应的定位。
一种示例性的实施例中,根据所述评估结果确定最优的第一目标检测模型,可以包括:
将第二预设次数内最大的MAP对应的第一目标检测模型确定为最优的第一目标检测模型。
S3、计算所述当前的第一目标检测模型对应的目标检测任务奖励reward。
一种示例性的实施例中,计算所述当前的第一目标检测模型对应的目标检测任务奖励reward,包括:
Figure PCTCN2019130024-appb-000009
其中
Figure PCTCN2019130024-appb-000010
其中,L cla为分类loss,L reg为回归loss,所述baseline为预设的数值或公式。
S4、根据所述目标检测任务奖励reward调整搜索所述NAS网络模型所用的参数;
一种示例性的实施例中,根据所述reward调整搜索所述NAS网络模型所用的参数,包括:
将所述reward反馈到所述RNN控制器,
通过
Figure PCTCN2019130024-appb-000011
更新所述RNN控制器的可训练参数;其中,θ表示所述RNN控制器的可训练参数,
Figure PCTCN2019130024-appb-000012
表示所述可训练参数的反向传播梯度,γ表示预设参数,logπ θ(s t,a t)为第t次执行步骤S1时搜索到的NAS网络模型对应的交叉熵loss,R t表示第t次执行步骤S1时搜索到的NAS网络模型对应的reward值。
其中,NAS网络模型所用的参数可以包括RNN控制器的可训练参数。
S5、重复步骤S1-S4第二预设次数,根据所述评估结果确定最优的第一目标检测模型。
一种示例性的实施例中,根据所述评估结果确定最优的第一目标检测模型,可以包括:
将第二预设次数内最大的MAP对应的第一目标检测模型确定为最优的第一目标检测模型。
一种示例性的实施例中,根据所述评估结果确定最优的第一目标检测模型后,可以包括:
利用训练集对评估结果最优的第一目标检测模型进行训练,并对所述训练后的评估结果最优的第一目标检测模型在测试集上进行测试。
实施例2
图2为本发明实施例的一种目标检测模型的自动建模方法的示意图。如图2所示,包括步骤如下:
步骤201:初始化输入。
初始化输入包括超参数配置和数据库读入数据集。其中,超参数可包括子网络超参数和控制器超参数。其中,子网络超参数主要包括子网络的层数、cell分支数目、模型学习率相关参数、输出通道数目等。子网络的层数就是cell的个数,子网络学习率相关的参数指,如指数衰减中的衰减率、衰减步骤等。上述数据集可以包括训练集、验证集和测试集。
步骤202:采用RNN网络作为控制器,采样输出子网络结构编码。
步骤203:通过解码器输出NAS网络模型,并融合Yolo3的输出部分,形成Yolo3-NAS模型。
如图3所示,当前Yolo3采用Darknet-53的网络结构提取图像特征,在检测部分,参考了FPN(feature pyramid networks)的思想。
如图4所示,本发明将Yolo3中特征提取网络Darknet-53替换为自动搜索生成的NAS网络(normal cell和reduce cell叠加),通过去掉NAS网络模型的全连接层,并在NAS网络模型的多个输出端添加1*1卷积来与Yolo的检测输出部分融合,形成Yolo3-NAS模型。需要说明的是,本实施例以Yolo3为基础进行说明,但在实际应用中不仅局限于Yolo3,其它目标检测模型也可以此方式改进。
步骤204:在训练集上训练Yolo3-NAS模型,达到预设的训练次数后,通过验证集验证Yolo3-NAS模型,并输出评估结果。
其中,评估结果可以用均值平均精度MAP评估。其中MAP评估为现有技术,在此不再赘述。
步骤205:计算Yolo3-NAS模型对应的目标检测任务奖励reward。
其中,奖励(reward)的计算公式为:
Figure PCTCN2019130024-appb-000013
其中
Figure PCTCN2019130024-appb-000014
其中,L cla为分类loss,L reg为回归loss,baseline为预设的数值或公式。目标检测有分类和回归两部分,分类就是目标检测中目标类别是否分类正确,回归就是目标的位置是否正确。分类和回归对应的loss就是分类loss和回归loss。同时本发明通过函数扩大激励效果(由(-1,1)扩大至(-∞,∞)),使得控制器参数更新更快,即更快的搜索到最佳模型结构。
步骤206:将reward反馈到控制器,更新该控制器的可训练参数。
其中,更新控制器参数公式为
Figure PCTCN2019130024-appb-000015
其中,θ表示所述RNN控制器的可训练参数,
Figure PCTCN2019130024-appb-000016
表示所述可训练参数的反向传播梯度,γ表示预设参数,logπ θ(s t,a t)可理解为第t次结构(第t次执行步骤S1时搜索到的NAS网络模型)对应的交叉熵loss。为了确保这个结构真的是“正确的”,我们的loss在原本基础上乘以R t(第t次结构(第t次执行步骤S1时搜索到的NAS网络模型)对应的reward值),用reward来反馈这个交叉熵算出来的梯度是不是一个值得信任的梯度。如果reward小,或者是负的,说明这个梯度下降是一个错误的方向,我们应该向着另一个方向更新参数,如果reward是正的,或很大,说明这个梯度下降是一个正确的方向,并朝着这个方向更新参数。
步骤207:重复上述步骤202-206预设次数,该预设次数的评估结果最大的作为搜索到的最佳子网络模型(也即所述NAS网络模型和第二目标检测模型的检测部分融合得到的模型),重新训练并在测试集上测试其最终效果。
本发明通过将目标检测改为自动搜索最佳模型方式,可根据不同任务搜索出不同模型,从而有针对性地提高检测效果。
图5为本发明实施例的一种目标检测模型的自动建模装置的示意图,如图5所示,本实施例的自动建模装置包括:存储器和处理器;
所述存储器,用于保存用于目标检测模型的自动建模的程序;
所述处理器,用于读取执行所述用于目标检测模型的自动建模的程序,执行如下操作:
S1、根据预定的第一神经网络搜索神经网络结构搜索NAS网络模型;
S2、利用训练集训练第一目标检测模型;当训练次数达到第一预设次数后,利用验证集对当前的第一目标检测模型进行评估;并输出评估结果;其中,所述第一目标检测模型是所述NAS网络模型和第二目标检测模型的检测部分融合得到的模型;
S3、计算所述当前的第一目标检测模型对应的目标检测任务奖励reward;
S4、根据所述目标检测任务奖励reward调整搜索所述NAS网络模型所用的参数;
重复步骤S1-S4第二预设次数,根据所述评估结果确定最优的第一目标检测模型。
可选地,所述第一神经网络为循环神经网络RNN;
可选地,步骤S1包括:
步骤S11、根据预先配置的超参数通过循环神经网络RNN控制器采样,获取子网络结构编码;其中,所述超参数包括子网络超参数、控制器的超参数;子网络超参数包括子网络的层数、cell分支数目、与子网络学习率相关的参数、输出通道数目;控制器的超参数包括与控制器学习率相关的参数、优化器配置参数;
步骤S12、通过解码器对所述子网络结构编码进行解码,输出NAS网络模型。
可选地,所述NAS网络模型为没有全连接层的NAS网络;
可选地,所述NAS网络和所述第一目标检测模型的检测部分融合是指:
所述NAS网络模型的多个不同尺度的输出端各自连接一个1*1卷积层,将所述1*1卷积层的输出作为所述第二目标检测模型的检测部分的输入。
可选地,所述处理器,用于读取执行所述用于目标检测模型的自动建模的程序,执行如下操作:
根据所述评估结果确定最优的第一目标检测模型后,利用训练集对评估结果最优的第一目标检测模型进行训练,并对所述训练后的评估结果最优的第一目标检测模型在测试集上进行测试。
可选地,所述第二目标检测模型包括Yolo。
可选地,利用验证集对当前的第一目标检测模型进行评估,包括:
利用验证集对当前的第一目标检测模型进行均值平均精度MAP评估;
可选地,根据所述评估结果确定最优的第一目标检测模型,包括:
将第二预设次数内最大的MAP对应的第一目标检测模型确定为最优的第一目标检测模型。
可选地,计算所述当前的第一目标检测模型对应的目标检测任务奖励reward,包括:
Figure PCTCN2019130024-appb-000017
其中
Figure PCTCN2019130024-appb-000018
其中,L cla为分类loss,L reg为回归loss,所述baseline为预设的数值或公式。
可选地,根据所述reward调整搜索所述NAS网络模型所用的参数,包括:
将所述reward反馈到所述RNN控制器,
通过
Figure PCTCN2019130024-appb-000019
更新所述RNN控制器的可训练参数;其中,θ表示所述RNN控制器的可训练参数,
Figure PCTCN2019130024-appb-000020
表示所述可训练参数的反向传播梯度,γ表示预设参数,logπ θ(s t,a t)为第t次执行步骤S1时搜 索到的NAS网络模型对应的交叉熵loss,R t表示第t次执行步骤S1时搜索到的NAS网络模型对应的reward值。
本领域普通技术人员可以理解上述方法中的全部或部分步骤可通过程序来指令相关硬件完成,所述程序可以存储于计算机可读存储介质中,如只读存储器、磁盘或光盘等。可选地,上述实施例的全部或部分步骤也可以使用一个或多个集成电路来实现。相应地,上述实施例中的各模块/单元可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。本发明不限制于任何特定形式的硬件和软件的结合。
以上仅为本发明的优选实施例,当然,本发明还可有其他多种实施例,在不背离本发明精神及其实质的情况下,熟悉本领域的技术人员当可根据本发明作出各种相应的改变和变形,但这些相应的改变和变形都应属于本发明所附的权利要求的保护范围。

Claims (16)

  1. 一种目标检测模型的自动建模方法,其特征在于,包括:
    S1、根据预定的第一神经网络搜索神经网络结构搜索NAS网络模型;
    S2、利用训练集训练第一目标检测模型;当训练次数达到第一预设次数后,利用验证集对当前的第一目标检测模型进行评估;并输出评估结果;其中,所述第一目标检测模型是所述NAS网络模型和第二目标检测模型的检测部分融合得到的模型;
    S3、计算所述当前的第一目标检测模型对应的目标检测任务奖励reward;
    S4、根据所述目标检测任务奖励reward调整搜索所述NAS网络模型所用的参数;
    重复步骤S1-S4第二预设次数,根据所述评估结果确定最优的第一目标检测模型。
  2. 如权利要求1所述的方法,其特征在于,
    所述第一神经网络为循环神经网络RNN;
    步骤S1包括:
    步骤S11、根据预先配置的超参数通过循环神经网络RNN控制器采样,获取子网络结构编码;其中,所述超参数包括子网络超参数、控制器的超参数;子网络超参数包括子网络的层数、cell分支数目、与子网络学习率相关的参数、输出通道数目;控制器的超参数包括与控制器学习率相关的参数、优化器配置参数;
    步骤S12、通过解码器对所述子网络结构编码进行解码,输出NAS网络模型。
  3. 如权利要求2所述的方法,其特征在于,
    所述NAS网络模型为没有全连接层的NAS网络;
    所述NAS网络和所述第一目标检测模型的检测部分融合是指:
    所述NAS网络模型的多个不同尺度的输出端各自连接一个1*1卷积层,将所述1*1卷积层的输出作为所述第二目标检测模型的检测部分的输 入。
  4. 如权利要求1所述的方法,其特征在于,根据所述评估结果确定最优的第一目标检测模型后,包括:
    利用训练集对评估结果最优的第一目标检测模型进行训练,并对所述训练后的评估结果最优的第一目标检测模型在测试集上进行测试。
  5. 如权利要求1所述的方法,其特征在于,
    所述第二目标检测模型包括Yolo。
  6. 如权利要求1所述的方法,其特征在于,
    利用验证集对当前的第一目标检测模型进行评估,包括:
    利用验证集对当前的第一目标检测模型进行均值平均精度MAP评估;
    根据所述评估结果确定最优的第一目标检测模型,包括:
    将第二预设次数内最大的MAP对应的第一目标检测模型确定为最优的第一目标检测模型。
  7. 如权利要求1所述的方法,其特征在于,计算所述当前的第一目标检测模型对应的目标检测任务奖励reward,包括:
    Figure PCTCN2019130024-appb-100001
    其中
    Figure PCTCN2019130024-appb-100002
    其中,L cla为分类loss,L reg为回归loss,所述baseline为预设的数值或公式。
  8. 如权利要求2所述的方法,其特征在于,
    根据所述reward调整搜索所述NAS网络模型所用的参数,包括:
    将所述reward反馈到所述RNN控制器,
    通过
    Figure PCTCN2019130024-appb-100003
    更新所述RNN控制器的可训练参数;其中,θ表示所述RNN控制器的可训练参数,
    Figure PCTCN2019130024-appb-100004
    表示所述可训练参数 的反向传播梯度,γ表示预设参数,log π θ(s t,a t)为第t次执行步骤S1时搜索到的NAS网络模型对应的交叉熵loss,R t表示第t次执行步骤S1时搜索到的NAS网络模型对应的reward值。
  9. 一种目标检测模型的自动建模装置,包括:存储器和处理器;其特征在于:
    所述存储器,用于保存用于目标检测模型的自动建模的程序;
    所述处理器,用于读取执行所述用于目标检测模型的自动建模的程序,执行如下操作:
    S1、根据预定的第一神经网络搜索神经网络结构搜索NAS网络模型;
    S2、利用训练集训练第一目标检测模型;当训练次数达到第一预设次数后,利用验证集对当前的第一目标检测模型进行评估;并输出评估结果;其中,所述第一目标检测模型是所述NAS网络模型和第二目标检测模型的检测部分融合得到的模型;
    S3、计算所述当前的第一目标检测模型对应的目标检测任务奖励reward;
    S4、根据所述目标检测任务奖励reward调整搜索所述NAS网络模型所用的参数;
    重复步骤S1-S4第二预设次数,根据所述评估结果确定最优的第一目标检测模型。
  10. 如权利要求9所述的装置,其特征在于,
    所述第一神经网络为循环神经网络RNN;
    步骤S1包括:
    步骤S11、根据预先配置的超参数通过循环神经网络RNN控制器采样,获取子网络结构编码;其中,所述超参数包括子网络超参数、控制器的超参数;子网络超参数包括子网络的层数、cell分支数目、与子网络学习率相关的参数、输出通道数目;控制器的超参数包括与控制器学习率相关的参数、优化器配置参数;
    步骤S12、通过解码器对所述子网络结构编码进行解码,输出NAS网络模型。
  11. 如权利要求10所述的装置,其特征在于,
    所述NAS网络模型为没有全连接层的NAS网络;
    所述NAS网络和所述第一目标检测模型的检测部分融合是指:
    所述NAS网络模型的多个不同尺度的输出端各自连接一个1*1卷积层,将所述1*1卷积层的输出作为所述第二目标检测模型的检测部分的输入。
  12. 如权利要求9所述的装置,其特征在于,所述处理器,用于读取执行所述用于目标检测模型的自动建模的程序,执行如下操作:
    根据所述评估结果确定最优的第一目标检测模型后,利用训练集对评估结果最优的第一目标检测模型进行训练,并对所述训练后的评估结果最优的第一目标检测模型在测试集上进行测试。
  13. 如权利要求9所述的装置,其特征在于,
    所述第二目标检测模型包括Yolo。
  14. 如权利要求9所述的装置,其特征在于,
    利用验证集对当前的第一目标检测模型进行评估,包括:
    利用验证集对当前的第一目标检测模型进行均值平均精度MAP评估;
    根据所述评估结果确定最优的第一目标检测模型,包括:
    将第二预设次数内最大的MAP对应的第一目标检测模型确定为最优的第一目标检测模型。
  15. 如权利要求9所述的装置,其特征在于,计算所述当前的第一目标检测模型对应的目标检测任务奖励reward,包括:
    Figure PCTCN2019130024-appb-100005
    其中
    Figure PCTCN2019130024-appb-100006
    其中,L cla为分类loss,L reg为回归loss,所述baseline为预设的数值或公式。
  16. 如权利要求10所述的装置,其特征在于,
    根据所述reward调整搜索所述NAS网络模型所用的参数,包括:
    将所述reward反馈到所述RNN控制器,
    通过
    Figure PCTCN2019130024-appb-100007
    更新所述RNN控制器的可训练参数;其中,θ表示所述RNN控制器的可训练参数,
    Figure PCTCN2019130024-appb-100008
    表示所述可训练参数的反向传播梯度,γ表示预设参数,log π θ(s t,a t)为第t次执行步骤S1时搜索到的NAS网络模型对应的交叉熵loss,R t表示第t次执行步骤S1时搜索到的NAS网络模型对应的reward值。
PCT/CN2019/130024 2019-09-25 2019-12-30 一种目标检测模型的自动建模方法及装置 WO2021056914A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP19946971.9A EP4036796A4 (en) 2019-09-25 2019-12-30 AUTOMATIC MODELING METHOD AND APPARATUS FOR OBJECT DETECTION MODEL
JP2022517307A JP7335430B2 (ja) 2019-09-25 2019-12-30 目標検出モデルの自動モデリング方法及び装置
KR1020227009936A KR20220051383A (ko) 2019-09-25 2019-12-30 표적 검출 모델의 자동 모델링 방법 및 장치
US17/642,816 US20220383627A1 (en) 2019-09-25 2019-12-30 Automatic modeling method and device for object detection model

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910912868.1 2019-09-25
CN201910912868.1A CN110705573A (zh) 2019-09-25 2019-09-25 一种目标检测模型的自动建模方法及装置

Publications (1)

Publication Number Publication Date
WO2021056914A1 true WO2021056914A1 (zh) 2021-04-01

Family

ID=69196577

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/130024 WO2021056914A1 (zh) 2019-09-25 2019-12-30 一种目标检测模型的自动建模方法及装置

Country Status (6)

Country Link
US (1) US20220383627A1 (zh)
EP (1) EP4036796A4 (zh)
JP (1) JP7335430B2 (zh)
KR (1) KR20220051383A (zh)
CN (1) CN110705573A (zh)
WO (1) WO2021056914A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117291845A (zh) * 2023-11-27 2023-12-26 成都理工大学 一种点云地面滤波方法、***、电子设备及存储介质

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738098B (zh) * 2020-05-29 2022-06-17 浪潮(北京)电子信息产业有限公司 一种车辆识别方法、装置、设备及存储介质
CN113869521A (zh) * 2020-06-30 2021-12-31 华为技术有限公司 构建预测模型的方法、装置、计算设备和存储介质
CN111930795B (zh) * 2020-07-02 2022-11-29 苏州浪潮智能科技有限公司 一种分布式模型搜索方法及***
CN112149551A (zh) * 2020-09-21 2020-12-29 上海孚聪信息科技有限公司 一种基于嵌入式设备和深度学习的安全帽识别方法
CN116821513B (zh) * 2023-08-25 2023-11-10 腾讯科技(深圳)有限公司 一种推荐场景下的参数搜索方法、装置、设备和介质
CN117036869B (zh) * 2023-10-08 2024-01-09 之江实验室 一种基于多样性和随机策略的模型训练方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120119987A1 (en) * 2010-11-12 2012-05-17 Soungmin Im Method and apparatus for performing gesture recognition using object in multimedia devices
CN109063759A (zh) * 2018-07-20 2018-12-21 浙江大学 一种应用于图片多属性预测的神经网络结构搜索方法
CN109325454A (zh) * 2018-09-28 2019-02-12 合肥工业大学 一种基于YOLOv3的静态手势实时识别方法
CN109788222A (zh) * 2019-02-02 2019-05-21 视联动力信息技术股份有限公司 一种视联网视频的处理方法及装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7023613B2 (ja) 2017-05-11 2022-02-22 キヤノン株式会社 画像認識装置および学習装置
CN117892774A (zh) * 2017-07-21 2024-04-16 谷歌有限责任公司 用于卷积神经网络的神经架构搜索
EP3688673A1 (en) 2017-10-27 2020-08-05 Google LLC Neural architecture search
CN107886117A (zh) 2017-10-30 2018-04-06 国家新闻出版广电总局广播科学研究院 基于多特征提取和多任务融合的目标检测算法
FR3082963A1 (fr) * 2018-06-22 2019-12-27 Amadeus S.A.S. Systeme et procede d'evaluation et de deploiement de modeles d'apprentissage automatique non supervises ou semi-supervises
US10713491B2 (en) * 2018-07-27 2020-07-14 Google Llc Object detection using spatio-temporal feature maps

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120119987A1 (en) * 2010-11-12 2012-05-17 Soungmin Im Method and apparatus for performing gesture recognition using object in multimedia devices
CN109063759A (zh) * 2018-07-20 2018-12-21 浙江大学 一种应用于图片多属性预测的神经网络结构搜索方法
CN109325454A (zh) * 2018-09-28 2019-02-12 合肥工业大学 一种基于YOLOv3的静态手势实时识别方法
CN109788222A (zh) * 2019-02-02 2019-05-21 视联动力信息技术股份有限公司 一种视联网视频的处理方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4036796A4

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117291845A (zh) * 2023-11-27 2023-12-26 成都理工大学 一种点云地面滤波方法、***、电子设备及存储介质
CN117291845B (zh) * 2023-11-27 2024-03-19 成都理工大学 一种点云地面滤波方法、***、电子设备及存储介质

Also Published As

Publication number Publication date
EP4036796A1 (en) 2022-08-03
JP2022548293A (ja) 2022-11-17
CN110705573A (zh) 2020-01-17
EP4036796A4 (en) 2023-10-18
JP7335430B2 (ja) 2023-08-29
KR20220051383A (ko) 2022-04-26
US20220383627A1 (en) 2022-12-01

Similar Documents

Publication Publication Date Title
WO2021056914A1 (zh) 一种目标检测模型的自动建模方法及装置
US11651259B2 (en) Neural architecture search for convolutional neural networks
US11669744B2 (en) Regularized neural network architecture search
KR102114564B1 (ko) 학습 시스템, 학습 장치, 학습 방법, 학습 프로그램, 교사 데이터 작성 장치, 교사 데이터 작성 방법, 교사 데이터 작성 프로그램, 단말 장치 및 임계치 변경 장치
CN109242149B (zh) 一种基于教育数据挖掘的学生成绩早期预警方法及***
CN112232476B (zh) 更新测试样本集的方法及装置
WO2021238262A1 (zh) 一种车辆识别方法、装置、设备及存储介质
TW201626293A (zh) 由知識圖譜偏置的資料分類
US11416743B2 (en) Swarm fair deep reinforcement learning
JP2022508091A (ja) 動的再構成訓練コンピュータアーキテクチャ
US20210073628A1 (en) Deep neural network training method and apparatus, and computer device
WO2022246843A1 (zh) 软件项目的风险评估方法、装置、计算机设备、存储介质
CN114842343A (zh) 一种基于ViT的航空图像识别方法
CN112749737A (zh) 图像分类方法及装置、电子设备、存储介质
CN117173568A (zh) 目标检测模型训练方法和目标检测方法
CN115062779A (zh) 基于动态知识图谱的事件预测方法及装置
CN114627402A (zh) 一种基于时空图的跨模态视频时刻定位方法及***
CN111242176B (zh) 计算机视觉任务的处理方法、装置及电子***
CN112580581A (zh) 目标检测方法、装置及电子设备
CN117152528A (zh) 绝缘子状态识别方法、装置、设备、存储介质和程序产品
CN113761337B (zh) 基于事件隐式要素与显式联系的事件预测方法和装置
CN115410250A (zh) 阵列式人脸美丽预测方法、设备及存储介质
CN114970732A (zh) 分类模型的后验校准方法、装置、计算机设备及介质
CN112861474A (zh) 一种信息标注方法、装置、设备及计算机可读存储介质
CN112861689A (zh) 一种基于nas技术的坐标识别模型的搜索方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19946971

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022517307

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20227009936

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019946971

Country of ref document: EP

Effective date: 20220425