CN112633494A - Automatic neural network structure searching method based on automatic machine learning - Google Patents

Automatic neural network structure searching method based on automatic machine learning Download PDF

Info

Publication number
CN112633494A
CN112633494A CN202011492102.1A CN202011492102A CN112633494A CN 112633494 A CN112633494 A CN 112633494A CN 202011492102 A CN202011492102 A CN 202011492102A CN 112633494 A CN112633494 A CN 112633494A
Authority
CN
China
Prior art keywords
search
neural network
automatic
cell
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011492102.1A
Other languages
Chinese (zh)
Inventor
陈波
史特
左御丁
王庆先
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202011492102.1A priority Critical patent/CN112633494A/en
Publication of CN112633494A publication Critical patent/CN112633494A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an automatic neural network structure searching method based on automatic machine learning, which comprises the following steps: s1: determining a search space S based on a cell of a neural network; s2: selecting a specific structure S from the search space S by using a search strategy; s3: and based on the search space S, searching the specific structure S by using a search strategy, and returning the search result to the search strategy to complete the automatic search of the neural network structure. The search method has strong generalization capability, can be applied to various scenes such as computer vision and natural language processing, has obvious application effect particularly in the intelligent customer service industry oriented to the financial field, can obviously reduce the time cost in the modeling process and improve the modeling efficiency.

Description

Automatic neural network structure searching method based on automatic machine learning
Technical Field
The invention belongs to the technical field of neural networks, and particularly relates to an automatic neural network structure searching method based on automatic machine learning.
Background
With the rapid development of artificial intelligence technology, machine learning is widely applied in the fields of natural language processing, computer vision and the like, and the structure of a corresponding neural network is complicated along with the rapid development of the artificial intelligence technology. In the deep learning field, the neural network structure needs to be designed manually under normal conditions, and a large amount of manpower and material resources are consumed in order to obtain a neural network structure with an ideal effect. Therefore, in order to relieve manpower, researches on automatic search of neural network structures are becoming increasingly important to researchers. Based on the above situation, the invention provides an automatic neural network structure searching method based on automatic machine learning.
Disclosure of Invention
The invention aims to solve the problems of overlong searching time of a neural network structure, deep interval of a searching process and performance loss, and provides an automatic searching method of the neural network structure based on automatic machine learning.
The technical scheme of the invention is as follows: an automatic neural network structure searching method based on automatic machine learning comprises the following steps:
s1: determining a search space S by using a cell of a neural network based on a model generation stage of automatic machine learning;
s2: selecting a specific structure S from the search space S by using a search strategy;
s3: and based on the search space S, searching the specific structure S by using a search strategy, and returning the search result to the search strategy to complete the automatic search of the neural network structure.
The invention has the beneficial effects that:
(1) the method has strong generalization capability, can be applied to various scenes such as computer vision and natural language processing, has obvious application effect particularly in the intelligent customer service industry oriented to the financial field, can obviously reduce the time cost in the modeling process and improve the modeling efficiency.
(2) The search time is reduced. In the invention, a unit-based structure is used, the searching is carried out in stages in the searching process, in each stage, the candidate operation with lower grade is deleted according to the grade of the candidate operation, only the candidate operation with higher grade is reserved, and the next stage is entered. This reduces the number of operations so that the search space becomes relatively simple. In addition, the number of epochs set in each stage is reduced to a certain extent. By the method, the usage amount of the GPU video memory is reduced, so that the search time is shortened.
(3) The performance loss problem is solved. In the searching process of each stage, when the scores of the candidate operations are evaluated, the 0-1 loss function is used for replacing the original cross entropy loss function, so that the discrepancies of discretization caused by continuous coding are reduced, the scores of different candidate operations can be closer to 0 or 1 under more equal conditions, and the differences among the scores of the candidate operations are more obvious. The change enables the neural network structure to select the required operation more accurately when selecting from the candidate operations in the searching process, and further enables the performance loss problem of the searched network to be controlled.
(4) The depth spacing problem is solved. The automatic neural network structure searching method based on automatic machine learning provided by the invention carries out the searching process in stages, and solves the problem that the depth has intervals because the network used in the training process is different from the network used in the evaluation process and the difference between the network sizes is overlarge. The staged operation enables the searched network to be closer to the network structure during evaluation in the last stage, the problem of framework overfitting caused by directly using a large network for searching is solved, the searching space is approximated by the mode, and the interval problem existing between depths is relieved to a certain extent.
(5) In the process of controlling candidate operation, a search space regularization and early stop mechanism is used, and if more than two jump connections appear in a cell, the search of the cell is stopped. This directly controls the number of hopping connections, so that the performance of the search is improved.
Further, in step S1, stacking the cell to obtain a search space S; the cell consists of two input nodes, a middle node, an output node and an edge; the edges of the cell represent candidate operations.
The beneficial effects of the further scheme are as follows: in the invention, the cell-based search space has two advantages: firstly, the size of the search space S is effectively reduced; second, model migration is easier to perform.
Further, step S3 includes the following sub-steps:
s31: based on the search space S, carrying out first-stage search on the specific structure S by using a search strategy;
s32: normalizing the edges of the cell to obtain the weight of the edges in the cell;
s33: sorting the weights of the edges in the cell, and screening to obtain a candidate operation with the highest weight;
s34: and according to the candidate operation with the highest weight, sequentially performing the second-stage search and the third-stage search, and returning the search result to the search strategy to complete the automatic search of the neural network structure.
The beneficial effects of the further scheme are as follows: in the invention, a staged method is adopted, namely, the searching process is divided into three stages, the number of cell in each stage is changed in a progressive mode, and the depth of the searching network is gradually increased, so that the searching space is gradually reduced, and the problem of depth interval caused by the fact that the searching is carried out in a network with smaller depth and the testing is carried out in a network with deeper depth is solved.
Further, in step S31, the structure of the first stage search includes 5 cell units, and the candidate operations of all edges in each cell unit in the first stage search respectively include: max _ pool _3x3, avg _ pool _3x3, skip _ connect, sep _ conv _3x3, sep _ conv _5x5, dil _ conv _3x3, dil _ conv _5x5, and none.
The beneficial effects of the further scheme are as follows: in the present invention, the search network in the first stage is small, and only 5 cells exist, but the number of candidate operations on each edge of a cell is the largest, that is, all operations are included.
Further onIn step S32, the weight w of the edge in the celli.j(xi) The calculation formula of (2) is as follows:
Figure BDA0002840998810000041
wherein,
Figure BDA0002840998810000042
denotes softmax operation, Bi,jRepresenting the operating space of edge i and edge j, b representing a function with each edge representing a candidate, αb (i,j)Expressing the structural parameters, alpha expressing the weighted matrix of all edges, exp (-) expressing the exponential function operation, b' expressing the derivation of the candidate function, xiDenotes inode, αb‘ (i,j)The structure parameters after derivation of the candidate parameters are shown, and B represents the operation space of the edge.
The beneficial effects of the further scheme are as follows: in the present invention, the operation of each edge is normalized using the Softmax function among all the candidate operations.
Further, in step S33, the candidate filtering operation is modified by using a 0-1 loss function, which is calculated by the following formula:
totalL=lval*(α),α)+ω0-1L
Figure BDA0002840998810000043
ω*=argminωltrain(ω,a)
wherein totalL represents the total 0-1 loss function, L represents the loss function, LvalA loss value representing the validation set, a represents a matrix of weight values for all edges, ω represents a matrix of parameters, ω represents a matrix of weights for all edges*Representing the value of the optimal parameter matrix after the search space is continued, N representing the total number of edges, exp (-) representing the operation of the exponential function, i and j representing the edges, ω0-1L represents a weight coefficient, Bi,jRepresenting the operation space of edge i and edge j,αb (i,j)expressing the structural parameters, argmin (·) the variable value at which the objective function takes a minimum value, ltrainRepresenting the loss value of the training set, b' representing the derivative of the candidate function, αb‘ (i,j)Representing the structure parameters derived from the candidate parameters.
The beneficial effects of the further scheme are as follows: in the invention, a proper operation is selected from candidate operations according to continuous structural weight scores, but the great difference between the weight scores and 0-1 easily causes the deviation of selection when the candidate operations are selected, so a 0-1 loss function is used for relieving the occurrence of the situation. The 0-1 loss function reduces the difference of continuous coding discretization, makes the difference between weight parameters of different operations more obvious, enlarges the relative difference, and approaches to 0 or 1, so that the operation selection is more accurate and convenient. At the same time, to be able to control the weight of the loss function, ω is used0-1L weight coefficient, LvalDetermined by ω and α.
Further, in step S33, the skip connection in the process of screening the weight is cut off by the regularized dropout method.
The beneficial effects of the further scheme are as follows: in the invention, as the times of sending all data into the neural network to complete the process of one-time forward calculation and backward propagation (epoch) are increased, the mutual competition among the weights is gradually increased, so that the weight of the final skip-connection is higher and higher, the skip-connection is finally selected, and the excessive skip-connection is equivalent to the occurrence of an overfitting phenomenon in the searching process, so that the automatic searching performance of the neural network structure is influenced. Aiming at overfitting caused by excessive skip-connection, the number of skip-connection is controlled by a method of search space regularization and early stop, and the aim is to improve the search performance.
After the skip-connection operation, a regularization method (dropout) is added to the operation level to partially cut off the skip-connection operation, so that the algorithm searches for other candidate operations. And a stronger dropout is used in the initial training stage, the probability of the dropout is gradually attenuated in the training process, and a lighter dropout is used in the later training stage, so that the final learning of the network structure parameters is not influenced.
Further, in step S3, the search process is controlled by using an early-stopping mechanism, and the specific method thereof is as follows: and if the number of the jump connections in the cell exceeds two, stopping searching.
The beneficial effects of the further scheme are as follows: in the invention, a method for controlling the number of skip-connections intuitively is used in the searching process. If more than two skip-connections occur in a normal cell, the search is stopped.
The skip-connection is controlled through two modes of regularization and early stop for controlling the number of skip-connections, so that the problem of performance reduction caused by less training parameters is solved.
Further, in step S34, the number of operation candidates in the second stage is 5, the number of cell candidates is 11, the number of operation candidates in the third stage is 3, and the number of cell candidates is 17.
The beneficial effects of the further scheme are as follows: in the invention, the number of the cells is increased, but the number of the candidate operations is reduced, and the finally searched network is close to the last deeper evaluation network by gradually increasing the number of the cells.
Drawings
FIG. 1 is a flow chart of an automatic search method for neural network structures based on automatic machine learning;
FIG. 2 is a block diagram of a cell;
fig. 3 is a structural view of the first to third stages.
Detailed Description
The embodiments of the present invention will be further described with reference to the accompanying drawings.
Before describing specific embodiments of the present invention, in order to make the solution of the present invention more clear and complete, the definitions of the abbreviations and key terms appearing in the present invention will be explained first:
dil _ conv _3x3 and dil _ conv _5x 5: are all candidate operations in the depth separable convolution.
As shown in fig. 1, the present invention provides an automatic neural network structure search method based on automatic machine learning, which includes the following steps:
s1: determining a search space S by using a cell of a neural network based on a model generation stage of automatic machine learning;
s2: selecting a specific structure S from the search space S by using a search strategy;
s3: and based on the search space S, searching the specific structure S by using a search strategy, and returning the search result to the search strategy to complete the automatic search of the neural network structure.
In the embodiment of the present invention, as shown in fig. 2, in step S1, cell cells are stacked to obtain a search space S; the cell consists of two input nodes, a middle node, an output node and an edge; the edges of the cell represent candidate operations.
In the invention, the cell-based search space has two advantages: firstly, the size of the search space S is effectively reduced; second, model migration is easier to perform.
In the embodiment of the present invention, as shown in fig. 2, the cell is a directed acyclic graph, each cell is composed of N nodes, and each node is a layer in the neural network. Wherein, the input node: the input nodes of the convolutional network are the outputs of the previous two layers, the input nodes of the cyclic network are the inputs of the current layer and the state of the previous layer, where Ck-1And Ck-2Representing an input node; an intermediate node: each intermediate node is obtained by edge re-summing its predecessors, where N is0、N1、N2And N3Together forming an intermediate node; an output node: by each intermediate node operating in a vertical connection, where CkRepresents an output node; side: the edges represent candidate operations, and all edges contain candidate operations, such as convolution of 3 × 3, and the like.
In the embodiment of the present invention, as shown in fig. 1, step S3 includes the following sub-steps:
s31: based on the search space S, carrying out first-stage search on the specific structure S by using a search strategy;
s32: normalizing the edges of the cell to obtain the weight of the edges in the cell;
s33: sorting the weights of the edges in the cell, and screening to obtain a candidate operation with the highest weight;
s34: and according to the candidate operation with the highest weight, sequentially performing the second-stage search and the third-stage search, and returning the search result to the search strategy to complete the automatic search of the neural network structure.
In the invention, a staged method is adopted, namely, the searching process is divided into three stages, the number of cell in each stage is changed in a progressive mode, and the depth of the searching network is gradually increased, so that the searching space is gradually reduced, and the problem of depth interval caused by the fact that the searching is carried out in a network with smaller depth and the testing is carried out in a network with deeper depth is solved.
In the embodiment of the present invention, as shown in fig. 1, in step S31, the structure of the first-stage search includes 5 cell units, and the candidate operations of all edges in each cell unit in the first-stage search respectively include: max _ pool _3x3, avg _ pool _3x3, skip _ connect, sep _ conv _3x3, sep _ conv _5x5, dil _ conv _3x3, dil _ conv _5x5, and none.
In the present invention, the search network in the first stage is small, and only 5 cells exist, but the number of candidate operations on each edge of a cell is the largest, that is, all operations are included.
In the embodiment of the present invention, as shown in fig. 1, in step S32, the weight w of the edge in the cell isi.j(xi) The calculation formula of (2) is as follows:
Figure BDA0002840998810000081
wherein,
Figure BDA0002840998810000082
denotes softmax operation, Bi,jRepresenting the operating space of edge i and edge j, b representing a function with each edge representing a candidate, αb (i,j)Expressing the structural parameters, alpha expressing the weighted matrix of all edges, exp (-) expressing the exponential function operation, b' expressing the derivation of the candidate function, xiDenotes inode, αb‘ (i,j)The structure parameters after derivation of the candidate parameters are shown, and B represents the operation space of the edge.
In the present invention, the operation of each edge is normalized using the Softmax function among all the candidate operations.
In the embodiment of the present invention, as shown in fig. 1, in step S33, the candidate screening operation is modified by using a 0-1 loss function, and the calculation formula is as follows:
totalL=lval*(α),α)+ω0-1L
Figure BDA0002840998810000091
ω*=argminωltrain(ω,a)
wherein totalL represents the total 0-1 loss function, L represents the loss function, LvalA loss value representing the validation set, a represents a matrix of weight values for all edges, ω represents a matrix of parameters, ω represents a matrix of weights for all edges*Representing the value of the optimal parameter matrix after the search space is continued, N representing the total number of edges, exp (-) representing the operation of the exponential function, i and j representing the edges, ω0-1L represents a weight coefficient, Bi,jRepresenting the operating space of edge i and edge j, αb (i,j)Expressing the structural parameters, argmin (·) the variable value at which the objective function takes a minimum value, ltrainRepresenting the loss value of the training set, b' representing the derivative of the candidate function, αb‘ (i,j)Representing the structure parameters derived from the candidate parameters.
In the present invention, the appropriate operation is selected among the candidate operations according to the successive structural weight scores, but with a weight between the score of the weight and 0-1A large gap easily leads to a bias in the selection when selecting candidate operations, so a 0-1 penalty function is used to mitigate this. The 0-1 loss function reduces the difference of continuous coding discretization, makes the difference between weight parameters of different operations more obvious, enlarges the relative difference, and approaches to 0 or 1, so that the operation selection is more accurate and convenient. At the same time, to be able to control the weight of the loss function, ω is used0-1L weight coefficient, LvalDetermined by ω and α.
In the embodiment of the present invention, as shown in fig. 1, in step S33, the jump connection in the process of screening the weights is cut off by the regularized dropout method.
In the invention, as the times of sending all data into the neural network to complete the process of one-time forward calculation and backward propagation (epoch) are increased, the mutual competition among the weights is gradually increased, so that the weight of the final skip-connection is higher and higher, the skip-connection is finally selected, and the excessive skip-connection is equivalent to the occurrence of an overfitting phenomenon in the searching process, so that the automatic searching performance of the neural network structure is influenced. Aiming at overfitting caused by excessive skip-connection, the number of skip-connection is controlled by a method of search space regularization and early stop, and the aim is to improve the search performance.
After the skip-connection operation, a regularization method (dropout) is added to the operation level to partially cut off the skip-connection operation, so that the algorithm searches for other candidate operations. And a stronger dropout is used in the initial training stage, the probability of the dropout is gradually attenuated in the training process, and a lighter dropout is used in the later training stage, so that the final learning of the network structure parameters is not influenced.
In the embodiment of the present invention, as shown in fig. 1, in step S3, the early-stopping mechanism is used to control the search process, and the specific method thereof is as follows: and if the number of the jump connections in the cell exceeds two, stopping searching.
In the invention, a method for controlling the number of skip-connections intuitively is used in the searching process. If more than two skip-connections occur in a normal cell, the search is stopped.
The skip-connection is controlled through two modes of regularization and early stop for controlling the number of skip-connections, so that the problem of performance reduction caused by less training parameters is solved.
In the embodiment of the present invention, as shown in fig. 3, in step S34, the number of candidate operations in the second stage is 5, the number of cell is 11, the number of candidate operations in the third stage is 3, and the number of cell is 17.
In the invention, the number of the cells is increased, but the number of the candidate operations is reduced, and the finally searched network is close to the last deeper evaluation network by gradually increasing the number of the cells.
In the invention, the technical scheme can be applied to the field of automatic machine learning (AutoML). AutoML comprises four stages, namely data preprocessing, feature engineering, model generation and model evaluation. The automatic search of the neural network structure belongs to the core link of model generation.
The success of deep learning in perceptual tasks is mainly attributed to its automation of the feature engineering process: the hierarchical feature extractor is learned from the data in an end-to-end fashion, rather than being designed manually. However, this success has been accompanied by an increasing demand for a network architecture framework, more and more complex neural network architectures being designed by hand. And the manual design of the network structure framework is time-consuming and easy to make mistakes. On the contrary, the automatic neural network structure search can find the structure with the optimal performance in a traversal mode, and on the other hand, the automatic neural network structure search can also break the limitation of human thinking to find the organization mode of the structure which is not thought by human beings. The automatic search of the neural network structure can be regarded as a sub-field of the AutoML, has a cross point with the hyper-parameter optimization and the meta-learning, and is a reasonable development direction of the automatic machine learning. Of course, the invention also reveals the headedness in the deep learning fields such as image and natural language processing, and plays an important role in the fields related to the deep learning, such as intelligent financial wind control, intelligent automobile automatic driving and the like.
The working principle and the process of the invention are as follows: in the present invention, the search strategy selects a specific structure S from a predefined search space S, and the structure is transmitted to a performance evaluation stage for evaluation. And after evaluation, returning the performance evaluation result of the specific structure s to the search strategy, and then guiding the selection of the next structure by the search strategy. In the searching process, a 0-1 loss function is used, so that the differentiation of the candidate operation is more obvious, the accuracy of selecting the candidate operation is further improved, a search space regularization and early stop mechanism is added, and the performance of the searched model is further improved.
The invention has the beneficial effects that:
(1) the method has strong generalization capability, can be applied to various scenes such as computer vision and natural language processing, has obvious application effect particularly in the intelligent customer service industry oriented to the financial field, can obviously reduce the time cost in the modeling process and improve the modeling efficiency.
(2) The search time is reduced. In the invention, a unit-based structure is used, the searching is carried out in stages in the searching process, in each stage, the candidate operation with lower grade is deleted according to the grade of the candidate operation, only the candidate operation with higher grade is reserved, and the next stage is entered. This reduces the number of operations so that the search space becomes relatively simple. In addition, the number of epochs set in each stage is reduced to a certain extent. By the method, the usage amount of the GPU video memory is reduced, so that the search time is shortened.
(3) The performance loss problem is solved. In the searching process of each stage, when the scores of the candidate operations are evaluated, the 0-1 loss function is used for replacing the original cross entropy loss function, so that the discrepancies of discretization caused by continuous coding are reduced, the scores of different candidate operations can be closer to 0 or 1 under more equal conditions, and the differences among the scores of the candidate operations are more obvious. The change enables the neural network structure to select the required operation more accurately when selecting from the candidate operations in the searching process, and further enables the performance loss problem of the searched network to be controlled.
(4) The depth spacing problem is solved. The automatic neural network structure searching method based on automatic machine learning provided by the invention carries out the searching process in stages, and solves the problem that the depth has intervals because the network used in the training process is different from the network used in the evaluation process and the difference between the network sizes is overlarge. The staged operation enables the searched network to be closer to the network structure during evaluation in the last stage, the problem of framework overfitting caused by directly using a large network for searching is solved, the searching space is approximated by the mode, and the interval problem existing between depths is relieved to a certain extent.
(5) In the process of controlling candidate operation, a search space regularization and early stop mechanism is used, and if more than two jump connections appear in a cell, the search of the cell is stopped. This directly controls the number of hopping connections, so that the performance of the search is improved.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (9)

1. An automatic neural network structure searching method based on automatic machine learning is characterized by comprising the following steps:
s1: determining a search space S by using a cell of a neural network based on a model generation stage of automatic machine learning;
s2: selecting a specific structure S from the search space S by using a search strategy;
s3: and based on the search space S, searching the specific structure S by using a search strategy, and returning the search result to the search strategy to complete the automatic search of the neural network structure.
2. The automatic neural network structure searching method based on automatic machine learning of claim 1, wherein in step S1, cell cells are stacked to obtain a search space S; the cell consists of two input nodes, a middle node, an output node and an edge; the edges of the cell represent candidate operations.
3. The automatic machine learning-based neural network structure searching method according to claim 2, wherein the step S3 includes the following sub-steps:
s31: based on the search space S, carrying out first-stage search on the specific structure S by using a search strategy;
s32: normalizing the edges of the cell to obtain the weight of the edges in the cell;
s33: sorting the weights of the edges in the cell, and screening to obtain a candidate operation with the highest weight;
s34: and according to the candidate operation with the highest weight, sequentially performing the second-stage search and the third-stage search, and returning the search result to the search strategy to complete the automatic search of the neural network structure.
4. The automatic search method for neural network structure based on automatic machine learning of claim 3, wherein in step S31, the structure searched in the first stage includes 5 cell units, and the candidate operations included in all edges in each cell unit in the first stage search are: max _ pool _3x3, avg _ pool _3x3, skip _ connect, sep _ conv _3x3, sep _ conv _5x5, dil _ conv _3x3, dil _ conv _5x5, and none.
5. The automatic machine learning-based neural network structure searching method of claim 3,in step S32, the weight w of the edge in the celli.j(xi) The calculation formula of (2) is as follows:
Figure FDA0002840998800000021
wherein,
Figure FDA0002840998800000022
denotes softmax operation, Bi,jRepresenting the operating space of edge i and edge j, b representing a function with each edge representing a candidate, αb (i,j)Expressing the structural parameters, alpha expressing the weighted matrix of all edges, exp (-) expressing the exponential function operation, b' expressing the derivation of the candidate function, xiDenotes inode, αb‘ (i,j)The structure parameters after derivation of the candidate parameters are shown, and B represents the operation space of the edge.
6. The automatic machine learning-based neural network structure searching method of claim 3, wherein in step S33, the filtering candidate operation is modified by using a 0-1 loss function, and the formula is:
totalL=lval*(α),α)+ω0-1L
Figure FDA0002840998800000023
ω*=argminωltrain(ω,α)
wherein totalL represents the total 0-1 loss function, L represents the loss function, LvalA loss value representing the validation set, a represents a matrix of weight values for all edges, ω represents a matrix of parameters, ω represents a matrix of weights for all edges*Representing the value of the optimal parameter matrix after the search space is continued, N representing the total number of edges, exp (-) representing the operation of the exponential function, i and j representing the edges, ω0-1L represents a weight coefficient, Bi,jRepresenting operations of edge i and edge jSpace, αb (i,j)Expressing the structural parameters, argmin (·) the variable value at which the objective function takes a minimum value, ltrainRepresenting the loss value of the training set, b' representing the derivative of the candidate function, αb‘ (i,j)Representing the structure parameters derived from the candidate parameters.
7. The automatic machine learning-based neural network structure searching method according to claim 3, wherein in step S33, the jump connection in the process of screening the weight is cut off by a regularized dropout method.
8. The automatic neural network structure searching method based on automatic machine learning of claim 8, wherein in step S33, the early-stopping mechanism is used to control the searching process, and the specific method is as follows: and if the number of the jump connections in the cell exceeds two, stopping searching.
9. The automatic search method for neural network structure based on automatic machine learning of claim 3, wherein in step S34, the number of candidate operations in the second stage is 5, the number of unit cells is 11, the number of candidate operations in the third stage is 3, and the number of unit cells is 17.
CN202011492102.1A 2020-12-17 2020-12-17 Automatic neural network structure searching method based on automatic machine learning Pending CN112633494A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011492102.1A CN112633494A (en) 2020-12-17 2020-12-17 Automatic neural network structure searching method based on automatic machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011492102.1A CN112633494A (en) 2020-12-17 2020-12-17 Automatic neural network structure searching method based on automatic machine learning

Publications (1)

Publication Number Publication Date
CN112633494A true CN112633494A (en) 2021-04-09

Family

ID=75316216

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011492102.1A Pending CN112633494A (en) 2020-12-17 2020-12-17 Automatic neural network structure searching method based on automatic machine learning

Country Status (1)

Country Link
CN (1) CN112633494A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113256593A (en) * 2021-06-07 2021-08-13 四川国路安数据技术有限公司 Tumor image detection method based on task self-adaptive neural network architecture search

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113256593A (en) * 2021-06-07 2021-08-13 四川国路安数据技术有限公司 Tumor image detection method based on task self-adaptive neural network architecture search

Similar Documents

Publication Publication Date Title
CN110276765B (en) Image panorama segmentation method based on multitask learning deep neural network
Belouadah et al. Scail: Classifier weights scaling for class incremental learning
CN114240892B (en) Knowledge distillation-based unsupervised industrial image anomaly detection method and system
CN111461325B (en) Multi-target layered reinforcement learning algorithm for sparse rewarding environmental problem
CN110674965A (en) Multi-time step wind power prediction method based on dynamic feature selection
CN115424177A (en) Twin network target tracking method based on incremental learning
CN114282443A (en) Residual service life prediction method based on MLP-LSTM supervised joint model
CN112633494A (en) Automatic neural network structure searching method based on automatic machine learning
CN113467481B (en) Path planning method based on improved Sarsa algorithm
CN112922609B (en) Intelligent tunneling method of shield tunneling machine
CN116701665A (en) Deep learning-based traditional Chinese medicine ancient book knowledge graph construction method
CN111340637A (en) Medical insurance intelligent auditing system based on machine learning feedback rule enhancement
CN115936303A (en) Transient voltage safety analysis method based on machine learning model
CN115719478A (en) End-to-end automatic driving method for accelerated reinforcement learning independent of irrelevant information
CN111539989B (en) Computer vision single target tracking method based on optimized variance reduction
CN111259860B (en) Multi-order characteristic dynamic fusion sign language translation method based on data self-driving
CN115101136A (en) Large-scale aluminum electrolysis cell global anode effect prediction method
CN114019846A (en) Intelligent ventilation control design method and system for long road tunnel
CN109409306B (en) Active video behavior detection system and method based on deep reinforcement learning
CN114626284A (en) Model processing method and related device
CN105528681B (en) A kind of smelter by-product energy resource system method of real-time adjustment based on hidden tree-model
CN111860776A (en) Lightweight time convolution network oriented to time sequence data rapid prediction
Chetoui et al. Course recommendation model based on Knowledge Graph Embedding
CN111950691A (en) Reinforced learning strategy learning method based on potential action representation space
Peng et al. SCLIFD: Supervised contrastive knowledge distillation for incremental fault diagnosis under limited fault data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210409

RJ01 Rejection of invention patent application after publication