CN114997360B

CN114997360B - Evolution parameter optimization method, system and storage medium of neural architecture search algorithm

Info

Publication number: CN114997360B
Application number: CN202210551112.0A
Authority: CN
Inventors: 孙亚楠; 吕泽琼
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2022-05-18
Filing date: 2022-05-18
Publication date: 2024-01-19
Anticipated expiration: 2042-05-18
Also published as: CN114997360A

Abstract

The invention discloses an evolution parameter optimization method, an evolution parameter optimization system and a storage medium of a neural architecture search algorithm, wherein the evolution parameter optimization method comprises the steps of obtaining an object to be identified and determining a neural network model for identifying the object to be identified; selecting a neural architecture search algorithm for constructing a neural network model and a neural network architecture in a search space; reading the node number and the node operation type of the neural network architecture, and constructing a neural architecture searching algorithm to search an expected number of times of arrival theoretical model of the neural network model; drawing a relationship diagram of population size and iteration times through a control variable method according to an expected first-time theoretical model; and selecting the optimal population size and the optimal iteration times of the neural architecture search algorithm according to the relation diagram.

Description

Evolution parameter optimization method, system and storage medium of neural architecture search algorithm

Technical Field

The invention belongs to the field of artificial intelligence algorithms, and particularly relates to an evolution parameter optimization method, an evolution parameter optimization system and a storage medium of a neural architecture search algorithm.

Background

Medical image processing, speech recognition, semantic segmentation, and the like are research hotspots at present, and in order to improve recognition accuracy when recognizing these data, a neural network architecture with excellent design performance often needs to be considered as a "black box" design by means of a neural network model. Because the network architecture with excellent performance is generally 'deep', the performance of the same model under different applications is different, and the network architecture parameter design usually depends on expert setting with abundant experience, the design of the neural network model used in the fields of medical image processing, voice recognition, semantic segmentation and the like is very limited.

In view of this situation, in the prior art, application of an evolutionary neural architecture search algorithm (ENAS algorithm for short) to designing a neural network-based medical image, speech recognition and semantic segmentation processing model has emerged to reduce excessive reliance on experts in the neural network, so that researchers in the related art can design a model with excellent performance even in the absence of network design experience.

Although the ENAS algorithm has excellent performance in constructing the neural network, it also faces a problem that it consumes huge computational effort, and the biggest problem affecting the computational effort is that the evolution parameters (i.e. "population size" and "maximum number of iterations") involved in the ENAS algorithm are unknown, and researchers need to expend a lot of computational effort to experiment to find a more satisfactory parameter setting.

Disclosure of Invention

Aiming at the defects in the prior art, the evolution parameter optimization method, the system and the storage medium of the neural architecture search algorithm solve the problem that the existing neural architecture search algorithm excessively depends on expert experience in the aspect of neural network to determine the evolution parameters.

In order to achieve the aim of the invention, the invention adopts the following technical scheme:

in a first aspect, a method for optimizing evolution parameters of a neural architecture search algorithm is provided, including:

acquiring an object to be identified, and determining a neural network model for identifying the object to be identified;

selecting a neural architecture search algorithm for constructing a neural network model and a neural network architecture in a search space;

reading the node number and the node operation type of the neural network architecture, and constructing a neural architecture searching algorithm to search an expected number of times of arrival theoretical model of the neural network model;

drawing a relationship diagram of population size and iteration times through a control variable method according to an expected first-time theoretical model;

and selecting the optimal population size and the optimal iteration times of the neural architecture search algorithm according to the relation diagram.

Further, the calculation formula of the expected first-time theoretical model is as follows:

，/>

wherein,EHTfor the expected number of times;nis the size of the neural network architecture;vthe number of nodes for the neural network architecture;the number of possibility of all offspring after one mutation operation is +.>Is the number of combinations;qthe number of mutation bits of the mutation operator; />Is the population space->An optimal subspace of (a); />Is the population space->Intermediate distance function valuedIs used in the space of the sub-space of (a),is a population subspace->Is a secondary subspace of (2); />Is the size of the population;

is the firsttThe population at the time of generation iteration belongs to the optimal subspace +.>Probability of (2); />For the initial time the population belongs to the population subspace +.>Probability of (2); />Is->The population at the iteration time belongs to the secondary subspace +.>Probability of (2);for distance function value +.>After mutation operation, its offspring has a distance function value of +.>All possible new individual numbers of the neural network.

Further, the distance function value of the neural network architecturedThe calculation method of (1) comprises the following steps:

converting connection relation of nodes of neural network architecture into×/>Is a triangular adjacency matrix;

determining a series of real numbers codes according to the node operation types of the neural network architecture, and adopting an upper triangle adjacency matrix and the real numbers codes to form a coding result of the neural network architecture;

computing neural network architecture s to optimal neural network architecture using hamming distanceDistance of->：

Wherein,and->Neural network architecture s and optimal neural network architecture, respectively->In the coding result of (2)iThe value of the bit; />Is an absolute value sign.

Further, the method comprises the steps of,、/>and->The calculation formulas of (a) are respectively as follows:

，/>

wherein,Lthe node operation type is as follows;n ₁ edges that may be present in the upper triangular adjacency matrix;n ₂ intermediate vertex numbers outside the input and output vertices;and->Search space for computing individuals, respectively>The function value of the middle distance is->And->Is a number of individuals.

Further, the number of new individualsThe calculation formula of (2) is as follows:

，d ₂ =d-d ₁

wherein,Lthe node operation type is as follows;n ₁ edges that may be present in the upper triangular adjacency matrix;n ₂ intermediate vertex numbers outside the input and output vertices;zis a variable value, which takes the value of；d ₁ Andd ₂ are all intermediate variables.

Further, the neural architecture search algorithm is an ENAS algorithm.

Further, the object to be identified is image classification, semantic segmentation or graph structure data processing;

when the task is image classification, the search space is Nas-Standard 101 search space; when the task is semantic segmentation, the search space is an AUTO-deep Lab search space; when the task is graph structure data processing, the search space is a GraphNAS search space.

In a second aspect, an optimization system for evolution parameters of a neural architecture search algorithm is provided, comprising:

the screening module is used for acquiring the object to be identified and determining a neural network model for identifying the object to be identified;

the selection module is used for selecting a neural architecture search algorithm for constructing the neural network model and a neural network architecture in a search space;

the model construction module is used for reading the node number and the node operation type of the neural network architecture and constructing a theoretical model of expected first-time times of searching the neural network model by a neural architecture searching algorithm;

the drawing module is used for drawing a relationship diagram of population size and iteration times through a control variable method according to the expected first-arrival times theoretical model;

and the parameter selection module is used for selecting the optimal population size and the optimal iteration times of the neural architecture search algorithm according to the relation diagram.

In a third aspect, a storage medium is provided, the storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the steps of a method for optimizing evolution parameters of a neural architecture search algorithm.

The beneficial effects of the invention are as follows: the expected first-time theoretical model is constructed based on the search algorithm and parameters of a neural network architecture in a search space, can reflect a nonlinear relation between expected first-time times, population size and iteration times, and can find the optimal evolution parameters of the search algorithm by combining a relation diagram drawn by a control variable method.

According to the scheme, the population size and the iteration number are determined in the mode, excessive dependence on experts in the aspect of the neural network can be reduced, and therefore researchers in the related field can quickly determine excellent evolution parameters even under the condition of lacking network design experience.

In the subsequent neural network searching process, the evolution parameters determined by the scheme can avoid the condition that the prior art selects too large population and iteration times and increases a large amount of calculation power, thereby achieving the aim of saving resources.

In addition, the neural architecture search algorithm and the neural network architecture of the scheme are selected based on the object to be identified, so that the subsequently constructed neural network model is more suitable for identifying the corresponding object, and the subsequently constructed neural network based on the evolution parameters is ensured to have higher identification precision.

Drawings

FIG. 1 is a flow chart of a method of optimizing evolution parameters of a neural architecture search algorithm.

FIG. 2 shows the EHT lower bounds and parameters of the ENAS algorithmAnd->Is a graph of the relationship of (1).

FIG. 3 shows the runtime lower bounds and parameters of the ENAS algorithmAnd->Is a graph of the relationship of (1).

Fig. 4 is a CNN structure and its encoding process.

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.

Referring to fig. 1, fig. 1 shows an evolution parameter optimization method of a neural architecture search algorithm, and the method S includes steps S1 to S5.

In step S1, an object to be identified is obtained, and a neural network model for identifying the object to be identified is determined; the neural network model selected here may be a model with high recognition accuracy for the object, for example, if the object to be recognized is a medical image, and if the convolutional neural network is used for such segmentation recognition, the convolutional neural network model may be selected here.

According to the scheme, the neural network model is selected according to the object to be identified, so that the identification accuracy of the neural network constructed by later searching when the neural network is applied to the identified object can be ensured.

When the scheme is implemented, the object to be identified is preferably image classification, semantic segmentation or graph structure data processing;

when the task is image classification, such as identifying objects like numbers, cats, dogs, airplanes and the like, the search space is Nas-standard 101 search space; when the task is semantic segmentation, such as city street view semantic understanding, target category segmentation (such as medical images) and the like, the search space is AUTO-deep Lab search space; when the task is graph structure data processing, such as recommendation system, knowledge graph completion, reference prediction and the like, the search space is a graph NAS search space.

In step S2, a neural architecture search algorithm for constructing a neural network model and a neural network architecture in a search space are selected; the neural architecture search algorithm of the present approach is preferably the ENAS algorithm.

In step S3, the node number and the node operation type of the neural network architecture are read, and a neural architecture search algorithm is constructed to search for a theoretical model of the expected number of times of arrival of the neural network model.

In one embodiment of the invention, the calculation formula of the expected first-time theoretical model is:

，/>

In the scheme, the space of the population contains the possibility of all the populations, and each population consists ofIndividual individuals (an individual is a neural network architecture) are formed, and the manner in which individuals form a population follows the principle of "unordered repeatable" (i.e., individuals forming a population do not have repetition); population space->Is +.>By "number of combinations">To represent.

To facilitate understanding of population space, the following description is provided in connection with a small case:

hypothesized population spaceWith a size of 5, all populations were marked +.>And the minimum distance function value of all individuals in each population is +.>Then based on these distance function values the population space can be +.>Division into 3 types of subspaces->And has the following inclusion relationship:wherein->Is the population space->Is defined as the optimal subspace of the sub-space.

Assuming each populationThe function value of the middle distance is->The number of individuals is respectivelyThen each is not +.>The subspace may be further divided into secondary subspaces, namely: according to->And->Subspace can be +.>Can be divided into class 2 sub-subspaces +.>According toAnd->Subspace +.>Division into class 2 sub-subspaces->。

Population subspaceIs +.>Secondary subspace->Is +.>The calculation can be performed by respectively adopting the following formulas:

function ofRepresenting all individual search spaces +.>The number of individuals having a certain distance function value (in the formula, <>And->Then it is indicated separately that the individual search space is calculated +.>The middle distance value is +.>And->Number of individuals) in a functionIn distance function value->Is a variable. Function->The calculation method of (2) is as follows:

wherein the method comprises the steps of、/>All represent "number of combinations".

In step S4, according to the theoretical model of the expected number of times of the first arrival, a relationship diagram of the population size and the number of iterations is drawn by a controlled variable method:

fixed network node numberNumber of operation types->Isoparametric, bringing the variable value into a theoretical model of the expected first time to obtain the value with population size +.>A runtime formula of an argument (+.>) And drawing a relationship diagram of the independent variable and the dependent variable. In addition, the number of network nodes can be increased>And operation type number->As variables, to facilitate the observation of how the parameters interact with each other, the relationship between these variables may be visualized (the independent variable is the population size, the number of network nodes, etc., the dependent variable is EHT), and the following three-dimensional relationship diagram may be obtained, specifically, refer to fig. 2 and 3.

In step S5, according to the relationship diagram, the optimal population size and iteration number of the neural architecture search algorithm are selected.

As can be seen from fig. 2, as the size of the neural network increases, the longer the ENAS algorithm searches for the optimal network structure; as population size increases, the smaller the algebra required for the ENAS algorithm to search for the optimal network structure, and the more stable it gradually becomes. Therefore, to conserve computing resources, the system is sized in individual (network) sizesFor example, =26, the present scheme gives a parameter setting description:

1) If only what population size is considered to be setMost suitably, it can then be seen from FIG. 2 that when +.>At around 20 the EHT value starts to stabilize and thus the researcher can put the parameter +.>Set to 20.

2) If only how many algorithm iterations are set is consideredThe algorithm performance can be made better and the experimenter can be reduced to debug the parameter +.>Then the researcher may start setting from the minimum value of the lower bound of the EHT (i.e., t=331).

3) If the overall operation time is considered to be optimal, observation is neededLower bound) about the variable->Is a function of (i.e. the function is about +.>The value corresponding to the second derivative of 0) is plotted for this purpose, a three-dimensional diagram of population size, individual size, and run time is shown in fig. 3, from which fig. 3 it can be seen when>When in use, the methodLower bound) at->The second derivative at 0, then in this example the parameter value can be set to +.>And->。

In one embodiment of the present invention, a method for calculating a distance function value d of a neural network architecture includes:

The distance function value of the neural network architecture is combined with a specific exampledIs illustrated by the calculation of (a):

for havingThe network architecture of individual nodes can form a +_ according to the node connection relation>×/>An upper triangular adjacency matrix (refer to the middle part of fig. 4) where "1" represents that there is a connection relationship between two nodes (or describes a relationship between edges), by which adjacency matrix +_x can be defined>Representing the number of possible edges in the upper triangular adjacency matrix, etc>Representing the number of intermediate vertices other than the input and output vertices, < >>Representing the individual length.

Fig. 4 shows the coding process of a convolutional neural network structure with node number v=7 and operation type number l=3, and the coding result of the neural networks=() In this example, the individualsIs a length +.>Is a character string of (a). An individual->(neural network architecture) to an optimal solution +.>Distance (optimal neural network architecture)>Using hamming distances.

Specifically, it willAnd->After encoding, the two parameters are respectively two character strings with the same length, +.>And->Respectively represent +.>The value of the bit, then distance function +.>The calculation formula of (2) is as follows:the practical sense is to compare two strings, giving the two different numbers of bits. For example, there is an individual +.>Then->Distance function value->。

In practice, the scheme is preferably、/>And->The calculation formulas of (a) are respectively as follows:

，/>

Number of new individualsThe calculation formula of (2) is as follows:

，d ₂ =d-d ₁

wherein,Lthe node operation type is as follows;n ₁ edges that may be present in the upper triangular adjacency matrix;n ₂ intermediate vertex numbers outside the input and output vertices; z is a variable value, which is；d ₁ Andd ₂ are all intermediate variables.

The scheme also provides an optimization system of the evolution parameter optimization method of the neural architecture search algorithm, which comprises the following steps:

Finally, the present solution also provides a storage medium, where the storage medium stores a plurality of instructions, the instructions being adapted to be loaded by a processor to perform the steps of the method for optimizing the evolution parameters of the neural architecture search algorithm.

In conclusion, based on the evolution parameters obtained by the scheme, the computational effort of searching the neural network can be greatly reduced, the searching time of the neural network architecture is shortened, and meanwhile, the expertise of researchers on the requirement of the neural network knowledge can be reduced.

Claims

1. The evolution parameter optimization method of the neural architecture search algorithm is characterized by comprising the following steps of:

acquiring an object to be identified, and determining a neural network model for identifying the object to be identified; the object to be identified is image classification or semantic segmentation; when the task is image classification, the search space is Nas-Standard 101 search space; when the task is semantic segmentation, the search space is an AUTO-deep Lab search space;

selecting a neural architecture search algorithm for constructing the neural network model and a neural network architecture in a search space;

drawing a relationship diagram of population size and iteration times through a control variable method according to the expected first-arrival times theoretical model;

selecting the optimal population size and iteration times of the neural architecture search algorithm according to the relation diagram;

the calculation formula of the expected first-time theoretical model is as follows:

，/>

2. The method for optimizing evolution parameters of a neural architecture search algorithm according to claim 1, wherein the distance function value of the neural network architecturedThe calculation method of (1) comprises the following steps:

neural network architecture employing hamming distance computationsTo an optimal neural network architectureDistance of->：

Wherein,and->Neural network architecture respectivelysAnd optimal neural network architecture->In the coding result of (2)iThe value of the bit; />Is an absolute value sign.

3. The method for optimizing evolution parameters of a neural architecture search algorithm according to claim 2, wherein,、/>and->The calculation formulas of (a) are respectively as follows:

，/>

4. The method for optimizing evolution parameters of a neural architecture search algorithm according to claim 1, wherein the new number of individualsThe calculation formula of (2) is as follows:

，d ₂ =d-d ₁

wherein,Lthe node operation type is as follows;n ₁ edges that may be present in the upper triangular adjacency matrix;n ₂ intermediate vertex numbers outside the input and output vertices;zis a variable value, which takes the value of；/>Is a variable value, which takes the value of；d ₁ Andd ₂ are all intermediate variables.

5. The method for optimizing evolution parameters of a neural architecture search algorithm according to any one of claims 1 to 4, wherein the neural architecture search algorithm is an ENAS algorithm.

6. An optimization system for an evolution parameter optimization method applied to the neural architecture search algorithm of any one of claims 1 to 5, comprising:

7. A storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the steps of the evolution parameter optimization method of the neural architecture search algorithm of any one of claims 1-5.