CN111382868B - Neural network structure searching method and device - Google Patents

Neural network structure searching method and device Download PDF

Info

Publication number
CN111382868B
CN111382868B CN202010109054.7A CN202010109054A CN111382868B CN 111382868 B CN111382868 B CN 111382868B CN 202010109054 A CN202010109054 A CN 202010109054A CN 111382868 B CN111382868 B CN 111382868B
Authority
CN
China
Prior art keywords
neural network
search space
network
searching
network layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010109054.7A
Other languages
Chinese (zh)
Other versions
CN111382868A (en
Inventor
陈醒濠
杨朝晖
王云鹤
许春景
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202010109054.7A priority Critical patent/CN111382868B/en
Publication of CN111382868A publication Critical patent/CN111382868A/en
Application granted granted Critical
Publication of CN111382868B publication Critical patent/CN111382868B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a method and a device for searching a neural network structure by utilizing an artificial intelligence technology. According to the technical scheme, the sampling model is trained according to the resource constraint conditions of the given search space and the target equipment, so that the sampling model samples the given search space to obtain the neural network structure meeting the resource constraint conditions, then the sampling model is used for sampling the candidate search space obtained from the given search space, and searching the target neural network structure, and the technical scheme can ensure that the neural network meeting the resource constraint conditions of the target equipment is obtained, so that the searching efficiency of the neural network structure can be improved. In addition, the application also provides a technical scheme of searching the key layer of the neural network structure and then searching the non-key layer of the neural network structure, so that not only can the better neural network structure be obtained by searching, but also the searching efficiency of the neural network structure can be further improved.

Description

Neural network structure searching method and device
Technical Field
The present application relates to the field of artificial intelligence, and more particularly, to a neural network structure search method and a neural network structure search apparatus.
Background
Deep neural networks find wide application and great success in many visual tasks such as image classification, image detection, and image segmentation. The design of conventional neural network structures, such as ResNet, mobileNet, shuffleNet, is highly dependent on expert knowledge.
In recent years, the field of artificial intelligence has proposed neural network structure search (neural architecture search, NAS) that can automatically generate a better or even optimal neural network structure.
In general, neural network structures derived by NAS will run on some resource constrained devices. If the resource requirement of the neural network structure obtained by the NAS is too large, the target device cannot operate the neural network structure normally, which may cause that the NAS needs to be re-performed, and thus the NAS efficiency is not high.
Disclosure of Invention
The neural network structure searching method and the neural network structure searching device provided by the application can ensure that the neural network meeting the resource constraint condition of the target equipment is obtained by searching, thereby improving the searching efficiency of the neural network structure.
In a first aspect, the present application provides a neural network structure searching method, including: determining constraint conditions of resources provided by target equipment for running the neural network, wherein the neural network is used for processing images, texts or voices; acquiring a given search space of a neural network; training a sampling model according to the constraint condition and the given search space, so that the requirement of the neural network obtained by sampling the given search space by using the sampling model on the resource meets the constraint condition when the neural network runs in the target equipment; determining a candidate search space according to the sampling model and the given search space, wherein the candidate search space comprises a neural network sampled from the given search space based on the sampling model; searching the target neural network according to the candidate search space.
In the method, the neural network meeting the resource constraint condition of the target equipment in the given search space can be sampled by training the sampling model, so that the candidate search space determined based on the sampling model and the given search space contains the neural network meeting the resource constraint condition of the target equipment. Thus, the target neural network obtained by searching the candidate search space can search the neural network structure meeting the resource constraint condition of the target equipment, thereby improving the search efficiency.
In addition, searching the target neural network according to the candidate search space instead of searching the given search space can avoid searching the neural network which does not meet the resource constraint condition of the target device, thereby improving the search efficiency.
Because the searching efficiency of the method is higher, the target neural network can be searched by using the training data with larger scale, so that the better neural network running on the target equipment can be searched.
With reference to the first aspect, in a first possible implementation manner, the searching the target neural network according to the candidate search space includes: searching a first network layer according to the candidate search space, wherein the first network layer is a network layer contained in any neural network in the given candidate search space; searching a second network layer according to the candidate search space and the first network layer; and determining the target neural network according to the first network layer and the second network layer, wherein the second network layer comprises network layers except the first network layer in network layers included in any neural network in the given search space.
In this implementation, a key network layer (i.e., a first network layer) may be searched from the candidate search space, and then a non-key network layer (i.e., a second network layer) may be searched from the search space based on the searched network layers, so that a target neural network formed by the two network layers may be obtained.
The network layers that all the neural networks in the search space must include are usually the key layers of the neural network, and the role of the key layers in performing tasks by the neural network is usually important, that is, the performance of the key layers generally determines the performance of the neural network, so in this implementation, the key layers (i.e., the first network layer) are searched first, the performance influence of the non-key layers (i.e., the second network layer) can be removed, so that the key layers with better performance can be searched, and the neural networks with better performance can be searched.
With reference to the first aspect or any one of the foregoing possible implementation manners, in a second possible implementation manner, the sampling model samples the given search space based on a gumbel-softmax sampling method to obtain a neural network.
With reference to the first aspect or any one of the foregoing possible implementation manners, in a third possible implementation manner, the processing, by using the neural network, of an image may include: the method comprises the steps of classifying images, dividing the images, detecting the images, identifying the images and generating the images; the neural network for processing text may include: the method is used for translating texts, repeating the texts, generating the texts and the like; the neural network for processing speech may include: for recognizing speech, for translating speech, for generating speech, etc.
With reference to the first aspect or any one of the foregoing possible implementation manners, in a fourth possible implementation manner, the resource calculates a resource and/or stores a resource.
In a second aspect, the present application provides a method for searching a neural network structure, the method comprising: the method comprises the steps that a server receives a first message from a target device, wherein the first message is used for requesting the server to perform a neural network structure search, and the neural network is used for processing images, texts or voices; the server searches a first network layer according to the given search space, wherein the first network layer is a network layer contained in any neural network in the given candidate search space; the server searches a second network layer according to the given search space and the first network layer, wherein the second network layer comprises network layers except the first network layer in network layers included by any neural network in the given search space; the server sends a target neural network to the target device, the target neural network including the first network layer and the second network layer.
The network layers that all the neural networks in the search space must include are usually the key layers of the neural network, and the role of the key layers in performing tasks by the neural network is usually important, that is, the performance of the key layers usually determines the performance of the neural network, so in this implementation, the server searches the key layers (i.e., the first network layer) first, and the performance influence of the non-key layers (i.e., the second network layer) can be removed, so that the key layers with better performance can be searched for, and the neural networks with better performance can be searched for.
In some possible implementations, the neural network for processing the image may include: the method comprises the steps of classifying images, dividing the images, detecting the images, identifying the images and generating the images; the neural network for processing text may include: the method is used for translating texts, repeating the texts, generating the texts and the like; the neural network for processing speech may include: for recognizing speech, for translating speech, for generating speech, etc.
In a third aspect, the present application provides a neural network structure search apparatus, the apparatus comprising: the determining module is used for limiting the resource provided by the target equipment for running the neural network, and the neural network is used for processing images, texts or voices; an acquisition module for acquiring a given search space of the neural network; the training module is used for training a sampling model according to the constraint condition and the given search space so that the resource constraint condition is met by the requirement of the neural network obtained by sampling the given search space by using the sampling model when the neural network runs in the target equipment; the determining module is used for determining candidate search spaces according to the sampling model and the given search space, wherein the candidate search spaces comprise neural networks sampled from the given search space based on the sampling model; and the searching module is used for searching the target neural network according to the candidate searching space.
In the device, the neural network meeting the resource constraint condition of the target equipment in the given search space can be sampled by training the sampling model, so that the candidate search space determined based on the sampling model and the given search space contains the neural network meeting the resource constraint condition of the target equipment. Thus, the target neural network obtained by searching the candidate search space can search the neural network structure meeting the resource constraint condition of the target equipment, thereby improving the search efficiency.
In addition, searching the target neural network according to the candidate search space instead of searching the given search space can avoid searching the neural network which does not meet the resource constraint condition of the target device, thereby improving the search efficiency.
Because the device has higher searching efficiency, the target neural network can be searched by using the training data with larger scale, so that the better neural network can be obtained by searching.
With reference to the third aspect, in a first possible implementation manner, the search module is specifically configured to: searching a first network layer according to the candidate search space, wherein the first network layer is a network layer contained in any neural network in the given candidate search space; searching a second network layer according to the candidate search space and the first network layer, wherein the second network layer comprises network layers except the first network layer in network layers included by any neural network in the given search space; and determining the target neural network according to the first network layer and the second network layer.
The network layers that all the neural networks in the search space must include are usually the key layers of the neural network, and the role of the key layers in performing tasks by the neural network is usually important, that is, the performance of the key layers generally determines the performance of the neural network, so in this implementation, the key layers (i.e., the first network layer) are searched first, the performance influence of the non-key layers (i.e., the second network layer) can be removed, so that the key layers with better performance can be searched, and the neural networks with better performance can be searched.
With reference to the third aspect or any one of the foregoing possible implementation manners, in a second possible implementation manner, the sampling model samples the given search space based on a gumbel-softmax sampling method to obtain a neural network.
With reference to the third aspect or any one of the foregoing possible implementation manners, in a third possible implementation manner, the processing, by using the neural network, of an image may include: the method comprises the steps of classifying images, dividing the images, detecting the images, identifying the images and generating the images; the neural network for processing text may include: the method is used for translating texts, repeating the texts, generating the texts and the like; the neural network for processing speech may include: for recognizing speech, for translating speech, for generating speech, etc.
With reference to the third aspect or any one of the foregoing possible implementation manners, in a fourth possible implementation manner, the resources include computing resources and/or storage resources.
In a fourth aspect, the present application provides a neural network structure search apparatus, the apparatus comprising: the receiving module is used for receiving a first message from the target equipment, wherein the first message is used for requesting the server to perform neural network structure search; the acquisition module is used for acquiring a given search space of a neural network, and the neural network is used for processing images, texts or voices; the searching module is used for searching a first network layer according to the given searching space, wherein the first network layer is a network layer contained in any neural network in the given candidate searching space; the searching module is further configured to search a second network layer according to the given search space and the first network layer, where the second network layer includes network layers other than the first network layer among network layers included in any neural network in the given search space; and the sending module is used for sending a target neural network to the target equipment, wherein the target neural network comprises the first network layer and the second network layer.
The network layers that all the neural networks in the search space must include are usually the key layers of the neural network, and the role of the key layers in performing tasks by the neural network is usually important, that is, the performance of the key layers generally determines the performance of the neural network, so in this implementation, the key layers (i.e., the first network layer) are searched first, the performance influence of the non-key layers (i.e., the second network layer) can be removed, so that the key layers with better performance can be searched, and the neural networks with better performance can be searched.
In some possible implementations, the neural network for processing the image may include: the method comprises the steps of classifying images, dividing the images, detecting the images, identifying the images and generating the images; the neural network for processing text may include: the method is used for translating texts, repeating the texts, generating the texts and the like; the neural network for processing speech may include: for recognizing speech, for translating speech, for generating speech, etc.
In a fifth aspect, the present application provides a neural network structure search apparatus, the apparatus comprising: a memory for storing instructions; a processor for executing the memory-stored instructions, which when executed is for performing the method of the first aspect.
In a sixth aspect, the present application provides a neural network structure search apparatus, the apparatus comprising: a memory for storing instructions; a processor for executing the memory-stored instructions, which when executed is for performing the method of the second aspect.
In a seventh aspect, the application provides a computer readable medium storing instructions for execution by a device for performing the method of the first aspect.
In an eighth aspect, the application provides a computer readable medium storing instructions for execution by a device for performing the method of the second aspect.
In a ninth aspect, the application provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the first aspect.
In a tenth aspect, the application provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the second aspect.
In an eleventh aspect, the present application provides a chip comprising a processor and a data interface, the processor reading instructions stored on a memory through the data interface, performing the method of the first aspect.
Optionally, as an implementation manner, the chip may further include a memory, where the memory stores instructions, and the processor is configured to execute the instructions stored on the memory, where the instructions, when executed, are configured to perform the method in the first aspect.
In a twelfth aspect, the present application provides a chip comprising a processor and a data interface, the processor reading instructions stored on a memory through the data interface, performing the method of the second aspect.
Optionally, as an implementation manner, the chip may further include a memory, where the memory stores instructions, and the processor is configured to execute the instructions stored on the memory, where the processor is configured to perform the method in the second aspect when the instructions are executed.
In a thirteenth aspect, the present application provides a computing device comprising a processor and a memory, wherein: the memory has stored therein computer instructions that are executed by the processor to implement the method of the first aspect.
In a fourteenth aspect, the present application provides a computing device comprising a processor and a memory, wherein: the memory has stored therein computer instructions that are executed by the processor to implement the method of the second aspect.
Drawings
FIG. 1 is an exemplary flow chart of a neural network structure search method of the present application;
FIG. 2 is another exemplary flow chart of a neural network structure search method of the present application;
FIG. 3 is another exemplary flow chart of a neural network structure search method of the present application;
FIG. 4 is another exemplary flow chart of a neural network structure search method of the present application;
Fig. 5 is an exemplary structural diagram of a neural network structure search apparatus of the present application;
fig. 6 is another exemplary structural diagram of a neural network structure search device of the present application;
Fig. 7 is another exemplary structural diagram of a neural network structure search device of the present application.
Detailed Description
Some terms in the embodiments of the present application will be explained first.
The embodiments of the present application relate to related applications of neural networks, and in order to better understand the schemes of the embodiments of the present application, related terms and other related concepts of the neural networks to which the embodiments of the present application may relate are described below.
(1) Neural network
Neural Networks (NNs) are complex network systems formed by a large number of simple processing units (called neurons) widely interconnected, reflecting many of the fundamental features of human brain function, and are highly complex nonlinear power learning systems.
The neural network may be composed of neural units, which may refer to an arithmetic unit having x s and an intercept 1 as inputs, and an output of the arithmetic unit may be as shown in formula (1-1):
where s=1, 2, … … n, n is a natural number greater than 1, W s is the weight of x s, and b is the bias of the neural unit. f is an activation function (activation functions) of the neural unit for introducing a nonlinear characteristic into the neural network to convert an input signal in the neural unit to an output signal. The output signal of the activation function may be used as an input to a next convolutional layer, and the activation function may be a sigmoid function. A neural network is a network formed by joining together a plurality of the above-described single neural units, i.e., the output of one neural unit may be the input of another neural unit. The input of each neural unit may be connected to a local receptive field of a previous layer to extract features of the local receptive field, which may be an area composed of several neural units.
(2) Deep neural network
Deep neural networks (deep neural network, DNN), also known as multi-layer neural networks, can be understood as neural networks with multiple hidden layers. The DNNs are divided according to the positions of different layers, and the neural networks inside the DNNs can be divided into three types: input layer, hidden layer, output layer. Typically the first layer is the input layer, the last layer is the output layer, and the intermediate layers are all hidden layers. The layers are fully connected, that is, any neuron in the i-th layer must be connected to any neuron in the i+1-th layer.
Although DNN appears to be complex, it is not really complex in terms of the work of each layer, simply the following linear relational expression: wherein/> Is an input vector,/>Is an output vector,/>Is the offset vector, W is the weight matrix (also called coefficient), and α () is the activation function. Each layer is only for input vectors/>The output vector/>, is obtained through the simple operationSince the number of DNN layers is large, the coefficient W and the offset vector/>And the number of (2) is also relatively large. The definition of these parameters in DNN is as follows: taking the coefficient W as an example: it is assumed that in a three-layer DNN, the linear coefficients of the 4 th neuron of the second layer to the 2 nd neuron of the third layer are defined as/>The superscript 3 represents the number of layers in which the coefficient W is located, and the subscript corresponds to the output third layer index 2 and the input second layer index 4.
In summary, the coefficients of the kth neuron of the L-1 layer to the jth neuron of the L layer are defined as
It should be noted that the input layer is devoid of W parameters. In deep neural networks, more hidden layers make the network more capable of characterizing complex situations in the real world. Theoretically, the more parameters the higher the model complexity, the greater the "capacity", meaning that it can accomplish more complex learning tasks. The process of training the deep neural network, i.e. learning the weight matrix, has the final objective of obtaining a weight matrix (a weight matrix formed by a number of layers of vectors W) for all layers of the trained deep neural network.
(3) Convolutional neural network (convolutional neural network CNN)
A convolutional neural network is a deep neural network with a convolutional structure. The convolutional neural network comprises a feature extractor consisting of a convolutional layer and a sub-sampling layer. The feature extractor can be seen as a filter and the convolution process can be seen as a convolution with an input image or convolution feature plane (feature map) using a trainable filter. The convolution layer refers to a neuron layer in the convolution neural network, which performs convolution processing on an input signal. In the convolutional layer of the convolutional neural network, one neuron may be connected with only a part of adjacent layer neurons. A convolutional layer typically contains a number of feature planes, each of which may be composed of a number of neural elements arranged in a rectangular pattern. Neural elements of the same feature plane share weights, where the shared weights are convolution kernels. Sharing weights can be understood as the way image information is extracted is independent of location. The underlying principle in this is: the statistics of a certain part of the image are the same as other parts. I.e. meaning that the image information learned in one part can also be used in another part. So we can use the same learned image information for all locations on the image. In the same convolution layer, a plurality of convolution kernels may be used to extract different image information, and in general, the greater the number of convolution kernels, the more abundant the image information reflected by the convolution operation.
The convolution kernel can be initialized in the form of a matrix with random size, and reasonable weight can be obtained through learning in the training process of the convolution neural network. In addition, the direct benefit of sharing weights is to reduce the connections between layers of the convolutional neural network, while reducing the risk of overfitting.
(4) Back propagation algorithm
The convolutional neural network can adopt a Back Propagation (BP) algorithm to correct the parameter in the initial super-resolution model in the training process, so that the reconstruction error loss of the super-resolution model is smaller and smaller. Specifically, the input signal is transmitted forward until the output is generated with error loss, and the parameters in the initial super-resolution model are updated by back-propagating the error loss information, so that the error loss is converged. The back propagation algorithm is a back propagation motion that dominates the error loss, and aims to obtain parameters of the optimal super-resolution model, such as a weight matrix.
(5) Circulatory neural network (recurrent neural networks RNN)
RNNs are used for processing sequence data. In the traditional neural network model, the layers are fully connected from an input layer to an implicit layer to an output layer, and nodes between each layer are connectionless. However, such conventional neural networks are not capable of sustaining many problems. For example, you want to predict what the next word of a sentence is, it is generally necessary to use the previous word, because the previous and next words in a sentence are not independent. RNNs are referred to as recurrent neural networks, i.e., a sequence of current outputs is also related to previous outputs. The specific expression is that the network will memorize the previous information and apply it to the calculation of the current output, i.e. the nodes between the hidden layers are no longer connectionless but connected, and the input of the hidden layer includes not only the output of the input layer but also the output of the hidden layer at the previous moment. Theoretically RNNs can process any length of sequence data.
Training for RNNs is the same as for traditional artificial neural networks (ARTIFICIAL NEURAL NETWORK, ANN). The BP error back-propagation algorithm is also used, but with a few differences. If RNNs is network extended, then parameters W, U, V are shared, whereas conventional neural networks are not. And in using a gradient descent algorithm, the output of each step depends not only on the network of the current step, but also on the state of the previous steps of the network. For example, at t=4, three more steps are required to be passed back, all of which have been added with various gradients. This learning algorithm is referred to as a time-based back-propagation algorithm.
The technical scheme of the application can be applied to the fields of cloud service, picture retrieval, album management, safe city, automatic driving and the like which need convolutional neural networks. In practical application, according to the specific application scenario and the limitation of the computing resources and storage resources of the application equipment (such as a mobile phone terminal), the technical scheme of the application can be utilized to search and obtain a neural network model conforming to the given resource constraint, and then the neural network model is utilized to execute corresponding tasks, such as executing tasks of image recognition, object detection, object segmentation and the like.
Several exemplary application scenarios of the technical solution of the present application are described below.
Application scenario 1: album management system
The user stores a large number of pictures in the mobile phone album, and hopes to be able to manage the pictures in the album in a classified manner. For example, a user may wish to have his mobile phone automatically categorize all bird images together and categorize all character photographs together.
Under the scene, the technical scheme provided by the application can be utilized to search out the image classification model structure matched with the mobile phone computing resource based on the mobile phone computing resource of the user. Therefore, the images with different categories in the photo album can be classified and managed by running the image classification model structure on the mobile phone, so that the user can conveniently find, the management time of the user is saved, and the photo album management efficiency is improved.
Application scenario 2: object detection and segmentation
In automatic driving, targets such as pedestrians and vehicles on the street are detected and segmented, and it is important to make safe driving decisions for the vehicles. Under the application scene, the technical scheme provided by the application can be utilized to search out the target detection and segmentation model structure matched with the vehicle computing resource based on the vehicle computing resource. Thus, the object detection and segmentation model with the model structure is operated on the vehicle, and the object in the image acquired by the vehicle can be accurately detected, positioned and segmented.
Fig. 1 is an exemplary flowchart of a neural network structure search method of the present application. As shown in fig. 1, the method may include S110 to S150.
S110, acquiring a given search space of a neural network, wherein the neural network is used for processing images, texts or voices.
The neural network may be a neural network for classifying images, a neural network for dividing images, a neural network for detecting images, a neural network for identifying images, a neural network for generating a specified image, a neural network for translating texts, a neural network for reproducing texts, a neural network for generating specified texts, a neural network for identifying voices, a neural network for translating voices, a neural network for generating specified voices, or the like.
From another dimension, the neural network may be a convolutional neural network, a recurrent neural network, or the like.
It is understood that in this embodiment, the two concepts of the neural network and the neural network structure may be equivalent. For example, acquiring a given search space of a neural network may be understood as acquiring a given search space of a neural network structure for processing an image, text or speech, and may be understood as having the neural network structure for processing an image, text or speech.
Many operations may be included in the given search space, and all or some of these operations may constitute different neural networks based on different connection schemes, or, based on different connection schemes, the network structure of the neural networks constituted by all or some of these operations may be different.
S120, determining constraint conditions of resources provided by the target equipment for running the neural network.
The resources include computing resources and/or storage resources of the target device, wherein one of the computing resources of the target device includes floating point number of operations per second (floating-point operations per second) of the target device, and one example of a storage resource constraint of the target device is a memory resource of the target device.
The computing resources may be used to constrain the amount of computation of the searched target neural network, and the storage resources may be used to constrain the amount of parameters of the searched target neural network.
For example, the target device is preset to provide 5 mbytes of memory that can be provided by the operating neural network, i.e., when the target device operates the neural network, 5 mbytes of memory can be provided. At this time, the parameter amount of the neural network should be about 5 mbytes.
For another example, when the target device is preset to run the neural network to perform 500M computations per second and the target device is expected to run the neural network within the target duration, i.e., the target device runs the neural network, 500M computations per second may be performed. At this time, the parameter amount of the neural network should be about 500m×a target time period by a number of bytes.
The target device may be an intelligent terminal device, such as a smart phone, tablet computer, smart home device, vehicle, robot, drone, or the like.
And S130, training a sampling model according to the constraint condition and the given search space so that the resource constraint condition is met by the requirement of the neural network obtained by sampling the given search space by using the sampling model when the neural network runs in the target equipment.
The sampling model is used for sampling the given search space to obtain the neural network in the given search space.
In this embodiment, training the sampling model according to the given search space may include: sampling from a given search space by using a sampling model to obtain a neural network; then calculating the resource demand of the neural network, such as calculating the parameter quantity and the calculated quantity of the neural network; then, judging whether the resource demand of the neural network meets the preset resource constraint condition; if not, adjusting parameters of the sampling model; and continuing to repeat the four previous steps by using the adjusted sampling model until the training stopping condition is met, for example, the preset training times are reached, or the resource requirement of any neural network obtained by sampling by using the sampling model meets the resource constraint condition.
For example, when the image classification model needs to be run on the mobile phone to perform album management, the memory size of the mobile phone can be determined in advance to be 5 mbytes for the memory allocated to the image classification model when the mobile phone runs the image classification model, and the memory constraint condition of the mobile phone is set to be 5 mbytes. In this way, when searching the neural network structure, the sampling model can be trained according to constraint conditions of the parameter number of 5 Mbytes, so that the sampling model can search the neural network with the parameter number of about 5 Mbytes from a given search space.
For another example, when the target detection model needs to be run on the vehicle to detect the target object on the street, it may be determined in advance according to the memory size of the vehicle control system that the vehicle control system can allocate 10 mbytes to the target detection model when running the target detection model, and the memory constraint condition of the vehicle control system is set to 10 mbytes. Thus, when searching the neural network structure, the sampling model can be trained according to the constraint condition of 10 Mbytes, so that the sampling model can search the neural network with the parameter of about 10 Mbytes from a given search space.
For convenience of description, parameters of the sampling model are referred to as structural parameters in this embodiment.
In some possible implementations, the sampling model may sample the given search space based on gumbel-softmax sampling methods to arrive at a neural network.
It will be appreciated that the present embodiment is not limited to the sampling method used by the sampling model to sample. For example, the sampling model may also sample the given search space based on the gumbel-max method or the ST-gumbel-softmax method.
It is to be understood that, in the present application, the resource requirement of the sampled neural network meets the resource constraint condition, which does not mean that the resource requirement of the sampled neural network is identical to the resource requirement indicated in the constraint condition, but means that the difference between the resource requirement of the sampled neural network and the resource requirement indicated in the constraint condition is within a reasonable range, for example, the difference is less than or equal to a preset threshold.
And S140, determining a candidate search space according to the sampling model and the given search space, wherein the candidate search space comprises a neural network sampled from the given search space based on the sampling model.
For example, candidate search spaces refer to search spaces that are constructed from neural networks that can be sampled from the given search space using the sampling model.
And S150, searching the target neural network according to the candidate search space.
In other words, the target neural network is obtained by searching from the candidate search space, and the network structure of the target neural network is the searched target neural network structure.
In some implementations, the target neural network structure may be searched from the candidate search space with reference to a neural network structure search method in the prior art.
In this embodiment, since the sampling model is trained, the neural network satisfying the target device resource constraint condition in the given search space can be sampled, and therefore, the candidate search space determined based on the sampling model and the given search space includes the neural network satisfying the target device resource constraint condition. Thus, the target neural network obtained by searching the candidate search space can search the neural network structure meeting the resource constraint condition of the target equipment, thereby improving the search efficiency.
In addition, searching the target neural network according to the candidate search space instead of searching the given search space can avoid searching the neural network which does not meet the constraint condition of the target equipment resources, thereby improving the searching efficiency.
In other implementations, the searching for the target neural network according to the candidate search space may include: searching a first network layer according to the candidate search space; searching a second network layer according to the candidate search space and the first network layer; and determining the target neural network according to the first network layer and the second network layer.
In other words, a part of network layers may be searched in the candidate search space, and the network layer that has been searched is taken as a network layer of the target neural network, and then other network layers of the target neural network are searched.
For convenience of description, the network layer searched first is referred to as a first network layer, and the network layer searched based on the first network layer is referred to as a second network layer. The first network layer may be one or more than one; likewise, the second network layer may be one or more.
In this implementation, since the second network layer is not required to be searched when the first network layer is searched, the first network layer can be searched more efficiently; when searching the second network layer, the first network layer is already determined, so that the second network layer can be searched more efficiently, and finally the searching efficiency of the neural network is improved.
In some implementations, the first network layer may be a network layer that is included in any of the neural networks in the candidate search space, or the first network layer may be a network layer that is included in any of the neural networks in the given candidate search space.
The network layers that all neural networks must contain, whether candidate or given search spaces, are typically critical layers of the neural network. That is, in this implementation, the critical layers are searched first, and then the non-critical layers are searched.
Because the role of the key layer in the task execution of the neural network is generally important, that is, the performance of the key layer generally determines the performance of the neural network, in the implementation manner, the key layer is searched first, the performance influence of the non-key layer can be removed, and therefore the key layer with better performance can be searched for, and the neural network with better performance can be searched for.
An exemplary flow chart of a search method of searching for a critical layer and then searching for a non-critical layer in the present application is shown in fig. 2. The method includes S210 to S270.
S210, acquiring a given search space.
S220, obtaining given resource constraint conditions.
And S230, sampling according to the given search space and the given resource constraint condition to obtain candidate search spaces.
S240, determining a key layer.
S250, searching the structure of the key layer from the candidate search space.
And S260, searching the structure of the non-key layer from the candidate search space.
S270, searching to obtain the structure of the target neural network.
In this embodiment, since the candidate search space is sampled according to a given resource constraint condition, the target neural network structure searched from the candidate search space can satisfy the resource constraint condition, and the search range can be narrowed, thereby improving the search efficiency. In addition, the importance of different layers in the neural network is fully considered by searching the key layers and then searching the non-key layers, so that the neural network structure with better performance can be obtained by searching.
An exemplary implementation of the present application for determining a critical layer is described below.
First, a super-network (SuperNet) corresponding to a given search space is established, wherein the super-network contains operations in the given search space and connection modes between the operations, and is denoted as N.
And then converting the super network N into a directed acyclic graph G= (V, E), wherein V is a node of the directed acyclic graph, corresponds to a characteristic graph output by an operation in the super network, and E is an edge of the directed acyclic graph, corresponds to the operation in the super network. Each sub-graph in the directed acyclic graph is a path from a start point to an end point, and each sub-graph corresponds to a neural network or a neural network structure.
Assuming that the entire super-network includes n different sub-graphs { G i }, i is an integer, and n is taken from 1, the key layers of the neural network in a given search space can be obtained using the following method:
Step one: s=g 1 (initialization key layer S), i=1;
step two: i=i+1;
Step three: s=sjg i;
And repeating the second step and the third step until i is greater than n. At this time, the layer where the operation included in S is located is the key layer.
In the embodiment of the application, when the sampling model samples the given search space based on gumbel-softmax sampling method, the neural network obtained by sampling all meets the following constraint:
Aθ,i=GumbelSoftmax(θ)
Where θ represents a structural parameter of the sampling model, gumbelSoftmax (θ) represents gumbel-softmax sampling, a θ,i represents an ith resource requirement of the neural network sampled by using the gumbel-softmax sampling method, R i represents an ith resource requirement included in the resource constraint condition, α is a preset value, L i should be less than or equal to a preset threshold, M i represents an ith resource requirement of a certain neural network sampled by using the gumbel-softmax sampling method in a given search space, and the ith resource requirement is a maximum requirement of the ith resource requirements of all the neural networks in the given search space.
In various embodiments of the present application, the sampling model may optionally be trained multiple times to obtain different structural parameters. Further, based on different structural parameters, different candidate search spaces can be obtained by sampling from a given search space, and searching is performed based on different search spaces, so that a neural network model with better performance can be obtained by searching.
One implementation of the sampling model to sample the given search space based on gumbel-softmax sampling method is described below.
In this implementation, a given search space may be divided into M sub-search spaces, each of which includes one or more operations, M being a positive integer. The M sub-search spaces are in one-to-one correspondence with M network layers of the neural network, that is, each sub-search space is used for searching operations that should be contained in one network layer of the neural network, or from each sub-search space, one network layer of the neural network should be contained in each sub-search space, and the M sub-search spaces can search for M operations that should be contained in each network layer of the M network layers, where the M operations constitute the target neural network, or constitute the target neural network structure.
It is understood that no operation may be included in any sub-search space. When no operation is found, it means that the target neural network may not contain the corresponding network layer.
The sampling model may include a plurality of structural parameters, and the plurality of structural parameters may be divided into M structural parameter sets, where the M structural parameter sets are in one-to-one correspondence with the M sub-search spaces. Each set of structural parameters may include one or more structural parameters, and the structural parameters in each set of structural parameters are in one-to-one correspondence with operations in the sub-search space to which the set of structural parameters corresponds.
When the sampling model is used for sampling the given search space based on a gumbel-softmax sampling method, gumbel-softmax sampling is carried out on the structural parameters in the structural parameter set corresponding to each sub-search space, and the operation corresponding to the sampled structural parameters is the operation obtained by searching in the sub-search space.
Performing gumbel-softmax sampling on each of the M sub-search spaces to obtain M operations, wherein a neural network formed by the M operations is the neural network obtained by sampling.
In the embodiment, a neural network is obtained by sampling from a given search space by using a sampling model based on gumbel-softmax sampling method; then calculating the resource demand of the neural network, such as calculating the parameter quantity and the calculated quantity of the neural network; then, judging whether the resource demand of the neural network meets the preset resource constraint condition; if not, adjusting parameters of the sampling model; and continuing to repeat the four previous steps by using the adjusted sampling model until the training stopping condition is met, for example, the preset training times are reached, or the resource requirement of any neural network obtained by sampling by using the sampling model meets the resource constraint condition.
After the sampling model is trained, any neural network searched from the given search space based on gumbel-softmax sampling method can meet the resource constraint condition according to the sampling model.
One implementation of the present application of searching for critical layers and then non-critical layers is described below in connection with fig. 3. Arrows in fig. 3 indicate data flow.
As shown in fig. 3, a given search space may be divided into 7 sub-search spaces, the 7 sub-search spaces corresponding one-to-one to 7 network layers of the neural network, and each sub-search space including 4 different operations therein.
As shown in fig. 3 (a), among the 7 network layers of the neural network in the given search space, the first, fourth and seventh network layers are key layers, and the other network layers are non-key layers. Wherein the critical layer may be determined using the methods described previously.
Accordingly, the first, fourth, and seventh sub-search spaces in a given search space are key sub-search spaces and are used to search for operations of the first, fourth, and seventh network layers, respectively.
As shown in (b) of fig. 3, an operation that the first network layer should include may be searched from among four operations in the first sub-search space, an operation that the fourth network layer should include may be searched from among four operations in the fourth sub-search space, and an operation that the seventh network layer should include may be searched from among four operations in the seventh sub-search space.
One implementation of a search operation from a first sub-search space, a fourth sub-search space, and a seventh sub-search space in a given search space includes the steps of:
Inputting the same training data to four operations in a first sub-search space, then respectively adding the outputs of the four operations and the structural parameters corresponding to the four operations, and then summing the four added values, wherein the sum is called as first output data for descriptive convenience; then, inputting the first output data into four operations in a fourth sub-search space, respectively, adding the outputs of the four operations to the structural parameters corresponding to the four operations, respectively, and summing the four values obtained by the addition, wherein the sum is called second output data for convenience of description; next, the second output data are respectively input into four operations in the seventh sub-search space, and the outputs of the four operations are respectively added to the structural parameters corresponding to the four operations, and the four values obtained by the addition are summed, and for convenience of description, the sum is referred to as third output data; finally, adjusting parameters of operations in the first sub-search space, the fourth sub-search space and the seventh sub-search space and adjusting corresponding structural parameters of the operations according to the third output data and the training data.
The above steps are repeated until the training is stopped, so that the loss value of the third output data relative to the training data is smaller and smaller. Wherein, since the structure parameters corresponding to the operations are used to weight and sum the outputs of the operations when the operations in the key sub-search space are trained, so as to obtain the input of the operations in the next network layer, that is, the outputs of the operations in the key sub-search space are used as the input of the operations of the next network layer under the limitation of the corresponding structure parameters, the search of the key network layer can be considered to be performed in the candidate search space.
After training is stopped, the target operation can be obtained by sampling from each key sub-search space according to the adjusted structural parameters. As shown in fig. 3 (c), a third operation is searched from the first sub-search space, a first operation is searched from the fourth sub-search space, and a second operation is searched from the seventh sub-search space.
Next, it may be assumed that the first, fourth, and seventh network layers of the target neural network respectively include the three operations searched as described above, and then the operations of the other network layers of the target neural network are searched in the other sub-search space.
The method of searching other network layers of the target neural network in the other sub-search space is similar to the method of searching the key network layers in the key sub-search space, and will not be repeated here.
As shown in (d) of fig. 3, a third operation is searched from the second sub-search space, a second operation is searched from the third sub-search space, a fourth operation is searched from the fifth sub-search space, and a first operation is searched from the sixth sub-search space.
That is, the neural network constituted by seven operations shown in fig. 3 (d) is the target neural network.
The information statistical table of the image classification model obtained by searching the neural network structure searching method and other neural network structure searching methods is shown in table 1. It will be appreciated that the image classification model may be used alone to perform image classification or may be migrated to other models to assist in performing other tasks, for example, the image classification model may be migrated to an image detection or image segmentation model.
Table 1 statistics of information results of models searched by various neural network structure search methods
In table 1, a model (model) of the first column indicates the names of the models searched for using the corresponding search methods; the type (type) of the second column refers to a method of searching for a corresponding model, for example, manual design (manual) or automatic search (auto); the third column of search dataset (SEARCH DATASET) refers to the dataset used to search or train to get the corresponding model, e.g., CIFAR-10 dataset or ImageNet dataset; the fourth column of search cost (search cost) means that several image processor days (GPU days) are consumed for searching to obtain the corresponding model; the parameter (params) of the fifth column refers to the parameter of the corresponding model, and the unit is M; the calculation amount of the sixth column refers to FLOPS of the corresponding model, and the unit is M; top-1 of the seventh column refers to a probability that a first ranked result of the plurality of results obtained by performing the task using the corresponding model is a true result; top-5 of the eighth column refers to probability that the Top five results of the plurality of results obtained by executing the task by using the corresponding model contain a real structure; "-" indicates that no item of data is present.
HiNAS-A, hiNAS-B and HiNAS-C are three image classification models obtained by searching by using the applied neural network structure searching method, and the three models are different in that the three models are obtained by searching based on different resource constraint conditions. The resource constraint condition corresponding to HiNAS-A is that the parameter number is 4.8M, and the calculated amount is 3M; the resource constraint condition corresponding to HiNAS-B is that the parameter number is 5.5, and the calculated amount is 4M; hiNAS-C corresponds to a resource constraint of 5M.
Wherein CIFAR-10 is a small dataset for recognition of pervasive objects, organized by Hinton students Alex Krizhevsky and Ilya Sutskever; imageNet refers to a public dataset for image network large-scale visual recognition challenges (IMAGENET LARGE SCALE visual recognition challenge, ILSVRC) contests; GPU is the abbreviation of image processor (graphics processing unit); GPU days refers to the number of days required in the case of using one GPU, for example, 0.2GPU days refers to the task being completed in 0.2 days using one GPU.
The method can complete the search within 0.2 GPU days. Compared with FBNet-B, the method of the application requires shorter search time under the condition of the same accuracy.
In addition, the HiNAS-B model obtained by the method has the accuracy rate of top1 which is 0.4% higher than that of FBNet-C under the condition of the same FLOPs and parameter quantity. Under the condition that only constraint FLOPs is 500M, the accuracy of the obtained HiNAS-C model top1 is 75.6%, and the method is superior to the existing method.
The present application also provides a searching method of a neural network structure, as shown in fig. 4, which may include S410 to S440.
S410, the server receives a first message from the target device, wherein the first message is used for requesting the server to search the neural network structure.
The first message may also include a function of the neural network that the target device requests to search, such as requesting to search an image classification model, an image detection model, an image segmentation model, etc.
The target device may be an intelligent terminal device, such as a smart phone, tablet computer, smart home device, vehicle, robot, drone, or the like.
S420, the server acquires a given search space of a neural network for processing an image, text, or voice.
And S430, the server searches a first network layer according to the given search space, wherein the first network layer is a network layer contained in any neural network in the given candidate search space.
Here, the server searching the first network layer refers to an operation included in searching the first network layer of the target neural network. That is, all of the neural networks in a given candidate search space contain some network layers that are the same, referred to as first network layers, whereby it can be determined that the target neural network will also contain first network layers, thereby searching the search space of the first network layers of the given search space for operation as the first network layer of the target neural network.
S440, the server searches a second network layer according to the given search space and the first network layer, wherein the second network layer comprises network layers except the first network layer in network layers included by any neural network in the given search space.
Here, the server searching the second network layer refers to an operation included in searching the second network layer of the target neural network. That is, the operation as the second network layer of the target neural network is searched from the search space of the second network layer of the given search space.
S450, the server sends a target neural network to the target equipment, wherein the target neural network comprises the first network layer and the second network layer, and the neural network is used for processing images, texts or voices.
After searching the first network layer and the second network layer, the server can obtain the neural network formed by the first network layer and the second network layer and send the neural network to the target device.
The implementation of each step in the method may refer to the foregoing related content, and will not be described herein.
In the method, the first network layer can be searched more efficiently because the second network layer is not required to be searched when the first network layer is searched; when searching the second network layer, the first network layer is already determined, so that the second network layer can be searched more efficiently, and finally the searching efficiency of the neural network is improved.
The network layers that all the neural networks in the search space must include are usually the key layers of the neural network, and the role of the key layers in performing tasks by the neural network is usually important, that is, the performance of the key layers generally determines the performance of the neural network, so in this implementation, the key layers (i.e., the first network layer) are searched first, the performance influence of the non-key layers (i.e., the second network layer) can be removed, so that the key layers with better performance can be searched, and the neural networks with better performance can be searched.
For example, when the image classification model needs to be run on the mobile phone to perform album management, a first message may be sent to the server to request to search for the image classification model. After searching to obtain the image classification model, the server can retrain the image classification model and then send the image classification model to the mobile phone.
For another example, when the object detection model needs to be run on the vehicle to detect the object on the street, a first message may be sent to the server requesting to search for the image detection model. After searching for the image classification model, the server may retrain the image classification model and then send it to the vehicle.
Fig. 5 is an exemplary structural view of the neural network structure search apparatus of the present application. The apparatus 500 includes an acquisition module 510, a training module 520, a determination module 530, and a search module 540. The apparatus 500 may implement the method shown in any of the foregoing figures 1-3.
For example, the determining module 530 is used to perform S110 and S140, the acquiring module 510 is used to perform S120, the training module 520 is used to perform S130, and the searching module 540 is used to perform S150.
Alternatively, the search module 540 may be specifically used to perform S240 to S270.
In some implementations, the apparatus 500 may be deployed in a cloud environment, which is an entity that utilizes underlying resources to provide cloud services to users in a cloud computing mode. The cloud environment includes a cloud data center including a large number of underlying resources (including computing resources, storage resources, and network resources) owned by a cloud service provider, and a cloud service platform, where the computing resources included in the cloud data center may be a large number of computing devices (e.g., servers). The apparatus 500 may be a server for neural network structure search in a cloud data center. The apparatus 500 may also be a virtual machine created in a cloud data center for neural network structure searching. The apparatus 500 may also be a software apparatus deployed on a server or virtual machine in a cloud data center for conducting neural network structure searches, which may be deployed distributed on multiple servers, or distributed on multiple virtual machines, or distributed on virtual machines and servers. For example, the training module 520, the determining module 530, and the searching module 540 in the apparatus 500 may be distributed across multiple servers, or distributed across multiple virtual machines, or distributed across virtual machines and servers.
The device 500 may be provided by a cloud service provider to a user by abstracting the cloud service provider into a cloud service for searching a neural network structure on a cloud service platform, after the user purchases the cloud service on the cloud service platform, the cloud environment provides the cloud service for searching the neural network structure for the user by using the cloud service, the user may upload resource constraint conditions to the cloud environment through an application program interface (application program interface, API) or through a web page interface provided by the cloud service platform, the device 500 receives the resource constraint conditions, performs searching the neural network structure according to the resource constraint conditions, and finally the neural network structure obtained by searching is returned to the edge device where the user is located by the device 500.
When the apparatus 500 is a software apparatus, the apparatus 500 may also be deployed separately on one computing device in any environment.
Fig. 6 is an exemplary structural view of a neural network structure search device of the present application. The apparatus 600 includes a receiving module 610, an obtaining module 620, a searching module 630, and a transmitting module 640. The apparatus 600 may implement the method described above with respect to fig. 4.
For example, the receiving module may be used to perform S410, the acquiring module 620 is used to perform S420, the searching module 620 is used to perform S420 to S440, and the transmitting module 640 is used to perform S450.
In some implementations, the apparatus 600 may be deployed in a cloud environment, which is an entity that utilizes underlying resources to provide cloud services to users in a cloud computing mode. The cloud environment includes a cloud data center including a large number of underlying resources (including computing resources, storage resources, and network resources) owned by a cloud service provider, and a cloud service platform, where the computing resources included in the cloud data center may be a large number of computing devices (e.g., servers). The apparatus 600 may be a server for neural network structure search in a cloud data center. The apparatus 600 may also be a virtual machine created in a cloud data center for neural network structure searching. The apparatus 600 may also be a software apparatus deployed on a server or virtual machine in a cloud data center for conducting neural network structure searches, which may be deployed distributed on multiple servers, or distributed on multiple virtual machines, or distributed on virtual machines and servers. For example, the search module 620 in the apparatus 600 may be distributed across multiple servers, or distributed across multiple virtual machines, or distributed across virtual machines and servers.
The device 600 may be provided by a cloud service provider to a user in a cloud service platform to abstract a cloud service for searching a neural network structure, after the user purchases the cloud service by the cloud service platform, the cloud environment provides the cloud service for searching the neural network structure for the user by using the cloud service, the user may upload a model type to the cloud environment through an application program interface (application program interface, API) or a web page interface provided by the cloud service platform, the device 600 receives the model type (such as image classification, image detection or image segmentation), performs searching for the neural network structure according to the model type, and the neural network structure of the type obtained by the final searching is returned to the edge device where the user is located by the device 600.
When the apparatus 600 is a software apparatus, the apparatus 600 may also be deployed separately on one computing device in any environment.
The present application also provides an apparatus 700 as shown in fig. 7, the apparatus 700 comprising a processor 702, a communication interface 703 and a memory 704. One example of an apparatus 700 is a chip. Another example of an apparatus 700 is a computing device. Another example of an apparatus 700 is a server.
Communication between the processor 702, the memory 704 and the communication interface 703 may be via a bus. The memory 704 has stored therein executable code that the processor 702 reads from the memory 704 to perform the corresponding method. The memory 704 may also include software modules required by the operating system or other processes running. The operating system may be LINUX TM,UNIXTM,WINDOWSTM, etc.
For example, executable code in the memory 704 is used to implement the method shown in any of fig. 1-4, and the processor 702 reads the executable code in the memory 704 to perform the method shown in any of fig. 1-4.
The processor 702 may be a central processing unit (central processing unit, CPU). The memory 704 may include volatile memory (RAM), such as random access memory (random access memory). The memory 704 may also include a non-volatile memory (2 NVM), such as a read-only memory (2 ROM), flash memory, a hard disk drive (HARD DISK DRIVE, HDD) or a solid state drive (solid STATE DISK, SSD).
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (13)

1. A neural network structure search method, comprising:
determining constraint conditions of resources provided by target equipment for running a neural network, wherein the neural network is used for processing images, texts or voices;
acquiring a given search space of a neural network;
training a sampling model according to the constraint condition and the given search space, so that the requirement of the neural network obtained by sampling the given search space by using the sampling model on the resource meets the constraint condition when the neural network runs in the target equipment;
determining a candidate search space according to the sampling model and the given search space, wherein the candidate search space comprises a neural network sampled from the given search space based on the sampling model;
searching the target neural network according to the candidate search space.
2. The method of claim 1, wherein the searching for the target neural network according to the candidate search space comprises:
searching a first network layer according to the candidate search space, wherein the first network layer is a network layer contained in any neural network in the candidate search space;
searching a second network layer according to the candidate search space and the first network layer, wherein the second network layer comprises network layers except the first network layer in network layers included by any neural network in the given search space;
and determining the target neural network according to the first network layer and the second network layer.
3. The method according to claim 1 or 2, characterized in that the sampling model is based on gumbel-softmax sampling method when sampling the given search space.
4. The method according to claim 1 or 2, wherein the resources comprise computing resources and/or storage resources.
5. A neural network structure search method, comprising:
The method comprises the steps that a server receives a first message from a target device, wherein the first message is used for requesting the server to perform neural network structure searching;
acquiring a given search space of a neural network, wherein the neural network is used for processing images, texts or voices;
the server searches a first network layer according to the given search space, wherein the first network layer is a network layer contained in any neural network in the candidate search space;
The server searches a second network layer according to the given search space and the first network layer, wherein the second network layer comprises network layers except the first network layer in network layers included by any neural network in the given search space;
the server sends a target neural network to the target device, the target neural network including the first network layer and the second network layer.
6. A neural network structure search apparatus, comprising:
The determining module is used for limiting the resources provided by the target equipment for running the neural network, and the neural network is used for processing images, texts or voices;
An acquisition module for acquiring a given search space of the neural network;
the training module is used for training a sampling model according to the constraint condition and the given search space so that the resource constraint condition is met by the requirement of the neural network obtained by sampling the given search space by using the sampling model when the neural network runs in the target equipment;
The determining module is further configured to determine a candidate search space according to the sampling model and the given search space, where the candidate search space includes a neural network sampled from the given search space based on the sampling model;
and the searching module is used for searching the target neural network according to the candidate searching space.
7. The apparatus of claim 6, wherein the search module is specifically configured to:
searching a first network layer according to the candidate search space, wherein the first network layer is a network layer contained in any neural network in the candidate search space;
searching a second network layer according to the candidate search space and the first network layer, wherein the second network layer comprises network layers except the first network layer in network layers included by any neural network in the given search space;
and determining the target neural network according to the first network layer and the second network layer.
8. The apparatus of claim 6 or 7, wherein the sampling model samples the given search space based on a gumbel-softmax sampling method.
9. The apparatus according to claim 6 or 7, wherein the resources comprise computing resources and/or storage resources.
10. A neural network structure search apparatus, comprising:
the receiving module is used for receiving a first message from the target equipment, wherein the first message is used for requesting the server to search the neural network structure;
The acquisition module is used for acquiring a given search space of a neural network, and the neural network is used for processing images, texts or voices;
the searching module is used for searching a first network layer according to the given searching space, wherein the first network layer is a network layer contained in any neural network in the candidate searching space;
The searching module is further configured to search a second network layer according to the given search space and the first network layer, where the second network layer includes network layers other than the first network layer among network layers included in any neural network in the given search space;
And the sending module is used for sending a target neural network to the target equipment, wherein the target neural network comprises the first network layer and the second network layer.
11. A neural network structure search apparatus, comprising: a processor coupled to the memory;
The memory is used for storing instructions;
the processor is configured to execute instructions stored in the memory to cause the apparatus to perform the method of any one of claims 1to 5.
12. A server comprising a processor and a memory, wherein:
The memory stores computer instructions;
the processor executing the computer instructions to implement the method of any one of claims 1 to 5.
13. A computer readable medium comprising instructions which, when run on a processor, cause the processor to perform the method of any one of claims 1 to 5.
CN202010109054.7A 2020-02-21 2020-02-21 Neural network structure searching method and device Active CN111382868B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010109054.7A CN111382868B (en) 2020-02-21 2020-02-21 Neural network structure searching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010109054.7A CN111382868B (en) 2020-02-21 2020-02-21 Neural network structure searching method and device

Publications (2)

Publication Number Publication Date
CN111382868A CN111382868A (en) 2020-07-07
CN111382868B true CN111382868B (en) 2024-06-18

Family

ID=71217808

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010109054.7A Active CN111382868B (en) 2020-02-21 2020-02-21 Neural network structure searching method and device

Country Status (1)

Country Link
CN (1) CN111382868B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112585583B (en) * 2020-07-17 2021-12-03 华为技术有限公司 Data processing method and device and intelligent vehicle
CN112001491A (en) * 2020-07-27 2020-11-27 三星(中国)半导体有限公司 Search method and device for determining neural network architecture for processor
CN112019510B (en) * 2020-07-28 2021-07-06 北京大学 Self-adaptive search method and system of deep neural network architecture
CN111754532B (en) * 2020-08-12 2023-07-11 腾讯科技(深圳)有限公司 Image segmentation model searching method, device, computer equipment and storage medium
CN112100468A (en) * 2020-09-25 2020-12-18 北京百度网讯科技有限公司 Search space generation method and device, electronic equipment and storage medium
CN112100466A (en) * 2020-09-25 2020-12-18 北京百度网讯科技有限公司 Method, device and equipment for generating search space and storage medium
CN113407806B (en) * 2020-10-12 2024-04-19 腾讯科技(深圳)有限公司 Network structure searching method, device, equipment and computer readable storage medium
CN114595375A (en) * 2020-12-03 2022-06-07 北京搜狗科技发展有限公司 Searching method and device and electronic equipment
CN113743606A (en) * 2021-09-08 2021-12-03 广州文远知行科技有限公司 Neural network searching method and device, computer equipment and storage medium
CN114548384A (en) * 2022-04-28 2022-05-27 之江实验室 Method and device for constructing impulse neural network model with abstract resource constraint
CN117454959A (en) * 2022-07-14 2024-01-26 北京字跳网络技术有限公司 Neural network model structure determining method, device, equipment, medium and product
CN117688984A (en) * 2022-08-25 2024-03-12 华为云计算技术有限公司 Neural network structure searching method, device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919304A (en) * 2019-03-04 2019-06-21 腾讯科技(深圳)有限公司 Neural network searching method, device, readable storage medium storing program for executing and computer equipment
CN110188878A (en) * 2019-05-31 2019-08-30 北京市商汤科技开发有限公司 Neural network searching method and device
CN110503192A (en) * 2018-05-18 2019-11-26 百度(美国)有限责任公司 The effective neural framework of resource

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101478551B (en) * 2009-01-19 2011-12-28 清华大学 Multi-domain network packet classification method based on multi-core processor
FR2976377A1 (en) * 2011-06-10 2012-12-14 Eads Europ Aeronautic Defence Method for evaluating reliability of transmission between terminals of multiplexed command control network in e.g. aeronautical field, involves determining probability of transmission failure based on combination of probabilities of states
CN107491518B (en) * 2017-08-15 2020-08-04 北京百度网讯科技有限公司 Search recall method and device, server and storage medium
CN110175671B (en) * 2019-04-28 2022-12-27 华为技术有限公司 Neural network construction method, image processing method and device
CN110543944B (en) * 2019-09-11 2022-08-02 北京百度网讯科技有限公司 Neural network structure searching method, apparatus, electronic device, and medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110503192A (en) * 2018-05-18 2019-11-26 百度(美国)有限责任公司 The effective neural framework of resource
CN109919304A (en) * 2019-03-04 2019-06-21 腾讯科技(深圳)有限公司 Neural network searching method, device, readable storage medium storing program for executing and computer equipment
CN110188878A (en) * 2019-05-31 2019-08-30 北京市商汤科技开发有限公司 Neural network searching method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Jianlong Chang等.《Differentiable Architecture Search with Ensemble Gumbel-Softmax》.《arXiv》.2019,第3-4节. *

Also Published As

Publication number Publication date
CN111382868A (en) 2020-07-07

Similar Documents

Publication Publication Date Title
CN111382868B (en) Neural network structure searching method and device
CN111797893B (en) Neural network training method, image classification system and related equipment
CN113298197B (en) Data clustering method, device, equipment and readable storage medium
WO2021088365A1 (en) Method and apparatus for determining neural network
CN111898703B (en) Multi-label video classification method, model training method, device and medium
WO2024041479A1 (en) Data processing method and apparatus
JP2021022367A (en) Image processing method and information processor
CN113158554B (en) Model optimization method and device, computer equipment and storage medium
CN113806582B (en) Image retrieval method, image retrieval device, electronic equipment and storage medium
CN114329029B (en) Object retrieval method, device, equipment and computer storage medium
CN113344016A (en) Deep migration learning method and device, electronic equipment and storage medium
CN112446888A (en) Processing method and processing device for image segmentation model
CN111126249A (en) Pedestrian re-identification method and device combining big data and Bayes
CN115130711A (en) Data processing method and device, computer and readable storage medium
WO2021120177A1 (en) Method and apparatus for compiling neural network model
CN114118207B (en) Incremental learning image identification method based on network expansion and memory recall mechanism
US20230094415A1 (en) Generating a target classifier for a target domain via source-free domain adaptation using an adaptive adversarial neural network
CN112990387B (en) Model optimization method, related device and storage medium
CN112800253B (en) Data clustering method, related device and storage medium
CN114358250A (en) Data processing method, data processing apparatus, computer device, medium, and program product
CN112364198B (en) Cross-modal hash retrieval method, terminal equipment and storage medium
CN116797850A (en) Class increment image classification method based on knowledge distillation and consistency regularization
US20220292812A1 (en) Zero-shot dynamic embeddings for photo search
CN117010480A (en) Model training method, device, equipment, storage medium and program product
Zerrouk et al. Evolutionary algorithm for optimized CNN architecture search applied to real-time boat detection in aerial images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant