CN113869501B

CN113869501B - Neural network generation method and device, electronic equipment and storage medium

Info

Publication number: CN113869501B
Application number: CN202111215815.8A
Authority: CN
Inventors: 薛超; 李乾; 李明明; 陶大程
Original assignee: Jingdong Technology Information Technology Co Ltd
Current assignee: Jingdong Technology Information Technology Co Ltd
Priority date: 2021-10-19
Filing date: 2021-10-19
Publication date: 2024-06-18
Anticipated expiration: 2041-10-19
Also published as: CN113869501A

Abstract

The embodiment of the invention discloses a neural network generation method, a device, electronic equipment and a storage medium. According to the method, multiple paths of nodes are selected from the pre-trained super network, each input side of the current multiple paths of nodes is taken as a competitor, reservation and discarding of each input side are taken as strategies, accuracy of a predicted result output by the super network is taken as a utility manifold value, a first game is constructed, so that construction of a topological structure game of the super network is realized, a first Nash equilibrium strategy combination comprising reservation probability and discarding probability corresponding to each input side of the first game is determined, at least one input side with the reservation probability and discarding probability meeting preset conditions is selected from the first Nash equilibrium strategy combination, a neural network is generated based on the selected input side of each multiple paths of nodes, and accuracy of the determined neural network is improved.

Description

Neural network generation method and device, electronic equipment and storage medium

Technical Field

The embodiment of the invention relates to the technical field of artificial intelligence, in particular to a neural network generation method, a device, electronic equipment and a storage medium.

Background

Over the past few years, much effort has been devoted to the development of neural network search (NAS) algorithms to achieve the finding of optimal neural network structures for specific tasks.

The method commonly used at present is to firstly relax an initial neural network structure into a super network structure, then calculate the network weight value corresponding to each input edge of each node in the super network structure by using a difference method, and finally use a single-path network structure consisting of each node and the edge with the highest network weight value of each node as the optimal neural network structure.

In the process of realizing the invention, the prior art is found to have at least the following technical problems:

according to the method for determining the optimal neural network structure according to the network weight value of the edge, the accuracy is required to be improved.

Disclosure of Invention

The embodiment of the invention provides a method, a device, electronic equipment and a storage medium for generating a neural network, so as to improve the accuracy of the determined neural network.

In a first aspect, an embodiment of the present invention provides a method for generating a neural network, where the method includes:

Acquiring a super network pre-trained by using a training sample, and selecting multiple paths of nodes in the super network; wherein the multi-path node is a node having a plurality of input edges;

For each multi-path node, constructing a first game by taking each input edge of the current multi-path node as a competitor, taking the reservation and discarding of each input edge as a strategy and taking the accuracy of the prediction result output by the super network as a utility manifold value; determining a first Nash equilibrium strategy combination of the first game, and selecting at least one input edge with retention probability or discarding probability meeting a preset condition in the first Nash equilibrium strategy combination; the first Nash equilibrium policy combination comprises a retention probability and a discarding probability which correspond to the input edges respectively;

And generating a neural network based on the input edges selected for each of the multiple paths of nodes.

In a second aspect, an embodiment of the present invention further provides a device for generating a neural network, where the device includes:

the node selection module is used for acquiring a super network pre-trained by using the training sample and selecting multiple paths of nodes in the super network; wherein the multi-path node is a node having a plurality of input edges;

The input edge selection module is used for constructing a first game for each multi-path node by taking each input edge of the current multi-path node as a competitor, taking the reservation and discarding of each input edge as a strategy and taking the accuracy of the prediction result output by the super network as a utility manifold value; determining a first Nash equilibrium strategy combination of the first game, and selecting at least one input edge with retention probability or discarding probability meeting a preset condition in the first Nash equilibrium strategy combination; the first Nash equilibrium policy combination comprises a retention probability and a discarding probability which correspond to the input edges respectively;

And the neural network generation module is used for generating a neural network based on the input edges selected for the multipath nodes.

In a third aspect, an embodiment of the present invention further provides an electronic device, including:

One or more processors;

Storage means for storing one or more programs,

The one or more programs, when executed by the one or more processors, cause the one or more processors to implement a method of generating a neural network as provided by any embodiment of the present invention.

In a fourth aspect, an embodiment of the present invention further provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements a method for generating a neural network as provided in any embodiment of the present invention.

The embodiments of the above invention have the following advantages or benefits:

Selecting multiple paths of nodes in the pre-trained super network, aiming at each multiple path of nodes, taking each input edge of the current multiple paths of nodes as a competitor, taking the reservation and discarding of each input edge as a strategy, and taking the accuracy of a prediction result output by the super network as a utility manifold value, constructing a first game to realize the construction of a topological structure game of the super network, further determining a first Nash equilibrium strategy combination comprising the reservation probability and discarding probability respectively corresponding to each input edge of the first game, obtaining Nash equilibrium of the topological structure game, selecting at least one input edge with the reservation probability and discarding probability meeting preset conditions in the first Nash equilibrium strategy combination, selecting an input edge with higher connection maintaining probability, and further generating a neural network based on the input edge selected by each multiple paths of nodes, thereby improving the accuracy of the determined neural network.

Drawings

In order to more clearly illustrate the technical solution of the exemplary embodiments of the present invention, a brief description is given below of the drawings required for describing the embodiments. It is obvious that the drawings presented are only drawings of some of the embodiments of the invention to be described, and not all the drawings, and that other drawings can be made according to these drawings without inventive effort for a person skilled in the art.

Fig. 1A is a flowchart of a method for generating a neural network according to an embodiment of the present invention;

FIG. 1B is a schematic diagram of cells in a super network according to an embodiment of the present invention;

FIG. 1C is a schematic diagram illustrating a first game according to an embodiment of the present invention;

Fig. 2 is a flowchart of a method for generating a neural network according to a second embodiment of the present invention;

fig. 3A is a flowchart of a method for generating a neural network according to a third embodiment of the present invention;

FIG. 3B is a schematic diagram of a second game according to a third embodiment of the present invention;

fig. 4 is a schematic structural diagram of a neural network generating device according to a fourth embodiment of the present invention;

Fig. 5 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention.

Detailed Description

The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.

Before explaining the embodiments provided by the application, an application scenario of the neural network generating method provided by the application is first described in an exemplary way. The method for generating the neural network provided by the application can be applied to generating neural networks such as an image classification network, an image segmentation network, an image feature extraction network, an image compression network, an image enhancement network, an image noise reduction network, an image label generation network, a text classification network, a text translation network, a text abstract extraction network, a text prediction network, a keyword conversion network, a text semantic analysis network, a voice recognition network, an audio noise reduction network, an audio synthesis network, an audio equalizer conversion network, a weather prediction network, a commodity recommendation network, an article recommendation network, an action recognition network, a face recognition network, a facial expression recognition network and the like. The above application scenario is merely illustrative, and the application scenario of the neural network generating method is not limited in the present application.

Example 1

Fig. 1A is a flowchart of a method for generating a neural network according to a first embodiment of the present invention, where the method may be executed by a generating device of the neural network, and the device may be implemented by hardware and/or software, and the method specifically includes the following steps:

S110, acquiring a super network pre-trained by using a training sample, and selecting multiple paths of nodes in the super network; wherein the multiway node is a node having a plurality of input edges.

Wherein the training samples may be training data sets for training the super network. For example, the training samples may be image data, and the predicted result is an image processing result; or the training sample is text data, and the prediction result is a text processing result; or the training sample is audio data, and the prediction result is an audio processing result.

For example, if the training sample is image data, the super network may be an image classification super network, and the prediction result output by the super network may be an image classification result; or the super network can be an image segmentation super network, and the prediction result can be an image segmentation result; or the super network can be an image feature extraction super network, and the prediction result can be an image feature extraction result; or the super network can be an image compression super network, and the prediction result can be an image compression result; or the super network can be an image enhancement super network, and the prediction result can be an image enhancement result; or the super network can be an image noise reduction super network, and the prediction result can be an image noise reduction result; or the super network may be an image tag generation super network, the prediction result may be an image tag, etc. If the training sample is text data, the super network can be a text classification super network, and the prediction result output by the super network can be a text classification result; or the super network may be a text prediction super network, and the prediction result may be a text prediction result; or the super network can be a text abstract extraction super network, and the prediction result can be a text abstract extraction result; or the super network can be a text translation super network, and the predicted result can be a text translation result; or the super network can be a keyword conversion super network, and the prediction result can be a keyword conversion result; or the super-network may be a text semantic analysis super-network, the predicted result may be a text semantic analysis result, etc. If the training sample is audio data, the super network can be a voice recognition super network, and the prediction result output by the super network can be a voice recognition result; or the super network can be an audio noise reduction super network, and the prediction result can be an audio noise reduction result; or the super network can be an audio synthesis super network, and the prediction result can be an audio synthesis result; or the super-network may be an audio equalizer conversion super-network, the prediction result may be an audio equalizer conversion result, etc.

In this embodiment, the super network may be a network obtained after initial network relaxation. A complete super-network may be formed by a plurality of cells in series, e.g., a super-network may be formed by 20 cells in series, as shown in fig. 1B, which illustrates a schematic diagram of cells in a super-network. The super network comprises nodes and input edges of the nodes; the input edges of a node may be parallel edges that reach the node; the input edge includes at least one operator. For example, operators include, but are not limited to, a maximum pooling layer of kernel size 3*3, an average pooling layer of kernel size 3*3, jumps, separable convolutions of kernel size 3*3, separable convolutions of kernel size 5*5, hole convolutions of kernel size 3*3, hole convolutions of kernel size 5*5. Specifically, the data passes through an operator in the input edge, and an operation related to the operator, for example, a hole convolution operation 3*3 is performed. The nodes in the super network represent feature representations fused by operators on the input edges of the nodes.

Specifically, in this embodiment, after a super network pre-trained by using a training sample is obtained, a multi-path node having multiple input edges is selected from the super network. The number of the multiple paths of nodes selected in the super network in this embodiment may be one or more.

And S120, for each multi-path node, constructing a first game by taking each input edge of the current multi-path node as a competitor, taking reservation and discarding of each input edge as strategies and taking the accuracy of the prediction result output by the super network as a utility manifold value.

In this embodiment, the optimal neural network may be discretized from the pre-trained super network. Therefore, the task of extracting the appropriate neural network structure from the pre-trained super network can be expressed as a game between competitors (input edges of multiple nodes) in combination with a mathematical model for researching strategic interactions between rational decision makers in the game theory, and the strategy is reserved and discarded.

Specifically, for each multi-path node, each input edge of each multi-path node is taken as a competitor, reservation and discarding of each input edge are taken as strategies, and the accuracy of the prediction result output by the super network is taken as a utility manifold value, so that a first game of each multi-path node is constructed. In this embodiment, optionally, the training sample is an image, and the prediction result is an image classification result or an image segmentation result or a target detection result or a target tracking result; or the training sample is text, and the prediction result is a text classification result or a natural language processing result.

Wherein the utility manifold value may include accuracy of the predicted result output by the super network after deleting one or more input edges of the plurality of nodes in the super network. Specifically, the embodiment can calculate the utility manifold value of each multi-path node of the super network, and further construct the first game of each multi-path node, so as to select the input edge in the first game of each multi-path node.

S130, determining a first Nash equilibrium strategy combination of the first game, and selecting at least one input edge with retention probability or discarding probability meeting a preset condition in the first Nash equilibrium strategy combination.

The first Nash equilibrium strategy combination comprises a retention probability and a discarding probability which correspond to each input edge respectively. Specifically, a first nash equalization policy combination for the multipath node may be calculated based on utility manifold values for the multipath node.

In an alternative embodiment, the determining the first nash equalization strategy combination for the first game includes: acquiring utility manifold values of current multipath nodes; wherein the utility manifold value of the current multipath node comprises: after deleting any i input edges of the current multipath node from the super network, the accuracy of a prediction result output by the super network; wherein i is an integer which takes a value from 1 to M, M is the number of input edges of the current multi-path node; and determining a first Nash equilibrium policy combination of the first game based on the utility manifold values of the current multipath nodes and a Nash equilibrium solving algorithm. In this alternative embodiment, the utility manifold value of the multi-path node may specifically be the accuracy of the prediction result output by the super network after deleting one or more input edges of the multi-path node in the super network. Illustratively, if a certain multi-path node in the super network includes 3 input edges, a1, a2 and a3 respectively, the utility manifold value of the multi-path node includes the accuracy of the prediction result output by the super network after a1 is deleted, the accuracy of the prediction result output by the super network after a2 is deleted, the accuracy of the prediction result output by the super network after a3 is deleted, the accuracy of the prediction result output by the super network after a1 and a2 are deleted, the accuracy of the prediction result output by the super network after a1 and a3 are deleted, the accuracy of the prediction result output by the super network after a2 and a3 are deleted, and the accuracy of the prediction result output by the super network after a1, a2 and a3 are deleted in the super network.

In this alternative embodiment, the nash equalization solution algorithm may be a monte carlo algorithm. Specifically, a specific implementation method for determining the first nash equalization strategy combination of the first game based on the utility manifold value and the nash equalization solution algorithm may refer to a paper published in the year 2020 and 7, sam Ganzfried under the name of Fast Complete Algorithm for Multiplayer Nash Equilibrium, where MIQCP software in the paper needs to be replaced by the nash equalization solution algorithm, for example, the monte carlo algorithm.

In the optional implementation manner, after deleting any input edges of the multiple paths of nodes from the super network, the accuracy of the prediction result output by the super network is used as the utility manifold value of the multiple paths of nodes, and further, based on the utility manifold value of the multiple paths of nodes and a Nash equilibrium solving algorithm, the first Nash equilibrium strategy combination of the first game is determined, so that the accurate determination of the retention probability and the discarding probability corresponding to the input edges of the multiple paths of nodes is realized, and the accuracy of the selected input edges is further improved.

In this embodiment, after determining the first nash equalization policy combination of the first game, at least one input edge may be selected according to a preset condition and a retention probability and a discard probability of input edges of multiple paths of nodes in the first nash equalization policy combination. The preset condition may be a preset probability screening condition. Illustratively, selecting at least one input edge in which the retention probability or the discard probability in the first nash equalization policy combination satisfies a preset condition may be: n input edges with retention probability higher than a preset retention threshold are selected, and N input edges with discarding probability lower than the preset discarding threshold are selected. The number of N can be adjusted according to actual requirements.

And S140, generating a neural network based on the input edges selected for the multipath nodes.

Specifically, after at least one input edge whose retention probability or discard probability satisfies a preset condition is selected from the first nash equalization policy combination, a neural network may be generated based on each multipath node and the input edge selected for each multipath node. Or deleting other input sides except the selected input side in the super network, generating a neural network based on the super network after the deleting operation is executed, and the like.

The process of selecting an input edge for each multi-path node according to this embodiment is exemplarily described with reference to a schematic process diagram of a first game shown in fig. 1C. As shown in fig. 1C, 101 in fig. 1C illustrates a hybrid-weighted super network, which includes a node a, b, C, d, wherein the bottom multipath node a includes three input edges, a1, a2, a3, respectively. The three input edges may be considered as a game with three competitors. As shown at 102 in fig. 1C, three competitors' reservations and relinquishes are treated as policies (K represents reservations, D represents relinquishment); the nash equalization strategy combination of the multipath node a is obtained through a nash equalization solving algorithm, for example, 103 in fig. 1C shows the nash equalization strategy combination of the multipath node a, wherein the nash equalization strategy combination comprises a retention probability 0.1 of a1, a discard probability 0.9 of a1, a retention probability 0.75 of a2, a discard probability 0.25 of a2, a retention probability 0.8 of a3 and a discard probability 0.2 of a3. The first two input edges of the retention probability (the higher the probability of Keep, the greater the likelihood that the input edges remain connected) are selected in the nash equalization policy combination by Aramax (2) function, as shown at 104 in fig. 1C, with a2 and a3 being selected among a1, a2, a3. When the game is finished, as shown in 105 in fig. 1C, the other input edge (a 1) other than the first two selected input edges is deleted, and the bottom-most multi-path node a remains with two parallel input edges.

According to the technical scheme, multiple paths of nodes are selected in the pre-trained super network, each input side of the current multiple paths of nodes is taken as a competitor, the reservation and discarding of each input side are taken as strategies, the accuracy of the prediction result output by the super network is taken as a utility manifold value, a first game is constructed to realize the construction of the super network topology structure game, further, a first Nash equilibrium strategy combination comprising the reservation probability and discarding probability corresponding to each input side of the first game is determined, nash equilibrium of the topology structure game is obtained, at least one input side with the reservation probability and discarding probability meeting preset conditions is selected in the first Nash equilibrium strategy combination, so that the input side with higher connection possibility is selected, a neural network is generated based on the input side selected by each multiple paths of nodes, and the accuracy of the determined neural network is improved.

Example two

Fig. 2 is a flow chart of a method for generating a neural network according to a second embodiment of the present invention, and optionally, the selecting multiple nodes in the super network includes: selecting multiple paths of nodes with the number of input edges larger than N in the super network; selecting at least one input edge with retention probability or discarding probability meeting a preset condition in the first Nash equilibrium strategy combination, wherein the method comprises the following steps: selecting N input edges with retention probability or discarding probability meeting preset conditions in a first Nash equilibrium strategy combination; wherein N is an integer not less than 1. Wherein the explanation of the same or corresponding terms as those of the above embodiments is not repeated herein. Referring to fig. 2, the method for generating a neural network provided in this embodiment includes the following steps:

S210, acquiring a super network pre-trained by using training samples, and selecting multiple paths of nodes with the number of input edges larger than N in the super network, wherein the multiple paths of nodes are nodes with multiple input edges.

Wherein N is an integer not less than 1. N may be set according to the accuracy requirement of the neural network in practice, for example, N may be equal to 3, 4, 5, etc. Specifically, in this embodiment, multiple nodes with input edges greater than N need to be selected from the super network, so as to further construct the first game for the multiple nodes with input edges greater than N.

S220, for each multi-path node, constructing a first game by taking each input edge of the current multi-path node as a competitor, taking reservation and discarding of each input edge as strategies and taking the accuracy of a prediction result output by the super network as a utility manifold value.

S230, determining a first Nash equilibrium strategy combination of the first game, and selecting N input edges with retention probability or discarding probability meeting preset conditions in the first Nash equilibrium strategy combination.

The first Nash equilibrium strategy combination comprises a retention probability and a discarding probability which correspond to each input edge respectively. Specifically, after multiple paths of nodes with the number of input edges being greater than N are selected from the super network, a first game is constructed for the multiple paths of nodes, and N input edges with retention probabilities or discarding probabilities meeting preset conditions are selected from a first nash equilibrium policy combination of the first game.

Optionally, selecting N input edges with retention probability or discard probability meeting a preset condition in the first nash equalization policy combination includes: selecting N input edges with highest retention probability in a first Nash equilibrium strategy combination; or selecting N input edges with lowest discarding probability in the first Nash equilibrium strategy combination.

Specifically, the retention probabilities of the input edges in the first Nash equilibrium policy combination may be ordered, so as to obtain N input edges with highest probabilities after ordering; or sorting the discarding probabilities of the input edges in the first Nash equilibrium strategy combination to obtain N input edges with the lowest probability after sorting. In the optional implementation manner, the accurate selection of the input edges of each multi-path node can be realized by selecting N input edges with the highest retention probability or discarding N input edges with the lowest retention probability, so that the accuracy of the optimal single-path network is improved.

Of course, selecting N input edges with retention probability or discarding probability meeting a preset condition in the first nash equalization policy combination may also be: selecting N input edges with retention probability higher than a preset retention threshold value in a first Nash equilibrium strategy combination; or selecting N input edges with retention probability lower than a preset retention threshold value in the first Nash equilibrium strategy combination.

The embodiment can select N input edges with retention probability or discarding probability meeting preset conditions from the first nash equalization policy combination, so as to select N input edges for the multipath nodes with the number of input edges being greater than N, thereby ensuring that the number of input edges of the multipath nodes in the neural network is N.

S240, generating a neural network based on the input edges selected for the multipath nodes.

According to the technical scheme, through selecting multiple paths of nodes with the number of input edges being greater than N in the super network, and selecting N input edges with the retention probability or the discarding probability meeting the preset condition in the first Nash equilibrium strategy combination of the multiple paths of nodes, the N input edges are selected for the multiple paths of nodes with the number of input edges being greater than N, the number of input edges of the multiple paths of nodes in the neural network is guaranteed to be N, the input edges of the multiple paths of nodes in the neural network are prevented from being too few, and the accuracy of the generated neural network is further improved.

Example III

Fig. 3A is a schematic flow chart of a method for generating a neural network according to a third embodiment of the present invention, and optionally, the method further includes: for each side in the super network, constructing a second game by taking each candidate operator contained in the current side as a competitor, taking the reservation and discarding of each candidate operator as a strategy and taking the accuracy of the prediction result output by the super network as a utility manifold value; determining a second Nash equilibrium strategy combination of the second game, and selecting at least one operator with retention probability or discarding probability meeting preset conditions in the second Nash equilibrium strategy combination; wherein, the second Nash equilibrium policy combination comprises a retention probability and a discarding probability which are respectively corresponding to each candidate operation operator; the generating a neural network based on the input edges selected for each of the plurality of nodes includes: a neural network is generated based on the input edges selected for each of the plurality of nodes and the operator selected for each of the edges. Referring to fig. 3A, the method for generating a neural network provided in this embodiment includes the following steps:

s310, acquiring a super network pre-trained by using the training sample, and selecting multiple paths of nodes in the super network.

Wherein the multiway node is a node having a plurality of input edges.

S320, for each multi-path node, constructing a first game by taking each input edge of the current multi-path node as a competitor, taking reservation and discarding of each input edge as strategies, and taking the accuracy of the prediction result output by the super network as a utility manifold value.

S330, determining a first Nash equilibrium strategy combination of the first game, and selecting at least one input edge with retention probability or discarding probability meeting a preset condition in the first Nash equilibrium strategy combination; the first Nash equilibrium strategy combination comprises a retention probability and a discarding probability which correspond to each input edge respectively.

And S340, for each side in the super network, constructing a second game by taking each candidate operator contained in the current side as a competitor, taking the reservation and discarding of each candidate operator as a strategy and taking the accuracy of the prediction result output by the super network as a utility manifold value.

Among other candidate operators, the operators in each edge may be operators such as, for example, hole convolution 3*3, skip, max-pooling, separable convolution 5*5, and so on. In this embodiment, in order to further construct a precise neural network, an optimal operator may be selected for each edge. Therefore, the task of extracting the appropriate neural network structure from the pre-trained super network can be expressed as the game between competitors (each candidate operator included by the edge) in combination with the mathematical model of strategic interaction between the researched rational decision makers in the game theory, and the strategy is reserved and discarded.

Specifically, for each side in the super network, each candidate operator included in each side is used as a competitor, reservation and discarding of the candidate operators are used as strategies, and the accuracy of the prediction result output by the super network is used as a utility manifold value, so that a second game of each side is constructed.

Wherein constructing utility manifold values for use in the second game may include deleting one or more operators of the current edge in the super network and then outputting the accuracy of the predicted result by the super network. Specifically, the utility manifold value of each side of the super network can be calculated, and then the second game of each side is constructed, so that an operator is selected from the second Nash equilibrium policy combination of the second game of each side.

S350, determining a second Nash equilibrium strategy combination of the second game, and selecting at least one operator with retention probability or discarding probability meeting preset conditions in the second Nash equilibrium strategy combination.

The second Nash equilibrium strategy combination comprises a retention probability and a discarding probability which are respectively corresponding to each candidate operation operator. Specifically, a second nash equalization policy combination for each edge may be calculated based on the utility manifold values for each edge.

Optionally, the determining the second nash equalization policy combination for the second game includes: acquiring a utility manifold value of the current edge; wherein the utility manifold value of the current edge comprises: deleting any j candidate operation operators contained in the current edge from the super network, and outputting the accuracy of the prediction result by the super network; wherein j is an integer which takes a value from 1 to P, and P is the number of candidate operators contained in the current edge; and determining a second Nash equilibrium policy combination for the second game based on the utility manifold value of the current edge and a Nash equilibrium solving algorithm.

The utility manifold value of the current edge may specifically be the accuracy of the prediction result output by the super network after deleting one or more candidate operators of the current edge in the super network. Illustratively, if one side of the super network includes 3 candidate operators, b1, b2 and b3 respectively, the utility manifold value of the side includes the accuracy of the prediction result output by the super network after b1 is deleted, the accuracy of the prediction result output by the super network after b2 is deleted, the accuracy of the prediction result output by the super network after b3 is deleted, the accuracy of the prediction result output by the super network after b1 and a2 are deleted, the accuracy of the prediction result output by the super network after b1 and b3 are deleted, the accuracy of the prediction result output by the super network after b2 and b3 are deleted, and the accuracy of the prediction result output by the super network after b1, b2 and b3 are deleted in the super network. Alternatively, the nash equalization solution algorithm may be a monte carlo algorithm.

In the optional implementation manner, after deleting any candidate operation operator of each side from the super network, the accuracy of the prediction result output by the super network is used as the utility manifold value of each side, and further, the second Nash equilibrium strategy combination of the second game is determined based on the utility manifold value of each side and the Nash equilibrium solving algorithm, so that the accurate determination of the retention probability and the discarding probability corresponding to each candidate operation operator of each side is realized, and the accuracy of the selected operation operator is further improved.

Optionally, the selecting at least one operation operator with the retention probability or the discard probability in the second nash equalization policy combination meeting the preset condition includes: selecting at least one operation operator with highest retention probability in the second Nash equilibrium strategy combination; or selecting at least one operator with the lowest discarding probability in the second Nash equilibrium strategy combination. Or the selecting at least one operation operator with the retention probability or the discarding probability meeting the preset condition in the second Nash equilibrium policy combination comprises: selecting at least one operator with retention probability higher than a preset retention threshold value in the second Nash equilibrium strategy combination; or selecting at least one operator with the discarding probability lower than a preset retention threshold value in the second Nash equilibrium strategy combination.

In this embodiment, the execution order of S340 to S350 and S310 to S330 is not limited. Specifically, S340-S350 may be performed after S310-S330, i.e., the operator may be selected for the input edge that has been selected for each multipath node and the input edges of other non-multipath nodes. Still alternatively, S340-S350 may be performed concurrently with S310-S330 or prior to S310-S330, i.e., an operator may be selected for each edge in the original super network.

S360, generating a neural network based on the input edges selected for the multipath nodes and the operation operators selected for the edges.

Specifically, after at least one operator whose retention probability or discard probability satisfies a preset condition is selected from the second nash equalization policy combination, a neural network may be generated according to the operators selected for each edge and the input edges selected for each multipath node. Or deleting other input edges except the selected input edge in the super network, deleting other candidate operation operators except the selected operation operator in the super network, generating a neural network based on the super network after the deleting operation is executed, and the like.

Illustratively, the generating the neural network based on the input edges selected for each multi-path node and the operator selected for each edge includes: for each multi-path node, deleting other input edges of the current multi-path node except the input edge selected for the current multi-path node in the super network; for each edge, deleting other operators contained in the current edge except the operator selected for the current edge in the super network; a neural network is generated based on the current super network.

In this exemplary embodiment, after selecting an input edge for each multi-path node and selecting an operator for each edge, deleting input edges except the selected input edge in each multi-path node, deleting operators except the selected operator in each edge, and generating a neural network according to the deleted super network.

Exemplary, the process of selecting operators for each edge in this embodiment is described with reference to a schematic process diagram of a second game shown in fig. 3B. As shown in FIG. 3B, each line 301 in FIG. 3B may represent one candidate operator, including L1, L2, L3, L4, L5. The 5 candidate operators may be considered as a game with five competitors, as shown by the diagram 302 in FIG. 3B, with the reservation and relinquishment of each candidate operator as policies (K represents reservation, D represents relinquishment); the Nash equilibrium policy combination of the edge is obtained through a Nash equilibrium solving algorithm, for example, the Nash equilibrium policy combination is shown in 303, wherein the Nash equilibrium policy combination comprises a retention probability of 0.99 of L1, a discarding probability of 0.01 of L1, a retention probability of 0.10 of L2, a discarding probability of 0.90 of L2, a retention probability of 0.15 of L3, a discarding probability of 0.85 of L3, a retention probability of 0.90 of L4, a discarding probability of 0.10 of L4, a retention probability of 0.19 of L5 and a discarding probability of 0.81 of L5. The candidate operator with the greatest retention probability, such as operator L1 selected in 304 of fig. 3B, is selected in the nash equalization strategy combination. When the game ends, as shown at 305 in fig. 3B, candidate operators other than the selected operator L1 are deleted, leaving one operator to form the final architecture.

According to the technical scheme of the embodiment, aiming at each side in the super network, each candidate operator contained in the current side is taken as a competitor, the reservation and discarding of each candidate operator are taken as strategies, the accuracy of a prediction result output by the super network is taken as a utility manifold value, a second game is constructed so as to realize the construction of an operator game of the super network, further, a second Nash equilibrium strategy combination containing the reservation probability and discarding probability respectively corresponding to each candidate operator of the second game is determined, nash equilibrium of the operator game is obtained, at least one operator with the reservation probability and discarding probability meeting preset conditions is selected in the second Nash equilibrium strategy combination, selection of an optimal operator is realized, and further, a neural network is generated based on each input side selected by the constructed first game and each operator selected based on the second game, so that the accuracy of the determined neural network is further improved.

Example IV

Fig. 4 is a schematic structural diagram of a neural network generating device according to a fourth embodiment of the present invention, where the embodiment is applicable to a case of generating a neural network by constructing a first game of a topology structure and determining a nash equalization policy combination according to a super network pre-trained by using training samples, and the device specifically includes: the node selection module 410, the input edge selection module 420, and the neural network generation module 430.

The node selection module 410 is configured to obtain a super network pre-trained by using a training sample, and select multiple paths of nodes in the super network; wherein the multi-path node is a node having a plurality of input edges;

An input edge selection module 420, configured to construct, for each of the multiple nodes, a first game with each input edge of the current multiple node as a competitor, with reservation and discarding of each input edge as a policy, and with accuracy of a prediction result output by the super network as a utility manifold value; determining a first Nash equilibrium strategy combination of the first game, and selecting at least one input edge with retention probability or discarding probability meeting a preset condition in the first Nash equilibrium strategy combination; the first Nash equilibrium policy combination comprises a retention probability and a discarding probability which correspond to the input edges respectively;

the neural network generation module 430 is configured to generate a neural network based on the input edges selected for each of the multiple nodes.

Optionally, the training sample is image data, and the prediction result is an image processing result; or the training sample is text data, and the prediction result is a text processing result; or the training sample is audio data, and the prediction result is an audio processing result.

Optionally, the node selection module 410 includes a first selection unit, configured to select multiple paths of nodes with the number of input edges greater than N in the super network; the input edge selection module 420 includes a second selection unit, configured to select N input edges whose retention probability or discard probability satisfies a preset condition in the first nash equalization policy combination; wherein N is an integer not less than 1.

Optionally, the second selecting unit is specifically configured to:

selecting N input edges with highest retention probability in the first Nash equilibrium strategy combination; or selecting N input edges with lowest discarding probability in the first Nash equilibrium strategy combination.

Optionally, the input edge selection module 420 includes a first policy combination determining unit, configured to obtain a utility manifold value of a current multi-path node; wherein the utility manifold value of the current multipath node comprises: after deleting any i input edges of the current multipath node from the super network, the accuracy of a prediction result output by the super network; wherein i is an integer which takes a value from 1 to M, M is the number of input edges of the current multipath node; and determining a first Nash equilibrium policy combination of the first game based on the utility manifold values of the current multipath nodes and a Nash equilibrium solving algorithm.

Optionally, the generating device of the neural network further includes an operator selecting module, where the operator selecting module is configured to construct, for each edge in the super network, a second game with each candidate operator included in the current edge as a competitor, with reservation and discarding of each candidate operator as a policy, and with accuracy of a prediction result output by the super network as a utility manifold value; determining a second Nash equilibrium strategy combination of the second game, and selecting at least one operator with retention probability or discarding probability meeting preset conditions in the second Nash equilibrium strategy combination; wherein, the second Nash equilibrium policy combination comprises a retention probability and a discarding probability which are respectively corresponding to each candidate operation operator; the neural network generation module 430 includes a first generation unit configured to generate a neural network based on the input edges selected for each of the multiple nodes and the operators selected for each of the edges.

Optionally, the operator selecting module includes a second policy combination determining unit, configured to obtain a utility manifold value of the current edge; wherein the utility manifold value of the current edge comprises: after deleting any j candidate operators contained in the current edge from the super network, the accuracy of a prediction result output by the super network; wherein j is an integer which takes a value from 1 to P, and P is the number of candidate operators contained in the current edge; and determining a second Nash equilibrium policy combination of the second game based on the utility manifold value of the current side and a Nash equilibrium solving algorithm.

Optionally, the operator selecting module includes an operator selecting unit, configured to select at least one operator with the highest retention probability in the second nash equalization policy combination; or selecting at least one operator with the lowest discarding probability in the second Nash equilibrium strategy combination.

Optionally, the first generating unit is specifically configured to:

for each multi-path node, deleting other input edges of the current multi-path node except the input edge selected for the current multi-path node in the super network;

for each edge, deleting other operation operators contained in the current edge except the operation operator selected for the current edge in the super network;

A neural network is generated based on the current super network.

In this embodiment, a node selection module is used to select multiple paths of nodes in a pre-trained super network, an input edge selection module is used to select, for each multiple path of nodes, each input edge of the current multiple path of nodes is used as a competitor, reservation and discarding of each input edge are used as strategies, accuracy of a prediction result output by the super network is used as a utility manifold value, a first game is constructed to realize construction of a topology structure game of the super network, further a first Nash equilibrium strategy combination of the first game, which includes reservation probability and discarding probability corresponding to each input edge, is determined, nash equilibrium of the topology structure game is obtained, at least one input edge, which satisfies preset conditions, of the reservation probability and discarding probability is selected in the first Nash equilibrium strategy combination, so that an input edge with higher connection possibility is selected, and a neural network is generated based on the input edge selected by each multiple path of nodes through a neural network generation module, thereby improving accuracy of the determined neural network.

The neural network generation device provided by the embodiment of the invention can execute the neural network generation method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

It should be noted that, the units and modules included in the above system are only divided according to the functional logic, but not limited to the above division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the embodiments of the present invention.

Example five

Fig. 5 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention. Fig. 5 illustrates a block diagram of an exemplary electronic device 12 suitable for use in implementing embodiments of the present invention. The electronic device 12 shown in fig. 5 is merely an example and should not be construed as limiting the functionality and scope of use of embodiments of the present invention. Device 12 is typically an electronic device that assumes the function of determining neural network generation.

As shown in fig. 5, the electronic device 12 is in the form of a general purpose computing device. Components of the electronic device 12 may include, but are not limited to: one or more processors or processing units 16, a memory 28, and a bus 18 connecting the different components, including the memory 28 and the processing unit 16.

Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include industry standard architecture (Industry Standard Architecture, ISA) bus, micro channel architecture (Micro Channel Architecture, MCA) bus, enhanced ISA bus, video electronics standards association (Video Electronics Standards Association, VESA) local bus, and peripheral component interconnect (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus.

Electronic device 12 typically includes a variety of computer-readable media. Such media can be any available media that is accessible by electronic device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

Memory 28 may include computer device readable media in the form of volatile memory, such as random access memory (Random Access Memory, RAM) 30 and/or cache memory 32. The electronic device 12 may further include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only, storage device 34 may be used to read from or write to a non-removable, non-volatile magnetic media (not shown in FIG. 5, commonly referred to as a "hard disk drive"). Although not shown in fig. 5, a disk drive for reading from and writing to a removable nonvolatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from and writing to a removable nonvolatile optical disk (e.g., a Compact Disc-Read Only Memory (CD-ROM), digital versatile Disc (Digital Video Disc-Read Only Memory), or other optical media), may be provided. In such cases, each drive may be coupled to bus 18 through one or more data medium interfaces. Memory 28 may include at least one program product 40, with program product 40 having a set of program modules 42 configured to perform the functions of embodiments of the present invention. Program product 40 may be stored, for example, in memory 28, such program modules 42 include, but are not limited to, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described herein.

The electronic device 12 may also communicate with one or more external devices 14 (e.g., keyboard, mouse, camera, etc., and display), with one or more devices that enable a user to interact with the electronic device 12, and/or with any device (e.g., network card, modem, etc.) that enables the electronic device 12 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 22. Also, electronic device 12 may communicate with one or more networks such as a local area network (Local Area Network, LAN), a wide area network Wide Area Network, a WAN, and/or a public network such as the internet via network adapter 20. As shown, the network adapter 20 communicates with other modules of the electronic device 12 over the bus 18. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 12, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, disk array (Redundant Arrays of INDEPENDENT DISKS, RAID) devices, tape drives, data backup storage, and the like.

The processor 16 executes various functional applications and data processing by running a program stored in the memory 28, for example, implementing the neural network generation method provided by the above embodiment of the present invention, including:

Of course, it will be understood by those skilled in the art that the processor may also implement the technical solution of the method for generating a neural network provided in any embodiment of the present invention.

Example six

A sixth embodiment of the present invention further provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a method for generating a neural network as provided in any embodiment of the present invention, the method including:

The computer storage media of embodiments of the invention may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for embodiments of the present invention may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims

1. A method for generating a neural network, comprising:

For each multi-path node, constructing a first game by taking each input edge of the current multi-path node as a competitor, taking reservation and discarding of each input edge as a strategy and taking the accuracy of a predicted result output by a super network as a utility manifold value; determining a first Nash equilibrium strategy combination of the first game, and selecting at least one input edge with retention probability or discarding probability meeting a preset condition in the first Nash equilibrium strategy combination; the first Nash equilibrium policy combination comprises a retention probability and a discarding probability which correspond to the input edges respectively;

generating a neural network based on the input edges selected for each of the plurality of nodes;

the training samples are image data, and the prediction results are image processing results;

Or alternatively

The training samples are text data, and the prediction results are text processing results;

Or alternatively

The training samples are audio data, and the prediction result is an audio processing result;

the utility manifold value includes the accuracy of the predicted result output by the super network after deleting one or more input edges of the current multi-path node in the super network.

2. The method of claim 1, wherein the selecting multiple nodes in the super network comprises:

selecting multiple paths of nodes with the number of input edges larger than N in the super network;

The selecting at least one input edge with the retention probability or the discarding probability meeting the preset condition in the first nash equalization strategy combination includes:

Selecting N input edges with retention probability or discarding probability meeting preset conditions in the first Nash equilibrium strategy combination; wherein N is an integer not less than 1.

3. The method of claim 2, wherein selecting N input edges for which a retention probability or a discard probability in the first nash equalization policy combination satisfies a preset condition comprises:

Selecting N input edges with highest retention probability in the first Nash equilibrium strategy combination; or alternatively

And selecting N input edges with lowest discarding probability in the first Nash equilibrium strategy combination.

4. The method of claim 1, wherein the determining a first nash equalization strategy combination for the first game comprises:

Acquiring utility manifold values of current multipath nodes; wherein the utility manifold value of the current multipath node comprises: after deleting any i input edges of the current multipath node from the super network, the accuracy of a prediction result output by the super network; wherein i is an integer which takes a value from 1 to M, M is the number of input edges of the current multipath node;

and determining a first Nash equilibrium policy combination of the first game based on the utility manifold values of the current multipath nodes and a Nash equilibrium solving algorithm.

5. The method according to any one of claims 1-4, further comprising:

for each side in the super network, constructing a second game by taking each candidate operator contained in the current side as a competitor, taking the reservation and discarding of each candidate operator as a strategy and taking the accuracy of a predicted result output by the super network as a utility manifold value; determining a second Nash equilibrium strategy combination of the second game, and selecting at least one operator with retention probability or discarding probability meeting preset conditions in the second Nash equilibrium strategy combination; wherein, the second Nash equilibrium policy combination comprises a retention probability and a discarding probability which are respectively corresponding to each candidate operation operator;

the generating a neural network based on the input edges selected for each of the plurality of nodes includes:

Generating a neural network based on the input edges selected for each of the plurality of nodes and the operator selected for each of the edges;

The method comprises the steps of constructing a utility manifold value used by the second game, wherein the construction comprises the accuracy of a prediction result output by a super network after deleting one or more operation operators of the current edge in the super network.

6. The method of claim 5, wherein said determining a second nash equalization strategy combination for said second game comprises:

acquiring a utility manifold value of the current edge; wherein the utility manifold value of the current edge comprises: deleting any j candidate operators contained in the current edge from the super network, and outputting the accuracy of the prediction result by the super network; wherein j is an integer which takes a value from 1 to P, and P is the number of candidate operators contained in the current edge;

and determining a second Nash equilibrium policy combination of the second game based on the utility manifold value of the current side and a Nash equilibrium solving algorithm.

7. The method of claim 5, wherein the selecting at least one operator for which a retention probability or a discard probability in the second nash equalization policy combination satisfies a preset condition comprises:

Selecting at least one operator with highest retention probability in the second Nash equilibrium strategy combination; or alternatively

And selecting at least one operator with the lowest discarding probability in the second Nash equilibrium strategy combination.

8. The method of claim 5, wherein generating the neural network based on the input edges selected for each of the plurality of nodes and the operator selected for each of the edges comprises:

A neural network is generated based on the current super network.

9. A neural network generation apparatus, comprising:

the input edge selection module is used for constructing a first game for each multi-path node by taking each input edge of the current multi-path node as a competitor, taking the reservation and discarding of each input edge as a strategy and taking the accuracy of a prediction result output by the super network as a utility manifold value; determining a first Nash equilibrium strategy combination of the first game, and selecting at least one input edge with retention probability or discarding probability meeting a preset condition in the first Nash equilibrium strategy combination; the first Nash equilibrium policy combination comprises a retention probability and a discarding probability which correspond to the input edges respectively;

the neural network generation module is used for generating a neural network based on the input edges selected for the multipath nodes;

The training sample is image data, and the prediction result is an image processing result; or the training sample is text data, and the prediction result is a text processing result; or the training sample is audio data, and the prediction result is an audio processing result;

10. An electronic device, the electronic device comprising:

One or more processors;

Storage means for storing one or more programs,

The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of generating a neural network as claimed in any one of claims 1-8.

11. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements a method of generating a neural network according to any one of claims 1-8.