CN109615073B - Neural network model construction method, device and storage medium - Google Patents

Neural network model construction method, device and storage medium Download PDF

Info

Publication number
CN109615073B
CN109615073B CN201811463775.7A CN201811463775A CN109615073B CN 109615073 B CN109615073 B CN 109615073B CN 201811463775 A CN201811463775 A CN 201811463775A CN 109615073 B CN109615073 B CN 109615073B
Authority
CN
China
Prior art keywords
array
neural network
network model
classification result
cell
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811463775.7A
Other languages
Chinese (zh)
Other versions
CN109615073A (en
Inventor
刘红丽
李峰
刘宏刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201811463775.7A priority Critical patent/CN109615073B/en
Publication of CN109615073A publication Critical patent/CN109615073A/en
Application granted granted Critical
Publication of CN109615073B publication Critical patent/CN109615073B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Abstract

The invention discloses a method for constructing a neural network model for realizing image classification, which comprises the following steps: s1, constructing a unit structure search network, a system structure search network, an image training set and a random coding array; s2, generating a neural network model by using a unit structure search network, a system structure search network and a random coding array; s3, inputting the image training set into a neural network model to obtain an actual classification result; s4, judging whether the actual classification result meets the preset condition, if not, performing the step S5; s5, updating a unit structure search network and a system structure search network according to the actual classification result and the theoretical classification of the image training set; s6, repeating S2-S5 until a judgment that the actual classification result meets the preset condition is made at S4. The method disclosed by the invention converts the original search space into two spaces of unit structure search and system structure search, and the optimal structure of the system is searched in an automatic learning mode, so that the flexibility of the generated model architecture is enhanced.

Description

Neural network model construction method, device and storage medium
Technical Field
The present invention relates to the field of image classification, and more particularly, to a method and an apparatus for constructing a neural network model, and a readable storage medium.
Background
The neural network model is a model structure which can be randomly stacked, basic components comprise FC (full connection layer), Convolution layer, Polling layer, Activation function and the like, the output of the former component is used as the input of the latter component, and different component connection modes and hyper-parameter configuration modes have different effects in different application scenes. Neural Architecture Search (NAS) aims to Search an optimal Neural network model from a collection of Neural network components. Common search methods include random search, bayesian optimization, evolutionary algorithms, reinforcement learning, gradient-based algorithms, and the like.
Zoph et al proposed to search a best network structure by using RNN in 2016, but since the search space is too large and it takes 22,400GPU working days, in 2017, it was changed to a convolution unit (conv cell) with the best effect by using reinforcement learning to search CNN, and then these conv cells were used to construct a better network, but the algorithm still needs 2000 GPU working days to obtain the current best architecture on CIFAR-10 and ImageNet. Many acceleration methods have been proposed, such as proposing weight sharing among multiple architectures, and microarchitectural searching based on the gradient descent of a continuous search space. However, these algorithms adopt a method of manually setting a network architecture, so that flexibility of the architecture is challenging.
Therefore, the current neural architecture search algorithm has the following problems:
(1) because the combination mode is too many, the search space is huge, and the function calculation cost is huge;
(2) the model architecture is designed manually and lacks flexibility.
Disclosure of Invention
In view of the above, in order to overcome at least one aspect of the above problems, an embodiment of the present invention provides a method for constructing a neural network model for implementing classification of an image, including the following steps:
s1, constructing a unit structure search network, a system structure search network, an image training set and a random coding array;
s2, generating the neural network model by using a unit structure search network, an architecture search network and a random code array;
s3, inputting the image training set into the neural network model to obtain an actual classification result;
s4, judging whether the actual classification result meets a preset condition according to the theoretical classification of the image training set, and if not, performing the step S5;
s5, updating the cell structure search network and the architecture search network according to the actual classification result and the theoretical classification;
s6, repeating the steps S2-S5 until a judgment is made at S4 that the actual classification result meets the preset conditions.
In some embodiments, the neural network model that obtains the actual classification result that satisfies the preset condition is the optimal neural network model.
In some embodiments, the step S2 further includes:
s21, searching the random coding array by using the unit structure searching network and the system structure searching network to obtain a unit structure coding array and a system structure coding array; and
and S22, decoding the unit structure coding array and the system structure coding array by using a decoder to obtain the neural network model.
In some embodiments, the cell structure encoding arrays include a falling cell array and a normal cell array.
In some embodiments, the descending cell array and the normal cell array each include a plurality of data blocks, wherein each data block includes constraint condition information, deep learning operation information, and stitching operation information.
In some embodiments, the architectural coding array is used to enable selection of deep learning operation information and selection of stitching operation information for the cell structure coding array.
In some embodiments, the step S4 further includes:
s41, calculating an error value of the actual classification result according to the theoretical classification of the image training set;
s42, judging whether the error value is smaller than the threshold value, if the error value is larger than the threshold value, then executing the step S5.
In some embodiments, the step S5 further includes:
s41, calculating a loss function value by using the actual classification result and the theoretical classification of the image training set;
s42, updating the cell structure search network and the architecture search network with the loss function value.
Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides a computer apparatus, including:
at least one processor; and
a memory storing a computer program operable on the processor to perform the steps of any of the methods of constructing a neural network model described above when the program is executed by the processor.
Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention further provides a computer-readable storage medium storing a computer program which, when executed by a processor, performs the steps of any one of the methods for constructing a neural network model as described above.
The invention has the following beneficial technical effects: the embodiment provided by the invention converts the original search space into two spaces of unit structure search and system structure search, and searches the optimal structure of the system in an automatic learning mode, thereby enhancing the flexibility of the generated model architecture, reducing the computational complexity and realizing the efficient search of the architecture.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
FIG. 1 is a schematic diagram of a cell structure according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a neural network model provided in an embodiment of the present invention;
FIG. 3 is a schematic flow chart of a method for constructing a neural network model according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a cell structure search network according to an embodiment of the present invention;
FIG. 5 is a flow chart of the decoding of the unit structure coding array and the decoding of the architecture coding array according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a computer device provided in an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.
It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.
According to one aspect of the invention, a method for constructing a neural network model for realizing image classification is provided, and the specific implementation concept is that a random array is generated firstly, then the random array is input into an encoder formed by a unit structure search network and an architecture search network to obtain an encoding array (a unit structure encoding array and an architecture encoding array), and then the encoding array is analyzed into a corresponding actual neural network model according to a decoding rule. And when the data of the image training set is trained by the actual neural network model, generating a corresponding loss function (loss) value, and finally updating the encoder according to the loss function (loss) value so as to update the actual neural network model.
In the invention, a plurality of cell structures are obtained through a cell structure coding array, wherein each cell structure (cell) is a final framework keystone, and then a plurality of cell structures are connected in series through a system structure to form a convolution network, thereby obtaining a neural network model.
As shown in fig. 1(a), a cell is a directed acyclic graph composed of N ordered nodes. Each node x(i)Are all feature maps in a convolutional network, with each directed edge (i, j) being for x(i)A certain operation o(i,j). Suppose that each cell has an input node hi-1And an output node hi+1. For a convolution cell, the input node is defined as the cell output of the previous layer, node hiFor input node toAnd (5) mapping the features after the row convolution operation. The output of the cell is obtained by applying a stitching operation in the channel dimension to all unused intermediate nodes.
Each intermediate node is computed based on all nodes before it:
Figure BDA0001889222710000052
wherein, gamma takes {0,1}, when r takes 1, it represents summation operation, when gamma takes 0, it represents splicing operation. A special zero operation is also included to indicate that there is no connection between the two nodes. The task of learning is thus reduced to learning its side-to-side operations.
Let O denote a set of candidate operations (e.g., convolution, max pooling, zero, etc.), and each operation is denoted by O (-). To fully represent the search space, the present invention parameterizes all possible choices of operations, the formula:
Figure BDA0001889222710000051
wherein the operation mixing weight of a pair of nodes (i, j) is given by the vector α of the dimension | O |(i,j)And (4) parameterizing. After the above formula, the task of searching the unit structure is converted into the discrete variable α ═ α(i,j)And (5) learning. And finally, obtaining a unit structure parameter corresponding to a discrete maximum possible operation, wherein alpha is the coding (encoding) of the unit structure and the value is {0,1 }. As shown in fig. 1(b), the dashed line corresponds to a zero operation.
The operations in O herein include 14 in total:
Figure BDA0001889222710000061
table 1 list of selectable operations in the search space of the present invention
For the ith intermediate node, a total of (i +1) × 14 parameters are required. Let i ═5, 6 x 14 parameters are needed, wherein the parameters include 4 intermediate nodes and 2 input nodes, and the parameters are excessive. Since each intermediate node is set to be fixed with only 2 inputs, in order to simplify the calculation amount, the O space is changed into O in the invention1And O2Two spaces, namely a cellular structure search network and an architectural search network, where O1Indicating the selection of the intermediate node to the input node, O2Indicating a selection of an operation.
The coding module is used for representing the model structure in a parameterization mode, namely different codes correspond to different model structures, and the process of searching the optimal model structure can be simplified into the process of searching the optimal code.
In the present invention, O2The operations in (1) include: identity, 1x3 and 3x1 contributions, 1x7 and 7x1 contributions, 3x3 differential contributions, 3x3 average potential, 3x3 max potential, 5x5 max potential, 7x7 max potential, 1x1 contributions, 3x3 contributions, 3x3 depthwise-subparable conv, 5x5 depthwise-subparable conv, 7x7 depthwise-subpable conv. All operation steps are 1 and their convolved feature maps are filled to maintain their spatial resolution. And convolution operations are performed sequentially using the Relu-CONV-BN and each separable convolution is always applied twice.
In the present invention, the convolution unit contains N-5 nodes (divide by h)i-1,hi,hi+1Outer) where the output node is defined as the deep-stitching of all unused intermediate nodes. The architecture is formed by stacking (connecting in series) a plurality of unit structures. The input node of cell k is equivalent to the output of cell k-1 and it is necessary to insert a 1x1 convolution. In order to dimension the model down, the architecture needs to insert a reduction cell (reduction cell) with all the operation steps of 2 connecting the input nodes. The structure is thus encoded as (α)normal,αreduce) All normal cells (normal cells) share αnormalAll reduce cells share αreduceThat is, the unit structure encoding array includes a falling unit array and a normal unit array, and the falling unit array and the normal unit array each include a plurality of data blocks, each of whichEach data block comprises constraint condition information, deep learning operation information and splicing operation information. And the system structure coding array is used for realizing the selection of the deep learning operation and the splicing operation of the unit structure coding array, namely, the normal cells and the reduce cells are connected in series to construct a model. FIG. 2 shows a neural network model in which M, L, N represents the number of normal cells repeated.
In some embodiments, according to an aspect of the present invention, an embodiment of the present invention provides a method for constructing a neural network model for image classification, as shown in fig. 3, which may include the following steps:
and S1, constructing a unit structure search network, an architecture search network, an image training set and a random coding array.
In some embodiments, the input of the cell structure search network is the output of the last cell structure search, and the final output represents the coding matrix of the cell structure, and the present invention assumes that one cell structure (as shown in fig. 4(a), and the dashed line in the figure represents the jump connection) is composed of five blocks (as shown in fig. 4 (b)) with the same structure. The concrete meanings of the cell coding matrix are as follows, each block: the first 6 columns correspond to the hidden layer A selection and the middle 13 columns correspond to the layer o2Selection of 13 deep learning operations in space; since each block has a 2 x 19 matrix of two operations; finally, splicing operation selection 1x 2 is carried out; thus, 2 × 19 is changed into 1 × 38 form, and then 1 × 2 is followed, so that a 1 × 40 coding matrix is obtained.
The invention assumes 5 blocks in total for 1 cell, and 5x 40 is obtained. There are two types of cells (normal and reduce) in total, so the cell search finally outputs a 10 x 40 matrix.
The constraint conditions in the structure search are as follows:
(1) the first 6 columns correspond to the selection ranges of the inputs: the input of the first intermediate node can only be from hi-1,hiCan only be selected from hi-1,hi,h0And so on until the input selectable range of the fifth intermediate node is hi-1,hi,h0,h1,h2,h3Block not selected lastLine splicing and outputting; 0 means not used, 1 means used, and the 6 columns only have one 1 present.
(2) Selection range of 13 deep learning operations: identity, 1x3 and 3x1 conjugation, 1x7 and 7x1 conjugation, 3x3 differential conjugation, 3x3 average substitution, 3x3 max substitution, 5x5 max substitution, 7x7 max substitution, 1x1 conjugation, 3x3 conjugation, 3x3 depthwise-subparable constv, 5x5 depthwise-subparable constv, 7x7 depthwise-subparable constv; 0 means not used, 1 means used, and the 13 columns can only have one 1 present.
(3) Selecting a range of the combination operation after two operations in the block: element addition and channel dimension splicing; 0 means not used, 1 means used, and the 2 columns only have one 1 present.
(4) The dimensions of the reduce cell and the normal cell are reduced, the number of channels is increased, and the channels are divided into 2 classes for realization.
The input of the architecture search network is the output of the last architecture search, and the output is the encoding result of the architecture (architecture layer number x3 matrix, as shown in table 2).
Normal cell Reduce cell None
1 0 0
0 0 1
0 1 0
Table 2 architecture coding (assuming 10 system layers, 10 x3 matrix is formed)
Different from the manual setting of the architecture by the previous work, the invention also encodes the architecture, further searches the optimal model and enhances the flexibility of the model architecture, which is a great advantage of the invention.
The original search space dimension is equal to (cell space dimension × system space dimension); now the search space dimension is equal to (cell space dimension + system space dimension); since the spatial dimension of the cell is far beyond billions, the spatial dimension of the system is about 3layer_numTherefore, the dimension of the original search space is far larger than that of the current search space, and the original mode needs about 3 more flowerslayer_numThe time is doubled, and the advantages of the adopted existing coding and decoding modes are more obvious along with the increase of the number of the layers of the net.
And S2, generating the neural network model by using the unit structure search network, the architecture search network and the random code array.
In some embodiments, step S2 may further include the steps of:
s21, searching the random coding array by using the unit structure searching network and the system structure searching network to obtain a unit structure coding array and a system structure coding array; and
and S22, decoding the unit structure coding array and the system structure coding array by using a decoder to obtain the neural network model.
In some embodiments, as shown in FIG. 5, FIG. 5 illustrates a cell structure encoding array decoding rule and an architecture encoding array decoding rule. The decoding rule of the unit structure coding array is as follows, h _ (i-1) is input, h _ i is obtained through convolution operation, subscript corresponding to 1 of M0 < 0:5 > is taken, hidden layer A can be obtained according to the coding rule, subscript corresponding to 1 of M0 < 6:18 > is taken, corresponding operation of the hidden layer A can be obtained according to the coding rule, similarly, subscript corresponding to 1 of M0 < 19:24 > is taken, hidden layer B can be obtained according to the coding rule, subscript corresponding to 1 of M0 < 25:37 > is taken, corresponding operation of the hidden layer B can be obtained according to the coding rule, subscript corresponding to 1 of M0 < 38:39 > is taken, and hidden layer fusion operation can be obtained according to the coding rule, so that a new hidden layer is obtained. And (3) decoding the structure of the 2 nd block according to M [1] [ ], decoding 5 blocks in sequence and finally splicing together to obtain the cell structure. The decoding rule of the system structure coding array is as follows, images batch is input, subscript of N0 corresponding to 1 is taken, the layer is connected with cell type according to the coding rule, the decoding cell structure is continuously connected in series by analogy, and finally the model structure is output.
And S3, inputting the image training set into the neural network model to obtain an actual classification result.
And S4, judging whether the actual classification result meets the preset condition according to the theoretical classification of the image training set.
In some embodiments, step S4 may include:
s41, calculating an error value of the actual classification result according to the theoretical classification of the image training set;
in some embodiments, the error value may be a value of error/total value in the actual classification result, for example, if there are 100 classification results in total, and there are 50 correct classification results, the error value is 0.5.
And S42, judging whether the error value is smaller than a threshold value, and if the error value is larger than the threshold value, performing the subsequent steps.
The threshold may be set according to actual requirements, and may be 0.05-0.15. For example, if the desired result is more accurate, the threshold may be set to a lower value, such as 0.1, or lower, such as 0.05.
S5, updating the cell structure search network and the architecture search network according to the actual classification result and the theoretical classification.
In some embodiments, step S5 may further include the steps of:
s41, calculating a loss function value by using the actual classification result and the theoretical classification of the image training set;
s42, updating the cell structure search network and the architecture search network with the loss function value.
In some embodiments, the technical solutions in the prior art may be adopted to implement the updating of the unit structure search network and the architecture search network by using the loss function value. For example, the loss function value updating unit structure may be used to search for a network and parameters of the architecture search network.
In some embodiments, the unit structure search network and the architecture search network are continuously updated with the loss function value to finally obtain an optimal loss function value, and at this time, the obtained neural network model is the optimal model.
S6, repeating the steps S2-S5 until a judgment is made at S4 that the actual classification result meets the preset conditions.
Based on the same inventive concept, according to another aspect of the present invention, as shown in fig. 6, an embodiment of the present invention further provides a computer apparatus 501, including:
at least one processor 520; and
a memory 510, said memory 510 storing a computer program 511 executable on said processor, said processor 520 when executing said program performing the steps of any of the methods of constructing a neural network model as described above.
Based on the same inventive concept, according to another aspect of the present invention, as shown in fig. 7, an embodiment of the present invention further provides a computer-readable storage medium 601, the computer-readable storage medium 601 stores a computer program 610, and the computer program 610, when executed by a processor, performs the steps of any one of the methods for constructing a neural network model as described above.
Finally, it should be noted that, as will be understood by those skilled in the art, all or part of the processes of the methods of the above embodiments may be implemented by a computer program, which may be stored in a computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a Random Access Memory (RAM), or the like. The embodiments of the computer program may achieve the same or similar effects as any of the above-described method embodiments.
In addition, the apparatuses, devices and the like disclosed in the embodiments of the present invention may be various electronic terminal devices, such as a mobile phone, a Personal Digital Assistant (PDA), a tablet computer (PAD), a smart television and the like, or may be a large terminal device, such as a server and the like, and therefore the scope of protection disclosed in the embodiments of the present invention should not be limited to a specific type of apparatus, device. The client disclosed in the embodiment of the present invention may be applied to any one of the above electronic terminal devices in the form of electronic hardware, computer software, or a combination of both.
Furthermore, the method disclosed according to an embodiment of the present invention may also be implemented as a computer program executed by a CPU, and the computer program may be stored in a computer-readable storage medium. The computer program, when executed by the CPU, performs the above-described functions defined in the method disclosed in the embodiments of the present invention.
Further, the above method steps and system elements may also be implemented using a controller and a computer readable storage medium for storing a computer program for causing the controller to implement the functions of the above steps or elements.
Further, it should be appreciated that the computer-readable storage media (e.g., memory) described herein can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. By way of example, and not limitation, nonvolatile memory can include Read Only Memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which can act as external cache memory. By way of example and not limitation, RAM is available in a variety of forms such as synchronous RAM (DRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The storage devices of the disclosed aspects are intended to comprise, without being limited to, these and other suitable types of memory.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments of the present invention.
The various illustrative logical blocks, modules, and circuits described in connection with the disclosure herein may be implemented or performed with the following components designed to perform the functions described herein: a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination of these components. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP, and/or any other such configuration.
The steps of a method or algorithm described in connection with the disclosure herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a user terminal.
In one or more exemplary designs, the functions may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk, blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.
It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.
The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of an embodiment of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.

Claims (9)

1. A method of constructing a neural network model for enabling classification of images, the method comprising the steps of:
s1, constructing a unit structure search network, a system structure search network, an image training set and a random coding array;
s2, generating the neural network model by using a unit structure search network, an architecture search network and a random code array;
s3, inputting the image training set into the neural network model to obtain an actual classification result;
s4, judging whether the actual classification result meets a preset condition according to the theoretical classification of the image training set, and if not, performing the step S5;
s5, updating the cell structure search network and the architecture search network according to the actual classification result and the theoretical classification;
s6, repeating the steps S2-S5 until the judgment that the actual classification result meets the preset condition is obtained in S4;
wherein the step S2 further includes:
s21, searching the random coding array by using the unit structure searching network and the system structure searching network to obtain a unit structure coding array and a system structure coding array; and
and S22, decoding the unit structure coding array and the system structure coding array by using a decoder to obtain the neural network model.
2. The method of claim 1, wherein the neural network model that yields the actual classification result satisfying the preset condition is an optimal neural network model.
3. The method of claim 1, wherein the cell structure encoding arrays include a falling cell array and a normal cell array.
4. The method of claim 3, wherein the descending cell array and the normal cell array each include a plurality of data blocks, wherein each data block includes constraint information, deep learning operation information, and stitching operation information.
5. The method of claim 4, wherein the architectural coding array is used to enable selection of deep learning operation information and selection of stitching operation information for the cell structure coding array.
6. The method of claim 1, wherein the step S4 further comprises:
s41, calculating an error value of the actual classification result according to the theoretical classification of the image training set;
s42, judging whether the error value is smaller than the threshold value, if the error value is larger than the threshold value, then executing the step S5.
7. The method of claim 1, wherein the step S5 further comprises:
s41, calculating a loss function value by using the actual classification result and the theoretical classification of the image training set;
s42, updating the cell structure search network and the architecture search network with the loss function value.
8. A computer device, comprising:
at least one processor; and
memory storing a computer program operable on the processor, wherein the processor, when executing the program, performs the method of any of claims 1-7.
9. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the method of any one of claims 1 to 7.
CN201811463775.7A 2018-12-03 2018-12-03 Neural network model construction method, device and storage medium Active CN109615073B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811463775.7A CN109615073B (en) 2018-12-03 2018-12-03 Neural network model construction method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811463775.7A CN109615073B (en) 2018-12-03 2018-12-03 Neural network model construction method, device and storage medium

Publications (2)

Publication Number Publication Date
CN109615073A CN109615073A (en) 2019-04-12
CN109615073B true CN109615073B (en) 2021-06-04

Family

ID=66006198

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811463775.7A Active CN109615073B (en) 2018-12-03 2018-12-03 Neural network model construction method, device and storage medium

Country Status (1)

Country Link
CN (1) CN109615073B (en)

Families Citing this family (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018176000A1 (en) 2017-03-23 2018-09-27 DeepScale, Inc. Data synthesis for autonomous control systems
US11157441B2 (en) 2017-07-24 2021-10-26 Tesla, Inc. Computational array microprocessor system using non-consecutive data formatting
US11893393B2 (en) 2017-07-24 2024-02-06 Tesla, Inc. Computational array microprocessor system with hardware arbiter managing memory requests
US11409692B2 (en) 2017-07-24 2022-08-09 Tesla, Inc. Vector computational unit
US10671349B2 (en) 2017-07-24 2020-06-02 Tesla, Inc. Accelerated mathematical engine
US11561791B2 (en) 2018-02-01 2023-01-24 Tesla, Inc. Vector computational unit receiving data elements in parallel from a last row of a computational array
US11215999B2 (en) 2018-06-20 2022-01-04 Tesla, Inc. Data pipeline and deep learning system for autonomous driving
US11361457B2 (en) 2018-07-20 2022-06-14 Tesla, Inc. Annotation cross-labeling for autonomous control systems
US11636333B2 (en) 2018-07-26 2023-04-25 Tesla, Inc. Optimizing neural network structures for embedded systems
US11562231B2 (en) 2018-09-03 2023-01-24 Tesla, Inc. Neural networks for embedded devices
CN115512173A (en) 2018-10-11 2022-12-23 特斯拉公司 System and method for training machine models using augmented data
US11196678B2 (en) 2018-10-25 2021-12-07 Tesla, Inc. QOS manager for system on a chip communications
US11816585B2 (en) 2018-12-03 2023-11-14 Tesla, Inc. Machine learning models operating at different frequencies for autonomous vehicles
US11537811B2 (en) 2018-12-04 2022-12-27 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
US11610117B2 (en) 2018-12-27 2023-03-21 Tesla, Inc. System and method for adapting a neural network model on a hardware platform
US10997461B2 (en) 2019-02-01 2021-05-04 Tesla, Inc. Generating ground truth for machine learning from time series elements
US11567514B2 (en) 2019-02-11 2023-01-31 Tesla, Inc. Autonomous and user controlled vehicle summon to a target
US10956755B2 (en) 2019-02-19 2021-03-23 Tesla, Inc. Estimating object properties using visual image data
CN110059804B (en) * 2019-04-15 2021-10-08 北京迈格威科技有限公司 Data processing method and device
DE102019206620A1 (en) * 2019-04-18 2020-10-22 Robert Bosch Gmbh Method, device and computer program for creating a neural network
CN110175671B (en) * 2019-04-28 2022-12-27 华为技术有限公司 Neural network construction method, image processing method and device
CN110278370B (en) * 2019-06-21 2020-12-18 上海摩象网络科技有限公司 Method and device for automatically generating shooting control mechanism and electronic equipment
CN110659721B (en) * 2019-08-02 2022-07-22 杭州未名信科科技有限公司 Method and system for constructing target detection network
CN110555514B (en) * 2019-08-20 2022-07-12 北京迈格威科技有限公司 Neural network model searching method, image identification method and device
CN110428046B (en) * 2019-08-28 2023-12-15 腾讯科技(深圳)有限公司 Method and device for acquiring neural network structure and storage medium
CN110659690B (en) * 2019-09-25 2022-04-05 深圳市商汤科技有限公司 Neural network construction method and device, electronic equipment and storage medium
CN110751267B (en) * 2019-09-30 2021-03-30 京东城市(北京)数字科技有限公司 Neural network structure searching method, training method, device and storage medium
CN111191785B (en) * 2019-12-20 2023-06-23 沈阳雅译网络技术有限公司 Structure searching method based on expansion search space for named entity recognition
CN113326929A (en) * 2020-02-28 2021-08-31 深圳大学 Progressive differentiable network architecture searching method and system based on Bayesian optimization
CN113469891A (en) * 2020-03-31 2021-10-01 武汉Tcl集团工业研究院有限公司 Neural network architecture searching method, training method and image completion method
CN113705276A (en) * 2020-05-20 2021-11-26 武汉Tcl集团工业研究院有限公司 Model construction method, model construction device, computer apparatus, and medium
CN111931904A (en) * 2020-07-10 2020-11-13 华为技术有限公司 Neural network construction method and device
CN114926698B (en) * 2022-07-19 2022-10-14 深圳市南方硅谷半导体股份有限公司 Image classification method for neural network architecture search based on evolutionary game theory

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874956A (en) * 2017-02-27 2017-06-20 陕西师范大学 The construction method of image classification convolutional neural networks structure
CN107172428A (en) * 2017-06-06 2017-09-15 西安万像电子科技有限公司 The transmission method of image, device and system
CN108021983A (en) * 2016-10-28 2018-05-11 谷歌有限责任公司 Neural framework search

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140249882A1 (en) * 2012-10-19 2014-09-04 The Curators Of The University Of Missouri System and Method of Stochastic Resource-Constrained Project Scheduling
CN105303252A (en) * 2015-10-12 2016-02-03 国家计算机网络与信息安全管理中心 Multi-stage nerve network model training method based on genetic algorithm
CN106295803A (en) * 2016-08-10 2017-01-04 中国科学技术大学苏州研究院 The construction method of deep neural network
US10019655B2 (en) * 2016-08-31 2018-07-10 Adobe Systems Incorporated Deep-learning network architecture for object detection

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021983A (en) * 2016-10-28 2018-05-11 谷歌有限责任公司 Neural framework search
CN106874956A (en) * 2017-02-27 2017-06-20 陕西师范大学 The construction method of image classification convolutional neural networks structure
CN107172428A (en) * 2017-06-06 2017-09-15 西安万像电子科技有限公司 The transmission method of image, device and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Progressive Neural Architecture Search;Chenxi Liu1 等;《arXiv》;20180726;1-20 *

Also Published As

Publication number Publication date
CN109615073A (en) 2019-04-12

Similar Documents

Publication Publication Date Title
CN109615073B (en) Neural network model construction method, device and storage medium
US11531889B2 (en) Weight data storage method and neural network processor based on the method
JP2019032808A (en) Mechanical learning method and device
KR20160117537A (en) Hierarchical neural network device, learning method for determination device, and determination method
CN112149797B (en) Neural network structure optimization method and device and electronic equipment
KR20210092286A (en) Image restoration method and device, electronic device, storage medium
CN110009048B (en) Method and equipment for constructing neural network model
CN109902808B (en) Method for optimizing convolutional neural network based on floating point digital variation genetic algorithm
CN113642726A (en) System and method for compressing activation data
CN116109920A (en) Remote sensing image building extraction method based on transducer
CN113128432B (en) Machine vision multitask neural network architecture searching method based on evolution calculation
WO2021038793A1 (en) Learning system, learning method, and program
KR20230069578A (en) Sign-Aware Recommendation Apparatus and Method using Graph Neural Network
KR102382491B1 (en) Method and apparatus for sequence determination, device and storage medium
CN112381147A (en) Dynamic picture similarity model establishing method and device and similarity calculating method and device
CN115294337B (en) Method for training semantic segmentation model, image semantic segmentation method and related device
CN111312340A (en) SMILES-based quantitative structure effect method and device
CN113905066B (en) Networking method of Internet of things, networking device of Internet of things and electronic equipment
CN115359211A (en) Model loading method and device, storage medium and electronic equipment
CN112561050B (en) Neural network model training method and device
CN111444180B (en) Double-layer structure index and query method thereof
EP4295276A1 (en) Accelerated execution of convolution operation by convolutional neural network
CN110147804B (en) Unbalanced data processing method, terminal and computer readable storage medium
CN111143641A (en) Deep learning model training method and device and electronic equipment
CN109614587B (en) Intelligent human relationship analysis modeling method, terminal device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant