CN113705276A

CN113705276A - Model construction method, model construction device, computer apparatus, and medium

Info

Publication number: CN113705276A
Application number: CN202010431405.6A
Authority: CN
Inventors: 李叶伟; 陈浩鹏; 熊宇龙; 李渊; 向少雄
Original assignee: Wuhan TCL Group Industrial Research Institute Co Ltd
Current assignee: Wuhan TCL Group Industrial Research Institute Co Ltd
Priority date: 2020-05-20
Filing date: 2020-05-20
Publication date: 2021-11-26

Abstract

The model construction method is suitable for the technical field of model construction, and provides a model construction method, a model construction device, computer equipment and a medium, wherein the model construction method is used for training a super network with a search space to obtain the trained super network; and searching a network frame to be trained from the trained hyper-network according to a preset search condition, and finally training and optimizing the network frame to be trained based on the training sample set to obtain a target model, so that the original model frame does not need to be subjected to structure optimization and content optimization, and the efficiency of model construction is improved.

Description

Model construction method, model construction device, computer apparatus, and medium

Technical Field

The present application relates to a model building method, a model building apparatus, a computer device, and a computer-readable storage medium.

Background

In recent years, with the development of artificial intelligence technology, operations such as recognition and judgment of real world objects by using mathematical operation models have been developed in more and more fields to relieve manual labor. For example, the face recognition technology adopts a face recognition model to perform feature recognition on a face image, and further determines whether the source of the face image is legal, that is, whether the identity of a user is legal.

In the related art, when a face recognition model is constructed, an original model is constructed based on a neural network, and then the original model is trained and verified by constructing a training sample set and a verification set. However, when constructing the original model based on the neural network, it is necessary to select a corresponding model frame according to actual requirements, and perform operations such as structure optimization and content optimization on the model frame, for example, deleting or adding a hierarchical structure of the model frame; for another example, a channel deletion or a channel addition is performed on a certain level in the model framework. Therefore, the model building process in the existing model building scheme is complicated, and the problem of low model building efficiency exists.

Disclosure of Invention

In view of this, embodiments of the present application provide a model building method, a model building apparatus, a computer device, and a computer-readable storage medium, so as to solve the problem that the existing model building scheme has low model building efficiency.

A first aspect of an embodiment of the present application provides a model building method, including:

training the super network with the search space to obtain a trained super network; wherein the search space contains a plurality of candidate web frameworks;

searching from the trained hyper-network to obtain a network frame to be trained according to a preset search condition;

and training the network frame to be trained by utilizing a training sample set to obtain a target model.

In the foregoing scheme, before the step of training the network frame to be trained by using the training sample set to obtain the target model, the method further includes:

acquiring a sample image set containing a human face;

intercepting a face area from each sample image in the sample image set to obtain a face image sample set;

and zooming and labeling each face image sample in the face image sample set to obtain the training sample set.

In the above solution, a plurality of candidate network frames in the search space are connected to each other, and each candidate network frame includes a plurality of substructures;

the training of the super network with the search space to obtain the trained super network comprises the following steps:

performing structure search based on all the substructures in the search space to determine a single-path supernet;

and carrying out sampling training on the super network according to the single-path super network to obtain the trained super network.

In the foregoing solution, the performing a structure search based on all the substructures in the search space to determine a single-path supernet includes:

performing structure search on all the substructures in the search space according to preset ultra-network attribute information to obtain a single-path ultra-network; wherein a transition probability between two adjacent substructures in the single-path super-network is the largest.

In the scheme, the preset search condition corresponds to a target deployment platform;

the searching from the trained super network to obtain the network frame to be trained according to the preset search condition comprises the following steps:

searching a network frame to be trained from the trained hyper-network based on the following formula contained in the preset search condition;

ACC_val(a) (a,a∈A,A＞0)

Latency(a,h)≤LatC_h

wherein a represents the candidate network framework and satisfies the condition (a, a belongs to A, and A is more than 0); a is the number of the candidate network frames in the search space, and A is more than 0; ACC (adaptive cruise control)_val(a) To verify the accuracy; h represents a target deployment platform, and h is greater than 0; latency (a, h) is an objective function; latC_hDeploying a latency of the platform for the target.

In the above scheme, the obtaining a training sample set, and training the network frame to be trained by using the training sample set to obtain a target model includes:

and training the network frame to be trained by using the training sample set by using a back propagation and gradient optimization method to obtain a target model.

In the above scheme, the training the network frame to be trained by using the back propagation and gradient optimization method and using the training sample set to obtain a target model, includes:

identifying the objective function Latency (a, h) as a convergence condition; wherein Latency (a, h) is less than or equal to LatC_h，LatC_hDeploying a time delay of the platform for the target;

and training and optimizing the network frame to be trained by using the training sample set by using a back propagation and gradient optimization method to obtain a target model meeting the convergence condition.

A second aspect of an embodiment of the present application provides a model building apparatus, including:

the sampling training unit is used for training the super network constructed with the search space to obtain a trained super network; wherein the search space contains a plurality of candidate web frameworks;

the searching unit is used for searching the trained super network to obtain a network frame to be trained according to a preset searching condition;

and the model training unit is used for training the network frame to be trained by utilizing a training sample set to obtain a target model.

In the foregoing solution, the model building apparatus further includes:

the image acquisition unit is used for acquiring a sample image set containing a human face;

the image intercepting unit is used for intercepting a face area from each sample image in the sample image set to obtain a face image sample set;

and the sample generating unit is used for scaling and labeling each face image sample in the face image sample set to obtain the training sample set.

the sampling training unit is specifically configured to perform structure search based on all the substructures in the search space, and determine a single-path supernet; and carrying out sampling training on the super network according to the single-path super network to obtain the trained super network.

In the above scheme, the sampling training unit is further specifically configured to perform structure search on all the substructures in the search space according to preset attribute information of the super-network, so as to obtain a single-path super-network; wherein a transition probability between two adjacent substructures in the single-path super-network is the largest.

the searching unit is specifically configured to search a network frame to be trained from the trained super network based on the following formula included in the preset searching condition;

ACC_val(a) (a,a∈A,A＞0)

Latency(a,h)≤LatC_h

In the above scheme, the model training unit is specifically configured to train the network frame to be trained by using the training sample set by using a back propagation and gradient optimization method to obtain the target model.

In the above scheme, the model training unit is specifically configured to identify the objective function Latency (a, h) as a convergence condition; wherein Latency (a, h) is less than or equal to LatC_h，LatC_hDeploying a time delay of the platform for the target; and training and optimizing the network frame to be trained by using the training sample set by using a back propagation and gradient optimization method to obtain a target model meeting the convergence condition.

A third aspect of embodiments of the present application provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the computer device, and the processor implements the steps of the model building method provided in the first aspect when executing the computer program.

A fourth aspect of embodiments of the present application provides a computer-readable storage medium, which stores a computer program that, when executed by a processor, implements the steps of the model construction method provided by the first aspect.

A fifth aspect of embodiments of the present application provides a computer program product, which, when run on a computer device, causes the computer device to perform the steps of the model building method according to any one of the first aspect.

The model construction method, the model construction device, the computer equipment and the computer readable storage medium provided by the embodiment of the application have the following beneficial effects:

according to the model construction method provided by the embodiment of the application, a trained hyper-network is obtained by training the hyper-network with a search space, and the search space is pre-constructed in the hyper-network, so that all candidate network frames needing to be searched can be contained in the hyper-network, and when the hyper-network is trained, all internal substructures can share parameters when different sub-networks are constructed, so that the hyper-network can be trained to a certain degree, and the sub-networks can be sampled and indexes can be evaluated; according to the preset search conditions, the network frame to be trained is searched from the trained hyper-network, and finally the network frame to be trained is trained and optimized based on the training sample set to obtain the target model, so that the structure optimization and the content optimization of the original model frame are not needed, the steps of model construction are simplified, and the efficiency of model construction is improved.

In addition, the model searching condition corresponds to the target deployment platform, and the network frame to be trained is searched out from the trained hyper-network and is the optimal network path, so that the network frame to be trained is the network frame which is searched out to be matched with the model deployment limiting condition best, and finally the target model is obtained by training and optimizing the network frame to be trained, and the matching degree between the computing capabilities of the target model and the target deployment platform can be improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

FIG. 1 is a flow chart of an implementation of a model building method provided in an embodiment of the present application;

FIG. 2 is a flow chart of an implementation of a model building method according to another embodiment of the present application;

FIG. 3 is a schematic diagram of a candidate network framework in an embodiment of the present application;

fig. 4 is a block diagram of a model building apparatus according to an embodiment of the present application;

fig. 5 is a block diagram of a computer device according to another embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

It should be noted that, in the model building methods provided in all embodiments of the present application, the execution subject is a computer device for building a model, such as a server for model deployment, a computer node for model deployment in a distributed system, and the like.

Referring to fig. 1, fig. 1 is a flowchart illustrating an implementation of a model building method according to an embodiment of the present disclosure.

The model construction method shown in fig. 1 includes the following steps:

s11: training the super network with the search space to obtain a trained super network; wherein the search space contains a plurality of candidate web frameworks.

In step S11, a search space is pre-constructed in the super network, and the search space includes a plurality of candidate network frames. Each candidate network frame is composed of a plurality of substructures, a part of the substructures among the candidate network frames are connected to form a sub-network, and all the sub-networks form a super network.

It should be noted that the super network is constructed based on a Neural Architecture Search (NAS) method. In the NAS, an algorithm is used for replacing a manual design model frame, and an optimal neural network frame is automatically searched in a massive search space, namely a candidate network frame to be optimized is searched from a plurality of candidate network frames. Here, a plurality of candidate network frameworks are integrated, that is, a part or all of the substructures in each candidate network framework are connected according to a certain integration logic, so as to form a plurality of sub-networks, and the plurality of sub-networks form a super-network.

The search space is a method for integrating substructures with different channel numbers (different widths) into a super network by taking the substructures contained in all the sub-networks in the super network as structure search objects on the basis of the network structure of the super network. Because it is difficult to integrate substructures of different widths into a super network according to the conventional NAS method, the search cost can be greatly reduced by constructing a search space containing all the structures to be searched.

Before the search space is constructed, each candidate network frame can be regarded as an independent module block, and the width and the spatial resolution of each block are different. Based on all blocks (all candidate network frameworks) in the super network, the process of constructing the search space is to connect a plurality of blocks with different widths and spatial resolutions with each other, namely to use a plurality of hierarchical structures in each block as a plurality of substructures in each candidate network framework, and then to connect in a purposeful and directional manner.

In this embodiment, training is performed on the super network in which the search space is constructed, and actually, sampling training is performed on the super network in which the search space is constructed. Because the core idea of the super network is to train a large number of network structures, that is, a large number of candidate network frames, simultaneously in a parameter sharing manner, in this embodiment, in order to treat all candidate network frames equally, when performing sampling training on the super network constructed with a search space, a method without super parameters is used to perform uniform sampling training on the super network, and then the trained super network is obtained.

S12: and searching a network frame to be trained from the trained super network according to a preset search condition.

In step S12, the preset search condition is related to the predicted performance index of the network framework to be trained.

In this embodiment, since all the substructures in the piconet share parameters when constructing different subnetworks, the subnetwork can be sampled and index evaluated only by training the piconet to a certain degree, and retraining of the subnetwork is not needed. And selecting the optimal candidate network frame by evaluating the performance of the candidate network frame on the super network, namely searching the network frame to be trained from the trained super network.

It should be noted that, because the blocks with different widths and spatial resolutions are connected with each other in the search space, the process of searching the network frame to be trained from the trained super network based on the pre-configured model deployment limiting condition is a process of optimizing the transition probability between the blocks to select an optimal path, that is, the searched network frame to be trained is an optimal candidate network frame under the operational capability of the target deployment platform represented by the model deployment limiting condition.

It can be understood that, in practical application, the corresponding model deployment limiting conditions may be set according to the specific performance or speed requirement of the target deployment platform, so as to search and deploy the optimal candidate network framework of the target deployment platform.

S13: and training the network frame to be trained by utilizing a training sample set to obtain a target model.

In step S13, the training sample set includes a plurality of training samples, where each training sample includes a sample feature.

Taking the target model for face recognition as an example, the training samples are a plurality of image samples containing faces, and in each image sample containing a face, the face region is a sample feature of the image sample.

The network frame to be trained is trained by using the training sample set, and the obtained target model, for example, by using a plurality of image samples containing faces and the network frame to be trained, can be used for recognizing the face region from the image containing the faces.

It should be understood that, in the model construction process, different training sample sets can be selected and obtained according to different purposes or purposes of the target model.

As can be seen from the above, in the model construction method provided in this embodiment, a trained hyper network is obtained by training a hyper network constructed with a search space, and the search space is pre-constructed in the hyper network, so that all candidate network frames to be searched can be included in the hyper network, and when the hyper network is trained, all internal substructures can share parameters when different sub-networks are constructed, so that the hyper network can be trained to a certain extent, and then the sub-networks can be sampled and indexes can be evaluated; according to the preset search conditions, the network frame to be trained is searched from the trained hyper-network, and finally the network frame to be trained is trained and optimized based on the training sample set to obtain the target model, so that the structure optimization and the content optimization of the original model frame are not needed, the steps of model construction are simplified, and the efficiency of model construction is improved.

Referring to fig. 2, fig. 2 is a flowchart illustrating an implementation of a model building method according to another embodiment of the present application. With respect to the embodiment corresponding to fig. 1, the model building method provided in this embodiment may further include steps S21 to S22 before step S11, and further include steps S23 to S25 before step S13. The details are as follows:

s21: a plurality of candidate network frameworks are obtained.

S22: and constructing a search space of the super network based on a plurality of the candidate network frameworks.

In this embodiment, the plurality of candidate network frames may be candidate network frames selected by the model deployment tool, or candidate network frames obtained from the target database.

In practical application, in the process of model deployment, a network framework meeting the model construction requirement can be selected from a target database as a candidate network framework through a model deployment tool, wherein information in the target database is used for describing a corresponding relation between the network framework and applicable equipment information thereof, and the applicable equipment information is used for distinguishing a deployment platform. When a search space of the hyper-network is constructed, a network frame can be obtained from a target database as a candidate network frame based on the applicable device information corresponding to the target deployment platform.

It should be appreciated that there are different computational logic and memory complexities between multiple candidate network frameworks. Fig. 3 shows a schematic diagram of a candidate network framework in the present embodiment. The module blocks shown in fig. 3 may be constructed based on 4 basic structures of the currently valid model, that is, A, B, C, D in fig. 3. The search space in this embodiment contains 32 candidate blocks.

The candidate network frameworks in all embodiments of the application can be constructed based on at least one of ShuffleNet V2, SPOS, DARTS and MobileNet V3.

In this embodiment, candidate network architectures can be searched for deployment platforms such as a DSP, an ARM CPU, and an NPU, and the deployment platforms are not limited thereto.

After the space of the super network is constructed, the super network is subjected to sampling training, i.e., step S11 is performed.

S11: and training the super network constructed with the search space to obtain the trained super network.

As a possible implementation manner of this embodiment, a plurality of candidate network frames in the search space are connected to each other, each of the candidate network frames includes a plurality of substructures, and step S11 specifically includes: performing structure search based on all the substructures in the search space to determine a single-path supernet; and carrying out sampling training on the super network according to the single-path super network to obtain the trained super network.

In this embodiment, to reduce the weight coupling of the hypernetwork, only the single-path hypernetwork in the hypernetwork is activated in each iterative training process by determining the single-path hypernetwork. The multiple substructures in a single candidate network framework are at different levels, respectively. When the single-path super network carries out sampling training on the super network, the selection of the substructures is guided without any super parameters, and all the substructures in the super network are treated equally by adopting a uniform sampling mode.

In practical application, the attribute information of the single-path super network can be configured according to actual requirements, and structure search is carried out according to the attribute information. Specifically, different types of selection units are defined to search different structure variables, and further channel number search in a search space for searching a complex model structure is supported. The selection unit is used for searching a substructure, such as the number of channels of a convolutional layer, randomly selecting the number of channels and segmenting corresponding sub-tensors for convolution during the period of hyper-network training by pre-distributing a weight tensor with the maximum number of channels to realize structure search, and further selecting the substructures positioned on different levels from different candidate network frames to be connected to form the single-path hyper-network.

It should be noted that, the model structure search is performed by using the super network, so that the network framework to be trained can be searched, the key reason is that in the verification set, the precision of any substructure using the multiplexing weight is highly reliable, that is, when the weight is required to approximate the optimal weight, the approximation effect is in direct proportion to the degree to which the training loss function is minimized. It follows that optimization of the weights of the super-network should be done simultaneously with optimization of all sub-structures in the search space. Therefore, when the super network is uniformly sampled and trained, all internal substructures share parameters when different sub-networks are constructed, and the sub-networks can be sampled and indexes can be evaluated only by training the super network to a certain degree, namely, the sub-networks do not need to be retrained.

In all embodiments of the present application, where the super-network has multiple selectable substructures per layer, the training super-network typically selects a single path to train through a uniform path sampling method. That is, training a super-network to uniformly sample is random, since all candidate network frameworks can optimize their weights simultaneously. To reduce weight coupling in the super-network, a simple search space containing only a single-path architecture, i.e., a single-path super-network, is used. For training, a method without hyper-parameters is used, all candidate network architectures are treated equally by uniform sampling, model searching automation is realized, and meanwhile model construction efficiency is improved.

As a possible implementation manner of this embodiment, the performing a structure search based on all the substructures in the search space to determine a single-path supernet includes:

In the embodiment, as the candidate network frameworks in the search space are connected with each other, the corresponding connection relationship also exists between the substructure in each candidate network framework and the substructures in other candidate network frameworks, and the super-network is formed. The number of candidate network frames, the transmission logic in each candidate network frame, such as the input channel, the output channel and the space size of the candidate network frame, and the step length used in each candidate network frame are controlled by defining preset attribute information of the hyper-network, and then all the substructures in the search space are subjected to structure search, so that the single-path hyper-network with the maximum transition probability between every two adjacent substructures is obtained.

It should be understood that the transition probability is used to describe a matching degree between two adjacent substructures, and in the process of performing the structure search, because a plurality of candidate network frameworks in the search space are connected with each other, so that a corresponding connection relationship also exists between a substructure in each candidate network framework and a substructure in other candidate network frameworks, each two adjacent substructures may not be from the same candidate network framework, but the matching degree between the two adjacent substructures is the highest based on the preset extranet attribute information.

The preset super-network attribute information is defined as shown in table 1, and a candidate network frame to be searched is indicated by marking TBS in a module Block column. In table 1, Input shape represents the Input model, Block represents the module, channels represent the channels, repeat is the number of repetitions, stride is the step size.

TABLE 1

S12: and searching the trained hyper-network to obtain a network frame to be trained according to a preset search condition.

As a possible implementation manner of this embodiment, the preset search condition corresponds to the target deployment platform, and step S12 specifically includes:

ACC_val(a) (a,a∈A,A＞0)

Latency(a,h)≤LatC_h

In this embodiment, the preset search condition corresponds to the target deployment platform, the preset search condition is related to the operational capability of the target deployment platform, and the preset search condition can also be used for representing the operational capability of the target deployment platform, so as to distinguish the candidate network framework suitable for the target deployment platform.

It is understood that a target deployment platform refers to a hardware device for deploying and providing computational resources for a target model.

By limiting the time delay of the target function and the target deployment platform, a network framework to be trained which is more suitable for the target deployment platform can be searched from the trained hyper-network. In practical application, the time delay LatC of the target deployment platform_hMay be obtained by the line termination unit LTU or the delay predictor method.

In this other embodiment, lfw may be preferably used as the verification set.

With respect to the embodiment shown in fig. 1, the model building method provided in this embodiment further includes steps S23 to S25 before step S13. The details are as follows:

s23: a sample image set containing a human face is acquired.

S24: and intercepting a face area from each sample image in the sample image set to obtain a face image sample set.

S25: and zooming and labeling each face image sample in the face image sample set to obtain the training sample set.

In this embodiment, the sample image set includes a plurality of sample images, and each sample image includes at least one face region. And identifying and intercepting a face region of each sample image, namely, positioning the face region in the sample image, namely identifying the position of the face region in the sample image, and intercepting the face region from the sample image to further obtain a face image sample set.

It should be noted that, scaling is performed on each face image sample in the face image sample set, and the face image samples are subjected to uniform normalization processing. And each human face image sample in the human face image sample set is annotated, so that the human face image samples are distinguished by more detailed features.

Because different faces have size differences in the same or different images, scaling each face image sample is performed to perform standardized processing on the sample, which is beneficial to improving the efficiency of model training. It is to be understood that the scaling of each face image sample may be a long-edge scaling, i.e. scaling the face image sample in the horizontal direction.

In order to distinguish each face image sample, each face image sample in the face image sample set is also labeled in the embodiment, where the label may be an identifier for distinguishing a face image, and may be at least one of a name, a gender, a race, and a number of people.

In this embodiment, the steps S21 to S23 and the steps S11 to S12 are not performed in sequence, and the step S13 may be performed after each face image sample in the face image sample set is scaled and labeled to obtain a training sample set.

S13: and acquiring a training sample set, and training the network frame to be trained by using the training sample set to obtain a target model.

As a possible implementation manner of this embodiment, step S13 specifically includes:

and training and optimizing the network frame to be trained by using the training sample set by using a back propagation and gradient optimization method to obtain a target model.

In this embodiment, a Back Propagation (BP) algorithm and a gradient optimization method are used to perform training optimization on the network frame to be trained according to the objective function and the training sample set. The gradient optimization method can be not limited to known optimization algorithms such as Adam algorithm, RMSprop algorithm and SGD algorithm; the objective function may be, but is not limited to, an AM-Softmax function, an ArcNegFace function, a CosFace function, and an ArcNegFace function.

As a possible implementation manner of this embodiment, the training the network framework to be trained by using the back propagation and gradient optimization method and using the training sample set to obtain the target model includes:

In this embodiment, the convergence condition configured when the network frame to be trained is the same as the target function in the preset search condition, and the target function in the preset search condition is used as the convergence condition for training the network frame to be trained, so that the target model is more suitable for the target deployment platform, that is, more suitable for the time delay requirement of the target deployment platform.

It should be understood that, since the embodiments of the present application do not relate to how to configure an optimization strategy, and it belongs to the prior art that training a model by using an optimization algorithm and an objective function in the field of connection between a deep neural network and the model, details of an optimization scheme and configuration of the objective function are not described here.

In addition, by constructing a search space of the super network based on a plurality of candidate network frames, the method takes width search as a starting point, and simultaneously can search the position and the global depth of network down-sampling, is not limited to the number of layers in each candidate network frame, and the number of substructures of each candidate network frame can be searched, so that the flexibility of network structure search is improved, and a realization basis is provided for the model to be deployed on deployment platforms with different requirements.

Referring to fig. 4, fig. 4 is a block diagram illustrating a model building apparatus according to an embodiment of the present disclosure. The model building apparatus in this embodiment includes units for performing the steps in the embodiments corresponding to fig. 1 to 2. Please refer to fig. 1 to 2 and fig. 1 to 2 for the corresponding embodiments. For convenience of explanation, only the portions related to the present embodiment are shown. Referring to fig. 3, the model building apparatus 30 includes: a sampling training unit 31, a search unit 32, and a model training unit 33. Wherein:

a sampling training unit 31, configured to train a super network in which a search space is constructed, to obtain a trained super network; wherein the search space contains a plurality of candidate web frameworks.

And the searching unit 32 is configured to search the trained super network to obtain a network frame to be trained according to a preset search condition.

And the model training unit 33 is configured to train the network frame to be trained by using a training sample set to obtain a target model.

As an embodiment of the present application, the model building apparatus 30 further includes: an image acquisition unit 36, an image cutout unit 37, and a sample generation unit 38.

An image obtaining unit 36 is configured to obtain a sample image set including a human face.

An image clipping unit 37, configured to clip a face region from each sample image in the sample image set, so as to obtain a face image sample set.

And the sample generating unit 38 is configured to scale and label each facial image sample in the facial image sample set to obtain the training sample set.

As an embodiment of the application, a plurality of candidate network frameworks in a search space are connected with each other, and each candidate network framework comprises a plurality of substructures.

The sampling training unit 31 is specifically configured to perform structure search based on all the substructures in the search space, and determine a single-path supernet; and carrying out sampling training on the super network according to the single-path super network to obtain the trained super network.

As an embodiment of the present application, the sampling training unit 31 is further specifically configured to perform structure search on all the substructures in the search space according to preset attribute information of the piconet, so as to obtain a single-path piconet; wherein a transition probability between two adjacent substructures in the single-path super-network is the largest.

As an embodiment of the application, a preset search condition corresponds to a target deployment platform; the search unit 32 is used in particular for,

ACC_val(a) (a,a∈A,A＞0)

Latency(a,h)≤LatC_h

As an embodiment of the present application, the model training unit 33 is specifically configured to use a back propagation and gradient optimization method to train the network frame to be trained by using the training sample set, so as to obtain a target model.

As an embodiment of the present application, the model training unit 33 is specifically configured to identify the objective function Latency (a, h) as a convergence condition; wherein Latency (a, h) is less than or equal to LatC_h，LatC_hDeploying a time delay of the platform for the target; and training and optimizing the network frame to be trained by using the training sample set by using a back propagation and gradient optimization method to obtain a target model meeting the convergence condition.

As can be seen from the above, in the scheme provided in this embodiment, a trained hyper network is obtained by training a hyper network in which a search space is constructed, and the search space is constructed in advance in the hyper network, so that all candidate network frames that need to be searched can be included in the hyper network, and when the hyper network is trained, all internal substructures can share parameters when different sub-networks are constructed, so that the hyper network can be trained to a certain extent, and then the sub-networks can be sampled and indexes can be evaluated; according to the preset search conditions, the network frame to be trained is searched from the trained hyper-network, and finally the network frame to be trained is trained and optimized based on the training sample set to obtain the target model, so that the structure optimization and the content optimization of the original model frame are not needed, the steps of model construction are simplified, and the efficiency of model construction is improved.

Fig. 5 is a block diagram of a computer device according to an embodiment of the present disclosure. As shown in fig. 5, the computer device 4 of this embodiment includes: a processor 40, a memory 41 and a computer program 42, such as a program of a model building method, stored in said memory 41 and executable on said processor 40. The processor 40, when executing the computer program 42, implements the steps in the embodiments of the model construction methods described above, such as S11 to S13 shown in fig. 1 or S21 to S25 and S11 to S13 shown in fig. 2. Alternatively, when the processor 40 executes the computer program 42, the functions of the units in the embodiment corresponding to fig. 3 are implemented, for example, the functions of the units 31 to 33 shown in fig. 4, or the functions of the units 31 to 38 shown in fig. 4 specifically refer to the description in the embodiment corresponding to fig. 4, which is not described herein again.

Illustratively, the computer program 42 may be divided into one or more units, which are stored in the memory 41 and executed by the processor 40 to accomplish the present application. The one or more units may be a series of computer program instruction segments capable of performing certain functions, which are used to describe the execution of the computer program 42 in the computer device 4. For example, the computer program 42 may be partitioned into a sampling training unit, a search unit, and a model training unit, each unit functioning as described above.

The computer device may include, but is not limited to, a processor 40, a memory 41. Those skilled in the art will appreciate that fig. 5 is merely an example of a computer device 4 and is not intended to limit computer device 4 and may include more or fewer components than those shown, or some of the components may be combined, or different components, e.g., the computer device may also include input output devices, network access devices, buses, etc.

The Processor 40 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 41 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. The memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the computer device 4. Further, the memory 41 may also include both an internal storage unit and an external storage device of the computer device 4. The memory 41 is used for storing the computer program and other programs and data required by the computer device. The memory 41 may also be used to temporarily store data that has been output or is to be output. Gradient waveform adjustment-based gradient field control method and magnetic resonance imaging equipment

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A method of model construction, comprising:

2. The model building method according to claim 1, wherein before the step of training the network framework to be trained by using the training sample set to obtain the target model, the method further comprises:

acquiring a sample image set containing a human face;

3. The model building method of claim 1, wherein a plurality of candidate web frames in the search space are interconnected, each of the candidate web frames comprising a plurality of substructures;

4. The model building method of claim 3, wherein said performing a structure search based on all of said substructures in said search space to determine a single-path supernet comprises:

5. The model building method according to claim 1, wherein the preset search condition corresponds to a target deployment platform;

ACC_val(a) (a，a∈A，A＞0)

Latency(a，h)≤LatC_h

6. The model building method according to claim 5, wherein the obtaining a training sample set and training the network framework to be trained by using the training sample set to obtain a target model comprises:

7. The model building method according to claim 6, wherein the training the network framework to be trained by using the set of training samples to obtain the target model by using a back propagation and gradient optimization method comprises:

8. A model building apparatus, comprising:

9. A computer device, characterized in that the computer device comprises a memory, a processor and a computer program stored in the memory and executable on the computer device, the processor implementing the steps of the model construction method according to any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the model building method according to any one of claims 1 to 7.