CN110288030B

CN110288030B - Image identification method, device and equipment based on lightweight network model

Info

Publication number: CN110288030B
Application number: CN201910566189.3A
Authority: CN
Inventors: 房斌; 李婷
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2019-06-27
Filing date: 2019-06-27
Publication date: 2023-04-07
Anticipated expiration: 2039-06-27
Also published as: CN110288030A

Abstract

The invention discloses an image identification method, device and equipment based on a lightweight network model. The image recognition method comprises the following steps: s1, acquiring a target image to be identified; s2, inputting the target image into the trained lightweight network model; and S3, classifying the target images by using the trained lightweight network model. Wherein the process of obtaining the lightweight network model comprises the steps of: s21, constructing a variant convolutional neural network without a full connection layer; s22, classifying the image through a softmax classifier, and updating the weight of the convolutional layer; s23, extracting the features of the image again by adopting the weight-updated variant convolutional neural network, and carrying out standardization processing on the features; and S24, generating feature nodes and enhanced nodes by the standardized features according to a construction method of the width network, determining the number of the final feature nodes and enhanced nodes, and constructing a lightweight network model.

Description

Image identification method, device and equipment based on lightweight network model

Technical Field

The invention relates to the technical field of pattern recognition, in particular to an image recognition method, device and equipment based on a lightweight network model and a readable storage medium.

Background

When the deep neural network is applied to the field of image recognition, the deep neural network is difficult to analyze in theory due to the fact that the deep neural network involves a large number of hyper-parameters and a complex structure, most of the work involves adjusting parameters or stacking more layers to obtain better precision, and therefore the deep neural network is high in precision but long in calculation time and training time. The article "Broad Learning System: the wide network (BLS) proposed in An Effective and Effective implementation dependent Learning System with the Need for Deep Architecture is designed based on the idea of RVFLNN, and compared with a 'Deep' structure, the 'wide' structure is very simple due to no coupling between layers, and the training process of the BLS reduces the dependence on computer and storage resources. Because multilayer connection is not available, the BLS does not need to update the weight by gradient descent, finds the required connection weight by the false inverse of a ridge regression matrix, improves the precision by increasing the width of the network when the network precision does not meet the requirement, adopts an incremental learning algorithm to quickly reconstruct the network, does not need a retraining process, and has a calculation speed greatly superior to that of deep learning. Although the speed of the BLS is obviously improved, the BLS is applied to the field of image classification and identification, and the classification precision is not high enough.

Disclosure of Invention

The invention aims to overcome the defects in the prior art, and provides an image identification method, an image identification device, image identification equipment and a readable storage medium based on a lightweight network model, so as to realize the balance of the size, the efficiency, the resources and the precision of an image classification model, further realize the rapid and accurate classification of images, and solve the problems that a deep network depends on expensive hardware configuration, calculation and training are time-consuming, and the precision of a wide network is not high.

In order to achieve the above purpose, the invention provides the following technical scheme:

an image identification method based on a lightweight network model comprises the following steps:

s1, acquiring a target image to be identified;

s2, inputting the target image into the trained lightweight network model;

and S3, classifying the target images by using the trained lightweight network model.

Wherein the process of obtaining the lightweight network model comprises the steps of:

s21, constructing a variant convolutional neural network without a full connection layer according to a construction mode of the convolutional neural network, wherein the variant convolutional neural network comprises one or more network layers, and the network layers comprise a convolutional layer and a pooling layer;

s22, the marked image to be classified is subjected to a variant convolutional neural network to obtain a characteristic image of the image, the characteristic is input into a softmax classifier, the image is classified through the softmax classifier, and the weight of the convolutional layer is updated through a loss function according to a classification result and the real value of the marked image;

s23, extracting the features of the image again by adopting the weight-updated variant convolutional neural network, and carrying out standardization processing on the features;

and S24, generating feature nodes and enhanced nodes by the standardized features according to a construction method of the width network, determining the number of the final feature nodes and enhanced nodes, and constructing a lightweight network model.

Preferably, the procedure of the convolutional layer convolution operation in step S21 is as follows:

suppose that

The original image data set is Xi, i is the ith image, i =1,2, \ 8230n; l represents the number of network layers, and the size of the l-th layer of convolution filter is k _l *k _l Depth d of _l The convolution filter has a step size of s _l ；

Ith image X _i The series of convolution operations performed is represented as:

wherein l and l-1 represent the number of the convolution layers, l is the current layer, and l-1 is the previous layer;

represents the image output by the l-layer convolutional layer>

Represents the image output after the ith image is processed by the pooling layer in the l-1 layer network layer and is/is selected>

Is an original image X _i (ii) a W is the weight of the convolutional layer, b is the bias of the convolutional layer, and W and b are randomly generated, i.e., W ^(l) Weight matrix for the first convolutional layer, b ^(l) A bias matrix for the l-th convolutional layer; />

Is a convolution operation.

Preferably, the pooling layer pooling operation in the step S21 is as follows:

adopting a pooling operation mode of taking the maximum value, wherein the step length of each pooling layer is t _l Therefore, a channel pooling operation expansion of a feature picture after a convolution operation is calculated as:

wherein, C is the characteristic image output after the pooling operation.

Preferably, the detailed process of step S22 is as follows:

the characteristic image data of the training sample output by the multi-layer convolution and pooling operation is represented as

Wherein C 'is a feature image set output after multi-layer convolution and pooling, n is the number of images and the vector U' _i Feature images extracted for the ith image by the variant CNN, and i ∈ [1, n ]](ii) a At this time, the images are to be divided into K types, and the output characteristic images subjected to multilayer convolution and pooling are fully connected to K nodes, which are represented as follows:

Y _y ＝W _Y C′+b _Y ，y∈[1，K]；

wherein Y is _y Is the output of the y-th node, W _Y Weights for node operations, b _Y An offset for node operations;

after the nodes are fully connected to the K nodes, classifying the operation result by using a softmax classifier, wherein the softmax algorithm is as follows:

wherein S is _y Representing the probability of the image being classified into the y-th class, a _y Is the value of the y classification, a _k For the value of the kth class, k ∈ [1, K ∈]；a _y I.e. the output Y of the Y node obtained by full connection _y 。

Using the cross entropy L as a loss function, the cross entropy L is expressed as:

and updating the weight of the variant convolutional neural network through an Adam algorithm based on a cross entropy loss function.

Preferably, the detailed process of step S23 is as follows:

and extracting the features of the image again by adopting the weight-updated variant convolutional neural network, and standardizing the features, wherein the detailed process of standardization is as follows:

wherein U' _i The characteristic image, mu, of the ith image extracted by the trained variant convolutional neural network after weight updating is referred to _i The mean value of the characteristic image of the ith output image is used, and sigma is the variance of the characteristic image of the ith output image; u _i The feature image is the normalized ith image;

the normalized feature set of the feature images of all images after normalization is represented as:

preferably, the detailed process of step S24 is as follows:

the width network feature node is represented as:

wherein Z is _r A feature node representing a width network; phi is an arbitrary function; c' is a normalized characteristic image;

random weight coefficients for feature nodes with appropriate dimensions; />

E is the number of the characteristic nodes of the width network;

definition of Z ^E ＝[Z ₁ ，Z ₂ ，…，Z _e ]；

The booster node is then represented as:

/>

definition H ^F ＝[H ₁ ，H ₂ ，…，H _f ]；

Wherein H _j A feature node representing a width network; ζ is the activation function;

random weight coefficients for enhanced nodes with appropriate dimensions; />

F is the number of characteristic nodes of the width network in order to enhance the bias of the nodes;

an overall representation based on a lightweight network structure is:

an image recognition device based on a lightweight network model comprises a target image acquisition module, a target image input module, a classification recognition module and a target model acquisition module;

the target image acquisition module is used for acquiring a target image to be identified;

the target image input module is used for inputting a target image into the target lightweight network model; the lightweight network model is obtained by removing a variant convolutional neural network and a width network structure of a full connection layer;

the classification identification module is used for classifying the target image by using the target lightweight network model to obtain an identification result;

an object model acquisition module comprising:

the model building unit is used for building a variant convolutional neural network without a full connection layer, obtaining width network characteristic nodes and enhanced nodes based on the building steps of the width network, and building a lightweight network model;

a loss function insertion unit for inserting a loss function in the variant convolutional neural network;

and the training unit is used for training the variant convolutional neural network by utilizing a loss function in combination with the softmax classifier, updating parameters and obtaining a target model.

An image recognition apparatus based on a lightweight network model, comprising:

a memory for storing a computer program;

and a processor for implementing the steps of the image recognition method based on the lightweight network model when the computer program is executed.

A readable storage medium, wherein a computer program is stored thereon, and when executed by a processor, the computer program implements the steps of the image recognition method based on a lightweight network model.

Compared with the prior art, the invention has the beneficial effects that: the image features are extracted through the variant convolutional neural network, the image features are input into the network structure based on the construction step of the width network, and the lightweight network model is obtained, the lightweight network model can reduce the dependence on computer storage resources in the image recognition process, can effectively reduce the training time, improves the image classification precision, and realizes the balance of the size, the efficiency, the resources and the precision of the image classification model.

Description of the drawings:

fig. 1 is a diagram of a lightweight network model structure of exemplary embodiment 1 of the present invention;

fig. 2 is a flowchart of an image recognition method based on a lightweight network model according to exemplary embodiment 1 of the present invention;

fig. 3 is a detailed flowchart of step S2 of the image recognition method based on a lightweight network model according to exemplary embodiment 1 of the present invention;

fig. 4 is a characteristic effect diagram of an MNIST data set extracted by an image recognition method based on a lightweight network model in exemplary embodiment 2 of the present invention;

fig. 5 is a schematic structural diagram of an image recognition apparatus based on a lightweight network model in exemplary embodiment 3 of the present invention;

fig. 6 is a schematic structural diagram of an image recognition apparatus based on a lightweight network model in exemplary embodiment 4 of the present invention;

fig. 7 is a schematic structural diagram of an image recognition apparatus based on a lightweight network model in exemplary embodiment 4 of the present invention.

Detailed Description

The present invention will be described in further detail with reference to test examples and specific embodiments. It should be understood that the scope of the above-described subject matter of the present invention is not limited to the following examples, and any technique realized based on the contents of the present invention is within the scope of the present invention.

Example 1

As shown in fig. 1 to 3, the present embodiment provides an image recognition method based on a lightweight network model, which specifically includes the following steps:

s1, acquiring a target image to be identified;

s2, inputting the target image into the trained lightweight network model;

s21, constructing a variant Convolutional Neural Network (CNN) without a full connection layer according to a construction mode of the convolutional neural network, wherein the variant convolutional neural network comprises one or more network layers, and the network layers comprise a convolutional layer and a pooling layer;

s22, inputting the marked image to be classified; the marked image to be classified is subjected to a variant convolutional neural network to obtain the characteristics of the image, the characteristics are input into a softmax classifier, the image is classified through the softmax classifier, and the weight of the convolutional layer is updated through a loss function according to the classification result and the real value of the marked image;

and S24, generating a certain number of feature nodes and enhanced nodes from the features after the standardization treatment according to a construction method of the width network, determining the number of the final feature nodes and enhanced nodes, and constructing a lightweight network model.

The convolutional neural network without the full connection layer is used for image feature extraction, the order of magnitude of network parameters is reduced through weight sharing and pooling of convolutional operation, and features of BLS input data of the wide network are enhanced. By combining the variant convolutional neural network and the width network, a lightweight network structure is constructed, and rapid and accurate image classification is realized.

LeNet, alexNet, vggNet, resNet and the like are common network structure models in the field of convolutional neural networks, have certain difference in construction modes, and are suitable for data sets with different sizes. The embodiment selects the appropriate number of convolutional layers and pooling layers, the size of the convolutional layer filter, the moving step length of the convolutional layer filter and the mode of pooling operation according to the size of the image data set to be classified so as to construct the variant convolutional neural network without fully-connected layers. Wherein, the pooling operation has two operation modes of taking a maximum value and taking an average value.

The convolutional layer convolution operation in step S21 is as follows:

suppose that

For the original image data set, X _i I =1,2, \ 8230; n for the ith image; l represents the number of network layers and the size of the l-th layer convolution filter is k _l *k _l Depth d _l The convolution filter has a step size of s _l ；

image output by a convolutional layer representing a layer network layer l, <' > based on>

Is an original image X _i (ii) a W is the weight of the convolutional layer, b is the bias of the convolutional layer, and W and b are randomly generated, i.e., W ^(l) Weight matrix for the first convolutional layer, b ^(l) A bias matrix for the first convolutional layer; />

Is a convolution operation.

One channel convolution operation expansion for an image is calculated as:

wherein, W _pq Is the weight coefficient in the weight matrix, wherein p belongs to (1, c) and q belongs to (1, c), and the size of the convolution kernel is c; x _gh Inputting the value of an image pixel point of the convolution layer, wherein g belongs to (1, m), h belongs to (1, m), and the dimension of the image is m; b is the bias of the convolution layer; s is the step size of the convolution filter. Since the resolution size of the image is usually changed, enhanced, and denoised before the image recognition, and the image input to the network is usually square, the image shape is square, and the resolution size is m × m, but the image of the present application may be other shapes, and various substitutions, modifications, and improvements made by those skilled in the relevant art without departing from the principle and scope of the present invention should be included in the protection scope of the present invention.

The height new _ height and the width new _ width of the feature matrix of the output image after the convolution operation are both m/s, in order to facilitate the matrix calculation, the size of the matrix of the image input with the convolution operation needs to be expanded, and the numerical value of a pixel point to be supplemented is marked as 0; the number of pixels of the input image matrix which need to be expanded in height is as follows:

Pad_needed_height＝(new_height-1)*s+c-m；

then, the calculation method of the number of pixels pad _ top to be expanded above the input matrix and the number of pixels pad _ bottom to be expanded below the input matrix is as follows:

pad_top＝pad_neededd_height/2，pad_bottom＝pad_needed_height-pad_top；

the calculation method of the number of pixels to be expanded, pad _ left, on the left side of the input matrix and the number of pixels to be expanded, pad _ right, on the right side of the input matrix is as follows:

pad_left＝pad_top，pad_right＝pad_bottom。

the operation process of the pooling layer in the step S21 is as follows:

the pooling operation of the embodiment adopts an operation mode of taking the maximum value, and the step length of the pooling layer of the layer-I network layer is t _l And the moving step length of the l-th layer network convolution filter is s _l Therefore, a channel pooling operation expansion of a feature picture after a convolution operation is calculated as:

wherein, C is the characteristic image output after the pooling operation.

The detailed process of step S22 is as follows.

Wherein C 'is a feature image set output after multilayer convolution and pooling, n is the number of images and a vector U' _i The pixel value set of each pixel point of the characteristic image extracted by the ith image through the variant CNN is obtained, and i belongs to [1, n ]]. At this time, the images are to be divided into K types, and the output characteristic images subjected to multilayer convolution and pooling are fully connected to K nodes, which are represented as follows:

Y _y ＝W _Y C′+b _Y ，y∈[1，K]，

wherein Y is _y Is the output of the y-th node, W _Y As weights of node operations, b _Y Is the bias of the node operation.

wherein S is _y Representing the probability of the image being classified into the y-th class, a _y Is the value of the y classification, a _k For the value of the kth class, k ∈ [1,K ]]；a _y I.e. the output Y of the Y-th node obtained by full connection _y 。

The loss function of this implementation uses cross entropy L as the loss function, which is expressed as:

the weight update is performed on the variant convolutional neural network by using the Adam algorithm based on the calculated value of the loss function. In the process of constructing the variant convolutional neural network, weight and bias parameters are randomly generated, so that the ideal characteristics extracted after convolutional pooling operation cannot be ensured, and in order to solve the problem, the Adam algorithm is adopted for updating the network weight. The Adam algorithm designs independent adaptive learning rates for different parameters through first moment estimation and second moment estimation of random gradients, has advantages in a non-convex optimization problem, has better optimization effect compared with other existing optimization algorithms (such as Gradient Descent, adadelta and Adagrad algorithms), and has high classification precision of network images optimized by the Adam algorithm.

The Adam algorithm for updating the weights of the variant convolutional neural network is as follows:

z _λ-1 ＝β ₁ z _λ-2 +(1-β ₁ )f′(θ _λ-1 )，

v _λ-1 ＝β ₂ v _λ-2 +(1-β ₂ )f′(θ _λ-1 ) ² ；

wherein λ represents the number of iterations; alpha is the hyper-parametric learning rate; beta is a ₁ 、β ₂ An exponential decay rate estimated for the hyper-parametric moment for controlling the decay rate of the moving average; ε is a smoothing term; theta is an arbitrary variable; z and v are initialized zero vectors, deviation correction is performed on the variable theta so that the variable theta is not biased to zero, and the deviation-corrected z is calculated _λ 、v _λ To counteract these deviations; theta _λ The theta vector representing the lambda iteration.

In this example, parameter β ₁ 、β ₂ Respectively taking values0.9 and 0.999, and the value of the smoothing term epsilon is 10 ^-8 The learning rate α is fine-tuned during network training. And selecting a small batch of samples for training when the weight is updated, and reducing the iteration times of network training as much as possible.

Updating parameters by adopting a gradient descent algorithm, and carrying out step-by-step iterative solution by adopting a gradient descent method to obtain a minimized loss function and corresponding model parameter values, wherein the specific calculation process is as follows:

the derivative of the loss function with respect to the convolutional layer weight W is:

the derivative of the loss function with respect to convolutional layer bias b is:

wherein

The calculation formula is as follows:

for the weighting factor of the qth column of the pth row in the l-level weighting matrix, < >>

For l layers the value of the pixel point of the g-th row and h-th column of the image of the convolutional layer is input, b ^(l) Is a bias of layer l>

Is the value of the pixel point of the g-th row and h-th column of the image after l layers of convolution layer processing.

Constructing a variant convolutional neural network without a full connection layer according to a construction mode of the convolutional neural network, wherein the variant convolutional neural network comprises a convolutional layer and a pooling layer, and the order of magnitude of network parameters is reduced through weight sharing and pooling of convolutional operation, so that image features are better extracted; the image is then classified by a softmax classifier, and the variant convolutional neural network is weight updated by a gradient descent method based on the operation value of the cross entropy loss function by using the Adam algorithm. The feature extraction of the weight-updated variant convolutional neural network is more accurate.

In step S23, the feature of the image is extracted again by using the weight-updated variant convolutional neural network, and the feature is normalized, and the detailed process of normalization is as follows:

wherein U' _i Refers to the pixel value set, mu, of each pixel point of the characteristic image of the ith image extracted by the trained variant CNN after weight updating _i The sigma is the variance of the feature image of the ith output image; u _i The normalized pixel values of all pixel points of the characteristic image of the ith image are obtained.

and step S24, generating a certain quantity of feature nodes and enhanced nodes by the standardized features according to a construction method of the width network, determining the final number of the feature nodes and the enhanced nodes by using a grid search method, and constructing a lightweight network model. The characteristic nodes of the width network are used for extracting the characteristics of the image; while the enhanced nodes increase the overall network non-linearity for classification. The detailed construction process is as follows:

the breadth network feature node is represented as:

wherein, Z _r A feature node representing a width network; phi is an arbitrary function; c' is a normalized characteristic image;

random weight coefficients for feature nodes with appropriate dimensions; />

E is the number of feature nodes of the width network.

Definition of Z ^E ＝[Z ₁ ，Z ₂ ，…，Z _e ]；

The booster node is represented as:

definition H ^F ＝[H ₁ ，H ₂ ，…，H _f ]

random weight coefficients for enhanced nodes with appropriate dimensions; />

To enhance the biasing of the nodes.

An overall representation based on a lightweight network structure is:

through the steps, the image features are extracted through a variant Convolutional Neural Network (CNN), the features are input into the lightweight network model invented by the construction step based on the width network, and the image recognition is rapidly carried out. The image recognition method based on the lightweight network model can reduce the dependence on computer storage resources, effectively reduce training time, improve image classification precision and better balance three requirements of efficiency, resources and precision.

Example 2

The image recognition method based on the lightweight network model in embodiment 1 can be widely applied to the field of image recognition, and in this embodiment, an MNIST data set is used for testing and training a network, and the detailed process is as follows:

a variant Convolutional Neural Network (CNN) is constructed to extract image features. Constructing a variant CNN by taking a LeNet network structure as a reference, wherein a first convolution layer, a first pooling layer, a second convolution layer and a second pooling layer in the variant convolutional neural network are sequentially connected, the sizes of convolution filters of the first convolution layer and the second convolution layer are all 5 x 5, the moving step lengths are all 1, the depth of a filter of the first convolution layer is 32, and the depth of a filter of the second convolution layer is 64; and adopting a maximum pooling method for the first pooling layer and the second pooling layer, wherein the moving step length of a filter of each pooling layer is 2, connecting the result of the last layer of pooling to a certain number of nodes, constructing a variant CNN without a full connection layer, classifying by using a softmax classifier, and taking the cross entropy as a loss function.

60000 images of 28 × 1 in the MNIST dataset were input as training samples for variant CNNs, denoted as

The images were processed as described in example 1 for the convolutional pooling operation as follows:

X _i the computational process for one pass through the filter of the first convolutional layer is expanded to:

the result of the above-mentioned channel passing through the first pooling layer filter becomes:

thus, the final output result for any image on the dataset is:

final layer pooling results for all training sample images:

all connected to 10 nodes, corresponding to output

And constructing CNN variants without full connection layers, then classifying the CNN variants by using a softmax classifier, taking cross entropy as a loss function, and updating network weights by using an Adam algorithm.

The weight updating selects 100 samples each time, and the iteration is performed 1000 times, wherein the iteration number can be increased or reduced according to the requirement of improving the precision and considering the time cost. After 1000 iterations, assuming that the network weight is W 'and the bias is b', extracting the features of the image again by using the weight-updated variant convolutional neural network, wherein the feature set of all the extracted images is represented as:

the features extracted by the variant CNN are more scientific than simple projection, direction and gravity center. The fitting ability of the overall model can be controlled by using different convolution, pooling and the size of the final output feature vector. The dimension of the feature vector can be reduced during overfitting, the output dimension of the convolutional layer can be improved during underfitting, and compared with other feature extraction methods, the method is more flexible.

And carrying out standardization processing on the image feature set extracted by the variant convolutional neural network after the weight is updated.

The image standardization is to realize centralized processing of data through mean value removal, and the data centralization accords with a data distribution rule according to convex optimization theory and data probability distribution related knowledge, so that a generalization effect after training is obtained more easily. Image characteristics of each image

The feature image normalized feature set of all images is: />

And (4) taking the image features extracted by the trained variant CNN as input to construct a lightweight network structure.

Generating a certain number of feature nodes and enhanced nodes according to a construction method of a wide area network (BLS) by the normalized features, determining the number of the feature nodes to be 10 by using a grid search method, determining the number of the enhanced nodes to be 11000, and generating the feature nodes by the image features as follows:

definition of Z ^E ＝[Z ₁ ，Z ₂ ，…，Z ₁₀ ]；

The booster node is then represented as:

an overall representation based on a lightweight network structure is:

and (4) keeping the training model and classifying the MNIST test set images.

Fig. 4 is a diagram illustrating a characteristic effect extracted by the variant CNN in an MNIST data set, specifically, a characteristic effect diagram obtained by passing the handwritten numeral 7 through a first convolutional layer, a first pooling layer, a second convolutional layer, and a second pooling layer, respectively.

The image identification method based on the lightweight network model provided by the embodiment better balances three requirements of efficiency, resources and precision, the detailed experiment parameters are shown in table 1, and the experiment effects of the deep neural network (LeNet 5), the wide neural network (BLS) and the lightweight network when applied to MNIST data sets for image identification are respectively shown. The experimental effects include accuracy, training time, and testing time.

Table 1

Network model classes	Precision (%)	Training time(s)	Test time(s)
				Deep neural network (LeNet 5)	98.96	598.21	4.92
Width neural network (BLS)	98.85	142.91	3.67
				Lightweight network	99.27	359.22	4.11

As can be seen from the experimental data in table 1, the image recognition method based on the lightweight network model provided by the embodiment better balances the three requirements of efficiency, resources and precision, and has better image recognition effect.

Example 3

In correspondence with the above method embodiments, the present embodiment also provides an image recognition apparatus based on a lightweight network model, and the image recognition apparatus based on a lightweight network model described below and the image recognition method based on a lightweight network model described above may be referred to in correspondence with each other.

Referring to fig. 5, the apparatus includes the following modules: a target image acquisition module 101, a target image input module 102, a classification recognition module 103 and a target model acquisition module 104;

the target image acquiring module 101 is configured to acquire a target image to be identified;

a target image input module 102, configured to input a target image into a target lightweight network model; the lightweight network model is obtained by removing a variant convolutional neural network and a width network structure of a full connection layer;

the classification identification module 103 is used for performing classification processing on the target image by using the target lightweight network model to obtain an identification result;

an object model acquisition module 104, comprising:

By applying the device provided by the embodiment of the invention, the target image to be identified is obtained; and inputting the target image to a classification identification module to obtain an identification result.

And acquiring a target image to be identified, and inputting the target image into the lightweight network model. The lightweight network model is obtained based on a variant convolutional neural network without a fully connected layer and a width network. That is, the lightweight network model reduces the magnitude of network parameters through weight sharing and pooling of convolution operations, enhancing the characteristics of the wide network BLS input data. Then, the classification recognition module is used for classifying the target image, so that the recognition result of the target image can be obtained. Because the lightweight network model adopted by the classification and identification module is obtained based on the variant convolutional neural network without the full connection layer and the width network, the dependence on computer storage resources can be reduced, the training time can be effectively reduced, the image classification precision is improved, and the three requirements on efficiency, resources and precision are better balanced.

In an embodiment of the invention, the loss function insertion unit is specifically configured to insert a cross-entropy loss function in the variant convolutional neural network.

Example 4

Corresponding to the above method embodiments, the present embodiment also provides a lightweight network model-based image recognition apparatus, and a lightweight network model-based image recognition apparatus described below and a lightweight network model-based image recognition method described above may be referred to in correspondence with each other.

Referring to fig. 6, the image recognition apparatus based on a lightweight network model includes:

a memory D1 for storing a computer program;

and a processor D2, configured to implement the steps of the image identification method based on the lightweight network model according to the above method embodiment when executing the computer program.

Specifically, referring to fig. 7, a specific structure diagram of the image recognition device based on the lightweight network model provided in this embodiment is shown, the image recognition device based on the lightweight network model may generate relatively large differences due to different configurations or performances, and may include one or more processors (CPUs) 322 (e.g., one or more processors) and a memory 332, and one or more storage media 330 (e.g., one or more mass storage devices) storing an application 342 or data 344. Memory 332 and storage media 330 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 330 may include one or more modules (not shown), each of which may include a series of instructions operating on a data processing device. Still further, the central processor 322 may be configured to communicate with the storage medium 330, and execute a series of instruction operations in the storage medium 330 on the image recognition apparatus 301 based on the lightweight network model.

The lightweight network model-based image recognition device 301 may also include one or more power sources 326, one or more wired or wireless network interfaces 350, one or more input-output interfaces 358, and/or one or more operating systems 341. For example, windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, etc.

The steps in the above-described lightweight network model-based image recognition method may be implemented by the structure of a lightweight network model-based image recognition apparatus.

Example 5

Corresponding to the above method embodiments, the present embodiment further provides a readable storage medium, and a readable storage medium described below and an image recognition method based on a lightweight network model described above may be referred to correspondingly.

A readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the steps of the image recognition method based on a lightweight network model of the above-described method embodiments.

The readable storage medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and various other readable storage media capable of storing program codes.

The foregoing is merely a detailed description of specific embodiments of the invention and is not intended to limit the invention. Various alterations, modifications and improvements will occur to those skilled in the art without departing from the spirit and scope of the invention.

Claims

1. An image recognition method based on a lightweight network model is characterized by comprising the following steps:

s1, acquiring a target image to be identified;

s2, inputting the target image into the trained lightweight network model;

s3, classifying the target images by using the trained lightweight network model;

the detailed process is as follows:

Wherein C 'is a feature image set output after multi-layer convolution and pooling, n is the number of images and the vector U' _i Feature image extracted for ith image through variant CNN, and i E [1, n ]](ii) a At this time, the image is to be divided into K classes, and the output characteristic images subjected to multilayer convolution and pooling operation are all connected to K nodes, which are represented as:

Y _y ＝W _Y C′+b _Y ，y∈[1，K]；

after all the nodes are connected to K nodes, classifying operation results by using a softmax classifier, wherein the softmax algorithm is as follows:

wherein S is _y Representing the probability of the image being classified into the y-th class, a _y Is the value of the y classification, a _k For the value of the kth class, k ∈ [1,K ]]；a _y I.e. the output Y of the Y-th node obtained by full connection _y ；

updating the weight of the variant convolutional neural network through an Adam algorithm based on a cross entropy loss function;

s24, generating feature nodes and enhanced nodes by the standardized features according to a construction method of a width network, determining the number of the final feature nodes and enhanced nodes, and constructing a lightweight network model;

the detailed process is as follows:

the width network feature node is represented as:

/>

random weight coefficients for feature nodes with appropriate dimensions; />

E is the number of the characteristic nodes of the width network;

definition of Z ^E ＝[Z ₁ ，Z ₂ ，...，Z _e ]；

The booster node is then represented as:

definition H ^F ＝[H ₁ ，H ₂ ，...，H _f ]；

random weight coefficients for enhanced nodes with appropriate dimensions; />

a lightweight network structure based overall representation is:

wherein, W ^f And represents the weight coefficient when the number of the enhanced nodes is f.

2. The method for image recognition based on a lightweight network model according to claim 1, wherein the convolutional layer convolution operation process in step S21 is as follows:

suppose that

For the original image data set, X _i I =1,2, \ 8230; n for the ith image; l represents the number of network layers and the size of the l-th layer convolution filter is k _l *k _l Depth d _l The convolution filter is moved by a step size s _l ；

represents the image output by the l-layer convolutional layer>

Is a convolution operation.

3. The image recognition method based on the lightweight network model according to claim 1, wherein the pooling operation of the pooling layer in the step S21 is as follows:

wherein C is a characteristic image output after the pooling operation; t is t _l Representing the step size of a pooling layer of the layer l; s _l Representing the moving step of the l-layer network convolution filter; m represents the image size of the input convolution layer, and the height and width of the output image feature matrix after convolution operation are both

Representing a th ÷ in a pooled image>

A block of pixel values of the block.

4. The method for image recognition based on a lightweight network model according to claim 1, wherein the detailed procedure of step S23 is as follows:

5. an image recognition device based on a lightweight network model is characterized by comprising a target image acquisition module, a target image input module, a classification recognition module and a target model acquisition module;

the width network feature node is represented as:

to have a suitable dimensionThe random weight coefficient of the feature node of (1); />

E is the offset of the characteristic nodes, and e is the number of the characteristic nodes of the width network;

an object model acquisition module comprising:

the training unit is used for training the variant convolutional neural network by utilizing a loss function in combination with the softmax classifier, updating parameters and obtaining a target model;

Wherein C 'is a feature image set output after multilayer convolution and pooling, n is the number of images and a vector U' _i Feature images extracted for the ith image by the variant CNN, and i ∈ [1, n ]](ii) a At this time, the image is to be divided into K classes, and the output characteristic images subjected to multilayer convolution and pooling operation are all connected to K nodes, which are represented as:

Y _y ＝W _Y C′+b _Y ，y∈[1，K]；

wherein Y is _y Is the output of the y-th node, W _Y As weights of node operations, b _Y An offset for node operations;

wherein S is _y Representing the probability of the image being classified into the y-th class, a _y Is the value of the y classification, a _k For the value of the kth class, k ∈ [1, K ∈]；a _y I.e. the output Y of the Y node obtained by full connection _y ；

an overall representation based on a lightweight network structure is:

6. An image recognition device based on a lightweight network model, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the image recognition method based on a lightweight network model according to any one of claims 1 to 4 when executing the computer program.

7. A readable storage medium, having stored thereon a computer program which, when executed by a processor, implements the steps of the method for image recognition based on a lightweight network model according to any one of claims 1 to 4.