CN109858618B - Convolutional neural unit block, neural network formed by convolutional neural unit block and image classification method - Google Patents

Convolutional neural unit block, neural network formed by convolutional neural unit block and image classification method Download PDF

Info

Publication number
CN109858618B
CN109858618B CN201910170897.5A CN201910170897A CN109858618B CN 109858618 B CN109858618 B CN 109858618B CN 201910170897 A CN201910170897 A CN 201910170897A CN 109858618 B CN109858618 B CN 109858618B
Authority
CN
China
Prior art keywords
convolution
convolution kernel
convolutional neural
kernel
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910170897.5A
Other languages
Chinese (zh)
Other versions
CN109858618A (en
Inventor
李帅
朱策
张铁
郑龙飞
高艳博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201910170897.5A priority Critical patent/CN109858618B/en
Publication of CN109858618A publication Critical patent/CN109858618A/en
Application granted granted Critical
Publication of CN109858618B publication Critical patent/CN109858618B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a convolution nerve cell block, a formed nerve network and an image classification method, wherein the convolution nerve cell block comprisesnIs not onlyCo-directional convolution kernel and onem×mConvolution kernel post-stack join ofa1 × 1 convolution kernels; also includes jump connection of identity conversion, in whichaThe number of channels is equal to that of the input feature map; the convolution kernel is decomposed, so that the number of original parameters is reduced while the convolution kernel is ensured to have a large receptive field; the method adopts diagonal convolution to directly acquire the correlation of the original characteristic diagram in the diagonal direction, and enhances the adaptability to space transformation.

Description

Convolutional neural unit block, neural network formed by convolutional neural unit block and image classification method
Technical Field
The invention relates to the technical field of construction and application of a neural network, in particular to a convolutional neural unit block, a constructed neural network and an image classification method.
Background
In the existing computer vision field, a convolutional neural network is a common tool; convolutional neural networks are generally formed by stacking successive convolutional layers, activation layers, and pooling layers; the convolution layer is composed of a plurality of convolution nerve units, receives the input characteristics of the previous layer, calculates by using the convolution nerve units of the layer, and outputs a characteristic diagram with the channel number equal to the number of the convolution nerve units of the layer. The active layer is composed of a linear correction unit; generally, one activation layer only comprises one linear correction unit, and nonlinear mapping is carried out on a feature map output by the previous layer; one pooling layer also performs pooling mapping on the feature map output by the previous layer.
The existing neural network generally comprises two kinds, namely a Resnet network, wherein the Resnet is a network framework proposed in 2016, and identification mapping (identity mapping) of jump connection is added between convolution layers which are directly stacked originally, so that the network only needs to fit residual errors of the original network; heretofore, networks implemented nonlinear transformations by stacking convolutional, active, and pooling layers. The second is the concept of the third version of the concept, in the concept of v3, there is a part of the structure that decomposes a 3 × 3 convolution kernel into successive 3 × 1 convolution kernels and superimposes the 1 × 3 convolution kernels. This part of the structure is similar to the structure of the present invention.
However, in the existing convolutional neural network, because the number of layers of the network is large, model parameters are huge, and in order to reduce the parameters of the network, the convolutional neural network mainly adopts convolution kernels of 3 × 3 and 1 × 1; this makes the direct receptive field of the convolutional layer small; if a convolution kernel with a large receptive field is adopted, the parameter quantity is increased, and the calculation quantity is large, and the adaptability to space non-deformation is poor.
Disclosure of Invention
The invention provides a convolutional neural unit block capable of reducing the number of parameters, a neural network formed by the convolutional neural unit block and an image classification method.
The technical scheme adopted by the invention is as follows: a convolution nerve cell block comprises n convolution kernels in different directions and an m multiplied by m convolution kernel which are stacked and then connected with a 1 multiplied by 1 convolution kernels; and the jump connection of the identity transformation is also included, wherein a is equal to the number of channels of the input feature map.
Further, the convolution kernels in different directions comprise a left oblique convolution kernel bx 1 and a right oblique convolution kernel 1 × b; the left oblique convolution kernel is b × b convolution kernel all positions except the left oblique diagonal are all kept as 0; the right-diagonal convolution kernel is a b × b convolution kernel in which all positions except the right diagonal are all held at 0.
Further, each of the n convolution kernels in different directions includes a stack of directional convolutions in two perpendicular directions.
A neural network adopting convolutional neural unit blocks comprises a convolutional kernel with the length of m multiplied by m, c convolutional neural unit blocks, a convolutional kernel with the step length of d, e convolutional neural unit blocks, a residual block with the step length of d, e convolutional neural unit blocks, a pooling layer with the step length of f and the receptive field of f multiplied by f and a full connection layer which are sequentially connected.
An image classification method using a neural network, comprising the steps of:
step 1: constructing a neural network;
step 2: training the neural network constructed in the step 1;
and step 3: and (3) performing data enhancement processing on the test set picture, and inputting the test set picture into the neural network obtained after training in the step (2), so as to finish the classification of the picture.
The invention has the beneficial effects that:
(1) the convolution kernel is decomposed, so that the number of original parameters is reduced while the convolution kernel is ensured to have a large receptive field;
(2) the invention provides a method for directly obtaining the correlation of the direction of the diagonal of the original characteristic diagram by adopting diagonal convolution and enhancing the adaptability to space transformation;
(3) the method is used for image classification, has higher accuracy and reduces the quantity of parameters required to be stored.
Drawings
FIG. 1 is a diagonal convolution kernel employed in the present invention, a being a left diagonal convolution kernel and b being a right diagonal convolution kernel.
Fig. 2 is a block structure of a convolutional neural unit employed in an embodiment of the present invention.
FIG. 3 is a block diagram of a convolutional neural unit employed in an embodiment of the present invention.
FIG. 4 is a block diagram of a convolutional neural unit employed in an embodiment of the present invention.
FIG. 5 is a block diagram of a convolutional neural unit employed in an embodiment of the present invention.
Detailed Description
The invention is further described with reference to the following figures and specific embodiments.
A convolution nerve cell block comprises n convolution kernels in different directions and an m multiplied by m convolution kernel which are stacked and then connected with a 1 multiplied by 1 convolution kernels; and the jump connection of the identity transformation is also included, wherein a is equal to the number of channels of the input feature map.
The convolution kernels in different directions comprise a left oblique convolution kernel bx 1 and a right oblique convolution kernel 1 x b; the left oblique convolution kernel is b × b convolution kernel all positions except the left oblique diagonal are all kept as 0; the right-diagonal convolution kernel is a b × b convolution kernel in which all positions except the right diagonal are all held at 0.
Each of the n different directional convolution kernels comprises a stack of two orthogonal directional convolutions.
The convolution neural unit is different from a common convolution neural unit, the receptive field of the common convolution neural unit is generally square, the convolution unit provided by the invention is a diagonal convolution unit, and as shown in fig. 1, a is a left oblique convolution kernel and b is a right oblique convolution kernel. The convolution kernel can extract the correlation of the oblique angle direction of the previous layer of feature map; the parameter quantity required to be occupied by one diagonal convolution neural unit is gx 1 × h, wherein g is the size of a convolution kernel, and h is the number of the convolution kernels; the parameter quantity corresponds to the parameter quantity of a normal convolution kernel of g × 1 × h or 1 × g × h. In fig. 1, a is a 5 × 1 left-diagonal convolution kernel, which operates in the same manner as the normal convolution kernel, and corresponds to that all positions of the 5 × 5 normal convolution kernel except for the left diagonal are all kept as 0, and b is the same as above.
FIG. 2 is a diagram of one configuration of the block of convolutional neural cells of the present invention, an improvement made on one block of the original resnet; the original resnet block is stacked by two 3 × 3 convolution kernels, plus a jump connection of identity transform; fig. 2 is an improvement on this, in which the first layer is replaced by four large scale convolution kernels with different directions, and then a 3 × 3 convolution kernel is added. The relevance in different directions can be obtained, the large-scale convolution kernel can increase the receptive field and obtain more information. Then stacking them together, compressing the dimension by using convolution kernel of 1X 1 to obtain a characteristic diagram with the same dimension as the input, and adding the characteristic diagram with the input; in order to reduce the parameter number of the network convolution kernels, the number of the convolution kernels of 5 branches in the first layer is equal to one half of the number of the input convolution kernels, after stacking, the number of the 1 multiplied by 1 convolution kernels is equal to the number of channels of the input feature diagram, the number of the channels of the input feature diagram is ensured to be the same as that of the channels of the output feature diagram, and direct addition can be carried out. Since the two 3 × 3 convolution kernels of the original resnet network correspond to a 5 × 5 receptive field, the scales of the four different directions of the first layer are set to 5 × 1, and the other is set to one 3 × 3 convolution kernel.
FIG. 3 is another block structure of convolutional neural cells proposed on the basis of FIG. 2, which expands the last branch of the first layer into a stack of two 3 × 3 convolutional kernels on the basis of FIG. 2; the structure can better extract other forms of correlation in the feature map, and the parameter setting is the same as that in FIG. 2.
FIG. 4 is a block diagram of another convolutional neural unit proposed in the present invention, in order to better extract the correlation in four different directions; stacking convolution kernels in different directions; the first layer splits into five branches, four branches in different directions and a 3 x 3 convolution kernel to extract spatial correlation except for the straight direction. Each branch becomes a stack of convolutions in two perpendicular directions to extract information from different angles; as the first branch is a stack of 1 x 3 transverse convolution kernels and 3 x 1 vertical convolution kernels; the second branch is a stack of 3 × 1 vertical convolution kernels and 1 × 3 horizontal convolution kernels; the third branch is a stack of a 1 × 3 right skewed convolution kernel and a 3 × 1 left skewed convolution kernel; the fourth branch is a stack of 1 x 3 left and 3 x 1 right skewed convolution kernels. Since the first layer is a convolution kernel of 3 × 1 and the second layer is a convolution kernel of 3 × 3, it is ensured that the same receptive field as before is obtained. Parametrically, the structure is the same as that of the convolution nerve unit shown in FIG. 2; considering the reason of parameter quantity, the number of channels of the convolution kernel of the first layer is set to be half of that of the input image, and the parameter quantity of the second layer is set to be consistent with that of the input feature map, so that the input feature map and the result thereof are conveniently added.
FIG. 5 shows an improvement on the bottleneck of the resnet structure, which is originally formed by stacking three layers of 1 × 1, 3 × 3, and 1 × 1 convolution kernels, and adding a jump connection from the input directly to the output; the architecture of fig. 5 is optimized for the middle layer network; the 3 x 3 convolution kernel is decomposed into four directional convolution kernels, one convolution of 3 x 3, for obtaining characteristic correlations other than the straight-line direction. In order to reduce the number of parameters, the number of convolution kernels of the first layer and the second layer is reduced by half in total, and the number of parameters is further reduced.
The neural network is designed according to the four neural network units, and comprises a convolution kernel with the length of m multiplied by m, c convolution neural unit blocks, a convolution kernel with the step length of d, e convolution neural unit blocks, a residual block with the step length of d, e convolution neural unit blocks, a pooling layer with the step length of f and the receptive field of f multiplied by f and a full connection layer which are connected in sequence.
The neural network designed by the invention can be used for image classification, and comprises the following steps:
step 1: constructing a neural network; a neural network can be constructed using python, tenserflow, or keras;
step 2: training the neural network constructed in the step 1;
and step 3: and (3) performing data enhancement processing on the test set picture, and inputting the test set picture into the neural network obtained after training in the step (2), so as to finish the classification of the picture.
The convolutional neural cell block designed by the invention can replace any resnet block, and image classification is carried out by taking resnet32 as an example, and the specific scheme is as follows:
s1: after the image is input, a feature map with 16 channels and the same size as the original image is obtained by a layer of convolution kernel with 16 channels and 3 multiplied by 3 receptive field; the parameter number is consistent with the content network parameter number, and the channel number is set to be 16 through experience.
S2: the connection passes through 10 blocks as shown in fig. 2, and a feature map with 16 channels and unchanged scale is obtained.
S3: the same purpose as pooling is achieved by a convolution kernel with the step length of 2 without using a pooling layer; meanwhile, in order to ensure that the enough degree is obtained, the number of the channels of the convolution kernel is doubled to 32; this step does not use the convolutional neural unit designed by the present invention, and uses the same residual block as resnet 32.
S4: the feature map with the channel number of 32 is obtained by 9 convolutional neural unit blocks as shown in fig. 2.
S5: the resnet residual block with step size 2 is passed through while the number of channels is doubled again to 64.
S6: the feature map with 64 channels is obtained by passing 9 convolutional neural unit blocks as shown in fig. 2.
S7: the step size is 8, and the receptive field is 8 multiplied by 8 pooling layers.
S8: and through the full connection layer, the classification prediction can be completed.
The parameter of the convolutional neural network is reduced by one eighteen times compared with that of the conventional resnet32, but the classification effect is similar to that of resnet 32; for example, compared with the parameter of the resnet block, the convolutional neural block shown in fig. 2 is adopted, assuming that the number of input channels is n, the parameter required by one resnet block is 2 × 3 × 3 × n ═ 18n, and the parameter required by the first layer of the convolutional neural block shown in fig. 2 is 4 × 5 × n × 0.5+3 × 3 × n × 0.5 ═ 14.5 n; the required parameters for the second layer are 1 × 1 × 2.5n — 2.5n, together 17 n; compared with the original content block, the number of the blocks is reduced by one eighteen.
Carrying out image classification on the convolutional neural network obtained by the design of the invention; performing an experiment by using a cifar10 data set, and adopting a random gradient descent method (SGD) on a training strategy; the initial learning rate is 0.1, and 250 rounds of training are carried out; at the 81 st round, the learning rate was modified to 0.01; at the 121 st round, the learning rate was changed to 0.001; at the 181 th round, the learning rate was changed to 0.0001. During training, a momentum parameter is set, and the momentum is 0.9. The loss function is a cross-entropy loss function. Meanwhile, in order to reduce overfitting, data enhancement is carried out on pictures of the cifar10 data set, wherein the data enhancement comprises random horizontal turning, horizontal and vertical micro translation and the like. Finally, 93.09% accuracy was obtained on the cifar10 dataset, which is 0.5% higher than the accuracy using the event 32, and the amount of parameters needed to be stored was reduced.
The diagonal convolution provided by the invention can directly obtain the correlation of the original characteristic diagram in the diagonal direction; the neural network constructed by the convolutional neural unit block designed by the invention has improved adaptability to space transformation. The square convolution kernels can be decomposed by convolution kernels in different directions, and the number of parameters is reduced under the condition that a large receptive field is ensured.

Claims (4)

1. An image classification method based on a convolution nerve cell block is characterized in that the convolution nerve cell block comprises n convolution kernels in different directions and an m multiplied by m convolution kernel which is stacked and then connected with a 1 multiplied by 1 convolution kernels; the method also comprises jump connection of identity transformation, wherein a is equal to the number of channels of the input characteristic diagram; the convolutional neural unit block is used for replacing the unit block in the classification network for image classification; the convolution kernels in different directions comprise a left oblique convolution kernel bx 1 and a right oblique convolution kernel 1 x b; the left oblique convolution kernel is b × b convolution kernel all positions except the left oblique diagonal are all kept as 0; the right-diagonal convolution kernel is a b × b convolution kernel in which all positions except the right diagonal are all held at 0.
2. The method of claim 1, wherein each of the n different directions of the convolution kernels comprises a stack of two perpendicular directional convolutions.
3. The neural network image classification method based on the convolutional neural unit blocks as claimed in claim 1, which comprises a convolutional kernel of m × m, c convolutional neural unit blocks, a convolutional kernel with step length of d, e convolutional neural unit blocks, a residual block with step length of d, e convolutional neural unit blocks, a pooling layer with step length of f and receptive field of f × f, and a full connection layer, which are connected in sequence.
4. The method for classifying neural network images based on convolutional neural unit blocks as claimed in claim 3, comprising the steps of:
step 1: constructing a neural network;
step 2: training the neural network constructed in the step 1;
and step 3: and (3) performing data enhancement processing on the test set picture, and inputting the test set picture into the neural network obtained after training in the step (2), so as to finish the classification of the picture.
CN201910170897.5A 2019-03-07 2019-03-07 Convolutional neural unit block, neural network formed by convolutional neural unit block and image classification method Active CN109858618B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910170897.5A CN109858618B (en) 2019-03-07 2019-03-07 Convolutional neural unit block, neural network formed by convolutional neural unit block and image classification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910170897.5A CN109858618B (en) 2019-03-07 2019-03-07 Convolutional neural unit block, neural network formed by convolutional neural unit block and image classification method

Publications (2)

Publication Number Publication Date
CN109858618A CN109858618A (en) 2019-06-07
CN109858618B true CN109858618B (en) 2020-04-14

Family

ID=66900186

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910170897.5A Active CN109858618B (en) 2019-03-07 2019-03-07 Convolutional neural unit block, neural network formed by convolutional neural unit block and image classification method

Country Status (1)

Country Link
CN (1) CN109858618B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110796250A (en) * 2019-10-11 2020-02-14 浪潮电子信息产业股份有限公司 Convolution processing method and system applied to convolutional neural network and related components
CN112785663B (en) * 2021-03-17 2024-05-10 西北工业大学 Image classification network compression method based on convolution kernel of arbitrary shape
CN114677568B (en) * 2022-05-30 2022-08-23 山东极视角科技有限公司 Linear target detection method, module and system based on neural network

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105374007B (en) * 2015-12-02 2019-01-01 华侨大学 Merge the pencil drawing generation method and device of skeleton stroke and textural characteristics
CN108701249B (en) * 2016-01-25 2023-04-14 渊慧科技有限公司 Generating images using neural networks
CN105957059B (en) * 2016-04-20 2019-03-01 广州视源电子科技股份有限公司 Electronic component missing detection method and system
US10339445B2 (en) * 2016-10-10 2019-07-02 Gyrfalcon Technology Inc. Implementation of ResNet in a CNN based digital integrated circuit
CN106710589B (en) * 2016-12-28 2019-07-30 百度在线网络技术(北京)有限公司 Speech Feature Extraction and device based on artificial intelligence
CN107145939B (en) * 2017-06-21 2020-11-24 北京图森智途科技有限公司 Computer vision processing method and device of low-computing-capacity processing equipment
CN108520275A (en) * 2017-06-28 2018-09-11 浙江大学 A kind of regular system of link information based on adjacency matrix, figure Feature Extraction System, figure categorizing system and method
CN108062551A (en) * 2017-06-28 2018-05-22 浙江大学 A kind of figure Feature Extraction System based on adjacency matrix, figure categorizing system and method
CN108305261A (en) * 2017-08-11 2018-07-20 腾讯科技(深圳)有限公司 Picture segmentation method, apparatus, storage medium and computer equipment
CN107830846B (en) * 2017-09-30 2020-04-10 杭州艾航科技有限公司 Method for measuring angle of communication tower antenna by using unmanned aerial vehicle and convolutional neural network
CN108985443B (en) * 2018-07-04 2022-03-29 北京旷视科技有限公司 Action recognition method and neural network generation method and device thereof, and electronic equipment

Also Published As

Publication number Publication date
CN109858618A (en) 2019-06-07

Similar Documents

Publication Publication Date Title
US10769757B2 (en) Image processing apparatuses and methods, image processing systems and training methods
CN109858618B (en) Convolutional neural unit block, neural network formed by convolutional neural unit block and image classification method
Juefei-Xu et al. Local binary convolutional neural networks
CN106462724B (en) Method and system based on normalized images verification face-image
US20210192701A1 (en) Image processing method and apparatus, device, and storage medium
CN113159143B (en) Infrared and visible light image fusion method and device based on jump connection convolution layer
US11216913B2 (en) Convolutional neural network processor, image processing method and electronic device
CN110097178A (en) It is a kind of paid attention to based on entropy neural network model compression and accelerated method
CN110020639B (en) Video feature extraction method and related equipment
Ni et al. Semantic representation for visual reasoning
US20200057921A1 (en) Image classification and conversion method and device, image processor and training method therefor, and medium
CN111553246A (en) Chinese character style migration method and system based on multi-task antagonistic learning network
CN112184554A (en) Remote sensing image fusion method based on residual mixed expansion convolution
CN110874636B (en) Neural network model compression method and device and computer equipment
WO2022022001A1 (en) Method for compressing style transfer network, and style transfer method, apparatus and system
CN109886391B (en) Neural network compression method based on space forward and backward diagonal convolution
CN106991355A (en) The face identification method of the analytical type dictionary learning model kept based on topology
CN107679572A (en) A kind of image discriminating method, storage device and mobile terminal
CN113128527B (en) Image scene classification method based on converter model and convolutional neural network
CN111476835B (en) Unsupervised depth prediction method, system and device for consistency of multi-view images
CN110852367B (en) Image classification method, computer device, and storage medium
CN112307982A (en) Human behavior recognition method based on staggered attention-enhancing network
US20230316699A1 (en) Image semantic segmentation algorithm and system based on multi-channel deep weighted aggregation
Liu et al. RB-Net: Training highly accurate and efficient binary neural networks with reshaped point-wise convolution and balanced activation
CN113378721B (en) Symmetrical and local discrimination-based face correction method and system for generating countermeasure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant