CN113159301B - Image processing method based on binarization quantization model - Google Patents

Image processing method based on binarization quantization model Download PDF

Info

Publication number
CN113159301B
CN113159301B CN202110569275.7A CN202110569275A CN113159301B CN 113159301 B CN113159301 B CN 113159301B CN 202110569275 A CN202110569275 A CN 202110569275A CN 113159301 B CN113159301 B CN 113159301B
Authority
CN
China
Prior art keywords
layer
sub
binarization
quantization model
output end
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110569275.7A
Other languages
Chinese (zh)
Other versions
CN113159301A (en
Inventor
刘启和
但毅
周世杰
张准
董婉祾
王钰涵
严张豹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202110569275.7A priority Critical patent/CN113159301B/en
Publication of CN113159301A publication Critical patent/CN113159301A/en
Application granted granted Critical
Publication of CN113159301B publication Critical patent/CN113159301B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an image processing method based on a binarization quantization model, which belongs to the technical field of image processing and comprises the following steps: s1, preprocessing an image set to obtain initial input data of each image; s2, constructing a binarization quantization model; s3, training the binarization quantization model by adopting the initial input data of each image to obtain a trained binarization quantization model; s4, inputting initial input data of an image into the trained binary quantization model to obtain the boundary and the attribute of an object in the image, and finishing the processing of the image; the invention solves the problem that the existing full-precision model cannot be stored due to small storage memory of the unmanned aerial vehicle.

Description

Image processing method based on binarization quantization model
Technical Field
The invention relates to the technical field of image processing, in particular to an image processing method based on a binarization quantization model.
Background
Convolutional Neural Networks (CNN) are widely used in the field of unmanned aerial vehicles, especially in image processing. Currently, researchers have proposed CNN algorithms for aerial images, such as performing image classification and object tracking tasks. However, applying the full-precision CNN model to the field of drones encounters some difficulties. The CNN model of full precision needs a large amount of storage space and computational resource, and unmanned aerial vehicle storage memory on the present market is very little, can't transplant to the full precision model who trains to go up, and moreover, the computational module power consumption that unmanned aerial vehicle carried on is very serious, according to current battery capacity, unmanned aerial vehicle's continuation of the journey sharply reduces.
Disclosure of Invention
Aiming at the defects in the prior art, the image processing method based on the binarization quantization model provided by the invention solves the problems that the storage memory of the unmanned aerial vehicle is very small and the existing full-precision model cannot be stored.
In order to achieve the purpose of the invention, the invention adopts the technical scheme that: an image processing method based on a binarization quantization model comprises the following steps:
s1, preprocessing an image set to obtain initial input data of each image;
s2, constructing a binarization quantization model;
s3, training the binarization quantization model by adopting the initial input data of each image to obtain a trained binarization quantization model;
and S4, inputting the initial input data of one image into the trained binary quantization model to obtain the boundary and the attribute of the object in the image, and finishing the processing of the image.
Further, step S1 comprises the following sub-steps:
s11, normalizing each image in the image set to obtain a normalized image pixel value;
s12, scaling the normalized image pixel point values to [ -128,128], and obtaining initial input data of each image.
Further, the method for performing normalization processing on the image in step S11 is as follows: adjusting the mean value of all pixel points in the image to be 0 and the variance to be 1;
the formula for the normalization process is:
Figure BDA0003081999770000021
wherein the content of the first and second substances,
Figure BDA0003081999770000022
for the ith normalized image pixel point value, x i The ith pixel point value is m, the total number of the pixel points, and the epsilon is a parameter for preventing the denominator from being zero;
the normalized image pixel point values are scaled to [ -128,128,128 ] in step S12]The specific method comprises the following steps: using a 256-bit one-dimensional binary vector I = (I) for each normalized image pixel point value 1 ,i 2 ,...,i n ,...,i 256 ) Is shown in which i 1 ,i 2 ,...,i n ,...,i 256 The value range of each component is { +1, -1} for 256 components of the binary vector I.
The beneficial effects of the above further scheme are: normalization is to ensure that data in all dimensions are in a variation range, and normalization is a simplified calculation mode, namely, a dimensional expression is transformed into a dimensionless expression to become a pure quantity. The purpose is to ensure that the small value in the output data is not swallowed.
Scaling image pixel point values to [ -128,128]The purpose of the method is to further convert the vector into a 256-dimensional vector to facilitate the first convolution layer
Figure BDA0003081999770000023
And (4) processing.
Further, the binarization quantization model comprises: the first sub-binarization quantization model, the second sub-binarization quantization model, the third sub-binarization quantization model, the fourth sub-binarization quantization model, the fifth sub-binarization quantization model, the sixth sub-binarization quantization model and the seventh sub-binarization quantization model;
the input end of the first sub-binarization quantization model is used as the input end of the binarization quantization model, the A output end of the first sub-binarization quantization model is connected with the input end of the second sub-binarization quantization model, and the B output end of the first sub-binarization quantization model is respectively connected with the output end of the sixth sub-binarization quantization model and the input end of the seventh sub-binarization quantization model; the output end A of the second sub-binarization quantization model is connected with the input end of the third sub-binarization quantization model, and the output end B of the second sub-binarization quantization model is respectively connected with the input end of the sixth sub-binarization quantization model and the output end of the fifth sub-binarization quantization model; the output end A of the third sub-binarization quantization model is connected with the input end of the fourth sub-binarization quantization model, and the output end B of the third sub-binarization quantization model is respectively connected with the input end of the fifth sub-binarization quantization model and the output end of the fourth sub-binarization quantization model; and the output end of the seventh sub-binarization quantization model is used as the output end of the binarization quantization model.
Further, the first sub-binarization quantization model comprises: the first convolution layer, the first linear rectifying layer, the second convolution layer, the second linear rectifying layer, the first maximum pooling layer, the first normalization layer and the first quantization layer;
the input end of the first convolution layer is used as the input end of the first sub-binarization quantization model, and the output end of the first convolution layer is connected with the input end of the first linear rectification layer; the input end of the second convolution layer is connected with the output end of the first linear rectifying layer, and the output end of the second convolution layer is connected with the input end of the second linear rectifying layer; the output end of the second linear rectifying layer is connected with the input end of the first maximum pooling layer and is used as the output end B of the first sub-binarization quantization model; the output end of the first maximum pooling layer is connected with the input end of the first normalization layer; the input end of the first quantization layer is connected with the output end of the first normalization layer, and the output end of the first quantization layer is used as the output end A of the first sub-binarization quantization model;
the convolution kernel size of the first convolution layer is 3 × 3, and the formula is as follows: l is 1 (I)=I*W b ,W b =Binariz e(W),
Figure BDA0003081999770000031
Wherein W isTrained weight matrix, W b For the binarized weight matrix, binariz e () is the binarization function, I is the binarization vector, L 1 (I) As the output of the first convolution layer, a convolution operation;
the formula of the first linear rectifying layer is as follows: l is 2 (I)=ReLU(L 1 (I) Wherein, reLU () is an activation function,
Figure BDA0003081999770000041
L 2 (I) Is the output of the first linear rectifying layer;
the convolution kernel size of the second convolution layer is 3 × 3, and the formula is: l is a radical of an alcohol 3 (I)=L 2 (I)*W b Wherein L is 3 (I) Is the output of the second convolutional layer;
the formula of the second linear rectifying layer is as follows: l is 4 (I)=ReLU(L 3 (I) Wherein, L 4 (I) Is the output of the second linear rectifying layer;
the first maximum pooling layer is provided with 12 × 2 filter, the step length is 2, and the formula is as follows: l is 5 (I)=MaxPool(L 4 (I) MaxPool () is the maximum pooling function, L) 5 (I) Is the output of the first max pooling layer;
the formula of the first normalization layer is as follows: l is 6 (I)=BN(L 5 (I) BN () is the BatchNormalization function, L) 6 (I) Is the output of the first normalization layer;
the formula of the first quantization layer is: l is a radical of an alcohol 7 (I)=Binarize(L 6 (I) Binarize () is a quantization function, L) 7 (I) Is the output of the first quantization layer;
in the forward propagation of the first quantization layer:
Figure BDA0003081999770000042
during the reverse propagation of the first quantization layer:
Figure BDA0003081999770000043
further, the second sub-binarization quantization model comprises: a third convolution layer, a third linear rectifying layer, a fourth convolution layer, a fourth linear rectifying layer, a second maximum pooling layer, a second normalization layer and a second quantization layer;
the input end of the third convolution layer is used as the input end of the second sub-binarization quantization model, and the output end of the third convolution layer is connected with the input end of the third linear rectification layer; the input end of the fourth convolution layer is connected with the output end of the third linear rectifying layer, and the output end of the fourth convolution layer is connected with the input end of the fourth linear rectifying layer; the output end of the fourth linear rectifying layer is connected with the input end of the second maximum pooling layer and is used as the B output end of the second sub-binarization quantization model; the output end of the second maximum pooling layer is connected with the input end of the second normalization layer; the input end of the second quantization layer is connected with the output end of the second normalization layer, and the output end of the second quantization layer is used as the output end A of the second sub-binarization quantization model;
the third convolution layer has the same structure as the first convolution layer, the third linear rectifying layer has the same structure as the first linear rectifying layer, the fourth convolution layer has the same structure as the second convolution layer, the fourth linear rectifying layer has the same structure as the second linear rectifying layer, the second largest pooling layer has the same structure as the first largest pooling layer, the second normalizing layer has the same structure as the first normalizing layer, and the second quantifying layer has the same structure as the first quantifying layer.
Further, the third sub-binarization quantization model comprises: a fifth convolution layer, a fifth linear rectifying layer, a sixth convolution layer, a sixth linear rectifying layer, a third maximum pooling layer, a third normalization layer and a third quantization layer;
the input end of the fifth convolution layer is used as the input end of the third sub-binarization quantization model, and the output end of the fifth convolution layer is connected with the input end of the fifth linear rectification layer; the input end of the sixth convolution layer is connected with the output end of the fifth linear rectifying layer, and the output end of the sixth convolution layer is connected with the input end of the sixth linear rectifying layer; the output end of the sixth linear rectifying layer is connected with the input end of the third maximum pooling layer and is used as the B output end of the third sub-binarization quantization model; the output end of the third largest pooling layer is connected with the input end of the third normalizing layer; the input end of the third quantization layer is connected with the output end of the third normalization layer, and the output end of the third quantization layer is used as the output end A of the third sub-binarization quantization model;
the fifth convolution layer has the same structure as the first convolution layer, the fifth linear rectifying layer has the same structure as the first linear rectifying layer, the sixth convolution layer has the same structure as the second convolution layer, the sixth linear rectifying layer has the same structure as the second linear rectifying layer, the third maximum pooling layer has the same structure as the first maximum pooling layer, the third normalization layer has the same structure as the first normalization layer, and the third quantization layer has the same structure as the first quantization layer.
Further, the fourth sub-binarization quantization model comprises a seventh convolution layer, a seventh linear rectification layer, an eighth convolution layer, an eighth linear rectification layer and a first anti-pooling layer which are connected in sequence; the input end of the seventh convolution layer is used as the input end of a fourth sub-binarization quantization model; the output end of the first anti-pooling layer is used as the output end of a fourth sub-binarization quantization model;
the fifth sub-binarization quantization model comprises a ninth convolution layer, a ninth linear rectification layer, a tenth convolution layer, a tenth linear rectification layer and a second anti-pooling layer which are connected in sequence; the input end of the ninth convolution layer is used as the input end of a fifth sub-binarization quantization model; the output end of the second anti-pooling layer is used as the output end of a fifth sub-binarization quantization model;
the sixth sub-binarization quantization model comprises an eleventh convolution layer, an eleventh linear rectification layer, a twelfth convolution layer, a twelfth linear rectification layer and a third anti-pooling layer which are connected in sequence; the input end of the eleventh convolution layer is used as the input end of a sixth sub-binarization quantization model; the output end of the third inverse pooling layer is used as the output end of a sixth sub-binarization quantization model;
the seventh sub-binarization quantization model comprises a thirteenth convolution layer, a thirteenth linear rectification layer, a fourteenth convolution layer, a fourteenth linear rectification layer and a fifteenth convolution layer which are connected in sequence; the input end of the thirteenth convolution layer is used as the input end of a seventh sub-binarization quantization model; an output end of the fifteenth convolution layer serves as an output end of a seventh sub-binarization quantization model;
the seventh, eighth, ninth, tenth, eleventh, twelfth, thirteenth and fourteenth convolutional layers have the same structure as the first convolutional layer;
the operation formula of the fifteenth convolutional layer is the same as that of the first convolutional layer, and the size of a convolutional kernel is 1*1;
the seventh linear rectifying layer, the eighth linear rectifying layer, the ninth linear rectifying layer, the tenth linear rectifying layer, the eleventh linear rectifying layer, the twelfth linear rectifying layer, the thirteenth linear rectifying layer and the fourteenth linear rectifying layer have the same structure as the first linear rectifying layer;
the first anti-pooling layer, the second anti-pooling layer and the third anti-pooling layer have the same structure and all adopt an anti-pooling function UnPool ().
The beneficial effects of the above further scheme are: the method carries out feature extraction through a plurality of layers of convolution layers, and compresses the image dimension by adopting a convolution kernel of 1*1 on the last layer of convolution layer; by introducing a quantization layer, the byte size of the parameters is compressed; through a plurality of rectifying layers, the output result is not linearly related to the input any more; the number of parameters is reduced to 1/4 of the original number through a plurality of layers of pooling layers, so that the purpose of compressing the model is further achieved.
Further, step S3 comprises the following sub-steps:
s31, constructing a loss function;
s32, randomly inputting initial input data of an image to the binary quantization model, and calculating a gradient by adopting a loss function;
s33, updating the weight matrix and the learning rate after binarization in the binarization quantization model according to the gradient;
and S34, repeatedly executing the step S32 to the step S33 until the weight matrix and the learning rate after binarization are optimal, and obtaining the trained binarization quantization model.
Further, the loss function in step S31 is:
Figure BDA0003081999770000071
wherein the content of the first and second substances,
Figure BDA0003081999770000072
as a loss function, y true For the output of the binary quantization model of the training process, y pred Is a prediction output;
the formula for updating the binarized weight matrix and learning rate in step S32 is as follows:
Figure BDA0003081999770000073
Figure BDA0003081999770000074
wherein, W b t For the binarized weight matrix, W, obtained in the t-th training process b t-1 Is a weight matrix after binarization obtained in the t-1 training process, eta is a hyper-parameter, W b Is a binarized weight matrix, eta t Is the learning rate, eta, of the tth training process t-1 For the learning rate of the t-1 training process,
Figure BDA0003081999770000081
exponential decay rate estimated for the first moment of the t-1 training session.
The beneficial effects of the above further scheme are: the gradient of the loss function for the last layer of weights is no longer related to the derivative of the activation function, but is only proportional to the difference between the output value and the true value, where convergence is faster. And because the back propagation is multiply-by-multiply, the updating of the whole weight matrix and the learning rate is accelerated.
The weight matrix after binarization and a learning rate updating formula enable the learning rate to be updated along with the weight value so as to continuously reduce the learning rate and accelerate the parameter convergence speed.
In conclusion, the beneficial effects of the invention are as follows: the invention designs a light-weight image processing model which has the advantages of small model, less occupied memory, high image processing precision and increased calculation speed while reducing the storage space. According to the invention, the compressed model can be smoothly transferred to the unmanned aerial vehicle computing component for computing by changing the hierarchical structure of the model and modifying the quantization function.
Drawings
FIG. 1 is a flow chart of an image processing method based on a binarization quantization model;
fig. 2 is a schematic structural diagram of a binarization quantization model.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.
As shown in fig. 1, an image processing method based on a binarization quantization model includes the following steps:
s1, preprocessing an image set to obtain initial input data of each image;
step S1 includes the following substeps:
s11, normalizing each image in the image set to obtain a normalized image pixel value;
the method for normalizing the image in the step S11 is as follows: adjusting the mean value of all pixel points in the image to be 0 and the variance to be 1;
the formula of the normalization process is:
Figure BDA0003081999770000091
wherein the content of the first and second substances,
Figure BDA0003081999770000092
for the ith normalized image pixel point value, x i The ith pixel point value is m, the total number of the pixel points, and the epsilon is a parameter for preventing the denominator from being zero;
s12, the normalized image pixel point values are scaled to (-128,128) to obtain initial input data of each image.
The normalized image pixel point values are scaled to [ -128,128,128 ] in step S12]The specific method comprises the following steps: using a 256-bit one-dimensional binary vector I = (I) for each normalized image pixel point value 1 ,i 2 ,...,i n ,...,i 256 ) Is represented by wherein i 1 ,i 2 ,...,i n ,...,i 256 The method comprises the following steps that 256 components of a binarization vector I are obtained, the value range of each component is { +1, -1}, and the relationship between the pixel point value of an image after normalization and the binarization vector I is as follows:
Figure BDA0003081999770000093
s2, constructing a binarization quantization model;
as shown in fig. 2, the specific structure of the binarization quantization model includes: the method comprises the steps of firstly, obtaining a first sub-binarization quantization model, a second sub-binarization quantization model, a third sub-binarization quantization model, a fourth sub-binarization quantization model, a fifth sub-binarization quantization model, a sixth sub-binarization quantization model and a seventh sub-binarization quantization model;
the input end of the first sub-binarization quantization model is used as the input end of the binarization quantization model, the A output end of the first sub-binarization quantization model is connected with the input end of the second sub-binarization quantization model, and the B output end of the first sub-binarization quantization model is respectively connected with the output end of the sixth sub-binarization quantization model and the input end of the seventh sub-binarization quantization model; the output end A of the second sub-binarization quantization model is connected with the input end of the third sub-binarization quantization model, and the output end B of the second sub-binarization quantization model is respectively connected with the input end of the sixth sub-binarization quantization model and the output end of the fifth sub-binarization quantization model; the output end A of the third sub-binarization quantization model is connected with the input end of the fourth sub-binarization quantization model, and the output end B of the third sub-binarization quantization model is respectively connected with the input end of the fifth sub-binarization quantization model and the output end of the fourth sub-binarization quantization model; and the output end of the seventh sub-binarization quantization model is used as the output end of the binarization quantization model.
The input of the fifth sub-binarization quantization model is the output of the output end of the third sub-binarization quantization model B and the output of the fourth sub-binarization quantization model.
The input of the sixth sub-binarization quantization model is the output of the five sub-binarization quantization model and the output of the second sub-binarization quantization model B.
The input of the seventh sub-binarization quantization model is the output of the sixth sub-binarization quantization model and the output of the first sub-binarization quantization model B.
The first sub-binarization quantization model comprises: the first convolution layer, the first linear rectifying layer, the second convolution layer, the second linear rectifying layer, the first maximum pooling layer, the first normalization layer and the first quantization layer;
the input end of the first convolution layer is used as the input end of the first sub-binarization quantization model, and the output end of the first convolution layer is connected with the input end of the first linear rectification layer; the input end of the second convolution layer is connected with the output end of the first linear rectifying layer, and the output end of the second convolution layer is connected with the input end of the second linear rectifying layer; the output end of the second linear rectifying layer is connected with the input end of the first maximum pooling layer and is used as the output end B of the first sub-binarization quantization model; the output end of the first maximum pooling layer is connected with the input end of the first normalization layer; the input end of the first quantization layer is connected with the output end of the first normalization layer, and the output end of the first quantization layer is used as the output end A of the first sub-binarization quantization model;
the convolution kernel size of the first convolution layer is 3 × 3, and the formula is as follows: l is 1 (I)=I*W b ,W b =Binariz e(W),
Figure BDA0003081999770000111
Wherein W is a weight matrix to be trained, W b For the binarized weight matrix, binariz e () is the binarization function, I is the binarization vector, L 1 (I) As the output of the first convolution layer, a convolution operation;
the formula of the first linear rectifying layer is as follows: l is a radical of an alcohol 2 (I)=ReLU(L 1 (I) Wherein, reLU () is an activation function,
Figure BDA0003081999770000112
L 2 (I) Is the output of the first linear rectifying layer;
the convolution kernel size of the second convolution layer is 3 × 3, and the formula is: l is 3 (I)=L 2 (I)*W b Wherein, L 3 (I) Is the output of the second convolutional layer;
the formula of the second linear rectifying layer is as follows: l is 4 (I)=ReLU(L 3 (I) Wherein, L 4 (I) Is the output of the second linear rectification layer;
the first maximum pooling layer is provided with 12 × 2 filter, the step length is 2, and the formula is as follows: l is 5 (I)=MaxPool(L 4 (I) MaxPool () is the maximum pooling function, L) 5 (I) For the output of the first largest pooling layer, because 12 × 2 filter is arranged on the first largest pooling layer, the step length is 2, so that each step is moved, the maximum value of 4 values is output, and finally, a length and width are all L 4 (I) Half the results;
the formula of the first normalization layer is as follows: l is 6 (I)=BN(L 5 (I) BN () is the BatchNormalization function, L) 6 (I) Is the output of the first normalization layer;
the formula of the first quantization layer is as follows: l is 7 (I)=Binarize(L 6 (I) Binarize () is a quantization function, L) 7 (I) Is the output of the first quantization layer;
in forward propagation of the first quantization layer:
Figure BDA0003081999770000113
during the reverse propagation of the first quantization layer:
Figure BDA0003081999770000121
the second sub-binarization quantization model comprises: a third convolution layer, a third linear rectifying layer, a fourth convolution layer, a fourth linear rectifying layer, a second maximum pooling layer, a second normalization layer and a second quantization layer;
the input end of the third convolution layer is used as the input end of the second sub-binarization quantization model, and the output end of the third convolution layer is connected with the input end of the third linear rectification layer; the input end of the fourth convolution layer is connected with the output end of the third linear rectifying layer, and the output end of the fourth convolution layer is connected with the input end of the fourth linear rectifying layer; the output end of the fourth linear rectifying layer is connected with the input end of the second maximum pooling layer and is used as the B output end of the second sub-binarization quantization model; the output end of the second maximum pooling layer is connected with the input end of the second normalization layer; the input end of the second quantization layer is connected with the output end of the second normalization layer, and the output end of the second quantization layer is used as the output end A of the second sub-binarization quantization model;
the third convolution layer has the same structure as the first convolution layer, the third linear rectifying layer has the same structure as the first linear rectifying layer, the fourth convolution layer has the same structure as the second convolution layer, the fourth linear rectifying layer has the same structure as the second linear rectifying layer, the second maximum pooling layer has the same structure as the first maximum pooling layer, the second normalization layer has the same structure as the first normalization layer, and the second quantization layer has the same structure as the first quantization layer.
The third sub-binarization quantization model comprises: a fifth convolution layer, a fifth linear rectifying layer, a sixth convolution layer, a sixth linear rectifying layer, a third maximum pooling layer, a third normalization layer and a third quantization layer;
the input end of the fifth convolution layer is used as the input end of the third sub-binarization quantization model, and the output end of the fifth convolution layer is connected with the input end of the fifth linear rectification layer; the input end of the sixth convolution layer is connected with the output end of the fifth linear rectifying layer, and the output end of the sixth convolution layer is connected with the input end of the sixth linear rectifying layer; the output end of the sixth linear rectifying layer is connected with the input end of the third maximum pooling layer and is used as the B output end of the third sub-binarization quantization model; the output end of the third largest pooling layer is connected with the input end of the third normalizing layer; the input end of the third quantization layer is connected with the output end of the third normalization layer, and the output end of the third quantization layer is used as the output end A of the third sub-binarization quantization model;
the fifth convolution layer has the same structure as the first convolution layer, the fifth linear rectifying layer has the same structure as the first linear rectifying layer, the sixth convolution layer has the same structure as the second convolution layer, the sixth linear rectifying layer has the same structure as the second linear rectifying layer, the third maximum pooling layer has the same structure as the first maximum pooling layer, the third normalization layer has the same structure as the first normalization layer, and the third quantization layer has the same structure as the first quantization layer.
The fourth sub-binarization quantization model comprises a seventh convolution layer, a seventh linear rectification layer, an eighth convolution layer, an eighth linear rectification layer and a first anti-pooling layer which are connected in sequence; the input end of the seventh convolution layer is used as the input end of a fourth sub-binarization quantization model; the output end of the first anti-pooling layer is used as the output end of a fourth sub-binarization quantization model;
the fifth sub-binarization quantization model comprises a ninth convolution layer, a ninth linear rectification layer, a tenth convolution layer, a tenth linear rectification layer and a second anti-pooling layer which are connected in sequence; the input end of the ninth convolution layer is used as the input end of a fifth sub-binarization quantization model; the output end of the second anti-pooling layer is used as the output end of a fifth sub-binarization quantization model;
the sixth sub-binarization quantization model comprises an eleventh convolution layer, an eleventh linear rectification layer, a twelfth convolution layer, a twelfth linear rectification layer and a third anti-pooling layer which are connected in sequence; the input end of the eleventh convolution layer is used as the input end of a sixth sub-binarization quantization model; the output end of the third inverse pooling layer is used as the output end of a sixth sub-binarization quantization model;
the seventh sub-binarization quantization model comprises a thirteenth convolution layer, a thirteenth linear rectification layer, a fourteenth convolution layer, a fourteenth linear rectification layer and a fifteenth convolution layer which are connected in sequence; the input end of the thirteenth convolution layer is used as the input end of a seventh sub-binarization quantization model; an output end of the fifteenth convolution layer serves as an output end of a seventh sub-binarization quantization model;
the seventh, eighth, ninth, tenth, eleventh, twelfth, thirteenth and fourteenth convolutional layers have the same structure as the first convolutional layer;
the operation formula of the fifteenth convolutional layer is the same as that of the first convolutional layer, and the size of a convolutional kernel is 1*1;
the seventh linear rectifying layer, the eighth linear rectifying layer, the ninth linear rectifying layer, the tenth linear rectifying layer, the eleventh linear rectifying layer, the twelfth linear rectifying layer, the thirteenth linear rectifying layer and the fourteenth linear rectifying layer have the same structure as the first linear rectifying layer;
the first anti-pooling layer, the second anti-pooling layer and the third anti-pooling layer have the same structure and adopt an anti-pooling function UnPool ().
S3, training the binarization quantization model by adopting the initial input data of each image to obtain a trained binarization quantization model;
step S3 comprises the following substeps:
s31, constructing a loss function;
the loss function in step S31 is:
Figure BDA0003081999770000141
wherein the content of the first and second substances,
Figure BDA0003081999770000142
as a loss function, y true For the output of the binary quantization model of the training process, y pred Is a prediction output;
s32, randomly inputting initial input data of an image to the binarization quantization model, and calculating a gradient by adopting a loss function;
the formula for updating the binarized weight matrix and learning rate in step S32 is as follows:
Figure BDA0003081999770000143
Figure BDA0003081999770000144
wherein, W b t For the binarized weight matrix, W, obtained in the t-th training process b t-1 Is a weight matrix after binarization obtained in the t-1 training process, eta is a hyper-parameter, W b Is a binarized weight matrix, eta t Is the learning rate, eta, of the tth training process t-1 For the learning rate of the t-1 st training process,
Figure BDA0003081999770000145
the exponential decay rate estimated for the first moment of the t-1 training session.
S33, updating the weight matrix and the learning rate after binarization in the binarization quantization model according to the gradient;
s34, repeatedly executing the step S32 to the step S33 until the weight matrix and the learning rate after binarization are optimal, and obtaining a trained binarization quantization model;
and S4, inputting the initial input data of one image into the trained binary quantization model to obtain the boundary and the attribute of the object in the image, and finishing the processing of the image.

Claims (7)

1. An image processing method based on a binarization quantization model is characterized by comprising the following steps:
s1, preprocessing an image set to obtain initial input data of each image;
s2, constructing a binarization quantization model;
s3, training the binarization quantization model by adopting the initial input data of each image to obtain a trained binarization quantization model;
s4, inputting initial input data of an image into the trained binary quantization model to obtain the boundary and the attribute of an object in the image, and finishing the processing of the image;
the binarization quantization model in the step S2 comprises: the first sub-binarization quantization model, the second sub-binarization quantization model, the third sub-binarization quantization model, the fourth sub-binarization quantization model, the fifth sub-binarization quantization model, the sixth sub-binarization quantization model and the seventh sub-binarization quantization model;
the input end of the first sub-binarization quantization model is used as the input end of the binarization quantization model, the A output end of the first sub-binarization quantization model is connected with the input end of the second sub-binarization quantization model, and the B output end of the first sub-binarization quantization model is respectively connected with the output end of the sixth sub-binarization quantization model and the input end of the seventh sub-binarization quantization model; the output end A of the second sub-binarization quantization model is connected with the input end of the third sub-binarization quantization model, and the output end B of the second sub-binarization quantization model is respectively connected with the input end of the sixth sub-binarization quantization model and the output end of the fifth sub-binarization quantization model; the output end A of the third sub-binarization quantization model is connected with the input end of the fourth sub-binarization quantization model, and the output end B of the third sub-binarization quantization model is respectively connected with the input end of the fifth sub-binarization quantization model and the output end of the fourth sub-binarization quantization model; the output end of the seventh sub-binarization quantization model is used as the output end of the binarization quantization model;
the first sub-binarization quantization model comprises: the device comprises a first convolution layer, a first linear rectification layer, a second convolution layer, a second linear rectification layer, a first maximum pooling layer, a first normalization layer and a first quantification layer;
the input end of the first convolution layer is used as the input end of the first sub-binarization quantization model, and the output end of the first convolution layer is connected with the input end of the first linear rectification layer; the input end of the second convolution layer is connected with the output end of the first linear rectifying layer, and the output end of the second convolution layer is connected with the input end of the second linear rectifying layer; the output end of the second linear rectifying layer is connected with the input end of the first maximum pooling layer and is used as the output end B of the first sub-binarization quantization model; the output end of the first maximum pooling layer is connected with the input end of the first normalization layer; the input end of the first quantization layer is connected with the output end of the first normalization layer, and the output end of the first quantization layer is used as the output end A of the first sub-binarization quantization model;
the second sub-binarization quantization model comprises: a third convolution layer, a third linear rectifying layer, a fourth convolution layer, a fourth linear rectifying layer, a second maximum pooling layer, a second normalization layer and a second quantization layer;
the input end of the third convolution layer is used as the input end of the second sub-binarization quantization model, and the output end of the third convolution layer is connected with the input end of the third linear rectification layer; the input end of the fourth convolution layer is connected with the output end of the third linear rectifying layer, and the output end of the fourth convolution layer is connected with the input end of the fourth linear rectifying layer; the output end of the fourth linear rectifying layer is connected with the input end of the second maximum pooling layer and is used as the B output end of the second sub-binarization quantization model; the output end of the second maximum pooling layer is connected with the input end of the second normalization layer; the input end of the second quantization layer is connected with the output end of the second normalization layer, and the output end of the second quantization layer is used as the output end A of the second sub-binarization quantization model;
the third convolution layer has the same structure as the first convolution layer, the third linear rectifying layer has the same structure as the first linear rectifying layer, the fourth convolution layer has the same structure as the second convolution layer, the fourth linear rectifying layer has the same structure as the second linear rectifying layer, the second maximum pooling layer has the same structure as the first maximum pooling layer, the second normalization layer has the same structure as the first normalization layer, and the second quantization layer has the same structure as the first quantization layer;
the third sub-binarization quantization model comprises: a fifth convolution layer, a fifth linear rectifying layer, a sixth convolution layer, a sixth linear rectifying layer, a third maximum pooling layer, a third normalization layer and a third quantization layer;
the input end of the fifth convolution layer is used as the input end of the third sub-binarization quantization model, and the output end of the fifth convolution layer is connected with the input end of the fifth linear rectification layer; the input end of the sixth convolution layer is connected with the output end of the fifth linear rectifying layer, and the output end of the sixth convolution layer is connected with the input end of the sixth linear rectifying layer; the output end of the sixth linear rectifying layer is connected with the input end of the third maximum pooling layer and is used as the B output end of the third sub-binarization quantization model; the output end of the third largest pooling layer is connected with the input end of the third normalizing layer; the input end of the third quantization layer is connected with the output end of the third normalization layer, and the output end of the third quantization layer is used as the output end A of the third sub-binarization quantization model;
the fifth convolution layer has the same structure as the first convolution layer, the fifth linear rectifying layer has the same structure as the first linear rectifying layer, the sixth convolution layer has the same structure as the second convolution layer, the sixth linear rectifying layer has the same structure as the second linear rectifying layer, the third maximum pooling layer has the same structure as the first maximum pooling layer, the third normalization layer has the same structure as the first normalization layer, and the third quantization layer has the same structure as the first quantization layer.
2. The image processing method based on the binary quantization model according to claim 1, characterized in that said step S1 comprises the sub-steps of:
s11, normalizing each image in the image set to obtain a normalized image pixel value;
s12, the normalized image pixel point values are scaled to (-128,128) to obtain initial input data of each image.
3. The image processing method based on the binarization quantization model as recited in claim 2, characterized in that the method for carrying out normalization processing on the image in the step S11 is as follows: adjusting the mean value of all pixel points in the image to be 0 and the variance to be 1;
the formula of the normalization process is:
Figure FDA0003677114660000031
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003677114660000032
for the ith normalized image pixel point value, x i The value of the ith pixel point before normalization is shown, m is the total number of the pixel points, and the element belongs to a parameter for preventing the denominator from being zero;
the normalized image pixel point values are scaled to [ -128,128,128 ] in step S12]The specific method comprises the following steps: using a 256-bit one-dimensional binary vector I = (I) for each normalized image pixel point value 1 ,i 2 ,…,i n ,…,i 256 ) Is shown in which i 1 ,i 2 ,…,i n ,…,i 256 And (3) obtaining 256 components of the binary vector I, wherein the value range of each component is { +1, -1}.
4. The image processing method based on the binary quantization model according to claim 1, wherein the convolution kernel size of the first convolution layer is 3 x 3, and the formula is: i is 1 (I)=I*W b ,W b =Binariz e(W),
Figure FDA0003677114660000041
Wherein W is a weight matrix to be trained, W b For the binarized weight matrix, binariz e () is the binarization function, I is the binarization vector, L 1 (I) As the output of the first convolution layer, a convolution operation;
the formula of the first linear rectifying layer is as follows: l is 2 (I)=ReLU(L 1 (I) Wherein, reLU () is an activation function,
Figure FDA0003677114660000042
L 2 (I) Is the output of the first linear rectifying layer;
the convolution kernel size of the second convolution layer is3 × 3, its formula is: l is a radical of an alcohol 3 (I)=L 2 (I)*W b Wherein L is 3 (I) Is the output of the second convolutional layer;
the formula of the second linear rectifying layer is as follows: l is 4 (I)=ReLU(L 3 (I) Wherein, L 4 (I) Is the output of the second linear rectifying layer;
the first largest pooling layer is provided with 12 × 2 filter, the step length is 2, and the formula is as follows: l is 5 (I)=MaxPool(L 4 (I) MaxPool () is the maximum pooling function, L) 5 (I) Is the output of the first max pooling layer;
the formula of the first normalization layer is as follows: l is a radical of an alcohol 6 (I)=BN(L 5 (I) Wherein BN () is the BatchNormalization function, L 6 (I) Is the output of the first normalization layer;
the formula of the first quantization layer is: l is a radical of an alcohol 7 (I)=Binarize(L 6 (I) Binarize () is a quantization function, L) 7 (I) Is the output of the first quantization layer;
in forward propagation of the first quantization layer:
Figure FDA0003677114660000051
during the reverse propagation of the first quantization layer:
Figure FDA0003677114660000052
5. the image processing method based on the binarization quantization model as recited in claim 1, characterized in that the fourth sub-binarization quantization model comprises a seventh convolution layer, a seventh linear rectification layer, an eighth convolution layer, an eighth linear rectification layer and a first anti-pooling layer which are connected in sequence; the input end of the seventh convolution layer is used as the input end of a fourth sub-binarization quantization model; the output end of the first anti-pooling layer is used as the output end of a fourth sub-binarization quantization model;
the fifth sub-binarization quantization model comprises a ninth convolution layer, a ninth linear rectification layer, a tenth convolution layer, a tenth linear rectification layer and a second anti-pooling layer which are connected in sequence; the input end of the ninth convolution layer is used as the input end of a fifth sub-binarization quantization model; the output end of the second anti-pooling layer is used as the output end of a fifth sub-binarization quantization model;
the sixth sub-binarization quantization model comprises an eleventh convolution layer, an eleventh linear rectification layer, a twelfth convolution layer, a twelfth linear rectification layer and a third anti-pooling layer which are connected in sequence; the input end of the eleventh convolution layer is used as the input end of a sixth sub-binarization quantization model; the output end of the third inverse pooling layer is used as the output end of a sixth sub-binarization quantization model;
the seventh sub-binarization quantization model comprises a thirteenth convolution layer, a thirteenth linear rectification layer, a fourteenth convolution layer, a fourteenth linear rectification layer and a fifteenth convolution layer which are connected in sequence; the input end of the thirteenth convolution layer is used as the input end of a seventh sub-binarization quantization model; an output end of the fifteenth convolution layer serves as an output end of a seventh sub-binarization quantization model;
the seventh, eighth, ninth, tenth, eleventh, twelfth, thirteenth and fourteenth convolutional layers have the same structure as the first convolutional layer;
the operation formula of the fifteenth convolution layer is the same as that of the first convolution layer, and the convolution kernel size is 1*1;
the seventh linear rectifying layer, the eighth linear rectifying layer, the ninth linear rectifying layer, the tenth linear rectifying layer, the eleventh linear rectifying layer, the twelfth linear rectifying layer, the thirteenth linear rectifying layer and the fourteenth linear rectifying layer have the same structure as the first linear rectifying layer;
the first anti-pooling layer, the second anti-pooling layer and the third anti-pooling layer have the same structure and all adopt an anti-pooling function UnPool ().
6. The image processing method based on the binary quantization model according to claim 1, characterized in that said step S3 comprises the sub-steps of:
s31, constructing a loss function;
s32, randomly inputting initial input data of an image to the binary quantization model, and calculating a gradient by adopting a loss function;
s33, updating the weight matrix and the learning rate after binarization in the binarization quantization model according to the gradient;
and S34, repeatedly executing the step S32 to the step S33 until the weight matrix and the learning rate after binarization are optimal, and obtaining the trained binarization quantization model.
7. The image processing method based on binary quantization model according to claim 6, characterized in that the loss function in step S31 is:
Figure FDA0003677114660000061
wherein the content of the first and second substances,
Figure FDA0003677114660000062
as a loss function, y true For the output of the binary quantization model of the training process, y pred Is the predicted output;
the formula for updating the binarized weight matrix and learning rate in step S32 is as follows:
Figure FDA0003677114660000063
Figure FDA0003677114660000071
wherein, W b t For the binarized weight matrix, W, obtained in the t-th training process b t-1 Is a weight matrix after binarization obtained in the t-1 training process, eta is a hyper-parameter, W b Is a binarized weight matrix, eta t Is the learning rate of the tth training process, eta t-1 For the learning rate of the t-1 st training process,
Figure FDA0003677114660000072
the exponential decay rate estimated for the first moment of the t-1 training session.
CN202110569275.7A 2021-05-25 2021-05-25 Image processing method based on binarization quantization model Active CN113159301B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110569275.7A CN113159301B (en) 2021-05-25 2021-05-25 Image processing method based on binarization quantization model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110569275.7A CN113159301B (en) 2021-05-25 2021-05-25 Image processing method based on binarization quantization model

Publications (2)

Publication Number Publication Date
CN113159301A CN113159301A (en) 2021-07-23
CN113159301B true CN113159301B (en) 2022-10-28

Family

ID=76877228

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110569275.7A Active CN113159301B (en) 2021-05-25 2021-05-25 Image processing method based on binarization quantization model

Country Status (1)

Country Link
CN (1) CN113159301B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019079895A1 (en) * 2017-10-24 2019-05-02 Modiface Inc. System and method for image processing using deep neural networks
CN110852964A (en) * 2019-10-30 2020-02-28 天津大学 Image bit enhancement method based on deep learning
CN111382788A (en) * 2020-03-06 2020-07-07 西安电子科技大学 Hyperspectral image classification method based on binary quantization network

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239793B (en) * 2017-05-17 2020-01-17 清华大学 Multi-quantization depth binary feature learning method and device
AU2020103901A4 (en) * 2020-12-04 2021-02-11 Chongqing Normal University Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019079895A1 (en) * 2017-10-24 2019-05-02 Modiface Inc. System and method for image processing using deep neural networks
CN110852964A (en) * 2019-10-30 2020-02-28 天津大学 Image bit enhancement method based on deep learning
CN111382788A (en) * 2020-03-06 2020-07-07 西安电子科技大学 Hyperspectral image classification method based on binary quantization network

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
An Efficient Binary Convolutional Neural Network with Numerous Skip Connections for Fog Computing;Lijun Wu 等;《IEEE Internet of Things Journal》;20210118;第8卷(第14期);1-10 *
Network Search for Binary Networks;Jiajun Du 等;《International Joint Conference on Neural Networks》;20190719;1-8 *
基于低精度量化的卷积神经网络在FPGA上的加速研究;祁迪;《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》;20200315(第3期);I135-416页摘要,2.3节,3节 *
深度学习优化器方法及学习率衰减方式综述;冯宇旭 等;《数据挖掘》;20180929;第8卷(第4期);186-200 *
遥感影像地物分类多注意力融和U型网络法;李道纪 等;《测绘学报》;20200831;第49卷(第8期);1051-1064 *

Also Published As

Publication number Publication date
CN113159301A (en) 2021-07-23

Similar Documents

Publication Publication Date Title
CN106250939B (en) Handwritten character recognition method based on FPGA + ARM multilayer convolutional neural network
Pang et al. Hierarchical dynamic filtering network for RGB-D salient object detection
CN110135580B (en) Convolution network full integer quantization method and application method thereof
CN109241972B (en) Image semantic segmentation method based on deep learning
CN113379709B (en) Three-dimensional target detection method based on sparse multi-scale voxel feature fusion
CN112651973A (en) Semantic segmentation method based on cascade of feature pyramid attention and mixed attention
CN111062395B (en) Real-time video semantic segmentation method
Jiang et al. Cascaded subpatch networks for effective CNNs
CN112633477A (en) Quantitative neural network acceleration method based on field programmable array
CN113177580A (en) Image classification system based on channel importance pruning and binary quantization
CN111812647A (en) Phase unwrapping method for interferometric synthetic aperture radar
CN110874627A (en) Data processing method, data processing apparatus, and computer readable medium
CN114648667B (en) Bird image fine-granularity recognition method based on lightweight bilinear CNN model
CN114882278A (en) Tire pattern classification method and device based on attention mechanism and transfer learning
CN115797835A (en) Non-supervision video target segmentation algorithm based on heterogeneous Transformer
CN113591509A (en) Training method of lane line detection model, image processing method and device
US20230072445A1 (en) Self-supervised video representation learning by exploring spatiotemporal continuity
Zhaoa et al. Semantic segmentation by improved generative adversarial networks
CN113159301B (en) Image processing method based on binarization quantization model
CN117152438A (en) Lightweight street view image semantic segmentation method based on improved deep LabV3+ network
CN111914996A (en) Method for extracting data features and related device
CN114677545B (en) Lightweight image classification method based on similarity pruning and efficient module
CN114758135A (en) Unsupervised image semantic segmentation method based on attention mechanism
Vogel et al. Guaranteed compression rate for activations in cnns using a frequency pruning approach
CN115409150A (en) Data compression method, data decompression method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant