CN114692815A - Method for optimizing low-bit model training - Google Patents
Method for optimizing low-bit model training Download PDFInfo
- Publication number
- CN114692815A CN114692815A CN202011617715.3A CN202011617715A CN114692815A CN 114692815 A CN114692815 A CN 114692815A CN 202011617715 A CN202011617715 A CN 202011617715A CN 114692815 A CN114692815 A CN 114692815A
- Authority
- CN
- China
- Prior art keywords
- training
- model
- data
- network
- bit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a method for optimizing training of a low-bit model, which aims to overcome the defects in the prior art and solve the problems of serious precision loss and difficult convergence of the existing 2-bit model in the training process. The method comprises the following steps: s1, training a full-precision model: training a full-precision model based on the data set; s2, training a low-bit model: then training 4bit model, 2bit model in turn, and adopting different weight attenuation coefficient and optimizer under different bit width.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a method for optimizing low-bit model training.
Background
In recent years, with the rapid development of science and technology, a big data age has come. Deep learning takes a Deep Neural Network (DNN) as a model, and achieves remarkable results in key fields of many human intelligence, such as image recognition, reinforcement learning, semantic analysis and the like. The Convolutional Neural Network (CNN) is a typical DNN structure, can effectively extract hidden layer features of an image and accurately classify the image, and is widely applied to the field of image identification and detection in recent years.
In the prior art, a Relu function is mostly adopted when a full-precision model is trained, because the real number range represented by full-precision number is wide, the numerical range required in the training process can be met, however, when low bits are trained, because of the limit of bit width, the representation range is limited, the model cannot be effectively converged in the training process, and the precision of the final model is not ideal.
Commonly used terms in the art include:
convolutional Neural Networks (CNN): is a type of feedforward neural network that contains convolution calculations and has a depth structure.
And (3) quantification: quantization refers to the process of approximating a continuous value (or a large number of possible discrete values) of a signal to a finite number (or fewer) of discrete values.
Low bit: and quantizing the data into data with bit width of 8bit, 4bit or 2 bit.
Disclosure of Invention
In order to solve the problems, the method aims to overcome the defects in the prior art and solve the problems that the existing 2-bit model has serious precision loss and is difficult to converge in the training process.
Fine-tuning a low-bit model based on a full-precision model: firstly, a full-precision model is trained by using a data set to reach target precision, and then a low-bit model is finely trained on the basis of the full-precision model.
Specifically, the invention provides a method for optimizing low bit model training, which comprises the following steps:
s1, training a full-precision model: training a full-precision model based on the data set;
s2, training a low-bit model: then training 4bit model, 2bit model in turn, and adopting different weight attenuation coefficient and optimizer under different bit width.
The step S1 further includes:
s1.1, training data:
the data set for the training model is ImageNet1000, which is a subset of the ImageNet data set with about 1.2 millions of training set, 5 thousand of validation sets, 15 thousand of test sets, 1000 classes;
s1.2, establishing a model:
the basic neural network model adopted by training is MobileNet V1, and the network is a model based on deep separable convolution;
s1.3, training a network:
the basic steps for training the network are: setting the weight attenuation coefficient to be 0.0005, firstly training 60 epochs by adopting an adam optimizer, and then using an SGD optimizer until the training is finished;
s1.4, testing the network effect: and testing the network result by using the test set.
The step S2 further includes:
s2.1, data quantization: quantizing the data to be quantized to obtain low-bit data;
s2.2, carrying out low bit model training:
s2.2.1, training 4bit model;
s2.2.2, training a 2bit model;
s2.2.3, testing network effects:
s2.2.4, output network.
Step S2.1, quantization is performed according to formula 1:
description of variables: wfFor full-precision data being an array, WqTo simulate quantized data, maxwFull precision data WfMedian maximum value, minwFull precision data WfAnd b is the bit width after quantization.
The step S2.2.1, the training 4bit model: during training, the weight sum activation is quantized to 4 bits, and the weight attenuation coefficient is set to be 0, namely weight attenuation is not adopted; and training the model until convergence by adopting an adam optimizer.
The step S2.2.2, the training 2bit model: obtaining a model with weight and activation quantized to 4 bits after training in the step S2.2.1, training the model with weight and activation quantized to 2 bits based on the model, setting a weight attenuation coefficient to 0.00005 when training the 2bit model, adopting an adam optimizer during the first 60epoch training, then reducing the learning rate to 0.0001, and adopting an SGD optimizer to train the model until convergence.
Thus, the present application has the advantages that: the method is simple, and the aim of improving the model precision of the convolutional neural network during quantification is fulfilled by training a full-precision model based on a data set, then training a 4-bit model and a 2-bit model, and adopting different weight attenuation coefficients and optimizers under different bit widths.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention.
FIG. 1 is a schematic flow diagram of the process of the present invention.
FIG. 2 is a schematic diagram of the training process of the low bit model of the present invention.
Detailed Description
In order that the technical contents and advantages of the present invention can be more clearly understood, the present invention will now be described in further detail with reference to the accompanying drawings.
As shown in fig. 1, the present invention relates to a method for optimizing low bit model training, comprising the steps of:
s1, training a full-precision model: training a full-precision model based on the data set;
s2, training a low-bit model: then training 4bit model, 2bit model in turn, and adopting different weight attenuation coefficient and optimizer under different bit width.
Specifically, the method comprises the following steps:
1, full-precision model training:
1) training data:
the data set for the training model is ImageNet1000, which is a subset of the ImageNet data set with about 1.2 millions of training set, 5 thousand of validation sets, 15 thousand of test sets, 1000 classes.
2) Model:
the basic neural network model adopted in the training is MobileNet V1, which is a model based on deep separable convolution.
3) Training a network:
the basic steps for training the network are: setting the weight attenuation coefficient to 0.0005, firstly training 60 epochs by using an adam optimizer, and then using an SGD optimizer until the training is finished.
4) Testing the network effect:
and testing the network result by using the test set.
2 low bit model training:
data quantization: the data to be quantized is quantized according to the following formula to obtain low-bit data.
Description of variables: wfIs an array, W, for full-precision dataqTo simulate the quantized data, maxwFull precision data WfMedian maximum value, minwFull precision data WfAnd b is the bit width after quantization.
The low bit model training process is shown in fig. 2, and is mainly divided into 2 steps, which are respectively:
1) training a 4-bit model:
during training, the weight sum activation is quantized to 4 bits, and the weight attenuation coefficient is set to be 0, namely, weight attenuation is not adopted. And the optimizer trains the model using adam until convergence.
2) Training a 2bit model:
obtaining a model with weight and activation quantized into 4 bits after the first training, then training the model with weight and activation quantized into 2 bits based on the model, setting a weight attenuation coefficient to be 0.00005 when training the 2bit model, adopting an adam optimizer during the first 60epoch training, then reducing the learning rate to be 0.0001, and adopting an SGD optimizer to train the model until convergence.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes may be made to the embodiment of the present invention by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (6)
1. A method for optimizing training of a low bit model, the method comprising the steps of:
s1, training a full-precision model: training a full-precision model based on a data set;
s2, training a low-bit model: then training 4bit model, 2bit model in turn, and adopting different weight attenuation coefficient and optimizer under different bit width.
2. The method for optimizing low bit rate model training of claim 1, wherein said step S1 further comprises:
s1.1, training data:
the data set for the training model is ImageNet1000, which is a subset of the ImageNet data set with about 1.2 millions of training set, 5 thousand of validation sets, 15 thousand of test sets, 1000 classes;
s1.2, establishing a model:
the basic neural network model adopted by training is MobileNet V1, and the network is a model based on deep separable convolution;
s1.3, training a network:
the basic steps for training the network are: setting the weight attenuation coefficient to be 0.0005, firstly adopting an adam optimizer to train 60 epochs, and then adopting an SGD optimizer until the training is finished;
s1.4, testing the network effect: and testing the network result by using the test set.
3. The method for optimizing low bit rate model training of claim 1, wherein said step S2 further comprises:
s2.1, data quantization: quantizing the data to be quantized to obtain low-bit data;
s2.2, carrying out low bit model training:
s2.2.1, training 4bit model;
s2.2.2, training a 2bit model;
s2.2.3, testing network effects:
s2.2.4, output network.
4. A method for optimizing low bit rate model training as claimed in claim 3, wherein said step S2.1, quantization according to equation 1:
description of variables: wfFor full-precision data being an array, WqTo simulate the quantized data, maxwFull precision data WfMedian maximum value, minwFull precision data WfAnd b is the bit width after quantization.
5. A method for optimizing low bit model training as claimed in claim 3, wherein said step S2.2.1, said training 4bit model: during training, the weight sum activation is quantized to 4 bits, and the weight attenuation coefficient is set to be 0, namely weight attenuation is not adopted; and training the model until convergence by adopting an adam optimizer.
6. A method for optimizing low bit model training as claimed in claim 5, wherein said step S2.2.2, training 2bit model: obtaining a model with weight and activation quantized to 4 bits after training in the step S2.2.1, training the model with weight and activation quantized to 2 bits based on the model, setting a weight attenuation coefficient to 0.00005 when training the 2bit model, adopting an adam optimizer during the first 60epoch training, then reducing the learning rate to 0.0001, and adopting an SGD optimizer to train the model until convergence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011617715.3A CN114692815A (en) | 2020-12-31 | 2020-12-31 | Method for optimizing low-bit model training |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011617715.3A CN114692815A (en) | 2020-12-31 | 2020-12-31 | Method for optimizing low-bit model training |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114692815A true CN114692815A (en) | 2022-07-01 |
Family
ID=82134849
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011617715.3A Pending CN114692815A (en) | 2020-12-31 | 2020-12-31 | Method for optimizing low-bit model training |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114692815A (en) |
-
2020
- 2020-12-31 CN CN202011617715.3A patent/CN114692815A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112116030B (en) | Image classification method based on vector standardization and knowledge distillation | |
CN108900346B (en) | Wireless network flow prediction method based on LSTM network | |
CN111564160B (en) | Voice noise reduction method based on AEWGAN | |
CN105096955B (en) | A kind of speaker's method for quickly identifying and system based on model growth cluster | |
CN108847223A (en) | A kind of audio recognition method based on depth residual error neural network | |
CN111161744B (en) | Speaker clustering method for simultaneously optimizing deep characterization learning and speaker identification estimation | |
CN111429947A (en) | Speech emotion recognition method based on multi-stage residual convolutional neural network | |
CN113177558B (en) | Radiation source individual identification method based on small sample feature fusion | |
CN113140220A (en) | Lightweight end-to-end speech recognition method based on convolution self-attention transformation network | |
CN115392285A (en) | Deep learning signal individual recognition model defense method based on multiple modes | |
WO2020253692A1 (en) | Quantification method for deep learning network parameters | |
CN114299995A (en) | Language emotion recognition method for emotion assessment | |
CN113206808B (en) | Channel coding blind identification method based on one-dimensional multi-input convolutional neural network | |
CN114692815A (en) | Method for optimizing low-bit model training | |
CN110619886A (en) | End-to-end voice enhancement method for low-resource Tujia language | |
CN113762500B (en) | Training method for improving model precision during quantization of convolutional neural network | |
CN113762497B (en) | Low-bit reasoning optimization method for convolutional neural network model | |
CN116248202A (en) | Method for realizing radio frequency channel calibration based on deep learning | |
CN112434716B (en) | Underwater target data amplification method and system based on condition countermeasure neural network | |
CN113220892A (en) | BERT-based self-adaptive text classification method and device | |
CN114692814A (en) | Quantification method for optimizing neural network model activation | |
CN108563639B (en) | Mongolian language model based on recurrent neural network | |
CN113593551B (en) | Objective evaluation method for interference effect of voice communication based on command word recognition | |
CN113762495A (en) | Method for improving precision of low bit quantization model of convolutional neural network model | |
Avila et al. | Low-bit shift network for end-to-end spoken language understanding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |